Data Entry: Please note that the research database will be replaced by UNIverse by the end of October 2023. Please enter your data into the system https://universe-intern.unibas.ch. Thanks

Login for users with Unibas email account...

Login for registered users without Unibas email account...

 
rDLB: A Novel Approach for Robust Dynamic Load Balancing of Scientific Applications with Parallel Independent Tasks
ConferencePaper (Artikel, die in Tagungsbänden erschienen sind)
 
ID 4514869
Author(s) Mohammed, Ali Omar Abdelazim; Cavelan, Aurélien; Ciorba, Florina M.
Author(s) at UniBasel Mohammed, Ali Omar Abdelazim
Cavelan, Aurélien
Ciorba, Florina M.
Year 2019
Title rDLB: A Novel Approach for Robust Dynamic Load Balancing of Scientific Applications with Parallel Independent Tasks
Book title (Conference Proceedings) International Conference on High Performance Computing & Simulation (HPCS)
Place of Conference Dublin, Ireland
Publisher IEEE
Abstract Scientific applications often contain large and computationally- intensive parallel loops. Dynamic loop self-scheduling (DLS) is used to achieve a balanced load execution of such applications on high performance computing (HPC) systems. Large HPC systems are vulnerable to processors or node failures and perturbations in the availability of resources. Most self-scheduling approaches do not consider fault-tolerant scheduling or depended on failure or perturbation detection and react by rescheduling failed tasks. In this work, a robust dynamic load balancing (rDLB) approach is proposed for the robust self-scheduling of independent tasks. The proposed approach is proactive and does not depend on failure or perturbation detection. The theoretical analysis of the proposed approach shows that it is linearly scalable and its cost decreases quadratically by increasing the system size. rDLB is integrated into an MPI DLS library to evaluate its performance experimentally with two computationally-intensive scientific applications. Results show that rDLB enables the tolerance of up to (P −1) processor failures, where P is the number of processors executing an application. In the presence of perturbations, rDLB boosted the robustness of DLS techniques up to 30 times and decreased application execution time up to 7 times compared to their counterparts without rDLB.
edoc-URL https://edoc.unibas.ch/72155/
Full Text on edoc No
ISI-Number INSPEC:18972176
 
   

MCSS v5.8 PRO. 0.363 sec, queries - 0.000 sec ©Universität Basel  |  Impressum   |    
03/05/2024