Highly scalable SFC-based dynamic load balancing and its application to atmospheric modeling

Research output: Contribution to journalResearch articleContributedpeer-review


Load balance is one of the major challenges for efficient supercomputing, especially for applications that exhibit workload variations. Various dynamic load balancing and workload partitioning methods have been developed to handle this issue by migrating workload between nodes periodically during the runtime. However, on today’s top HPC systems–and even more so on future exascale systems–runtime performance and scalability of these methods becomes a concern, due to the costs exceeding the benefits of dynamic load balancing. In this work, we focus on methods based on space-filling curves (SFC), a well-established and comparably fast approach for workload partitioning. SFCs reduce the partitioning problem from n dimensions to one dimension. The remaining task, the so-called 1D partitioning problem or chains-on-chains partitioning problem, is to decompose a 1D workload array into consecutive, balanced partitions. While published parallel heuristics for this problem cannot reliably deliver the required workload balance, especially at large scale, exact algorithms are infeasible due to their sequential nature. We therefore propose a hierarchical method that combines a heuristic and an exact algorithm and allows to trade-off between these two approaches. We compare load balance, execution time, application communication, and task migration of the algorithms using real-life workload data from two different applications on two different HPC systems. The hierarchical method provides a significant speed-up compared to exact algorithms and yet achieves nearly the optimal load balance. On a Blue Gene/Q system, it is able to partition 2.6 million tasks for 524 288 processes with over 99% of the optimal balance in 23.4 ms only, while a published fast exact algorithm requires 6.4 s. We also provide a comparison to parallel load balancing methods implemented in the Zoltan library and present results from applying our methods to COSMO-SPECS+FD4, a detailed atmospheric simulation model that requires frequent dynamic load balancing to run efficiently at large scale.


Original languageEnglish
Pages (from-to)575-590
Number of pages16
JournalFuture Generation Computer Systems
Issue number82
Publication statusPublished - 2018

External IDs

Scopus 85019918232
ORCID /0000-0003-3137-0648/work/142238851



  • dynamic load balancing, Earth and atmospheric sciences, Massively parallel algorithms, space-filling curves, One-dimensional partitioning