Trace-based performance analysis for the petascale simulation code FLASH
Research output: Contribution to journal › Research article › Contributed › peer-review
Contributors
Abstract
Performance analysis of applications on modern high-end petascale systems is increasingly challenging due to the rising complexity and quantity of the computing units. This paper presents a performance-analysis study using the Vampir performance-analysis tool suite, which examines application behavior as well as the fundamental system properties. This study was carried out on the Jaguar system at Oak Ridge National Laboratory, the fastest computer on the November 2009 Top500 list. We analyzed the FLASH simulation code that is designed to be scaled with tens of thousands of CPU cores, which means that using existing performance-analysis tools is very complex. The study reveals two classes of performance problems that are relevant for very high CPU counts: MPI communication and scalable I/O. For both, solutions are presented and verified. Finally, the paper proposes improvements and extensions for event tracing tools in order to allow scalability of the tools towards higher degrees of parallelism.
Details
Original language | English |
---|---|
Pages (from-to) | 428-439 |
Number of pages | 12 |
Journal | International Journal of High Performance Computing Applications |
Volume | 25 |
Issue number | 4 |
Publication status | Published - Nov 2011 |
Peer-reviewed | Yes |
External IDs
Scopus | 82955223396 |
---|
Keywords
Keywords
- collective I/O, collective MPI operations, event tracing, libNBC, Vampir