Holistic Performance Analysis of Multi-layer I/O in Parallel Scientific Applications

Research output: Types of ThesisDoctoral thesis


Efficient usage of file systems poses a major challenge for highly scalable parallel applications. The performance of even the most sophisticated I/O subsystems lags behind the compute capabilities of current processors. To improve the utilization of I/O subsystems, several libraries, such as HDF5, facilitate the implementation of parallel I/O operations. These libraries abstract from low-level I/O interfaces (for instance, POSIX I/O) and may internally interact with additional I/O libraries. While improving usability, I/O libraries also add complexity and impede the analysis and optimization of application I/O performance. This thesis proposes a methodology to investigate application I/O behavior in detail. In contrast to existing approaches, this methodology captures I/O activities on multiple layers of the I/O software stack, correlates these activities across all layers explicitly, and identifies interactions between multiple layers of the I/O software stack. This allows users to identify inefficiencies at individual layers of the I/O software stack as well as to detect possible conflicts in the interplay between these layers. Therefor, a monitoring infrastructure observes an application and records information about I/O activities of the application during its execution. This work describes options to monitor applications and generate event logs reflecting their behavior. Additionally, it introduces concepts to store information about I/O activities in event logs that preserve hierarchical relations between I/O operations across all layers of the I/O software stack. In combination with the introduced methodology for multi-layer I/O performance analysis, this work provides the foundation for application I/O tuning by exposing patterns in the usage of I/O routines. This contribution includes the definition of I/O access patterns observable in the event logs of parallel scientific applications. These access patterns originate either directly from the application or from utilized I/O libraries. The introduced patterns reflect inefficiencies in the usage of I/O routines or reveal optimization strategies for I/O accesses. Software developers can use these patterns as a guideline for performance analysis to investigate the I/O behavior of their applications and verify the effectiveness of internal optimizations applied by high-level I/O libraries. After focusing on the analysis of individual applications, this work widens the scope to investigations of coordinated sequences of applications by introducing a top-down approach for performance analysis of entire scientific workflows. The approach provides summarized performance metrics covering different workflow perspectives, from general overview to individual jobs and their job steps. These summaries allow users to identify inefficiencies and determine the responsible job steps. In addition, the approach utilizes the methodology for performance analysis of applications using multi-layer I/O to record detailed performance data about job steps, enabling a fine-grained analysis of the associated execution to exactly pinpoint performance issues. The introduced top-down performance analysis methodology presents a powerful tool for comprehensive performance analysis of complex workflows. On top of their theoretical formulation, this thesis provides implementations of all proposed methodologies. For this purpose, an established performance monitoring infrastructure is enhanced by features to record I/O activities. These contributions complement existing functionality and provide a holistic performance analysis for parallel scientific applications covering computation, communication, and I/O operations. Evaluations with synthetic case studies, benchmarks, and real-world applications demonstrate the effectiveness of the proposed methodologies. The results of this work are distributed as open-source software. For instance, the measurement infrastructure including improvements introduced in this thesis is available for download and used in computing centers world-wide. Furthermore, research projects already employ the outcomes of this work.


Original languageEnglish
Awarding Institution
  • Nagel, Wolfgang Erwin, Mentor
Publication statusPublished - 2021
No renderer: customAssociatesEventsRenderPortal,dk.atira.pure.api.shared.model.researchoutput.Thesis



  • Multi-layer I/O, Parallel Scientific Applications, Mehrschichtige E/A, Leistungsanalyse