Critical-blame analysis for OpenMP 4.0 offloading on Intel Xeon Phi

Publikation: Sonstige VeröffentlichungSonstigesBeigetragenBegutachtung

Beitragende

  • R. Dietrich - (Autor:in)
  • F. Schmitt - (Autor:in)
  • A. Grund - (Autor:in)
  • J. Stolle - (Autor:in)

Abstract

Recent supercomputers rated in the TOP 500 list increasingly utilize accelerator or co-processor devices to improve performance and energy efficiency. Since version 4.0 of the specification OpenMP addresses this heterogeneity in computing with the target directives, which enable programmers to offload portions of the code to massively-parallel target devices. Due to this new complexity in hardware and software design, performance optimization of large-scale parallel programs becomes more and more challenging. As manual performance analysis is getting infeasible for complex high performance computing (HPC) codes, we propose an approach to automatically detect bottlenecks such as load imbalances in heterogeneous OpenMP applications. We developed a method to perform critical-path and root-cause analysis for the OpenMP 4.0 offloading model and integrated it into the tool CASITA. The post-mortem analysis is based on execution traces that are generated with an implementation of the evolving OpenMP Tools Interface into the measurement system Score-P. To validate the implementation of our method we ported several existing codes to OpenMP 4.0 , executed them with an Intel Xeon Phi as the target device and analyzed the resulting trace files.

Details

OriginalspracheEnglisch
Seitenumfang8
Band125
PublikationsstatusVeröffentlicht - 2016
Peer-Review-StatusJa
No renderer: customAssociatesEventsRenderPortal,dk.atira.pure.api.shared.model.researchoutput.OtherContribution

Externe IDs

Scopus 84955312939

Schlagworte

Ziele für nachhaltige Entwicklung

Schlagwörter

  • hardware, energy efficiency, software