Performance Measurement for the OpenMP 4.0 Offloading Model - Lecture Notes in Computer Science
Research output: Contribution to conferences › Paper › Contributed › peer-review
Abstract
OpenMP is one of the most widely used standards for enabling thread-level parallelism in high performance computing codes. The recently released version 4.0 of the specification introduces directives that enable application developers to offload portions of the computation to massively-parallel target devices. However, to efficiently utilize these devices, sophisticated performance analysis tools are required. The emerging OpenMP Tools Interface (OMPT) aids the development of portable tools, but currently lacks the support for OpenMP 4.0 target directives. This paper presents a novel approach to measure the performance of applications utilizing OpenMP offloading. It introduces libmpti, an OMPT-based measurement library for Intel MIC target devices. For host-side analysis we extended the OPARI2 instrumenter and prototypically integrated the complete approach into the state-of-the-art tool infrastructure Score-P. We demonstrate the effectiveness of the presented method and implementation with a Conjugate-Gradient (CG) kernel on an Intel Xeon Phi coprocessor. Finally, we visualize the obtained performance data with Vampir.
Details
Original language | English |
---|---|
Pages | 291-301 |
Number of pages | 11 |
Publication status | Published - 2014 |
Peer-reviewed | Yes |
External IDs
Scopus | 84920092368 |
---|
Keywords
Sustainable Development Goals
Keywords
- performance analysis, offloading, OpenMP 4.0, Intel MIC, Score-P