Performance Tools for the NEC SX-Aurora Tsubasa

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Abstract

83% of the TOP500s performance share is contributed by systems that utilize accelerators. While the overwhelming majority of accelerators in TOP500 systems are Graphics Processing Units (GPUs), other accelerators, that excel in specialized applications, are also present. The NEC SX-Aurora Tsubasa is the most common non-GPU accelerator for November 2024 TOP500 systems and has proven to be efficient for certain codes like the weather and climate model ICON. However, applications have to be carefully optimized to achieve this efficiency. Performance analysis tools, which support developers in understanding the runtime behavior of their programs can help in that regard. With profiling tools programmers can see which functions contribute to the overall runtime, but also which of them are causing high amounts of cache misses. However, the basic principle of profiling - the summarization of metrics over all invocations of a function - can hide differences between subsequent invocations of the same function. Due to its implementation a function could perhaps only exhibit high-amounts of cache misses, increasing its runtime, every fourth call. Hence, tools that allow developers to trace their applications by recording non-aggregated events in the time-domain can further provide insight into the applications performance properties. However, as of today only profiling tools exist for the NEC SX-Aurora Tsubasa. In this paper, we use its performance measurement capabilities to extend the Score-P measurement infrastructure and the lo2s system monitoring tool to enable developers to record the exact execution of programs using tracing. We further demonstrate how developers can find performance issues that cannot be detected with profiling-based tools in the ICON weather and climate model. We also describe the overhead and perturbation that our tools introduce.

Details

OriginalspracheEnglisch
TitelICPE Companion 2025 - Companion of the 16th ACM/SPEC International Conference on Performance Engineering
Herausgeber (Verlag)Association for Computing Machinery, Inc
Seiten215-222
Seitenumfang8
ISBN (elektronisch)979-8-4007-1130-5
PublikationsstatusVeröffentlicht - 5 Mai 2025
Peer-Review-StatusJa

Konferenz

Titel16th ACM/SPEC International Conference on Performance Engineering
KurztitelICPE Companion 2025
Veranstaltungsnummer16
Dauer5 - 9 Mai 2025
Webseite
OrtYork University
StadtToronto
LandKanada

Schlagworte

Ziele für nachhaltige Entwicklung

Schlagwörter

  • high performance computing, measurement, performance engineering, vector computing