Scalable Tools for Non-Intrusive Performance Debugging of Parallel Linux Workloads

Publikation: Beitrag zu KonferenzenPaperBeigetragenBegutachtung

Abstract

There is a variety of tools to measure the performance of Linux systems and the applications running on them. However, the resulting performance data is often presented in plain text format or only with a very basic user interface. For large systems with many cores and concurrent threads, it is increasingly difficult to present the data in a clear way for analysis. Moreover, certain performance analysis and debugging tasks require the use of a high-resolution time-line based approach, again entailing data visualization challenges. Tools in the area of High Performance Computing (HPC) have long been able to scale to hundreds or thousands of parallel threads and help finding performance anomalies. We therefore present a solution to gather performance data using Linux performance monitoring interfaces. A combination of sampling and careful instrumentation allows us to obtain detailed performance traces with manageable overhead. We then convert the resulting output to the Open Trace Format (OTF) to bridge the gap between the recording infrastructure and HPC analysis tools. We explore ways to visualize the data by using the graphical tool Vampir. The combination of established Linux and HPC tools allows us to create an interface for easy navigation through time-ordered performance data grouped by thread or CPU and to help users find opportunities for performance optimizations.

Details

OriginalspracheEnglisch
Seiten63-75
Seitenumfang13
PublikationsstatusVeröffentlicht - 2014
Peer-Review-StatusJa

(Fach-)Tagung

TitelOttawa Linux Symposium
KurztitelOLS
Veranstaltungsnummer
Dauer14 Juli 2014 - 19 Februar 2022
BekanntheitsgradInternationale Veranstaltung
Ort
StadtOttawa
LandKanada

Externe IDs

ORCID /0000-0002-8491-770X/work/141543270
ORCID /0009-0003-0666-4166/work/151475563
ORCID /0000-0002-5437-3887/work/154740493

Schlagworte

Schlagwörter

  • tools, linux, performance