Scalable Tools for Non-Intrusive Performance Debugging of Parallel Linux Workloads

Research output: Contribution to conferencesPaperContributedpeer-review

Abstract

There is a variety of tools to measure the performance of Linux systems and the applications running on them. However, the resulting performance data is often presented in plain text format or only with a very basic user interface. For large systems with many cores and concurrent threads, it is increasingly difficult to present the data in a clear way for analysis. Moreover, certain performance analysis and debugging tasks require the use of a high-resolution time-line based approach, again entailing data visualization challenges. Tools in the area of High Performance Computing (HPC) have long been able to scale to hundreds or thousands of parallel threads and help finding performance anomalies. We therefore present a solution to gather performance data using Linux performance monitoring interfaces. A combination of sampling and careful instrumentation allows us to obtain detailed performance traces with manageable overhead. We then convert the resulting output to the Open Trace Format (OTF) to bridge the gap between the recording infrastructure and HPC analysis tools. We explore ways to visualize the data by using the graphical tool Vampir. The combination of established Linux and HPC tools allows us to create an interface for easy navigation through time-ordered performance data grouped by thread or CPU and to help users find opportunities for performance optimizations.

Details

Original languageEnglish
Pages63-75
Number of pages13
Publication statusPublished - 2014
Peer-reviewedYes

Symposium

TitleOttawa Linux Symposium
Abbreviated titleOLS
Conference number
Duration14 July 2014 - 19 February 2022
Degree of recognitionInternational event
Location
CityOttawa
CountryCanada

External IDs

ORCID /0000-0002-8491-770X/work/141543270
ORCID /0009-0003-0666-4166/work/151475563
ORCID /0000-0002-5437-3887/work/154740493

Keywords

Keywords

  • tools, linux, performance