An Interface for Integrated MPI Correctness Checking

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Abstract

Usage errors of the widely accepted Message-Passing Interface (MPI) are common and complicate the development process of parallel applications considerably. Some of these errors are hard to track, especially when they only occur in certain application runs or on certain platforms. Runtime correctness checking tools for MPI simplify the detection of these errors. However, they usually need the MPI profiling interface for their analysis. This paper addresses two issues related to correctness tools: First, due to the exclusive usage of the MPI profiling interface, it is not possible to use such tools in conjunction with other MPI tools, which are also based on the profiling interface. Second, correctness checking tools usually lack the ability to provide a detailed history of the events leading to an error, whereas such a history is provided naturally by tracing frameworks. We introduce the Universal MPI Correctness Interface (UniMCI) to overcome the first problem. This interface provides functions that invoke correctness checking and return detected errors in a manner that is independent of the correctness checker in use. Furthermore, we demonstrate the applicability of UniMCI with an implementation that uses the Marmot correctness checker and an exemplary integration of the interface into the VampirTrace performance analysis framework. As a result, we can provide a history for detected correctness events, which provides detailed information for debugging. Finally, we present a study using the SPEC MPI2007 benchmark to demonstrate the feasibility and applicability of our approach.

Details

OriginalspracheEnglisch
TitelPARALLEL COMPUTING: FROM MULTICORES AND GPU'S TO PETASCALE
Redakteure/-innenB Chapman, F Desprez, GR Joubert, A Lichnewsky, F Peters, T Priol
Herausgeber (Verlag)IOS Press, Amsterdam [u. a.]
Seiten693-700
Seitenumfang8
ISBN (Print)978-1-60750-529-7
PublikationsstatusVeröffentlicht - 2010
Peer-Review-StatusJa

Publikationsreihe

ReiheAdvances in Parallel Computing
Band19
ISSN0927-5452

Konferenz

TitelInternational Conference on Parallel Computing 2009
KurztitelParCo 2009
Dauer1 - 4 September 2009
OrtÉcole Normale Supérieure de Lyon
StadtLyon
LandFrankreich

Externe IDs

Scopus 84906502143
ORCID /0000-0001-6520-4563/work/142236628

Schlagworte

Schlagwörter

  • Correctness checking, Message-Passing Interface, Tools, Marmot, Vampir