An Interface for Integrated MPI Correctness Checking

Research output: Contribution to book/conference proceedings/anthology/reportConference contributionContributedpeer-review

Abstract

Usage errors of the widely accepted Message-Passing Interface (MPI) are common and complicate the development process of parallel applications considerably. Some of these errors are hard to track, especially when they only occur in certain application runs or on certain platforms. Runtime correctness checking tools for MPI simplify the detection of these errors. However, they usually need the MPI profiling interface for their analysis. This paper addresses two issues related to correctness tools: First, due to the exclusive usage of the MPI profiling interface, it is not possible to use such tools in conjunction with other MPI tools, which are also based on the profiling interface. Second, correctness checking tools usually lack the ability to provide a detailed history of the events leading to an error, whereas such a history is provided naturally by tracing frameworks. We introduce the Universal MPI Correctness Interface (UniMCI) to overcome the first problem. This interface provides functions that invoke correctness checking and return detected errors in a manner that is independent of the correctness checker in use. Furthermore, we demonstrate the applicability of UniMCI with an implementation that uses the Marmot correctness checker and an exemplary integration of the interface into the VampirTrace performance analysis framework. As a result, we can provide a history for detected correctness events, which provides detailed information for debugging. Finally, we present a study using the SPEC MPI2007 benchmark to demonstrate the feasibility and applicability of our approach.

Details

Original languageEnglish
Title of host publicationPARALLEL COMPUTING: FROM MULTICORES AND GPU'S TO PETASCALE
EditorsB Chapman, F Desprez, GR Joubert, A Lichnewsky, F Peters, T Priol
PublisherIOS Press, Amsterdam [u. a.]
Pages693-700
Number of pages8
ISBN (print)978-1-60750-529-7
Publication statusPublished - 2010
Peer-reviewedYes

Publication series

SeriesAdvances in Parallel Computing
Volume19
ISSN0927-5452

Conference

TitleInternational Conference on Parallel Computing 2009
Abbreviated titleParCo 2009
Duration1 - 4 September 2009
LocationÉcole Normale Supérieure de Lyon
CityLyon
CountryFrance

External IDs

Scopus 84906502143
ORCID /0000-0001-6520-4563/work/142236628

Keywords

Keywords

  • Correctness checking, Message-Passing Interface, Tools, Marmot, Vampir