Rapid fault-space exploration by evolutionary pruning

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Beitragende

  • Horst Schirmeier - , Professur für Betriebssysteme, Technische Universität (TU) Dortmund (Autor:in)
  • Christoph Borchert - , Technische Universität (TU) Dortmund (Autor:in)
  • Olaf Spinczyk - , Technische Universität (TU) Dortmund (Autor:in)

Abstract

Recent studies suggest that future microprocessors need low-cost fault-tolerance solutions for reliable operation. Several competing software-implemented error-detection methods have been shown to increase the overall resiliency when applied to critical spots in the system. Fault injection (FI) is a common approach to assess a system's vulnerability to hardware faults. In an FI campaign comprising multiple runs of an application benchmark, each run simulates the impact of a fault in a specific hardware location at a specific point in time. Unfortunately, exhaustive FI campaigns covering all possible fault locations are infeasible even for small target applications. Commonly used sampling techniques, while sufficient to measure overall resilience improvements, lack the level of detail and accuracy needed for the identification of critical spots, such as important variables or program phases. Many faults are sampled out, leaving the developer without any information on the application parts they would have targeted. We present a methodology and tool implementation that application-specifically reduces experimentation efforts, allows to freely trade the number of FI runs for result accuracy, and provides information on all possible fault locations. After training a set of Pareto-optimal heuristics, the experimenting user is enabled to specify a maximum number of FI experiments. A detailed evaluation with a set of benchmarks running on the eCos embedded OS, including MiBench's automotive benchmark category, emphasizes the applicability and effectiveness of our approach: For example, when the user chooses to run only 1.5% of all FI experiments, the average result accuracy is still 99.84%.

Details

OriginalspracheEnglisch
TitelComputer Safety, Reliability, and Security - 33rd International Conference, SAFECOMP 2014, Proceedings
Herausgeber (Verlag)Springer-Verlag
Seiten17-32
Seitenumfang16
ISBN (Print)9783319105055
PublikationsstatusVeröffentlicht - 2014
Peer-Review-StatusJa

Publikationsreihe

ReiheLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Band8666 LNCS
ISSN0302-9743

Konferenz

Titel33rd International Conference on Computer Safety, Reliability, and Security, SAFECOMP 2014
Dauer10 - 12 September 2014
StadtFlorence
LandItalien

Externe IDs

ORCID /0000-0002-1427-9343/work/167216817

Schlagworte