Fault Tolerance of Sorting Algorithms Under Varying Input Characteristics

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Beitragende

Abstract

Bit flips in main memory or in the CPU can lead to silent data corruptions (SDCs) and therefore pose a threat to all modern computer systems. The selection of particularly fault-tolerant algorithms is known to have a strong impact on a system's resilience, but involves time-consuming fault-injection (FI) campaigns. This problem is exacerbated by the fact that the optimal selection also depends on input characteristics like data length or structure. In this paper, we exemplarily analyze different sorting algorithms with respect to their fault tolerance. Confirming and complementing other studies, we show that input-data length and ordering are relevant for algorithm selection. We find that existing hybrid algorithms, like Timsort or Introsort, also make favorable choices for fault tolerance and dominate the field. In a further consideration, we correlate the SDC probabilities with simpler metrics, such as the number of retired instructions. We find that these metrics can also be used to select the most resilient sorting algorithm, as the resulting algorithm ranking is strongly correlated to one obtained from an expensive FI campaign.

Details

OriginalspracheEnglisch
Titel2025 20th European Dependable Computing Conference (EDCC)
Herausgeber (Verlag)Institute of Electrical and Electronics Engineers (IEEE)
Seiten63-71
Seitenumfang9
ISBN (elektronisch)979-8-3315-1280-4
ISBN (Print)979-8-3315-1281-1
PublikationsstatusVeröffentlicht - 10 Apr. 2025
Peer-Review-StatusJa

Publikationsreihe

ReiheEuropean Dependable Computing Conference (EDCC)
ISSN2641-810X

Konferenz

Titel20th European Dependable Computing Conference
KurztitelEDCC 2025
Veranstaltungsnummer20
Dauer8 - 11 April 2025
Webseite
OrtUniversity of Lisboa (FCUL)
StadtLisbon
LandPortugal

Externe IDs

Scopus 105015528051
ORCID /0000-0002-1427-9343/work/193705399

Schlagworte

Schlagwörter

  • Europe, Fault tolerance, Fault tolerant systems, Lead, Measurement, Redundancy, Registers, Sorting, fault tolerance, bit flip, soft error, sorting algorithm, input characteristics, correlation, silent data corruption, fault injection