Automatically Tolerating Arbitrary Faults in Non-malicious Settings

Publikation: Beitrag zu KonferenzenPaperBeigetragenBegutachtung

Beitragende

Abstract

Arbitrary faults such as bit flips have been often observed in commodity-hardware data centers and have disrupted large services. Benign faults, such as crashes and message omissions, are nevertheless the standard assumption in practical fault-tolerant distributed systems. Algorithms tolerant to arbitrary faults are harder to understand and more expensive to deploy (requiring more machines). In this work, we introduce a non-malicious arbitrary fault model including transient and permanent arbitrary faults, such as bit flips and hardware-design errors, but no malicious faults, typically caused by security breaches. We then present a compiler-based framework that allows benign fault-tolerant algorithms to automatically tolerate arbitrary faults in non-malicious settings. Finally, we experimentally evaluate two fundamental algorithms: Paxos and leader election. At expense of CPU cycles, transformed algorithms use the same number of processes as their benign fault-tolerant counterparts, and have virtually no network overhead, while reducing the probability of failing arbitrarily by two orders of magnitude.

Details

OriginalspracheEnglisch
Seiten114-123
Seitenumfang10
PublikationsstatusVeröffentlicht - 2013
Peer-Review-StatusJa

Konferenz

TitelSixth Latin-American Symposium on Dependable Computing (LADC), IEEE Computer Society, 2013
KurztitelLADC 2013
Veranstaltungsnummer
Dauer1 Mai 2013
BekanntheitsgradInternationale Veranstaltung
Ort
StadtRio de Janeiro
LandBrasilien

Externe IDs

Scopus 84881137463

Schlagworte

Forschungsprofillinien der TU Dresden

DFG-Fachsystematik nach Fachkollegium

Schlagwörter

  • Byzantine faults, fault tolerance, hardware errors, algorithm transformation, arbitrary faults, Computer crashes, Encoding, Fault Tolerance, Fault tolerant systems, Hardware, distributed algorithms, Transforoms, Byzantine faults, hardware errors