Fail-Awareness: An Approach to Construct Fail-Safe Applications

Publikation: Beitrag in FachzeitschriftForschungsartikelBeigetragenBegutachtung

Beitragende

  • Christof Fetzer - , European University at St Petersburg (Autor:in)
  • Flaviu Cristian - , University of California at San Diego (Autor:in)

Abstract

We present a framework for building fail-safe hard real-time applications in timed asynchronous distributed systems subject to communication partitions and performance, omission, and crash failures. Most distributed systems built from commercial-off-the-shelf (COTS) processor and communication services are subject to such partitions because their COTS components do not provide hard real-time guarantees. Also custom designed systems can be subject to partitions due to unmaskable link or router failures. The basic assumption behind our approach is that each processor has a local hardware clock that proceeds within a linear envelope of real-time. This allows one to compute an upper bound on the actual delays incurred by a particular processing sequence or message transmission. Services and applications can use these computed bounds to detect when they cannot guarantee all their standard properties because of excessive delays. This allows an application to be fail-aware, that is, to detect when it cannot guarantee all its safety properties and in particular, to detect when to switch to a fail-safe mode.

Details

OriginalspracheEnglisch
Seiten (von - bis)203–238
Seitenumfang36
FachzeitschriftReal-time systems : the international journal of time-critical computing systems
Jahrgang24
PublikationsstatusVeröffentlicht - 2003
Peer-Review-StatusJa
Extern publiziertJa

Schlagworte

Forschungsprofillinien der TU Dresden

DFG-Fachsystematik nach Fachkollegium

Schlagwörter

  • fail-safe systems, fail-awareness, timed asynchronous systems, synchronous systems