Rejuvenation and Failure Detection in Partitionable Systems

Research output: Contribution to conferencesPaperContributedpeer-review

Contributors

Abstract

Certain gateways (e.g., some cable or DSL modems) are known to have low reliability and low availability. Most failures of these devices can however be "fixed" by rejuvenating the device after a failure has been detected. Such a detection based rejuvenation strategy permits increasing the availability of these gateways. In the considered scenario, rejuvenation is non-trivial since a failure of such a gateway will leave it partitioned away from the network. In particular, network operators that want to rejuvenate these gateways are in a different network partition, and can therefore not initiate a remote rejuvenation. In this paper we propose a failure detection based rejuvenation service and a remote detection service. The rejuvenation service detects and fixes "soft" failures automatically (in one partition), and the detection service detects (in another partition) all rejuvenations exactly once, within a bounded amount of time, even when the gateway is rejuvenated consecutively. The detection service also allows the detection of "hard" failures, and filtering of notifications of soft failures.

Details

Original languageEnglish
Pages154-161
Number of pages8
Publication statusPublished - 2001
Peer-reviewedYes

Conference

Title2001 Pacific Rim International Symposium on Dependable Computing
Abbreviated titlePRDC '01
Conference number
Duration17 December 2001
Degree of recognitionInternational event
Location
CitySeoul
CountryKorea, Republic of

External IDs

Scopus 60249089569

Keywords

Research priority areas of TU Dresden

DFG Classification of Subject Areas according to Review Boards

Keywords

  • failure detection, failure detection based rejuvenation, distributed systems, fault-tolerant systems, home networking, remote system management, local area networks, protocols, IP networks, Modems, DSL, Availability, Fault detection, Software maintenance Costs, internetworking, gateways, reliability, remote detection, network manager