An adaptive replication scheme for elastic data stream processing systems
Publikation: Beitrag zu Konferenzen › Paper › Beigetragen › Begutachtung
Beitragende
Abstract
A major challenge for cloud-based systems is to be fault tolerant to cope with an increasing probability of faults in cloud environments. This is especially true for in-memory computing solutions like data stream processing systems, where a single host failure might result in an unrecoverable information loss.
In state of the art data streaming systems either active replication or upstream backup are applied to ensure fault tolerance, which have a high resource overhead or a high recovery time respectively. This paper combines these two fault tolerance mechanisms in one system to minimize the number of violations of a user-defined recovery time threshold and to reduce the overall resource consumption compared to active replication. The system switches for individual operators between both replication techniques dynamically based on the current workload characteristics. Our approach is implemented as an extension of an elastic data stream processing engine, which is able to reduce the number of used hosts due to the smaller replication overhead. Based on a real-world evaluation we show that our system is able to reduce the resource usage by up to 19% compared to an active replication scheme.
In state of the art data streaming systems either active replication or upstream backup are applied to ensure fault tolerance, which have a high resource overhead or a high recovery time respectively. This paper combines these two fault tolerance mechanisms in one system to minimize the number of violations of a user-defined recovery time threshold and to reduce the overall resource consumption compared to active replication. The system switches for individual operators between both replication techniques dynamically based on the current workload characteristics. Our approach is implemented as an extension of an elastic data stream processing engine, which is able to reduce the number of used hosts due to the smaller replication overhead. Based on a real-world evaluation we show that our system is able to reduce the resource usage by up to 19% compared to an active replication scheme.
Details
Originalsprache | Englisch |
---|---|
Seiten | 150-161 |
Seitenumfang | 12 |
Publikationsstatus | Veröffentlicht - Juni 2015 |
Peer-Review-Status | Ja |
Externe IDs
Scopus | 84960969423 |
---|---|
ORCID | /0000-0003-0768-6351/work/141545305 |
Schlagworte
Forschungsprofillinien der TU Dresden
DFG-Fachsystematik nach Fachkollegium
Schlagwörter
- Distributed Data Stream Processiong, Fault Tolerance, Upstream Backup, Active Replication