Speculation in Parallel and Distributed Event Processing Systems

Publikation: Hochschulschrift/AbschlussarbeitDissertation

Beitragende

  • Andrey Brito - (Autor:in)

Abstract

Event stream processing (ESP) applications enable the real-time processing of continuous flows
of data. Algorithmic trading, network monitoring, and processing data from sensor networks are
good examples of applications that traditionally rely upon ESP systems. In addition, technological
advances are resulting in an increasing number of devices that are network enabled, producing
information that can be automatically collected and processed. This increasing availability of
on-line data motivates the development of new and more sophisticated applications that require
low-latency processing of large volumes of data.
ESP applications are composed of an acyclic graph of operators that is traversed by the data.
Inside each operator, the events can be transformed, aggregated, enriched, or filtered out. Some
of these operations depend only on the current input events, such operations are called stateless.
Other operations, however, depend not only on the current event, but also on a state built during
the processing of previous events. Such operations are, therefore, named stateful.
As the number of ESP applications grows, there are increasingly strong requirements, which
are often difficult to satisfy. In this dissertation, we address two challenges created by the use of
stateful operations in a ESP application: (i) stateful operators can be bottlenecks because they
are sensitive to the order of events and cannot be trivially parallelized by replication; and (ii), if
failures are to be tolerated, the accumulated state of an stateful operator needs to be saved, saving
this state traditionally imposes considerable performance costs.
Our approach is to evaluate the use of speculation to address these two issues. For handling
ordering and parallelization issues in a stateful operator, we propose a speculative approach
that both reduces latency when the operator must wait for the correct ordering of the events and
improves throughput when the operation in hand is parallelizable. In addition, our approach
does not require that user understand concurrent programming or that he or she needs to consider
out-of-order execution when writing the operations.
For fault-tolerant applications, traditional approaches have imposed prohibitive performance
costs due to pessimistic schemes. We extend such approaches, using speculation to mask the cost
of fault tolerance.

Details

OriginalspracheEnglisch
Betreuer:in / Berater:in
ErscheinungsortDresden, Germany
PublikationsstatusVeröffentlicht - 2010
Extern publiziertJa
No renderer: customAssociatesEventsRenderPortal,dk.atira.pure.api.shared.model.researchoutput.Thesis

Schlagworte

Forschungsprofillinien der TU Dresden