Online Parameter Optimization for Elastic Data Stream Processing

Research output: Contribution to book/conference proceedings/anthology/reportConference contributionContributedpeer-review

Contributors

Abstract

Elastic scaling allows data stream processing systems to dynamically scale in and out to react to workload changes. As a consequence, unexpected load peaks can be handled and the extent of the overprovisioning can be reduced. However, the strategies used for elastic scaling of such systems need to be tuned manually by the user. This is an error prone and cumbersome task, because it requires a detailed knowledge of the underlying system and workload characteristics. In addition, the resulting quality of service for a specific scaling strategy is unknown a priori and can be measured only during runtime. In this paper we present an elastic scaling data stream processing prototype, which allows to trade off monetary cost against the offered quality of service. To that end, we use an online parameter optimization, which minimizes the monetary cost for the user. Using our prototype a user is able to specify the expected quality of service as an input to the optimization, which automatically detects significant changes of the workload pattern and adjusts the elastic scaling strategy based on the current workload characteristics. Our prototype is able to reduce the costs for three real-world use cases by 19% compared to a naive parameter setting and by 10% compared to a manually tuned system. In contrast to state of the art solutions, our system provides a stable and good trade-off between monetary cost and quality of service.

Details

Original languageEnglish
Title of host publicationSoCC '15: Proceedings of the Sixth ACM Symposium on Cloud Computing
PublisherAssociation for Computing Machinery (ACM), New York
Pages276-287
Number of pages12
ISBN (print)978-1-4503-3651-2
Publication statusPublished - 2015
Peer-reviewedYes

Publication series

SeriesMOD: International Conference on Management of Data (SoCC)

External IDs

Scopus 84959036559

Keywords

Research priority areas of TU Dresden

DFG Classification of Subject Areas according to Review Boards

Keywords

  • Distributed applications, Distributed Data Stream Processiong, Load Balancing, Elasticity, Parameter Optimization, Distributed data Stream Processing, Load Balancing, Elasticity, parameter optimization