Learned Selection Strategy for Lightweight Integer Compression Algorithms

Lucas Woltmann; Patrick Damme; Claudio Hartmann; Dirk Habich; Wolfgang Lehner

doi:10.48786/EDBT.2023.47

Learned Selection Strategy for Lightweight Integer Compression Algorithms

Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review

Contributors

Lucas Woltmann - , Chair of Databases (Author)
Patrick Damme - (Author)
Claudio Hartmann - , Chair of Databases (Author)
Dirk Habich - , Chair of Databases (Author)
Wolfgang Lehner - , Chair of Databases (Author)

Abstract

Data compression has recently experienced a revival in the domain of in-memory column stores. In this field, a large corpus of lightweight integer compression algorithms plays a dominant role since all columns are typically encoded as sequences of integer values. Unfortunately, there is no single-best integer compression algorithm and the best algorithm depends on data and hardware properties. For this reason, selecting the best-fitting integer compression algorithm becomes more important and is an interesting tuning knob for optimization. However, traditional selection strategies require a profound knowledge of the (de-)compression algorithms for decision-making. This limits the broad applicability of the selection strategies. To counteract this, we propose a novel learned selection strategy by considering integer compression algorithms as independent black boxes. This black-box approach ensures broad applicability and requires machine learning-based methods to model the required knowledge for decision-making. Most importantly, we show that a local approach, where every algorithm is modeled individually, plays a crucial role. Moreover, our learned selection strategy is generalized by user-data-independence. Finally, we evaluate our approach and compare our approach against existing selection strategies to show the benefits of our learned selection strategy.

Details

Original language	English
Title of host publication	26th International Conference on Extending Database Technology (EDBT 2023)
Publisher	OpenProceedings.org
Pages	552-564
Number of pages	13
Volume	26
Edition	3
Publication status	Published - 28 Mar 2023
Peer-reviewed	Yes

External IDs

dblp	conf/edbt/WoltmannDHHL23
Scopus	85165055743
ORCID	/0000-0001-8107-2775/work/142253565

Research Portal of the TU Dresden

Learned Selection Strategy for Lightweight Integer Compression Algorithms

Contributors

Abstract

Details

External IDs

Keywords

Research priority areas of TU Dresden

DFG Classification of Subject Areas according to Review Boards

Subject groups, research areas, subject areas according to Destatis

ASJC Scopus subject areas