Learned Selection Strategy for Lightweight Integer Compression Algorithms

Lucas Woltmann; Patrick Damme; Claudio Hartmann; Dirk Habich; Wolfgang Lehner

doi:10.48786/EDBT.2023.47

Learned Selection Strategy for Lightweight Integer Compression Algorithms

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung

Beitragende

Lucas Woltmann - , Professur für Datenbanken (Autor:in)
Patrick Damme - (Autor:in)
Claudio Hartmann - , Professur für Datenbanken (Autor:in)
Dirk Habich - , Professur für Datenbanken (Autor:in)
Wolfgang Lehner - , Professur für Datenbanken (Autor:in)

Abstract

Data compression has recently experienced a revival in the domain of in-memory column stores. In this field, a large corpus of lightweight integer compression algorithms plays a dominant role since all columns are typically encoded as sequences of integer values. Unfortunately, there is no single-best integer compression algorithm and the best algorithm depends on data and hardware properties. For this reason, selecting the best-fitting integer compression algorithm becomes more important and is an interesting tuning knob for optimization. However, traditional selection strategies require a profound knowledge of the (de-)compression algorithms for decision-making. This limits the broad applicability of the selection strategies. To counteract this, we propose a novel learned selection strategy by considering integer compression algorithms as independent black boxes. This black-box approach ensures broad applicability and requires machine learning-based methods to model the required knowledge for decision-making. Most importantly, we show that a local approach, where every algorithm is modeled individually, plays a crucial role. Moreover, our learned selection strategy is generalized by user-data-independence. Finally, we evaluate our approach and compare our approach against existing selection strategies to show the benefits of our learned selection strategy.

Details

Originalsprache	Englisch
Titel	26th International Conference on Extending Database Technology (EDBT 2023)
Herausgeber (Verlag)	OpenProceedings.org
Seiten	552-564
Seitenumfang	13
Band	26
Auflage	3
Publikationsstatus	Veröffentlicht - 28 März 2023
Peer-Review-Status	Ja

Externe IDs

dblp	conf/edbt/WoltmannDHHL23
Scopus	85165055743
ORCID	/0000-0001-8107-2775/work/142253565

Forschungsportal der TU Dresden

Learned Selection Strategy for Lightweight Integer Compression Algorithms

Beitragende

Abstract

Details

Externe IDs

Schlagworte

Forschungsprofillinien der TU Dresden

DFG-Fachsystematik nach Fachkollegium

Fächergruppen, Lehr- und Forschungsbereiche, Fachgebiete nach Destatis

ASJC Scopus Sachgebiete