Data compression has recently experienced a revival in the domain of in-memory column stores. In this field, a large corpus of lightweight integer compression algorithms plays a dominant role since all columns are typically encoded as sequences of integer values. Unfortunately, there is no single-best integer compression algorithm and the best algorithm depends on data and hardware properties. For this reason, selecting the best-fitting integer compression algorithm becomes more important and is an interesting tuning knob for optimization. However, traditional selection strategies require a profound knowledge of the (de-)compression algorithms for decision-making. This limits the broad applicability of the selection strategies. To counteract this, we propose a novel learned selection strategy by considering integer compression algorithms as independent black boxes. This black-box approach ensures broad applicability and requires machine learning-based methods to model the required knowledge for decision-making. Most importantly, we show that a local approach, where every algorithm is modeled individually, plays a crucial role. Moreover, our learned selection strategy is generalized by user-data-independence. Finally, we evaluate our approach and compare our approach against existing selection strategies to show the benefits of our learned selection strategy.
|Title of host publication||26th International Conference on Extending Database Technology (EDBT 2023)|
|Number of pages||13|
|Publication status||Published - 28 Mar 2023|