Small Selectivities Matter: Lifting the Burden of Empty Samples

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Beitragende

  • Axel Hertzschuch - , Professur für Datenbanken (Autor:in)
  • Guido Moerkotte - , Universität Mannheim (Autor:in)
  • Wolfgang Lehner - , Professur für Datenbanken (Autor:in)
  • Norman May - , SAP Research (Autor:in)
  • Florian Wolf - , SAP Research (Autor:in)
  • Lars Fricke - , SAP Research (Autor:in)

Abstract

Every year more and more advanced approaches to cardinality estimation are published, using learned models or other data and workload specific synopses. In contrast, the majority of commercial in-memory systems still relies on sampling. It is arguably the most general and easiest estimator to implement. While most methods do not seem to improve much over sampling-based estimators in the presence of non-selective queries, sampling struggles with highly selective queries due to limitations of the sample size. Especially in situations where no sample tuple qualifies, optimizers fall back to basic heuristics that ignore attribute correlations and lead to large estimation errors. In this work, we present a novel approach, dealing with these 0-Tuple Situations. It is ready to use in any DBMS capable of sampling, showing a negligible impact on optimization time. Our experiments on real world and synthetic data sets demonstrate up to two orders of magnitude reduced estimation errors. Enumerating single filter predicates according to our estimates reveals 1.3 to 1.8 times faster query responses for complex filters.

Details

OriginalspracheEnglisch
TitelSIGMOD '21: Proceedings of the 2021 International Conference on Management of Data
Herausgeber (Verlag)Association for Computing Machinery, Inc
Seiten697-709
Seitenumfang13
PublikationsstatusVeröffentlicht - 2021
Peer-Review-StatusJa

Konferenz

Titel2021 International Conference on Management of Data, SIGMOD 2021
Dauer20 - 25 Juni 2021
StadtVirtual, Online
LandChina

Externe IDs

Scopus 85108979368
ORCID /0000-0001-8107-2775/work/142253441

Schlagworte

ASJC Scopus Sachgebiete

Schlagwörter

  • beta distribution, filter predicate ordering, in-memory, olap, sampling, small selectivity