Local Learning Strategies for Data Management Components

Research output: Types of ThesisDoctoral thesis

Contributors

Abstract

In a world with an ever-increasing amount of data processed, providing tools for highquality and fast data processing is imperative. Database Management Systems (DBMSs) are complex adaptive systems supplying reliable and fast data analysis and storage capabilities. To boost the usability of DBMSs even further, a core research area of databases is performance optimization, especially for query processing. With the successful application of Artificial Intelligence (AI) and Machine Learning (ML) in other research areas, the question arises in the database community if ML can also be beneficial for better data processing in DBMSs. This question has spawned various works successfully replacing DBMS components with ML models. However, these global models have four common drawbacks due to their large, complex, and inflexible one-size-fits-all structures. These drawbacks are the high complexity of model architectures, the lower prediction quality, the slow training, and the slow forward passes. All these drawbacks stem from the core expectation to solve a certain problem with one large model at once. The full potential of ML models as DBMS components cannot be reached with a global model because the model's complexity is outmatched by the problem's complexity. Therefore, we present a novel general strategy for using ML models to solve data management problems and to replace DBMS components. The novel strategy is based on four advantages derived from the four disadvantages of global learning strategies. In essence, our local learning strategy utilizes divide-and-conquer to place less complex but more expressive models specializing in sub-problems of a data management problem. It splits the problem space into less complex parts that can be solved with lightweight models. This circumvents the one-size-fits-all characteristics and drawbacks of global models. We will show that this approach and the lesser complexity of the specialized local models lead to better problem-solving qualities and DBMS performance. The local learning strategy is applied and evaluated in three crucial use cases to replace DBMS components with ML models. These are cardinality estimation, query optimizer hinting, and integer algorithm selection. In all three applications, the benefits of the local learning strategy are demonstrated and compared to related work. We also generalize the strategy's usability for a broader application and formulate best practices with instructions for others.

Details

Original languageEnglish
Qualification levelDr.-Ing.
Awarding Institution
Supervisors/Advisors
  • Lehner, Wolfgang, Reviewer
  • Schüle, Maximilian E., Reviewer, External person
Defense Date (Date of certificate)14 Dec 2023
Publication statusPublished - 18 Dec 2023
No renderer: customAssociatesEventsRenderPortal,dk.atira.pure.api.shared.model.researchoutput.Thesis

External IDs

ORCID /0000-0003-0720-8878/work/150329798

Keywords

Keywords

  • Databases, Machine Learning, Local Approach