De2Dup: Extended Deduplication for Multi-Tenant Databases
Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung
Beitragende
Abstract
Content-based page sharing (de-duplication) is a heavily used technique to improve memory efficiency in virtualized systems by identifying and merging identical pages. For many years now, the Linux kernel has offered this de-duplication technique via the Kernel Samepage Merging (KSM) feature. Although KSM in general works well, it is not used in multi-tenant database systems even though multiple tenants often manage similar data. One reason is that pages must be binary identical, which is a severe restriction. Secondly, KSM is seemingly scheduled as a single-threaded process by the OS, independently of the database workload, which further limits its applicability for in-memory systems with terabytes of main memory. To overcome that, we propose an extended de-duplication mechanism called De2Dup for memory-centric multi-tenant database engines. De2Dup extends de-duplication with a delta mechanism to significantly boost the application, especially when pages are not binary identical. Moreover, our De2Dup mechanism allows to steer the search for duplicates and has low overhead as we are able to offload the complete execution to a modern on-chip accelerator for memory operations in an asynchronous manner on recent Intel server processors. In addition, De2Dup offers an efficient way for on-the-fly tenant-aware data reconstruction during scan operations.
Details
| Originalsprache | Englisch |
|---|---|
| Titel | 21st International Workshop on Data Management on New Hardware, DaMoN 2025 |
| Herausgeber (Verlag) | Association for Computing Machinery, Inc |
| Seitenumfang | 9 |
| ISBN (elektronisch) | 979-8-4007-1940-0 |
| Publikationsstatus | Veröffentlicht - 10 Juli 2025 |
| Peer-Review-Status | Ja |
Workshop
| Titel | 21st International Workshop on Data Management on New Hardware |
|---|---|
| Kurztitel | DaMoN 2025 |
| Veranstaltungsnummer | 21 |
| Dauer | 23 Juni 2025 |
| Webseite | |
| Ort | Intercontinental Berlin |
| Stadt | Berlin |
| Land | Deutschland |
Externe IDs
| ORCID | /0000-0001-8107-2775/work/194824064 |
|---|
Schlagworte
ASJC Scopus Sachgebiete
Schlagwörter
- Data Access, De-Duplication, Intel DSA, Multi-Tenancy