Needles in the haystack — tackling bit flips in lightweight compressed data

Research output: Contribution to book/Conference proceedings/Anthology/ReportConference contributionContributedpeer-review

Contributors

Abstract

Modern database systems are very often in the position to store their entire data in main memory. Aside from increased main emory capacities, a further driver for in-memory database system has been the shift to a column-oriented storage format in combination with lightweight data compression techniques. Using both mentioned software concepts, large datasets can be held and efficiently processed in main memory with a low memory footprint. Unfortunately, hardware becomes more and more vulnerable to random faults, so that e.g., the probability rate for bit flips in main memory increases, and this rate is likely to escalate in future dynamic random-access memory (DRAM) modules. Since the data is highly compressed by the lightweight compression algorithms, multi bit flips will have an extreme impact on the reliability of database systems. To tackle this reliability issue, we introduce our research on error resilient lightweight data compression algorithms in this paper. Of course, our software approach lacks the efficiency of hardware realization, but its flexibility and adaptability will play a more important role regarding differing error rates, e.g. due to hardware aging effects and aggressive processor voltage and frequency scaling. Arithmetic AN encoding is one family of codes which is an interesting candidate for effective software-based error detection. We present results of our research showing tradeoffs between compressibility and resiliency characteristics of data. We show that particular choices of the AN-code parameter lead to a moderate loss of performance. We provide evaluation for two proposed techniques, namely AN-encoded Null Suppression and AN-encoded Run Length Encoding.

Details

Original languageEnglish
Title of host publicationData Management Technologies and Applications
EditorsOrlando Belo, Andreas Holzinger, Markus Helfert, Chiara Francalanci
PublisherSpringer-Verlag
Pages135-153
Number of pages19
ISBN (electronic)978-3-319-30162-4
ISBN (print)978-3-319-30161-7
Publication statusPublished - 2016
Peer-reviewedYes

Publication series

SeriesCommunications in Computer and Information Science
Volume584
ISSN1865-0929

Conference

Title4th International Conference on Data Management Technologies and Applications, DATA 2015
Duration20 - 22 July 2015
CityColmar
CountryFrance

External IDs

ORCID /0000-0001-8107-2775/work/198592307

Keywords

Research priority areas of TU Dresden

DFG Classification of Subject Areas according to Review Boards

Subject groups, research areas, subject areas according to Destatis