AHEAD: Adaptable data hardening for on-the-fly hardware error detection during database query processing

Research output: Contribution to book/conference proceedings/anthology/reportConference contributionContributedpeer-review

Contributors

Abstract

We have already known for a long time that hardware components are not perfect and soft errors in terms of single bit flips happen all the time. Up to now, these single bit flips are mainly addressed in hardware using general-purpose protection techniques. However, recent studies have shown that all future hardware components become less and less reliable in total and multi-bit flips are occurring regularly rather than exceptionally. Additionally, hardware aging effects will lead to error models that change during run-time. Scaling hardware-based protection techniques to cover changing multi-bit flips is possible, but this introduces large performance, chip area, and power overheads, which will become non-affordable in the future. To tackle that, an emerging research direction is employing protection techniques in higher software layers like compilers or applications. The available knowledge at these layers can be efficiently used to specialize and adapt protection techniques. Thus, we propose a novel adaptable and on-the-fly hardware error detection approach called AHEAD for database systems in this paper. AHEAD provides configurable error detection in an end-to-end fashion and reduces the overhead (storage and computation) compared to other techniques at this level. Our approach uses an arithmetic error coding technique which allows query processing to completely work on hardened data on the one hand. On the other hand, this enables on-the-fly detection during query processing of (i) errors that modify data stored in memory or transferred on an interconnect and (ii) errors induced during computations. Our exhaustive evaluation clearly shows the benefits of our AHEAD approach.

Details

Original languageEnglish
Title of host publicationSIGMOD '18: Proceedings of the 2018 International Conference on Management of Data
PublisherAssociation for Computing Machinery (ACM), New York
Pages1619-1634
Number of pages16
ISBN (print)978-1-4503-4703-7
Publication statusPublished - 27 May 2018
Peer-reviewedYes

Publication series

SeriesMOD: International Conference on Management of Data (SIGMOD)

Conference

Title44th ACM SIGMOD International Conference on Management of Data, SIGMOD 2018
Duration10 - 15 June 2018
CityHouston
CountryUnited States of America

External IDs

dblp conf/sigmod/KolditzHL0B18
Scopus 85048809259
ORCID /0000-0001-8107-2775/work/142253575

Keywords

ASJC Scopus subject areas

Keywords

  • Database systems, Error detection, Query processing, Reliability