Elzar: Triple Modular Redundancy using Intel Advanced Vector Extensions (technical report)
Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review
Contributors
Abstract
Instruction-Level Redundancy (ILR) is a well
known approach to tolerate transient CPU faults. It replicates
instructions in a program and inserts periodic checks to detect
and correct CPU faults using majority voting, which essentially
requires three copies of each instruction and leads to high
performance overheads. As SIMD technology can operate simul-
taneously on several copies of the data, it appears to be a good
candidate for decreasing these overheads. To verify this hypoth-
esis, we propose ELZAR, a compiler framework that transforms
unmodified multithreaded applications to support triple modular
redundancy using Intel AVX extensions for vectorization. Our
experience with several benchmark suites and real-world case-
studies yields mixed results: while SIMD may be beneficial for
some workloads, e.g., CPU-intensive ones with many floating-
point operations, it exhibits higher overhead than ILR in many
applications we tested. We study the sources of overheads and
discuss possible improvements to Intel AVX that would lead to
better performance.
known approach to tolerate transient CPU faults. It replicates
instructions in a program and inserts periodic checks to detect
and correct CPU faults using majority voting, which essentially
requires three copies of each instruction and leads to high
performance overheads. As SIMD technology can operate simul-
taneously on several copies of the data, it appears to be a good
candidate for decreasing these overheads. To verify this hypoth-
esis, we propose ELZAR, a compiler framework that transforms
unmodified multithreaded applications to support triple modular
redundancy using Intel AVX extensions for vectorization. Our
experience with several benchmark suites and real-world case-
studies yields mixed results: while SIMD may be beneficial for
some workloads, e.g., CPU-intensive ones with many floating-
point operations, it exhibits higher overhead than ILR in many
applications we tested. We study the sources of overheads and
discuss possible improvements to Intel AVX that would lead to
better performance.
Details
Original language | Undefined |
---|---|
Title of host publication | arXiv:1604.00500 |
Number of pages | 13 |
Publication status | Published - 2016 |
Peer-reviewed | Yes |