SFU-driven transparent approximation acceleration on GPUs

Research output: Contribution to book/conference proceedings/anthology/reportConference contributionContributedpeer-review

Contributors

  • Ang Li - , Eindhoven University of Technology (Author)
  • Shuaiwen Leon Song - , Pacific Northwest National Laboratory (Author)
  • Mark Wijtvliet - , Eindhoven University of Technology (Author)
  • Akash Kumar - , Chair of Processor Design (cfaed) (Author)
  • Henk Corporaal - , Eindhoven University of Technology (Author)

Abstract

Approximate computing, the technique that sacrifices certain amount of accuracy in exchange for substantial performance boost or power reduction, is one of the most promising solutions to enable power control and performance scaling towards exascale. Although most existing approximation designs target the emerging data-intensive applications that are comparatively more error-tolerable, there is still high demand for the acceleration of traditional scientific applications (e.g., weather and nuclear simulation), which often comprise intensive transcendental function calls and are very sensitive to accuracy loss. To address this challenge, we focus on a very important but long ignored approximation unit on today's commercial GPUs | the specialfunction unit (SFU), and clarify its unique role in performance acceleration of accuracy-sensitive applications in the context of approximate computing. To better understand its features, we conduct a thorough empirical analysis on three generations of NVIDIA GPU architectures to evaluate all the single-precision and double-precision numeric transcendental functions that can be accelerated by SFUs, in terms of their performance, accuracy and power consumption. Based on the insights from the evaluation, we propose a transparent, tractable and portable design framework for SFU-driven approximate acceleration on GPUs. Our design is software-based and requires no hardware or application modifications. Experimental results on three NVIDIA GPU platforms demonstrate that our proposed framework can provide fine-grained tuning for performance and accuracy trade-offs, thus facilitating applications to achieve the maximum performance under certain accuracy constraints.

Details

Original languageEnglish
Title of host publicationICS '16: Proceedings of the 2016 International Conference on Supercomputing
PublisherAssociation for Computing Machinery (ACM), New York
Publication statusPublished - 1 Jun 2016
Peer-reviewedYes

Publication series

SeriesICS: International Conference on Supercomputing

Conference

Title30th International Conference on Supercomputing, ICS 2016
Duration1 - 3 June 2016
CityIstanbul
CountryTurkey

Keywords

Research priority areas of TU Dresden

ASJC Scopus subject areas

Keywords

  • Approximate computing, GPU, Performance/energy/accuracy trade-offs, Program transformation, Special-function-unit