SFU-driven transparent approximation acceleration on GPUs

Ang Li; Shuaiwen Leon Song; Mark Wijtvliet; Akash Kumar; Henk Corporaal

doi:10.1145/2925426.2926255

SFU-driven transparent approximation acceleration on GPUs

Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review

Contributors

Ang Li - , Eindhoven University of Technology (Author)
Shuaiwen Leon Song - , Pacific Northwest National Laboratory (Author)
Mark Wijtvliet - , Eindhoven University of Technology (Author)
Akash Kumar - , Chair of Processor Design (cfaed) (Author)
Henk Corporaal - , Eindhoven University of Technology (Author)

Abstract

Approximate computing, the technique that sacrifices certain amount of accuracy in exchange for substantial performance boost or power reduction, is one of the most promising solutions to enable power control and performance scaling towards exascale. Although most existing approximation designs target the emerging data-intensive applications that are comparatively more error-tolerable, there is still high demand for the acceleration of traditional scientific applications (e.g., weather and nuclear simulation), which often comprise intensive transcendental function calls and are very sensitive to accuracy loss. To address this challenge, we focus on a very important but long ignored approximation unit on today's commercial GPUs | the specialfunction unit (SFU), and clarify its unique role in performance acceleration of accuracy-sensitive applications in the context of approximate computing. To better understand its features, we conduct a thorough empirical analysis on three generations of NVIDIA GPU architectures to evaluate all the single-precision and double-precision numeric transcendental functions that can be accelerated by SFUs, in terms of their performance, accuracy and power consumption. Based on the insights from the evaluation, we propose a transparent, tractable and portable design framework for SFU-driven approximate acceleration on GPUs. Our design is software-based and requires no hardware or application modifications. Experimental results on three NVIDIA GPU platforms demonstrate that our proposed framework can provide fine-grained tuning for performance and accuracy trade-offs, thus facilitating applications to achieve the maximum performance under certain accuracy constraints.

Details

Original language	English
Title of host publication	ICS '16: Proceedings of the 2016 International Conference on Supercomputing
Publisher	Association for Computing Machinery (ACM), New York
Publication status	Published - 1 Jun 2016
Peer-reviewed	Yes

Publication series

Series	ICS: International Conference on Supercomputing

Conference

Title	30th International Conference on Supercomputing, ICS 2016
Duration	1 - 3 June 2016
City	Istanbul
Country	Turkey

Keywords

Research priority areas of TU Dresden

Information Technology and Microelectronics

ASJC Scopus subject areas

General Computer Science

Keywords

Approximate computing, GPU, Performance/energy/accuracy trade-offs, Program transformation, Special-function-unit

Research Portal of the TU Dresden

Contributors

Abstract

Details

Publication series

Conference

Keywords

Research priority areas of TU Dresden

ASJC Scopus subject areas

Keywords