SFU-driven transparent approximation acceleration on GPUs

Ang Li; Shuaiwen Leon Song; Mark Wijtvliet; Akash Kumar; Henk Corporaal

doi:10.1145/2925426.2926255

SFU-driven transparent approximation acceleration on GPUs

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung

Beitragende

Ang Li - , Eindhoven University of Technology (Autor:in)
Shuaiwen Leon Song - , Pacific Northwest National Laboratory (Autor:in)
Mark Wijtvliet - , Eindhoven University of Technology (Autor:in)
Akash Kumar - , Professur für Prozessorentwurf (Prozessor Design) (cfaed) (Autor:in)
Henk Corporaal - , Eindhoven University of Technology (Autor:in)

Abstract

Approximate computing, the technique that sacrifices certain amount of accuracy in exchange for substantial performance boost or power reduction, is one of the most promising solutions to enable power control and performance scaling towards exascale. Although most existing approximation designs target the emerging data-intensive applications that are comparatively more error-tolerable, there is still high demand for the acceleration of traditional scientific applications (e.g., weather and nuclear simulation), which often comprise intensive transcendental function calls and are very sensitive to accuracy loss. To address this challenge, we focus on a very important but long ignored approximation unit on today's commercial GPUs | the specialfunction unit (SFU), and clarify its unique role in performance acceleration of accuracy-sensitive applications in the context of approximate computing. To better understand its features, we conduct a thorough empirical analysis on three generations of NVIDIA GPU architectures to evaluate all the single-precision and double-precision numeric transcendental functions that can be accelerated by SFUs, in terms of their performance, accuracy and power consumption. Based on the insights from the evaluation, we propose a transparent, tractable and portable design framework for SFU-driven approximate acceleration on GPUs. Our design is software-based and requires no hardware or application modifications. Experimental results on three NVIDIA GPU platforms demonstrate that our proposed framework can provide fine-grained tuning for performance and accuracy trade-offs, thus facilitating applications to achieve the maximum performance under certain accuracy constraints.

Details

Originalsprache	Englisch
Titel	ICS '16: Proceedings of the 2016 International Conference on Supercomputing
Herausgeber (Verlag)	Association for Computing Machinery (ACM), New York
Publikationsstatus	Veröffentlicht - 1 Juni 2016
Peer-Review-Status	Ja

Publikationsreihe

Reihe	ICS: International Conference on Supercomputing

Konferenz

Titel	30th International Conference on Supercomputing, ICS 2016
Dauer	1 - 3 Juni 2016
Stadt	Istanbul
Land	Türkei

Schlagworte

Forschungsprofillinien der TU Dresden

Informationstechnologien und Mikroelektronik

ASJC Scopus Sachgebiete

Allgemeine Computerwissenschaft

Schlagwörter

Approximate computing, GPU, Performance/energy/accuracy trade-offs, Program transformation, Special-function-unit

Forschungsportal der TU Dresden