Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation.

Cecilia De la Parra; Taha Soliman; Andre Guntoro; Akash Kumar; Norbert Wehn

doi:10.1109/MM.2022.3196865

Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation.

Publikation: Beitrag in Fachzeitschrift › Forschungsartikel › Beigetragen › Begutachtung

Beitragende

Cecilia De la Parra - , Robert Bosch GmbH (Autor:in)
Taha Soliman - , Robert Bosch GmbH (Autor:in)
Andre Guntoro - , Robert Bosch GmbH (Autor:in)
Akash Kumar - , Professur für Prozessorentwurf (Prozessor Design) (cfaed) (Autor:in)
Norbert Wehn - , Rheinland-Pfälzische Technische Universität Kaiserslautern-Landau (Autor:in)

Abstract

Approximate computing and mixed-signal in-memory accelerators are promising paradigms to significantly reduce computational requirements of deep neural network (DNN) inference without accuracy loss. In this work, we present a novel in-memory design for layerwise approximate computation at different approximation levels. A sensitivity-based high-dimensional search is performed to explore the optimal approximation level for each DNN layer. Our new methodology offers high flexibility and optimal tradeoff between accuracy and throughput, which we demonstrate by an extensive evaluation on various DNN benchmarks for medium- and large-scale image classification with CIFAR10, CIFAR100, and ImageNet. With our novel approach, we reach an average of 5× - and up to 8× - speedup without accuracy loss.

Details

Originalsprache	Englisch
Seiten (von - bis)	17-24
Seitenumfang	8
Fachzeitschrift	IEEE Micro
Jahrgang	42
Ausgabenummer	6
Publikationsstatus	Veröffentlicht - 2022
Peer-Review-Status	Ja

Externe IDs

Scopus	85135767385

Forschungsportal der TU Dresden

Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation.

Beitragende

Abstract

Details

Externe IDs

Schlagworte

Forschungsprofillinien der TU Dresden

Bibliotheksschlagworte