Full-Stack Optimization for CAM-Only DNN Inference

João Paulo C. De Lima; Asif Ali Khan; Luigi Carro; Jeronimo Castrillon

doi:10.23919/DATE58400.2024.10546805

Full-Stack Optimization for CAM-Only DNN Inference

Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review

Contributors

João Paulo C. De Lima - , Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI Dresden), Chair of Compiler Construction (cfaed), Universidade Federal do Rio Grande do Sul (Author)
Asif Ali Khan - , Chair of Compiler Construction (cfaed) (Author)
Luigi Carro - , Universidade Federal do Rio Grande do Sul (Author)
Jeronimo Castrillon - , Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI Dresden), Center for Advancing Electronics Dresden (cfaed), Chair of Compiler Construction (cfaed) (Author)

Abstract

The accuracy of neural networks has greatly improved across various domains over the past years. Their ever-increasing complexity, however, leads to prohibitively high energy demands and latency in von-Neumann systems. Several computing-in-memory (CIM) systems have recently been proposed to overcome this, but trade-offs involving accuracy, hardware reliability, and scalability for large models remain a challenge. Additionally, for some CIM designs, the activation movement still requires considerable time and energy. This paper explores the combination of algorithmic optimizations for ternary weight neural networks and associative processors (APs) implemented using racetrack memory (RTM). We propose a novel compilation flow to optimize convolutions on APs by reducing their arithmetic intensity. By leveraging the benefits of RTM-based APs, this approach substantially reduces data transfers within the memory while addressing accuracy, energy efficiency, and reliability concerns. Concretely, our solution improves the energy efficiency of ResNet-18 inference on ImageNet by 7.5× compared to crossbar in-memory accelerators while retaining software accuracy.

Details

Original language	English
Title of host publication	2024 Design, Automation and Test in Europe Conference and Exhibition, DATE 2024 - Proceedings
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (electronic)	9798350348590
Publication status	Published - 2024
Peer-reviewed	Yes

Publication series

Series	Proceedings -Design, Automation and Test in Europe, DATE
ISSN	1530-1591

Conference

Title	2024 Design, Automation and Test in Europe Conference and Exhibition, DATE 2024
Duration	25 - 27 March 2024
City	Valencia
Country	Spain

External IDs

ORCID	/0000-0002-5007-445X/work/173985262

Keywords

Sustainable Development Goals

SDG 7 - Affordable and Clean Energy

ASJC Scopus subject areas

General Engineering

Keywords

Associative memory, compiler optimizations, neu-ral network, racetrack memories

Research Portal of the TU Dresden

Contributors

Abstract

Details

Publication series

Conference

External IDs

Keywords

Sustainable Development Goals

ASJC Scopus subject areas

Keywords