dCSR: A Memory-Efficient Sparse Matrix Representation for Parallel Neural Network Inference.

Elias Trommer; Bernd Waschneck; Akash Kumar

doi:10.1109/ICCAD51958.2021.9643506

dCSR: A Memory-Efficient Sparse Matrix Representation for Parallel Neural Network Inference.

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung

Beitragende

Elias Trommer - , Professur für Prozessorentwurf (Prozessor Design) (cfaed) (Autor:in)
Bernd Waschneck - (Autor:in)
Akash Kumar - , Professur für Prozessorentwurf (Prozessor Design) (cfaed) (Autor:in)

Abstract

Reducing the memory footprint of neural networks is a crucial prerequisite for deploying them in small and low-cost embedded devices. Network parameters can often be reduced significantly through pruning. We discuss how to best represent the indexing overhead of sparse networks for the coming generation of Single Instruction, Multiple Data (SIMD)-capable microcontrollers. From this, we develop Delta-Compressed Storage Row (dCSR), a storage format for sparse matrices that allows for both low overhead storage and fast inference on embedded systems with wide SIMD units. We demonstrate our method on an ARM Cortex-M55 MCV prototype with M-Profile Vector Extension (MVE). A comparison of memory consumption and throughput shows that our method achieves competitive compression ratios and increases throughput over dense methods by up to 2.9x for sparse matrix-vector multiplication (SpMV)-based kernels and 1.06x for sparse matrix-matrix multiplication (SpMM). This is accomplished through handling the generation of index information directly in the SIMD unit, leading to an increase in effective memory bandwidth.

Details

Originalsprache	Englisch
Titel	2021 40th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2021 - Proceedings
Seiten	1-9
Seitenumfang	9
Publikationsstatus	Veröffentlicht - 2021
Peer-Review-Status	Ja

Externe IDs

Scopus	85124129859

Schlagworte

Forschungsprofillinien der TU Dresden

Informationstechnologien und Mikroelektronik

ASJC Scopus Sachgebiete

Software
Angewandte Informatik
Computergrafik und computergestütztes Design

Schlagwörter

Compression, Embedded systems, Pruning, Simd, Sparse neural networks

Forschungsportal der TU Dresden