NetPU-M: A Generic Reconfigurable Neural Network Accelerator Architecture for MLPs

Yuhao Liu; Shubham Rai; Salim Ullah; Akash Kumar

doi:10.1109/IPDPSW59300.2023.00026

NetPU-M: A Generic Reconfigurable Neural Network Accelerator Architecture for MLPs

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung

Beitragende

Yuhao Liu - , Center for Advancing Electronics Dresden (cfaed), Professur für Prozessorentwurf (Prozessor Design) (cfaed), Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI Dresden) (Autor:in)
Shubham Rai - , Center for Advancing Electronics Dresden (cfaed), Professur für Prozessorentwurf (Prozessor Design) (cfaed), Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI Dresden) (Autor:in)
Salim Ullah - , Professur für Prozessorentwurf (Prozessor Design) (cfaed), Center for Advancing Electronics Dresden (cfaed), Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI Dresden) (Autor:in)
Akash Kumar - , Professur für Prozessorentwurf (Prozessor Design) (cfaed), Center for Advancing Electronics Dresden (cfaed), Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI Dresden) (Autor:in)

Abstract

Recent research widely deployed Neural Networks (NNs) in various scenarios, such as IoT systems, wearable devices, or smart sensors. However, the complex application scenarios cause the rapid extension of network model size and the requirement for higher-performance hardware platforms. Related works apply Heterogeneous Streaming Dataflow (HSD) and Processing Element Matrix (PEM) architectures as the most popular schemes for FPGA-based implementation of NN accelerator: 1) HSD architecture implements a complete network for given trained models on FPGA with simplified control but more hardware consumption; 2) PEM architecture implements reusable neuron structures controlled by runtime environments/drivers providing the generic acceleration supports for different network models. Our work explores a new hybrid architecture based on HSD and PEM to implement a reusable partial network structure on FPGA and achieve generic acceleration supports for different network models with simplffied runtime control. This architecture supports scalable, mixable, quantized precision, and selectable activation functions, including ReLU, Sigmoid, Tanh, Sign, and Multi-Thresholds. Data stream transmission can reset the accelerator configuration in runtime without hardware implementation changes for different networks. Our design fully supports the generic inference acceleration for different Multi-Layer Perceptron (MLP) models.

Details

Originalsprache	Englisch
Titel	2023 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2023
Herausgeber (Verlag)	Institute of Electrical and Electronics Engineers (IEEE)
Seiten	85-92
Seitenumfang	8
ISBN (elektronisch)	979-8-3503-1199-0
Publikationsstatus	Veröffentlicht - 2023
Peer-Review-Status	Ja

Konferenz

Titel	37th IEEE International Parallel and Distributed Processing Symposium
Kurztitel	IPDPS 2023
Veranstaltungsnummer	37
Dauer	15 - 19 Mai 2023
Webseite	https://www.ipdps.org/ipdps2023/2023-.html
Ort	Hilton St. Petersburg Bayfront
Stadt	St. Petersburg
Land	USA/Vereinigte Staaten

Schlagworte

ASJC Scopus Sachgebiete

Schlagwörter

FPGA, Generic Hardware Accelerator, Neural Network, Quantization

Forschungsportal der TU Dresden