Multi-Precision Deep Neural Network Acceleration on FPGAs
Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review
Contributors
Abstract
Quantization is a promising approach to reduce the computational load of neural networks. The minimum bit-width that preserves the original accuracy varies significantly across different neural networks and even across different layers of a single neural network. Most existing designs over-provision neural network accelerators with sufficient bit-width to preserve the required accuracy across a wide range of neural networks. In this paper, we present mpDNN, a multi-precision multiplier with dynamically adjustable bit-width for deep neural network acceleration. The design supports run-time splitting an arithmetic operator into multiple independent operators with smaller bit-width, effectively increasing throughput when lower precision is required. The proposed architecture is designed for FPGAs, in that the multipliers and bit-width adjustment mechanism are optimized for the LUT-based structure of FPGAs. Experimental results show that by enabling run-time precision adjustment, mpDNN can offer 3-15x improvement in throughput.
Details
| Original language | English |
|---|---|
| Title of host publication | 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC) |
| Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
| Pages | 454-459 |
| Number of pages | 6 |
| ISBN (electronic) | 9781665421355 |
| Publication status | Published - 2022 |
| Peer-reviewed | Yes |
Publication series
| Series | Asia and South Pacific Design Automation Conference (ASP-DAC) |
|---|---|
| Volume | 2022-January |
Conference
| Title | 27th Asia and South Pacific Design Automation Conference |
|---|---|
| Abbreviated title | ASP-DAC 2022 |
| Conference number | 27 |
| Duration | 17 - 20 January 2022 |
| Website | |
| Location | Online |
| City | Taipei |
| Country | Taiwan, Province of China |