RAPID: Approximate Pipelined Soft Multipliers and Dividers for High Throughput and Energy Efficiency

Zahra Ebrahimi; Muhammad Zaid; Mark Wijtvliet; Akash Kumar

doi:10.1109/TCAD.2022.3184928

RAPID: Approximate Pipelined Soft Multipliers and Dividers for High Throughput and Energy Efficiency

Research output: Contribution to journal › Research article › Contributed › peer-review

Contributors

Zahra Ebrahimi - , Chair of Processor Design (cfaed) (Author)
Muhammad Zaid - , TUD Dresden University of Technology (Author)
Mark Wijtvliet - , Chair of Processor Design (cfaed) (Author)
Akash Kumar - , Chair of Processor Design (cfaed) (Author)

Abstract

The rapid updates in error-resilient applications along with their quest for high throughput has motivated designing fast approximate functional units for Field-Programmable Gate Arrays (FPGAs). Studies have proposed various imprecise functional techniques, albeit posed with three shortcomings: first, most existing inexact multipliers and dividers are specialized for Application-Specific Integrated Circuit (ASIC) platforms. Therefore, due to the architectural differences of underlying building blocks in FPGA and ASIC, ASIC-customized designs have not yielded comparable improvements when directly synthesized and ported to FPGAs. Second, state-of-the-art (SoA) approximate units are substituted, mostly in a single kernel of a multi-kernel application. Moreover, the end-to-end assessment is adopted on the Quality of Results (QoR), but not on the overall gained performance. Finally, existing imprecise components are not designed to support a pipelined approach, which could boost the operating frequency/throughput of, e.g., division-included applications. In this paper, we propose, the first pipelined approximate multiplier and divider architectures, customized for FPGAs. The proposed units efficiently utilize 6-input Look-up Tables (6-LUTs) and fast carry chains to implement Mitchell’s approximate algorithms. Our novel error-refinement scheme not only has negligible overhead over the baseline Mitchell’s approach, but also boosts its accuracy to 99.4% for arbitrary size of multiplication and division. Experimental results obtained with Xilinx Vivado demonstrate the efficiency of the proposed pipelined and non-pipelined multipliers and dividers over accurate counterparts. In particular, 4-stage pipelined architecture of 32-bit multiplier (divider) enables 3.3× (5.1×) higher throughput, 2.3× (6.8×) higher throughput/Watt, and 52% (31%) savings of LUTs, over their 4-stage pipelined, accurate IP counterparts. Moreover, the end-to-end evaluations of non-pipelined, deployed in three multi-kernel applications in the domains of bio-signal processing, image processing, and moving object tracking for Unmanned Air Vehicles (UAV) indicate up to 35%, 33%, and 45% improvements in area, latency, and Area-Delay-Product (ADP), respectively, over accurate kernels, with negligible loss in QoR. To springboard future research in reconfigurable and approximate computing communities, our implementations will be available and open-sourced at https://cfaed.tu-dresden.de/pd-downloads.

Details

Original language	English
Article number	3
Pages (from-to)	712-725
Number of pages	14
Journal	IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Volume	42
Issue number	3
Publication status	Published - 1 Mar 2023
Peer-reviewed	Yes

External IDs

Scopus	85133620975
Mendeley	d8b0aada-e300-3ff4-b109-b86847e9f395

Keywords

Research priority areas of TU Dresden

Information Technology and Microelectronics

Sustainable Development Goals

SDG 7 - Affordable and Clean Energy

ASJC Scopus subject areas

Keywords

Approximate Computing, Approximation algorithms, Bio-signal Processing, Compressors, Computer architecture, Divider, Energy-Efficiency., Field programmable gate arrays, Field-Programmable Gate Arrays, High-Throughput, Mitchell’s Algorithm, Multiplier, Pipeline, Pipeline processing, Table lookup, Throughput, Unmanned Air Vehicles, unmanned aerial vehicles (UAVs), Approximate computing, Mitchell's algorithm, divider, energy efficiency, biosignal processing, multiplier, high throughput, pipeline, field-programmable gate arrays (FPGAs)

Library keywords

620 Engineering and allied operations

Research Portal of the TU Dresden

RAPID: Approximate Pipelined Soft Multipliers and Dividers for High Throughput and Energy Efficiency

Contributors

Abstract

Details

External IDs

Keywords

Research priority areas of TU Dresden

Sustainable Development Goals

ASJC Scopus subject areas

Keywords

Library keywords

Related content

RAPID: AppRoximAte Pipelined Soft Multipliers and Dividers for High-Throughput and Energy-Efficiency