ReLAccS: A Multilevel Approach to Accelerator Design for Reinforcement Learning on FPGA-Based Systems.

Akhil Raj Baranwal; Salim Ullah; Siva Satyendra Sahoo; Akash Kumar

doi:10.1109/TCAD.2020.3028350

ReLAccS: A Multilevel Approach to Accelerator Design for Reinforcement Learning on FPGA-Based Systems.

Research output: Contribution to journal › Research article › Contributed › peer-review

Contributors

Akhil Raj Baranwal - (Author)
Salim Ullah - , Chair of Processor Design (cfaed) (Author)
Siva Satyendra Sahoo - , Chair of Processor Design (cfaed) (Author)
Akash Kumar - , Chair of Processor Design (cfaed) (Author)

Abstract

Reinforcement learning (RL), specifically Q -learning, with human-like learning abilities to learn from experience without any a priori data, is being increasingly used in embedded systems in the field of control and navigation. However, finding the optimal policy in this approach can be highly compute-intensive, and a software-only implementation may not satisfy the application's timing constraints. To this end, we propose optimization methods at multiple levels of accelerator design for RL. Specifically, at the architecture-level, we exploit the instruction-level parallelism and the spatial parallelism in FPGAs to improve the throughput over state-of-the-art designs by up to 34%. Further, we propose lookup table-level optimizations to reduce the resource utilization and power dissipation of the accelerator. Finally, we propose algorithm-level approximation that can be used for acceleration of Q -learning problems with more states and for reducing the peak power dissipation. We report up to 10 × reduction in power dissipation with marginal degradation in quality of results.

Details

Original language	English
Article number	9211770
Pages (from-to)	1754-1767
Number of pages	14
Journal	IEEE transactions on computer-aided design of integrated circuits and systems : CAD
Volume	40
Issue number	9
Publication status	Published - Sept 2021
Peer-reviewed	Yes

External IDs

Scopus	85109209852

Keywords

Research priority areas of TU Dresden

Information Technology and Microelectronics

ASJC Scopus subject areas

Keywords

Cross-layer system design, embedded systems, field-programmable gate array (FPGA), high-level synthesis, reinforcement learning (RL)

Library keywords

620 Engineering and allied operations

Research Portal of the TU Dresden

Contributors

Abstract

Details

External IDs

Keywords

Research priority areas of TU Dresden

ASJC Scopus subject areas

Keywords

Library keywords