ReLAccS: A Multilevel Approach to Accelerator Design for Reinforcement Learning on FPGA-Based Systems.

Akhil Raj Baranwal; Salim Ullah; Siva Satyendra Sahoo; Akash Kumar

doi:10.1109/TCAD.2020.3028350

ReLAccS: A Multilevel Approach to Accelerator Design for Reinforcement Learning on FPGA-Based Systems.

Publikation: Beitrag in Fachzeitschrift › Forschungsartikel › Beigetragen › Begutachtung

Beitragende

Akhil Raj Baranwal - (Autor:in)
Salim Ullah - , Professur für Prozessorentwurf (Prozessor Design) (cfaed) (Autor:in)
Siva Satyendra Sahoo - , Professur für Prozessorentwurf (Prozessor Design) (cfaed) (Autor:in)
Akash Kumar - , Professur für Prozessorentwurf (Prozessor Design) (cfaed) (Autor:in)

Abstract

Reinforcement learning (RL), specifically Q -learning, with human-like learning abilities to learn from experience without any a priori data, is being increasingly used in embedded systems in the field of control and navigation. However, finding the optimal policy in this approach can be highly compute-intensive, and a software-only implementation may not satisfy the application's timing constraints. To this end, we propose optimization methods at multiple levels of accelerator design for RL. Specifically, at the architecture-level, we exploit the instruction-level parallelism and the spatial parallelism in FPGAs to improve the throughput over state-of-the-art designs by up to 34%. Further, we propose lookup table-level optimizations to reduce the resource utilization and power dissipation of the accelerator. Finally, we propose algorithm-level approximation that can be used for acceleration of Q -learning problems with more states and for reducing the peak power dissipation. We report up to 10 × reduction in power dissipation with marginal degradation in quality of results.

Details

Originalsprache	Englisch
Aufsatznummer	9211770
Seiten (von - bis)	1754-1767
Seitenumfang	14
Fachzeitschrift	IEEE transactions on computer-aided design of integrated circuits and systems : CAD
Jahrgang	40
Ausgabenummer	9
Publikationsstatus	Veröffentlicht - Sept. 2021
Peer-Review-Status	Ja

Externe IDs

Scopus	85109209852

Schlagworte

Forschungsprofillinien der TU Dresden

Informationstechnologien und Mikroelektronik

ASJC Scopus Sachgebiete

Schlagwörter

Cross-layer system design, embedded systems, field-programmable gate array (FPGA), high-level synthesis, reinforcement learning (RL)

Bibliotheksschlagworte

620 Ingenieurwissenschaften und zugeordnete Tätigkeiten

Forschungsportal der TU Dresden