Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning.

Nathan O. Lambert; Albert Wilcox; Howard Zhang; Kristofer S. J. Pister; Roberto Calandra

doi:10.1109/CDC45484.2021.9683134

Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning.

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung

Beitragende

Nathan O. Lambert - (Autor:in)
Albert Wilcox - (Autor:in)
Howard Zhang - (Autor:in)
Kristofer S. J. Pister - (Autor:in)
Roberto Calandra - , Meta (Autor:in)

Abstract

Accurately predicting the dynamics of robotic systems is crucial for model-based control and reinforcement learning. The most common way to estimate dynamics is by fitting a one-step ahead prediction model and using it to recursively propagate the predicted state distribution over long horizons. Unfortunately, this approach is known to compound even small prediction errors, making long-term predictions inaccurate. In this paper, we propose a new parametrization to supervised learning on state-action data to stably predict at longer horizons-that we call a trajectory-based model. This trajectory-based model takes an initial state, a future time index, and control parameters as inputs, and directly predicts the state at the future time index. Experimental results in simulated and real-world robotic tasks show that trajectory-based models yield significantly more accurate long term predictions, improved sample efficiency, and the ability to predict task reward. With these improved prediction properties, we conclude with a demonstration of methods for using the trajectory-based model for control.

Details

Originalsprache	Englisch
Titel	2021 60th IEEE Conference on Decision and Control (CDC), Austin, TX, USA
Seiten	2880-2887
Seitenumfang	8
ISBN (elektronisch)	9781665436595
Publikationsstatus	Veröffentlicht - 2021
Peer-Review-Status	Ja
Extern publiziert	Ja

Externe IDs

Scopus	85126061279
ORCID	/0000-0001-9430-8433/work/146646296

Forschungsportal der TU Dresden

Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning.

Beitragende

Abstract

Details

Externe IDs

Schlagworte

ASJC Scopus Sachgebiete