Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning.
Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review
Contributors
Abstract
Accurately predicting the dynamics of robotic systems is crucial for model-based control and reinforcement learning. The most common way to estimate dynamics is by fitting a one-step ahead prediction model and using it to recursively propagate the predicted state distribution over long horizons. Unfortunately, this approach is known to compound even small prediction errors, making long-term predictions inaccurate. In this paper, we propose a new parametrization to supervised learning on state-action data to stably predict at longer horizons-that we call a trajectory-based model. This trajectory-based model takes an initial state, a future time index, and control parameters as inputs, and directly predicts the state at the future time index. Experimental results in simulated and real-world robotic tasks show that trajectory-based models yield significantly more accurate long term predictions, improved sample efficiency, and the ability to predict task reward. With these improved prediction properties, we conclude with a demonstration of methods for using the trajectory-based model for control.
Details
| Original language | English |
|---|---|
| Title of host publication | 2021 60th IEEE Conference on Decision and Control (CDC), Austin, TX, USA |
| Pages | 2880-2887 |
| Number of pages | 8 |
| ISBN (electronic) | 9781665436595 |
| Publication status | Published - 2021 |
| Peer-reviewed | Yes |
| Externally published | Yes |
External IDs
| Scopus | 85126061279 |
|---|---|
| ORCID | /0000-0001-9430-8433/work/146646296 |