On-Policy Deep Reinforcement Learning Assisted Koopman Bilinear Model Predictive Control for Unknown Dynamical Systems

Ketong Zheng; Peng Huang; Jonathan Casas; Gerhard Fettweis

doi:10.23919/ICCAS66577.2025.11301354

On-Policy Deep Reinforcement Learning Assisted Koopman Bilinear Model Predictive Control for Unknown Dynamical Systems

Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review

Contributors

Ketong Zheng - , Vodafone Chair of Mobile Communications Systems (Author)
Peng Huang - , Barkhausen Institut (Author)
Jonathan Casas - , Vodafone Chair of Mobile Communications Systems (Author)
Gerhard Fettweis - , Clusters of Excellence CeTI: Centre for Tactile Internet, Vodafone Chair of Mobile Communications Systems, Barkhausen Institut (Author)

Abstract

Data-driven Koopman operator approximation has gained interest recently for its ability to embed nonlinear systems into a lifted linear state space using only measurements. When control inputs are included, however, the lifted dynamics render a bilinear form, which poses challenges for controller synthesis, such as Model Predictive Control (MPC). This paper proposes an on-policy actor-critic Deep Reinforcement Learning (DRL) framework that simultaneously learns the Koopman bilinear dynamics and an MPC neural cost map. Instead of directly generating control actions, the actor network takes the Koopman-lifted states and produces MPC weight matrices for each prediction step. These state-dependent weight matrices serve as high-level guidance for the control objective, allowing the low-level MPC to run under very short prediction horizon while maintaining stability and enforcing safety constraints. Simulations carried out with the OpenAI Gym library demonstrate that, without requiring explicit knowledge of the dynamics, the proposed Actor-Critic Koopman MPC (ACKMPC) achieves control accuracy and disturbance robustness on par with a model-based ACMPC, and outperforms a pure DRL-learned policy using baseline Proximal Policy Optimization (PPO). It also exceeds standard Koopman MPC (KMPC) in both robustness and computational efficiency.

Details

Original language	English
Title of host publication	2025 25th International Conference on Control, Automation and Systems, ICCAS 2025
Publisher	IEEE Computer Society
Pages	432-437
Number of pages	6
ISBN (electronic)	978-8-9932-1539-7
ISBN (print)	979-8-3503-8070-5
Publication status	Published - Nov 2025
Peer-reviewed	Yes

Publication series

Series	International Conference on Control, Automation and Systems ( ICCAS)
ISSN	1598-7833

Conference

Title	25th International Conference on Control, Automation and Systems
Abbreviated title	ICCAS 2025
Conference number	25
Duration	4 - 7 November 2025
Website	https://2025.iccas.org/
Location	Songdo ConvensiA
City	Incheon
Country	Korea, Republic of

Keywords

ASJC Scopus subject areas

Keywords

Data-driven control, Deep reinforcement learning, Koopman operator, Model predictive control

Research Portal of the TU Dresden

Contributors

Abstract

Details

Publication series

Conference

Keywords

ASJC Scopus subject areas

Keywords