Constrained traffic signal control under competing public transport priority requests via safe reinforcement learning

Publikation: Beitrag in FachzeitschriftForschungsartikelBeigetragenBegutachtung

Abstract

Agile signal switching under frequent arrivals of transit vehicles, combined with the need to respect multiple operational constraints, presents significant challenges for effective and safe signal control, as well as for the real-world implementation of reinforcement learning-based control algorithms. We introduce a safe reinforcement learning-based fully adaptive multimodal traffic signal controller in a connected vehicle environment that incorporates a cost estimator during the learning process to account for multiple operational constraints. It utilises the Duelling Double Deep Q-network and a multicriteria reward to minimise passenger delay, and maximise throughput under lower and upper bounds on green time and maximum phase skip constraint. Unsafe situations due to inappropriate and frequent phase switches are specified as a safety constraint, which constrains the learning process. The Lagrangian method is used to transform the constrained learning to an unconstrained one based on the concept of safe reinforcement learning, and the associated Lagrange multiplier is updated via a gradient-based mechanism. The performance of the proposed algorithm is evaluated for an isolated intersection using simulations in SUMO under different traffic demands, fixed public transport schedules and random passenger occupancy levels. The results demonstrate that the proposed algorithm reduces queue length and public transport passenger delays compared to state-of-the-art model-based and model-free signal controllers. The integration of a cost estimator effectively handles both hard and soft constraints during learning. The proposed algorithm resolves public transport priority request conflicts, makes a trade-off between public transport and individual traffic, and ensures traffic safety.

Details

OriginalspracheEnglisch
Aufsatznummer127676
Seitenumfang19
FachzeitschriftExpert systems with applications : an international journal
Jahrgang284
PublikationsstatusVeröffentlicht - 23 Apr. 2025
Peer-Review-StatusJa

Externe IDs

ORCID /0000-0001-6555-5558/work/182334861
ORCID /0000-0002-1623-8051/work/182336236
Scopus 105004263453

Schlagworte

Schlagwörter

  • Constrained Markov decision process, Dilemma zone, Safe reinforcement learning, Traffic signal control, Transit signal priority