Rate GQN: A Deviations-Reduced Decision-Making Strategy for Connected and Automated Vehicles in Mixed Autonomy

Xin Gao; Xueyuan Li; Qi Liu; Zhaoyang Ma; Tian  Luan; Fan  Yang; Zirui Li

doi:10.1109/TITS.2023.3312951

Rate GQN: A Deviations-Reduced Decision-Making Strategy for Connected and Automated Vehicles in Mixed Autonomy

Research output: Contribution to journal › Research article › Contributed › peer-review

Contributors

Xin Gao - , Beijing Institute of Technology (Author)
Xueyuan Li - , Beijing Institute of Technology (Author)
Qi Liu - , Beijing Institute of Technology (Author)
Zhaoyang Ma - , Beijing Jiaotong University (Author)
Tian Luan - , Beijing Institute of Technology (Author)
Fan Yang - , Beijing Institute of Technology (Author)
Zirui Li - , Beijing Institute of Technology, TUD Dresden University of Technology (Author)

Chair of Traffic Process Automation

Abstract

Connected and automated vehicles (CAVs) have become one of the essential approaches to effectively resolve problems such as traffic safety, road congestion, and energy consumption. However, due to the spatial-temporal interaction of the mixed traffic environment, the driving behaviors of traffic participants are continuously transmitted in time and space. This makes it difficult for the existing decision-making system of CAVs to make accurate judgments and effective strategies. In this study, a rate graph convolution Q-learning network (Rate GQN) is proposed to train a discrete strategy that can improve the comprehensive performance of CAVs in scenarios with spatial-temporal interaction. Firstly, the Rate algorithm is proposed to impose a ratio on the estimates of Q-values from the previous learning process, which improves the stability and performance of the algorithm by reducing the approximate error variance of the target value. Secondly, the traffic Scenario is modeled as a graph structure. And graph convolutional networks are adopted to extract the features information of graph structure to help the CAVs grasp the dynamic traffic interaction information quickly and accurately. Additionally, an internal dynamic multi-objective reward function is presented to improve the comprehensive performance of CAVs, including safety, efficiency, energy saving, and comfort. Finally, comparison and ablation experiments are constructed in a task-based traffic scenario (station stop and traffic light passing). The simulation results show that our Rate GQN method has faster training speed, a more stable training process, and better overall performance than the deep Q-learning network (DQN) and algorithms of the comparison group.

Details

Original language	English
Pages (from-to)	613-625
Number of pages	13
Journal	IEEE Transactions on Intelligent Transportation Systems
Volume	25
Issue number	1
Publication status	Published - 22 Sept 2023
Peer-reviewed	Yes

External IDs

Scopus	85173064758

Keywords

Sustainable Development Goals

SDG 7 - Affordable and Clean Energy