Data Might be Enough: Bridge Real-World Traffic Signal Control Using Offline Reinforcement Learning
Applying reinforcement learning (RL) to traffic signal control (TSC) has become a promising solution. However, most RL-based methods focus solely on optimization within simulators and give little thought to deployment issues in the real world. Online RL-based methods, which require interaction with the environment, are limited in their interactions with the real-world environment. Additionally, acquiring an offline dataset for offline RL is challenging in the real world. Moreover, most real-world intersections prefer a cyclical phase structure. To address these challenges, we propose: (1) a cyclical offline dataset (COD), designed based on common real-world scenarios to facilitate easy collection; (2) an offline RL model called DataLight, capable of learning satisfactory control strategies from the COD; and (3) a method called Arbitrary To Cyclical (ATC), which can transform most RL-based methods into cyclical signal control. Extensive experiments using real-world datasets on simulators demonstrate that: (1) DataLight outperforms most existing methods and achieves comparable results with the best-performing method; (2) introducing ATC into some recent RL-based methods achieves satisfactory performance; and (3) COD is reliable, with DataLight remaining robust even with a small amount of data. These results suggest that the cyclical offline dataset might be enough for offline RL for TSC. Our proposed methods make significant contributions to the TSC field and successfully bridge the gap between simulation experiments and real-world applications. Our code is released on Github.
READ FULL TEXT