DynLight: Realize dynamic phase duration with multi-level traffic signal control
Adopting reinforcement learning (RL) for traffic signal control is increasingly popular. Most RL methods use fixed action interval (denoted as tduration) and actuate or maintain a phase every tduration, which makes the phase duration less dynamic and flexible. In addition, the actuated phase can be arbitrary, affecting the real-world deployment, which requires a fixed cyclical phase structure. To address these challenges, we propose a multi-level traffic signal control framework, DynLight, which uses an optimization method Max-QueueLength (M-QL) to determine the phase and uses a deep Q-network to determine the corresponding duration. Based on DynLight, we further propose DynLight-C that adopts a well trained deep Q-network of DynLight and replace M-QL by a fixed cyclical control policy that actuate a set of phases in fixed order to realize cyclical phase structure. Comprehensive experiments on multiple real-world datasets demonstrate that DynLight achives a new state-of-the-art. Furthermore, the deep Q-network of DynLight can learn well on determining the phase duration and DynLight-C demonstrates high performance for deployment.
READ FULL TEXT