Long-term Joint Scheduling for Urban Traffic

10/27/2019 ∙ by Xianfeng Liang, et al. ∙ 0

Recently, the traffic congestion in modern cities has become a growing worry for the residents. As presented in Baidu traffic report, the commuting stress index has reached surprising 1.973 in Beijing during rush hours, which results in longer trip time and increased vehicular queueing. Previous works have demonstrated that by reasonable scheduling, e.g, rebalancing bike-sharing systems and optimized bus transportation, the traffic efficiency could be significantly improved with little resource consumption. However, there are still two disadvantages that restrict their performance: (1) they only consider single scheduling in a short time, but ignoring the layout after first reposition, and (2) they only focus on the single transport. However, the multi-modal characteristics of urban public transportation are largely under-exploited. In this paper, we propose an efficient and economical multi-modal traffic scheduling scheme named JLRLS based on spatio -temporal prediction, which adopts reinforcement learning to obtain optimal long-term and joint schedule. In JLRLS, we combines multiple transportation to conduct scheduling by their own characteristics, which potentially helps the system to reach the optimal performance. Our implementation of an example by PaddlePaddle is available at https://github.com/bigdata-ustc/Long-term-Joint-Scheduling, with an explaining video at https://youtu.be/t5M2wVPhTyk.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

The rapid expansion of urban traffic, with the slow growth of traffic resources, has led to the serious and growing traffic congestion. According to a Baidu traffic report (baidu), the commuting index has raised to 1.973 during rush hours in Beijing. Traffic congestion poses a great threat to traffic safety and also brings losses to the urban economy. Fortunately, previous works have proved that reasonable traffic scheduling can improve the traffic efficiency with less consumption. In  (wang2017data), a data-driven optimization algorithm for bus system is proposed, which reduces the average waiting time of citizens. A bike-sharing scheduling system (ghosh2016robust) proposes an online and robust framework to minimize the loss of customers.

However, there are still two disadvantages that restrict the further improvement of efficiency: (1) typical bike scheduling, e.g, Liu et al  (liu2016rebalancing), proposes a hierarchical optimization model for rebalancing by exploring multi-source data. But they only consider single planning in a short time, ignores analyzing the situation after first planning. A common example is in Figure 1 (a). There are three bike-sharing stations (, and ) without available bikes. We assume that 10 customers ride from to , and 15 customers ride from to during . From to , 10 customers ride from to . And only 10 bikes can be dispatched before . According to the greedy strategy, during , will be assigned to 10 bikes. But with consideration of the dynamic flow, 10 bikes should be moved to A, as it can finally serve 20 customers. Therefore, if the secondary planning can be carried out, the local greedy problem could be alleviated. (2) Current works  (ghosh2016robust; wang2017data; liu2016rebalancing; li2018dynamic) only adopt single modes of transport, while ignoring the multi-modal characteristics of urban public transport. For example, as shown in Figure 1 , (b) indicates the normal operation of the bus, and (c) indicates that when the bus is unavailable, the system can automatically move the shared bikes to replace the bus.

Figure 1. The Case of Traffic System

Joint scheduling for long periods is usually difficult. Human activities are social and uncertain, which may result in an extremely imbalance between supply and demand of the traffic. Moreover, it increases the difficulties of traffic scheduling: (1) The accuracy of the predication of future demand should be as high as possible, as it directly affects the subsequent optimization. (2) Traditional scheduling is complex and can not be applied to large-scale problems. Furthermore, traditional algorithms use MIP to solve optimization, which only suits for obtaining a solution in special environments. (3) Joint scheduling depends on setting rules manually, which may produce greedy and short-sighted strategies.

In this paper, we investigate multi-modal transposition carefully, and discover that different traffic modes can be scheduled complementarily to improve efficiency. It should be noted that the characteristics of different transports are different. For example, bike-sharing scheduling is flexible, so it could be affected by other transports easily. Therefore, bike-sharing scheduling is suitable for scheduling with others. Some traffic scheduling, such as bus scheduling is less affected by other transports, and thus relatively fixed.To consider the interaction between different traffic systems, we design a global scheduling method that can schedule both types of traffic at the same time, so that the flexible traffics can be dispatched in coordination with the fixed traffic to achieve the global optimality. Specifically, for the most common transports — bus and bike, a joint traffic scheduling framework named JLRLS based on reinforcement learning is proposed. JLRLS incorporates the bike flow into consideration, which helps to avoid local greed. Meanwhile, it also incorporates the observation information of other traffic scheduling systems, so that the reinforcement learning model can learn the strategy of joint scheduling. Compared with traditional traffic scheduling methods, it has the following advantages:

  • We adopt reinforcement learning to learn the scheduling strategy, which is robust to the inaccuracy of demand prediction and adaptable in complex scheduling situations.

  • Compared with other scheduling schemes, it can take into account longer-term traffic demand, avoid local greed and achieve optimal scheduling over a long period of time.

  • JLRLS can realize joint scheduling among different traffic modes. When a certain traffic service is temporarily unavailable or inappropriate, more flexible traffic can be dispatched in time to meet the corresponding demand. The framework has strong scalability and can be applied to joint dispatch between buses on different routes. In the future, it can also be applied to the more different joint scheduling.

2. Overview

In this section we define some concepts and notations used in the paper, and overview the framework of our model JLRLS.

Notation Description
The number of stations in the cluster
The number of agents in system
The -th time segment in the future
The number of predicted time segments
The longest time the passengers willing to wait
The -th station
The feature dimension of the station in other systems
The feature dimension of the current system
The observations for the stations in other systems
The environmental factors of the current system
The predicted flow network of bikes between stations
in -th time segment

The vector dimension after encoding

Table 1. Notations

2.1. Preliminary

Definition 1.

Agent: Agent indicates buses in bus systems and the dispatching vehicles in bike-sharing systems.

Definition 2.

Cluster: Two types of cluster are defined for two different situations. For bus systems, the cluster is the bus stations sharing the same route. For bike-sharing systems, the cluster represents the similar stations, which are close to each other after clustering shown in section 3.1.2.

Definition 3.

Demand: We define two types of demand here. The first, demand for riding bikes, and taking buses from origin station to terminal. The second, demand for returning bikes, and taking buses from terminal to origin station.

Definition 4.

Time segment: Time segment is a period of time with fixed length, e.g, 15 mins.

Definition 5.

Capacity: Capacity stands for bus carrying capacity of passengers in bus systems, and vehicles for dispatch carrying capacity of bikes in bike-sharing systems.

Definition 6.

Episode: Episode is defined as a certain period in a day.

2.2. Framework

We propose a general framework of Joint Long-term Reinforcement Learning Scheduling system (JLRLS). As shown in Fig 2, our model includes demand forecasting for stations and joint dispatching based on reinforcement learning. In bike-sharing scheduling, we incorporate the information of bus stations in bus scheduling system, so that the reinforcement learning model can learn the strategy of joint traffic scheduling.

Figure 2. Framework of Joint Scheduling System

3. Method

Figure 3. The State of System

3.1. Forecast System

Since the characteristics are largely different between scenarios of bus systems and bike-sharing systems, we propose two kinds of prediction frameworks, bus flow forecast system and bike flow forecast system.

3.1.1. Bus Flow Forecast System

In bus flow forecast system, there is a relatively stable hierarchical concept, which is the passenger flow in each bus station is equal to that in the bus system. As passenger flow is regular and periodic in one day, daily total passenger flow of the bus system is also stable. We find that the scenario of bus passenger flow forecasting is similar to power system consumption forecasting. Inispired by it, we propose a bus flow prediction algorithm based on hierarchical time series.

Since the time series of daily passenger flow in a bus station exhibit strong regularity, in order to reduce the complexity of calculation, we use linear model to learn the time series of the past time period for each station and the total bus system, and predict the passenger flow in a short feature term. Because the individual forecast of the traffic at each station does not guarantee that their sum is consistent with the total flow of the bus system, there is a summing matrix in the hierarchical time series forecasting to transform the whole problem into a regression problem which needs to be optimized.

3.1.2. Bike Flow Forecast System

Modeling the scene of bike-sharing flow is a very complex problem, because the time sequence in this scenario does not have strong regularity. In a bus system with fixed routes and stable users, many users only use shared bikes temporarily. Therefore, we might not use the method of bus flow forecast system. For the bike-sharing flow prediction system, our prediction model needs to predict the traffic flow between each station in the future. As a lot of stations sharing bikes, we implement (li2018dynamic) to group the stations for simplifying the situation. The stations in each group are closer to each other and the traffic between them is more frequent than others. And we only consider the bike movement situation between the stations in each group.

Compared with traditional linear prediction algorithm, deep learning model like Long short-term memory 

(hochreiter1997long) (LSTM) is more suitable for modeling such unstable and nonlinear time series of bike-sharing station. Thus, we use LSTM to model the bike departure situation of each station in the short time of future. Specifically, indicate the time sequence of a bike station, the LSTM model maps an input sequence to outputs via a sequence of hidden states by computing the following equations recursively from to :

where , are the input and hidden vectors of the -th time step, , , , are the activation vectors of the input gate, forget gate, memory cell and output gate, is the weight matrix between vector and (e.g, is weight matrix from the input to the input gate ), is the bias term of and

is the sigmoid function, and

is the prediction of time series .

Considering the movement of bikes between stations, when a bike leaves from station

, other stations in the same group may become the destination. For this problem, we use the method of frequency replace probability to calculate the probability of bikes leave from station

to other stations according to the past period. Then the number of bikes predicted from station to station in the future period is the product of the total number of bikes predicted leave from station and the probability of station to .

3.2. Scheduling System

After forecasting the bus flow and bike flow, we use reinforcement learning to produce the scheduling strategy, and adopt a Deep Deterministic Policy Gradient (DDPG) approach  (lillicrap2015continuous). It incorporates the information of bike flow to avoid local greed. At the same time, it also incorporates the observation information of other traffic scheduling systems, so that the reinforcement learning model can learn the strategy of traffic joint scheduling.

3.2.1. The State of Scheduling System

In order to describe the state clearly, we summarize it in Figure 3. As we can see, the state is divided into five categories: predicted demand, station information, agent states, scheduling information of other traffic systems and the state of system. There are a matrix and three vectors in the predicted demand, where represents the number of bikes from to in the -th time segment. represents vector after encoding . respectively indicate the first and the second type of demand for each station in the -th time segment. Both and denote station information. In bus systems, they represent the time interval of the most recent bus going forward and backward at a station, respectively. In bike-sharing systems, means the available bikes, and is the available docks at the station. is an one-hot vector, which stands for which station the current agent will locate at. respectively stand for the capacity occupied and the remaining capacity for current agent. represents the operation being performed. In bus systems, there are three types of operations, {-1,0,1}, -1 standing for driving from to , 0 for halting, and 1 standing for driving from to . In bike-sharing systems, means how many bikes are loaded or unloaded, ¿ 0 for loading, ¡ 0 for unloading, and = 0 standing for not moving. respectively stand for the corresponding state of other agents. can be the observation of the bus system by bike-sharing system, or it can be the observation of the bus system on different routes. represents the environmental factors such as the weather, temperature, distance.

3.2.2. Bus Scheduling System

A State. For bus scheduling system, we define the state as follows:

  • Observation for bus stations,

  • The state of the scheduled bus,

  • The state of other buses on the same route,

  • Observation for the system, ().

  • Observation for stations in other traffic systems, ().

An Action. For a bus, there are three types of an action:

(1) towards to terminal ; (2) toward to origin station ; (3) stoping at or .

A Reward. We set the reward mechanism as follows: (1) Each time the bus travels from a to b, the reward is the reduced waiting time, where the punishment is related to the driving time; (2) The bus stops driving, no rewards and punishments.

Stopping Condition. A passenger waiting for p time segments or an episode is completed.

3.2.3. Bike Scheduling System

A State. For a bike-sharing scheduling system, the state consists of the following five parts:

  • Observation for bike-sharing stations, ().

  • State of the current dispatch vehicle,

  • State of other vehicles for dispatch in the same cluster, .

  • Observation for the system, ().

  • Observation for bike-sharing stations in other traffic systems (), such as in the bus system.

Different from (li2018dynamic), we consider more detailed information on the flow of bikes between stations. We use to represent the predicted flow network of bikes between stations in the future. The matrix will result in a high complexity, so we encode the matrix to get a vector , which keeps the bike flow between stations. It comprehensively describes the bike flow network, and simplifies the representation of the state, which is more conducive to the convergence of the strategy and the exploration of the agent. To enable the bike-system working with the bus system, we incorporate the observation information of bus scheduling systems in reinforcement learning.

An Action. An action is defined as . denotes which station the current dispatch vehicle will unload or load bikes, and denotes the number of unloaded or loaded bikes.

Reward. After an episode is completed, we set the reward mechanism as follows: we reward the agent as the number of services provided by bike; The punishment is related to the cost of scheduling and the number of bikes exceeding total capacity.

Stop Condition. When an episode is completed.

Our model has the following advantages over (li2018dynamic):

  • The representation of the state contains more detailed flow information between stations in the future, which is more conducive to policy convergence.

  • We adopt a Deep Deterministic Policy Gradient (DDPG) approach. First of all it is an Actor-Critic network, taking into account the advantages of Value-Based and Policy-Based methods. Secondly, using LSTM inside the Actor network, it can comprehensively consider the historical information of the state.

  • We consider the interactions between different traffic systems, so as to jointly dispatch different traffic systems and improve traffic efficiency.

4. Conclusions

In order to provide a better travel experience, we urgently need a joint scheduling system capable of jointly schedule multiple modes of transportation. Therefore, we propose the above topic and give our solution. To successfully complete this research, we need more resource, including but not limited to the following:

  1. Complete query records of Baidu map App.

  2. The bicycle histories of Baidu partners.

  3. The routes of buses and the passenger flow at different time.

  4. Enough GPU resources.

Multi-modal scheduling is an indispensable part of smart city. The successful development of multi-modal transportation scheduling could make a lots of advantages, such as reducing transport times, balancing traffic flows, reducing traffic congestion, and ultimately, improving efficiency of intelligent transportation systems. Therefore, the research of our topic is valuable to the project of smart city. We believe that after possessing these resources we can develop a more comprehensive and efficient multimodal joint scheduling system.