1 Introduction
Transportation Network Companies (TNCs) like Uber and Lyft have fundamentally transformed mobility in many cities, providing ondemand doortodoor transportation through mobile applications. They have also increased traffic in many cities: a recent study by Erhardt2019 Erhardt2019 showed that, between 2010 and 2016, weekday vehicle hours of traffic delay have increased by 62% in San Francisco. In contrast, it was estimated that the delay increase would have been 22% without TNCs. To address this issue, several cities have begun limiting the number of TNC vehicles on the road. Another way to tackle the underlying congestion and pollution issues is to build mobility systems that utilize ridesharing systematically. A study by AlonsoMora462 AlonsoMora462 showed that systematic ridesharing may significantly reduce the number of vehicles needed to serve requests. Their results indicate that 98% of the historic demand for taxi services in NYC could be served with a much smaller taxi fleet, while maintaining short wait times. This paper continues this line of research and focuses on how to build a realtime dispatching and routing architecture that serves the needs of largescale ridesharing systems. It is envisioned that, in the future, these ridesharing systems will be deployed using autonomous vehicles, guarantee service to all customers, and leverage advanced AI systems. In the transition period, they can be supported by human drivers provided that these drivers follow the instructions of the ridesharing system.
The value of stochastic information in realtime vehicle routing has been demonstrated previously by scenariopvh scenariopvh. However, incorporating stochastic information in largescale ridesharing systems, where requests may arrive every tenth of a second during peak times, is a challenge. It is thus not surprising that stateoftheart approaches are purely myopic [1, 9]. These systems batch requests and optimize frequently to account for realtime demand. Other approaches to realtime dispatching (e.g., [4]
) use deep reinforcement learning, but they ignore ridesharing and do not leverage advanced routing algorithms, focusing only on customer assignment. To incorporate stochastic information, this paper proposes a novel endtoend framework (
ARTRS) for realtime ridesharing systems that tightly integrates stateoftheart optimization techniques, machine learning, and model predictive control.The ARTRS architecture is illustrated in Figure 1
. Time is divided into epochs and, during epoch
, ARTRS optimizes the routing of the requests that were batched in epoch as well as unserved requests from earlier epochs. Moreover, at a lower frequency and prior to the routing optimization, ARTRS relocates idle vehicles using a Model Predictive Control (MPC) step. The MPC step does not operate on individual requests for scalability reasons. Rather it works with longer time periods and at a coarser zone level (e.g., taxi zones in New York City), and relies on a machinelearning model to predict the number of requests between each pair of zones over time. The main contribution of ARTRS is to demonstrate that, in largescale realtime ridesharing systems, hybridizing stateoftheart optimization algorithms for finegrained routing decisions with model predictive control for idle vehicle relocation at a coarser space and time granularity provides significant operational benefits. Indeed, results on historic taxi trips from the New York City Taxi and Limousine Commission [5], indicate that this tight integration decreases average waiting times by about 30% over all test cases and reaches close to a 32% reduction in average waiting times for highdemand zones in the most challenging instances.This paper is organized as follows. Section 2 presents the problem. Section 3 presents related work. Section 4 gives an overview of ARTRS. Section 5 describes the dialaride optimization. Sections 6 and 7 present the core conceptual contributions of the paper: the demand forecasting model and the vehicle relocation scheme. Section 8 reports the experimental results and Section 9 concludes the paper.
2 Problem Statement
Operating a realtime ridesharing system requires solving largescale dialaride problems, where each request corresponds to a trip for a number of riders from an origin to a destination that must take place after a specified pickup time. Constraints limit the time a rider can spend inside a vehicle (ride duration constraints) and the number of riders in a vehicle at any one time (vehicle capacity constraints). The goal is to serve all requests and minimize the average waiting time, while satisfying the ride duration and capacity constraints. Special attention is also devoted to ensure that no request is left unserved indefinitely. The systems studied in this paper either use a fleet of autonomous vehicles or their own pool of drivers who follow routing instructions exactly. The system can thus relocate the vehicles at will in order to anticipate demand. It is assumed that significant historical data is available and can be used to forecast demand at the zone level.
3 Related Work
A comprehensive review of popular dialaride formulations which serve as the foundation of ARTRS can be found in [2]. ARTRS uses a rolling horizon, alternating request batching and optimization, as traditionally used in taxi pooling [8, 7]
. AlonsoMora462 AlonsoMora462 were first to demonstrate the value of ridesharing in NYC: they showed that 98% of the historic demand could be served with a smaller taxi fleet and short wait times. Their anytime algorithm uses cliques to generate vehicle routes and hardtime windows to discard requests which cannot be served efficiently. A linear program is employed to move idle vehicles towards discarded requests in order to better serve those areas in the future. ZAC2018 ZAC2018 improve over
[1] by partitioning the region into zones and assigning vehicles to zone paths. riley2019 riley2019 is the first algorithm designed to serve all requests with small waiting times: they use column generation to serve all requests with smaller number of vehicles and shorter waiting times. Their dialaride algorithm is used as the dispatching engine of ARTRS. To the author’s knowledge, no algorithm other than the one of riley2019 riley2019 provides guarantees to serve all requests: They can decide to ignore arbitrary requests.Note that these three algorithms are myopic: they do not exploit information about future requests. Iglesias2017 Iglesias2017 proposed a model predictive control approach for serving individual requests at the zone level, combining a machinelearning model (based on deep learning) and a mixedinteger program for request assignments and vehicle relocation. They did not consider ridesharing and their dispatching algorithm is performed at a coarse granularity. This paper leverages and generalizes their model predictive control approach. Ma2019 Ma2019 integrated dispatching optimization and model predictive control for scheduling requests in a multimodal transit systems: they do not batch requests, use a single period for vehicle relocation, and assume Poisson arrivals for each zone. The dispatching of each request uses local search and insertion heuristics. The benefits of demand prediction and vehicle relocation has been demonstrated by Bent and Van Hentenryck Bent2007,scenariopvh for various types of vehicle routing problems (using online stochastic optimization) and by xyu2019 xyu2019 for ondemand ridepooling, using approximate dynamic programming. Holler2019 Holler2019 used deep learning and bipartite matching for dispatching and vehicle relocation: their approach does not support ride sharing. Shah2020 Shah2020 enhance
[1]with an approximation of the future reward learned using a deep neural network. They provide improvements over
[1] when the ride duration is twice as long as the shortest path. However, the approach does not provide service guarantees and does not minimize waiting times. It also rejects requests even when vehicles are available, which can be problematic to justify in practice. In contrast, this paper serves all requests with an average waiting time of 2.5 minutes with 2,000 vehicles during peak times and a more realistic rideduration constraint (50% increase). The socioeconomic benefits of ridesharing systems is explored by Bistaffa2019 Bistaffa2019. To the authors’ knowledge, this paper is the first integration of advanced optimization techniques, machine learning, and model predictive control for the realtime vehicle dispatching and relocation of largescale ridesharing systems.4 Overall Organization
This section gives an overview of the ARTRS architecture. As depicted in Figure 1, ARTRS divides time into epochs of length , i.e., . During epoch , ARTRS batches incoming requests and performs an optimization that assigns prior requests to vehicles and routes them. The requests considered in this optimization are those batched in epoch , as well as unserved requests from earlier epochs. Periodically, ARTRS performs a relocation optimization, which exploits a forecasting model to direct idle vehicles towards expected demand.
The Optimization Problem
The optimization problem receives as inputs a set of requests, each of which is characterized by its origin and destination, its earliest pickup time, and its number of riders. The optimization has at its disposal a number of vehicles. Each vehicle is characterized by its departure location, its earliest departure time, its capacity, and its set of riders. Each rider is characterized by her dropoff location and the time she has already spent in the vehicle.
The starting location and departure time of a vehicle are given by the current state of the mobility system. If a vehicle is idle in the existing schedule, its starting location is its current position and its departure time is the beginning of the epoch (i.e., ). If a vehicle is serving customers, its starting location is the first location it visits after the start of the epoch and the departure time is specified accordingly. The riders associated with a vehicle are those who have already been picked up and need to be dropped off. Hence, for every epoch, the optimization problem considers all the requests whose riders have not been picked up yet, while also scheduling the dropoffs of existing riders. Note that the optimization problems associated with two successive epochs may schedule a request differently as long as the request’s riders have not been picked up. This gives a lot of flexibility to the realtime system at the cost of more complex optimization problems.
Given the computational complexity of the dialaride problem that must be solved in real time, the optimization may not be able to serve all requests for some epochs. Hence, following riley2019 riley2019, ARTRS associates a penalty with each request to ensure that the request is served in reasonable time. The penalty is increased after each epoch in which the request is not served. The optimization model minimizes a weighted sum of the average waiting time and the penalties associated with unserved requests.
Vehicle Relocation
Every epochs, ARTRS performs a relocation of vehicles at the zonal level (e.g., taxi zones, census tracks, or traffic analysis zones). The goal is to determine the desired number of idle vehicles to move from zone to zone over the next period , where is the length of the relocation period and is significantly larger than the epoch length . As a result, the relocation optimization operates at a much coarser granularity both in space and time.
This combination of micro and macrodecisions for routing and relocation is one of the salient features of ARTRS and is driven by the reality of the largescale realtime ridesharing systems, where the number of requests in each epoch makes it difficult computationally to exploit forecasting information during the routing decisions.
Forecasting the Demand
To inform the vehicle relocation, ARTRS is assumed to have at its disposal historical data for the number of requests from zone to zone for every time period of length . This historical data is used to train a forecasting model of the demand.
5 The DialARide Optimization
During each epoch, ARTRS solves a generalized dialaride optimization specified in Section 4. To perform this task, it borrows the algorithm from riley2019 riley2019, which is briefly reviewed in this section. The dialaride optimization is based on a column generation that operates at the route level. A vehicle route specifies a sequence of pickups and dropoffs which satisfies the ride duration constraints and the vehicle capacity. The column generation interleaves the solving of (the linear relaxation of) a Restricted Master Problem (RMP), which selects routes, and pricing subproblems which generate new routes for each vehicle. The process terminates when no new routes can improve the solution of the RMP or the time limit for the column generation is met. The last stage of the dialaride optimization is a mixedinteger program that solves the RMP exactly for the generated routes. The pricing subproblems are complex due to their objective of minimizing waiting times. As a result, traditional dynamic programming formulations are not effective and riley2019 riley2019 use an anytime exact algorithm that generates routes of increasing lengths.
The RMP is depicted in Figure 2. In the formulation, denotes the set of routes, denotes the subset of possible routes for vehicle , represents the wait times incurred by all customers served by route , is the penalty of not scheduling request for this epoch, and iff request is served by route . The RMP uses the following decision variables: is 1 iff route is selected and is 1 iff request is not served by any of the selected routes. The objective function minimizes the waiting times of the served customers and the penalties for the unserved customers. Constraints (1b) ensure that is set to 1 if request is not served by any of the selected routes and constraints (1c) ensure that only one route is selected per vehicle.
(1a)  
subject to  (1b)  
(1c)  
(1d)  
(1e) 
Since the dialaride optimization may not schedule all the requests, it is important to update the penalty of unserved requests to ensure that they will not be delayed too long. For the penalty for an unserved request in epoch , riley2019 riley2019 use , where is the earliest possible pickup time for request . The parameter was tuned to incentivize the algorithm to schedule each incoming request in its first available epoch.
6 Demand Forecasting
Forecasting the demand from zone to zone over time may be challenging in some settings, since this demand may be sparse for some zones and historical data may be limited. To address this difficulty, ARTRS proceeds in two steps: it first predicts the number of requests in a zone in time period and then approximates the zonetozone demand.
Preprocessing
Let be the demand for zone during period . In the case study, the time series
is strongly nonstationary (the mean and variance vary over time). As a result, the forecasting model first stationarizes the time series by differencing it over a week period. More precisely, the forecasting model defines
, where is the number of periods in a day, and predicts the differenced demand instead of .Vector Autoregression
To forecast the time series , ARTRS
uses Vector Autoregression (VAR), a multivariate generalization of Autoregression (AR). In VAR, the expected value of a multivariate time series at a particular period is assumed to be a linear function of the value of the time series at previous time steps.
The prediction for the differenced demand in zone in period uses not only ’s demand in prior periods but also the differenced demands of ’s adjacent zones. Let denote the zones adjacent to and . Let vector denote the weeklydifferenced demands of and its adjacent zones in period . Each element in is an element in . can then be modeled as
(2) 
where is a row vector in and
is a white noise with zero mean. The coefficients
are estimated using least square regression and the order is selected based on the Akaike information criterion (AIC).Once the parameters have been estimated, the prediction for the differenced demand of in period is given by
The demand prediction for zone at time is then given by
Destination Assignment
Given the number of requests for zone in period , the trip destinations for these requests are assigned using historical distributions. ARTRS uses an historical distribution for each hour during the weekdays and the weekend days. For example, when predicting the demand in zone during the 7–8am period on a Wednesday, if percent of the trips originating from during this period on weekdays have their destination in zone in historical data, then the number of trips going from to will be rounded to the nearest nonnegative integer. This returns the final demand prediction for the requests from zone to zone at .
7 Idle Vehicle Relocation
The idle vehicle relocation process is run every epochs and considers periods of length , i.e., It has at its disposal the zone to zone demand forecasts for each period. It proceeds in two steps: (1) It first uses a Model Predictive Control (MPC) approach to find the desired number of vehicles in each zone; (2) It then selects the vehicle relocation assignments to minimize the relocation cost.
Zone Rebalancing
The MPC approach in this section is borrowed from Iglesias2017 Iglesias2017 and generalized to realtime ridesharing systems, where multiple riders can share a vehicle. Its goal is to determine the number of idle vehicles to move from zone to zone during the next period in order to minimize the average waiting time in the dialaride optimizations. To achieve this goal, the MPC approach solves an assignment optimization at the zone level over multiple time periods. Hence, in contrast to the dialaride optimization, it works at a coarser granularity and looks ahead in the future using the demand forecasting module.
The MPC approach uses a MIP model (MPCMIP) over periods, each of length . Let . For each period , MPCMIP takes as input , the forecasted demand originating in zone with a destination in zone at time , as well as a variety of information about the current state of the system. In particular, is the current sharing ratio for requests from zone to zone in the system (e.g., means that riders are alone in the vehicle, while means an average of passengers per vehicle); is the set of vehicles that will become idle in zone during period (estimated from routes of current vehicles) and is the average number of periods it takes to move from a stop in zone to a stop in zone .
The MPCMIP decision variables are: the number of empty vehicles to move from zone to zone starting at period ; the number of vehicles with passengers moving from to starting at period ; and the number of requests originating in zone and ending in zone which are not served at period . The objective (3a) minimizes the number of unserved requests while also enforcing a small penalty on moving vehicles. This penalty ensures that vehicles prefer to stay in their current zone if that current zone is expected to need them in the future. Constraints (3c) and (3c) are the flow balance constraints for requests. Constraints (3d) are the flow conservation constraints for vehicles.
(3a)  
subject to  
(3b)  
(3c)  
(3d)  
(3e) 
Vehicle Relocation
MPCMIP returns the number of vehicles that should move from zone to zone in period . Only the relocations in period are relevant for ARTRS at this point in time. However, ARTRS must now identify specific vehicles to relocate. This is performed by another MIP model (VRMIP), which receives the following inputs: , the sets of idle vehicles in zone , the time to move vehicle to its closest stop in zone . VRMIP decides whether a vehicle is relocated to a zone: Variable is 1 if vehicle is chosen to relocate to zone . The VRMIP objective (4a) minimizes the sum of the traveling times, Constraints (4b) ensure the correct number of relocations from zone to zone , and Constraints (4c) ensure that a vehicle does not relocate to more than one zone.
min  (4a)  
subject to  (4b)  
(4c)  
(4d) 
8 Experimental Results
Case Study
ARTRS is evaluated on the yellow trip data obtained via the New York City Taxi and Limousine Commission (NYCTLC) [5]. The NYCTLC dataset provides the number of passengers for each trip and the start time of each trip, which is used as both the request time and the lower bound on the earliest pickup time. This section reports results on 24 instances, two hours each day for two days per month from July 2015 through June 2016. Rush hours (7–9am) were selected to obtain instances that are computationally challenging. The instances have an average of customers and range from to customers. Individual requests with more customers than the vehicle capacity are split into several trips. Following riley2019 riley2019, in order to ease ride sharing, avoid curb management issues, and reduce the number of stops, Manhattan is represented by a grid with cells of 200 squared meters. Each such cell represents a pickup/dropoff location. The travel time matrix for the network of locations was then precomputed by querying OpenStreetMap [6]. For every trip, the locations of the origin and destination were obtained by selecting the closest locations to their pickup and dropoff points in the NYCTLC dataset.
Runtime Configurations
ARTRS is compared to its myopic version MRTRS which has no idle vehicle relocation and is essentially the approach proposed by riley2019 riley2019. It is also compared to OARTRS, a version of ARTRS using perfect information on future requests instead of the machinelearning predictions. Unless otherwise specified, all experiments were performed with the following default parameters for ARTRS (and MRTRS when relevant): vehicles of capacity 4, a maximum deviation from the shortest path determined by , where is the shortest possible path from customer ’s origin to their destination, , , , seconds, seconds, seconds, and epochs. Empty vehicles are initially distributed evenly over the locations. The demand was forecasted at the hour level and scaled down uniformly due to the sparse demand in some of the zones. All MIP models are solved using Gurobi 8.1 [3].
Reduction in Waiting Times
Figures 5 and 6 report the distributions of the waiting times incurred by all customers across all instances with a logarithmic yaxis. The results demonstrate that ARTRS
reduces waiting times across the vast majority of trips. It reduces the average waiting times from 3.64 to 2.51 minutes (a 30% improvement) while also decreasing the standard deviation from 1.48 to 1.16.
Figures 7, 8, and 9 go into more details and report results on each instance and by instance sizes. The results show that the benefits of relocation are more significant for instances with a large number of requests. ARTRS strongly dominates MRTRS for instances with large demands, but is not as effective on instances with relatively low demand. This last result is due to the accuracy of the machinelearning algorithm as highlighted in Figure 8. With perfect information, OARTRS improves over ARTRS over lowdemand instances, but ARTRS and OARTRS behave similarly on highdemand instances.
Number of Requests  MRTRS  ARTRS  OARTRS 

2.33  2.37  2.15  
3.83  2.40  2.41  
3.78  2.56  2.56 
Zonal Information
Figures 10 and 11 describe where the reductions in waiting times occur. Together, they show that relocation benefits most the zones with significant demand, where the improvements can reach almost 55%. Figure 12 is reassuring from a fairness standpoint: Zones with low demand keep low average waiting times under relocation.
Passenger Information
Vehicle Information
Figures 15 depicts the histograms of idle times per vehicles. It shows that MRTRS has more vehicles with no idle time and more vehicles with high idle times. ARTRS is more balanced.
Relocation
Figure 16 shows that most vehicles spend less than five minutes in relocation and the vast majority of vehicles spend less than 17% of their operating hours relocating. Figure 17 shows that OARTRS relocates more on the instances with lower demands, explaining why it improves over ARTRS on these instances. Having a better demand predictor is thus an important research direction.
9 Conclusion
This paper proposed ARTRS, an endtoend framework for realtime optimization of largescale ridesharing systems. ARTRS combines demand forecasting, stateoftheart optimization, and model predictive control to dispatch, route, and relocate vehicles in realtime, minimizing average waiting times. The mobility system provides service guarantees (i.e, it serves all requests), enforces a rideduration constraint (i.e., no passenger travels more than 50% over their shortest path), minimizes waiting times, while achieving reasonable waiting times through penalties increasing over time. Experiments using historic taxi trips in New York City indicate that this integration decreases average waiting times by about 30% over all test cases and reaches close to 55% on the largest instances for highdemand zones compared to a base line without relocation. On the NYC case study, ARTRS serves all requests in reasonable time and with an average waiting of 2.51 minutes with a standard deviation of 1.16, using a fleet of 2,000 vehicles of capacity 4. The results also demonstrate that, while zones with large demand see the most benefits, zones with low demand maintain low waiting times and that the vast majority of vehicles spend less than 17% of their operating time relocating. In summary, ARTRS demonstrates that, in largescale realtime ridesharing systems, hybridizing stateoftheart optimization algorithms for finegrained routing decisions with model predictive control for idle vehicle relocation at a coarser space and time granularity provides significant operational benefits. Future research will be devoted in improving the machinelearning algorithm, since more accurate predictions will enable a better performance on instances with relatively fewer requests.
References
 [1] (2017) Ondemand highcapacity ridesharing via dynamic tripvehicle assignment. Proceedings of the National Academy of Sciences 114 (3), pp. 462–467. Cited by: §1, §3.
 [2] (20070901) The dialaride problem: models and algorithms. Annals of Operations Research 153 (1), pp. 29–46. Cited by: §3.
 [3] (2016) Gurobi optimizer reference manual. External Links: Link Cited by: §8.
 [4] (2019) Deep reinforcement learning for multidriver vehicle dispatching and repositioning problem. ArXiv abs/1911.11260. Cited by: §1.
 [5] (2019) NYC taxi & limousine commission  trip record data. External Links: Link Cited by: §1, §8.
 [6] (2017) Planet dump retrieved from https://planet.osm.org. Note: https://www.openstreetmap.org Cited by: §8.
 [7] (201510) A scalable approach for datadriven taxi ridesharing simulation. In 2015 IEEE International Conference on Big Data (Big Data), pp. 888–897. Cited by: §3.
 [8] (2017Sept) STaRS: simulating taxi ride sharing at scale. IEEE Transactions on Big Data 3 (3), pp. 349–361. Cited by: §3.

[9]
(2019)
Column generation for realtime ridesharing operations.
In
Integration of Constraint Programming, Artificial Intelligence, and Operations Research
, L. Rousseau and K. Stergiou (Eds.), pp. 472–487. Cited by: §1.