Ridesharing is a coordination problem in its core. Traditionally it has been
solved in a centralized manner by ridesharing platforms. Yet, to truly allow
for scalable solutions, we needs to shift from traditional approaches, to
multi-agent systems, ideally run on-device. In this paper, we show that a
recently proposed heuristic (ALMA), which exhibits such properties, offers an
efficient, end-to-end solution for the ridesharing problem. Moreover, by
utilizing simple relocation schemes we significantly improve QoS metrics, by up
to 50
To demonstrate the latter, we perform a systematic evaluation of a diverse
set of algorithms for the ridesharing problem, which is, to the best of our
knowledge, one of the largest and most comprehensive to date. Our evaluation
setting is specifically designed to resemble reality as closely as possible. In
particular, we evaluate 12 different algorithms over 12 metrics related to
global efficiency, complexity, passenger, driver, and platform incentives.
Parking has been a painful problem for urban drivers. The parking pain
e...
1. Introduction
The emergence and widespread use of ridesharing in recent years has had a profound impact on urban transportation in a variety of ways. Amongst others, it has mitigated congestion costs (such as commute times, fuel usage, accident propensity, etc.), it has enabled marketplace optimization for both passengers and drivers, and it has provided great environmental benefits. Ridesharing however results to some passenger disruption as well, due to compromise in flexibility, increased travel time, and loss of privacy and convenience. Thus, in the core of any ridesharing platform lies the need for an efficient balance between the incentives of the passengers, the drivers, and those of the platform.
Optimizing the usage of transportation resources is not an easy task, especially for cities like New York, with more than 13000 taxis and 270 ride requests per minute. For example, Buchholz (2018)estimates that 45000 customer requests remain unmet each day in New York, despite the fact that approximately 5000 taxis are vacant at any time. In fact, on aggregate, drivers spend about 47% of their time not serving any passengers. Moreover, up to 80% of the taxi rides in Manhattan could be shared by two riders, with only a few minutes increase in travel time Alonso-Mora et al. (2017). A more sophisticated matching policy could mitigate these costs by better allocating available supply to demand. As a second example, coordinated vehicle relocation could also be employed to bridge the gap on the spatial supply/demand imbalance and improve passenger satisfaction, and Quality of Service (QoS) metrics. Drivers often relocate to find passengers: 61.3% of trips begin in a different neighborhood than the drop-off location of the last passenger Buchholz (2018), yet currently drivers move without any coordinated search behavior, resulting to spatial search frictions.
Given the importance of the ridesharing problem for transportation and the economy, it is not surprising that the related literature is populated with a plethora of papers, proposing different solutions along different axes, such as efficiency Santi et al. (2014); Alonso-Mora et al. (2017); Agatz
et al. (2011); Ashlagi et al. (2017); Huang
et al. (2019); Bienkowski et al. (2018); Dickerson et al. (2018), platform revenue Banerjee
et al. (2017); Chen
et al. (2019), driver incentives Ma
et al. (2019); Yuen
et al. (2019) or fairness Lesmana
et al. (2019); Sühr et al. (2019). Most of the related work can be broadly categorized as either (a) empirical papers, that propose heuristics tailored for the ridesharing problem and evaluate their performance on experimental scenarios or (b) theoretical papers, that design algorithms for more abstract versions of the problem (e.g., matching under uncertainty, in the presence of deadlines etc.) and provide worst-case guarantees for their performance. Crucially, all of the above solutions rely on the existence of a centralized platform. Yet, scaling demands that we move to multi-agent solutions, that ideally run on-device. A fundamental question here is:
Can we replace centralized solutions with on-device multi-agent systems without compromising performance?
In order to properly answer the above, we strongly believe that there is a need for a thorough empirical evaluation: which algorithm works well in practice, in a realistic ridesharing scenario, for a host of different objectives? Can we leverage historical data for dynamic vehicle relocation to close the gap on the spatial supply/demand imbalance?
1.1. Our Contributions
(1) As a highlight of our results, we identify a scalable, on-device heuristic (the ALMA algorithm of Danassis et al. (2019)) that offers an efficient, end-to-end solution for the Dynamic Ridesharing and Fleet Relocation problem.
(2) We perform a comprehensive, and systematic evaluation of a diverse set of algorithms for the ridesharing problem. We have evaluated 12 different algorithms over 12 metrics. We put extra emphasis on designing an evaluation setting which resembles reality as closely as possible, in every aspect of the problem. Our list of metrics include amongst others the total distance saved as a result of ridesharing, the pick-up times and delay incurred by the passengers, the profit and search frictions of drivers, and the platform revenue. To the best of our knowledge, this is the first end-to-end experimental evaluation of this magnitude.
(3) We examine the extend to which relocation of idle taxis can improve QoS objectives, by closing the gap on spatial supply/demand imbalance. We propose relocation schemes which are based on matching algorithms and make use of the historical data to predict future requests. Our results here are irrefutable: we improve several QoS metrics radically (exceeding 50% in certain cases), suggesting that relocation should be a vital part of any efficient ridesharing algorithm.
We firmly believe that our findings provide a clear-cut recommendation to ridesharing platforms on which solutions they should employ in practice.
1.2. Ridesharing in the Literature
The key algorithmic components of ridesharing are the following. First, it is an online problem, as the decisions made at some point in time clearly affect the possible decisions in the future, and therefore the general approach of the field of online algorithms and competitive analysis is applicable Borodin and
El-Yaniv (2005); Manasse
et al. (1988). Secondly, it is clearly a matching setting, both for bipartite graphs (for matching passengers with taxis) and for general graphs (for matching passengers to pairs). In fact, most of the algorithms that have been proposed in the literature for the problem are for different variants of online matching Kalyanasundaram and
Pruhs (1993); Karp
et al. (1990). Finally, ridesharing can be seen as a generalization of the k-taxi problem Coester and
Koutsoupias (2018); Fiat
et al. (1994); Kosoresow (1997), which, in turn, is a generalization of the well-known k-server problem Koutsoupias and
Papadimitriou (1995); Koutsoupias (2009).^{1}^{1}1In fact the latter two problems are quite closely connected, and algorithms for the k-server problem can be used to solve the k-taxi problem. See Coester and
Koutsoupias (2018) for more details. This means that, treating passenger pairs as requests in the k-server problem, algorithms from the classical literature of the k-server problem (paired with an appropriate matching algorithm for pairing passenger requests) are prime candidates for solving the ridesharing problem.
The algorithms that we consider in this paper are appropriate modifications of the most significant ones that have been proposed for the aforementioned settings, as well as heuristic approaches which are based on the same principles, but were designed with the ridesharing application in mind. We emphasize that such modifications are needed, primarily because many of these algorithms were tailored for sub-problems of the ridesharing setting, and end-to-end solutions in the literature are rather scarce.
Finally, we emphasize that, while several papers in the related work provide detailed evaluations on realistic datasets, (e.g., see Danassis et al. (2019); Santi et al. (2014); Alonso-Mora et al. (2017); Agatz
et al. (2011); Santos and Xavier (2013)), they either (a) only consider parts of the ridesharing problem and therefore do not propose end-to-end solutions, (b) only evaluate a few newly-proposed algorithms against some basic baselines, (c) only consider a limited number of performance metrics, predominantly with regard to the overall efficiency and often without regard to QoS metrics or (d) perform evaluations on a smaller scale, thus not capturing the real-life complexity of the problem. On the contrary, our approach provides a comprehensive evaluation of a large number of proposed algorithms, over multiple different metrics and for end-to-end problems of a real scale.
2. Problem Statement & Modeling
In this section we formally present the Dynamic Ridesharing, and Fleet Relocation (DRSFR) problem. In the DRSFR problem there is a (potentially infinite) metric space X representing the topology of the environment, equipped with a distance function δ:X×X→R≥0
. Both are known in advance. At any moment, there is a (dynamic) set of available taxi vehicles
Vt, ready to service customer requests (i.e., drive to the pick-up, and subsequently to the destination location). Between servicing requests, vehicles can relocate to locations of potentially higher demand, to mitigate spatial search frictions between drivers. Customer requests appear in an online manner at their respective pick-up locations, wait to potentially be matched to a shared ride, and finally are serviced by a taxi to their respective destination. In order for two requests to be able to share a ride, they must satisfy spatial, and temporal constraints. The former dictates that requests should be matched only if there is good spatial overlap among their routes. Yet, due to the latter constraint, requests cannot be matched even if they have perfect spatial overlap, if they are not both ‘active’ at the same time. Finally, the DRSFR is an inherently online problem, as we are unaware of the requests that will appear in the future, and need to make decisions before the requests expire, while taking into account the dynamics of the fleet of taxis. The goal is to minimize the cumulative distance driven by the fleet of taxis, while maintaining high QoS, given that we serve all requests. Serving all requests improves passenger satisfaction, and, most importantly, allows us to ground our evaluation to a common scenario, ensuring a fair comparison.
2.1. Performance Metrics
2.1.1. Global Metrics
.
Distance Driven: Minimize the cumulative distance driven by all vehicles for serving all the requests. We chose this objective as it directly correlates to passenger, driver, company, and environmental objectives (minimize cost, delay, CO2 emissions, maximize number of shared rides, improve QoS, etc.). All of the evaluated algorithms have to serve all the requests, either as shared, or single rides.
Complexity: Real-world time constraints dictate that the employed solution produces results in a reasonable time-frame^{2}^{2}2For example UberPool has a waiting period of at most 2 minutes until you get a match (https://www.uber.com/au/en/ride/uberpool/), thus any algorithm has to run in under that time.
2.1.2. Passenger Specific Metrics – Quality of Service
.
Time to Pair: Expected time to be paired in a shared ride, i.e., E[tpaired−topen], where topen,tpaired denote the time the request appeared, and was paired as a shared ride respectively. If the request is served as a single ride, then tpaired refers to the time the algorithm chose to serve it as such.
Time to Pair with Taxi: Expected time to be paired with a taxi, i.e., E[ttaxi−tpaired], where ttaxi denotes the time the (shared) ride was paired with a taxi.
Time to Pick-up: Expected time to passenger pickup, i.e., E[tpickup−ttaxi], where tpickup denotes the time the request was picked-up.
Delay: Additional travel time over the expected direct travel time (when served as a single ride, instead of a shared ride), i.e., E[(tdest−tpickup)−(t′dest−tpickup)]. tdest, and t′dest denote the time the request reaches, and would have reached as a single ride, its destination.
Research conducted by ridesharing companies shows that passengers’ satisfaction level remains sufficiently high as long as the pick-up time is less than a certain threshold. The latter is corroborated by data on booking cancellation rate against pick-up time Tang et al. (2017). In other words, passengers would rather have a short pick-up time and long detour, than vice-versa Brown ([n. d.]b). This also suggests that an effective relocation scheme can considerably improve passenger satisfaction by reducing the average pick-up time (see Section 5.1).
2.1.3. Driver Specific Metrics
.
Driver Profit: Total revenue earned minus total travel costs.
Number of Shared Rides: Directly related to the profit. By carrying more than one passenger at a time, drivers can serve more requests in a day, which consequently, increases their income Widdows
et al. (2017).
Frictions: Waiting time experienced by drivers between serving requests (i.e., time between dropping-off a ride, and getting matched with another). Search frictions occur when drivers are unable to locate rides due to spatial supply and demand imbalance. Even though in our scenario matchings are performed automatically, without any searching involved by the drivers, lower frictions indicate lower regret by the drivers, thus lower temptation to potentially switch to an alternative ridesharing platform.
2.1.4. Platform Specific Metrics
.
Platform Profit: Usually a commission on the driver’s fee^{3}^{3}3E.g., Uber charges partners 25% fee on all fares (https://www.uber.com/en-GH/drive/resources/payments/)., and or passenger fees (which given that we serve all the requests would be constant across all the employed algorithms).
Quality of Service (QoS): Refer to the aforementioned, passenger specific metrics. Improving the QoS to their costumers correlates to the growth of the company.
Number of Shared Rides: The matching rate is important especially in the nascent stage of a ridesharing platform Dutta and Sholley (2018).
We do not report separate values on the aforementioned metrics, as they directly correlate to their respective passenger, and driver specific ones.
2.2. Modeling
2.2.1. Dataset
We have used the yellow taxi trip records of 2016, provided by the NYC Taxi and Limousine Commission^{4}^{4}4https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page. For every request, the dataset provides amongst others the pick-up and drop-off times, and geo-location coordinates. Time is discrete, with granularity of 1 minute (same as the dataset). On average, there are 272 new requests per minute, totaling to 391479 requests on the broader NYC area (352455 in Manhattan) on the evaluated day (Jan, 15). Figure 1 depicts the request arrival per minute on the aforementioned day.
2.2.2. Taxi Vehicles
A unique feature of the NYC Yellow taxis is that they may only be hailed from the street and are not authorized to conduct pre-arranged pick-ups. This provides an ideal setting for a counter-factual analysis since (1) we can assume a realistic position of each taxi at the beginning of the simulation (last drop-off location), and (2) all observed rides are obtained through search, thus – assuming reasonable prices, and delays – customers do not have nor are willing to take an alternative means of transportation. Thus, validating our choice that all of the algorithms considered will have to eventually serve all the requests. By law, there are 13,587 taxis in NYC^{5}^{5}5https://www1.nyc.gov/site/tlc/businesses/yellow-cab.page. The majority of the results presented in this paper use a much lower number of vehicles (what we call base number) for three reasons: (1) to reduce complexity, given that most of the employed algorithms can not handle such a large number of vehicles, (2) to evaluate under resource scarcity, making the problem harder to better differentiate between the results, and (3) to investigate the possibility of a more efficient utilization of resources, with minimal cost to the consumers. However, we still present simulations for a wide range of vehicles, up to close to the total number. The number, initial location, and speed of the taxi vehicles were calculated as follows:
[leftmargin=*]
We calculated the base number of taxis, as the minimum number of taxis required to serve all requests as single rides (no ridesharing). If a request appears, and all taxis are occupied serving other requests, we increase the required number of taxis by one. This resulted to around 4000−5000 vehicles (depending of the size of the simulation, see Section 5). Simulations were conducted for {×0.5,×0.75,×1.0,×2.0,×3.0} the base number.
Given a number of taxis, V, the initial position of each taxi is the drop-off location of the last V requests, prior to the starting time of the simulation. To avoid cold start, we compute the drop-off time of each request, and assume the vehicle occupied until then.
The vehicles’ average speed is estimated to 6.2 m/s (22.3 km/h), based on the trip distance and time per trip as reported in the dataset, and corroborated by the related literature (In Santi et al. (2014) the authors estimated the speed to be between 5.5−8.5 m/s depending on the time of day).
2.2.3. Customer Requests
A request, r, is a tuple ⟨tr,sr,dr,kr⟩. Request r appears (becomes open) at its respective pick-up time (tr), and geo-location (sr). Let dr denote the destination. Each requests admits a willingness to wait (kr) to find a match (rideshare), i.e., we assume dynamic waiting periods per request. The rationale behind kr is that requests with longer trips are more willing to wait to find a match than requests with destinations near-by. After kr time-steps we call request r, critical. If a critical request is not matched, it has to be served as a single ride. Recall that in our setting all of the requests must be served. Let Ropent,Rcriticalt denote the sets of open, and critical requests respectively, and let Rt=Ropent∪R%
criticalt.
We calculate kr as in related literature Danassis et al. (2019). Let wmin, and wmax be the minimum and maximum possible waiting time, i.e., wmin≤kr≤wmax,∀r. Knowing sr,dr, we can compute the expected trip time (E[ttrip]). Assuming people are willing to wait proportional to their trip time, let kr=q×E[ttrip], where q∈[0,1]. wmin,wmax, and q can be set by the ridesharing company, based on customer satisfaction (following Danassis et al. (2019), let wmin=1,wmax=3, and q=0.1).
2.2.4. Rides
A (shared)ride, ρ, is a pair ⟨r1,r2⟩, composed of two requests. If a request r is served as a single ride, then r1=r2=r. Let Pt denote the set of rides waiting to be matched to a taxi at time t. Contrary to some recent literature on high capacity ridesharing (e.g., Alonso-Mora et al. (2017); Lowalekar
et al. (2019)), we purposefully restricted ourselves to rides of at most two requests for two reasons: complexity, and passenger satisfaction. The complexity of the problem grows rapidly as the number of potential matches increases, while most of the proposed/evaluated approaches already struggle to tackle matchings of size two on the scale of a real-world application. Moreover, even though a fully utilized vehicle would ultimately be a more efficient use of resources, it diminishes passenger satisfaction (a frequent worry being that the ride will become interminable, according to internal research by ridesharing companies) Widdows
et al. (2017); Brown ([n. d.]a). Given that a hard constraint is the servicing of all requests, we do not assume a time limit on matching rides with taxis; instead we treat it a QoS metric.
2.2.5. Distance Function
The optimal choice for a distance function would be the actual driving distance. Yet, our simulations require trillions of distance calculations, which is not attainable. Given that the locations are given in latitude and longitude coordinates, it is tempting to use the Haversine formula^{6}^{6}6https://en.wikipedia.org/wiki/Haversine_formula to estimate the Euclidean distance, as in related literature Santos and Xavier (2013); Brown ([n. d.]a). We have opted to use the Manhattan distance, given that the simulation takes place mostly in Manhattan. To evaluate our choice, we collected more than 12
million actual driving distances using the Open Source Routing Machine (
project-osrm.org), which computes shortest paths in road networks. Manhattan distance’s error, compared to the actual driving distance, was −0.5±2.9 km, while Euclidean distance’s was −3.2±3.8 km.
2.2.6. Pricing
A combination of an one-time flag drop fee (β=2.2 $^{7}^{7}7https://www.uber.com/us/en/price-estimate/), distance fare (πI=0.994 $/km for a single ride, πII=0.8 $/km shared), fuel price (3.2 $/gal^{8}^{8}8https://www.eia.gov/dnav/pet/pet_pri_gnd_dcus_sny_m.htm), vehicle mileage (46.671 km/gal Buchholz (2018)), resulting to cost per km c=3.2/46.671. Specifically, the revenue M(ρ) earned by a taxi driver from serving ride ρ is given by the following equation Buchholz (2018):
M(ρ)=⎧⎪⎨⎪⎩β+πIδ(sr,dr)−cδ(sv,sr,dr), if ρ single2β+πIIδ(sr1,dr1|r2)+πIIδ(sr2,dr2|r1)−cδ(sv,sr1,sr2,dr1,dr2), if ρ%
shared
(1)
where, with some slight abuse of notation, δ(sv,sr,dr) denotes the distance from the current location of the taxi sv, to the pick-up and subsequently drop-off location of the ride, δ(sr1,dr1|r2) denotes the distance driven from the pick-up to the destination of r1, given that r1 will share the ride with r2 (similarly δ(sr2,dr2|r1) for r2), and finally, δ(sv,sr1,sr2,dr1,dr2) denotes the total driving distance of the taxi for serving the two requests starting from sv.
2.2.7. Embedding into HSTs
A starting point of many of the employed k-server algorithms is embedding the input metric space X into a distribution μ over σ-hierarchically well-separated trees (HSTs), with separation σ=Θ(log|X|log(klog|X|)), where |X|
denotes the number of points. It has been shown that solving the problem on HSTs suffices, as any finite metric space can be embedded into a probability distribution over HSTs with low distortion
Fakcharoenphol et al. (2003). The distortion is of order O(σlogσ|X|), and the resulting HSTs have depth O(logσΔ), where Δ is the diameter of X Bansal
et al. (2015).
Given the popularity of the aforementioned method, it is worth examining the size of the resulting trees. Given that the geo-coordinate system is a discrete metric space, we could directly embed it into HSTs. Yet, the size of the space is huge, thus for better discretization we have opted to generate the graph of the street network of NYC. To do so, we used data from openstreetmap.org. Similarly to Santi et al. (2014), we filtered the streets, selecting only primary, secondary, tertiary, residential, unclassified, road, and living street classes, using those as undirected edges and street intersections as nodes. The resulting graph for NYC contains 66543 nodes, and 95675 edges (5018, and 8086 for Manhattan). Given that graph, we generate the HSTs Santi et al. (2014).
3. Employed Algorithms & Challenges
Figure 2. The three separate components of the DRSFR problem.
3.1. Problem Decomposition
The DRSFR problem can be decomposed into (Figure 2):
Request – request matching to create a (shared) ride,
Ride to taxi matching,
Relocation of the idle fleet.
Complexity issues make the simultaneous consideration of all three problems when taking a decision impractical. Instead, a more realistic approach is to tackle each component individually, under minimum consideration of the remaining two. Each of the aforementioned components is a significant problem in its own right. Step (a) refers to the problem of Online Maximum Weight Matching with Delays – given a non-bipartite graph, nodes appear in an online manner, and leave after some time-steps. Nodes can be matched only while being present, and the goal is to maximize the cumulative utility over a finite time horizon Ashlagi et al. (2019); Emek
et al. (2016). Step (b) can be viewed either as an Online Maximum Weight Bipartite Matching with Delays Ashlagi et al. (2017) – the difference here being that the graph is bipartite, with rides on one side, and taxis on the other – or as a k-Taxi Problem Coester and
Koutsoupias (2018) (and by extension as a k-Server Problem Koutsoupias (2009); Koutsoupias and
Papadimitriou (1995)). In the latter formulation, k taxis (servers) have to move to serve all the requests while minimizing the total distance traveled, with the difference being that each will end up at the destination of the served request. Finally, step (c) can be either viewed as the k-center problem or the more general k-Facility Location Problem Guha and Khuller (1999), concerned with the optimal placement of facilities (taxis) to minimize transportation costs, or as an Online Maximum Weight Matching problem, performed on the history of requests. Given the high complexity of the former problems (they are both NP-hard, in fact APX-hard Hsu and Nemhauser (1979); Feder and Greene (1988)), we have opted for the latter interpretation.
3.2. Algorithms
We have evaluated a variety of approaches ranging from offline maximum weight matching (MWM), and greedy solutions (run in batches, or just-in-time (JiT)), to online MWM, k
-Taxi/Server algorithms, and linear programming. We solve each of the three steps of the DRSFR problem (Figure
2) individually. When possible, we use the same algorithm for both steps (a) and (b). k-Taxi/Server algorithms, though, can not solve step (a), thus we opted to use the best performing algorithm for step (a) (namely the offline MWM run in batches). Step (c) was treated separately; due to the computational complexity of most of the evaluated approaches, we opted to evaluate only the most promising solutions, . In what follows, we will start by describing the employed approaches for the ridesharing part, i.e., steps (a) and (b). In the Section 4 we describe the employed techniques for dynamic relocation.
Matching Graphs:
At time t, let Ga=(Rt,Eat), where Eat denotes the weighted edges between requests. With a slight abuse of notation, let δ(sr1,sr2,dr1,dr2) denote the minimum distance required to serve both r1, and r2 (as a shared ride, i.e. excluding the case of first serving one of them and then the other) with a single taxi located either in s1, or s2. The weight wr1,r2 of an edge (r1,r2)∈Eat is defined as wr1,r2=δ(s1,d1)+δ(s2,d2)−δ(sr1,sr2,dr1,dr2) (similarly to Danassis et al. (2019); Alonso-Mora et al. (2017)). If r1=r2, let wr1,r2=0 (single passenger ride). Intuitively, this number represents an approximation on the travel distance saved by matching requests r1, and r2 (it is impossible to know in advance the location of the taxi that will serve the ride)^{9}^{9}9It also ensures that the shared ride will cost less than the single ride option..
Offline algorithms (e.g., MWM, ALMA, Greedy) can be run either in a just-in-time manner – i.e., when a request becomes critical – or in batches (given that our dataset has granularity of 1 minute, we run in batches of 1, and 2 minutes).
Similarly, at time t, let Gb=(Pt∪Vt,Ebt), where Ebt denotes the weighted edges between rides and taxis. With a slight abuse of notation, let δ(sv,sr1,sr2,dr1,dr2) denote the minimum distance required (out of all the possible pick-up and drop-off combinations) to serve both r1, and r2 with a single taxi located at sv. The weight wr1,r2 of an edge (r1,r2)∈Ebt is defined as wr1,r2=1/δ(sv,sr1,sr2,dr1,dr2). If r1=r2 (single passenger ride), let δ(sv,sr1,sr2,dr1,dr2)=δ(sv,sr1,dr1). We run the offline algorithms every time the set of rides (Pt) is not empty.
3.2.1. Maximum Weight Matching (MWM)
To match requests into shared rides (step (a) of the DRSFR problem), we find the maximum weight matching on Ga. To match rides with taxis (step (b)), we find the maximum weight matching on Gb. On both cases we use the blossom algorithm Edmonds (1965). Not surprisingly, MWM results to high quality allocations, but that comes with an overhead in running time, compared to simpler, ‘local’ solutions (see Section 5). This is because blossom’s running time – on a graph (V,E) – is O(|E||V|2), and we have to run it three times, one for each step of the DRSFR problem. Additionally, the MWM algorithm inherently requires a global view of the whole request set in a time window, and is therefore not a good candidate for the fast, decentralized solutions that are more appealing for real-life applications.
3.2.2. ALtruistic MAtching Heuristic (ALMA) Danassis et al. (2019)
ALMA is a recently proposed lightweight heuristic for for weighted matching. A distinctive characteristic of ALMA is that agents (in our context: requests / rides) make decisions locally, based solely on their own utilities. While contesting for a resource (in our context: request / taxi), each agent will back-off with probability that depends on their own utility loss of switching to their respective remaining resources. Agents that do not have good alternatives will be less likely to back-off and vice versa. It is inherently decentralized, requires only a 1-bit partial feedback, and has constant in the total problem size running time, under reasonable assumptions on the preference domain of the agents. Thus, it is an ideal candidate for an on-device solution. Moreover, in Danassis et al. (2019) it was shown to achieve high quality results on simpler version of step (a) of the DRSFR problem.
3.2.3. Greedy
Greedy is a very simple algorithm, which selects a node i randomly, and matches it with j=argmax(wi,j). Greedy approaches are appealing^{10}^{10}10Widdows
et al. (2017) reports that GrabShare’s scheduling component has used an entirely greedy approach to allocate bookings to drivers. Lyft also started with a greedy system Brown ([n. d.]a)., not only due to their low complexity, but also because real time constraints dictate a short planning windows which diminish the benefit of batch optimization solutions compared to myopic approaches Widdows
et al. (2017).
Similar to MWM, this is a recently-proposed offline algorithm for solving steps (a), and (b) of the DRSFR problem. It takes a two-phase approach: first, it matches requests to shared rides using minimum weight matching based on the shortest distance to serve any request pair but on the worst pickup choice. Then it matches rides to taxis using again minimum weight matching, and assuming the weight to be the distance of the closest pick-up location of the two. The authors of Bei and Zhang (2018) prove a worst-case approximation guarantee of 2.5 for the algorithm.
3.2.5. Postponed Greedy (PG) Ashlagi et al. (2019)
This is another very recently proposed, 1/4-competitive algorithm for maximum weight online matching with deadlines (step (a) of the DRSFR problem). Contrary to our setting, it was designed for fixed deadlines, i.e., kr=c,∀r∈R. When a request r appears, the algorithm creates a virtual seller and a virtual buyer for that request. The buyer matches greedily with the best seller so far, and the choice remains fixed. When r becomes critical, it’s role will be randomly finalized either as a seller, or buyer. If r is a seller, and a subsequent buyer was matched with r, the match is finalized. The major difference is that in our setting requests become critical out-of-order, and a critical request can not be matched later. Thus, when a request becomes critical, if determined to be a seller, the match is finalized (if one has been found), otherwise the request is treated as single ride.
An online algorithm for solving the Min-cost (Bipartite) Perfect Matching with Delays, i.e., both steps (a), and (b) of the DRSFR problem, based on the popular Primal-Dual technique Goemans and
Williamson (1997). The weight (cost) of an edge in this setting includes arrival times as well, specifically wr1,r2=(δ(s1,s2)+δ(d1,d2))/uaverage+|t1−t2|, where uaverage is the average speed (see Section 2.2.2). The algorithm partitions all the requests into active sets, starting with the singleton {r} for a newly arrived request r. Every time-step these actives sets grow, until the weight of edges of different active make the dual constraints of the problem tight. Then the active sets merge, and the algorithm matches as many pairs of free requests in these sets as possible. It’s a O(|R|)-competitive algorithm, that works with infinite metric spaces, potentially making the algorithm better suited for applications like the DRSFR problem. Yet, it does not take into account the willingness to wait (kr), missing matches of requests that became critical. Despite being designed for bipartite matchings as well, we opted out from using it for step (b) since it would require to create a new node every time a taxi vehicle drops-off a ride and becomes available.
Balance is simple and classical k-server algorithm from the literature of competitive analysis. A ride is served by the taxi that has the minimum sum of the distance traveled so far plus its distance to the source of the ride (chosen randomly between the sources of the two requests composing the ride). It is min-max fair, i.e., it greedily minimizes the maximum accumulated distance among the taxis. The competitive ratio of the algorithm is |X|−1 in arbitrary metric spaces with |X| points Manasse
et al. (1990).
Harmonic is another classical randomized algorithm from the k-server literature, which is simple and memoryless. It matches a taxi with a ride with probability inversely proportional to the distance from it’s source (chosen randomly between the sources of the two requests composing the ride). The trade-off for the simplicity is the high competitive ratio, which is O(2|V|log|V|) Bartal and Grove (2000).
Double-coverage is one of the two most famous k-server algorithms in the literature. The algorithm is designed for HSTs and extends to general finite metric spaces X via HST embeddings. First we perform the embedding Bartal (1996); Fakcharoenphol et al. (2004) and then, to determine which taxi will serve a ride, all unobstructed taxis move towards its source (chosen randomly between the sources of the two requests composing the ride) with equal speed. When during this process, a taxi becomes obstructed (i.e., its path is block by another taxi), it stops, while the rest keep moving. When a taxi reaches the leaf with the ride, the process stops, and each taxi maintains it’s position on the HST. Given that only leafs correspond to locations on X, we chose to implement the lazy version of the algorithm (which is equivalent to the original definition e.g., see Koutsoupias (2009)), i.e. only the taxi serving the request will move on X. This is also on par with the main goal of minimizing the distance driven. The algorithm is k-competitive on all tree metrics Chrobak and
Larmore (1991a).
3.2.10. Work Function (WFA) Chrobak and
Larmore (1991b); Koutsoupias and
Papadimitriou (1995)
WFA is perhaps the most important k-server algorithm, as it provides the best competitive ratio to date, due to the celebrated result of Koutsoupias and
Papadimitriou (1995). It is a dynamic programming approach which, intuitively, computes the optimal solution until time t−1, plus a greedy cost for switching taxi locations. An obvious obstacle that makes the algorithm intractable in practice is that the complexity rises from step to step, resulting to computation and/or memory issues. We implemented an efficient implementation using network flows, as described in Rudec
et al. (2013). Yet, as the authors of Rudec
et al. (2013) state as well, the only practical way of using the WFA is switching to its window version w-WFA, where we only optimize for the last w rides. Even though the complexity of w-WFA does not change between time-steps, it does change with the number of taxis. The resulting network has 2|P|+2|V|+2 nodes, and we have to run the Bellman–Ford algorithm Bellman (1958) at least once to compute the potential of nodes and make the costs positive (Bellman–Ford runs in O(|P||V|). We refer the reader to Bertsekas (1998)) for more details on network optimization. As before, the source of the ride is chosen randomly between the sources of the two requests composing the ride.
This is a very recent algorithm for the k-taxi problem, which provides the best possible competitive ratio for the problem. The algorithm operates on HSTs, where the rides and taxis at any time are placed at its leaves. First, it generates a Steiner tree that spans the leaves that have taxis or rides, and then uses this tree to schedule rides, by simulating an electrical circuit. In particular, whenever a ride appears at a leaf, the algorithm interprets the edges of the tree with length R as resistors with resistance R, which determine the fraction of the current flow that will be routed from the node corresponding to the taxi towards the ride. These fractions are then interpreted as probabilities which determine which taxi will be chosen to pick up the ride.
3.2.12. High Capacity (HC) Alonso-Mora et al. (2017)
Highly cited paper, and the only one in our evaluated approaches that addresses vehicle relocation (step (c)). Contrary to our approach, they tackle steps (a), and (b) simultaneously, leaving step (c) as a separate sub-problem. Their method consists of five steps: (i) computing a pairwise request-vehicle shareability graph (RV-graph) Santi et al. (2014), (ii) computing a graph of feasible trips and the vehicles that can serve them (RTV-graph), (iii) computing a greedy solution for the RTV-graph, (iv) solving an ILP to compute the best assignment of vehicles to trips, using the previously computed greedy solution as an initial solution, and finally (v - optional) if there remain any unassigned requests, solving an ILP to optimally assign them to idle vehicles based on travel times. Given the ILP formulation, this is the most promising approach in terms of solution quality, but it is not scalable, and effectively impractical to apply in the real-world. Worse case, the number of variables in the ILP is O(|V||R|2) – which results to 27 - 216 million variables, given that every time-step we have approximately 300 - 600 requests, and as many taxis – and the number of constraints is |V|+|R|. The latter make hard to even compute the initial greedy solution in real-time. The authors of Alonso-Mora et al. (2017) circumvent this problem by enforcing delay constraints, but in our modeling every algorithm has to serve all requests, resulting in a prohibited large ILP. We use IBM-CPLEX Bliek1ú
et al. (2014) to solve the resulting ILPs.
3.2.13. Baseline: Single Ride
Uses MWM to schedule the serving of single rides to taxis (there is no ride sharing, i.e., we omit step (a) of the DRSFR problem).
3.2.14. Baseline: Random
Makes random matches, provided that the edge weight is non-negative.
While our evaluation contains many recently proposed algorithms for matching, the observant reader might notice that, with the exception of k-taxi, our k-server algorithms are from the classical literature. We did consider more recent k-server algorithms (e.g., Dehghani et al. (2017); Lee (2018); Bansal
et al. (2015)), but their complexity turns out to be prohibitive. This is mainly because they proceed via an ‘online rounding’ of an LP-relaxation of the problem, which maintains a variable for every (time-step, point in the metric space) pair. Even for one hour (3600 time-steps) and only for our discretization of Manhattan (5018 nodes), we need more than 18 mill. variables (230 mill. for NYC).
4. Dynamic Vehicle Relocation
The aim of any relocation strategy is to improve the spatial allocation of supply. Serving requests redistributes the taxis, resulting to an inefficient allocation. One can assume a ‘lazy’ approach, relocating vehicles only to serve requests. While this minimizes the cost of serving a request (e.g., distance driven, fuel, etc.), it results in sub-optimal QoS. Improving the QoS (especially the time to pick-up, since it highly correlates to passenger satisfaction, see Section 2.1.2) plays a crucial role to the growth of a company. The goal is to:
Improve the QoS metrics, while minimizing the excess distance driven.
There are two ways to enforce relocation: passive, and active. Ridesharing platforms, like Uber and Lyft, have implemented market-driven pricing as a passive form of relocation. Counterfactual analysis performed in Buchholz (2018) shows that implementing pricing rules can result to daily net surplus gains of up to 232000 and 93000 additional daily taxi-passenger matches. While the gains are substantial, the market might be slow to adapt, and drivers and passengers do not always follow equilibrium policies. Contrary to that, our approach is active, in the sense that we directly enforce relocation. Moreover, we adopt a more anthropocentric approach: in our setting, the demand is fixed, thus the goal is not to increase revenue as a result of serving more rides, but rather to improve QoS^{11}^{11}11Decreased delays can improve revenue by serving more requests in a fixed time..
There are many ways to approach dynamic relocation (part (c) of the DRSFR problem). High Capacity Alonso-Mora et al. (2017) solves an ILP, which could reach high quality results, but it is not scalable nor practical. Ideally, we would like a solution that can run on-device. The k-server algorithms perform an implicit relocation, yet they are primarily developed for adversarial scenarios, and do not utilize the plethora of historic data^{12}^{12}12NYC TLC has been proving data on yellow taxi trips since 2009.. In reality, requests follow patterns that emerge due to human habituality (e.g., during the first half of the day in Manhattan, there are many more drop-offs in Midtown compared to pickups Buchholz (2018)). Density based clustering Xu and Tian (2015) is a natural approach, yet, due to the vast number of requests, the only discernible clusters were of large regions (Manhattan, Bronx, Staten Island, Brooklyn, or Queens), which does not allow for fine-grained relocation. Given the high density of the requests, and the low frictions of the taxis, we opted for a simple, fine-grained, matching approach.
Table 1. Employed algorithms for each step of the DRSFR.
4.1. Patterns in Customer Requests
To confirm the existence of transportation patterns, we performed the following analysis: For each request r on January 15, we searched the past three days for requests r′ such that |tr−tr′|≤10, δ(sr,sr′)≤250, and δ(dr,dr′)≤250. The results are depicted in Figure 3. On average, 13.3% of the trips are repeated across all three previous days, peaking at 43.7% on rush hours (e.g., 6-8 in the morning).
4.2. Proposed Approach
We propose a fine-grained, weighted matching relocation, which is plug-and-play and can be combined with any algorithm for steps (a) and (b). Given the existence of transportation patterns, we use the history to predict a set of expected future requests. Specifically, let D, and T be the sampling windows, in days and minutes (we used D=3, and T=2). Let t denote the current time-step. The set of past request on our sampling window is Rpast={r:tr−t≤T}, as long as that r appeared at most D number of days prior to t. The set of expected future requests Rfuture is generated by sampling from Rpast. Relocation is performed in a JiT manner, every time the set of idle vehicles is not empty. We generate similar matching graphs as in Section 3.2, and the we proceed to match requests to shared rides, and rides to idle taxis. The set of nodes of Ga is now Rfuture∪Rt. Finally, each idle taxi starts moving towards the source of its match (given that these are expected rides, the source is picked at random between the sources of the two requests composing the ride).
We use the MWM, ALMA, and Greedy algorithms for the weighted matching. It is worth noting that we evaluated different approaches for the edge weights of Gb (ride – taxi matching). The best performing one in our scenario was the inverse of the distance, which makes sense given that we want to improve QoS, while minimizing the extra distance driven. Yet, depending on objectives, one might use different weights (e.g., the expected profit).
Table 1 presents all of the evaluated algorithms, subdivided into the three parts of the DRSFR problem. For example, k-Server algorithms can not solve step (a), thus we use the best performing algorithm for step (a), i.e. MWM. Similarly for PG, GD for step (b).
In this section we present the results of our evaluation. For every metric we report the average value out of 8 runs. We shortly detail the most relevant results. Please refer to Appendix A
for the complete results including larger test-cases on the broader NYC area, and omitted metrics, standard deviation values, algorithms (e.g., WFA, and HC had to be evaluated in smaller test-cases), etc.
We first present our results on one hour, and base number (see Section 2.2.2) of taxis (Figure 4). Then we show that the results are robust at a larger time-scale (one day, Figure 5), and varying number of vehicles (2138 - 12828, Figure 6). Finally we present results on the step (c) of the DRSFR problem: dynamic relocation (Table 2, Figure 7).
Distance Driven: In the small test-case (Figure 3(a)) MWM performs the best, followed by Bal (+7%). ALMA comes second (+19%), and then Greedy (+21%). The high performance of Bal in this metric is because it uses MWM for step (a), which has a more significant impact on the distance driven. Similar results are observed for the whole day (Figure 4(a)), with Bal, ALMA, and Greedy achieving +4%, +18%, and +22% compared to MWM, respectively. Figure 5(a) shows that as we decrease the number of taxis, Bal loses its advantage, Greedy is pulling away from ALMA (9% worse than ALMA), while ALMA closes the gap to MWM (+17%).
Complexity: To estimate the complexity, we measured the elapsed time of each algorithm. Greedy is the fastest one (Figure 3(b)), closely followed by Har, Bal, and ALMA. ALMA is inherently decentralized. The red overlay denotes the parallel time for ALMA, which is 2.5 orders of magnitude faster than Greedy.
Time to Pick-up: MWM exhibits exceptionally low time to pick-up (Figure 3(c)), lower than the single ride baseline. ALMA, Greedy, and Bal have +69%, +76%, and +33% compared to MWM, respectively. As before, Figure 5(b) shows that as we decrease the number of taxis, Bal loses its advantage, and Greedy is pulling further away from ALMA. Note that to improve visualization, we removed DC’s pick-up time as it was one order of magnitude larger than Appr.
Delay: PG exhibits the lowest delay (Figure 3(d)), but this is because it makes 26% fewer shared rides than the rest of the high performing algorithms. ALMA has the smallest delay (−13% compared to MWM), with Greedy following at −1%, while Bal has +63% (both compared to MWM). As the number of taxis decrease (Figure 5(c)), ALMA’s gains increase further (−22% compared to MWM).
Figure 5(d) depicts the cumulative delay, which is the sum of all delays described in Section 2.1.2, namely the time to pair, time to pair with taxi, time to pick-up, and delay. An interesting observation is that reducing the fleet size from 12828 (3.0) to just 3207 (0.75) vehicles (75% reduction), results in only approximately 2 minutes of additional delay. This goes to show the great potential for efficiency gains such technologies have to offer.
Profit & Frictions:
Contrary to their performance in QoS metrics, GD, and Appr achieve the highest driver profit, 12% and 8% higher than MWM, respectively (although the low QoS and increased distance driven suggest low quality matchings, which can explain the higher revenue, yet deems them undesirable). Bal, and Har follow with +2−3%. ALMA and Greedy achieve the similar profit to MWM. PG exhibits significantly worse results (−13%), due to the lower number of shared rides it matches.
Small differences in driver profit can have a significant impact on the platform’s profit. There are 13587 taxis in NYC, 67−85% of which are on the road at one time (i.e. 9103 - 11549 taxis). The additional 2% profit of Bal translates to $32.3 additional revenue in a day. Multiplied by the total number of taxis, and assuming that the platform keeps 25% as commission, this results to $73506 - $93258 additional revenue per day for the platform.
Figure 4(b) also depicts the maximum (red dot), and minimum (green dot) value of a driver’s profit. A closer to the mean maximum value suggest a fairer algorithm for the drivers. Moreover, it is worth noting that the minimum value for all the algorithms is zero, meaning that there are taxis which remain unutilized (in spite of the fact that the number of taxis – in this scenario 5081 – is considerably lower than the current fleet size of yellow taxis).
Finally, Figure 4(c) shows the driver frictions. Just like with the profit, k-server algorithms seem to outperform matching algorithm by far. Compared to MWM, Bal and Har achieve a 97% decrease, while ALMA and Greedy a 31%, and 23% decrease respectively. Given that we have a fixed supply, lower frictions indicate a more even distribution of rides amongst taxis.
Time to Pair with Taxi & Number of Shared Rides: Excluding the test-case with the smallest taxi fleet (×0.5 the base number), the time to pair with taxi was zero, or close to zero, for all the evaluated algorithms. The latter comes to show the potential for efficiency gains and better utilization of resources using smart technologies. The number of shared rides is approximately the same for all the employed algorithms, with notable exception the PG which makes 26% fewer shared rides.
5.0.1. MWM vs. Greedy Approaches:
MWM seems to perform the best in the total distance driven, and the QoS metrics, which is reasonable since it makes optimal matches amongst passengers. Yet, MWM is hard to scale and requires a centralized solution. Greedy approaches are appealing^{13}^{13}13Widdows
et al. (2017) reports that GrabShare’s scheduling component has used an entirely greedy approach to allocate bookings to drivers. Lyft also started with a greedy system Brown ([n. d.]a)., not only due to their low complexity, but also because real time constraints dictate short planning windows which would potentially diminish the benefit of batch optimization solutions compared to myopic approaches Widdows
et al. (2017).
5.0.2. ALMA vs. Greedy Approaches:
ALMA was inherently developed for multi-agent applications. Agents make decisions locally, using completely uncoupled learning rules, and require only a 1-bit partial feedback Danassis et al. (2019), making it an ideal candidate for an on-device implementation. This is fundamentally different than a decentralized implementation of the Greedy algorithm for example. Even in decentralized algorithms, the number of communication rounds required grows with the size of the problem. However, in practice the real-time constraints impose a limit on the number of rounds, and thus on the size of the problem that can be solved within them. In fact, ALMA is of a greedy nature as well, albeit it utilizes a more intelligent backing-off scheme, thus there are scenarios where ALMA significantly outperform the greedy, as proven by the simulation results. For example, in more challenging scenarios (smaller taxi fleet, or potentially different types of taxis) the smarter back off mechanism result to a more profound difference.
5.1. Relocation
A crucial trade-off of any relocation scheme is improving the QoS metrics, while minimizing the excess distance driven. Table 2 shows that our proposed scheme successfully balances this trade-off. In particular, ALMA – the best performing overall – radically improves the QoS metrics by more than 50% (e.g., ALMA decreases the pick-up time by 55%, and its standard deviation (SD) by 58%), while increasing the driving distance by only 6%. The cumulative delay is decreased by 43%. Recall that the proposed approach is plug-and-play and can be combined with any algorithm for steps (a) and (b) of the DRSFR problem. Table 2 uses MWM for steps (a) - (b). This provides the evaluated algorithms with a common ground, and allows for fair comparison focused only on the relocation part.
As a final step, we evaluate end-to-end solutions, using MWM, ALMA, and Greedy to solve all three of the steps of the DRSFR problem. Figure 7 depicts the time to pick-up (error bars denote one SD of uncertainty), a metric highly correlated to passenger satisfaction level Tang et al. (2017); Brown ([n. d.]b). We compare against the single ride base line (see Section 3.2.13). Once more, the proposed relocation scheme results in radical improvements, as the time to pick-up drops (compared to the single ride) from +14.09% to −41.76% for MWM, from +74.14% to −9.33% for ALMA, and finally, from +86.10% to −7.97% for Greedy. The latter comes to show that utilizing a simple relocation scheme can eliminate the negative effects of ridesharing on the QoS metrics.
MWM
ALMA
Greedy
Time to Pick-up
-48.95%
-55.18%
-55.03%
Time to Pick-up SD
-52.97%
-58.22%
-58.21%
Delay
-15.95%
-17.79%
-17.73%
Delay SD
-19.25%
-20.96%
-20.98%
Cumulative Delay
-38.37%
-43.23%
-43.11%
Total Distance
5.48%
6.25%
6.24%
Table 2. Relocation Gains.
6. Conclusion
The next technological revolution will be interwoven to the proliferation of intelligent systems. As we bridge the gap between physical and cyber worlds, we will give rise to decentralized, multi-agent based technologies, ideally run on-device. In this paper, we show that a recently proposed heuristic (ALMA), which exhibits such properties, offers an efficient, end-to-end solution for the Dynamic Ridesharing, and
Fleet Relocation problem.
To gain insight into the problem, it is highly important to evaluate a diverse set of candidate solutions in settings designed to closely resemble reality. To the best of our knowledge, our evaluation setting is one of the largest and most comprehensive to date. Our findings provide a clear-cut recommendation to ridesharing platforms, and validates the capacity for deployment of our proposed approach.
References
(1)
Agatz
et al. (2011)
Niels Agatz, Alan L
Erera, Martin WP Savelsbergh, and Xing
Wang. 2011.
Dynamic ride-sharing: A simulation study in metro
Atlanta.
Procedia-Social and Behavioral Sciences
17 (2011), 532–550.
Alonso-Mora et al. (2017)
Javier Alonso-Mora,
Samitha Samaranayake, Alex Wallar,
Emilio Frazzoli, and Daniela Rus.
2017.
On-demand high-capacity ride-sharing via dynamic
trip-vehicle assignment.
Proceedings of the National Academy of
Sciences 114, 3
(2017), 462–467.
Ashlagi et al. (2017)
Itai Ashlagi, Yossi Azar,
Moses Charikar, Ashish Chiplunkar,
Ofir Geri, Haim Kaplan,
Rahul Makhijani, Yuyi Wang, and
Roger Wattenhofer. 2017.
Min-cost bipartite perfect matching with delays.
(2017).
Ashlagi et al. (2019)
Itai Ashlagi, Maximilien
Burq, Chinmoy Dutta, Patrick Jaillet,
Amin Saberi, and Chris Sholley.
2019.
Edge Weighted Online Windowed Matching. In
Proceedings of the 2019 ACM Conference on Economics
and Computation(EC ’19). ACM,
New York, NY, USA, 729–742.
https://doi.org/10.1145/3328526.3329573
Banerjee
et al. (2017)
Siddhartha Banerjee,
Daniel Freund, and Thodoris Lykouris.
2017.
Pricing and Optimization in Shared Vehicle Systems:
An Approximation Framework. In Proceedings of the
2017 ACM Conference on Economics and Computation. ACM,
517–517.
Bansal
et al. (2015)
Nikhil Bansal, Niv
Buchbinder, Aleksander Madry, and
Joseph Naor. 2015.
A polylogarithmic-competitive algorithm for the
k-server problem.
J. ACM 62,
5 (2015), 1–49.
Bartal (1996)
Yair Bartal.
1996.
Probabilistic approximation of metric spaces and
its algorithmic applications. In Proceedings of
37th Conference on Foundations of Computer Science. IEEE,
184–193.
Bartal and Grove (2000)
Yair Bartal and Eddie
Grove. 2000.
The harmonic k-server algorithm is competitive.
Journal of the ACM (JACM)
47, 1 (2000),
1–15.
Bei and Zhang (2018)
Xiaohui Bei and Shengyu
Zhang. 2018.
Algorithms for trip-vehicle assignment in
ride-sharing. In Thirty-Second AAAI Conference on
Artificial Intelligence.
Bellman (1958)
Richard Bellman.
1958.
On a routing problem.
Quarterly of applied mathematics
16, 1 (1958),
87–90.
Bertsekas (1998)
Dimitri P Bertsekas.
1998.
Network optimization continuous and discrete
models.
Athena Scientific Belmont.
Bienkowski et al. (2018)
Marcin Bienkowski, Artur
Kraska, Hsiang-Hsuan Liu, and Paweł
Schmidt. 2018.
A primal-dual online deterministic algorithm for
matching with delays.
(2018), 51–68.
Bliek1ú
et al. (2014)
Christian Bliek1ú,
Pierre Bonami, and Andrea Lodi.
2014.
Solving mixed-integer quadratic programming
problems with IBM-CPLEX: a progress report. In Proceedings of the twenty-sixth RAMP symposium. 16–17.
Borodin and
El-Yaniv (2005)
Allan Borodin and Ran
El-Yaniv. 2005.
Online computation and competitive
analysis.
cambridge university press.
Buchholz (2018)
Nicholas Buchholz.
2018.
Spatial equilibrium, search frictions and
dynamic efficiency in the taxi industry.
Technical Report. mimeo,
Princeton University.
Chen
et al. (2019)
Mengjing Chen, Weiran
Shen, Pingzhong Tang, and Song Zuo.
2019.
Dispatching through pricing: modeling ride-sharing
and designing dynamic prices. In Proceedings of the
28th International Joint Conference on Artificial Intelligence (IJCAI).
165–171.
Chrobak et al. (1990)
M. Chrobak, H. Karloff,
T. Payne, and S. Vishwanathan.
1990.
New Results on Server Problems.
(1990), 291–300.
Chrobak and
Larmore (1991a)
Marek Chrobak and
Lawrence L Larmore. 1991a.
An optimal on-line algorithm for k servers on
trees.
SIAM J. Comput. 20,
1 (1991), 144–148.
Chrobak and
Larmore (1991b)
Marek Chrobak and
Lawrence L Larmore. 1991b.
The Server Problem and On-Line Games.
On-line algorithms 7
(1991), 11–64.
Danassis et al. (2019)
Panayiotis Danassis, Aris
Filos-Ratsikas, and Boi Faltings.
2019.
Anytime Heuristic for Weighted Matching Through
Altruism-Inspired Behavior. In Proceedings of the
Twenty-Eighth International Joint Conference on Artificial Intelligence,
IJCAI-19. International Joint Conferences on
Artificial Intelligence Organization, 215–222.
https://doi.org/10.24963/ijcai.2019/31
Dickerson et al. (2018)
John P Dickerson,
Karthik A Sankararaman, Aravind
Srinivasan, and Pan Xu.
2018.
Allocation problems in ride-sharing platforms:
Online matching with offline reusable resources. In Thirty-Second AAAI Conference on Artificial Intelligence.
Edmonds (1965)
Jack Edmonds.
1965.
Maximum matching and a polyhedron with 0
1-vertices.
Journal of research of the National Bureau of
Standards B (1965).
Emek
et al. (2016)
Yuval Emek, Shay Kutten,
and Roger Wattenhofer. 2016.
Online matching: haste makes waste!
(2016), 333–344.
Fakcharoenphol et al. (2003)
Jittat Fakcharoenphol,
Satish Rao, Satish Rao, and
Kunal Talwar. 2003.
A Tight Bound on Approximating Arbitrary Metrics by
Tree Metrics. In
Fakcharoenphol et al. (2004)
Jittat Fakcharoenphol,
Satish Rao, and Kunal Talwar.
2004.
A tight bound on approximating arbitrary metrics by
tree metrics.
J. Comput. System Sci. 69,
3 (2004), 485–497.
Feder and Greene (1988)
Tomás Feder and
Daniel Greene. 1988.
Optimal algorithms for approximate clustering. In
Proceedings of the twentieth annual ACM symposium on
Theory of computing. ACM, 434–444.
Fiat
et al. (1994)
Amos Fiat, Yuval Rabani,
and Yiftach Ravid. 1994.
Competitive k-server algorithms.
J. Comput. System Sci. 48,
3 (1994), 410–428.
Goemans and
Williamson (1997)
Michel X Goemans and
David P Williamson. 1997.
The primal-dual method for approximation algorithms
and its application to network design problems.
Approximation algorithms for NP-hard
problems (1997), 144–191.
Guha and Khuller (1999)
Sudipto Guha and Samir
Khuller. 1999.
Greedy strikes back Improved facility location
algorithms.
Journal of algorithms
31, 1 (1999),
228–248.
Hsu and Nemhauser (1979)
Wen-Lian Hsu and
George L Nemhauser. 1979.
Easy and hard bottleneck location problems.
Discrete Applied Mathematics
1, 3 (1979),
209–215.
Huang
et al. (2019)
Taoan Huang, Bohui Fang,
Xiaohui Bei, and Fei Fang.
2019.
Dynamic Trip-Vehicle Dispatch with Scheduled and
On-Demand Requests. In The Conference on
Uncertainty in Artificial Intelligence (UAI.
Kalyanasundaram and
Pruhs (1993)
Bala Kalyanasundaram and
Kirk Pruhs. 1993.
Online weighted matching.
Journal of Algorithms
14, 3 (1993),
478–488.
Karp
et al. (1990)
Richard M Karp, Umesh V
Vazirani, and Vijay V Vazirani.
1990.
An optimal algorithm for on-line bipartite
matching. In Proceedings of the twenty-second
annual ACM symposium on Theory of computing. ACM,
352–358.
Kosoresow (1997)
Andrew P Kosoresow.
1997.
Design and analysis of online algorithms for mobile
server applications.
(1997).
Koutsoupias and
Papadimitriou (1995)
Elias Koutsoupias and
Christos H Papadimitriou. 1995.
On the k-server conjecture.
Journal of the ACM (JACM)
42, 5 (1995),
971–983.
Lee (2018)
James R Lee.
2018.
Fusible HSTs and the randomized k-server
conjecture. In 2018 IEEE 59th Annual Symposium on
Foundations of Computer Science (FOCS). IEEE, 438–449.
Lesmana
et al. (2019)
Nixie Lesmana, Xuan
Zhang, and Xiaohui Bei.
2019.
Balancing Efficiency and Fairness in On-Demand
Ridesourcing. In Proceedings of the 33rd Conference
on Neural Information Processing Systems (NEURIPS). to
appear.
Lowalekar
et al. (2019)
Meghna Lowalekar, Pradeep
Varakantham, and Patrick Jaillet.
2019.
ZAC: A Zone pAth Construction Approach for
Effective Real-Time Ridesharing.
(2019).
Ma
et al. (2019)
Hongyao Ma, Fei Fang,
and David C Parkes. 2019.
Spatio-Temporal Pricing for Ridesharing Platforms.
In Proceedings of the 2019 ACM Conference on
Economics and Computation. ACM, 583–583.
Manasse
et al. (1988)
Mark Manasse, Lyle
McGeoch, and Daniel Sleator.
1988.
Competitive algorithms for on-line problems. In
Proceedings of the twentieth annual ACM symposium on
Theory of computing. ACM, 322–333.
Manasse
et al. (1990)
Mark S. Manasse, Lyle A.
McGeoch, and Daniel Dominic Sleator.
1990.
Competitive algorithms for server problems.
J. Algorithms 11,
2 (1990), 208–230.
Raghavan and Snir (1989)
Prabhakar Raghavan and
Marc Snir. 1989.
Memory versus randomization in on-line algorithms.
(1989), 687–703.
Rudec
et al. (2013)
Tomislav Rudec, Alfonzo
Baumgartner, and Robert Manger.
2013.
A fast work function algorithm for solving the
k-server problem.
Central European Journal of Operations
Research 21, 1 (01
Jan 2013), 187–205.
https://doi.org/10.1007/s10100-011-0222-7
Santi et al. (2014)
Paolo Santi, Giovanni
Resta, Michael Szell, Stanislav
Sobolevsky, Steven H Strogatz, and
Carlo Ratti. 2014.
Quantifying the benefits of vehicle pooling with
shareability networks.
Proceedings of the National Academy of
Sciences 111, 37
(2014), 13290–13294.
Santos and Xavier (2013)
Douglas Oliveira Santos and
Eduardo Candido Xavier. 2013.
Dynamic taxi and ridesharing: A framework and
heuristics for the optimization problem. In Twenty-Third International Joint Conference on Artificial Intelligence.
Sühr et al. (2019)
Tom Sühr, Asia J
Biega, Meike Zehlike, Krishna P Gummadi,
and Abhijnan Chakraborty.
2019.
Two-Sided Fairness for Repeated Matchings in
Two-Sided Markets: A Case Study of a Ride-Hailing Platform. In
Proceedings of the 25th ACM SIGKDD International
Conference on Knowledge Discovery & Data Mining. ACM,
3082–3092.
Tang et al. (2017)
M. Tang, S. Ow,
W. Chen, Y. Cao, K.
Lye, and Y. Pan. 2017.
The Data and Science behind GrabShare Carpooling.
In
2017 IEEE International Conference on Data
Science and Advanced Analytics (DSAA)
Widdows
et al. (2017)
Dominic Widdows, Jacob
Lucas, Muchen Tang, and Weilun Wu.
2017.
GrabShare: The construction of a realtime
ridesharing service. In 2017 2nd IEEE International
Conference on Intelligent Transportation Engineering (ICITE). IEEE,
138–143.
Xu and Tian (2015)
Dongkuan Xu and Yingjie
Tian. 2015.
A Comprehensive Survey of Clustering Algorithms.
Annals of Data Science
2, 2 (01 Jun
2015), 165–193.
https://doi.org/10.1007/s40745-015-0040-1
Yuen
et al. (2019)
Chak Fai Yuen,
Abhishek Pratap Singh, Sagar Goyal,
Sayan Ranu, and Amitabha Bagchi.
2019.
Beyond Shortest Paths: Route Recommendations for
Ride-sharing. In The World Wide Web Conference.
ACM, 2258–2269.
Appendix A Appendix
In this section we present in detail the results of our evaluation of Section 5 including, but not limited to, larger test-cases (broader NYC area), and the omitted algorithms, graphs, and tables. For every metric we report the average value out of 8 runs. The dataset was cleaned to remove requests with travel time shorter than one minute, or invalid geo-locations (e.g., outside Manhattan, Bronx, Staten Island, Brooklyn, or Queens).
Section A.1 08:00 - 09:00 – Manhattan: We begin with our small test-case: one hour (08:00 - 09:00), base number of taxis (i.e., 4276, see Section 2.2.2), limited to Manhattan. Figure 8, and Table 3
depict all the evaluated metrics, while the latter also includes the standard deviation of each value. Finally, Table
4 presents the relative difference (percentage of gain or loss) compared to MWM (first line of the table). In what follows, we will adhere to the same pattern, i.e., presenting two tables for the same evaluation, one containing the absolute values, and one presenting the relative difference compared to the algorithm in the first line of the table. We were able to run most of the algorithms in this test-case, except for WFA which we run only for {×0.5,×0.75} the base number of taxis, and HC which is so computationally heavy, that we had to run a separate test-case of only 10 minutes (see Section A.5).
Offline algorithms (e.g., MWM, ALMA, Greedy) can be run either in a just-in-time (JiT) manner – i.e., when a request becomes critical – or in batches. The following two tables (Tables 5, and 6) evaluate the performance of each algorithm for each option. Given that our dataset has granularity of one minute, we run in batches of one, and two minutes. Moreover, due to the large number of requests, at least one request turns critical in every time-step. Thus, JiT and in batches of one minute produced the exact same results. To allow for the evaluation of every algorithm (except HC), we run the evaluation in a smaller scale, i.e., 2138 taxis ({×0.5} the base number of taxis). These tables also include the results for the WFA algorithm. Every other result presented in this paper assumes the best performing option for each of the algorithms (usually batch size of two minutes).
Finally, Figure 9 shows that our results are robust to a varying number of vehicles (2138 - 12828).
Section A.2 00:00 - 23:59 (full day) – Manhattan: We continue to show that the results are robust to a larger time-scale. As before, Figure 10, and Tables 7, and 8 depict all the evaluated metrics.
Sections A.3 08:00 - 09:00, and A.4 00:00 - 23:59 (full day) – Broader NYC Area: In the following two sections, we show that our results are robust to larger geographic areas, specifically in the broader NYC Area, including Manhattan, Bronx, Staten Island, Brooklyn, and Queens. Figure 11, and Tables 9, and 10, and Figure 12, and Tables 11, and 12 depict all the evaluated metrics, for one hour, and one day respectively.
Section A.5 08:00 - 08:10 – Manhattan: This is a limited test-case aimed to evaluate the HC algorithm, due to its high computational complexity. Figure 13, and Tables 13, and 14 depict all the evaluated metrics.
Section A.6 Dynamic Vehicle Relocation – 00:00 - 23:59 (full day) – Manhattan: In this section, we present results on the step (c) of the DRSFR problem: dynamic relocation. We fix an algorithm for steps (a), and (b) – specifically MWM – to allow for a common ground and a fair comparison, focused only on the relocation part. Figure 14, and Tables 15, and 16 depict all the evaluated metrics.
Section A.7 End-To-End Solution – 00:00 - 23:59 (full day) – Manhattan: As a final step, we evaluate end-to-end solutions, using MWM, ALMA, and Greedy to solve all three of the steps of the DRSFR problem. Figure 15, and Tables 17, and 18 present all the evaluated metrics.
Table 4. January 15, 2016 – 08:00 - 09:00 – Manhattan – #Taxis = 4276 (base number). Each column presents the relative difference compared to the first line, i.e., the MWM (algorithm - MWM) / MWM, for each metric.
Distance
Driven (m)
SD
Elapsed
Time (ns)
SD
Time to
Pair (s)
SD
Time to Pair
with Taxi (s)
SD
Time to
Pick-up (s)
SD
Delay (s)
SD
Cumulative
Delay (s)
Driver
Profit ($)
SD
Number of
Shared Rides
SD
Frictions (s)
SD
MWM (1)
5.46E+07
0.00E+00
9.91E+10
0.00E+00
5.17
18.08
102.26
407.94
318.23
542.59
45.92
123.45
471.57
190.44
52.46
9.72E+03
0.00
29.68
12.38
MWM (2)
5.27E+07
0.00E+00
1.45E+11
0.00E+00
32.05
30.90
88.64
308.08
288.94
441.85
38.34
112.20
447.97
187.50
49.91
9.54E+03
0.00
34.32
16.06
ALMA (1)
6.30E+07
1.38E+05
3.96E+10
4.12E+09
3.97
15.71
356.07
614.16
654.41
806.25
39.48
119.40
1053.93
188.79
36.01
9.85E+03
8.36
29.43
9.91
ALMA (2)
6.18E+07
1.02E+05
6.02E+10
7.57E+09
31.91
30.92
323.09
566.04
611.59
758.26
29.84
102.38
996.43
184.66
35.59
9.62E+03
6.88
30.00
10.28
Greedy (1)
6.76E+07
2.28E+05
6.78E+09
1.68E+09
4.41
16.52
577.85
706.95
932.92
813.98
44.64
121.06
1559.82
189.99
36.36
9.82E+03
6.13
29.54
9.76
Greedy (2)
6.66E+07
1.01E+05
1.31E+10
3.75E+09
32.14
31.04
536.36
668.99
881.67
783.43
34.77
106.58
1484.94
185.72
36.99
9.55E+03
17.72
29.97
10.00
Appr (1)
8.04E+07
0.00E+00
5.41E+11
0.00E+00
27.94
30.07
852.44
1048.80
1454.91
1185.15
71.59
137.37
2406.88
198.00
36.05
1.00E+04
0.00
45.28
18.08
Appr (2)
8.05E+07
0.00E+00
5.42E+11
0.00E+00
30.29
30.19
804.67
995.86
1410.40
1145.77
69.81
128.89
2315.17
197.35
35.61
1.00E+04
0.00
29.69
10.13
PG
6.74E+07
1.17E+05
5.20E+11
1.81E+10
59.53
43.27
297.99
764.61
664.30
1053.38
19.90
107.57
1041.73
161.90
39.97
7.12E+03
31.39
29.56
8.29
GD
6.92E+07
0.00E+00
7.11E+12
3.28E+11
52.91
32.09
358.38
801.79
626.96
945.13
153.49
333.21
1191.73
210.00
54.47
9.01E+03
0.00
30.21
9.70
Bal (1)
6.22E+07
9.71E+04
1.35E+10
4.56E+09
5.17
18.08
403.38
535.51
725.18
637.49
55.53
134.36
1189.26
192.96
41.38
9.72E+03
0.00
30.23
11.05
Bal (2)
5.99E+07
1.38E+05
3.42E+10
7.09E+09
32.05
30.89
336.45
466.17
635.90
569.04
46.20
120.74
1050.59
189.53
41.81
9.54E+03
0.00
31.76
12.20
Har (1)
8.10E+07
2.36E+05
1.37E+10
4.70E+09
5.17
18.08
946.56
1058.46
1555.42
1190.40
61.98
146.65
2569.12
194.22
49.97
9.72E+03
0.00
29.45
9.67
Har (2)
7.96E+07
2.87E+05
3.36E+10
6.90E+09
32.05
30.89
906.49
1024.29
1501.74
1153.03
51.28
130.04
2491.55
190.37
51.20
9.54E+03
0.00
29.65
9.65
DC (1)
1.41E+08
1.90E+06
1.16E+13
5.48E+11
5.17
18.08
0.00
0.00
7660.58
10890.62
62.70
142.71
7728.45
192.45
280.85
9.72E+03
0.00
85.76
269.59
DC (2)
1.38E+08
1.71E+06
8.99E+12
3.74E+11
32.05
30.89
0.00
0.00
7290.07
10196.66
51.77
124.76
7373.89
188.37
272.89
9.54E+03
0.00
93.37
287.47
k-Taxi (1)
7.10E+07
3.00E+05
3.41E+12
2.12E+11
5.17
18.08
646.48
528.54
1099.19
702.35
58.32
141.50
1809.15
193.48
46.97
9.72E+03
0.00
29.33
9.74
k-Taxi (2)
6.92E+07
2.04E+05
4.32E+12
4.05E+11
32.05
30.89
586.22
491.05
1020.87
670.76
48.51
126.93
1687.65
189.90
48.16
9.54E+03
0.00
30.14
10.10
WFA (1)
8.48E+07
3.25E+05
1.28E+14
6.77E+12
5.17
18.08
0.00
0.00
30966.54
40847.98
64.04
145.43
31035.75
194.69
597.69
9.72E+03
0.00
63.79
254.48
WFA (2)
8.09E+07
0.00E+00
9.69E+13
0.00E+00
32.05
30.90
0.00
0.00
28964.94
39066.93
51.85
127.05
29048.84
190.49
593.79
9.54E+03
0.00
69.68
264.10
Single
8.63E+07
0.00E+00
2.02E+12
0.00E+00
0.00
0.00
700.50
1317.51
843.80
1444.94
0.00
0.00
1544.30
49.72
6.90
0.00E+00
0.00
29.41
6.11
Random
1.14E+08
2.56E+05
6.26E+09
1.42E+09
3.61
15.05
1963.08
1457.54
2903.86
1580.37
161.07
328.92
5031.62
222.47
47.27
9.88E+03
6.54
29.56
9.37
Table 5. January 15, 2016 – 08:00 - 09:00 – Manhattan – #Taxis = 2138.
Offline algorithms are run either in Just-in-Time (JiT) manner, or in batches (with batch size 1, or 2 min). Because of the density of the dataset, requests become critical every time-step, thus JiT is the same as in batches with batch size 1.
Distance
Driven (m)
SD
Elapsed
Time (ns)
SD
Time to
Pair (s)
SD
Time to Pair
with Taxi (s)
SD
Time to
Pick-up (s)
SD
Delay (s)
SD
Cumulative
Delay (s)
Driver
Profit ($)
SD
Number of
Shared Rides
SD
Frictions (s)
SD
MWM (1)
0.00%
–
0.00%
–
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
–
0.00%
0.00%
MWM (2)
-3.44%
–
46.61%
–
520.26%
70.89%
-13.32%
-24.48%
-9.20%
-18.57%
-16.49%
-9.12%
-5.00%
-1.54%
-4.86%
-1.82%
–
15.65%
29.68%
ALMA (1)
15.50%
–
-60.03%
–
-23.20%
-13.08%
248.20%
50.55%
105.64%
48.59%
-14.01%
-3.29%
123.49%
-0.87%
-31.36%
1.36%
–
-0.84%
-20.00%
ALMA (2)
13.27%
–
-39.29%
–
517.65%
71.03%
215.95%
38.76%
92.19%
39.75%
-35.02%
-17.07%
111.30%
-3.04%
-32.16%
-1.03%
–
1.08%
-16.96%
Greedy (1)
23.91%
–
-93.16%
–
-14.58%
-8.61%
465.07%
73.30%
193.16%
50.02%
-2.78%
-1.94%
230.77%
-0.24%
-30.70%
1.01%
–
-0.47%
-21.15%
Greedy (2)
21.98%
–
-86.81%
–
522.14%
71.70%
424.50%
63.99%
177.06%
44.39%
-24.29%
-13.67%
214.89%
-2.48%
-29.48%
-1.72%
–
0.99%
-19.22%
Appr (1)
47.32%
–
445.49%
–
440.80%
66.34%
733.59%
157.10%
357.19%
118.42%
55.92%
11.28%
410.39%
3.97%
-31.28%
2.92%
–
52.57%
46.03%
Appr (2)
47.52%
–
447.16%
–
486.30%
66.99%
686.88%
144.12%
343.20%
111.17%
52.04%
4.40%
390.95%
3.63%
-32.13%
2.90%
–
0.02%
-18.16%
PG
23.56%
–
424.86%
–
1052.23%
139.31%
191.40%
87.43%
108.75%
94.14%
-56.65%
-12.87%
120.90%
-14.99%
-23.81%
-26.70%
–
-0.41%
-33.08%
GD
26.84%
–
7071.20%
–
924.12%
77.50%
250.45%
96.55%
97.01%
74.19%
234.27%
169.91%
152.71%
10.27%
3.83%
-7.35%
–
1.79%
-21.63%
Bal (1)
14.02%
–
-86.42%
–
0.00%
0.00%
294.47%
31.27%
127.88%
17.49%
20.94%
8.83%
152.19%
1.33%
-21.13%
0.00%
–
1.87%
-10.75%
Bal (2)
9.88%
–
-65.49%
–
520.26%
70.88%
229.01%
14.28%
99.83%
4.87%
0.61%
-2.20%
122.78%
-0.48%
-20.31%
-1.82%
–
7.00%
-1.50%
Har (1)
48.42%
–
-86.16%
–
0.00%
0.00%
825.64%
159.47%
388.77%
119.39%
34.98%
18.79%
444.80%
1.98%
-4.75%
0.00%
–
-0.76%
-21.90%
Har (2)
45.93%
–
-66.14%
–
520.26%
70.88%
786.45%
151.09%
371.90%
112.50%
11.68%
5.34%
428.35%
-0.04%
-2.41%
-1.82%
–
-0.08%
-22.10%
DC (1)
159.34%
–
11636.41%
–
0.00%
0.00%
-100.00%
-100.00%
2307.25%
1907.15%
36.55%
15.59%
1538.87%
1.05%
435.35%
0.00%
–
188.95%
2076.91%
DC (2)
152.49%
–
8969.76%
–
520.26%
70.88%
-100.00%
-100.00%
2190.83%
1779.25%
12.75%
1.06%
1463.68%
-1.09%
420.17%
-1.82%
–
214.60%
2221.34%
k-Taxi (1)
30.13%
–
3339.41%
–
0.00%
0.00%
532.19%
29.56%
245.41%
29.44%
27.00%
14.61%
283.64%
1.60%
-10.47%
0.00%
–
-1.18%
-21.38%
k-Taxi (2)
26.82%
–
4258.51%
–
520.26%
70.88%
473.26%
20.37%
220.80%
23.62%
5.65%
2.81%
257.88%
-0.28%
-8.20%
-1.82%
–
1.54%
-18.41%
WFA (1)
55.42%
–
129083.20%
–
0.00%
0.00%
-100.00%
-100.00%
9630.90%
7428.32%
39.48%
17.80%
6481.32%
2.23%
1039.29%
0.00%
–
114.92%
1954.90%
WFA (2)
48.26%
–
97654.76%
–
520.26%
70.89%
-100.00%
-100.00%
9001.91%
7100.07%
12.93%
2.91%
6059.99%
0.03%
1031.86%
-1.82%
–
134.77%
2032.61%
Single
58.20%
–
1935.09%
–
-100.00%
-100.00%
585.01%
222.97%
165.16%
166.30%
-100.00%
-100.00%
227.48%
-73.89%
-86.85%
-100.00%
–
-0.91%
-50.64%
Random
108.44%
–
-93.69%
–
-30.10%
-16.76%
1819.69%
257.30%
812.51%
191.26%
250.78%
166.43%
966.99%
16.82%
-9.89%
1.60%
–
-0.40%
-24.38%
Table 6. January 15, 2016 – 08:00 - 09:00 – Manhattan – #Taxis = 2138.
Offline algorithms are run either in Just-in-Time (JiT) manner, or in batches (with batch size 1, or 2 min). Because of the density of the dataset, requests become critical every time-step, thus JiT is the same as in batches with batch size 1.
Each column presents the relative difference compared to the first line, i.e., MWM of batch size one (algorithm - MWM(1)) / MWM(1), for each metric.
Table 8. January 15, 2016 – 00:00 - 23:59 (full day) – Manhattan – #Taxis = 5081 (base number). Each column presents the relative difference compared to the first line, i.e., the MWM (algorithm - MWM) / MWM, for each metric.
Figure 11. January 15, 2016 – 08:00 - 09:00 – Broader NYC Area – #Taxis = 4972 (base number).
Distance
Driven (m)
SD
Elapsed
Time (ns)
SD
Time to
Pair (s)
SD
Time to Pair
with Taxi (s)
SD
Time to
Pick-up (s)
SD
Delay (s)
SD
Cumulative
Delay (s)
Driver
Profit ($)
SD
Number of
Shared Rides
SD
Frictions (s)
SD
MWM
6.48E+07
0.00E+00
1.81E+12
0.00E+00
32.34
31.34
0.00
0.00
164.59
401.29
33.01
104.44
229.94
104.67
81.54
1.02E+04
0.00
219.19
415.07
ALMA
7.86E+07
2.48E+05
9.61E+10
1.02E+10
32.09
31.32
0.00
0.00
287.93
646.99
29.55
167.88
349.58
104.13
72.74
1.03E+04
5.88
191.41
379.85
Greedy
8.18E+07
2.96E+05
4.88E+10
7.00E+09
32.32
31.40
0.00
0.00
317.48
720.24
34.73
170.53
384.53
104.68
69.88
1.02E+04
12.28
185.35
374.04
Bal
7.22E+07
1.15E+05
3.09E+10
6.01E+09
5.49
18.97
0.00
0.00
234.85
428.34
67.79
219.01
308.14
109.57
65.24
1.03E+04
0.00
571.11
516.22
Single
1.20E+08
0.00E+00
1.03E+12
0.00E+00
0.00
0.00
10.44
83.19
212.61
577.74
0.00
0.00
223.06
26.37
9.21
0.00E+00
0.00
85.07
211.52
Table 9. January 15, 2016 – 08:00 - 09:00 – Broader NYC Area – #Taxis = 4972 (base number).
Distance
Driven (m)
SD
Elapsed
Time (ns)
SD
Time to
Pair (s)
SD
Time to Pair
with Taxi (s)
SD
Time to
Pick-up (s)
SD
Delay (s)
SD
Cumulative
Delay (s)
Driver
Profit ($)
SD
Number of
Shared Rides
SD
Frictions (s)
SD
MWM
0.00%
–
0.00%
–
0.00%
0.00%
–
–
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
–
0.00%
0.00%
ALMA
21.37%
–
-94.68%
–
-0.75%
-0.08%
–
–
74.94%
61.23%
-10.48%
60.74%
52.03%
-0.52%
-10.79%
1.02%
–
-12.67%
-8.49%
Greedy
26.34%
–
-97.30%
–
-0.04%
0.20%
–
–
92.90%
79.48%
5.20%
63.27%
67.24%
0.01%
-14.31%
0.42%
–
-15.44%
-9.88%
Bal
11.44%
–
-98.29%
–
-83.01%
-39.48%
–
–
42.69%
6.74%
105.35%
109.70%
34.01%
4.68%
-20.00%
1.77%
–
160.56%
24.37%
Single
84.67%
–
-42.94%
–
-100.00%
-100.00%
–
–
29.18%
43.97%
-100.00%
-100.00%
-2.99%
-74.80%
-88.71%
-100.00%
–
-61.19%
-49.04%
Table 10. January 15, 2016 – 08:00 - 09:00 – Broader NYC Area – #Taxis = 4972 (base number). Each column presents the relative difference compared to the first line, i.e., the MWM (algorithm - MWM) / MWM, for each metric.
Table 12. January 15, 2016 – 00:00 - 23:59 (full day) – Broader NYC Area – #Taxis = 6533 (base number). Each column presents the relative difference compared to the first line, i.e., the MWM (algorithm - MWM) / MWM, for each metric.
Table 14. January 15, 2016 – 08:00 - 08:10 – Manhattan – #Taxis = 2779 (base number). Each column presents the relative difference compared to the first line, i.e., the MWM (algorithm - MWM) / MWM, for each metric.