1 Introduction
Millions of drivers provide transportation services for over ten million passengers every day at Didi Chuxing [12], which is a Chinese counterpart of UberPOOL [37]. In peak travel periods, Didi needs to match more than a hundred thousand passengers to drivers every second [43], and rider demand often greatly exceeds rider capacity. Two approaches can be used to mitigate this problem. The first method attempts to predict areas with high travel demands using historical data and statistical predictions or a heat map, and taxis are strategically deployed in the corresponding areas in advance. An alternative approach is to serve multiple riders with fewer vehicles using a ridesharing service: riders with similar routes and time schedules can share the same vehicle [32, 4]. According to statistical data from the Bureau of Infrastructure, Transport and Regional Economics [6], there are less than 1.6 persons per vehicle per kilometer in Australia. If only 10% of vehicles had more than one passenger, then it would reduce annual fuel consumption by 5.4% [19]. Therefore, increasing vehicle occupancy rates would provide many benefits including the reduction of gas house emissions. Moreover, it has been reported that a crucial imbalance exists in supply and demand in peak hour scenarios, where the rider demand is double the rider availability based on historical data statistical analysis at Didi Chuxing [36]. Alleviating traffic congestion challenges during peak commuter times will ultimately require significant government commitment dedicated to increasing the region’s investment in core transportation infrastructure [13]. In this paper, we focus on the dynamic ridesharing problem, specifically during peak hour travel periods.
From an extensive literature study, we make the following observations to motivate this work – (1) Existing ridesharing studies [44, 9] do not fully explore the scenario where the number of riders is much greater than the number of available drivers, and so the scalability of current solutions in this setting remains unclear. (2) Prior studies [30, 25, 44, 18, 10, 40] primarily focus on single objective optimization solutions, such as minimizing the total travel distance from the perspective of drivers [25, 18], or maximizing the served rate of ridesharing system [40]. In contrast, we aim to optimize two objectives: maximize the served rate and minimize the total additional distance. (3) Previous studies [25, 15, 35, 9, 7] report the processing time mainly based on the rider request matching time, but not the trip schedule update time, which encompasses a driver’s current trip schedule and the underlying index structure updates. However, in a dynamic ridesharing scenario where vehicles are continuously moving, accounting for these additional costs produces a more realistic comparison of the algorithms being studied.
Our goal in this work is to determine a series of trip schedules with the minimum total additional distance capable of accommodating as many rider requests as possible, under a set of spatiotemporal constraints. In order to achieve this goal, we must account for three features: (1) each driver has an initial location and trip schedule, but rider requests are being continuously generated in a streaming manner; (2) before a driver receives a new incoming rider request, the arrival time and location of the rider are unknown; (3) the driver and the rider should be informed at short notice about whether a matching is possible or not. Specifically, the driver should be notified quickly if a new rider is added, and similarly the rider should be informed quickly if her travel request can be fulfilled based on the current preference settings, such as waiting time tolerance to be picked up.
The following challenges arise in addressing this problem:
(1) How can we find eligible driverrider matching pairs to maximize the served rate and also minimize the additional distance? If more rider requests are satisfied, it implies that the driver has to travel further to pick up more rider(s). For instance, if a driver already has passengers on board but receives a new rider request , the driver might detour to pick up , resulting in an increase in the served rate and the distance traveled.
(2) How can we make a decision on the best service sequence for a trip schedule? Every rider has their own maximum allowable waiting time, and detour time tolerance. When a rider request is served, these constraints should not be violated. For example, a rider request is already in the trip schedule of a driver , but a new rider request is received while the driver is serving . The driver needs to determine if can be picked up first without violating ’s constraints.
(3) How can we efficiently support the ridesharing problem for streaming rider requests? The time used for determining the updated trip schedule as per new rider requests should not exceed the time window size.
In order to address the above challenges, we make the following contributions:

[leftmargin=*]

We define a variant of the dynamic ridesharing problem, which aims to optimize two objectives subject to a series of spatiotemporal constraints presented in Section 3. In addition, our work mainly focuses on a common yet important scenario where the number of drivers is insufficient to serve all riders in peak travel periods.

We develop an index structure on top of a partitioned road network and compute the lower bound of the shortest path distance between any two vertices in constant time in Section 4. We then propose a pruningbased rider request insertion algorithm based on several pruning rules to accelerate the matching process in Section 5.

We conduct extensive experiments on a realworld dataset in order to validate the efficiency and effectiveness of our methods under different parameter settings in Section 7.
2 Related Work
Ridesharing has been intensively studied in recent years, in both static and dynamic settings. A typical formulation for the static ridesharing problem consists of designing drivers’ routes and schedules for a set of rider requests with departure and arrival locations known beforehand [24, 23, 42, 38], while the dynamic ridesharing problem is based on the setting that new riders are continuously added in a streamwise manner [17, 31, 8, 27].
Static Ridesharing Problem. Ta et al. [30] defined two kinds of ridesharing models to maximize the shared route ratio, under the assumption that at most one rider can be assigned to a driver in the vehicle. Cheng et al. [10] proposed a utilityaware ridesharing problem, which can be regarded as a variant of the dialaride problem [11, 29]. The aim was to maximize the riders’ satisfaction, which is defined as a linear combination of vehiclerelated utility, riderrelated utility, and trajectoryrelated utility, such as rider sharing preferences. Bei and Zhang [5] investigated the assignment problem in ridesharing and designed an approximation algorithm with a worstcase performance guarantee when only two riders can share a vehicle. In contrast, we focus primarily on the dynamic ridesharing problem, and allow the maximum number of riders to be greater than two.
Dynamic Ridesharing Problem. Several existing techniques are well illustrated and outlined by two recent surveys [3, 16]. We describe the literature in chronological order. Agatz et al. [2] explored a rideshare optimization problem in which the rideshare provider aims to minimize the total systemwide vehiclemiles. Xing et al. [40] studied a multiagent system with the objective of maximizing the number of served riders. Kleiner et al. [21] proposed an auctionbased mechanism to facilitate both riders and drivers to bid according to their preferences. Ma et al. [25, 26] proposed a dynamic taxi ridesharing service to serve each request by dispatching a taxi with the minimum additional travel distance incurred. Huang et al. [18] designed a kinetic tree to enumerate all possible valid trip schedules for each driver. Duan et al. [14] studied the personalized ridesharing problem which maximizes a satisfaction value defined as a linear function of distance and time cost. Zheng et al. [44] considered the platform profit as the optimization objective to be maximized by dispatching the orders to vehicles. Tong et al. [35] devised route plans to maximize the unified cost which consists of the total travel distance and a penalty for unserved requests. Chen et al. [9] considered both the pickup time and the price to return multiple options for each rider request in dynamic ridesharing. Xu et al. [41] proposed an efficient rider insertion algorithm that minimizes the maximum flow time of all requests or minimizes the total travel time of the driver.
Our work is different from existing ridesharing studies in two aspects: (1) Our focus is the peak hours scenario where there are too few drivers to satisfy all of the rider requests, which has not been explored in previous work. Later in the experimental study we extend the stateoftheart [9] to the peak hour scenario and show that our approach outperforms it in both efficiency and effectiveness. (2) Most previous studies aim to optimize a single objective [18, 2, 25, 40, 44, 10, 41] or a customized linear function [14, 35]. This differs from our problem scenario where we solve the dualobjective optimization problem. Chen et al. [9] also considered two criteria (price and pickup time) from the perspective of riders, but not the same criteria (served rate and additional distance) as we use to satisfy riders, drivers and ridesharing system requirements. In our problem, riders can provide their personal sharing preferences by giving constraint values, such as waiting time tolerance, drivers and ridesharing system expect to serve more riders with less detour distance, which coincides with our goal, i.e., maximizing the served rate and meanwhile minimizing the additional distance. Kleiner et al. [21] took both minimizing the total travel distance and maximizing the number of served riders into consideration, where riders and drivers can select each other by adjusting bids. They assumed that each driver can only share the trip with one other rider, which limited the potential of ridesharing system. Other recent studies on task assignment [33, 34] exploit bilateral matching between a set of workers and tasks to achieve one single objective, minimizing the total distance [33] or maximizing the total utility [34]. However, when the ordering of multiple riders must also be mapped to a single driver, bilateral mapping is not sufficient.
3 Problem formulation
3.1 Preliminaries
Let be a road network represented as a graph, where is the set of vertices and is the set of edges. We assume drivers and riders travel along the road network. Let be a set of drivers, where each is a tuple . Here, is the current location, is the maximal seat capacity, and is the current trip schedule of the driver . If does not coincide with a vertex, we map the location to the closest vertex for ease of computation. Each trip schedule is a sequence of points, where is the driver’s current location, and () is a source or destination point of a rider. We assume that the rider requests arrive in a streaming fashion. We impose a timebased window model, where we process the set of requests that arrive in the most recent timeslot.
Definition 1.
(Rider Request). Let be a set of rider requests. Each is a tuple , where is the request submission time, is the source location, is the destination location, is the number of riders in that request, is the waiting time threshold (i.e., the maximum period needs to be picked up after submitting a request), and is a detour time threshold (explained later in Definition 2).
Additional distance. As mentioned before, each driver maintains a trip schedule , where the corresponding riders are served sequentially in . If a new rider must be served by , the trip schedule changes. The additional distance is defined as the difference between the travel distance of the updated trip schedule after a new rider request is inserted, and the travel distance of the original trip schedule .
Here, the travel distance of a trip schedule is computed as: , where the distance between two points is computed as the shortest path in the road network . For any two points and () which are not adjacent in , the travel distance from to is defined as follows: .
Served Rate. The served rate is defined as the ratio of the number of served riders (i.e., matched with a driver) over the total number of riders . Now we formally define the dynamic ridersharing problem.
3.2 Problem Definition
Definition 2.
(Dynamic Ridesharing). Given a set of drivers and a set of new incoming riders on road network, the dynamic ridesharing problem finds the optimal driverrider pairs such that (1) the served rate is maximal; (2) the total additional distance is minimal, subject to the following spatiotemporal constraints:
(a) Capacity constraint. The number of riders served by any driver should not exceed the corresponding maximal seat capacity .
(b) Waiting time constraint. The actual time for the driver to pick up the rider after receiving the request should not be greater than the rider’s waiting time constraint .
(c) Detour time constraint. A driver may detour to pick up other riders, so the actual travel time by any rider in the road network should be bounded by the shortest travel time multiplied by the corresponding rider’s detour threshold , which is .
Note that time and distance are interchangeable using reasonable travel speeds collected from historical data. Therefore, we emphasize the following two points for clarity of exposition: (i) we adopt a uniform travel speed assumption henceforth, while we conduct the experiments under different settings of travel speed in experiments (Section 7) to simulate varying road conditions in peak hours; (ii) in accordance with our index structure (Section 4), which is a distancebased framework, is stored as distance value computed by a multiplication of waiting time and travel speed. In regard to the detour time constraint, we represent it as an inequation based on distance, i.e., , where is the actual travel distance, and is the shortest travel distance.
3.3 Solution Overview
The dynamic ridesharing problem is a classical constraintbased combinatorial optimization problem, which was proven to be NPhard
[44]. Our objective is to match the set of new rider requests in each timeslot with the set of drivers and update the corresponding drivers’ trip schedules, such that all constraints are met, the served rate is maximized, and the total additional distance is minimized.We first propose an underlying index structure on top of a partitioned road network to compute the lower bound distance between any two vertices in Section 4. Then, one crucial problem is to accommodate a new incoming rider request in an existing trip schedule for a driver efficiently. We present a pruning based algorithm to efficiently insert the source and destination points of a rider into an existing trip schedule of a driver in Section 5. Note that, when inserting new points into a trip schedule, the constraints of existing riders in that trip schedule cannot be violated. Moreover, multiple drivers may be able to serve a rider, and there can be multiple options to insert a rider’s source and destination points into a trip schedule. Thus, the dynamic insertion of a rider’s request into a trip schedule is a difficult optimization problem.
Next, we propose two different algorithms in Section 6 to find the match between and using the insertion algorithm. The first is the Distancefirst algorithm in Section 6.1. Specifically, we process each rider request one by one in a firstcomefirstserve manner according to the request submission time. For each rider request, we invoke the insertion algorithm (Algo. 1) to find an eligible driver who generates the minimal additional distance. Although Distancefirst can match each rider request with a suitable driver efficiently, the served rate of the ridesharing system is neglected in the process. Therefore, we propose the Greedy algorithm in Section 6.2. In this approach, we consider the batch of rider requests within the most recent timeslot (e.g., 10 seconds) altogether and match them with a set of drivers optimally by tradingoff two metrics: served rate and additional distance.
4 Index Structure
Symbol  Description 

A driver with a unique , current location , maximal seat capacity , and current trip schedule  
A trip schedule consists of a sequence of points, where is the driver’s current location, and () is a source or destination point of a rider  
A rider request with the submission time , a source point , a destination point , the number of riders , a waiting time threshold , and a detour threshold  
The actual travel distance between and  
The shortest travel distance between and  
The lower bound distance between two subgraphs and  
The lower bound distance between a vetex and a subgraph  
The lower bound distance between any two vertices and  
The incremental distance by inserting between and , then  
,  Two optimization utility functions 
The update time window  
The travel speed 
In this section, we propose a new index structure on top of a partitioned road network. First we present the motivation behind the index design, and then we present the details of the index in Section 4.2. Table I presents the notation used throughout this work.
4.1 Motivation
The distance computation from a driver’s location to a rider’s pickup or dropoff point can be reduced to the shortest path computation between their closest vertices in the road network, which can be easily solved using an efficient hubbased labelling algorithm [1]. Although invoking the shortest path computation once only requires a few microseconds, a huge number of online shortest path computations are required when trying to optimally match new incoming riders with the drivers with constraints, and update the trip schedules accordingly. Such computations lead to a performance bottleneck.
A straightforward way is to precompute the shortest path distance offline for all vertex pairs and store them in memory or disk. Then the shortest path query problem is simply reduced to a direct lookup operation. Although the query can be processed efficiently, this approach is rarely used in a large road network in practice, especially when many variables may change in a dynamic or streaming scenario. Therefore, it is essential but nontrivial to devise an efficient index over road network which can be used to estimate the actual shortest path distance between any two locations.
Since road networks are often combined with nonEuclidean distance metrics, a traditional spatial index cannot be directly used. For example, a grid index is widely used in existing ridesharing studies [9, 35, 25, 7] to partition the space. Generally, they divide the whole road network into multiple equalsized cells and then compute the lower or upper bound distance between any two grids. These distance bounds are further used for pruning. However, as the density distribution of the vertices vary widely in urban and rural regions, most grids are empty, and contain no vertex. For example, more than 80% grids are empty in the grid index, resulting in very weak pruning power in ridesharing scenarios [9]. Although a quadtree index can divide the road network structure in a densityaware way, it has to maintain a consistent hierarchical representation such that each child node update may lead to a parent node update, which can increase the update costs when available drivers are moving (which is often true in real world scenarios). Therefore, we choose to adopt a densitybased road network partitioning approach, which can efficiently estimate shortest path distances.
4.2 Road Network Index
The index is constructed in two steps – (1) Partition the road network into subgraphs such that closely connected vertices are stored in the same subgraph, while the connections among subgraphs are minimized. (2) For each subgraph, it stores the information necessary to efficiently estimate the shortest path between any pair of vertices in the road network.
Densitybased Partitioning. We present a densitybased partitioning approach which divides the road network into multiple subgraphs. The partition process follows two criteria: (1) Similar number of vertices: each subgraph maintains an approximately equal number of vertices such that the density distributions of the subgraphs are similar. (2) Minimal cut: an edge is considered as a cut if its two endpoints belong to two disjoint subgraphs. The number of cuts is minimal, so that the vertices in a subgraph are closely connected.
Given a road network with a vertex set and an edge set , a partition number , is divided into a set of subgraphs , where each subgraph contains a subset of vertices and a subset of edges such that (1) , ; (2) if , and , then , ; (3) ; (4) the number of edge cuts is minimized.
The graph partitioning problem has been wellstudied in the literature and is not the primary focus of this paper. Instead, we focus on how to define the distance bounds in order to reduce unnecessary shortest path distance computations. Thus we use a stateoftheart method [20] to obtain the density based partitioning of the road network .
Subgraph information. After we obtain a set of disjoint subgraphs from , we build our index by storing the following information for each subgraph .

[leftmargin=*]

Bridge Vertex Set. If an edge is a cut, then each of its two endpoints is regarded as a bridge vertex. We store the set of the bridge vertices of each subgraph .

NonBridge Vertex Set. Vertices not in of a subgraph are stored as a nonbridge vertex set , i.e., . For each nonbridge vertex , we use to denote the lower bound distance of , which is the shortest path distance to its nearest bridge vertex in the same subgraph .

Subgraph set. For each subgraph , we store a list of the lower bound distances, one entry for each of the other subgraphs. Specifically, as the vertex of a subgraph is reachable from a vertex of another subgraph only through the bridge vertices, the lower bound distance is calculated with the following equation.
(1) 
Dispatched Driver Set. If the current location of a driver is situated in a subgraph , then is stored in the dispatched driver set .
4.3 Bounding Distance Estimations
Furthermore, we introduce two additional concepts on how to calculate the lower bound distance from a vertex to a subgraph (Definition 3) or another vertex (Definition 4).
Definition 3.
(Lower Bound Distance Between a Vertex and a Subgraph). Given a vertex which belongs to and a subgraph , the lower bound distance from to is defined as follows:
(2) 
Definition 4.
(Lower Bound Distance Between Two Vertices). Given a vertex which belongs to and a vertex which is located at , the lower bound distance is defined as follows:
(3) 
In our implementation, we construct the road network index offline and compute or online. The index structure is memory resident, which includes and information. Therefore, (or ) can be computed using Eq. 2 (or Eq. 3) in time complexity.
Example 1.
A road network partitioning example is depicted in Fig. (a)a. There are ten vertices and ten edges in the graph, and they are divided into four subgraphs: , , , and . The values in parenthesis after each edge denote the distance. The red vertices represent the bridge vertices in each subgraph, while the red edges denote the cut edges connecting two disjoint subgraphs. E.g., is a cut because it connects two disjoint subgraphs and . and are two bridge vertices because they are two endpoints of a cut .
The corresponding road network index structure is illustrated in Fig. (b)b. For a subgraph , includes two bridge vertices: and . There is one remaining nonbridge vertex which belongs to . Then the lower bound distance from to a bridge vertex in the same subgraph is . Three subgraphs are connected with directly or indirectly through cut edges. The distance between and is , and the distance between and is .
The lower bound distance between and is . The lower bound distance between and is .
5 Pruningbased Rider Request Insertion
In this section, we propose a rider insertion algorithm on top of several pruning rules to insert a new incoming rider request into an existing trip schedule of a driver efficiently, such that a customized utility function is minimized, i.e., (Eq. 6) for Distancefirst algorithm and (Eq. 8) for Greedy algorithm, respectively.
5.1 Problem Assumption
First, same as prior work [10, 44, 26], when receiving a new rider request, a driver maintains the original, unchanged trip schedule sequence. In other words, we do not reorder the current trip schedule to ensure a consistent user experience for riders already scheduled. For example, if a driver has been assigned to pick up first and then pick up , then the pickup timestamp of should be no later than the pickup timestamp of . Second, in contrast to restrictions commonly adopted in prior work [44, 5, 21, 14], where the number of rider requests served by each driver is never greater than two, we assume that more than two riders can be served as long as the seat capacity constraint is not violated, which improves the usability and scalability of the ridesharing system.
5.2 Approach Description
Given a set of drivers and a new rider , we aim to find a matching driver and insert and of into a driver’s trip schedule. A straightforward approach can be applied as follows: (1) for each driver candidate , we enumerate all possible insertion positions for and in ’s current trip schedule (its complexity is , where is the number of points in the trip schedule); (2) for each possible insertion position pair , we check whether it violates the waiting and detour time constraints of both and the other riders who have been scheduled for (). Therefore, the total time consumption is . As the insertion process is crucial to the overall efficiency, we now propose several new pruning strategies. Then we present our algorithm for rider insertion derived from these pruning rules.
Before presenting the pruning strategies, we would like to introduce how the rider request is inserted and a preliminary called the slack distance.
Suppose that and are inserted in the th and th location respectively, where must hold. There are four ways to insert them as shown in Fig. 2. To accelerate constraint violation checking, we borrow the idea of “slack time” [18, 35] and define “slack distance”.
Definition 5.
(Slack Distance). Given a trip schedule where is the driver’s current location, the slack distance (Eq. 4) is defined as the minimal surplus distance to serve new riders inserted before the th position in . The surplus distance w.r.t. a particular point () after the th position is discussed as follows:
(1) If is a source point () of a rider request , our only concern is whether the waiting time constraint will be violated. The actual pickup distance of from the driver’s current location is , following the trip schedule. In the worst case, is picked up just within . Then the surplus distance generated by is .
(2) If is a destination point () of a rider request , the detour time constraint will be examined. Similarly, the actual dropoff distance is following the trip schedule from the source point . The worst case is that is dropped off within the border of detour time constraint, which is from . Then the surplus distance generated by is .
Thus we define the slack distance as Eq. 4.
(4) 
The available seat capacity (Eq. 5) in the process of pickup and dropoff along the trip schedule changes dynamically.
(5) 
In addition, we use an auxiliary variable to indicate the incremental distance by inserting between and , then .
Example 2.
Given a driver already carrying a rider , and the current trip schedule . On the driver’s way to drop off , receives a new rider request , then there are three possible updated trip schedules , , and . For , we need to check whether the detour time constraint of and the waiting time constraint of are violated. For , we need to check whether the waiting time constraint of is violated. For , we need to check whether the detour time constraint of , the waiting time constraint and detour time constraint of are violated.
5.3 Pruning Rules
5.3.1 Pruning driver candidates
Based on our proposed index in Section 4, we can estimate the lower bound distance from a driver to a rider’s pickup point. According to the waiting time constraint, drivers outside this range can be filtered out.
Lemma 1.
Given a new rider and a subgraph , if , then the drivers who are located in can be safely pruned.
Proof.
The lower bound distance between the rider’s source point and any vertex in is always greater than or equal to the lower bound distance from to . Therefore, if holds, then , also holds. Thus, the drivers located at any vertex in cannot satisfy the waiting time constraint of the rider. ∎
Lemma 2.
Given a new rider and a driver , if , then can be safely pruned.
Proof.
Since the shortest path distance between the new rider’s source point and the driver’s current location is no less than the lower bound distance , we have . ∎
5.3.2 Pruning rider insertion positions
It is worth noting that the insertion of a new rider may violate the waiting or detour time constraint of riders already scheduled for a vehicle, thus we propose the following rules to reduce the insertion time examination.
Lemma 3.
Given a new rider , a trip schedule and a source point insertion position , if , then the rider should be picked up before the th point.
Proof.
Since each vehicle travels following a trip schedule, we have , which is not less than . Then we can get , which violates the waiting time constraint of . ∎
Lemma 4.
Given a new rider , a trip schedule and a source point insertion position (), if , then the rider cannot be picked up at the th point.
Proof.
The incremental distance generated by picking up is , which is no smaller than . Thus, we obtain , which implies that the incremental distance exceeds the maximal waiting or detour tolerance range of point(s) after the th point. ∎
Lemma 5.
Given a new rider whose source point is already inserted at the th position of a trip schedule , and a destination point insertion position , if (i.e., the insertion position as shown in Fig. (b)b) and , then cannot be inserted at the th point.
Proof.
Similar to Lemma 4, the additional distance as a result of inserting and is , which is not less than . Then the additional distance is greater than , which violates the waiting or detour time constraint of the point(s) after the th point. ∎
Lemma 6.
Given a new rider whose source point is already inserted at the th position of a trip schedule , and a destination point insertion position , if (i.e., the insertion position as shown in Fig. (c)c) and , then cannot be inserted at the th point.
Proof.
The incremental distance generated by dropping off at is , which is no smaller than . Then , which violates the waiting or detour time constraint of point(s) after the th position. ∎
Lemma 7.
Given a new rider , a trip schedule , and two insertion positions and , if and , then such two insertion positions for and can be pruned.
Proof.
The actual travel distance from to is , which is not less than (. Then we can obtain that , which violates the detour time constraint of . ∎
Lemma 8.
Given a new rider , a trip schedule , and two insertion positions and , (), if , then such two insertion positions for and can be pruned.
Proof.
It can be easily proved by the capacity constraint. ∎
In practice, we use the lowerbound distance to execute the pruning rules. If a possible insertion cannot be pruned, we use the true distance as a further check, which can guarantee the correctness of the pruning operation.
5.4 Algorithm Sketch
The pseudocode for the rider insertion algorithm is shown in Algo. 1. We initialize two local variables: to record the utility value found each time, and the best utility value found so far (line 1), where a lower value denotes better utility. The pruning rules are first executed for the pickup point . We examine whether the vacant vehicle capacity is sufficient to hold the riders, and that the driver is close enough to provide the ridesharing service (line 3). If the conditions hold for the th position, then we check whether the detour distance will exceed the slack distance of the following points starting from the th position.
If all of the conditions are satisfied, we continue to check the destination point insertion (line 5). Otherwise, the source point is not added at the th position. Similarly, we check whether the current capacity is sufficient (line 6). Since the sequence between and leads to a different detour distance, two cases are possible: (i.e., the source and the destination are added to consecutive positions, shown in Fig. (a)a and Fig. (b)b) and (i.e., the positions are not consecutive, as shown in Fig. (c)c and Fig. (d)d). If , we compute the incremental distance generated by and , then judge whether it is greater than the slack distance of the following points after the th position (line 9). If , besides checking whether the incremental distance generated by and is larger than the slack distance, we also need to check whether the detour time constraint of is violated (line 12). If the current utility value obtained is better than the current optimal value found, we update (line 16).
Time complexity analysis. The two nested loops (line 2 and 5), each iterating over points takes total time. The examination for constraint violations (line 3, 4, 6, 9, and 12) only takes . Calculating the utility function value (line 14) also takes , which will be explained later according to different utility functions . In our method, we use a fast hubbased labeling algorithm [1] to answer the shortest path distance query. First, a hub label index is constructed in advance by assigning each vertex (considered as a “hub indicator”) a hub label which is a collection of pairs (, (, )) where . Then given a shortest path query (, ), the algorithm considers all common vertices in the hub labels of and . Therefore, the shortest path distance from to depends on the sizes of the hub label sets. We assume that the time complexity is , where indicates the average label size of all label sets [22]. Therefore, the total time complexity is .
6 Matchingbased Algorithm
In this section, we introduce two heuristic algorithms to bilaterally match a set of drivers and rider requests. Furthermore, we describe the insertion/update process for dynamic ridesharing.
6.1 Distancefirst Algorithm
The Distancefirst algorithm is executed in the following way: (1) we choose an unserved rider request with the earliest submission time from , and perform the pruningbased insertion algorithm (Algo. 1) to find a driver with minimal additional distance. (2) If a driver can be found to match , then the pair is joined and added to the final matching result.
The Distancefirst algorithm has two important aspects. First, a matching for a single rider insertion is made whenever the constraints are not violated, even if the effectiveness is relatively low. Second, if multiple drivers are available to provide a ridesharing service, the Distancefirst algorithm always chooses the one with the minimal additional distance. Thus, we define the optimization utility function as the additional distance .
(6) 
Based on the insertion option used (Fig. 2), we calculate in Eq. 7. As the distance from a driver’s location to any existing point in is already calculated and stored, the computation of only takes if the trip schedule and two insertion positions are given. Specifically, is calculated as:
(7) 
The pseudocode of the Distancefirst method is illustrated in Algo. 2, which has two main phases: Filter and Refine. We initialize an empty set to store the current valid driverrider pairs (line 1).
In the Filter phase (lines 27), for each rider we prune out ineligible driver candidates who violate the waiting time constraint. Specifically, for each rider , we maintain a driver candidate list to store the drivers who can possibly serve . We first prune the subgraphs from which it is impossible for a driver to serve (line 4). Then we further prune the drivers using a tighter bound, which is the lower bound distance between the driver’s location and the rider’s pickup point (line 6). The remaining drivers are inserted into as candidates.
In the Refine phase, the retained driver candidates are considered in the driverrider matching process. We select one unserved rider with the earliest submission time (line 9) and find a suitable driver with minimal additional distance. In the for loop (lines 1215), for each driver in , we examine the feasibility by using Algo. 1 (line 13) and choose the one with the smallest additional distance (line 15). If a driver that can satisfy all the requirements is found, then is added to the corresponding trip schedule (line 17).
6.2 Greedy Algorithm
The drawback of the Distancefirst algorithm is that the rider request with the earliest submission time is selected in each iteration, which neglects the served rate. Therefore, we propose a Greedy algorithm in Algo. 3 to deal with our dualobjective problem: maximize the served rate and minimize the additional distance.
The Filter phase in Greedy is similar to that in Distancefirst. However, in the Refine phase, we adopt a dispatch strategy by allocating the rider to the driver that results in the best utility gain (i.e., minimum utility value) locally each time, until there is no remaining driverrider pair to refine. Since we want to make more riders happy, where the served riders can have less additional distance, we use a utility function to represent the average additional distance of the served riders. Then if is served by , the utility function is defined as:
(8) 
A rider request attached with more riders () is more favoured according to our utility gain (Eq. 8). For example, if a rider makes a ridesharing request with a friend, they tend to have the same source and destination point, which implies that the detour distance and waiting time can be reduced or even avoided (if not shared with other riders). On the contrary, if two separate requests are served by a driver, it is imperative for us to coordinate the dispatch permutation sequence and guarantee each of them is satisfied. We expect that is minimized as much as possible, i.e., more riders are served with less detour distance.
The Greedy method is presented in Algo. 3. We initialize a minpriority queue to save the valid driverrider pair candidates with their utility values, and an empty set to store our final result (line 1). In the Filter phase, we traverse each driverrider pair to check whether they can be matched (lines 29). If the driver can carry the rider, then we will calculate its utility value (line 7) and push it into a pool (line 9). In the Refine phase, in each iteration we select the pair with the smallest utility value greedily (line 11), we perform an insertion (line 12), and append the pair into our result list (line 13). Meanwhile, we remove all pairs related to from (line 14). For riders where was considered as a candidate, we check whether it is still valid to include them as a pair since the insertion of a new rider may influence the riders previously considered (lines 1519). If is still feasible, we update the utility gain value for those riders (line 17). Otherwise, the driverrider pair is removed from (line
Comments
There are no comments yet.