Dynamic Ridesharing in Peak Travel Periods

04/06/2020
by   Hui Luo, et al.
The University of Melbourne
RMIT University
0

In this paper, we study a variant of the dynamic ridesharing problem with a specific focus on peak hours: Given a set of drivers and rider requests, we aim to match drivers to each rider request by achieving two objectives: maximizing the served rate and minimizing the total additional distance, subject to a series of spatio-temporal constraints. Our problem can be distinguished from existing work in three aspects: (1) Previous work did not fully explore the impact of peak travel periods where the number of rider requests is much greater than the number of available drivers. (2) Existing solutions usually rely on single objective optimization techniques, such as minimizing the total travel cost. (3) When evaluating the overall system performance, the runtime spent on updating drivers' trip schedules as per incoming rider requests should be incorporated, while it is excluded by most existing solutions. We propose an index structure together with a set of pruning rules and an efficient algorithm to include new riders into drivers' existing trip schedule. To answer new rider requests effectively, we propose two algorithms that match drivers with rider requests. Finally, we perform extensive experiments on a large-scale test collection to validate the proposed methods.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 14

12/18/2019

Balancing the Tradeoff between Profit and Fairness in Rideshare Platforms During High-Demand Hours

Rideshare platforms, when assigning requests to drivers, tend to maximiz...
07/20/2019

Dynamic Trip-Vehicle Dispatch with Scheduled and On-Demand Requests

Transportation service providers that dispatch drivers and vehicles to r...
02/02/2021

A metaheuristic for crew scheduling in a pickup-and-delivery problem with time windows

A vehicle routing and crew scheduling problem (VRCSP) consists of simult...
05/22/2020

A Dynamic Tree Algorithm for On-demand Peer-to-peer Ride-sharing Matching

Innovative shared mobility services provide on-demand flexible mobility ...
07/19/2021

A Queueing-Theoretic Framework for Vehicle Dispatching in Dynamic Car-Hailing [technical report]

With the rapid development of smart mobile devices, the car-hailing plat...
07/06/2020

Approximation algorithms for car-sharing problems

We consider several variants of a car-sharing problem. Given are a numbe...
04/16/2020

Optimal control of atmospheric pollution because of urban traffic flow by means of Stackelberg strategies

Two major problems in modern cities are air contamination and road conge...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Millions of drivers provide transportation services for over ten million passengers every day at Didi Chuxing [12], which is a Chinese counterpart of UberPOOL [37]. In peak travel periods, Didi needs to match more than a hundred thousand passengers to drivers every second [43], and rider demand often greatly exceeds rider capacity. Two approaches can be used to mitigate this problem. The first method attempts to predict areas with high travel demands using historical data and statistical predictions or a heat map, and taxis are strategically deployed in the corresponding areas in advance. An alternative approach is to serve multiple riders with fewer vehicles using a ridesharing service: riders with similar routes and time schedules can share the same vehicle [32, 4]. According to statistical data from the Bureau of Infrastructure, Transport and Regional Economics [6], there are less than 1.6 persons per vehicle per kilometer in Australia. If only 10% of vehicles had more than one passenger, then it would reduce annual fuel consumption by 5.4% [19]. Therefore, increasing vehicle occupancy rates would provide many benefits including the reduction of gas house emissions. Moreover, it has been reported that a crucial imbalance exists in supply and demand in peak hour scenarios, where the rider demand is double the rider availability based on historical data statistical analysis at Didi Chuxing [36]. Alleviating traffic congestion challenges during peak commuter times will ultimately require significant government commitment dedicated to increasing the region’s investment in core transportation infrastructure [13]. In this paper, we focus on the dynamic ridesharing problem, specifically during peak hour travel periods.

From an extensive literature study, we make the following observations to motivate this work – (1) Existing ridesharing studies [44, 9] do not fully explore the scenario where the number of riders is much greater than the number of available drivers, and so the scalability of current solutions in this setting remains unclear. (2) Prior studies [30, 25, 44, 18, 10, 40] primarily focus on single objective optimization solutions, such as minimizing the total travel distance from the perspective of drivers [25, 18], or maximizing the served rate of ridesharing system [40]. In contrast, we aim to optimize two objectives: maximize the served rate and minimize the total additional distance. (3) Previous studies [25, 15, 35, 9, 7] report the processing time mainly based on the rider request matching time, but not the trip schedule update time, which encompasses a driver’s current trip schedule and the underlying index structure updates. However, in a dynamic ridesharing scenario where vehicles are continuously moving, accounting for these additional costs produces a more realistic comparison of the algorithms being studied.

Our goal in this work is to determine a series of trip schedules with the minimum total additional distance capable of accommodating as many rider requests as possible, under a set of spatio-temporal constraints. In order to achieve this goal, we must account for three features: (1) each driver has an initial location and trip schedule, but rider requests are being continuously generated in a streaming manner; (2) before a driver receives a new incoming rider request, the arrival time and location of the rider are unknown; (3) the driver and the rider should be informed at short notice about whether a matching is possible or not. Specifically, the driver should be notified quickly if a new rider is added, and similarly the rider should be informed quickly if her travel request can be fulfilled based on the current preference settings, such as waiting time tolerance to be picked up.

The following challenges arise in addressing this problem:

(1) How can we find eligible driver-rider matching pairs to maximize the served rate and also minimize the additional distance? If more rider requests are satisfied, it implies that the driver has to travel further to pick up more rider(s). For instance, if a driver already has passengers on board but receives a new rider request , the driver might detour to pick up , resulting in an increase in the served rate and the distance traveled.

(2) How can we make a decision on the best service sequence for a trip schedule? Every rider has their own maximum allowable waiting time, and detour time tolerance. When a rider request is served, these constraints should not be violated. For example, a rider request is already in the trip schedule of a driver , but a new rider request is received while the driver is serving . The driver needs to determine if can be picked up first without violating ’s constraints.

(3) How can we efficiently support the ridesharing problem for streaming rider requests? The time used for determining the updated trip schedule as per new rider requests should not exceed the time window size.

In order to address the above challenges, we make the following contributions:

  • [leftmargin=*]

  • We define a variant of the dynamic ridesharing problem, which aims to optimize two objectives subject to a series of spatio-temporal constraints presented in Section 3. In addition, our work mainly focuses on a common yet important scenario where the number of drivers is insufficient to serve all riders in peak travel periods.

  • We develop an index structure on top of a partitioned road network and compute the lower bound of the shortest path distance between any two vertices in constant time in Section 4. We then propose a pruning-based rider request insertion algorithm based on several pruning rules to accelerate the matching process in Section 5.

  • We further propose two algorithms to find matchable and eligible driver-rider pairs, which aim to maximize the served rate and minimize the total additional distance, in Section 6.1 and Section 6.2 respectively.

  • We conduct extensive experiments on a real-world dataset in order to validate the efficiency and effectiveness of our methods under different parameter settings in Section 7.

In addition, we review the related work in Section 2 and conclude the work in Section 8.

2 Related Work

Ridesharing has been intensively studied in recent years, in both static and dynamic settings. A typical formulation for the static ridesharing problem consists of designing drivers’ routes and schedules for a set of rider requests with departure and arrival locations known beforehand [24, 23, 42, 38], while the dynamic ridesharing problem is based on the setting that new riders are continuously added in a stream-wise manner [17, 31, 8, 27].

Static Ridesharing Problem. Ta et al. [30] defined two kinds of ridesharing models to maximize the shared route ratio, under the assumption that at most one rider can be assigned to a driver in the vehicle. Cheng et al. [10] proposed a utility-aware ridesharing problem, which can be regarded as a variant of the dial-a-ride problem [11, 29]. The aim was to maximize the riders’ satisfaction, which is defined as a linear combination of vehicle-related utility, rider-related utility, and trajectory-related utility, such as rider sharing preferences. Bei and Zhang [5] investigated the assignment problem in ridesharing and designed an approximation algorithm with a worst-case performance guarantee when only two riders can share a vehicle. In contrast, we focus primarily on the dynamic ridesharing problem, and allow the maximum number of riders to be greater than two.

Dynamic Ridesharing Problem. Several existing techniques are well illustrated and outlined by two recent surveys [3, 16]. We describe the literature in chronological order. Agatz et al. [2] explored a ride-share optimization problem in which the ride-share provider aims to minimize the total system-wide vehicle-miles. Xing et al. [40] studied a multi-agent system with the objective of maximizing the number of served riders. Kleiner et al. [21] proposed an auction-based mechanism to facilitate both riders and drivers to bid according to their preferences. Ma et al. [25, 26] proposed a dynamic taxi ridesharing service to serve each request by dispatching a taxi with the minimum additional travel distance incurred. Huang et al. [18] designed a kinetic tree to enumerate all possible valid trip schedules for each driver. Duan et al. [14] studied the personalized ridesharing problem which maximizes a satisfaction value defined as a linear function of distance and time cost. Zheng et al. [44] considered the platform profit as the optimization objective to be maximized by dispatching the orders to vehicles. Tong et al. [35] devised route plans to maximize the unified cost which consists of the total travel distance and a penalty for unserved requests. Chen et al. [9] considered both the pick-up time and the price to return multiple options for each rider request in dynamic ridesharing. Xu et al. [41] proposed an efficient rider insertion algorithm that minimizes the maximum flow time of all requests or minimizes the total travel time of the driver.

Our work is different from existing ridesharing studies in two aspects: (1) Our focus is the peak hours scenario where there are too few drivers to satisfy all of the rider requests, which has not been explored in previous work. Later in the experimental study we extend the state-of-the-art [9] to the peak hour scenario and show that our approach outperforms it in both efficiency and effectiveness. (2) Most previous studies aim to optimize a single objective [18, 2, 25, 40, 44, 10, 41] or a customized linear function [14, 35]. This differs from our problem scenario where we solve the dual-objective optimization problem. Chen et al. [9] also considered two criteria (price and pick-up time) from the perspective of riders, but not the same criteria (served rate and additional distance) as we use to satisfy riders, drivers and ridesharing system requirements. In our problem, riders can provide their personal sharing preferences by giving constraint values, such as waiting time tolerance, drivers and ridesharing system expect to serve more riders with less detour distance, which coincides with our goal, i.e., maximizing the served rate and meanwhile minimizing the additional distance. Kleiner et al. [21] took both minimizing the total travel distance and maximizing the number of served riders into consideration, where riders and drivers can select each other by adjusting bids. They assumed that each driver can only share the trip with one other rider, which limited the potential of ridesharing system. Other recent studies on task assignment [33, 34] exploit bilateral matching between a set of workers and tasks to achieve one single objective, minimizing the total distance [33] or maximizing the total utility [34]. However, when the ordering of multiple riders must also be mapped to a single driver, bilateral mapping is not sufficient.

3 Problem formulation

3.1 Preliminaries

Let be a road network represented as a graph, where is the set of vertices and is the set of edges. We assume drivers and riders travel along the road network. Let be a set of drivers, where each is a tuple . Here, is the current location, is the maximal seat capacity, and is the current trip schedule of the driver . If does not coincide with a vertex, we map the location to the closest vertex for ease of computation. Each trip schedule is a sequence of points, where is the driver’s current location, and () is a source or destination point of a rider. We assume that the rider requests arrive in a streaming fashion. We impose a time-based window model, where we process the set of requests that arrive in the most recent timeslot.

Definition 1.

(Rider Request). Let be a set of rider requests. Each is a tuple , where is the request submission time, is the source location, is the destination location, is the number of riders in that request, is the waiting time threshold (i.e., the maximum period needs to be picked up after submitting a request), and is a detour time threshold (explained later in Definition 2).

Additional distance.  As mentioned before, each driver maintains a trip schedule , where the corresponding riders are served sequentially in . If a new rider must be served by , the trip schedule changes. The additional distance is defined as the difference between the travel distance of the updated trip schedule after a new rider request is inserted, and the travel distance of the original trip schedule .

Here, the travel distance of a trip schedule is computed as: , where the distance between two points is computed as the shortest path in the road network . For any two points and () which are not adjacent in , the travel distance from to is defined as follows: .

Served Rate.  The served rate is defined as the ratio of the number of served riders (i.e., matched with a driver) over the total number of riders . Now we formally define the dynamic ridersharing problem.

3.2 Problem Definition

Definition 2.

(Dynamic Ridesharing). Given a set of drivers and a set of new incoming riders on road network, the dynamic ridesharing problem finds the optimal driver-rider pairs such that (1) the served rate is maximal; (2) the total additional distance is minimal, subject to the following spatio-temporal constraints:

(a) Capacity constraint. The number of riders served by any driver should not exceed the corresponding maximal seat capacity .

(b) Waiting time constraint. The actual time for the driver to pick up the rider after receiving the request should not be greater than the rider’s waiting time constraint .

(c) Detour time constraint. A driver may detour to pick up other riders, so the actual travel time by any rider in the road network should be bounded by the shortest travel time multiplied by the corresponding rider’s detour threshold , which is .

Note that time and distance are interchangeable using reasonable travel speeds collected from historical data. Therefore, we emphasize the following two points for clarity of exposition: (i) we adopt a uniform travel speed assumption henceforth, while we conduct the experiments under different settings of travel speed in experiments (Section 7) to simulate varying road conditions in peak hours; (ii) in accordance with our index structure (Section 4), which is a distance-based framework, is stored as distance value computed by a multiplication of waiting time and travel speed. In regard to the detour time constraint, we represent it as an inequation based on distance, i.e., , where is the actual travel distance, and is the shortest travel distance.

3.3 Solution Overview

The dynamic ridesharing problem is a classical constraint-based combinatorial optimization problem, which was proven to be NP-hard 

[44]. Our objective is to match the set of new rider requests in each timeslot with the set of drivers and update the corresponding drivers’ trip schedules, such that all constraints are met, the served rate is maximized, and the total additional distance is minimized.

We first propose an underlying index structure on top of a partitioned road network to compute the lower bound distance between any two vertices in Section 4. Then, one crucial problem is to accommodate a new incoming rider request in an existing trip schedule for a driver efficiently. We present a pruning based algorithm to efficiently insert the source and destination points of a rider into an existing trip schedule of a driver in Section 5. Note that, when inserting new points into a trip schedule, the constraints of existing riders in that trip schedule cannot be violated. Moreover, multiple drivers may be able to serve a rider, and there can be multiple options to insert a rider’s source and destination points into a trip schedule. Thus, the dynamic insertion of a rider’s request into a trip schedule is a difficult optimization problem.

Next, we propose two different algorithms in Section 6 to find the match between and using the insertion algorithm. The first is the Distance-first algorithm in Section 6.1. Specifically, we process each rider request one by one in a first-come-first-serve manner according to the request submission time. For each rider request, we invoke the insertion algorithm (Algo. 1) to find an eligible driver who generates the minimal additional distance. Although Distance-first can match each rider request with a suitable driver efficiently, the served rate of the ridesharing system is neglected in the process. Therefore, we propose the Greedy algorithm in Section 6.2. In this approach, we consider the batch of rider requests within the most recent timeslot (e.g., 10 seconds) altogether and match them with a set of drivers optimally by trading-off two metrics: served rate and additional distance.

4 Index Structure

Symbol Description
A driver with a unique , current location , maximal seat capacity , and current trip schedule
A trip schedule consists of a sequence of points, where is the driver’s current location, and () is a source or destination point of a rider
A rider request with the submission time , a source point , a destination point , the number of riders , a waiting time threshold , and a detour threshold
The actual travel distance between and
The shortest travel distance between and
The lower bound distance between two subgraphs and
The lower bound distance between a vetex and a subgraph
The lower bound distance between any two vertices and
The incremental distance by inserting between and , then
, Two optimization utility functions
The update time window
The travel speed
TABLE I: Symbol and description

In this section, we propose a new index structure on top of a partitioned road network. First we present the motivation behind the index design, and then we present the details of the index in Section 4.2. Table I presents the notation used throughout this work.

4.1 Motivation

The distance computation from a driver’s location to a rider’s pickup or drop-off point can be reduced to the shortest path computation between their closest vertices in the road network, which can be easily solved using an efficient hub-based labelling algorithm [1]. Although invoking the shortest path computation once only requires a few microseconds, a huge number of online shortest path computations are required when trying to optimally match new incoming riders with the drivers with constraints, and update the trip schedules accordingly. Such computations lead to a performance bottleneck.

A straightforward way is to precompute the shortest path distance offline for all vertex pairs and store them in memory or disk. Then the shortest path query problem is simply reduced to a direct look-up operation. Although the query can be processed efficiently, this approach is rarely used in a large road network in practice, especially when many variables may change in a dynamic or streaming scenario. Therefore, it is essential but non-trivial to devise an efficient index over road network which can be used to estimate the actual shortest path distance between any two locations.

Since road networks are often combined with non-Euclidean distance metrics, a traditional spatial index cannot be directly used. For example, a grid index is widely used in existing ridesharing studies [9, 35, 25, 7] to partition the space. Generally, they divide the whole road network into multiple equal-sized cells and then compute the lower or upper bound distance between any two grids. These distance bounds are further used for pruning. However, as the density distribution of the vertices vary widely in urban and rural regions, most grids are empty, and contain no vertex. For example, more than 80% grids are empty in the grid index, resulting in very weak pruning power in ridesharing scenarios [9]. Although a quadtree index can divide the road network structure in a density-aware way, it has to maintain a consistent hierarchical representation such that each child node update may lead to a parent node update, which can increase the update costs when available drivers are moving (which is often true in real world scenarios). Therefore, we choose to adopt a density-based road network partitioning approach, which can efficiently estimate shortest path distances.

4.2 Road Network Index

The index is constructed in two steps – (1) Partition the road network into subgraphs such that closely connected vertices are stored in the same subgraph, while the connections among subgraphs are minimized. (2) For each subgraph, it stores the information necessary to efficiently estimate the shortest path between any pair of vertices in the road network.

(a) Road Network Partition example. The red vertices represent the bridge vertices in each subgraph, while the red edges denote the cut edges connecting two disjoint subgraphs. (b) Road Network Index example
Fig. 1: An illustration example of road network indexing
(a) The source insertion position and the destination insertion position are both after the -th position (after finishing the existing trip schedule) (b) and are consecutive and inserted before the -th position (c) and are not consecutive and inserted before the -th position (d) is before the -th position but is inserted after the -th position
Fig. 2: Four different ways the source and the destination points of a new rider can be inserted in an existing trip schedule

Density-based Partitioning.  We present a density-based partitioning approach which divides the road network into multiple subgraphs. The partition process follows two criteria: (1) Similar number of vertices: each subgraph maintains an approximately equal number of vertices such that the density distributions of the subgraphs are similar. (2) Minimal cut: an edge is considered as a cut if its two endpoints belong to two disjoint subgraphs. The number of cuts is minimal, so that the vertices in a subgraph are closely connected.

Given a road network with a vertex set and an edge set , a partition number , is divided into a set of subgraphs , where each subgraph contains a subset of vertices and a subset of edges such that (1) , ; (2) if , and , then , ; (3) ; (4) the number of edge cuts is minimized.

The graph partitioning problem has been well-studied in the literature and is not the primary focus of this paper. Instead, we focus on how to define the distance bounds in order to reduce unnecessary shortest path distance computations. Thus we use a state-of-the-art method [20] to obtain the density based partitioning of the road network .

Subgraph information.  After we obtain a set of disjoint subgraphs from , we build our index by storing the following information for each subgraph .

  1. [leftmargin=*]

  2. Bridge Vertex Set. If an edge is a cut, then each of its two endpoints is regarded as a bridge vertex. We store the set of the bridge vertices of each subgraph .

  3. Non-Bridge Vertex Set. Vertices not in of a subgraph are stored as a non-bridge vertex set , i.e., . For each non-bridge vertex , we use to denote the lower bound distance of , which is the shortest path distance to its nearest bridge vertex in the same subgraph .

  4. Subgraph set. For each subgraph , we store a list of the lower bound distances, one entry for each of the other subgraphs. Specifically, as the vertex of a subgraph is reachable from a vertex of another subgraph only through the bridge vertices, the lower bound distance is calculated with the following equation.

    (1)
  5. Dispatched Driver Set. If the current location of a driver is situated in a subgraph , then is stored in the dispatched driver set .

4.3 Bounding Distance Estimations

Furthermore, we introduce two additional concepts on how to calculate the lower bound distance from a vertex to a subgraph (Definition 3) or another vertex (Definition 4).

Definition 3.

(Lower Bound Distance Between a Vertex and a Subgraph). Given a vertex which belongs to and a subgraph , the lower bound distance from to is defined as follows:

(2)
Definition 4.

(Lower Bound Distance Between Two Vertices). Given a vertex which belongs to and a vertex which is located at , the lower bound distance is defined as follows:

(3)

In our implementation, we construct the road network index offline and compute or online. The index structure is memory resident, which includes and information. Therefore, (or ) can be computed using Eq. 2 (or Eq. 3) in time complexity.

Example 1.

A road network partitioning example is depicted in Fig. (a)a. There are ten vertices and ten edges in the graph, and they are divided into four subgraphs: , , , and . The values in parenthesis after each edge denote the distance. The red vertices represent the bridge vertices in each subgraph, while the red edges denote the cut edges connecting two disjoint subgraphs. E.g., is a cut because it connects two disjoint subgraphs and . and are two bridge vertices because they are two endpoints of a cut .

The corresponding road network index structure is illustrated in Fig. (b)b. For a subgraph , includes two bridge vertices: and . There is one remaining non-bridge vertex which belongs to . Then the lower bound distance from to a bridge vertex in the same subgraph is . Three subgraphs are connected with directly or indirectly through cut edges. The distance between and is , and the distance between and is .

The lower bound distance between and is . The lower bound distance between and is .

5 Pruning-based Rider Request Insertion

In this section, we propose a rider insertion algorithm on top of several pruning rules to insert a new incoming rider request into an existing trip schedule of a driver efficiently, such that a customized utility function is minimized, i.e., (Eq. 6) for Distance-first algorithm and (Eq. 8) for Greedy algorithm, respectively.

5.1 Problem Assumption

First, same as prior work [10, 44, 26], when receiving a new rider request, a driver maintains the original, unchanged trip schedule sequence. In other words, we do not reorder the current trip schedule to ensure a consistent user experience for riders already scheduled. For example, if a driver has been assigned to pick up first and then pick up , then the pickup timestamp of should be no later than the pickup timestamp of . Second, in contrast to restrictions commonly adopted in prior work [44, 5, 21, 14], where the number of rider requests served by each driver is never greater than two, we assume that more than two riders can be served as long as the seat capacity constraint is not violated, which improves the usability and scalability of the ridesharing system.

5.2 Approach Description

Given a set of drivers and a new rider , we aim to find a matching driver and insert and of into a driver’s trip schedule. A straightforward approach can be applied as follows: (1) for each driver candidate , we enumerate all possible insertion positions for and in ’s current trip schedule (its complexity is , where is the number of points in the trip schedule); (2) for each possible insertion position pair , we check whether it violates the waiting and detour time constraints of both and the other riders who have been scheduled for (). Therefore, the total time consumption is . As the insertion process is crucial to the overall efficiency, we now propose several new pruning strategies. Then we present our algorithm for rider insertion derived from these pruning rules.

Before presenting the pruning strategies, we would like to introduce how the rider request is inserted and a preliminary called the slack distance.

Suppose that and are inserted in the -th and -th location respectively, where must hold. There are four ways to insert them as shown in Fig. 2. To accelerate constraint violation checking, we borrow the idea of “slack time” [18, 35] and define “slack distance”.

Definition 5.

(Slack Distance). Given a trip schedule where is the driver’s current location, the slack distance (Eq. 4) is defined as the minimal surplus distance to serve new riders inserted before the -th position in . The surplus distance w.r.t. a particular point () after the -th position is discussed as follows:

(1) If is a source point () of a rider request , our only concern is whether the waiting time constraint will be violated. The actual pickup distance of from the driver’s current location is , following the trip schedule. In the worst case, is picked up just within . Then the surplus distance generated by is .

(2) If is a destination point () of a rider request , the detour time constraint will be examined. Similarly, the actual drop-off distance is following the trip schedule from the source point . The worst case is that is dropped off within the border of detour time constraint, which is from . Then the surplus distance generated by is .

Thus we define the slack distance as Eq. 4.

(4)

The available seat capacity (Eq. 5) in the process of pick-up and drop-off along the trip schedule changes dynamically.

(5)

In addition, we use an auxiliary variable to indicate the incremental distance by inserting between and , then .

Example 2.

Given a driver already carrying a rider , and the current trip schedule . On the driver’s way to drop off , receives a new rider request , then there are three possible updated trip schedules , , and . For , we need to check whether the detour time constraint of and the waiting time constraint of are violated. For , we need to check whether the waiting time constraint of is violated. For , we need to check whether the detour time constraint of , the waiting time constraint and detour time constraint of are violated.

5.3 Pruning Rules

5.3.1 Pruning driver candidates

Based on our proposed index in Section 4, we can estimate the lower bound distance from a driver to a rider’s pickup point. According to the waiting time constraint, drivers outside this range can be filtered out.

Lemma 1.

Given a new rider and a subgraph , if , then the drivers who are located in can be safely pruned.

Proof.

The lower bound distance between the rider’s source point and any vertex in is always greater than or equal to the lower bound distance from to . Therefore, if holds, then , also holds. Thus, the drivers located at any vertex in cannot satisfy the waiting time constraint of the rider. ∎

Lemma 2.

Given a new rider and a driver , if , then can be safely pruned.

Proof.

Since the shortest path distance between the new rider’s source point and the driver’s current location is no less than the lower bound distance , we have . ∎

5.3.2 Pruning rider insertion positions

It is worth noting that the insertion of a new rider may violate the waiting or detour time constraint of riders already scheduled for a vehicle, thus we propose the following rules to reduce the insertion time examination.

Lemma 3.

Given a new rider , a trip schedule and a source point insertion position , if , then the rider should be picked up before the -th point.

Proof.

Since each vehicle travels following a trip schedule, we have , which is not less than . Then we can get , which violates the waiting time constraint of . ∎

Lemma 4.

Given a new rider , a trip schedule and a source point insertion position (), if , then the rider cannot be picked up at the -th point.

Proof.

The incremental distance generated by picking up is , which is no smaller than . Thus, we obtain , which implies that the incremental distance exceeds the maximal waiting or detour tolerance range of point(s) after the -th point. ∎

Lemma 5.

Given a new rider whose source point is already inserted at the -th position of a trip schedule , and a destination point insertion position , if (i.e., the insertion position as shown in Fig. (b)b) and , then cannot be inserted at the -th point.

Proof.

Similar to Lemma 4, the additional distance as a result of inserting and is , which is not less than . Then the additional distance is greater than , which violates the waiting or detour time constraint of the point(s) after the -th point. ∎

Lemma 6.

Given a new rider whose source point is already inserted at the -th position of a trip schedule , and a destination point insertion position , if (i.e., the insertion position as shown in Fig. (c)c) and , then cannot be inserted at the -th point.

Proof.

The incremental distance generated by dropping off at is , which is no smaller than . Then , which violates the waiting or detour time constraint of point(s) after the -th position. ∎

Lemma 7.

Given a new rider , a trip schedule , and two insertion positions and , if and , then such two insertion positions for and can be pruned.

Proof.

The actual travel distance from to is , which is not less than (. Then we can obtain that , which violates the detour time constraint of . ∎

Lemma 8.

Given a new rider , a trip schedule , and two insertion positions and , (), if , then such two insertion positions for and can be pruned.

Proof.

It can be easily proved by the capacity constraint. ∎

In practice, we use the lower-bound distance to execute the pruning rules. If a possible insertion cannot be pruned, we use the true distance as a further check, which can guarantee the correctness of the pruning operation.

0:  a driver with a trip schedule , a new rider request , a utility function 0:  return the utility value if can be served by ; Otherwise, return . 1:  , 2:  for  to  do 3:      if  or (,) then // Lemma 3 and 8 4:          if  and (,)+(,)-(,)  then // Lemma 4 5:              for  to  do 6:                 if  then // Lemma 8 7:                     if  then 8:                         if  =  then 9:                             if (,)+(,)+(,) (, ) then // Lemma 5 10:                                continue 11:                         else 12:                             if (,,)+(,)+(, )-( ,) or (,)+(,)+(,) (+)(,then // Lemma 6 and 7 13:                                continue 14:                      compute the utility value using 15:                     if   then 16:                          17:  return  
Algorithm 1 RiderInsertion (, , )

5.4 Algorithm Sketch

The pseudocode for the rider insertion algorithm is shown in Algo. 1. We initialize two local variables: to record the utility value found each time, and the best utility value found so far (line 1), where a lower value denotes better utility. The pruning rules are first executed for the pickup point . We examine whether the vacant vehicle capacity is sufficient to hold the riders, and that the driver is close enough to provide the ridesharing service (line 3). If the conditions hold for the -th position, then we check whether the detour distance will exceed the slack distance of the following points starting from the -th position.

If all of the conditions are satisfied, we continue to check the destination point insertion (line 5). Otherwise, the source point is not added at the -th position. Similarly, we check whether the current capacity is sufficient (line 6). Since the sequence between and leads to a different detour distance, two cases are possible: (i.e., the source and the destination are added to consecutive positions, shown in Fig. (a)a and Fig. (b)b) and (i.e., the positions are not consecutive, as shown in Fig. (c)c and Fig. (d)d). If , we compute the incremental distance generated by and , then judge whether it is greater than the slack distance of the following points after the -th position (line 9). If , besides checking whether the incremental distance generated by and is larger than the slack distance, we also need to check whether the detour time constraint of is violated (line 12). If the current utility value obtained is better than the current optimal value found, we update (line 16).

Time complexity analysis. The two nested loops (line 2 and 5), each iterating over points takes total time. The examination for constraint violations (line 3, 4, 6, 9, and 12) only takes . Calculating the utility function value  (line 14) also takes , which will be explained later according to different utility functions . In our method, we use a fast hub-based labeling algorithm [1] to answer the shortest path distance query. First, a hub label index is constructed in advance by assigning each vertex (considered as a “hub indicator”) a hub label which is a collection of pairs (, (, )) where . Then given a shortest path query (, ), the algorithm considers all common vertices in the hub labels of and . Therefore, the shortest path distance from to depends on the sizes of the hub label sets. We assume that the time complexity is , where indicates the average label size of all label sets [22]. Therefore, the total time complexity is .

6 Matching-based Algorithm

In this section, we introduce two heuristic algorithms to bilaterally match a set of drivers and rider requests. Furthermore, we describe the insertion/update process for dynamic ridesharing.

6.1 Distance-first Algorithm

The Distance-first algorithm is executed in the following way: (1) we choose an unserved rider request with the earliest submission time from , and perform the pruning-based insertion algorithm (Algo. 1) to find a driver with minimal additional distance. (2) If a driver can be found to match , then the pair is joined and added to the final matching result.

The Distance-first algorithm has two important aspects. First, a matching for a single rider insertion is made whenever the constraints are not violated, even if the effectiveness is relatively low. Second, if multiple drivers are available to provide a ridesharing service, the Distance-first algorithm always chooses the one with the minimal additional distance. Thus, we define the optimization utility function as the additional distance .

(6)

Based on the insertion option used (Fig. 2), we calculate in Eq. 7. As the distance from a driver’s location to any existing point in is already calculated and stored, the computation of only takes if the trip schedule and two insertion positions are given. Specifically, is calculated as:

(7)

The pseudocode of the Distance-first method is illustrated in Algo. 2, which has two main phases: Filter and Refine. We initialize an empty set to store the current valid driver-rider pairs (line 1).

In the Filter phase (lines 2-7), for each rider we prune out ineligible driver candidates who violate the waiting time constraint. Specifically, for each rider , we maintain a driver candidate list to store the drivers who can possibly serve . We first prune the subgraphs from which it is impossible for a driver to serve (line 4). Then we further prune the drivers using a tighter bound, which is the lower bound distance between the driver’s location and the rider’s pickup point (line 6). The remaining drivers are inserted into as candidates.

In the Refine phase, the retained driver candidates are considered in the driver-rider matching process. We select one unserved rider with the earliest submission time (line 9) and find a suitable driver with minimal additional distance. In the for loop (lines 12-15), for each driver in , we examine the feasibility by using Algo. 1 (line 13) and choose the one with the smallest additional distance (line 15). If a driver that can satisfy all the requirements is found, then is added to the corresponding trip schedule (line 17).

Time complexity analysis. In the Filter phase, the two nested iterations (line 2 and 5) takes . The time complexity of the Refine phase is . Therefore, the total time complexity is .

0:  a driver set , a rider request set 0:  an assigned driver-rider pair list ; 1:  ; 2:  for   do 3:      for   do 4:          if .. then // Lemma 1 5:              for   do 6:                 if (.,.). then // Lemma 2 7:                     ) 8:  while  do 9:      choose one rider with the earliest submission time 10:       11:      , , 12:      for  in  do 13:           RiderInsertion(, , ) 14:          if  then 15:              , 16:      if  then 17:          insert rider to the current trip schedule of 18:           19:  return  
Algorithm 2 The Distance-first Algorithm

6.2 Greedy Algorithm

The drawback of the Distance-first algorithm is that the rider request with the earliest submission time is selected in each iteration, which neglects the served rate. Therefore, we propose a Greedy algorithm in Algo. 3 to deal with our dual-objective problem: maximize the served rate and minimize the additional distance.

The Filter phase in Greedy is similar to that in Distance-first. However, in the Refine phase, we adopt a dispatch strategy by allocating the rider to the driver that results in the best utility gain (i.e., minimum utility value) locally each time, until there is no remaining driver-rider pair to refine. Since we want to make more riders happy, where the served riders can have less additional distance, we use a utility function to represent the average additional distance of the served riders. Then if is served by , the utility function is defined as:

(8)

A rider request attached with more riders () is more favoured according to our utility gain (Eq. 8). For example, if a rider makes a ridesharing request with a friend, they tend to have the same source and destination point, which implies that the detour distance and waiting time can be reduced or even avoided (if not shared with other riders). On the contrary, if two separate requests are served by a driver, it is imperative for us to coordinate the dispatch permutation sequence and guarantee each of them is satisfied. We expect that is minimized as much as possible, i.e., more riders are served with less detour distance.

0:  a driver set , a rider request set 0:  assigned driver-rider pair list ; 1:   a min-priority queue, 2:  for   do 3:      for   do 4:          if .. then // Lemma 1 5:              for   do 6:                 if (.,.). then // Lemma 2 7:                      RiderInsertion(, , ) 8:                     if   then 9:                         ,,) 10:  while  do 11:      choose a pair with minimal utility value from 12:      insert rider to the current trip schedule of 13:       14:      remove pairs from 15:      for (, ) (, *)  do 16:          if RiderInsertion(, , )  then 17:              update the utility value of in 18:          else 19:              remove the pair from 20:  return  
Algorithm 3 The Greedy Algorithm (GR)

The Greedy method is presented in Algo. 3. We initialize a min-priority queue to save the valid driver-rider pair candidates with their utility values, and an empty set to store our final result (line 1). In the Filter phase, we traverse each driver-rider pair to check whether they can be matched (lines 2-9). If the driver can carry the rider, then we will calculate its utility value (line 7) and push it into a pool (line  9). In the Refine phase, in each iteration we select the pair with the smallest utility value greedily (line 11), we perform an insertion (line 12), and append the pair into our result list (line 13). Meanwhile, we remove all pairs related to from (line 14). For riders where was considered as a candidate, we check whether it is still valid to include them as a pair since the insertion of a new rider may influence the riders previously considered (lines 15-19). If is still feasible, we update the utility gain value for those riders (line 17). Otherwise, the driver-rider pair is removed from (line