Bus Frequency Optimization: When Waiting Time Matters in User Satisfaction

03/23/2020 ∙ by Songsong Mo, et al. ∙ Singapore Management University Wuhan University RMIT University 0

Reorganizing bus frequency to cater for the actual travel demand can save the cost of the public transport system significantly. Many, if not all, existing studies formulate this as a bus frequency optimization problem which tries to minimize passengers' average waiting time. However, many investigations have confirmed that the user satisfaction drops faster as the waiting time increases. Consequently, this paper studies the bus frequency optimization problem considering the user satisfaction. Specifically, for the first time to our best knowledge, we study how to schedule the buses such that the total number of passengers who could receive their bus services within the waiting time threshold is maximized. We prove that this problem is NP-hard, and present an index-based algorithm with (1-1/e) approximation ratio. By exploiting the locality property of routes in a bus network, we propose a partition-based greedy method which achieves a (1-ρ)(1-1/e) approximation ratio. Then we propose a progressive partition-based greedy method to further improve the efficiency while achieving a (1-ρ)(1-1/e-ε) approximation ratio. Experiments on a real city-wide bus dataset in Singapore verify the efficiency, effectiveness, and scalability of our methods.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Public transport and the services delivered by buses are essential to our daily life. Bus services provide us with the capability to move around, which shapes where we can work and live, where we shop and how we spend our leisure time. In this paper, we focus on bus frequency design which plays a very important role in urban public transport systems, as reorganizing bus frequencies to meet the actual travel demands is expected to achieve significant savings in cost. Taking New York City as an example, the cost of each bus is around $550,000 and the operating cost of transit agencies reaches $215 per hour111https://www.liveabout.com/bus-cost-to-purchase-and-operate-2798845. If we re-organize the bus frequencies based on real travel demands and save 10% bus departures, we can save $20 operating costs per hour and $55,000 per vehicle.

In the literature, there are many studies focusing on the problem of bus frequency optimization. Most of them share a common objective, which is to minimize the average travel cost (in terms of waiting time) of passengers [13, 3, 5, 10, 9]

. Moreover, their solutions are usually heuristic rather than approximate (with theoretical guarantees). However, most, if not all, existing works ignore an important aspect, the

user satisfaction. Many studies have confirmed that the user satisfaction drops faster as the waiting time increases [1, 8]. Motivated by this finding, we aim to schedule the buses in a way to serve more passengers within a given waiting time threshold but not to minimize the average waiting time. In addition, our algorithms are adaptive to cater for different settings of .

We call this novel problem as SatisFAction-BooST Bus Scheduling (FAST). Given a bus database , a bus route database , a passenger database

, and a vector

that specifies the expected number of bus departures for each bus route, it chooses buses for each route such that the whole bus system is able to satisfy the most passengers. The analysis shows that the objective function of FAST is submodular and FAST is NP-hard.

To resolve the FAST problem, we develop a range of approximate algorithms with non-trivial theoretical guarantees. First, we propose an index-based greedy method (Greedy), which can provide approximation factor for FAST as the baseline, and two enhanced versions, namely PartGreedy and ProPartGreedy. PartGreedy is inspired from [18] and by the fact that a bus network is designed to cover different parts of the city and it tries to avoid unnecessary overlapping among routes [4, 16]. It adopts a partitioning algorithm to divide the bus network into several disjoint partitions. Accordingly, it invokes local greedy search within each partition, which effectively reduces the computation cost of the original greedy algorithm. On the other hand, ProPartGreedy adopts a different strategy to address the efficiency issue. Instead of finding one bus that contributes the most to the objective function in each iteration of the local greedy search, it fetches multiple buses in each iteration of the local greedy search to cut down the total number of iterations required. Meanwhile, ProPartGreedy has a tunable parameter that could determine roughly how many buses could be fetched in each iteration and hence provide a trade-off between efficiency and effectiveness.

In summary, we make the following contributions.

  • We propose and study the FAST problem. To the best of our knowledge, this is the first study on bus frequency optimization that considers user satisfaction. We prove that the objective function of FAST is monotone and submodular, and FAST is NP-hard.

  • We propose an index-based greedy method (Greedy), a partition-based greedy method (PartGreedy) and a progressive partition-based greedy method (ProPartGreedy) to solve the FAST problem efficiently. They can achieve an approximation ratio of , , and respectively, where and are the user-defined parameters.

  • We conduct extensive experiments on real-world bus route and bus touch-on/touch-off records in Singapore (396 routes, 28 million trip records of one week) to demonstrate the effectiveness, efficiency and scalability of our methods.

2 Related Work

In this section, we will review existing related work and report the difference between this work and existing ones.

We divide the literature into two categories based on the overall optimization objective. One is called the travel time driven bus frequency optimization problem (Travel-BFO), which aims to minimize the average/total travel time of passengers for either one bus route or a bus route network, based on passenger demands. It treats each ride as a new trip. Another is called the transfer time driven bus frequency optimization problem (Transfer-BFO), which aims to minimize the total transfer time of the transfer passengers.

Travel-BFO. Here, the passenger demands are usually abstracted as origin-destination (OD) pairs. The model proposed in [13] treats the travel time of passengers as an aggregation of the walking time, the waiting time, and the on-board travel time. The problem is usually formulated as a nonconvex objective function with linear or convex constraints. In [3], it is modeled as a nonlinear bilevel problem: the upper level represents the planner who wants to ensure minimal total travel time under fleet size constraints; the lower level represents the users who act by minimizing the travel time. In [5], a multi-objective model is proposed, seeking to minimize the overall travel time of the users and the operational cost of the operators (assumed to be linearly proportional to the frequencies). Martínez et al. [10]

study the transit frequency optimization problem to determine the time interval between subsequent buses for a set of bus lines. They propose a mixed-integer linear programming (MILP) formulation for an existing bilevel model 

[3], and present a metaheuristic method. A new model considering user behavior is proposed in [9]. It assigns a user’s trip to three stages (pre-trip, on-board and end-trip) and aims to minimize users’ total travel costs of the objective bus line.

Differences. Although different bus frequency optimization models have been proposed, they share a very similar optimization objective, i.e., minimizing the average/total travel cost of passengers. Different from the above literature, we aim to improve the overall passenger satisfaction by scheduling the buses such that they can serve more passengers within the given waiting time threshold. Our work is mainly motivated by the following two findings. First, waiting time has a direct impact on the user satisfaction, as evident by many studies [8, 1]. Second, the waiting time threshold is tunable, hence the bus company can adjust thresholds to cater to various concerns on budget, government needs, passengers’ tolerance of waiting, etc.

Transfer-BFO. Transfer time driven bus frequency optimization problem is an extension of single bus route timetabling. It determines the departure time of each trip of all lines in the bus network with the consideration of passenger transfer activities at transfer stations [6].

This problem is modeled by mixed integer programming models to maximize the number of synchronized bus arrivals at transfer nodes [2]. Ibarra-Rojas et al. [7] extend the work of Ceder et al. [2] to address a flexible Transfer-BFO problem with almost evenly spaced departures and preventing bus bunching. The model proposed in [14] tries to minimize the total transfer time experienced by passengers. Parbo et al. [12] studied a bi-level bus timetabling problem to minimize the weighted transfer waiting time of passengers, and a Tabu Search algorithm was applied to solve the bilevel model. Recently a nonlinear mixed integer-programming model is proposed to maximize the number of total transferring passengers with small excess transfer time [17].

Differences. The above studies on the Transfer-BFO problem mainly focus on minimizing the total transfer cost for passengers on transfer, which can only improve the satisfaction of the transfer passengers. In contrast, our problem aims to improve overall passenger satisfaction by serving them within a given time threshold.

For all the above work in both categories, despite the difference, all existing approaches only propose heuristic methods without theoretical guarantees, while we propose algorithms with non-trivial theoretical guarantees.

3 Problem Formulation

In a bus route database , a route is a sequence of bus stations (, , , , , ), where is a bus station represented by (latitude, longitude). In a passenger database , a passenger is in form of a tuple , where denotes the boarding station, denotes the alighting station, and denotes the time when reaches . A bus is in form of a tuple , where and denote the bus service route and the departure time from respectively.

Definition 1

We define that a bus can serve a passenger , if contains and in order, and , where denotes the travel time required by bus from to via the bus route , and is a given waiting time threshold.

There are multiple ways available to approximate . In this paper, we utilize the historical average travel time from to via the route to compute . Based on Definition 1, we formally introduce to denote the service of to , as presented in Equation (1).

(1)

Next, we introduce the concept of bus service frequency in Definition 2. Let the bus service frequency for be a set, with each element corresponding to a bus route , i.e., }. Then, the service of to a passenger can be computed by Equation (2). Note as long as any can serve ; otherwise, .

(2)
Definition 2

A bus service frequency () for refers to a set of buses (, , , ) that serve the route , where denotes the total number of bus departures corresponding to the route within a day.

Next, we formulate our problem in Definition 3 and show its NP-hardness. Note that we ignore the passenger capacity of the bus in our problem definition.

Definition 3 (SatisFaction-BooSt Bus Scheduling (FAST))

Given a bus route database , a passenger database , a waiting time threshold , and a vector , , ,, , where denotes the total number of bus departures of bus route , we output a bus service frequency which can maximize , where denotes the total number of passengers served by .

Theorem 3.1

The objective function of FAST is monotone and submodular.

Proof

We skip the proof of the monotonicity of as it is straightforward. In the following, we prove that is submodular. Let , where denotes the universe of buses, and refers to a bus in . According to [11], is submodular if it satisfies: . To facilitate the proof, we define and . Then, we have:

(3)

To show the submodularity of , we first prove Inequality (4).

(4)

According to whether can be served by buses in or buses in or bus , there are in total four cases corresponding to Inequality (4). Case 1: can be served by a bus . Then we have , because and . Thus, . Case 2: cannot be served by any bus but it can be served by a bus . Then we have , and . Thus, . Case 3: cannot be served by any bus and can be served by the bus . Then we have and . Thus, . Case 4: cannot be served by any bus or the bus . Then we have ==. Thus, . The above shows the correctness of Inequality (4). Based on Equation (3) and Inequality (4), we have and hence is a submodular function.

Theorem 3.2

The FAST problem is NP-hard.

Proof

It is worth noting that the minimum unit of time is second in daily life. Therefore, is a finite set. Based on this, we prove it by reducing the Set Cover problem to the FAST problem. In the Set Cover problem, given a collection of subsets , , , , of a universe of elements , we wish to know whether there exist of the subsets whose union is equal to . We map each element in in the Set Cover problem to each passenger in , and map each subset to the set of passengers server by a bus . Consequently, if all passengers in are served by , the total number of passengers served by is . Subsequently, is set to (selecting buses). The Set Cover problem is equivalent to deciding if there is a -bus set with the maximum served passenger number in FAST. As the Set Cover problem is NP-complete, the decision problem of FAST is NP-complete, and the optimization problem is NP-hard.

0.1 Input: a bus database , a bus route database , a passenger database , and a vector Output: a bus service frequency Initialize , Initialize a -dimension vector with zero for  to  do
0.2        Select a bus if  then
0.3              
0.4       if  then
0.5               remove all the buses serving the route from
0.6       
return
Algorithm 1 Greedy

4 Basic Greedy Method

To address FAST, we first present a baseline which extends the basic greedy method for the problem of submodular function maximization. To accelerate the marginal gain computation, we propose a mapping structure to index the bus and passenger database. The basic greedy method is guaranteed to achieve (1 - 1/)-approximation, as proved by Nemhauser et al. [11].

4.1 A Basic Greedy Method

The pseudo-code of the greedy method is listed in Algorithm 1. In each iteration, it selects a bus with the largest marginal gain, such that , and inserts it to the current service frequency . In lines 1-1, it checks whether the number of bus departures of route , which serves, has reached the total number of bus departures required by this route. If so, it removes all buses serving the route from . Such an iteration is repeated times, with being the total number of bus departures required by all the bus routes. Finally, it returns as the solution.

Time Complexity. In each iteration, Algorithm 1 needs to scan all the buses in and computes their marginal gain to the chosen set. Each marginal gain computation needs to traverse once in the worst case. Thus, adding one bus into takes time, and the total complexity is .

Bus List
Figure 1: Forward list
Passenger List Optional Buses
Figure 2: Inverted list

4.2 Index for Efficient Marginal Gain Computation

To accelerate the marginal gain computation, which is the main bottleneck of Algorithm 1, we propose two mapping indexes, forward list and inverted list as shown in Fig. 2 and Fig. 2 respectively. The former is for buses , maintaining a list of passengers that could be served by bus . Note that a passenger could be served by multiple buses. To avoid counting the same passenger multiple times when we calculate the marginal gain, we maintain another parameter to capture the number of passengers in that are still waiting for services. The initial value of is set to be the cardinality of , and its value will be reduced every time when a passenger in is served by another bus. The latter is for passengers , maintaining a list of buses that could serve the passenger . The boolean is to indicate whether any of the optional buses has been scheduled with an initial value being . For example, if bus is selected, it could serve three passengers based on ’s value associated with in forward list. Meanwhile, ’s value of passengers in of (i.e., ) will be changed to , all the buses that could serve or or have to update ’s value to reflect the fact that some of their potential passengers have already been served.

5 Partition-based Greedy Method

In practice, a bus network is designed to cover different parts of a city to meet residents’ various travel demands. By design, it tries to avoid unnecessary overlapping among routes [4, 16]. For example, Figure 3 plots three popular bus routes in Singapore. A passenger whose travel demand could be served by route 67 will not consider route 161 or route 147 as these routes have zero overlap. This observation suggests that it might be unnecessary to scan the entire bus network when calculating the marginal gains of certain buses. This motivates us to design a partition-based greedy method. In the following, we first introduce a novel concept namely service overlap ratio to guide the partitioning process, and then present the algorithm.

(a) Bus Route 67
(b) Bus Route 147
(c) Bus Route 161
Figure 3: Visualization of three popular bus routes in Singapore

Our main idea is to partition the bus routes (and buses) into disjoint clusters, and then use a divide-and-conquer strategy to find local optimal frequencies for routes in each partition. This approach is expected to reduce the time complexity of the basic greedy by a factor of with being the number of partitions. The speedup is contributed by the fact that it invokes the greedy algorithm for each cluster and hence it only needs to scan the buses and passengers corresponding to the routes in a cluster during the greedy search. Meanwhile, in term of accuracy, we introduce a novel concept called service overlap ratio to achieve an approximation ratio with non-trivial theoretical guarantee, as shown later.

Definition 4 (Partition)

A partition of a set is denoted as a cluster set ={, , , }, where denotes the total number of clusters, such that , , , and with , .

To better illustrate the service overlap ratio, we define a function Serve(,) that takes a passenger set and a route set as inputs and returns the passengers in that could be served by any route in without considering the temporal factor. To be more specific, a passenger will be returned by Serve(,) if there is a route such that contains and in order, which is different from the “bus serves passengers” defined in Definition 1. We name the set of passengers returned by Serve(,) as the passenger pool w.r.t. bus routes .

As stated in Definition 5, the service overlap ratio of a bus route cluster tries to measure the number of passengers in the passenger pool w.r.t. that actually also belong to the passenger pools w.r.t. other clusters. Let denote the cardinality of the set , and denote a bus service frequency returned by . , , and refer to a cluster of buses, a cluster of routes and a cluster of passengers respectively, and refers to a -dimensional vector in the form of . The parameter is set to the minimum number of buses required by any route. Although there are different ways to quantify the overlaps between bus routes, we define in such a way that a partition-based greedy guided by can achieve a theoretical bound, as to be detailed next.

Definition 5 (Service overlap ratio)

Given a partition of the original bus route database , for a cluster , the ratio of the service overlap between and the rest clusters is .

0.1 Input: a bus database , a bus route database , a passenger database , and a vector , a controlling threshold Output: a bus service frequency initialize , , , , for each cluster  do
0.2        ,
return
Algorithm 2 PartGreedy

Partitioning of bus routes and buses. Algorithm 3 lists the pseudo-code of a bus route partitioning method guided by service overlap ratio. It first partitions the routes using the finest granularity by forming a cluster for each bus route. Thereafter, it checks the service overlap ratio for each cluster and picks the one with the largest , denoted as , for expansion (Line 3). It selects the cluster that shares the largest common passenger pool with (Line 3) and merges with (Lines 3 - 3). Note that when cluster is expanded, let denote the new frequency returned by Greedy. is actually required when calculating for this expanded cluster, by Definition 5. However, to reduce the computation cost and the complexity, we use as an approximation of . According to our merger rules, is a lower bound of and it does not affect the accuracy of our partition algorithm. This merge-and-expansion process continues until the s associated with all the clusters fall below the input threshold .

0.1 Input: a bus database , a bus route database , an integer , and a controlling threshold Output: a partition of and a partition of for each bus route  do
0.2        initialize , ,
0.3initialize for  do
0.4       
0.5, while  do
0.6        , , , , ,
return ,
Algorithm 3 BusRoutePartitioning )

When the bus routes and buses are partitioned, it invokes the basic greedy method (Section 4) to find the frequency for each cluster, and merges the local frequencies for clusters as the final answer. We name this approach as PartGreedy. Its pseudo-code is shown in Algorithm 2 and its approximation ratio is analyzed in Lemma 1.

Lemma 1

Given a partition ={, , , , , } of the bus route database and the maximum service overlap ratio , PartGreedy achieves a approximation ratio to solve the FAST problem.

Proof

Let denote the solution obtained by Greedy for cluster , denote the solution obtained by PartGreedy, denote the optimal solution for cluster , and denote the global optimal solution. In Algorithm 3, it uses the lower bound of the to compute the upper bound of and terminates when the upper bound of for every cluster is no greater than the given threshold . Then we have for any . Recall Section 3, the basic greedy method is proved to achieve -approximation. Therefore, we have . Because of the submodularity and monotonicity of , we have and . Then, by Definition 5 we have:

(5)

In addition, Inequality (6) holds according to Definition 3.

(6)

Based on Inequality (5) and Inequality (6), we have . Using the principle of inclusion-exclusion, we have . Thus, this lemma is proved.

6 Progressive Partition-based Greedy Method

Although PartGreedy improves the efficiency of basic greedy by conducting the search within each partition (though not the original route/bus database), it still suffers from a high computational cost. To be more specific, in each iteration of the greedy search (either a global search or a local search by Greedy), in order to find the one with the maximum gain, it has to recalculate the marginal gain for all the buses not yet scheduled.

Motivated by this observation, we propose a progressive partition-based greedy method (ProPartGreedy). It selects multiple, but not only one, buses in each local greedy search iteration to cut down the total number of iterations required and hence the computation cost. The pseudo-code of ProPartGreedy is the same as Algorithm 2 except that the call of Greedy is replaced with Function 1 (ProGreedy) in line 2 of Algorithm 2. Meanwhile, we will prove that it can achieve an approximation ratio of , where and are tunable parameters that provide a trade-off between efficiency and accuracy.

0.1 Input: a bus database , a bus route database , a vector , and a parameter Output: a bus service frequency Initialize , Initialize a -dimension vector with zero Sort based on descending order of Initialize while  do
0.2        for each  do
0.3               if  then
0.4                      if  then
0.5                             , if  then
0.6                                    remove all bus serve the route from
0.7                     if  then
0.8                             break
0.9                     
0.10              else
0.11                      break
0.12       
return
Function 4 ProGreedy

As presented in Function 1, ProGreedy first sorts by and initializes the threshold to the value of . Then, it iteratively fetches all the buses with their marginal gains not smaller than into and meanwhile lowers the threshold by a factor of for next iteration (Lines 4-4). The iteration continues until there are buses in . Unlike the basic greedy method that has to check all the potential buses in or a cluster of in each iteration, it is not necessary for ProGreedy as it implements an early termination (Lines 4-4). Since buses are sorted by values, if of the current bus is smaller than , all the buses pending for evaluation will have their values smaller than and hence could be skipped from evaluation. In the following, we first analyze the approximation ratio of Function 1 by Lemma 2. Based on Lemma 2, we show the approximation ratio of ProPartGreedy by Lemma 3.

Lemma 2

ProGreedy achieves a approximation ratio.

Proof

Let be the bus selected at a given threshold and denote the optimal local solution to the problem of selecting that can maximize . Because of the submodularity of , we have:

(7)

where is the current partial solution. Equation (7) implies that for any . Thus, we have . Let denote the partial solution that has been included and be the bus selected at the th step. Then we have .

The solution obtained by Function 4 with . Using the geometric series formula, we have . Hence, the lemma is proved.

Lemma 3

Given a partition ={, , , , , } of the bus route database and the maximum service overlap ratio , ProPartGreedy achieves a approximation ratio to solve the FAST problem.

Proof

Based on Lemma 2, this proof is similar to the proof of Lemma 1, so we omit it due to space limit.

Database Amount AvgDistance AvgTravelTime
451k N.A. N.A.
396 19.91km 5159s
28m 4.2km 1342s
Table 1: Statistics of datasets

7 Experiment

In this section, we first explain the experimental setup; we then conduct sensitivity tests to tune the parameters to their reasonable settings, as our algorithms have several tunable parameters; we finally report the performance, in terms of effectiveness, efficiency, and scalability, of all the algorithms.

Datasets. We crawl the real bus routes from transitlink222https://www.transitlink.com.sg/eservice/eguide/service_idx.php in Singapore. Each route is represented by the sequence of bus stop IDs it passes sequentially, together with the distance between two consecutive bus stops. The travel time from a stop to another stop via a route

is estimated by the ratio of the distance between those two stops along the route to the average bus speed of the route. We use bus touch-on record data (shown later) to find the average travel speed of a particular bus line. For the passenger database

, due to the exhibit regular travel patterns of passengers [15], we use the real bus touch-on record data in a week of April 2016 in Singapore, which is obtained from the authors of [15] and contains 28 million trip records. Each trip record includes the IDs/timestamps of the boarding and alighting bus stops, the bus route, and the trip distance. We assume passengers spend minutes waiting for their buses, with following a random distribution between 1 and 5 minutes. Then, we generate the bus candidate set based on the route and service time range. For each route, we use buses that depart every minute between 5am and 12am as the superset of candidate buses. The statistics of those datasets are shown in Table 1.

Parameters. Table 2 lists the parameter settings, with values in bold being default. In all experiments, we vary one parameter and set the rest to their defaults. We assume all bus routes require the same number of bus departures in our study. Notation represents the vector , , for brevity.

Parameter Values
number of bus departures , , , ,
total passenger number 100k, 200k, 300k, 400k, 500k
waiting time threshold 1min, 2min, 3min, 4min, 5min
tunable parameter used byProPartGreedy , , ,
controlling threshold used by PartGreedy 0.1, 0.2, 0.3, 0.4
Table 2: Parameter settings

Algorithm. To the best of our knowledge, this is the first work to study the FAST problem, and thus no previous work is available for direct comparison. In particular, we compare the following five methods. FixInterval that fixes the time interval between two bus departures as (service time range) / (bus number) for each line and chooses the bus that departures at 5am as the first bus; Top- that picks top- buses, which could serve the most number of passengers (); Greedy, PartGreedy, and ProPartGreedy, i.e., Algorithm 1, Algorithm 2, and the progressive partition-based method proposed in this paper.

Performance measurement. We adopt the total running time of each algorithm and the total served passenger number (SPN) of the scheduled buses as the main performance metrics. We randomly choose 5 million passengers from a week of data and pre-process the passenger dataset to build the index, which takes seconds and occupies MB disk space. Each experiment is repeated ten times, and the average result is reported.

Setup. All codes are implemented in C++. Experiments are conducted on a server with 24 Intel X5690 CPU and 140GB memory running CentOS release 6.10. We will release the code publicly once the paper is published.

Parameter Sensitivity Test - . The impact of waiting time threshold on the running time and SPN are reported in Figure 4(a) and Figure 4(d), respectively. Parameter has an almost-zero impact on the running time. On the other hand, it affects SPN. As increases, all the algorithms are able to serve more passengers, which is consistent with our expectations. We set , the mean value.

Parameter Sensitivity Test - . The impact of parameter on the running time and SPN are reported in Figure 4(b) and Figure 4(e), respectively. It has a positive impact on the running time performance but a negative impact on SPN. As increases its value, PartGreedy and ProPartGreedy both incur shorter running time but serve less number of passengers. We choose as the default setting.

Parameter Sensitivity Test - . Parameter only affects ProPartGreedy. It controls the trade-off between efficiency and accuracy. As increases its value, ProPartGreedy incurs shorter running time and serves less number of passengers, as reported in Figure 4(b) and Figure 4(f), respectively. We choose as the default setting.

(a) Running time vs.
(b) Running time vs.
(c) Running time vs.
(d) SPN vs.
(e) SPN vs.
(f) SPN vs.
Figure 4: Effect of parameters
(a) SPN vs.
(b) SPN vs.
Figure 5: Effectiveness Study: SPN vs. or

Effectiveness Study. We report the effectiveness of different algorithms in Figure 5. We observe that (1) FixInterval is most ineffective; (2) the three algorithms proposed in this work perform much better than the other two, e.g., ProPartGreedy doubles (or even triples in some cases) the SPN of FixInterval; and (3) Greedy performs the best while PartGreedy and ProPartGreedy achieve comparable performance (only up to 9.4% below that of Greedy).

Efficiency Study. Figure 6 shows the running time of each method w.r.t. varying and . We have two main observations. (1) The time gap among Greedy, PartGreedy and ProPartGreedy becomes more significant with the increase of . This could be the increase of causes an increase in the number of clusters and . On the other hand, PartGreedy and ProPartGreedy only need to scan one cluster when selecting buses. (2) The improvement of PartGreedy and ProPartGreedy over Greedy decreases with the increase of . This is because the overlap between clusters increases with the increase of , which leads to a reduction in the number of clusters and an increase in partition time.

Scalability Study. To evaluate the scalability of our methods, we vary from to , and from 1 million to 5 million. From Figure 7(a), we find that the efficiency of Greedy is more sensitive to , as compared to PartGreedy and ProPartGreedy. It’s worth noting that the results are omitted for Greedy when it cannot terminate within seconds. As shown in Figure 7(b), PartGreedy and ProPartGreedy are about ten times faster than Greedy when is varying.

(a) Running time vs.
(b) Running time vs.
Figure 6: Efficiency Study: Total Running Time vs. or
(a) Running time vs.
(b) Running time vs.
Figure 7: Scalability Study

8 Conclusion

In this paper we studied the bus frequency optimization problem considering user satisfaction for the first time. Our target is to schedule the buses in such a way that the total number of passengers who could receive their bus services within the waiting time threshold is maximized. We showed that this problem is NP-hard, and proposed three approximation algorithms with non-trivial theoretical guarantees. Lastly, we conducted experiments on real-world datasets to verify the efficiency, effectiveness, and scalability of our methods.

Acknowledgements. Zhiyong Peng is supported in part by the National Key Research and Development Program of China (Project Number: 2018YFB1003400), Key Project of the National Natural Science Foundation of China (Project Number: U1811263) and the Research Fund from Alibaba Group. Zhifeng Bao is supported in part by ARC DP200102611, DP180102050, NSFC 91646204, and a Google Faculty Award. Baihua Zheng is supported in part by Prime Minister’s Office, Singapore under its International Research Centres in Singapore Funding Initiative.

References

  • [1] G. Antonides, P. C. Verhoef, and M. Van Aalst (2002) Consumer perception and evaluation of waiting time: a field experiment. Journal of consumer psychology 12 (3), pp. 193–202. Cited by: §1, §2.
  • [2] A. Ceder, B. Golany, and O. Tal (2001) Creating bus timetables with maximal synchronization. Transportation Research Part A: Policy and Practice 35 (10), pp. 913–928. Cited by: §2.
  • [3] I. Constantin and M. Florian (1995) Optimizing frequencies in a transit network: a nonlinear bi-level programming approach. International Transactions in Operational Research 2 (2), pp. 149–164. Cited by: §1, §2.
  • [4] M. Fletterman et al. (2009) Designing multimodal public transport networks using metaheuristics. Ph.D. Thesis, University of Pretoria. Cited by: §1, §5.
  • [5] Z. Gao, H. Sun, and L. L. Shan (2004) A continuous equilibrium network design model and algorithm for transit systems. Transportation Research Part B: Methodological 38 (3), pp. 235–250. Cited by: §1, §2.
  • [6] O. J. Ibarra-Rojas, F. Delgado, R. Giesen, and J. C. Muñoz (2015) Planning, operation, and control of bus transport systems: a literature review. Transportation Research Part B: Methodological 77, pp. 38–75. Cited by: §2.
  • [7] O. J. Ibarra-Rojas and Y. A. Rios-Solis (2012) Synchronization of bus timetabling. Transportation Research Part B: Methodological 46 (5), pp. 599–614. Cited by: §2.
  • [8] M. C. Kong, F. T. Camacho, S. R. Feldman, R. T. Anderson, and R. Balkrishnan (2007) Correlates of patient satisfaction with physician visit: differences between elderly and non-elderly survey respondents. Health and Quality of Life Outcomes 5 (1), pp. 62. Cited by: §1, §2.
  • [9] N. Lin, W. Ma, and X. Chen (2019) Bus frequency optimisation considering user behaviour based on mobile bus applications. IET Intelligent Transport Systems 13 (4), pp. 596–604. Cited by: §1, §2.
  • [10] H. Martínez, A. Mauttone, and M. E. Urquhart (2014) Frequency optimization in public transportation systems: formulation and metaheuristic approach. European Journal of Operational Research 236 (1), pp. 27–36. Cited by: §1, §2.
  • [11] G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher (1978) An analysis of approximations for maximizing submodular set functions - I. Math. Program. 14 (1), pp. 265–294. Cited by: §4, §3.
  • [12] J. Parbo, O. A. Nielsen, and C. G. Prato (2014) User perspectives in public transport timetable optimisation. Transportation Research Part C: Emerging Technologies 48, pp. 269–284. Cited by: §2.
  • [13] S. Schéele (1980) A supply model for public transit services. Transportation Research Part B: Methodological 14 (1-2), pp. 133–146. Cited by: §1, §2.
  • [14] Y. Shafahi and A. Khani (2010) A practical model for transfer optimization in a transit network: model formulations and solutions. Transportation Research Part A: Policy and Practice 44 (6), pp. 377–389. Cited by: §2.
  • [15] X. Tian and B. Zheng (2018) Using smart card data to model commuters’ responses upon unexpected train delays. In Big Data, pp. 831–840. Cited by: §7.
  • [16] S. Wang, Z. Bao, J. S. Culpepper, T. Sellis, and G. Cong (2018) Reverse k nearest neighbor search over trajectories. IEEE Trans. Knowl. Data Eng. 30 (4), pp. 757–771. Cited by: §1, §5.
  • [17] Y. Wu (2019)

    Combining local search into genetic algorithm for bus schedule coordination through small timetable modifications

    .
    International Journal of Intelligent Transportation Systems Research 17 (2), pp. 102–113. Cited by: §2.
  • [18] P. Zhang, Z. Bao, Y. Li, G. Li, Y. Zhang, and Z. Peng (2018) Trajectory-driven influential billboard placement. In SIGKDD, pp. 2748–2757. Cited by: §1.