Survey of ETA prediction methods in public transport networks

04/10/2019 ∙ by Thilo Reich, et al. ∙ Bournemouth University 0

The majority of public transport vehicles are fitted with Automatic Vehicle Location (AVL) systems generating a continuous stream of data. The availability of this data has led to a substantial body of literature addressing the development of algorithms to predict Estimated Times of Arrival (ETA). Here research literature reporting the development of ETA prediction systems specific to busses is reviewed to give an overview of the state of the art. Generally, reviews in this area categorise publications according to the type of algorithm used, which does not allow an objective comparison. Therefore this survey will categorise the reviewed publications according to the input data used to develop the algorithm. The review highlighted inconsistencies in reporting standards of the literature. The inconsistencies were found in the varying measurements of accuracy preventing any comparison and the frequent omission of a benchmark algorithm. Furthermore, some publications were lacking in overall quality. Due to these highlighted issues, any objective comparison of prediction accuracies is impossible. The bus ETA research field therefore requires a universal set of standards to ensure the quality of reported algorithms. This could be achieved by using benchmark datasets or algorithms and ensuring the publication of any code developed.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 2

page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The UK has seen a constant rise in vehicles on its roads since personal vehicles have become available, which resulted in a 7 fold increase in traffic on British roads between 1950 and 2016 [1]. This has naturally led to an increase in congestion felt by all road users. In a recent report, it was estimated that UK travellers spent 10% of their driving time in gridlock [2]. The reduction of congestion became a key priority as it will have a positive impact on the environment, the economy and will reduce commute times. This has been recognised for example in the UK government’s ‘Road to Zero’ strategy aiming to tackle emissions from road usage. The biggest environmental and societal impact can be achieved if the public is encouraged to use alternative modes of travel instead of private cars [3]. This review is focused on public buses as 4.44 billion bus journeys are made annually in the UK. Despite this, the patronage is declining and better Estimated Time of Arrival (ETA) predictions could play a role in slowing down this trend. It has been shown that even small changes can have a significant impact on the overall congestion of a city as highlighted by the fact that reducing daily commutes from specific neighbourhoods by only 1% can cut delays for all road users by as much 18% [4]. Even if the cancelled commutes are randomly selected, delays can still be reduced by as much as 3%. To encourage road users to change their mode of transportation, public transport has to be convenient and reliable. Punctuality and timeliness of the journey have the biggest impact on passenger satisfaction [5]. Non-surprisingly, the most frequently requested improvements by passengers are accurate travel times both pre-trip and during the journey, especially for passengers using public transport to commute [6]⁠. To provide this punctuality buses should ideally adhere to a timetable, that has been carefully designed to allow the bus to meet it without introducing too much buffer times to lengthen the journey unnecessarily. However, this is often difficult and therefore it is crucial to accurately predict the arrival times of vehicles. This will improve passenger satisfaction even if the vehicle is late as passengers, in general, do not mind waiting within reason as long as they know for how long [7]. Furthermore, reliable real-time travel information delivered to passengers reduces the perceived waiting time for bus passengers as well as the actual waiting time as passengers can arrive more closely to the departure time [8]. Furthermore, it will allow developing new smart applications allowing to offer personalised journey suggestions to the traveller. Because buses are affected by a large number of external influences such as weather, traffic conditions, passenger loads [9] and other types of disruptions, predicting their arrival is challenging and therefore currently not very accurate [10]

. Methods to predict ETA can include simple historical averages or be based on statistical models. Therefore, such techniques applied to bus ETA predictions can be expected to drastically improve the current performance. However, due to the complexity of the ETA prediction machine learning methods have become increasingly popular 

[11]

. In recent years, Artificial Neural Networks (NN) have revolutionised a number of other domains. Therefore NNs should be expected to have the same potential when applied to bus ETA prediction problems. A comprehensive review specifically investigating NN applications in public transport 

[12]

found that only 16% (12) addressed ETA of buses, whereas the rest of the studies applied the techniques to other modes of transport. This suggests that the area of bus ETA prediction using NNs might be underrepresented in the context of public transport research. This relative absence of NN to predict bus ETA is striking as NNs has revolutionised other areas of data science such as image and speech recognition. Nowadays the majority of buses have onboard Automatic Vehicle Location (AVL) systems, which are equipped with GPS sensors and transmit the location of the bus at frequent intervals, typically ranging between 20 and 60s. The availability of vehicle locations are the basis for any ETA prediction and are readily accessible through the AVL systems and do not necessarily need any additional investment in static sensors. The general approach of published reviews of ETA predictions methods is either to categorise by area of application or by the technique used as in 

[12] or by the applied algorithm [13, 11]. This review will asses the current literature concerning ETA prediction for buses. In doing so it will demonstrate a more informative categorisation than commonly used to review the literature and address shortcomings of the reporting standards.

2 Categorisation of ETA prediction algorithms

Figure 1: Treemap showing the proportions of the input features used in the reviewed publications.

ETA prediction methods are commonly reported as categorised literature reviews based on the type of algorithm used as suggested in [11, 14]. This categorisation is not necessarily informative for the reader, as the algorithms can be developed based on different background information – different input features such as locations, speed and passenger load of the vehicle are used to develop the algorithm, which prevents any meaningful comparison. Therefore, approaches that were developed using only AVL data should in most cases not be compared to methods accounting additionally for passenger load as well as weather conditions, even if it might be based on the same algorithm. Typically AVL data includes vehicle positions and schedule and route identifiers but can include more information depending on the provider. This would compare algorithms relying on entirely different extent of information thus preventing a meaningful interpretation. As this review’s focus lies on the prediction of bus ETA, the

Figure 2: Categories used to review the literature based on feature types.

reviewed studies are categorised based on the nature of input features used. The most basic requirement of input features to predict ETA are sequences of timestamped GPS coordinates recorded by AVL systems (n=18, Figure 1). These features were used by all 40 reviewed publications (also see supplementary 1). The different feature sources were found to be from external data such as information about the traffic or the weather (n=5), Passenger information such as load and embarking and disembarking numbers (n=4), and a combination of all three aforementioned sources (n=2). A separate group of studies used AVL information from the bus to be predicted in combination with AVL data of other buses serving the same route to calculate the Headway (n=7).

2.1 AVL as sole data source

A minimum requirement to allow any ETA prediction is the knowledge about the position of a vehicle, hence most reviewed studies used AVL data from onboard devices. The only exception was [15], where the locations were recorded using a modified mobile phone as the buses were not equipped with a GPS system. The reviewed studies used data, which included time-stamped positions of the buses and in some cases, additional information was explicitly calculated such as average speeds or dwell times. Therefore, this central group of features was the most common and thus also includes the widest range of applied techniques. The simplest ETA prediction based solely on AVL data are historical methods using the average speed from historical records to predict the arrival time at a destination [16]

. Naturally these cannot account for any fluctuations and thus perform with up to 9.3% lower accuracy compared to more intricate methods such as Kalman Filters (KF)

[17]. Attempts to improve simple historical mean based algorithms, such as accounting for timed stops at which the timetable has deliberate waiting times, reduce the prediction deviation by 0.8 [16]. Another approach was used in which the prediction was made using the historical average updated with exponential smoothing for several short sections of the route, which are then combined to give the total travel time [18]. In the search for an algorithm with better performance and lowest computational impact [19]

, compared a historical average method, Artificial Neural Networks (NN), and Support Vector Machines (SVM). The results suggest that the NN did outperform historical methods with a minuscule advantage although the exact value of the improvement is not reported. The author’s conclusion is that as the NN and the historical method perform similarily, yet the NN requires more intensive training and longer prediction times, the historical method is superior 

[19]. However, the overall consensus of the literature regarding historical methods is that their performance is low [20, 21, 22].

Kalman Filter (KF) is a statistical method that has been applied to bus arrival times  [23, 24, 25] and was found to perform with better accuracy in comparison to historical methods (maximum relative error of 0.543 of the historical approach and 0.087 for the Kalman Filter) [20, 21, 22]. Autoregressive Integrated Moving Average (ARIMA) exploit the information contained in the timeseries and was used in one example with acceptable results compared to the ground truth (MAPE=3.88-6.42% depending on direction). Unfortunately, it was not compared to any other methods making it difficult to objectively bring this method into context [26]

. A direct comparison of historical methods to Linear Regression (LR) in 

[20, 21] showed that LR performed with up to 6.7 times lower error than historical methods. However, KF performed up to 3.95 times better than LR. This study is the only example of a direct comparison of KF and LR. When compared to regression models, NNs generally perform with higher accuracy when trained on the same dataset [27]. Historical-based and regression methods do not cope well with fluctuations [14] and variations of travel times are highly likely at peak times in the urban environment. Therefore, non-linear methods such as NN should intuitively perform better when used with more complex data with higher variation. Pan et al. [28] used an NN to predict the average speed for the remaining distance to the destination, improving the accuracy compared to a historical algorithm by 5.7%. Similarly, in Houston a NN outperformed historical and regression models [29]. Interestingly, this study also found that the improvement although drastic compared to the historical algorithm was less pronounced in the suburban areas presumably due to congestion. This also materialises from the findings by [30], that overall NNs performed significantly better. An exception was heavy congestion where historical approaches were more accurate than NNs. Further investigations found that the NN overestimated speeds in slow conditions and underestimated travel times at high speeds. Surprisingly, the information whether a bus was currently on a bus lane did not influence this behaviour. Generally, ETA predictions are made by estimating the absolute number of minutes until arrival or the travel speed. In a unique approach, [31]

treated the estimation as a classification problem by predicting the 1/4 h when the bus will arrive. In their experiments, an NN based approach performed 8% better than Decision Trees, Random Forests (RF) and Naive Bayes. The ensemble approach was also used to combine several NNs where the parameters such as the number of layers and neurons ware generated randomly and the best performing was included into one ensemble 

[32]. Unfortunately, the authors do not report the exact architecture of the final NNs. As the number of layers could have ranged between 1-5 this could be an example of a deep neural network if this information was known.

Publication

AVL

External

Trajectories

Passenger

Headway

Amita et al. (2015) [27]
Bai et al. (2015) [33]
Chen (2004a) [34]
Chen et al. (2004b) [35]
Chen (2018) [32]
Chien et al. (2002) [36]
Dailey et al. (2001) [23]
Deng et al. (2013) [37]
Dong et al. (2013) [38]
Gal (2017) [39]
Heghedus (2017) [40]
Hua et al. (2017) [41]
Jeong & Rilett (2004) [29]
Julio et al. (2016) [30]
Junyou et al. (2018) [42]
Kee et al. (2017) [31]
Khosharavi et al. (2011) [43]
Kumar et al. (2017) [17]
Li (2018) [44]
Lin & Zeng (1999) [16]
Lin et al. (2013) [45]
Maiti et al. (2014) [19]
Meng et al. (2017) [18]
Nappiah et al. (2009) [26]
Padmanaban et al. (2009) [25]
Pan et al. (2012) [28]
Shalaby & Farhan (2003) [20]
Shalaby & Farhan (2004) [21]
Sinn et al. (2012) [46]
Treethidtaphat et al. (2017) [14]
Vanajakshi et al. (2009) [22]
Wang et al. (2014) [47]
Xinghao et al. (2013) [9]
Xu (2017) [48]
Yin et al. (2017) [49]
Yu et al. (2010) [50]
Yu et al. (2011) [51]
Yu et al. (2017) [52]
Zaki et al (2013) [53]
Zhang et al. (2015) [15] 1

1 authors used a modified smartphone instead of a commercial AVL system.

Table 1: The input features used by each publication indicated as points.

The relative absence of deep learning approaches is striking in the context of bus-ETA prediction. A reason could be the reported behaviour that NNs with a single hidden layer outperformed NNs with two or three layers thus suggesting that shallow NNs might be sufficient or even desirable to predict bus ETAs 

[40]

. However, as ETA prediction is a sequential problem it can be expected that Recurrent Neural Networks (RNN) and their derivatives will perform better. The reason for this is the design specifically tailored to sequential data, where the depth of the network is linked to the length of the sequence 

[54]. A similar conclusion was made by [34] who found in a comparison of NN architectures that the more hidden layers a network had the less likely it was to generalise. In contrast [14]

used a NN with 4 hidden layers reporting excellent performance compared to ordinary least square regression. As this study does not report on any NNs with different depths the results are difficult to interpret. Generally arrival times are predicted for designated bus stops, however in some public transport systems buses can be flagged down anywhere on their route. In a study in Bangkok a 4 layer deep neural network was used to improve the prediction of arrival in comparison to a regression model resulting in an error reduction of 55% 

[14]. The dilemma of choosing a suitable NN architecture has led [43]

to use a genetic algorithm (GA) to select the best performing architecture. As it is unlikely that any model will be able to perform with the same accuracy under every condition, some authors have tried to overcome this limitation by using hybrid methods. Such an example is a combination of a SVM and KF by 

[50], where the SVM predicted baseline values used for the KF prediction. The SVM-KF hybrid achieved 11.1% higher accuracy than a NN-KF hybrid. Nevertheless, the most commonly used method in the context of ETA prediction are NNs. Considering that the majority of publications are using shallow networks and there are few examples of deep learning architectures (3/19) this poses the question whether the reason is that these architectures do not work in this context. A possible reason could be that the studies focused on input features of one bus route thus limiting the data complexity compared to an approach, using the network-wide state of all buses as input.

2.2 Trajectory based methods

Trajectory based methods use historical trajectories of a bus line i.e. the distance travelled by a bus over time (see Figure 3 for an example). The estimate is being made by comparing the current trajectory of a bus with those of the past and using the most similar trajectory as a prediction. The choice of an appropriate trajectory is made by different algorithms.

Figure 3: Example of a bus trajectory illustrating the travelled distance over time.

One such example is the work described by [38]

, who select the most similar trajectory using a k-Nearest Neighbour (kNN) algorithm. In this study, it was found that the kNN algorithm outperformed a NN approach for long term prediction. Interestingly, this approach did not perform well on short distances below 3 km and the authors reverted to use the average speed of all buses travelling on the same road segment as a prediction. Similarly,

[55]

used a kNN classifier to select historical trajectories which were then fed into a KF to predict the bus travel time. In a modification 

[48] grouped the trajectories into categories based on road segments and time of day. Then the prediction was made by comparing the progress of the current bus to the historical trajectories corresponding to the time and section the bus is presently travelling on. This approach was used in order to reduce computational cost and was shown to outperform SVM and NN trajectory matching approaches. In a comparison of different methods applied to trajectories, kernel regression was superior to both LR and kNN methods [46].

2.3 AVL and headway information

As the progress of a bus is naturally dependent on traffic flow, information about the state of the forward traffic should improve the accuracy of any algorithm. As all the reviewed methods used AVL data this allows using these data from previous buses as an indication of the traffic ahead. The distance or time to the preceding buses is called headway and was used in 6 out of the 40 reviewed studies. An example specifically looking at bus stops served by multiple routes, showed that the best accuracy could be achieved if not only the weighted headway to preceding buses of the same route but also those to buses of other lines were included. This was true when the prediction was made using an SVM, interestingly, excluding the running time of the same line resulted in the best prediction using a NN but was still outperformed by the SVM [51].

In contrast, when accounting for travel times of the preceding buses on a virtual road, an NN solution was found to perform better than an SVM [41]. However, [41] used different features as well as 2 hidden layers instead of 1 thus making a comparison difficult. A further study found that an SVM had slightly better accuracy than NN models and KFs. The error was nearly halved if KF was used upstream of either model to account for dynamic changes. Also, in this case, the SVM-KF model was slightly superior to the NN-KF approach [33].

[49] found that overall both NNs and SVMs resulted in a prediction error around 10% although with minimal variations over the course of the day and in different city environments. A genetic algorithm was used to determine the best architecture for a NN, which resulted in an NN with 1 hidden layer and 5 hidden units (3-5-1). This is the same structure as [51] whereas [33] used 6 hidden units. The described works are very consistent in the selection of network depth as well as their findings.

An advancement from a simple NN was presented by [45], who used a hierarchical NN. This approach trained sub-NNs for clusters based on the day of the data collection as well as the delay level at the time of collection. These were then combined into a hierarchical NN which performed better than the conventional NN and KF. Other hierarchical methods are Random Forests which surpassed SVM, kNN and LR. The error was further reduced by 1.3% if the RF was trained on datasets preselected using a kNN approach accounting for the intuition that under similar circumstances the travel time will be similar [52]

. The methods described in this section use headways as additional inputs to AVL data. However, one method instead used queuing theory. The so-called snapshot method simply uses the travel time of the last bus traversing the same segment as a prediction. To minimise the effect of outliers on this approach, different RF based methods were used to get the final prediction based on the snapshot design 

[39].

2.4 AVL and external data

As any road user knows, progress in traffic depends on many external influences, such as weather or traffic volume. This is also true for buses and has been addressed in a number of studies. The weather conditions have been taken into account in two studies. One basic example including weather influences used a SVM to make ETA predictions based on data from the last 30 days. These predictions are then stored and used as predictions for all journeys of the next day. Naturally, this will not account for any sudden changes in external conditions. Regrettably, this study does not compare the method to any other approaches thus making it impossible to objectively evaluate it [44]. Similarly,  [42] used an SVM to predict ETAs based on the last four days in order to predict the 5th. An interesting approach used cameras on overhead bridges to not only count bus traffic but also the speed of taxis as these can use the same routes as buses and unsurprisingly found that their speed is the same in heavy traffic. Furthermore, it was found that the prediction solely based on the information from the static cameras identifying the bus was more accurate than if it was using only GPS recordings. The authors did not combine both in order to investigate whether this would improve the overall performance although this would have been an insightful addition to their research [9]. Again, these methods were not compared to any alternative approaches. A combination of both weather and traffic state was used in a hybrid method. The reasoning is that NNs are often poor at accounting for disruptions, therefore, a system was used, employing an NN for traffic situations that appear to be ‘normal’ in the sense that the system has encountered similar conditions before. If it appears to be an unseen condition the prediction is made using a KF. This improves the performance compared to an NN that is used for all conditions by 0.2 min error for the entire route (37min) [53]. This highlights the crux that it is unlikely that one method will always perform best and it can be anticipated that different conditions will affect a model’s performance.

A preliminary report [40] describes attempts to use LSTMs to predict bus ETAs and including both traffic and weather data, but full results have not yet been published.

2.5 AVL and Passenger data

As public transport’s purpose is to convey passengers, the customers themselves affect the progress of any bus. The number of passengers boarding will have an influence on the dwell time as well as on the frequency of stops made by the vehicle.

An interesting sensitivity analysis [34] showed that the impact of dwell time on the ETA of a bus has an effect of 45% whereas the day of the week played a 25% role. In practice, it is difficult to include the exact number of passengers as this information is generally not collected automatically since tickets do not necessarily have information about the destination and passengers do not have to swipe for example a smart card when disembarking. However, if this data could be made available it should give information about future dwell times as more passengers require longer to disembark.

Therefore, passenger numbers boarding and disembarking were included in an NN model that performed significantly better than LR with the same inputs [47]. Due to the difficulty of assessing the number of passengers an imaginative way used the microphone of a mobile phone installed on the bus to count the sound made when a smart card was swiped at the terminal by a passenger. This information was used to record the number of boarding passengers without any information about the number disembarking [15]. In a comparison,  [20] found that a KF performed better if data including location and passenger load were included. This outperformed a time-lagged NN, as well as LR and a historical model. The same study was republished [21]. This model was later replicated and found to perform with the lowest accuracy when compared to NNs and Hierarchical NNs [45]. This illustrates the replication problem found in the current literature inhibiting any objective comparison of the proposed methods.

2.6 AVL and passenger and external data

To account for as many external influences as possible several studies combined both data from external sources such as weather and traffic and information about the passengers.

A NN-KF hybrid where the NN feeds into the KF was developed using features including weather (and more specifically precipitation), passenger loads, boarding and disembarking as well as AVL information. The hybrid did perform better than a conventional NN [35]. Generally, two methods of segmentation of a route exist: (1) the stop based segmentation where the travel time between two stops is predicted, and (2) the link based prediction where the travel time of a link consisting of several stop to stop segments is estimated. The travel time can either be predicted using a stop-based approach where the time needed from one stop to the next is predicted or a link-based method where the route between two stops is split into several shorter links and each link is predicted separately. In a comparison of the stop based and link-based ETA predictions using AVL data and traffic flow data as features, it was found that the stop based method performed with up to 2.7 times smaller error [36].

sectionDiscussion

The feature-based categorisation used in this review, allowed a better understanding of the applied methods to predict bus ETAs. The analysis highlighted several flaws in the current research that make the interpretation of the results challenging. A reliable comparison of the methods was not possible because the measures used to report the algorithm performance were inconsistent. Furthermore, one of the reviewed papers presented an algorithm without any comparison to other methods, thus preventing any objective assessment. Lastly, the reporting quality of some papers was inadequate. Following each point will be discussed individually.

2.7 Comparability

As the accuracy and performance of any prediction model is of crucial importance, this has to be reported in a way that allows to replicate and compare the results. However, this is not possible in all cases as some authors report relative errors and no consistency in the reported parameters can be distinguished. The precondition that any developed machine learning algorithm should fulfil is verifiability and has been highlighted by a report of the Royal Society as one of the central importance [56]. This has also been recognised in the healthcare sector where guidelines for the development and reporting of predictive models exist [57]. The difference in standards might be explained because ETA predictions do not affect the health or safety of a passenger and a spurious algorithm might at most cause inconvenience rather than physical harm. However, for an operating company, this might cause a loss of revenue because patronage might decline. Furthermore, the society as a whole might be subjected to more congestion, that could simply be reduced by providing accurate ETA predictions. Furthermore, the doctrine of science is replicability. The reproducibility crisis is most prominently known from psychological research [58] however due to its notoriety it is actively being addressed [59]. It has also been identified as a problem in ‘harder’ sciences such as biomedicine [60]

and also artificial intelligence 

[61]. Although results gained from machine learning techniques might be considered to be hard evidence, because the final model is based on mathematical concepts, they suffer from similar problems as seen in psychology where the research is often subjective to the researcher. The similarities between the two fields are that the findings cannot usually be explained due to the ‘black box’ effect. The field of psychology has now started to apply lessons from problems seen in machine learning research [59]. A suggested way of addressing such problems is meta-science that could shed light on the true accuracy of findings [62]. However, this relies on comparable measurements of accuracy, which was not found in a large proportion of the reviewed literature. Therefore, comprehensive standards of reporting are urgently needed in the field of predictive bus transportation research.

2.8 Comparison

Leading on from the reproducibility problems is the lack of comparison to other methods found in a large proportion of studies (n=11, 27.5%). This would not be a major issue if the same prediction measurements were described, however, as this is not the case such reports only allow limited comparison between the studies. The findings, therefore cannot be compared to other researcher’s work and therefore can only be considered standalone reports of a method applied to a certain problem. Such studies do not even give information about any possible relative improvements to other currently employed methods. If the researchers had directly compared their approach to a preexisting or commonly used algorithm, the value of the findings would increase. The comparison to other methods is the only way of establishing a benchmark to which any improvement can be compared to. https://www.overleaf.com/project/5ca487c0504f2453fce07a0d

2.9 Quality

The third issue is related to the reporting standards and a few studies did not make it clear what architectures were used in the final algorithm or left leeway in the interpretation of their findings, by not explaining graphs or figures or because of discrepancies between values in the description compared to the presented figures.

2.10 Conclusion

This review highlighted some shortfalls of the current literature addressing the ETA prediction of buses. Overall NNs predominated (n=12, 30%) the methods (Figure 4). Also, deep learning approaches with more than 2 hidden layers have been used in 4 publications. However, in one approach an iterative selection of layer numbers and units was applied but the final layer number was not reported.

* methods n <2 such as RF, Bayesian Networks etc.

Figure 4: Proportion of each method used in the reviewed studies.

It was telling that several studies found different algorithms performing better in different settings suggesting that there will not be one superior algorithm for all cases. Unfortunately, due to the highlighted shortcomings, it is not possible to identify the ‘best’ method for each of the categories. Considering the popularity of NNs it appears to be the most widely used method suggestive of being the best performing and/or most universal.

Interestingly, deep learning approaches are underrepresented and in some cases, it was found that 2 layer networks were performing better than deeper architectures. This could be due to the fact that NNs allow representing any nonlinear relationship between variables, in data with lower complexity. In general, the input features used consisted of data regarding one bus line and several variables directly linked to this line such as other vehicles travelling on the same route. It would be expected that deep learning approaches will be more successful in generalising more complex datasets for example if the entire network state is considered, including information about all vehicles.

Concluding it can be said that research into bus ETAs lacks consistency and uniform standards. Ideally, an approach similar to image classification or other areas could be used where a standard reference dataset is made available and used as benchmark performance test. Alternatively, if the used data was published alongside the used code this would help increase the comparability. Furthermore, it became clear that an industry-wide standard for reporting prediction accuracy is urgently needed.

References

  • [1] Department of Transport. Road Use Statistics Great Britain 2016. Statistical Release, 3(6):452–6, 2016.
  • [2] Graham Cookson and Bob Pishue. INRIX Global Traffic Scorecard, 2017.
  • [3] Ting Xia, Monika Nitschke, Ying Zhang, Pushan Shah, Shona Crabb, and Alana Hansen. Traffic-related air pollution and health co-benefits of alternative transport in Adelaide, South Australia. Environment International, 74:281–290, 2015.
  • [4] Pu Wang, Timothy Hunter, Alexandre M. Bayen, Katja Schechtner, and Marta C. González. Understanding road usage patterns in urban areas. Scientific Reports, 2(August 2015), 2012.
  • [5] Transport Focus. Bus Passenger Survey Autum 2017 Report. Technical report, Transport Focus, 2017.
  • [6] Jan Willem Grotenhuis, Bart W. Wiegmans, and Piet Rietveld. The desired quality of integrated multimodal travel information in public transport: Customer needs for time and effort savings. Transport Policy, 14(1):27–38, 2007.
  • [7] Rabi G Mishalani, Mark M Mccord, and John Wirtz. Passenger Wait Time Perceptions at Bus Stops : Empirical Results and Impact on Evaluating Real- Time Bus Arrival Information. Journal of Public Tr, 9(2):89–106, 2006.
  • [8] Kari Edison Watkins, Brian Ferris, Alan Borning, G. Scott Rutherford, and David Layton. Where Is My Bus? Impact of mobile real-time information on the perceived and actual wait time of transit riders. Transportation Research Part A: Policy and Practice, 45(8):839–848, 2011.
  • [9] Song Xinghao, Teng Jing, Chen Guojun, and Shu Qichong. Predicting Bus Real-time Travel Time Basing on both GPS and RFID Data. Procedia - Social and Behavioral Sciences, 96(Cictp):2287–2299, 2013.
  • [10] Manuel Martin Salvador, Marcin Budka, and Tom Quay. Automatic Transport Network Matching Using Deep Learning. Transportation Research Procedia, 31(2016):67–73, 2018.
  • [11] Rubina Choudhary, Aditya Khamparia, and Amandeep Kaur Gahier. Real time prediction of bus arrival time: A review. In 2016 2nd International Conference on Next Generation Computing Technologies (NGCT), pages 25–29. IEEE, 10 2016.
  • [12] Engin Pekel and Selin Soner Kara. A COMPREHENSIVE REVIEW FOR ARTIFICAL NEURAL NETWORK APPLICATION TO PUBLIC TRANSPORTATION. Sigma Journal of Engineering and Natural Sciences, 35(1):157–179, 2017.
  • [13] Mehmet Altinkaya and Metin Zontul. Urban Bus Arrival Time Prediction: A Review of Computational Models. International Journal of Recent Technology and Engineering, 2(4):2277–3878, 2013.
  • [14] Wichai Treethidtaphat, Wasan Pattara-Atikom, and Sippakorn Khaimook. Bus arrival time prediction at any distance of bus route using deep neural network model. In 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), pages 988–992. IEEE, 10 2017.
  • [15] Qiang Zhang, Yanhu Zhang, and Jingyi Li. EasyComeEasyGo: Predicting Bus Arrival Time with Smart Phone. In Proceedings - 2015 9th International Conference on Frontier of Computer Science and Technology, FCST 2015, volume 44, pages 268–273. IEEE, 8 2015.
  • [16] Wei-Hua Lin and Jian Zeng. An Experimental Study on Real Time Bus Arrival Time Prediction With Gps Data. Transportation Research Record: Journal of the Transportation Research Board, 1666(1):101–109, 1999.
  • [17] A Kumar, V Kumar, L Vanajakshi, and Shankar Subramanian. Performance Comparrison of Data Driven and Less Data Demanding Techniques for Bus Travvel Time on Prediction. European Transport, 9(65), 2017.
  • [18] Lei Meng, Peiying Li, Junabo Wang, and Zhiyong Zhou. Research on the Prediction Algorithm of the Arrival Time of Campus Bus. In Advances in Intelligent Systems Research (AISR),, volume 142, pages 31–33, 2017.
  • [19] Santa Maiti, Arpan Pal, Arindam Pal, T Chattopadhyay, and Arijit Mukherjee. Historical Data based Real Time Prediction of Vehicle Arrival Time. 17th International IEEE Conference on Intelligent Transportation Systems (ITSC), pages 1837–1842, 10 2014.
  • [20] Amer Shalaby and Ali Farhan. Bus Travel Time Prediction Model for Dynamic Operations Control and Passenger Information Systems. Proceedings of the CD-ROM. Transportation Research Board 82nd Annual Meeting, National Research Council, 2003.
  • [21] Amer Shalaby and Ali Farhan. Prediction model of bus arrival and departure times using AVL and APC data. Journal of Public Transportation, 7:41–61, 2004.
  • [22] L. Vanajakshi, S.C. Subramanian, and R. Sivanandan. Travel time prediction under heterogeneous traffic conditions using global positioning system data from buses. IET Intelligent Transport Systems, 3(1):1, 2009.
  • [23] D. Dailey, S. Maclean, F. Cathey, and Z. Wall. Transit Vehicle Arrival Prediction: Algorithm and Large-Scale Implementation. Transportation Research Record: Journal of the Transportation Research Board, 1771(01):46–51, 1 2001.
  • [24] F. W. Cathey and D. J. Dailey. A prescription for transit arrival/departure prediction using automatic vehicle location data. Transportation Research Part C: Emerging Technologies, 11(3-4):241–264, 2003.
  • [25] R. P.S. S Padmanaban, Lelitha Vanajakshi, and Shankar C. Subramanian. Estimation of bus travel time incorporating dwell time for APTS applications. IEEE Intelligent Vehicles Symposium, Proceedings, pages 955–959, 2009.
  • [26] Madzlan Napiah and Ibrahim Kamaruddin. Arima Models for Bus Travel Time Prediction. Journal-The Institution of Engineers, 71(2):49, 2009.
  • [27] Johar Amita, Jain Sukhvir Singh, and Garg Pradeep Kumar. Prediction of Bus Travel Time Using Artificial Neural Network. International Journal for Traffic and Transport Engineering, 5(4):410–424, 2015.
  • [28] Jian Pan, Xiuting Dai, Xiaoqi Xu, and Yanjun Li. A Self-learning algorithm for predicting bus arrival time based on historical data model. In 2012 IEEE 2nd International Conference on Cloud Computing and Intelligence Systems, pages 1112–1116. IEEE, 10 2012.
  • [29] R. Jeong and R. Rilett. Bus arrival time prediction using artificial neural network model. Proceedings. The 7th International IEEE Conference on Intelligent Transportation Systems (IEEE Cat. No.04TH8749), pages 988–993, 2004.
  • [30] Nikolas Julio, Ricardo Giesen, and Pedro Lizana. Real-time prediction of bus travel speeds using traffic shockwaves and machine learning algorithms. Research in Transportation Economics, 59:250–257, 2016.
  • [31] Chee Yau Kee, Li Pei Wong, Ahamad Tajudin Khader, and Fadratul Hafinaz Hassan. Multi-label classification of estimated time of arrival with ensemble neural networks in bus transportation network. 2017 2nd IEEE International Conference on Intelligent Transportation Engineering, ICITE 2017, pages 150–154, 2017.
  • [32] Chi-hua Chen. An Arrival Time Prediction Method for Bus System. IEEE Internet of Things Journal, PP(c):1–1, 2018.
  • [33] Cong Bai, Zhong Ren Peng, Qing Chang Lu, and Jian Sun. Dynamic bus travel time prediction models on road with multiple bus routes. Computational Intelligence and Neuroscience, 2015, 2015.
  • [34] Mei Chen, Jason Yaw, Steven I. Chien, and Xiaobo Liu. Using automatic passenger counter data in bus arrival time prediction. Journal of Advanced Transportation, 41(3):267–283, 2007.
  • [35] Mei Chen, Xiaobo Liu, Jingxin Xia, and Steven I. Chien. A dynamic bus-arrival time prediction model based on APC data. Computer-Aided Civil and Infrastructure Engineering, 19(5):364–376, 2004.
  • [36] S. I. J. Chien, Y. Ding, and C. Wei. Dynamic Bus Arrival Time Prediction with Artificial Neural Networks. Journal of Transportation Engineering, 128(4):29–438, 2002.
  • [37] Lingli Deng, Zhaocheng He, and Renxin Zhong. The Bus Travel Time Prediction Based on Bayesian Networks. 2013 International Conference on Information Technology and Applications, pages 282–285, 2013.
  • [38] Jian Dong, Lu Zou, and Yan Zhang. Mixed Model For Prediction OfBus Arrival Times.

    2013 IEEE Congress on Evolutionary Computation

    , pages 2918–2923, 2013.
  • [39] Avigdor Gal, Avishai Mandelbaum, François Schnitzler, Arik Senderovich, and Matthias Weidlich. Traveling time prediction in scheduled transportation with journey segments. Information Systems, 64:266–280, 2017.
  • [40] Cristina Heghedus. PhD Forum: Forecasting Public Transit Using Neural Network Models. 2017 IEEE International Conference on Smart Computing, SMARTCOMP 2017, 2017.
  • [41] Xuedong Hua, Wei Wang, Yinhai Wang, and Min Ren. Bus arrival time prediction using mixed multi-route arrival time data at previous stop. Transport, 33(2):1–12, 2017.
  • [42] Zhang Junyou, Wang Fanyu, and Wang Shufeng. Application of Support Vector Machine in Bus Travel Time Prediction. International Journal of Systems Engineering, 2(1):21–25, 2018.
  • [43] Abbas Khosravi, Ehsan Mazloumi, Saeid Nahavandi, Doug Creighton, and J. W.C. Van Lint. A genetic algorithm-based method for improving quality of travel time prediction intervals. Transportation Research Part C: Emerging Technologies, 19(6):1364–1376, 2011.
  • [44] Yao Li, Chuanlin Huang, and Jingjing Jiang. Research of bus arrival prediction model based on GPS and SVM. Proceedings of the 30th Chinese Control and Decision Conference, CCDC 2018, pages 575–579, 2018.
  • [45] Yongjie Lin, Xianfeng Yang, Nan Zou, and Lei Jia. Real-Time Bus Arrival Time Prediction: Case Study for Jinan, China. Journal of Transportation Engineering, 139(11):1133–1140, 2013.
  • [46] Mathieu Sinn, Ji Won Yoon, Francesco Calabrese, and Eric Bouillet. Predicting arrival times of buses using real-time GPS measurements. IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, 2(4):1227–1232, 9 2012.
  • [47] Lei Wang, Zhongyi Zuo, and Junhao Fu. Bus Arrival Time Prediction Using RBF Neural Networks Adjusted by Online Data. Procedia - Social and Behavioral Sciences, 138(0):67–75, 2014.
  • [48] Haitao Xu and Jing Ying. Bus arrival time prediction with real-time and historic data. Cluster Computing, 20(4):3099–3106, 2017.
  • [49] Tingting Yin, Gang Zhong, Jian Zhang, Shanglu He, and Bin Ran. A prediction model of bus arrival time at stops with multi-routes. Transportation Research Procedia, 25:4627–4640, 2017.
  • [50] Bin Yu, Zhong-Zhen Yang, Kang Chen, and Bo Yu. Hybrid model for prediction of bus arrival times at next station. Journal of Advanced Transportation, 44(3):193–204, 7 2010.
  • [51] Bin Yu, William H.K. K Lam, and Mei Lam Tam. Bus arrival time prediction at bus stop with multiple routes. Transportation Research Part C: Emerging Technologies, 19(6):1157–1170, 2011.
  • [52] Bin Yu, Huaizhu Wang, Wenxuan Shan, and Baozhen Yao. Prediction of Bus Travel Time Using Random Forests Based on Near Neighbors. Computer-Aided Civil and Infrastructure Engineering, 33:333–350, 2017.
  • [53] M. Zaki, I. Ashour, M. Zorkany, and B. Hesham. Online Bus Arrival Time Prediction Using Hybrid Neural Network and Kalman filter Techniques. International Journal of Modern Engineering Research (IJMER), 3(4):1–9, 2013.
  • [54] Zachary C. Lipton, John Berkowitz, and Charles Elkan. A Critical Review of Recurrent Neural Networks for Sequence Learning. arXiv, pages 1–38, 2015.
  • [55] B. Anil Kumar, R. Jairam, Shriniwas S. Arkatkar, and Lelitha Vanajakshi. Real time bus travel time prediction using k-NN classifier. Transportation Letters, 7867:1–11, 2017.
  • [56] The Royal Society. Machine learning: the power and promise of computers that learn by example, volume 66. The Royal Society, 2017.
  • [57] Wei Luo, Dinh Phung, Truyen Tran, Sunil Gupta, Santu Rana, Chandan Karmakar, Alistair Shilton, John Yearwood, Nevenka Dimitrova, Tu Bao Ho, Svetha Venkatesh, and Michael Berk. Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View. Journal of medical Internet research, 18(12):e323, 2016.
  • [58] Monya Baker. Over half of psychology studies fail reproducibility test. Nature, 8 2015.
  • [59] Tal Yarkoni and Jacob Westfall. Choosing Prediction Over Explanation in Psychology: Lessons From Machine Learning. Perspectives on Psychological Science, 12(6):1100–1122, 2017.
  • [60] Monya Baker. 1,500 scientists lift the lid on reproducibility. Nature, 533(7604):452–454, 5 2016.
  • [61] Matthew Hutson. Artificial intelligence faces reproducibility crisis Unpublished code and sensitivity to training conditions make many claims hard to verify, 2 2018.
  • [62] Jonathan W. Schooler. Metascience could rescue the ‘replication crisis’. Nature, 515(7525):9–9, 11 2014.