An intelligent transportation system (ITS) is a large scale information system whose objective is to provide guidance information to drivers and to optimize transportation traffic by analyzing vehicle traffic over an entire city. In order to provide accurate information, an ITS needs to collect accurate and comprehensive road traffic data. Due to the development of information and sensing technologies, we can collect various types of road traffic data, density, flow, speed, and so on, from different sensing devices such as optical beacons and probe vehicles. These sensors each have different features. For example, a beacon, which is a fixed type of traffic sensor, can steadily collect the traffic data of the road where it is located in a shot time period; however, the detection area is narrow. A probe vehicle, which is a GPS-equipped vehicle, can collect the traffic data of a comprehensive area, but cannot steadily collect the data and needs a long time period to acquire comprehensive traffic data. Therefore, the fusion of various data collected from different sensors for traffic prediction has recently attracted much attention .
Traffic prediction is a major research topic in the machine learning field . In fact, the analysis of freeway traffic has been researched since the 70s . Travel time prediction , density prediction , and route planning  are other active topics. In the machine learning approach, the existence of two databases, real time (RDB) and historical (HDB), is assumed. An RDB consists of road traffic data collected from sensors at the present time, while an HDB contains road traffic data collected from sensors and traffic surveys in the past. The data in RDB represents a situation where we want to conduct traffic prediction while HDB consists of traffic data of roads in a comprehensive area for long time periods and can be used to help the prediction. That is, we use an HDB for learning and make a traffic prediction based on an RDB.
However, there still remains an important problem related to traffic prediction based on an RDB: The quality of the prediction depends on the quality of the RDB. We cannot acquire the complete traffic data of an entire road network in short time period since sensors are not installed on all roads. In fact, only 22% of the total length of trunk roads in Nagoya, Japan is covered by beacons . Further, at the present time, there are not enough probe vehicle to allow sufficient data to be acquired. If the number of probe vehicles in Japan is a hundred thousand, we need an hour on average to acquire one or two traffic data of an entire road network . Therefore, in practice, it is difficult to collect sufficiently comprehensive road traffic data in short time period to make a traffic prediction. Therefore, we need a method to reconstruct the unobserved parts in an RDB to solve a realistic traffic prediction problem. Recently, some researchers have tackled this problem. Kumagai et al. proposed a method to reconstruct the traffic data of unobserved parts in an RDB based on feature space projection , which Kumagai and co-workers then applied to the dynamical traffic prediction problem . In the field of statistical mechanics, Furtlehner et al. modeled road traffic as an Ising model, where the state is determined by whether a road is congested or not, and addressed the traffic reconstruction and prediction problem that arises when the observed data are incomplete using belief propagation .
In this paper, we propose a new algorithm to reconstruct the traffic data of the unobserved parts in an RDB. We use a Bayesian approach to express a posterior probability density function of unobserved roads. Our method is based on Markov random fields (MRF) modeling of road traffic and the reconstruction of the traffic data of the unobserved parts in an RDB is achieved by solving simple simultaneous equations derived by mean-field method after learning our MRF model by utilizing HDB. For the simplicity of the model, our method can easily address large scale problem to which we consider it difficult to apply previous methods.
The remainder of this paper is organized as follows. In section 2, we introduce a graph representation of a road network and MRF modeling of road traffic. In section 3, we propose a traffic data reconstruction algorithm based on the MRF modeling of road traffic described in section 2. In section 4
, we give a framework for determining the hyperparameters in the posterior probability density function derived in section3 using the machine learning method. In section 5, we numerically verify the performances of our MRF model by using large scale simulation data for the road network of Sendai, Japan (the number of roads is ). The performances are evaluated by conducting leave-one-out cross-validation. Finally in section 6, we present our concluding remarks.
2 MRF modeling of road traffics
In this section, we explain how road traffic is expressed by MRFs. First, we define the undirected graph representation of a real road network. Let us consider a road network consisting of roads or road segments. A vertex corresponds to the th road in a road network. A set of all edges includes either edge or edge if a vehicle on road can move to road without passing along the other roads.
We assign a random continuous variable associated with the traffic data of road . For each vertice and edge, we assign a potential function and , respectively. Then, the joint probability density function of is written as a product of a potential function:
The quantity is a partition function defined as
where is taken over all the configurations of random various
. If we want to use a discrete random variable, the integration over continuous variables in equation (2) becomes a summation over discrete variables.
To explain our model, a simple case is shown in figure 1. There are six roads, represented as encircled numbers, and two intersections in figure 1 (a). In this toy road network, vehicles on road 1 can directly move to roads 2, 3, or 4, but cannot move to road 5 and 6 without passing along road 4. Then, this road network is translated to its graph representation, shown in figure 1 (b). In this case, the joint probability density function is expressed as
where . It should be noted that we ignore road direction relationships throughout this paper for simplicity, as shown in figure 1; however extending to the model to include road direction relationships is straightforward.
3 Traffic data reconstruction algorithm based on MRF
As mentioned in the introduction, a problem that affects traffic prediction is that we cannot collect complete traffic data of all roads , due to a lack of sensors. Here after, is the traffic density of road , which is obtained by dividing the number of vehicles by the length of road . In this section, we propose a method to reconstruct the traffic densities of unobserved roads from observed traffic densities based on MRF modeling and the Bayesian point of view. Suppose that is a set of traffic densities of observed roads collected by sensors at a certain time and that we do not have complete information about all traffic densities. Our goal is to reconstruct the traffic densities of unobserved roads .
In the Bayesian point of view, a reconstruction of unobserved roads is inferred by using the posterior probability density function expressed as
where is the observed results for all roads collected by sensors, and is a conditional density function expressing how is obtained from the true traffic density . It should be noted that, since is a specific value, a denominator in equation (5) gives a constant value.
To define a concrete joint probability density function of , we assume that the potential functions in equation (1) are expressed as
respectively, where and are hyperparameters that determine features of road traffic. represents how large a value the density of road takes and is associated with the closeness of neighboring roads in graph representation.
Then, the joint probability density function, which is regarded as the prior density function in the Bayesian point of view, of is written as
where and the matrix is defined by
where is a set of vertices neighboring vertex . For positive , the second term in the exponential in equation (8) guarantees the normalization of the joint probability density function. This form of probability density function is known as a Gaussian MRF and has been widely used in various applications .
We define a conditional density function as
where is the Dirac delta function. Here, we assume that the unobserved traffic densities can take any real value with equal probability, and the densities of observed roads are not changed at all.
From equation (5), the posterior probability density function is written as . Thus, the marginal posterior probability density function over the traffic densities of unobserved roads is expressed as
where , , and . The matrix
and vectorare defined as follows:
where . The reconstruction of unobserved traffic densities in the RDB can be achieved to find values such that
for . Because the marginal posterior probability density function in equation (12
) is a multivariate Gaussian distribution, valuesare given by the mean vector of
and we can calculate exactly by applying the mean-field approximation 
. The problem of estimating road traffic densities is reduced to solving the following simultaneous equations by an iteration method:
The proposed algorithm for reconstructing the traffic densities of unobserved roads in an RDB is summarized as follows:
Determine the sets and from a graph representation of a road network. Input the values of observed traffic densities .
Calculate matrix according to the definition equation (13).
4 Determining hyperparameters from HDB
We derived a reconstruction algorithm for traffic densities of unobserved roads based on belief propagation described in the previous section. However, we have not yet specified the values of the hyperparameters. The purpose of this section is to show how we determine these parameters from the HDB using a machine learning method. In this section, we assume that a large number of complete traffic data are available. An explanation that excuses this assumption is that we do not need real complete data but artificial data to determine hyperparameters if it expresses the situations of road traffic well. And once we permit an assumption that daytime road traffic situations are similar on different days, we can create such pseudo complete traffic data at a certain time by merging the data collected on days because, different from the RDB, the HDB consists of many traffic data for long time periods and a comprehensive area. This assumption seems likely, especially at rush hour in an urban area where traffic predictions are necessary. The extension to the area where this assumption is violated is mentioned in section 6 with its difficulty.
Let us suppose that we have complete road data of traffic densities, , created from the HDB. The empirical distribution of the complete road data is given by
A standard approach to determining hyperparameters is finding the one that maximizes the likelihood function defined as
However, this approach often give rise to the over-fitting problem, which occurs when the number of hyperparameters is larger than the number of data. In the present model, there exits hyperparameters. Therefore, in the machine learning approach, we sometimes maximize the regularized likelihood function written as
This regularization method is called ridge regression. The parameter is called the regularization parameter; it prevents the magnitudes of hyperparameters from being extremely large to fit the data and is often determined by hand in advance.
where the notation denotes the expectation with respect to , i.e., the sample average of the complete traffic data set. Using the gradient ascent method, we can obtain the values of and that maximize . The gradient of with respect to and are calculated as
It should be noted that, although we need the inverse of matrix in equation (24) and equation (25), it is enough to calculate the inverse matrix once in pre-processing because it depends on only the structure of a given road network.
5 Numerical experiments
In this section, we describe the numerical verification of the performance of our MRF model. We used the real road network of Sendai, Japan, described in figure 2, and vehicle traffic data, which constitute a snapshot of its simulated vehicle traffic. These simulation data represent the real vehicle traffics in Sendai, Japan. In the graph representation of the Sendai road network, there are vertices and edges.
To evaluate the performance of our model, we conducted leave-one-out cross-validation  in which only one data item is used to check the performance, and the others are used to determine the hyperparameters. The performance of the model is then given by the average over all the choices of test data. That is, in each choice of test data, we regard the other data as complete data created from the HDB, and the test data are used to create the data in the RDB. In the test phase, we randomly selected unobserved roads with equal probability from all roads, and then, reconstructed the traffic densities of the unobserved roads using our algorithm. In each test data, we evaluated the performance of our model by the average of mean absolute errors (MAE) between the true and reconstructed traffic density over trials defined by
where is the number of unobserved roads at the th trial and is the true traffic density of road in the th data. Hence, the results of leave-one-out cross-validation are given by
for each .
Figure 3 shows the plot of MAE versus when , , and . Here, we set in equation (6) so that the effect of the first term in the exponent is as small as possible, because this term is needed only to guarantee the normalization.
In the region where is sufficiently small, our reconstruction algorithm yields a good performance for all values of , and the MAE approach asymptotically to the values when . When , MAE was , , and for , , and , respectively. Hence, is the best approach for determining the hyperparameters in our model.
Figures 4 shows an example of our numerical experiments when and .
Figure 4 (a) shows the original traffic densities and figure 4 (e) shows the reconstructed traffic densities using our model. In figure 4 (a) and (e), the road colors are changed from black to blue, green, yellow, and red in order of increasing traffic densities by intervals where a black road is one where the traffic density takes a value between and . Figure 4 (c) shows the positions of unobserved roads; we colored the roads red when they were selected as unobserved roads with probability . That is, about of roads are unobserved. The black roads in figure 4 (c) denote the positions of traffic sensors that collect traffic densities of the observed parts in the RDB. The MAE between figure 4 (a) and (e) is . Figure 4 (b), (d), and (f) are enlarged images of the downtown area of Sendai, Japan shown in figure 4 (a), (c) and (e), respectively. Our method yields good reconstruction results of the unobserved parts in the RDB, as shown in figure 4, and these results show that our Gaussian model well expresses the real traffic situation.
6 Concluding Remarks
In this paper, we proposed a traffic data reconstruction method based on MRF modeling. The reconstruction of unobserved parts in an RDB is reduced to simple simultaneous equations of mean-field method. The hyperparameters in our model are determined utilizing past traffic data in an HDB. We checked the performance of our method by conducting leave-one-out cross-validation, as described in section 5. In the numerical experiments, we used large scale simulated data in Sendai, Japan. We think it difficult to apply previous reconstruction method to such a large scale road network. It should be noted that, in this study, we reconstructed only the traffic density data, but the extension of our MRF model to other data types, such as speed or flow, and furthermore, to combinations of these data types is straightforward.
In our scheme, we made two assumptions about the HDB and traffic densities for analytical convenience. The first assumption was that we can create a number of complete traffic data from an HDB because it can contain many traffic data for a long time period and comprehensive area, and the daily conditions of road traffic seem similar, especially in an urban area. This assumption might be perfunctory in an area where the amount of traffic is small, as in a rural area. We can modify our learning framework by using an expectation maximization algorithm for determining hyperparameters from an incomplete data set in an HDB of such an area. However, we need to calculate the inverse of different matrices in this framework. The matrix is defined similar as equation (13) but dimension corresponding to the number of unobserved roads in th data may be different. Therefore, analytical treatment is distant and we need to seek some approximate method to calculate in this framework. It should be noted that the reconstruction scheme described in section 3 does not change after this modification. The second assumption was that traffic density can take any real value, and its potential functions have quadratic form, as equations (6) and (7). This assumption allows the Gaussian MRF of traffic densities, which is a single mode density function. In our definition of MRF modeling of traffic in section 2
, we did not need to restrict the form of the potential functions and their arguments. One extension that would result in a more complex MRF is using non-negative Boltzmann machine which is a multi-modal density function for the joint density function of ; however we need an approximation method  because its analytical treatment is difficult. We aim to develop our MRF model further in these directions.
The authors thank Prof. Masao Kuwahara and Dr. Jinyoung Kim of the Graduate School of Information Science, Tohoku University, for providing road network data and traffic simulation data. This work was partly supported by Grants-In-Aid (Nos. 25280089, 24700220 and 257259 ) from the Ministry of Education, Culture, Sports, Science and Technology of Japan. S.K. was partially supported by a Research Fellowships of Japan Society for the Promotion of Science for Young Scientists.
-  Faouzi N E E, Leung H and Kurian A 2011 Information Fusion 12 4–10
-  Bishop C M 2006 Pattern Recognition and Machine Learning (Springer)
-  Ahmed M S and Cook A R 1979 Transportation Research Record 722 1–9
Ide T and Sugiyama M 2011 Trajectory regression on road networks
Proc. 24th Association for the Advancement of Artificial Intelligence (AAAI) Conf. on Artificial Intelligence (AAAI-11)pp 203–208
-  Kriegel H P, Renz M, Schubert M and Zuefle A 2008 Statistical density prediction in traffic networks Proc. Society for Industrial and Applied Mathematics (SIAM) Int. Conf. on Data Mining (SDM_08) pp 692–703
-  Nikolova E and Karger D R 2008 Route planning under uncertainty: the canadian traveler problem Proc. 23rd Association for the Advancement of Artificial Intelligence (AAAI) Conf. on Artificial Intelligence (AAAI-08) pp 969–974
-  Morikawa T, Yamamoto T, Miwa T and Wan L 2007 Koutsuukougaku 42 65–75 [in Japanese]
-  Fushiki T, Yokota T, Kimita K and Kumagai M 2004 Study on density of probe cars sufficient for both level of area coverage and traffic information update cycle Proc. 11th World Congress on ITS
-  Kumagai M, Fushiki T, Yokota T and Kimita K 2006 Information Processing Society of Japan (IPSJ) Journal 47 2133–2140 [in Japanese]
-  Kumagai M, Hiruta T, Okude M and Yokota T 2012 Information Processing Society of Japan (IPSJ) Journal 53 243–250 [in Japanese]
-  Furtlehner C, Lasgouttes J M and Fortelle A D L 2007 A belief-propagation approach to traffic prediction using probe vehicles Proc. 10th IEEE Conf. of Intelligent Transportation Systems pp 1022–1027
-  Rue H and Held L 2005 Gaussian Markov Random Fields: Theory and Application (Chapman and Hall/CRC)
-  Wainwright M J and Jordan M I 2008 Foundations and Trends® in Machine Learning 1 1–305
-  Hoerl A E and Kennard R W 1970 Technometrics 12 55–67
-  Dempster A P, Laird N M and Rubin D B 1977 Journal of the Royal Statistical Society 39 1–38
-  Downs O B, Mackay D J C and Lee D D 2000 Advances in Neural Information Processing Systems 12 428–434
-  Yasuda M and Tanaka K 2012 Philosophical Magazine 92 192–209