I Introduction
The auto companies have been competing to get their automated vehicles (AVs) ready on road for years, yet there is still none available in the market. Partly, this is due to the challenging task of robustly testing and guaranteeing the safety of an AV before its release. Companies have been trying different methods such as road test [1, 2], computer simulation test [3] and humanvehicle interaction test [4, 5], yet providing safety certificate for an AV system is still open for solving [1]. Assisting the endeavors of solving this problem, the U.S Department of Transportation has released a new AV policy: A Vision for Safety 2.0 [6]. This official document standardizes the required safety features of an autonomous vehicle, providing guidance and clearer pathways for the various stakeholders aiming to certify the safety of their AV systems. However, even with this newly published official guideline, the testing standard remains unclear while the AV target release is quickly approaching. Thus, an effective and efficient testing method for an autonomous vehicle is an urgent need under this background.
Traditional vehicle safety tests are based on crash databases collected from crashes or dangerous scenarios, such as the CSD and GIDAS crash databases [7]. However, the information logged in these databases is limited so that it is difficult to reconstruct and analyze the dangerous scenarios. As a result, the NaturalisticField Operational Test (NFOT) has been actively used instead in recent auto safety evaluation. An NFOT logs naturalistic driving data such as the driver’s maneuver, driving environment, and vehicle dynamics, providing detailed data for reconstructing the desired traffic scenario. Among these NFOT records are the 100Car Naturalistic Driving Study [8], the Safety Pilot Model Deployment (SPMD) [9], and Strategic Highway Research Program (SHRPII) [10]. To evaluate the safety of an AV algorithm, an NFOT based safety evaluation tests the performance of driving policy under a certain sampled environment drawn from the databases. The recent research includes modeling driver failure into safety evaluation [11, 12] and analyzing the contributing factors in dangerous scenarios [13, 14, 15]. In these researches, the safety is evaluated under certain conditions, from which it is then generalized under various naturalistic driving conditions. While sampling from the NFOT data is a safe method to imitate the reality, it turns out that such approach is deemed inefficient. According to National Highway Safety Administration [16], the fatality rate is 1.08 per 100 million vehicle miles, which means it requires approximately 100 million miles data to generate a fatal crash in a simulation under natural conditions. This enormous simulation effort is indeed costly and timeconsuming. Thus, an accelerated method is necessary to attain an effective and efficient auto safety evaluation.
Accelerated Evaluation (AE) procedure is proposed in [17] to serve this notion of accelerated framework, and is further studied in [18, 19, 20]
. This approach utilizes statistical models to represent the traffic environment and estimates the rate of safetycritical events in the test scenarios. Obtaining an unbiased estimator for these rates is accelerated by the use of Importance Sampling (IS), alleviating the needs for extensive simulation replications. The efficiency of this approach depends primarily on the choice of accelerated distribution, which needs to be constructed based on specific characteristics of the problem of interest.
This paper extends the study of AE by proposing a new scheme for constructing accelerated distribution. The scheme merges supervised learning methods with rare event simulation and utilizes the properties of Gaussian distribution in IS theory. The proposed approach provides an alternative method for constructing accelerated distribution for AV test scenarios under Gaussian Mixture Models (GMM).
The structure of the paper is as the following: Section II reviews the AE procedure and emphasizes the concepts relevant to the proposed approach. Section III presents the base knowledge and shows the procedure of the proposed approach. Section IV showcases the performance of the proposed approach based on two problem instances.
Ii Review of Accelerated Evaluation
In this section, the AE procedure is reviewed in detail in the following order. Section IIA gives the highlevel overview of the AE procedure, followed by Section IIB in which the general problem setting for the approach is described. To provide adequate knowledge regarding the statistical methods used, the IS theory, that supports the validity of AE procedure, will be concisely presented in Section IIC.
Iia Procedure For Accelerated Evaluation
In evaluating the performance of an AV system, the AE procedure suggests to examine the AV under various different test scenarios. The safety level of a vehicle is assessed by the rate of safetycritical events in the test scenarios. In this case, the goal of the AE is modeled as a probability estimation problem (described in section
IIB). The AE procedure consists of four major steps [19]:
Model the behaviors of the traffic environment represented by (original distribution) as the major disturbance to the AV using largescale naturalistic driving data.

Skew the disturbance statistics from to modified statistics (accelerated distribution) to generate more frequent and intense interactions between AVs and the traffic environment.

Conduct “accelerated tests” with .

Use the IS technique to “skew back” the results to understand realworld behavior and safety benefits.
Note that the key part in the AE procedure is the use of IS technique, whose theory will be reviewed in Section IIC.
The proposed approach provides an alternative in constructing the accelerated distribution in Step 2. In the following sections, we focus on the construction of accelerated distribution.
IiB Evaluation Problem
The AE evaluates testing vehicles by individually estimating the rate of safetycritical events under various different test scenarios. Here, we show the setting of the problem and the notations that we use in the following sections.
In a test scenario, we assume that the uncertain environment traffic is represented by a vector
and is regarded as a stochastic model. Note that is the domain of . The distribution model for is estimated from a sufficiently large set of naturalistic driving data. The safetycritical events occurs with certain values of . We represent the set as and use an indicator function(1) 
to denote the response of a certain vector . is usually complicated, but can be evaluated by ontrack experiments or computer simulations.
Generally, the safetycritical events are very rare (smaller than ). For this reason, using crude Monte Carlo method to estimate the objective is timeconsuming due to large sample size needed to observe a critical event, and even much larger samples to achieve statistical confidence. Therefore, we need to improve the efficiency of the evaluation.
IiC Importance Sampling
is a technique to reduce the variance in simulation, while providing unbiased estimation. This technique is crucial to the efficiency and unbiasedness of the AE.
Consider a random vector with distribution and a rare event set on sample space . Our goal is to estimate the probability of the rare event
(2) 
where the event indicator function is defined as (1).
Given samples independently and identically generated from , the crude Monte Carlo computes the sample mean of
(3) 
The IS technique is derived from a change of sampling distribution as the following [21]:
(4) 
where is a distribution density function that has the same support with .
Note that
(5) 
where denotes the expectation with regard to . Therefore by taking the sample mean of the expectation,
(6) 
gives an unbiased estimator of the above expectation [21], where ’s are generated from .
By appropriately selecting , the evaluation procedure obtains an estimation with smaller variance. We refer to as the accelerated distribution in our procedure. The construction of a “good” accelerated distribution is discussed in section III.
Iii Constructing Accelerated Distribution via Classification Methods
As discussed in [20], GMM is a powerful distribution model for representing traffic environment in AE. In this section, we propose a new approach to construct accelerated distributions for test scenarios under GMM setting.
In Section IIIA, we review the existing strategies for constructing efficient accelerated distribution for GMM. Section IIIB shows the key idea of our approach. Section IIIC presents the procedure of the proposed approach.
Iiia Known Strategy for Efficient Accelerated Distribution
When the random vector follows Gaussian distribution with mean and covariance matrix and the rare event set satisfies the convexity assumption, there is a simple scheme that obtains an efficient accelerated distribution [23, 24].
For a convex rare event set , we define the dominating point of on to be
(7) 
where is the density function for Gaussian distribution with mean and covariance matrix . By shifting the mean of the Gaussian distribution to , we obtain an accelerated distribution that provides an efficient estimator for [24].
Now we consider the GMM with the density
(8) 
where is the number of mixture components, is the proportion of the th component, and are the mean and covariance for the th component. We use to denote the dominating point with regard to the th component and critical event set . An efficient accelerated distribution is then given by:
(9) 
Fig. 2 illustrates this scheme.
In practice, the critical event sets for automated vehicle evaluation are generally nonconvex. Therefore, it is hard to apply this approach directly.
IiiB Constructing Accelerated Distribution via Kernel Method
We propose a procedure to construct an accelerated distribution for GMM as illustrated in Fig. 3. The procedure utilizes the scheme in section IIIA and uses kernel method to transform the data on a higher dimensional space. The Classification is used to obtain the linear boundaries for the critical event set.
The following intuition motivates our approach: Suppose we have a feature function (see Appendix B) , where is the domain for and
. We assume that there is a subspace on the feature space that is a linear transformation of the original space. (For instance, the feature function for polynomial kernel
satisfies the assumption.) We further assume that on the feature space, the critical event set has a linear boundary and is a Gaussian mixture distribution that well fit the the model on the feature space. Thus, the scheme in section IIIA is directly applicable. Let be the accelerated distribution constructed on the feature space  we note that it is still a GMM. If the feature space contains the a linear transformation of the original space, we can easily obtain an accelerated distribution on the original space by linearly transforming the marginal distribution of . Since the marginal of a Gaussian distribution is still a Gaussian distribution [25], is still a Gaussian mixture distribution.To carry out the procedure described above, we need to find a feature function , the boundary and the model . The selection of depends on the shape of the critical event set, which is problem dependent. To obtain
, we use classification methods, e.g. support vector machine (SVM), to learn for the linear boundaries. The learned critical event set is represented by
.If is a Gaussian distribution, the true distribution on the feature space is generally not Gaussian. In the proposed procedure, we use a GMM to approximate the true distribution on the feature space.
IiiC A Sequential Procedure for the Accelerated Evaluation
To summarize the procedure of the proposed approach in section IIIB, we present the stepbystep description and elaborated discussion how to apply the approach in a sequential procedure.
Suppose we have a GMM for random vector and we are interested in . The procedure is as follows:

Construct a training set with samples, , where . We suggest to select ’s using space filling designs.

Select a feature function and transform the training set as . We suggest to use the feature function for polynomial kernel and use a small parameter ( or ).

Use a classification method with linear boundary on the training set and obtain the linear boundary . We use SVM (see Appendix A) in the experiments.

Generate samples from the model and transform the samples as .

Select as the number of mixture components. Use GMM with mixtures to fit the samples and obtain the approximation model . The discussion about the selection of refers to section IVC

Adjust the marginal distribution of and obtain .

Generate sample ’s from the distribution and use (6) to estimate the objective.
Note that if we use step 9, the procedure sequentially update the training set and the information of the critical event set. When the experiment is very timeconsuming and expensive, we suggest to start with small sample size and add step 9 in the procedure. In the experiments in section IV, we do not use step 9.
Iv Numerical Experiments
In this section, we present simulation results to illustrate the procedure of the proposed approach and to show the applicability of the approach on AV evaluation. In section IVA, we apply the proposed method on a simple problem to provide an intuition of the mechanism. Section IVB shows the performance of the method in an AV test case.
Iva Simple Example Problem
We use a simple example to discuss the implementation of the proposed method. The example is set to be a low dimensional problem for illustration purposes. Despite being simple, the problem setting maintains some similarity with AV test scenarios.
Assume we have a 2dimensional random vector , where . Here we have . Suppose the distribution model of is known and we have to be a multivariate Gaussian distribution with mean and identity covariance matrix . The critical event set is defined as . The indicator function shows whether a vector is in the critical event set. Our objective is to estimate the probability of the critical event, i.e. .
Fig. 4 shows 1,000 samples generated from the distribution model of . We could observe that distribution concentrates on the area with relatively small value on both axes (consider that the upper bound for both coordinates is 5).
Now, we start to explore the domain of the
and to learn the critical event set. We use 1,000 samples randomly generated using a uniform distribution on the domain and label these samples with regard to the critical events. Fig.
5 shows the samples and the return of the indicator function . Let us denote these samples as and the returns as. It is obvious that the two types of events cannot be linearly classified and the critical event set is obviously not convex. Therefore if we want to construct an accelerated distribution for estimating the objective probability, we cannot directly use the scheme mentioned in section
IIIA.We follow the procedure described in section IIIC to construct an accelerated distribution for the problem. Here we use a polynomial kernel which maps vectors in the original space to the feature space through the transformation . We then use linear SVM to classify the transformed samples with regard to the returns . A linear boundary on the feature space is obtained as . In Fig. 6, we plot this boundary on the original space. We should note that the classification captures the property of the critical set.
Next, we want to approximate the distribution model on the feature space. We generate extra 20,000 samples from the original distribution , and transform the new generated samples with . To investigate the effect of the accuracy of the approximation model, we use two cases with different number of mixture for comparison, i.e. , . We fit Gaussian mixture distribution with and components on the transformed samples respectively. For each component in the mixture distribution, we find the dominant point and construct a sampling distribution on the feature space. Finally, we use the marginal distribution on the original space as the accelerated distribution.
The simulation performance is shown in Fig. 7 and 8. In the figures, we use “AE ” to denote the proposed approach with components in the approximation mixture distribution and “Crude MC” to denote the crude Monte Carlo approach.
In Fig. 7, we could observe that the proposed approach attains stability within 2,000 samples. The crude Monte Carlo is obviously oscillating. This shows that the proposed approach provides better estimation using smaller sample size. This argument is further confirmed by Fig. 8.
In Fig. 8, we plot the 95% relative halfwidth, which is given by
(10) 
where
denotes the 0.95 quantile of the normal distribution,
denotes the number of samples used in the estimation,denotes the standard deviation of the sampled objective, and
denotes the estimation of the objective.The 95% relative halfwidth for the crude Monte Carlo is much larger than the proposed approaches. To reach the same level of 95% relative halfwidth as the proposed approaches, the crude Monte Carlo requires roughly 100 times more samples.
From the performance of this simple example, we shall note that the proposed approach provides good accelerated distributions. By comparing the performance with and , we conclude that a more accurate approximation model on the feature space leads to a more efficient accelerated distribution.
IvB Lane Change Scenario
The lane change model for AE is proposed by [17]. The setting is illustrated in Fig. 9, which shows that a leading human driving vehicle is conducting a lane change in front of an automated vehicle. Our objective is to evaluate the rate of safetycritical events in this scenario, where safetycritical events includes conflict, collision, injury, etc. In this example, we are interested in the rate of collision during the lane change procedure. We assume that the initial status of the two vehicles is captured by a vector , which consists of three important parameters: , the velocity of the leading vehicle, , the relative range of the two vehicles, and , the inverse of the range between the two vehicles. We assume that is stochastic and the distribution of , , is a known GMM. Given , the lane change procedure is supposed to be deterministic with regard to the testing vehicle. We use to denote the critical event set and our objective is represented by .
In this example, we use the vehicle control model shown in Fig. 9 as the testing vehicle. The model consists of Adaptive Cruise Control (ACC) and Autonomous Emergency Braking (AEB) [26]. (In the figure, is defined as . is a threshold that triggers the AEB system.)
We apply the proposed approach to obtain accelerated distributions. We construct a training set with samples by using a grid on the domain ( and are bounded; is unbounded, we sample between the minimum and maximum data in a naturalistic driving data). In this case, we use a polynomial kernel with feature function to map the model onto a higher dimensional space. Again, we use different value, and , for the number of mixtures of the approximation model in the feature space.
Fig. 10 presents the estimation results of the objective, . We observe that the estimation converges as we increase the number of samples. In this case, using different value for the number of mixtures for the approximation model does not result in significant difference in the resulting estimation.
To illustrate the efficiency of the proposed method, we will compare the performance of the proposed method with crude Monte Carlo. Since the probability we are estimating is very small, running crude Monte Carlo would require a huge computing time. Here we use the probability we estimated to approximate the number of samples required for the crude Monte Carlo. Using the formula for the standard deviation of the crude Monte Carlo estimation by
(11) 
we found that the crude Monte Carlo method requires roughly times more samples to reach the sample confidence level as the proposed approach.
Standard Deviation  Required Samples  

Proposed AE  
Crude MC 
Table I summarizes the comparison of the proposed AE with crude Monte Carlo. Column 2 shows the approximated standard deviation of the estimators. Column 3 presents the approximated number of the required samples to obtain an estimation with a 95% confidence relative halfwidth less than 0.4. Note that the proposed AE uses an extra of samples to learn the critical event set and construct the accelerated distribution. These samples are much smaller than the scale of the required samples for estimation.
IvC Discussion on Implementation
In the implementation of the proposed approach, there are some open choices in the procedure. We know from the experiment that the efficiency of the approach is largely influenced by these choices. Here, we discuss how they are related to the efficiency of the method.
To obtain an efficient accelerated distribution, we want to achieve two tasks: a) the classification must learn the property of the critical event set, at least a rough bound needs to be found; b) the approximation on the feature space needs to roughly capture the shape of the transformed distribution. We discuss these two tasks separately.
In the classification step, we need to construct a train set and choose a kernel function. Since the collection of train data requires test experiment, the size of the training set need to be balanced. In practice, to select a reasonable number requires prior knowledge of the test scenario. When such knowledge is not available, we suggest to start with a small size of data that includes some critical events and then follow the sequential procedure described in section IIIC. Given the number of data in the training set, we can use a deterministic design (e.g. use a grid) or a random design (e.g. uniformly sample), as long as samples fill the domain well.
The kernel functions to select need to have an explicit feature function. We suggest to use polynomial kernel. A feature space with higher dimension would achieve better accuracy for the critical event learning, however, to approximate the distribution model on the higher dimension is generally harder. For this reason, we want the dimension of the feature space to be as small as possible.
For approximating the distribution model on the feature space, we need to choose the number of mixture components. Although a larger number of components always provides a better fitting, the efficiency of the constructed accelerated distribution might not always improve. We suggest to use smaller number of components for less computing efforts. For high dimensional feature space, regularization should be applied in fitting the approximation model to make the fitting algorithm converge in fewer iterations. The selection of regularization parameter is suggested to choose through cross validation [27].
Appendix
Iva Classification Methods for Critical Event Set Learning
Here we review SVM as an example of classification methods that suits the proposed approach. Please refer to [27] for more details.
SVM is a popular algorithm for classification and regression. Here we briefly introduce the SVM with hard margin for binary classification. Denote the training dataset as , , where is the feature vector and is the label; suppose the data is linear separable. The SVM returns the linear classifier of the form:
where is the coefficient of the hyper plane in the feature space; is the bias; for ; for
. SVM wants to find the hyperplane to separate the points while maximizing the minimum distance between the hyperplane and the nearest points
from either side. This can be formulated with the constraints asA common solution to SVM is to use the Lagrange Multiplier, which can be found in [27].
Note that other classification methods, e.g. logistic regression, also provide a linear boundary and are compatible with kernel method. These methods also fits the proposed approach.
IvB Kernel Method for Critical Event Set Learning
The kernel is a generalized inner product, computing the inner product of two vectors in a higher dimensional space without explicitly define the feature function. Here we introduce the feature function, and more detail on kernel method can be found in [28] .The feature function is defined as , where is the domain for input , , and . A benefit of introducing feature space is nonseparable data can be separable in some feature spaces.
For instance, consider two sets on a 2D plane: and . There is no line can separate from since is surrounded by . But if we map the point on to by a feature function , the mapped sets are linearly separable. Denoting the original sets , in the feature space as and , then and are linearly separable by the plane in .
References
 [1] L. Wardle, “Selfdriving ubers in pittsburgh: What’s changed after 1 year on the roads?,” Sep 2017.
 [2] P. Newman, “Lyft to test autonomous cars in san francisco,” Sep 2017.
 [3] A. C. Madrigal, “Inside waymo’s secret world for training selfdriving cars,” Aug 2017.
 [4] S. Petters, “Ford tests communication signals for cavs.”
 [5] A. Toor and T. Warren, “Domino’s and ford will test selfdriving pizza delivery cars in michigan,” Aug 2017.
 [6] S. James, “U.s. dot releases new automated driving systems guidance,” Sep 2017.
 [7] D. Zhao, Y. Guo, and Y. J. Jia, “Trafficnet: An open naturalistic driving scenario library,” arXiv preprint arXiv:1708.01872, 2017.
 [8] V. L. Neale, T. A. Dingus, S. G. Klauer, J. Sudweeks, and M. Goodman, “An overview of the 100car naturalistic study and findings,” National Highway Traffic Safety Administration, Paper, no. 050400, 2005.
 [9] U. D. of Transportation, “Safety pilot model deployment data.”
 [10] V. Punzo, M. T. Borzacchiello, and B. Ciuffo, “On the assessment of vehicle trajectory data accuracy and application to the next generation simulation (ngsim) program data,” Transportation Research Part C: Emerging Technologies, vol. 19, no. 6, pp. 1243–1262, 2011.
 [11] C. Chai, Y. D. Wong, and X. Wang, “Safety evaluation of driver cognitive failures and driving errors on rightturn filtering movement at signalized road intersections based on fuzzy cellular automata (fca) model,” Accident Analysis and Prevention, vol. 104, pp. 156 – 164, 2017.
 [12] J. Kovaceva, A. Bálint, and H. Fagerlind, “Contributing factors to car crashes related to driver inattention for different age groups,” in International Conference on Driver Distraction and Inattention, 4th, 2015, Sydney, New South Wales, Australia, no. 15257, 2015.
 [13] T. A. Dingus, F. Guo, S. Lee, J. F. Antin, M. Perez, M. BuchananKing, and J. Hankey, “Driver crash risk factors and prevalence evaluation using naturalistic driving data,” Proceedings of the National Academy of Sciences, vol. 113, no. 10, pp. 2636–2641, 2016.
 [14] F. Bella, A. Calvi, and F. D’Amico, “Analysis of driver speeds under night driving conditions using a driving simulator,” Journal of safety research, vol. 49, pp. 45–e1, 2014.
 [15] X. Qu, Y. Kuang, E. Oh, and S. Jin, “Safety evaluation for expressways: a comparative study for macroscopic and microscopic indicators,” Traffic injury prevention, vol. 15, no. 1, pp. 89–93, 2014.
 [16] T. S. Facts, “Data (2014),” National Highway Traffic Safety Administration, 2014.
 [17] D. Zhao, H. Lam, H. Peng, S. Bao, D. J. LeBlanc, K. Nobukawa, and C. S. Pan, “Accelerated evaluation of automated vehicles safety in lanechange scenarios based on importance sampling techniques,” IEEE transactions on intelligent transportation systems, vol. 18, no. 3, pp. 595–607, 2017.
 [18] D. Zhao, X. Huang, H. Peng, H. Lam, and D. J. LeBlanc, “Accelerated evaluation of automated vehicles in carfollowing maneuvers,” IEEE Transactions on Intelligent Transportation Systems, 2017.
 [19] Z. Huang, D. Zhao, H. Lam, and D. J. LeBlanc, “Accelerated evaluation of automated vehicles using piecewise mixture models,” arXiv preprint arXiv:1701.08915, 2017.
 [20] Z. Huang, H. Lam, and D. Zhao, “An accelerated testing approach for automated vehicles with background traffic described by joint distributions,” arXiv preprint arXiv:1707.04896, 2017.
 [21] S. Asmussen and P. W. Glynn, Stochastic simulation: algorithms and analysis, vol. 57. Springer Science & Business Media, 2007.
 [22] S. Ross, Simulation. Academic Press, 2013.
 [23] J. S. Sadowsky et al., “On monte carlo estimation of large deviations probabilities,” The Annals of Applied Probability, vol. 6, no. 2, pp. 399–422, 1996.
 [24] J. S. Sadowsky and J. A. Bucklew, “On large deviations theory and asymptotically efficient monte carlo estimation,” IEEE transactions on Information Theory, vol. 36, no. 3, pp. 579–588, 1990.

[25]
A. Papoulis and S. U. Pillai,
Probability, random variables, and stochastic processes
. Tata McGrawHill Education, 2002.  [26] A. G. Ulsoy, H. Peng, and M. Çakmakci, Automotive control systems. Cambridge University Press, 2012.
 [27] C. M. Bishop, Pattern recognition and machine learning. springer, 2006.
 [28] K. P. Murphy, Machine learning: a probabilistic perspective. MIT press, 2012.