Adaptive Performance Assessment For Drivers Through Behavioral Advantage

04/23/2018 ∙ by Dicong Qiu, et al. ∙ Carnegie Mellon University 0

The potential positive impact of autonomous driving and driver assistance technolo- gies have been a major impetus over the last decade. On the flip side, it has been a challenging problem to analyze the performance of human drivers or autonomous driving agents quantitatively. In this work, we propose a generic method that compares the performance of drivers or autonomous driving agents even if the environmental conditions are different, by using the driver behavioral advantage instead of absolute metrics, which efficiently removes the environmental factors. A concrete application of the method is also presented, where the performance of more than 100 truck drivers was evaluated and ranked in terms of fuel efficiency, covering more than 90,000 trips spanning an average of 300 miles in a variety of driving conditions and environments.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Over the last decade, the potential of autonomous driving and driver assistance technologies have been recognized by both the industry and the academia [1]. The impact of automation in the trucking industry has been often characterized to supersede the traditional human driving capabilities especially in relation with the economic viability of deploying autonomous driving agents at-scale [2].

While the future for these technologies seems promising, quantitative evaluation tools to measure associated economic impact are in need. A direct qualitative comparison often falls short of only accounting for the end-end history of any given trip. Traditional evaluation strategies in a simulated or controlled environment are inefficient not only for assessing the driver or driving agent overall performance in various environments [3, 4] but also for revealing the individual advantage in different driving conditions that varies from one driver to another. The need for evaluating the operational efficiency of a driver or a driving agent in the then-prevailing driving conditions (within or across trips) and the associated uncertainty in correlation with control test results necessitates the need for a fair ranking methodology for quantitatively determining a driver’s advantage over another while accounting for the external operating conditions. While it is intractable to exhaustively account for every factor related to the driving condition, we propose a functional approximation by analyzing the operational outcome of driven trips.

In the example application of the proposed method on truck driver performance assessment, we analyzed more than 90,000 trips spanning an average of 300 miles per trip while ferrying weighted average payloads of approximately 43,000 pounds. In addition, the proposed regression model combines the dynamic impact of the terrain, driving conditions including comfort and impact of the driver’s training and experience on the vehicle performance and the economy of the trip. Much attributed to our sponsors generosity, our non-linear regression model is a collective approximation of more than 200 freight drivers with varying levels of experience and expertise. Nonetheless, the proposed driver evaluation system is agnostic to the driver and instead relies on the vehicle’s response to control, operating characteristics and several other external conditions. We support our predictions by comparing against untrained economic characteristics of each of the trips.

While modelling the dynamics involved in driver’s training and experience are highly susceptible to failure and stagnation, we believe that the adaptable function approximation capabilities of non-linear regression models is an advantage. Furthermore, we also proffer that the longevity of the activities involved in hauling or shipping operations and the historical end to end documentation make this study an attractive economic predictor for ground transportation going forward. In accordance with the generally prevalent persistent safety of autonomous and assisted driving technologies or the expected maturity over the next 5 to 10 years, we analyze freight shipment activities in this work not only to propose a metric for assisting the roadways-freight industry but also as a evaluation for advanced and economically competitive autonomous driving systems.

2 Feature Extraction and Normalization

To better assess the performance of a driver and compare that to the performance of other drivers, it is necessary to analyze multiple trips by the drivers. Different levels of information can be extracted from a trip to represent the characteristics of it. The problem raised is related to the granularity or level of abstraction of informative features that are necessary and sufficient representative of the trip. In our approach, we utilize the generally available abstract information or characteristics that collectively describe the economy of a trip. These summaries, which collectively represent the trip, consist of four parts: the identifier information, the environmental (objective) characteristics, the driver behavioral (subjective) characteristics and the driver performance evaluation. More specifically,

  • identifier information includes identifiers that distinguish the trip, the driver and the vehicle driven, along with information that is not used for data analytics;

  • environmental characteristics summarizes the driving conditions of the trip that are not controllable by the driver or the driving agent, including average loading, vehicle characteristics, terrain features, etc.;

  • driver behavioral characteristics describes driver controllable factors of the trip, including information like the number of times that engine running over rpm threshold, accumulative time that vehicle is running over a higher speed limit, etc.;

  • driver performance at a trip is evaluated by quantifiable measurements that matter in the assessment, such as total fuel consumed for the trip, accumulative driving time for this trip, miles per gallon over the entire trip, etc.

Data normalization is then applied to efficiently avoid the effect of scale and different units, which can facilitate the training process of the models in the following steps. The data are centered and brought to the same scale by multiplying the inverse standard deviation for each dimension. It is assumed that each dimension of the data is uncorrelated to the other dimensions. The

-th dimension of a data sample after normalization becomes


represent the mean and covariance of the data set, respectively. is the -th data sample, represents the stacked data samples of the entire data set with samples in total, is the -th dimension of a data sample and is the element at the -th row and -th column in the covariance matrix .

3 General Performance Assessment

The primary challenge in assessing driver or driving agent performance lies in the situation that evaluations under strictly identical environment are hardly accessible, and it is unfeasible to evaluate drivers or driving agents in all possible environments in which they may drive vehicles to evaluate their general performance across different driving conditions.

3.1 Baseline Model for Environmental Factors

In order to remove the influence of the environmental factors from evaluations, we propose a baseline model for environmental factors, which is a concept borrowed from reinforcement learning and has been used to remove state bias efficiently

[5]. In reinforcement learning problems, evaluating the effect of taking an action at a state is equivalent to solving for the state-action value function . To properly normalize the effect of taking an action, it is more interested in the relative advantage brought by an action instead of the absolute value brought by it, which leads to the action advantage function

where is the baseline function, is the state value function that evaluates the absolute value of being at a state and is chosen to be the baseline, and the action advantage function measures the relative advantage of taking an action at state .

In analogy to the aforementioned method [6]

which dates back to 1993, our approach resembles an one-step Markov Decision Process (MDP) by considering a driver or driving agent selection problem, where the environmental conditions to drive a vehicle are formulated as a state

, the behavioral characteristics of the driver or the driving agent in that trip is the action , and the performance , or says, the value brought by the trip is the value in the MDP context. The state set contains all possible environmental factors and the action set

contains all possible behavioral characteristics of drivers or driving agents. The difference is that the value becomes a multi-dimensional performance vector. And the value

incorporating both the environmental factors and driver behavioral characteristics has been given.

Since there is no indication that the environmental factors are linearly related to the performance evaluation, it is necessary to use a non-linear function approximator to approximate the environmental factor influence, playing the role as a conditional averager that characterizes the general performance over drivers or driving agents conditioned on specific environmental factors.

In our approach, the baseline function is approximated by a neural network

parameterized with , which has

fully connected ReLU layers followed by a linear output layer. The input to the neural network is the environmental characteristics

and the output is the performance evaluation . Unlike usual applications of neural networks, the caveat of avoiding overfitting does not hold in this context, because the neural network is used to as a conditional averager that characterizes the general performance over all drivers or driving agents, overfitting is actually a beneficial property. can also be replaced with more advanced neural network architecture [7] to model noise and uncertainty while averaging the environmental effect.

3.2 Removing Environment Bias by Baseline

With the baseline model summarizing the general performance over all drivers or driving agents conditioned on environmental characteristics, the advantage of a driver in a particular trip can be derived as

where for the -th trip, is the environmental characteristics, is the driver behavioral characteristics, is the performance evaluation, and is the advantage of the driver or driving agent with behavioral characteristics .

A naive version of driver performance assessment can be derived with performance advantage. Under the assumption that each driver or driving agent has driven sufficient number of trips that cover most of the driving conditions (environmental characteristics), the expected unbiased performance of the -th driver can be approximated by the empirical average unbiased performance.

where is the set of all possible data sample entry indices related to the -th driver, and the absolute value of the trip and the environmental characteristics have been given. Applying the above approximation to all drivers or driving agents will lead to an unbiased evaluation of them, which is not affected by the environmental factors. The difference among the expected unbiased performance of the drivers determines their general rankings, which reflect the general performance assessment of these drivers or driving agents.

4 Incorporating Driver Behavioral Characteristics

One step beyond assessing the general performance of drivers or driving agents is to address the optimal driver placement problem, where it is interesting to understand which driver or driving agent is more suitable to drive a trip in which kind of environment (driving conditions). In order to achieve this goal, it is necessary to incorporate driver behavioral characteristics.

4.1 Behavioral Characteristics Model for Drivers

Similar to the formulation of the baseline model proposed in section 3.1, a behavioral characteristics (behavior) model parameterized by

is proposed, which also serves to estimate the performance evaluation. But unlike the baseline model, which only takes environmental factors into account, the behavior model considers the effect from both the environmental factors and the driver or driving agent behavioral characteristics. The behavior model actually approximates the state-action value function.

where is the state-action value function and is the parameterized model. The purpose of this model is to abstract the effect on the performance evaluation introduced by certain behaviors characterized by their behavioral characteristics , conditioned on specific environments characterized by the environmental characteristics .

In our approach, the behavior model is a parameterized neural network consisting of fully connected ReLU layers followed by a linear output layer. The input to the neural network is the environmental characteristics and the behavioral characteristics , and the output is the performance evaluation incorporating both the environmental and driver behavioral factors. Similarly, overfitting is also a beneficial property here. And more advanced neural network architecture [7] can be adopted to replace the neural network mentioned above so as to model noise and uncertainty as well.

4.2 Direct Behavioral Advantage

With the baseline model from section 3.1 and the behavioral characteristics model from section 4.1 properly constructed and trained, they can jointly approximate the driver behavioral advantage conditioned on specific environmental characteristics. For the sake of simplicity, the combination of the baseline model and the behavior model is represented by a joint direct behavioral advantage model with parameter .

The architecture of the direct behavioral advantage model in our approach is a joint neural network consisting of the baseline model network and the behavior model network as shown below in figure 1. The architectural details of the two sub networks follow the design in sections 3.1 and 4.1.

Figure 1: architecture of the direct behavioral advantage network, where both the baseline model sub network and the behavior model sub network consist respectively of fully connected ReLU layers followed by a linear output layer, the environmental characteristics input goes into both sub networks and , the driver behavioral characteristics input goes only into the behavior model sub network , and the state-action performance is then subtracted by the baseline performance to produce the behavioral advantage .

4.3 Optimal Driver Placement

Beyond assessing driver general performance across all kinds of environments, it is more efficient to assign an appropriate driver or driving agent to the trips with environmental characteristics that fits the driver or the driving agent. The problem of optimal driver placement is to find the driver or driving agent that generates the most value or drives most efficiently in the trip. Formally, the optimal placement is the driver behavioral characteristics that yields the highest value in a specific environment characterized by .

where is the optimal policy that chooses the most suitable driver behavioral characteristics conditioned on environmental characteristics . Note that in this context the outputs are one-dimensional for the state-action value function , the state value function and the advantage function , which is the dimension of performance evaluation that matters the most. And from the above derivation, the optimal driver placement can be formulated as an optimization problem of the driver behavioral characteristics over the direct behavioral advantage network in practice.

The search strategy for the optima can be gradient descent with multiple starts, or gradient-free methods such as CMA-ES [8] which also has the potential for global optimization [9, 10, 11, 12]. Although the parameterized direct behavioral advantage network is an approximation to the true advantage function, with the assumption of smooth continuity, the optimized driver behavioral characteristics over shall approximate .

5 Experiments

The presented methods to assess the general performance of drivers or driving agents and solve the optimal driver placement problem are applied to a truck driving data set of freight trips. A baseline model as proposed in section 3.1 was trained in a supervised manner for epochs with the environmental characteristics data as input and the performance evaluation data as output. Similarly, a behavior model mentioned in section 4.1 was also trained for epochs with both the environmental characteristics data and the driver behavioral characteristics data jointly as input and the performance evaluation data as output. The training processes of the two aforementioned parameterized models are shown in figure 2. After the training, the baseline model and the behavior model reached a mean squared error (MSE) loss of and , respectively.

(a) learning curve of baseline network
(b) learning curve of behavior network
Figure 2: the training processes (learning curves) in training epochs of the baseline model network described in section 3.1 and the behavior model network described in section 4.2, after which they reached an MSE loss of and , respectively.

The general performance of the truck drivers in the freight data set are assessed by removing the environment bias and then averaging the unbiased performance using the method proposed in section 3.2. The metric miles per gallon over the entire trip (total_mpg) is used for the performance evaluation. The results of the truck driver overall performance evaluation and rankings are listed in Appendix A, where the truck driver ID, the averaged unbiased total_mpg along with the standard deviation in brackets are presented.

Figure 3: an example illustration of the the effect by different dimensions of the driver performance characteristics on the performance evaluation with the environmental characteristics fixed as presented in Appendix B, where the joint effect by two dimensions, the time that engine running over speed threshold (overspeedtime) and the maximum speed when engine running over threshold (overspeedmax), are visualized along with other dimensions fixed as presented in Appendix B, and miles per gallon over the entire trip (total_mpg

) is selected as the performance evaluation metric.

Given the trained baseline model and the behavior model , the direct behavioral advantage model (network) was constructed as discussed in section 4.2. To illustrate the effect on the outcome value of the trip, driven by different drivers or driving agents characterized by a varying driver behavioral characteristics in a given environment characterized by its environmental characteristics , an example illustration is provided in figure 3. In this example, the given environmental characteristics is presented in Appendix B and the values of the driver behavioral characteristics except for its overspeedtime and overspeedmax are predefined for simplification. The metric total_mpg represents the performance evaluation.

In our approach, a gradient-free method, CMA-ES [8], is used to optimize the conditional driver behavioral characteristics , in order to find the optimal driver behavioral characteristics constrained in a given trip environment . The driver with averaged behavioral characteristics that has the minimum euclidean distance to the optimal driver behavioral characteristics is chosen to drive the trip. Formally, the available driver chosen to drive the trip is the driver with identifier

And the averaged behavioral characteristics of driver is defined as

where and is the set of all possible data sample entry indices related to the -th driver.

6 Conclusion

The primary contributions of this work include a generic method in assessing the general performance of a driver or driving agent proposed in section 3 and a method that incorporates driver behavioral characteristics in section 4 to approach the optimal driver placement problem. These two methods are applied to a truck freight data set with more than 90,000 records of trips to fairly analyze the performance of truck drivers in terms of overall fuel efficiency of their trips considering the environmental difference when the trip data are collected, and find the truck driver most suitable for a new trip regarding their past records.

The source code111 for this work is available to the public.


This work was made possible through the support from AutonLab of Carnegie Mellon University who held the

Hackauton 2018 Machine Learning Hackathon

and the sponsor of the problem and the dataset.


  • [1] Brian Paden, Michal Čáp, Sze Zheng Yong, Dmitry Yershov, and Emilio Frazzoli. A survey of motion planning and control techniques for self-driving urban vehicles. IEEE Transactions on Intelligent Vehicles, 1(1):33–55, 2016.
  • [2] Bob Costello. Truck driver shortage analysis. American Trucking Associations, 2017.
  • [3] Myeonggi Jeong, Manabu Tashiro, Laxsmi N Singh, Keiichiro Yamaguchi, Etsuo Horikawa, Masayasu Miyake, Shouichi Watanuki, Ren Iwata, Hiroshi Fukuda, Yasuo Takahashi, et al. Functional brain mapping of actual car-driving using [18 f] fdg-pet. Annals of nuclear medicine, 20(9):623–628, 2006.
  • [4] Otto Lappi. The racer’s brain–how domain expertise is reflected in the neural substrates of driving. Frontiers in human neuroscience, 9:635, 2015.
  • [5] Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Van Hasselt, Marc Lanctot, and Nando De Freitas. Dueling network architectures for deep reinforcement learning. arXiv preprint arXiv:1511.06581, 2015.
  • [6] Leemon C Baird III. Advantage updating. Technical report, WRIGHT LAB WRIGHT-PATTERSON AFB OH, 1993.
  • [7] Akihiko Yamaguchi and Christopher G Atkeson. Neural networks and differential dynamic programming for reinforcement learning problems. In Robotics and Automation (ICRA), 2016 IEEE International Conference on, pages 5434–5441. IEEE, 2016.
  • [8] Nikolaus Hansen and Andreas Ostermeier. Completely derandomized self-adaptation in evolution strategies. Evolutionary computation, 9(2):159–195, 2001.
  • [9] Nikolaus Hansen and Stefan Kern. Evaluating the cma evolution strategy on multimodal test functions. In International Conference on Parallel Problem Solving from Nature, pages 282–291. Springer, 2004.
  • [10] Anne Auger and Nikolaus Hansen. A restart cma evolution strategy with increasing population size. In Evolutionary Computation, 2005. The 2005 IEEE Congress on, volume 2, pages 1769–1776. IEEE, 2005.
  • [11] Anne Auger and Nikolaus Hansen.

    Performance evaluation of an advanced local search evolutionary algorithm.

    In Evolutionary Computation, 2005. The 2005 IEEE Congress on, volume 2, pages 1777–1784. IEEE, 2005.
  • [12] Nikolaus Hansen. Benchmarking a bi-population cma-es on the bbob-2009 function testbed. In Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers, pages 2389–2396. ACM, 2009.

Appendix A General Performance Ranking Results of the Truck Drivers

As listed below is the ranking results of the truck driver general performance generated by subtracting the baseline performance from the raw performance evaluation with the metric of miles per gallon over the entire trip and averaging the driver behavioral advantage, using the method described in section 3.2. The higher the advantage is, the better the driver performed in general, with estimation error presented in brackets.


Appendix B Test Conditions for Optimal Driver Placement

[fontsize=] s = [ 0.43287603 1.16673833 0. 0. 1. 0. -0.28311216 -2.35413651 1.41650827 1.4164645 2.1548913 1.1848586 2.49437459 1.60718365 1.20496714 1.20496755 0.81056784 2.2438689 0.81056832 2.2438548 1.35917538 0.88847887 0.62507642 1.07587502 0.86936209 0.66701179 -0.52474082 3.04497366 0.01570298]

[fontsize=] a0 = [ 0.59222113 0.31818032 0.4902029 x -0.32864671 y -0.23821433 -0.31306424 0.28651447 -0.06029843 -0.07660633 -0.61115171 0.44149855 1.83023875 0. 0. 0. 0. 1. 1. 0. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 1. 0. 0. -0.5263721 -0.60509878 0.43784125 0.41665665 0.50981417 0.50831901 0.6149238 0.62443187 0.44167023 0.42748259]