Log In Sign Up

Multimodal Driver State Modeling through Unsupervised Learning

Naturalistic driving data (NDD) can help understand drivers' reactions to each driving scenario and provide personalized context to driving behavior. However, NDD requires a high amount of manual labor to label certain driver's state and behavioral patterns. Unsupervised analysis of NDD can be used to automatically detect different patterns from the driver and vehicle data. In this paper, we propose a methodology to understand changes in driver's physiological responses within different driving patterns. Our methodology first decomposes a driving scenario by using a Bayesian Change Point detection model. We then apply the Latent Dirichlet Allocation method on both driver state and behavior data to detect patterns. We present two case studies in which vehicles were equipped to collect exterior, interior, and driver behavioral data. Four patterns of driving behaviors (i.e., harsh brake, normal brake, curved driving, and highway driving), as well as two patterns of driver's heart rate (HR) (i.e., normal vs. abnormal high HR), and gaze entropy (i.e., low versus high), were detected in these two case studies. The findings of these case studies indicated that among our participants, the drivers' HR had a higher fraction of abnormal patterns during harsh brakes, accelerating and curved driving. Additionally, free-flow driving with close to zero accelerations on the highway was accompanied by more fraction of normal HR as well as a lower gaze entropy pattern. With the proposed methodology we can better understand variations in driver's psychophysiological states within different driving scenarios. The findings of this work, has the potential to guide future autonomous vehicles to take actions that are fit to each specific driver.


page 1

page 5

page 10

page 11

page 13


TripMD: Driving patterns investigation via Motif Analysis

Processing driving data and investigating driving behavior has been rece...

Dynamics of Driver's Gaze: Explorations in Behavior Modeling & Maneuver Prediction

The study and modeling of driver's gaze dynamics is important because, i...

The Effect of Whole-Body Haptic Feedback on Driver's Perception in Negotiating a Curve

It remains uncertain regarding the safety of driving in autonomous vehic...

Breaching the privacy of connected vehicles network

Connected Vehicles network is designed to provide a secure and private m...

Statistical Characteristics of Driver Accelerating Behavior and Its Probability Model

The naturalistic driving data are employed to study the accelerating beh...

Sugar and Stops in Drivers with Insulin-Dependent Type 1 Diabetes

Diabetes is a major public health challenge worldwide. Abnormal physiolo...

Driver State and Behavior Detection Through Smart Wearables

Integrating driver, in-cabin, and outside environment's contextual cues ...

I Introduction

People have different preferences in their choice for driving styles [41]. For instance, a recent study mentioned individual variability across participants when choosing between conservative and risky driving styles [41]. Additionally, different studies have shown that personalization of autonomy has a positive effect on the user’s perceived trust in autonomous vehicles (AV) [37, 53, 65, 62]. This implies that AV’s acceptance relies on correctly understanding people’s preferences in driving styles and act accordingly. One method to identify individual responses to different driving behaviors is using historical driving data. For instance, driver-related physiological data (e.g., facial expressions, heart rate (HR), and skin temperature) can show how a driver reacts within different driving styles [63, 23, 38].

Analysis of historical driving data, which is often retrieved through naturalistic driving studies (NDS), also comes with problems such as the difficulty (time and cost) in analyzing the massive amount of collected information [29]. NDS is often conducted in a longitudinal fashion to help detect behaviors while different environmental noise and challenges exist in the data [58], which drastically increases the amount of data that needs to be analyzed for detecting specific driver behaviors, actions, and responses. One method to address this issue is to apply unsupervised learning on both driving behaviors as well as the driver’s state. Through unsupervised learning, different patterns in driver’s state and behaviors can be detected [29]. These patterns, when analyzed together, may reveal important information regarding the driver’s responses in each driving behavior.

Previous NDS data often lack information related to driver’s physiological responses and cognitive metrics [11, 58]. With the current advancements in wearable technology, it is now viable to detect driver’s psychophysiological states through monitoring their heart rate (HR), skin temperature, skin conductance, arm movement, and other physiological metrics. In line with these improvements, recent studies have used driver’s physiological responses such as changes in HR to detect driver’s stress levels correlated with the increase in HR values [58, 64]

. Additionally, with the improvements in computer vision applications, it is now viable to use raw videos for estimating driver’s breathing patterns, HR, gaze directions, and emotional features through facial landmarks

[42]. As a result of these improvements in technology, recent NDS are starting to implement novel sensors and technologies to collect such data [18, 64, 44, 58, 20]

. Utilizing these data streams and by applying unsupervised machine learning techniques, we can understand a driver’s response within each driving behavior.

In this paper, we propose a methodology to classify driver’s reactions to different vehicular maneuvers using unsupervised learning in a fully naturalistic environment. We use a case study to analyze different driving patterns (e.g., harsh braking) coupled with driver state patterns (e.g., high stress level) extracted from a naturalistic driving dataset in an unsupervised fashion. Using the detected unsupervised patterns and by having access to multimodal human sensing data, we then analyze the driver’s reactions under each driving pattern. Specifically, we first decompose naturalistic driving scenario data into smaller segments using Bayesian Change Point (BCP) detection methods. We specifically use BCP for its ability to detect changes in the distribution of multimodal data resulting from vehicle kinematic sensors. Then, a combination of Gaussian Mixture Models (GMM) together with Latent Dirichlet Allocation analysis (LDA) is used to provide patterns of both driving behavior and driver state. We use the patterns in driver state to understand driver reactions with respect to specific driving behaviors. In order to show the applicability of our approach, we use two separate case studies. In the first one, we use high-frequency driving behavior (e.g., acceleration) and driver sensing data (i.e., HR and gaze) from one participant and apply the GMM-LDA method on both driving behavior and driver sensing data. Based on this model, we detect four driving patterns that are aligned with well-known driving behaviors of harsh braking, normal braking, highway free-flow driving, and road curvature driving. We also detect two driver state patterns of abnormal (high) HR versus baseline HR, as well as low versus high gaze entropy. Analyzing the fraction of each state pattern (i.e., HR and gaze pattern) within each driving behavior depicts that the driver of our study is calmer in a more conservative driving style with route selection that can include more segments of highway driving. We then extended our findings to a larger dataset with 12 participants as a second case study with low-resolution data and found similar results as case study I.

Ii Background Study

Previous studies have pointed out the utility of unsupervised learning methods in finding behavioral patterns in naturalistic driving data [29]. However, past studies have not focused on applying the same methods on driver physiological data (e.g., HR). Below we first elaborate on the relationship between driver sensing measures (i.e., HR and gaze patterns) and driver’s state (i.e., stress level, and workload). We then outline past progress in unsupervised modeling techniques for driver state and behavior detection.

Ii-a Driver State Detection

Human bio-signals have been used extensively in multiple fields such as psychology, health sciences, and engineering for retrieving a deep understanding of a human’s psychophysiological states. Psychophysiology refers to psychological states such as emotional responses (e.g., anger, frustration, and happiness), cognitive load, and distraction, which can be measured through changes in human physiological responses (e.g., HR, skin temperature, and skin conductance) [31]. In the driving research, psychophysiological measures such as driver’s HR [52, 55, 57], gaze patterns [3, 58], skin conductance [40], and brain signals [39] were all used for retrieving driver’s stress level, and cognitive load.

One popular definition of stress is the state of a human when the demand of a situation is perceived to be more than the available internal resources [19]. Different bio-signals can then be used to detect stressful instances. Human cardiovascular measures have been extensively used in literature for detecting stress levels. In naturalistic environments and mainly driving research, cardiovascular activity can be retrieved through using either of Electrocardiography (ECG) or photoplethysmogram (PPG) technologies. ECG measures heart electrical activity through the usage of contact electrodes. PPG, on the other hand, records the same activity through measuring blood volume in the vein using infrared technology [31, 58]. The cardiovascular measures can then be used to estimate features of the beat-to-beat signal of the heart such as HR itself, and root mean squared of successive intervals (RMSSD). Studies have demonstrated that higher stress levels are correlated with an increase in HR and a decrease in RMSSD [58, 24].

Driver’s cognitive workload is defined as cognitive resources that are taken from the driver by any activity other than the driving itself [15]. In this definition, the cognitive load consist of mind wandering and the load imposed on the driver by secondary tasks. Secondary tasks are the ones that require attention but are not required for the task of driving, such as working with a phone [15]. Human bio-signals such as driver’s gaze measures, cardiovascular measures, and brain signals have been used for cognitive load estimation in both controlled environments and in naturalistic settings [31]. Stationary and transition gaze entropy are two of the main eye gaze metrics that are commonly used for cognitive load estimation [27, 17, 50, 51]. In information theory, the uncertainty associated with a choice is referred to as the information entropy [51] in which the more the uncertainty, the more the entropy and the more randomness in the system. There are two ways to calculate the entropy in gaze analysis. The first is generally calculated through Shannon’s equation [46]. In gaze analysis, Shannon’s entropy shows the overall predictability of fixation locations in a sequence of gaze patterns, which is a measure of gaze dispersion [46] and is referred to as the Stationary Gaze Entropy (SGE). Specifically, for a set of fixation locations in a sequence of eye movements, fixation locations can be assigned to spatial bins of and calculate the SGE as:


Different studies used SGE to infer a human’s state in different conditions. For instance, SGE has been shown to be correlated with task difficulty, complexity, and experience with the task [51]. Additionally, studies have used SGE in driving research to infer driver’s state such as workload [51], drowsiness [50], and being under the influence of alcohol [49]. For instance, [50]

used driver’s eye SGE to predict lane change events in sleep-deprived drivers, where an increase in SGE was associated with a higher probability of lane change events. While SGE was shown to be correlated with different human states, a recent review suggests that inferences based on changes in SGE can be very task-specific. For instance, if we know a specific task requiers a higher SGE and we observe that the participant is having a lower SGE, this may imply the participant could be disengaged from the task

[51]. This highlights the importance of the second measure of gaze entropy, referred to as conditional entropy, which is task-independent.

Conditional entropy considers the dependency between back-to-back fixation points in a temporal fashion. This results in the Gaze Transition Entropy (GTE). GTE is a measure of predictability of the next fixation location given the current location. For a sequence of transitions between different spatial bins of and , with a probability of , the GTE is calculated as:


GTE was shown to be correlated with multiple aspects of the human’s state in both driving and non-driving research. In general, higher task demand, higher scene complexity, and higher levels of cognitive load were shown to increase the GTE [51, 17]. Additionally, higher levels of GTE when performing the same task with the same scene complexity can be associated with higher stress levels [51].

Ii-B Unsupervised Modeling of Driver’s state and Behavior

Previous studies provided significant insight into analyzing driver’s behaviors in naturalistic conditions through unsupervised methods. Bando et al. [6]

have used multimodal LDA to infer driving topics using a combination of image sequences, annotated tags and driving behavioral data. In their study, authors first used a double articulation analyzer (DAA) for driving data segmentation. Then by applying LDA on the multimodal data, they were able to achieve a dimensionality reduction of 5% on the raw data. This then led to an increase in classification accuracy achieved by a baseline that was trained using a support vector machine. In another study, the authors proposed a modification to DAA to not only perform segmentation but also to predict the duration of each segment


. Based on the assumption that driving data has a two-layer hierarchical structure, the authors proposed a double-layer articulation structure model which uses a hierarchical Dirichlet process hidden semi-Markov model to predict the duration of each segment. Li et al.

[30] used the density-based spatial clustering method to cluster physiological data of drivers into one of the normal, event, and noise clusters. Based on the data collected from three drivers and through an on-road controlled study, authors were able to achieve a recall rate of 75 percent in clustering the three categories.

Bender et al. [8] proposed an unsupervised method to provide high-level clusters for time series data streams from naturalistic driving behavior. In their study, the authors used a Bayesian multivariate linear model to segment the driving data. Then by using simplicial mixture models, they find unsupervised patterns associated with different driving behaviors such as acceleration and braking. This method was applied in an online fashion through an on-road controlled study. The data for the study was collected from a vehicle equipped with multiple telemetry sensors driving a 13 min route. Wang et al. [61]

proposed a method to build a library of human’s basic driving motion primitives that can be used for human-like actions. In their paper, the authors have used a probabilistic inference based on iterative expectation-maximization (EM) algorithm on driving data collected from one driver. Their probabilistic method achieves a more meaningful segmentation when compared with the classical EM-GMM approach by merging back-to-back clusters together, which is closer to the real-world driving situation. Lastly,


proposed a framework to automatically provide a description for driver’s behavioral data retrieved through telemetry sensors inside the vehicle. In their method, they applied Bayesian multivariate linear regression to segment the driving data. Through using three different clustering methods of Gaussian mixture model LDA (GMM-LDA), Gaussian Wishart LDA (GW-LDA), and Multimodal LDA (mLDA), they found out that GMM-LDA provides the most useful description generation for naturalistic driving behavior data.

Although previous studies have provided significant insights into the application of unsupervised learning in driver states and behavior detection, most of them did not couple the two together to understand the driver’s state in each behavioral pattern. This is extremely important as combining driver’s states with behavioral patterns can provide a deeper description for each driving segment. For instance, while the unsupervised category for a driving segment might be “high acceleration”, it is important to know whether the driver was carrying a high work load or not in that segment. For the future AV, such information might help with better prediction of driver’s state in each driving scenario. Additionally, most of the past studies in driver state detection are performed in experimental conditions either in a driving simulator or in an on-road controlled study where the conditions are different from a fully natural environment. Such shortcomings mostly existed due to not having access to naturalistic human sensing data. This is now partly achieved through novel wearable technology devices that are implemented day after day in driving studies (see [58, 64]. In this paper, by performing an unsupervised analysis on both driver state data collected through conventional wearable devices and driver’s behavioral data collected through vehicle’s kinematic data, we find driver’s state (i.e., stress levels and workload) in each driving pattern.

Iii Methodology

In this section the framework for retrieving state and behavior patterns is described. We then apply our framework to a fully naturalistic driving data collected through one of our previous studies [58]. We first perform a data exploration using our framework on a driver’s behaviors and states extracted through a 2 hour and 10 minutes long trip of a vehicle equipped with multiple sensors collecting contextual information including both driver and environmental sensing modules (i.e., case study I). We then apply our method to a larger pool of data collected from 12 participants (i.e., case study II). Below we first outline the methodology followed by the dataset details, the selected parameters.

In our framework, kinematic Sensors are the ones that record movements and acceleration in different directions such as an inertial measurement unit (IMU), which records acceleration and angular velocity in 3 different directions of X, Y, and Z. These information are used as driving behavior data such as vehicle’s lateral and forward acceleration. On the other hand, human sensing modules are sensors specifically used for human related data such as smart watches and in-cabin and outdoor facing cameras. These data are used to detect drivers state such as stress level and workload.

The goal of this framework is to apply unsupervised learning methods on the data from kinematic sensor readings and human sensing modules, to detect driving-related and driver state-related patterns automatically. These patterns are then compared with known driving behaviors (e.g., harsh acceleration) and driver states (e.g., high stress level) to provide descriptions for each detected pattern. We then statistically compare different driving behaviors based on the proportion of each driver state within them.

Our framework consist of six main sections. As a summary, through a formerly proposed NDS framework, multiple kinematic and human sensing data-streams are collected (Fig. 1 - A). Driver’s stress level and work load information are retrieved from human sensing data (Fig. 1 - B). Similarly, driving behavior data is used to retrieve driving segments (Fig. 1 - B). Then through an unsupervised learning method, a driving behavior and a driver state pattern is generated for each driving segment (Fig. 1 - C and D). Lastly, within each detected behavioral pattern we assess the likelihood occurrence of each state patterns (Fig. 1 - E).

Fig. 1: An overview of the framework of analysis consisting of six main sections. At first we retrieve features from driver’s state data that align with stress level and work load (A and B). Additionally, we perform segmentation on behavioral data that provides specific segments in driving (B). We then perform unsupervised analysis through GMM-LDA method to find different patterns in driver’s behavior and state. We first apply GMM to find word-like objects in the continuous inputs (C). We then apply the LDA on the sequence of word like objects in each segment to find different patterns in state and behavior (D). Lastly, we assess the presence of each state pattern within in driving behavioral pattern to find the more suitable driving behavior pattern to the driver.

Iii-a Details of Case Study Datasets

In our study we used two different datasets with different sensors and different resolutions to test the method. For case study I, we use the data collected from a 120 mile driving scenario between cities of Charlottesville and Washington DC. This data is available through an online repository [60]. Fig. 2 shows the map of the trip. Based on the framework suggested in our previous work [58], the vehicle has a dash camera which collects both in-cabin and outside environment as well as vehicle’s speed.

The videos collected from the in-cabin camera is fed into the OpenFace software [5] to retrieve driver gaze patterns at 30 frame per second resolution. OpenFace can perform facial landmark detection and gaze direction analysis on one or multiple people within a frame. OpenFace uses Conditional Neural Fields (CLNF) [4]

for facial landmark detection. As an open source off-the-shelf software, it has achieved the least error in landmark detection, gaze estimation and head pose estimation benchmarks. Specifically, it has achieved an error of 9.96 % on the MPIIGAZE dataset


Additionally, the driver is provided with a smartwatch that collects HR data, hand movement (through IMU sensor), and GPS locations. Also, we used an IMU sensor in the vehicle that collects vehicle’s kinematic data including acceleration, linear acceleration as well as rotational velocity in 3 different directions. The detail of data collection and setup is provided in [58].

In case study II, we use driving data collected from 12 participants through our previous study, HARMONY [58]. From each participant, we randomly chose a continuous highway trip that lasted approximately 90 minutes. In HARMONY, we did not have access to IMU sensors inside the vehicle. Instead, we calculated vehicle’s forward acceleration through applying a gradient on the speed data retrieved from GPS on the camera. Although this GPS information was sampled at a much lower frequency (i.e., 1 HZ), it provides significant insights into differences across participants.

Fig. 2: An overview of the case study dataset. The vehicle is equipped with a camera, smart watch, and an IMU sensor (A, and C). The data is collected from a 120 mile driving scenario between cities of Charlottesville and Washington DC, with a duration of 2 hour and 10 minutes (B).

Iii-B Driver State Feature Extraction

We use driver’s raw HR directly for driver stress inference as previous studies showed the positive correlation between driver’s HR and stress levels (see section II-A, and Fig. 1 - B). Moreover, our previous study in driving environments showed that stressful events on the road might change the distribution of driver’s HR momentarily, and increase the HR for a short period of time [58].

As described in section II-A, driver’s gaze patterns can be used for workload estimation (Fig. 1 - B). For this purpose, from the driver’s gaze data, we retrieve the GTE. For this task, we first construct a 2D space of the range of the gaze angles [26, 59]. The 2D space is divided into equally distanced areas of interest (AOI) which can be a grid. The GTE is then calculated based on a rolling window of size 240s on the driving data [26]. As a summary, for a sequence of AOIs, we find the transition matrix between AOIs by assuming a first-order Markov process for the gaze sequences [26]. To this end, a transition matrix is retrieved with being the probability of switching between AOIs i and j (in S), with a stationary probability of . Then the GTE is calculated based on Shannon’s entropy as:


Iii-C Segmentation

Each driving scenario can be imagined as a combination of multiple segments. These segments are either separated using road objects (e.g., intersections and stop signs) or road users (e.g., a slow lead vehicle). In order to analyze the driving behavior data, it is first required to find different segments of interest [29]. This is mostly because behaviors vary across segments, in which a person might be comfortable with different levels of acceleration/deceleration in a city segment, while the same acceleration/deceleration level may not feel suitable for a highway segment. We propose using Bayesian Change Point detection for driving behavior data segmentation in which different driving segments can be separated using change points in the distribution of vehicle’s kinematic data (e.g., sudden acceleration) (Fig. 1

- B). In this regard we first provide a brief overview on Bayesian Change Point detection method. A vector of multiple kinematic signals is then used as the input to the Bayesian Change Point detector.

Iii-C1 Bayesian Change Point Detection (BCP)

Previous studies have pointed out the utility of BCP in detecting changes in the underlying distribution of data in different fields such as health [32], epidemiology [22], ecology [10], and transportation engineering [58]. More specifically in transportation engineering, [58] used BCP to detect changes in driver’s HR in naturalistic scenarios, which might be indicative of presence of stressors on the road. In another study Maye et al. [34] also used BCP to find different segments in vehicle’s telemetry data. To perform BCP, we use the Bayesian change point model provided in Barry and Hartigan’s book [7]. In summary this model assumes that the mean of the input (e.g., acceleration of the vehicle) within different segments remains constant. Below we formally define BCP for a univariate case. The reader is also referred to [7, 34, 16] for more information and details on this method.

Assuming we have input data point (e.g., vehicle’s acceleration data points) , we will use to refer to the accelerometer data between indices and . Let indicate a segment of the time series inside the non overlapping acceleration segments. A boolean array of change points is then used to to denote the segments. At each time step, if takes a value , we have a new acceleration segment; else we remain in the same segment.

We are interested in the posterior density . By Baye’s theorem, this can be written as


The Prior cohesion density can be retrieved as follows: Let denote the probability of finding a new change point at each data point. We assume this probability to be the same at each data point. If we assume that there are segments, the prior cohesion density can be written as


The joint density of observations and parameters given is a product of densities of different segments in . Let us consider a single segment. If we assume that the data in this block is generated by a Gaussian process with mean

and variance

, let the prior density of be a Gaussian with mean and variance


We can then simplify the above integral to:




and is the mean of the observations in the partition. However still depends on the parameters . Defining and choosing the following priors for the parameters:


After integrating out and , This can be simplified to the indefinite integral below. The full derivation is provided in [7].




Similarly, after integrating out the change probability p, the prior cohesion density thus can be written as


To calculate the posterior distribution over segments, Markov Chain Monte Carlo (MCMC) is used

[21]. A Markov chain is then defined with the following transition rule: with probability , a new change point at the location is introduced. Here and refer to the expressions in (12) with and without the change point in location i.


Finally, this is simplified to a probabilistic model with the following two parameters and .

We use the package bcp [16] in R programming language to implement our change point analysis. For the case study I, we use vehicle’s kinematics as the main segmentation input. To this end, we use the input vector X for segmentation algorithm as combination of forward acceleration (a), lateral acceleration (l), and lateral rotational speed ():

The vector X is then fed into BCP. The important point with respect to BCP is that it does not require to know the number of segments. This helps with longitudinal data where the actual number of driving segments is unknown.

In order to illustrate the results of segmentation we use the vehicle’s speed data. Note that the speed data was not used as the input into the segmentation as it is collected at 1 Hz, which is much lower than the IMU sensor (i.e., 100 Hz) and have not been used as the input to the segmentation. We use the speed data mostly due to the fact that speed can naturally show different segments of driving such as slowing down due to a lead vehicle, stopping for an intersection, and turning (Fig. 3). This helps with a better visual inspection of our results.

Fig. 3: Different segments of driving detected by the BCP applied on the kinematics data shown with dashed vertical lines.

For case study II, as we did not have access to an IMU sensor, we use vehicle’s speed data, as well the acceleration, which is retrieved through applying a derivative on the speed data. The segmentation is performed separately for each participant.

Iii-D Pattern Inference

In order to perform pattern inference we use a combination of Gaussian Mixture Modeling (GMM) and Latent Dirichlet Allocation (LDA), referred to as GMM-LDA [2, 1, 43, 29]. LDA is a text analytics probabilistic hierarchical Bayesian model, which is used to infer topics from a set of words [9]. When applying it to driving data, the goal is to provide a main driving pattern and state pattern for each segment of driving. To do so, we first need to generate 50 word-like objects (i.e., discretize the data) from the continuous raw vehicle telemetry, gaze, and HR data using GMM (Fig. 1 - C). GMM-LDA method then uses LDA on top of GMM objects to produce topics for each segment of driving, which can be considered a driving pattern (Fig. 1 - D). Below we formally outline this method.

Iii-D1 Gmm-Lda

In LDA, each word is modeled as a finite random mixtures over an underlying set of k topics and each topic is represented as a distribution over a set of words [9]. A word (w) is defined as a basic unit of discrete data which is drawn from a set of vocabulary. In this way, a document is then a sequence of N words and a corpus is a set a of M documents [9]. LDA is based on the following generative assumptions [9, 29]:


which is the topic distribution of document m in M and is the distribution of each word w in topic k in K. Note that and are the prior distributions of and respectively. In this way, the topic of the nth word in document m is , which has a Multinomial distribution on . Additionally, the nth word in document m has a Multinomial distribution over .

Historically LDA is designed for inferring topics from a set of documents and is based on text type data, which is by nature a discrete type data. When considering continuous data sources (e.g., vehicle’s acceleration), multiple methods were proposed to overcome this issue. A more common approach is to apply GMM clustering method on the continuous data to generate word-like objects [2, 1, 43, 29]

. GMM is a clustering method that identifies a mixture of multidimensional Gaussian probability distributions that can best describe the input vector

[45]. Formally, a GMM model is a weighted summation of M different components as:


In the equation above, x is the multidimensional vector and

is the identified different Gaussian distributions that sum up to 1, and are described as:


in which is the mean vector and is the covariance matrix. The parameters of different Gaussian distributions can be retrieved through the Expectation Maximization algorithm which is described in detail in [45].

We first apply GMM algorithm to obtain word-like objects (i.e., descretizing data) from the multimodal vehicle kinematic data, which includes forward acceleration (a), lateral acceleration (l), and lateral rotational speed () (Fig. 1 - C). These inputs are the same inputs that were used in segmentation section (III-C). Based on the previous literature we chose 50 as the number of words [29]. Also, note that using Bayesian Information Criterion (BIC) for defining the number of components, we observed very small enhancements ( in the BIC value after increasing the number of words above 50.

Iv Results

Fig. 4: Distribution of different kinematic sensor readings within different driving patterns. Note that differences across patterns in the kinematic sensor distributions, for instance, pattern 3 has a more positive lateral acceleration than the other 3 patterns. Also note that for some patterns the difference is only visible through one sensor such as the comparison between patterns 0 and 3 lateral acceleration.

Based on the generated GMM distribution, we then apply LDA to find the driving pattern that best describes each segment in the driving scenario (Fig. 1 - E). When using LDA, we need to define the number of topics prior to analysis. Driving scenarios often consist of different patterns [29]. For instance, braking behavior includes a limited number of patterns which can be high negative acceleration (i.e., harsh brake), or a low negative acceleration (i.e., normal brake). Thus it is likely that there are more than one driving pattern present in an individual’s driving data [29]. While previous studies suggested the number of patterns to be 4 we also tested different numbers ranging from 2-6 [8, 29]

. We observed that with increasing the number of topics in LDA (patterns in driving) more than 4, some of the topics become very similar from the visual inspection of the distributions so it becomes impractical to have them as separate patterns. For instance, moving from 4 to 5 topics, two of the generated topics will be similar in their mean and standard deviation of acceleration, and speed. Additionally, reducing the topics from 4 to 3 misses the difference in the braking patterns.

For the case study II, we also performed segmentation using vehicle’s speed and acceleration. Then for each segment we followed the GMM-LDA method and applied it to the driving behavior as well as driver state data to find patterns of behaviors and state in each segment. Below we first present the results for case study I, which helps better illustrate how the method is applied. We then discuss the results of applying the same method to a larger pool of participants.

In this section we present the results of applying the GMM-LDA method with Bayesian Change Point segmenting on two separate case studies. As described in section III-A, the first case study is from one participant with high resolution kinematic data, while the second case study is from 12 participants vehicles that had low resolution kinematic data.

V Results - Case Study I

In this section, we showcase how GMM-LDA method can be applied on driver behavior and state data to find patterns in driver’s behavior (e.g., harsh braking) and states (e.g., high stress level). We first discuss the patterns detected in vehicle kinematic data. We then move to driver’s state data (e.g., HR). Fig. 4 shows the distribution of kinematic data across the four different detected driving patterns (refer to section III-D

for details on pattern recognition). As shown on this figure, each behavior pattern has its own distinct distribution. For instance, when comparing patterns 3 and 2, we observe the difference in the location of peak in the forward acceleration as well as lateral acceleration. However, comparing across patterns 0 and 3, we only observe a major difference when comparing the distribution of their lateral acceleration.

Additionally, Table II provides the detailed statistics of each driving pattern. The first step is to confirm that the driving patterns detected by GMM-LDA are as a result of different underlying distributions and are statistically different. The Kolmogorov-Smirnov test between different distributions is performed [33]. The pairwise comparison of all distributions were significant at level and is shown on Table I. Below we elaborate in detail on each driving pattern detected in the kinematic data and how they relate to certain predefined driving behaviors.

Source Comparison

KS Test Statistic

p-value Significant at 0.05?
0&1 0.3369 <0.0001 Y
0&2 0.2467 <0.0001 Y
0&3 0.0622 0.0003 Y
1&2 0.3447 <0.0001 Y
1&3 0.3021 <0.0001 Y
2&3 0.2133 <0.0001 Y
0&1 0.1670 <0.0001 Y
0&2 0.2761 <0.0001 Y
0&3 0.7096 <0.0001 Y
1&2 0.2883 <0.0001 Y
1&3 0.5568 <0.0001 Y
2&3 0.6054 <0.0001 Y
0&1 0.1201 0.0003 Y
0&2 0.4365 <0.0001 Y
0&3 0.2642 <0.0001 Y
1&2 0.4520 <0.0001 Y
1&3 0.1488 <0.0001 Y
2&3 0.5152 <0.0001 Y
TABLE I: Details of statistical tests on different patterns detected through unsupervised modeling

Considering the differences in sensor readings across the driving patterns, we find dominant differences that can be associated to certain well-known driving behaviors. Driving patterns 0 and 1 both exhibit high negative forward acceleration (deceleration) values (-0.1667 and -0.4180) respectively. These two driving patterns are mostly related to braking behavior. The more negative value of mean forward acceleration in driving pattern 1 suggest a harsher braking pattern. Note in Table II the mean speed of the two driving patterns are similar at 76.27 and 76.61 km/h respectively. This provides evidence for the two patterns to be similar in nature, as both are related to braking. We refer to pattern 0 as normal braking behavior, and pattern 1 as harsh braking behavior. Driving pattern 2 has a relatively flat forward acceleration distribution including a broad range of forward accelerations with a mean value 0.032 which is close to zero. Due to exhibiting an average higher rotational velocity (0.027), and a broader lateral acceleration distribution, this driving pattern might be associated to normal road curvatures. We refer to this driving pattern as road curvature driving behavior. Driving pattern 3 has the highest mean speed value (89 km/h), with a positive forward acceleration (0.095). This driving pattern also exhibits one large peak at a positive lateral angular speed at 0.0087 which is very close to zero indicating straight driving. This pattern can be associated to normal free flow driving in the highway, and we refer to it as the highway free flow driving behavior. A sample of such behaviors are also shown on Fig. 5, where snapshots of different behaviors (e.g., free flow highway driving) are shown from the dataset.

Driving Behavior Data Statistical Index
Driving Pattern 0
(Normal Braking)
Driving Pattern 1
(Harsh Braking)
Driving Pattern 2
(Road Curvature
Driving Pattern 3
(Free Flow Driving)
Forward Acceleration Mean -0.16677 -0.41807 0.032059 -0.17298
Standard Deviation 0.343867 0.617844 0.677998 0.397301
25th percentile -0.34826 -0.61339 -0.41371 -0.3527
Median -0.20286 -0.40094 -0.08027 -0.18616
75th percentile -0.02847 -0.13217 0.303769 0.0067
Lateral Acceleration Mean -0.26936 -0.20002 -0.35933 0.095032
Standard Deviation 0.205908 0.263469 0.774748 0.347671
25th percentile -0.38046 -0.36103 -0.66054 -0.0605
Median -0.27365 -0.21968 -0.345 0.077288
75th percentile -0.15625 -0.04575 -0.08588 0.221336
Rotational velocity Mean 0.016956 0.01432 0.027152 0.008777
Standard Deviation 0.012567 0.012698 0.073349 0.018666
25th percentile 0.011648 0.00877 0.012364 0.004099
Median 0.015752 0.014653 0.02795 0.012516
75th percentile 0.021457 0.020216 0.047836 0.01734
Speed Mean 76.27409 76.61246 79.16773 89.84004
Standard Deviation 36.47230 29.24584 27.87107 27.15401
25th percentile 65 53 67 90
Median 97 88 89 98
75th percentile 98 97 99 106
TABLE II: Detailed statistics of forward acceleration, lateral acceleration, angular velocity, and speed of each recognized behavior. Note that different patterns resemble well-known driving behaviors.
Fig. 5: Samples of the detected driving patterns. Patterns 0 and 1 are associated with normal and harsh braking behavior, while pattern 2 is associated to curve driving behavior, and pattern 3 is related to highway free flow driving behavior.

Additionally, the sequence of different detected patterns in kinematic sensors can provide insights into driver’s behavior. We discuss the findings when analyzing the sequence of behaviors through a case study and a transition matrix of behavior sequences. First through a case study, Fig. 6 depicts the sequence of driving behaviors through parts of the driving scenario. As visually observed, in locations that speed varies sporadically, the diver behavior also switches more often (e.g., 15:43 until 15:46 shown with a blue dashed box on Fig. 6). To quantify these sequences, as shown in Table III, the probability of transition between the different driving behaviors as recognized by our method. Each element of the table at location i, and j shows the transition probability between the elements i and j. As shown on the table, when in normal braking behavior, the chances of switching to behaviors of road curvature driving behavior and free flow driving is higher than harsh braking behavior. However, when being at harsh braking behavior, the driver is most likely to switch to normal braking than other two behaviors. Looking at the free flow driving behavior, the driver is most likely to continue with the high speed highway driving (), and the probability of switching to a harsh braking behavior () is equal to the normal braking behavior, while being less than switching to the curve driving behavior. A sample of a probable sequence can be to start from free flow highway driving, switch to road curvature driving and attempt a harsh brake. For a better illustration, such a sample sequence is shown with black dashed box on Fig. 6.

Fig. 6: The change in driving patterns throughout the different segments of driving. Each segment is associated with one of the four driving patterns as described previously. Note that when speed changes more sporadically, the behavior also changes more often (e.g., blue dashed box). Also, the black dashed box shows a sequence of sample behavior starting from free flow highway driving, switching to road curvature driving, and finishing with a harsh brake.
Driving Behavior
Road Curvature
Free Flow
0.28 0.21 0.26 0.26
0.27 0.27 0.21 0.25
Road Curvature
0.25 0.27 0.26 0.21
Free Flow
0.13 0.13 0.19 0.55
TABLE III: Transition matrix between different driving behaviors for case study I dataset.

The GMM-LDA method was then applied to the driver’s gaze and HR data. We considered two patterns for these two modalities relating to normal and abnormal (high) for HR, and low and high for gaze entropy, respectively. Fig. 7 shows the different driver state patterns detected through driver’s gaze entropy and HR data. We first confirmed that the distribution of normal/abnormal HR and low/high gaze are statistically significant. Similar to previous sections this is performed through Kolmogorov-Smirnov test [33] and the results of comparison between different distribution where significant at level. Table IV shows the significance of the tests across the different distributions.

Driver state pattern 0 in HR is related to normal HR values, which has a mean value of 71.23 bpm with an standard deviation of 4.47 bpm. We refer to this driver state pattern as the normal HR or calm state. On the other hand, driver state pattern 1 is related to high HR with a mean of 78 bpm and a standard deviation of 5.13 bpm, which we refer to as abnormal HR depicting a situation of having high stress driver state. As shown on Fig. 7, density of the abnormal HR is lower, indicating the less amount of time spent in this state pattern throughout all observations. This is also in line with our previous study [58] showing that driver’s HR through stressors on the road increases from it’s baseline values for a short period of time and moves back to it’s baseline.

Fig. 7: Distribution of different driver state patterns detected through HR and gaze entropy. Note the differences across the distributions pointing to different states for the driver
Driver State Data Comparison KS Test Statistic p-value Significant at 0.05?
HR Abnormal & Normal HR 0.5999 <0.0001 Y
Gaze Low & High Gaze Entropy 0.6688 <0.0001 Y
TABLE IV: Details of statistical tests across different distributions of HR and gaze data
Driver State Data Statistical Index
State Pattern 0
(Normal HR/
Low Gaze Entropy)
State Pattern 1
(Abnormal HR/
High Gaze Entropy)
HR Mean 71.23 78.13
Standard Deviation 4.47 5.13
25th percentile 68.4 75
Median 71 78
75th percentile 73.7 81.4
Gaze Mean 5.01 5.27
Standard Deviation 0.30 0.31
25th percentile 4.99 5.12
Median 5.08 5.36
75th percentile 5.17 5.47
TABLE V: Detailed statistics of HR, and gaze for each recognized state pattern

Fig. 8 shows the fraction of normal and abnormal HR in the recognized driving patterns such as normal brake, and harsh brake. As shown, the driver state pattern with abnormal HR happens more often than the normal HR in the road curvature driving behavior. A Chi-Square test [35] also shows that the ratio between the counts of each state is significantly different than being equal (i.e., ratio of 1) with a and a

. This figure also shows that the driver has more abnormal HR states during harsh brake compared to normal brakes, in which a Kruskal-Wallis test

[28] confirms the figure with a and a . Considering that normal HR indicates a calmer driving state, this implies the driver is calmer during normal braking behavior compared to harsh braking behaviors. Additionally, the driver had a higher level of normal HR as compared to abnormal HR in free flow driving behavior, in which we confirmed this through a Chi-Square test with a and a .

Fig. 8: The co-occurence of driving clusters with clusters of HR

Driver state pattern 0 in gaze with a mean of 5.01, is related to samples of gaze with lower GTE. On the other hand, driver state pattern 1 has a mean value of 5.27. As mentioned in section II, higher GTE in pattern 1 might be associated to higher task demand, higher scene complexity, and work load (Table V).

Fig. 9 shows the fraction of low and high GTE in the recognized driving patterns such as normal brake, and harsh brake. As shown, in the free flow driving, low GTE pattern happens more often than high GTE which might indicate lower work load for this driver during free flow driving. A Chi-Square test [35] shows the counts of each low and high GTE within free flow driving is significantly different than equal probability (i.e., ratio of 1) with a and a . Also, the ratio of high to low GTE in normal braking is higher than that of harsh braking, which implies that the probability of having a higher workload in normal braking is more than harsh braking. This is confirmed with A Kruskal-Wallis test [28] indicating a , and a . This can be due to the fact that in normal braking the period of time that the driver is in the process of braking is higher, thus driver’s GTE and the associated workload are more likely to be higher.

Fig. 9: The co-occurence of driving clusters with clusters of gaze entropy

Vi Results - Case Study II

Similar to the case study I, we performed segmentation, clustering and pattern inference on the case study II dataset. Note that case study II includes kinematic data from vehicle’s speed and acceleration from 12 participants. Fig. 10 depicts the distribution of speed and accelerometer data across the different detected driving patterns. Similar to case study I, each of the different detected patterns exhibit distinct characteristics. For instance, comparing across patterns 0, and 1, we observe the difference in the location of the peak in the distribution. In order to confirm that the patterns are statistically different, we have applied the Kolmogorov-Smirnov test [33]. The results of the pairwise tests across patterns are depicted in Table VI and are all statistically significant.

Fig. 10:

The distribution of forward acceleration and speed across different patterns retrieved unsupervised from 12 participants. Note that because we retrieved accelerometer data from the low resolution speed, the distributions are less smooth as compared to case study I. Thus, we are visualizing the differences across driving patterns with a histogram instead of a kernel density estimation (KDE).

Source Comparison KS Test Statistic p-value Significant at 0.05?
0&1 0.04 <0.0001 Y
0&2 0.08 <0.0001 Y
0&3 0.08 <0.0001 Y
1&2 0.08 0.0003 Y
1&3 0.09 <0.0001 Y
2&3 0.02 <0.0001 Y
Speed 0&1 0.7 <0.0001 Y
0&2 0.9 <0.0001 Y
0&3 0.9 <0.0001 Y
1&2 0.9 <0.0001 Y
1&3 0.8 <0.0001 Y
2&3 0.5 <0.0001 Y
TABLE VI: Details of statistical tests performed on the detected driving patterns. The tests shows that the distributions of speed and accelerometer data are significantly different across the detected patterns.

Considering the differences in vehicle’s speed and acceleration across the driving patterns, we find distinct differences associated to certain well-known driving behaviors, such as differences in mean and standard deviation as depicted in Table VII. Similar to case study I, driving pattern 0 is related to normal brakes, with a negative acceleration (-0.006). Additionally, driving pattern 1 is related to harsh brakes with the highest negative acceleration and the lowest mean speed as compared to all the other patterns. The high value of the standard deviation of speed (42.01) in the driving pattern 1 confirms that this behavior can happen at different speeds depending on the environment, and it can quickly change the speed from high to lower values. Driving pattern 2 is related to accelerating behavior, which has the highest positive acceleration (0.041). Similar to case study 1, driving pattern 3 exhibits a very low negative acceleration (-0.001) with the highest mean speed (110.368) and lowest standard deviation (18.78). This in fact, depicts the behavior through free flow driving where the driver keeps a constant speed while not accelerating/decelerating. Also, note the difference between driving pattern 0 and 3. Driving pattern 0 exhibits a higher standard deviation in speed relative to pattern 1 (23.4 vs. 18.74) with a lower mean speed value (107.195 vs. 110.368), respectively. These differences align with the difference between braking and highway driving behaviors. Please note that in case study II, we did not have access to lateral kinematic data thus we could not confirm the curve driving behavior across participants.

Behavioral Attribute Statistical Index
Driving Pattern 0
(Normal Braking)
Driving Pattern 1
(Harsh Braking)
Driving Pattern 2
Driving Pattern 3
(Free Flow Driving)
Forward Acceleration
Mean -0.006 -0.045 0.041 -0.001
Standard Deviation 0.805 1.259 1.291 0.523
25th percentile -0.118 -0.297 -0.259 -0.162
Median 0.0001 -0.0006 0.0038 -0.0006
75th percentile 0.108 0.233 0.353 0.126
Mean 107.195 47.872 103.252 110.368
Standard Deviation 23.404 42.014 25.964 18.784
25th percentile 84 6 99 96
Median 119 38 110 107
75th percentile 124 96 117 128
TABLE VII: Detailed statistics of forward acceleration and speed in each recognized driving pattern. Note that each pattern exhibits certain characteristics that resembles a well-known driving behavior.

Similar to case study I, we analyze the sequence of behaviors through a transition matrix. Table VIII shows the probability of transitioning between different driving behaviors. As depicted, when drivers are in free flow driving, there is a much higher probability to switch to a normal brake as compared to a harsh brake (0.16 vs. 0.02). This might imply that drivers are more likely to continue a more gentle driving style if they are in free flow driving which has an acceleration very close to zero. However, once being in an accelerating behavior, the driver is more likely to switch to a harsh brake as compared to a normal brake (0.15 vs. 0.13). Among the 12 participants, this might imply that an accelerating behavior continues with a harsh decelerating behavior, which can be considered a more risky driving style with high negative/positive values of acceleration.

Driving Behavior
Free Flow
0.75 0.1 0.12 0.12
Free Flow
0.02 0.65 0.16 0.16
0.15 0.14 0.60 0.11
Accelerating 0.15 0.13 0.08 0.64
TABLE VIII: Transition Matrix between different driving behaviors detected through the case study II dataset

For state data, we first normalized each participant’s HR and gaze entropy values to have a 0 mean and standard deviation of 1. This is performed due to the fact that the abnormal and normal HR as well as GTE values across people can be quite different. For instance, a person’s normal HR might be 65 bpm while for another person, it can be 75 bpm. For a better illustration, distribution of HR data is also shown on Fig. 11 where different participants’ HR distribution can be very different on the mean and standard deviation, implying different normal HR values across different people.

Fig. 11: The distribution of HR for 12 participants. Note the differences across the distributions pointing to different baselines for different participants.

We then applied the GMM-LDA method to both HR and gaze data separately, for case study II. Fig. 12 depicts different driver state patterns retrieved through both HR and gaze data. Similar to case study I, HR has two patterns of normal versus abnormal. Additionally, gaze entropy has two patterns of low versus high. The patterns recognized are significantly different as tested through the Kolmogorov-Smirnov test [33]. The results of the tests are depicted on Table IX.

Fig. 12: The distribution of gaze entropy and HR for 12 participants. Note the differences across the distributions pointing to different states for the drivers.
Driver State Data Comparison KS Test Statistic p-value Significant at 0.05?
HR 0&1 0.8 <0.0001 Y
Gaze 0&1 0.9 <0.0001 Y
TABLE IX: Detailed of statistical tests across different patterns of HR and gaze

The two detected patterns in drivers’ HR can be translated into a calm and stressful state for the drivers. When considering their interaction with the driving behaviors, we observe that, harsh brakes include a higher amount of abnormal patterns in HR in our dataset as compared to normal brakes (Fig. 13). This is then supported through a Kruskal-Wallis test [28] with a , and a . Additionally, the free flow driving with close to zero acceleration has the higher amount of normal HR, when compared to accelerating behavior. This is confirmed through a Kruskal-Wallis test [28] with a , and a .

Driver State Data Statistical Index
State Pattern 0
(Normal HR/
Low Gaze Entropy)
State Pattern 1
(Abnormal HR/
High Gaze Entropy)
HR Mean 76.78 89.15
Standard Deviation 9.26 12.02
25th percentile 69.45 79.76
Median 76.32 88.92
75th percentile 84 98
Gaze Mean 5.17 5.28
Standard Deviation 0.35 0.28
25th percentile 5.01 5.14
Median 5.21 5.29
75th percentile 5.39 5.44
TABLE X: Detailed statistics of HR and gaze for each recognized driver state pattern. Note that the values are normalized
Fig. 13: The fraction of each HR pattern within in each driving pattern from case study II. Note that free flow driving has the highest fraction of normal HR as compared to other driving behaviors, implying a calmer state for drivers.

With respect to drivers’ gaze patterns, we observe that free flow driving has the lowest amount of high GTE values when compared to the accelerating behavior (, and a ). This may point to the lowest cognitive load in free flow driving with close to zero acceleration as compared to the accelerating behavior with positive high acceleration (Fig. 14). Additionally, we observe that similar to case study I, normal braking behavior has a higher GTE fractions as compared to the harsh braking behavior (, and a ).

Fig. 14: The fraction of each gaze pattern within each driving pattern from case study II. Note that free flow driving has the highest fraction of low gaze entropy depicting lower work load.

Lastly, we aggregate the driving patterns with significantly higher than zero acceleration (both positive and negative) into one group titled as aggressive driving style. Additionally, we consider the free flow driving pattern as a conservative driving style where accelerations are close to zero. We then compare the fraction of abnormal HR and high GTE in each of these two categories of driving styles. Fig. 15 shows the fraction of driving segments with abnormal HR as well as high GTE together with the driving styles. As shown on the graph, almost all of the drivers are less likely to have high HR as well as high GTE in conservative monotone driving as compared to aggressive high acceleration driving. We also observe that the difference between fraction of High HR and GTE between aggressive and conservative driving varies across participants leading to different slopes for the lines connecting them. Additionally, among our participants, we observe that two of them has a higher fraction of abnormal HR and GTE in conservative driving (shown with red and blue color on Fig. 15). The implications of these differences will be discussed later.

Fig. 15: The fraction of each driver states patterns of high GTE and abnormal HR in each of conservative and aggressive driving styles. A conservative driving style tend to keep participants HR at a calmer state while using less cognitive resources. Two of the participants do not follow the general trend shown with red and blue.

Vii Discussion

The fast-paced improvement in the development and testing of AVs has been one of the important advancements of the automobile industry over the recent years. To enhance the human-AV collaboration, realistic human behavior models are needed where a driver’s comfort and behaviors can be modeled with a multidimensional approach. This is because human behavior is dynamic and can be affected by different internal dynamics (e.g., emotion and cognition), as well as external situational factors (e.g., traffic density or roadway conditions). The proposed framework in this paper couples both internal and external factors, namely context, to understand the pattern in driver’s state in each driving behavior through unsupervised modeling techniques. While supervised learning has been an impactful approach over the past decade of driving research, the massive amount of data collected through past NDS points towards more efficient, fast-paced methods that can understand behaviors with no or minimum manual annotation. While in the presented case study I, we have used very high-resolution data from one participant for method illustration, case study II confirms that low-resolution data when coupled with unsupervised modeling techniques, can help detect certain states and behaviors across participants. As current vehicles are equipped with multiple sensors such as GPS, our case study II suggests that coupling a conventional wearable device with sensors that are already in vehicles can provide significant insight into the driver’s state in each driving behavior.

Driver’s state patterns indicated the level to which a driver might be affected by different driving behaviors. More specifically, in our first case study, using high precision data, we observed that the driver had a higher fraction of normal HR in free-flow driving on the highway, implying a calmer state for the driver. This has then been confirmed in our case study II where we analyzed the data from 12 participants. Free flow driving not only had the lowest fraction of high gaze entropy, implying a lower workload but also was accompanied by the least fraction of abnormal HR among our participants in case study II. These findings are in line with previous studies indicating less reported subjective emotional states [12], as well as calmer states [56] in highway driving. Additionally, our results suggest that even the same action can have different effects on the driver when performed with different styles. For instance, we observe that harsh brakes are more likely to be accompanied by abnormal HR values implying higher stress levels as compared to normal brakes.

Our method can be leveraged to understand the effects of different driving styles on drivers’ states. Multiple studies have pointed out the importance of driving style selection for drivers’ trust in AVs [14]. For instance, a recent study points out that a ”defensive” driving style as compared to an aggressive driving style is perceived to be more trustworthy [14]. In case study II, our results suggest that a more conservative driving style that is accompanied by close to zero acceleration values is less probable to be accompanied by lower workload and stress levels. Although our sample size might not generalize to the larger pool of drivers, it shows the importance of choosing driving styles according to the drivers’ psychophysiological states given specific road environments (e.g., highway versus city driving). For instance, our results show that free flow driving is a better fit for the drivers of our study in keeping their HR at a normal pattern. Choosing highways over crowded city streets can help driving with the free flow speed, which then helps to keep drivers calmer with a lower workload. These differences in stress levels and workload have strong implications for humanizing automated services, including both the AV and routing services. Taking our results into account, different driving behaviors can be chosen based on how to they might affect a driver based on the unsupervised modeling of their historical driving data within the safety boundary of each behavior. Additionally, services such as route selection can take these factors into consideration for providing a human-centered service that considers how a driver might feel through driving on each route based on potential driving behaviors that can occur on the route (e.g., amount of highway driving, curve driving, etc.).

When comparing conservative versus aggressive driving style, we observe different levels to which participants are affected by each style, which might be indicative of individual differences when being in the two different driving styles. Additionally, among our participants, we have seen two participants that had higher abnormal HR and GTE during conservative driving style. This was in contrast with all the other 10 participants. After careful examination of the videos and interviewing the participants, we realized that one of the two drivers uses loud music in highways which might be indicative of the high HR values in conservative highway driving scenarios (shown with red on Fig. 15), while, the second driver often uses cruise control in highway driving (shown with blue). Although more data is required to address the effect of loud music in naturalistic driving, part of our future work will be focused on using the data provided in HARMONY [58], to assess the effect of music in naturalistic scenarios. This preliminary finding paves the way for understanding the naturalistic effect of music on driving behaviors and driver’s states.

Additionally, while we did not have access to the cruise control usage data by the vehicle, we hypothesize that the abnormal HR and GTE state for the participant shown with blue on Fig. 15 could be due to interaction with the semi-automated cruise control system. Current cruise control systems are generally used to keep a constant speed during highway free flow driving. This might imply that although automated driving assisting systems (ADAS) can help the drivers for delivering a safe driving task, they might be the cause of more stress and might result in changes in drivers’ state that can only be detected through using psychophysiological metrics. Changes in drivers’ behaviors and states, while engaged in ADAS systems were mentioned previously in other naturalistic driving study researches ([13, 25]). It is possible that when the cruise control system was active, which is often on highways and at higher speeds, the driver had more stress as they had to supervise the automation. Alternatively it could also be the case that participant engaged in other activities or started mind-wandering, which increased the fraction of high HR and GTE. A very recent study focusing on semi-automated vehicles drivers is also pointing towards more eyes-off-the-road during using automated systems [36]. More research with taking psychophysiology into account is required to understand the real effect of shared-autonomy on the driver. This also has strong implications for human sensing modules inside vehicles that rely on computer vision. Such modules often only consider the driver’s facial expressions and gaze patterns while leaving behind the driver’s physiological responses such as HR, skin temperature, etc. While computer vision has shown to be a very strong method for detecting a driver’s state, coupling it with physiological metrics can provide a deeper understanding of the driver’s state.

Although we have not focused specifically on the sequence of behaviors, using Tables III and VIII we observe important insights regarding the transition probabilities across different behaviors. This helps with analyzing driving styles in a deeper fashion. From Table III, we first observe that a harsh braking behavior is often followed by a normal braking behavior, while a normal braking behavior is more likely to continue in normal braking. This might imply that while performing normal braking, the driver is more confident regarding the context such as the surrounding road environment, traffic density, or other vehicles, thus the chances of switching to a harsh brake is lower than other two behavior. We also observe that the braking behavior that succeeds free flow driving is divided between the two types of braking behaviors. This might imply that depending on the context driver chooses between two different behavioral patterns. More information from the context can provide insight into the reasoning behind the choice between the two. Information such as the presence of other vehicles, the speed of other vehicles, and their distance can be predictive of such choice. This will be analyzed in our future work. Additionally, from Table VIII we observe that when being in accelerating driving behavior, the probability of switching to a hard brake is the highest among other behaviors. This indicates that aggressive, less monotone driving through accelerating behavior is more likely to result in a harsh braking/deceleration behavior, which can be inferred as continuing the less monotone aggressive driving. On the other hand, being in a free flow driving behavior is much less likely to switch to a harsh brake (). Such results could indicate the persistence of both aggressive or conservative behavior throughout the time, which can be used for driving style prediction. One part of the future work will be focused on this area to understand and predict the sequence of behaviors in different contexts.

Lastly, the findings of our approach to driver state modeling when applied in real-time, has the potential to guide an autonomous vehicle to better predict human behavior and take actions that are fit to driver’s physiology. Note that a vehicle does not need to know the meaning of each unsupervised pattern. For our study, we translated driving patterns into driving behaviors, which helps with illustrating each pattern and justifying the method. However, a vehicle only requires to know the characteristics of each planned driving pattern, compare it with the detected patterns in the historical data, and the probability of abnormal HR within different planned behaviors. This can then help the vehicle choose among different actions. For instance, when choosing between a harsh and normal brake, the vehicle can estimate the acceleration in each action, find the most similar driving pattern to each action based on acceleration characteristics and model fit, and choose the one that is more suitable to drivers’ HR and gaze patterns. Additionally, using our approach in driver state modeling, we can better predict future driving decisions by building behavioral models of past driver states and incorporating theoretical findings in psychology such as memory based judgment and decision making [47, 48].

Viii Conclusion and Future Work

In this paper, we propose a method to understand driver’s reactions to different driving maneuvers in real-world driving scenarios by using unsupervised methods. Using our approach, different driving behaviors can be ranked based on how they affect a user’s well-being, thus producing actions that are more aligned with human’s state, such as stress levels and cognitive loads. Our method uses a combination of Bayesian Change Point detection together with GMM-LDA methods applied on both driving and human sensing data to mine such associations between driver’s state and behavior.

Our work has five main limitations, which will be addressed in the future work of this paper. First, we will increase the number of participants as well as the duration of collected data to understand individual differences in how various driving styles may affect each driver. This can help us cluster different drivers based on how they are affected by the driving behaviors. Second, we will perform deeper feature extraction on both driver’s HR as well as gaze variation to retrieve a deeper understanding of the driver’s state in each driving pattern. Third, we will perform the same analysis with a higher number of sensors and modalities in both the human sensing and vehicle sensing part of our study. This can include modalities such as driver’s other bio-signals (e.g., skin temperature) and modalities retrieved from the computer vision techniques (e.g., distance to other vehicles). Fourth, we will focus deeper on the sequence of driving behaviors to be able to predict behaviors using time series prediction methods. Lastly, we will consider other unsupervised methods that can help understand behaviors in a more fine-grained manner.

Ix Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

X Acknowledgement

We would like to thank the UVA Link Lab, and Commonwealth Cyber Initiative (CCI) for providing support and resources to enable this project. Also, we are thankful to the UVA Institutional Review Board for their continuous support and feedback.


  • [1] G. Agamennoni, J. R. Ward, S. Worrall, and E. M. Nebot (2014) Bayesian model-based sequence segmentation for inferring primitives in driving-behavioral data. In 17th International Conference on Information Fusion (FUSION), pp. 1–8. Cited by: §III-D1, §III-D.
  • [2] G. Agamennoni, S. Worrall, J. R. Ward, and E. M. Neboty (2014) Automated extraction of driver behaviour primitives using bayesian agglomerative sequence segmentation. In 17th International IEEE Conference on Intelligent Transportation Systems (ITSC), pp. 1449–1455. Cited by: §III-D1, §III-D.
  • [3] S. Baee, E. Pakdamanian, I. Kim, L. Feng, V. Ordonez, and L. Barnes (2019)

    MEDIRL: predicting the visual attention of drivers via maximum entropy deep inverse reinforcement learning

    arXiv preprint arXiv:1912.07773. Cited by: §II-A.
  • [4] T. Baltrusaitis, P. Robinson, and L. Morency (2013) Constrained local neural fields for robust facial landmark detection in the wild. In Proceedings of the IEEE international conference on computer vision workshops, pp. 354–361. Cited by: §III-A.
  • [5] T. Baltrusaitis, A. Zadeh, Y. C. Lim, and L. Morency (2018) Openface 2.0: facial behavior analysis toolkit. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 59–66. Cited by: §III-A.
  • [6] T. Bando, K. Takenaka, S. Nagasaka, and T. Taniguchi (2013) Automatic drive annotation via multimodal latent topic model. In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2744–2749. Cited by: §II-B.
  • [7] D. Barry and J. A. Hartigan (1993) A bayesian analysis for change point problems. Journal of the American Statistical Association 88 (421), pp. 309–319. Cited by: §III-C1, §III-C1.
  • [8] A. Bender, G. Agamennoni, J. R. Ward, S. Worrall, and E. M. Nebot (2015) An unsupervised approach for inferring driver behavior from naturalistic driving data. IEEE transactions on intelligent transportation systems 16 (6), pp. 3325–3336. Cited by: §II-B, §IV.
  • [9] D. M. Blei, A. Y. Ng, and M. I. Jordan (2003) Latent dirichlet allocation. the Journal of machine Learning research 3, pp. 993–1022. Cited by: §III-D1, §III-D.
  • [10] W. J. Calder and B. Shuman (2017) Extensive wildfires, climate change, and an abrupt state change in subalpine ribbon forests, colorado. Ecology 98 (10), pp. 2585–2600. Cited by: §III-C1.
  • [11] O. Carsten, K. Kircher, and S. Jamson (2013) Vehicle-based studies of driving in the real world: the hard truth?. Accident Analysis & Prevention 58, pp. 162–174. Cited by: §I.
  • [12] M. Dittrich (2021) Why drivers feel the way they do: an on-the-road study using self-reports and geo-tagging. In 13th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, pp. 116–125. Cited by: §VII.
  • [13] N. Dunn, T. Dingus, and S. Soccolich (2019) Understanding the impact of technology: do advanced driver assistance and semi-automated vehicle systems lead to improper driving behavior?. Cited by: §VII.
  • [14] F. Ekman, M. Johansson, L. Bligård, M. Karlsson, and H. Strömberg (2019) Exploring automated vehicle driving styles as a source of trust information. Transportation research part F: traffic psychology and behaviour 65, pp. 268–279. Cited by: §VII.
  • [15] J. Engström, G. Markkula, T. Victor, and N. Merat (2017) Effects of cognitive load on driving performance: the cognitive control hypothesis. Human factors 59 (5), pp. 734–764. Cited by: §II-A.
  • [16] C. Erdman and J. W. Emerson (2007) Bcp: an r package for performing a bayesian analysis of change point problems. Journal of Statistical Software 23 (1), pp. 1–13. Cited by: §III-C1, §III-C1.
  • [17] R. A. Fabio, C. Incorpora, A. Errante, N. Mohammadhasni, T. Caprì, C. Carrozza, S. De Santis, and A. Falzone (2015) The influence of cognitive load and amount of stimuli on entropy through eye tracking measures.. In EAPCogSci, Cited by: §II-A, §II-A.
  • [18] H. Farah, H. N. Koutsopoulos, M. Saifuzzaman, R. Kölbl, S. Fuchs, and D. Bankosegger (2012) Evaluation of the effect of cooperative infrastructure-to-vehicle systems on driver behavior. Transportation research part C: emerging technologies 21 (1), pp. 42–56. Cited by: §I.
  • [19] A. L. Francis (2018) The embodied theory of stress: a constructionist perspective on the experience of stress. Review of General Psychology 22 (4), pp. 398–405. Cited by: §II-A.
  • [20] L. Fridman, D. E. Brown, M. Glazer, W. Angell, S. Dodd, B. Jenik, J. Terwilliger, A. Patsekin, J. Kindelsberger, L. Ding, et al. (2019) MIT advanced vehicle technology study: large-scale naturalistic driving study of driver behavior and interaction with automation. IEEE Access 7, pp. 102021–102038. Cited by: §I.
  • [21] W. R. Gilks (2005) Markov chain monte carlo. Encyclopedia of biostatistics 4. Cited by: §III-C1.
  • [22] T. A. Kass-Hout, Z. Xu, P. McMurray, S. Park, D. L. Buckeridge, J. S. Brownstein, L. Finelli, and S. L. Groseclose (2012) Application of change point analysis to daily influenza-like illness emergency department visits. Journal of the American Medical Informatics Association 19 (6), pp. 1075–1081. Cited by: §III-C1.
  • [23] C. Katsis, Y. Goletsis, G. Rigas, and D. Fotiadis (2011) A wearable system for the affective monitoring of car racing drivers during simulated conditions. Transportation research part C: emerging technologies 19 (3), pp. 541–551. Cited by: §I.
  • [24] H. Kim, E. Cheon, D. Bai, Y. H. Lee, and B. Koo (2018) Stress and heart rate variability: a meta-analysis and review of the literature. Psychiatry investigation 15 (3), pp. 235. Cited by: §II-A.
  • [25] H. Kim, M. Song, and Z. Doerzaph (2021) Is driving automation used as intended? real-world use of partially automated driving systems and their safety consequences. Transportation Research Record, pp. 03611981211027150. Cited by: §VII.
  • [26] K. Krejtz, A. Duchowski, T. Szmidt, I. Krejtz, F. González Perilli, A. Pires, A. Vilaro, and N. Villalobos (2015) Gaze transition entropy. ACM Transactions on Applied Perception (TAP) 13 (1), pp. 1–20. Cited by: §III-B.
  • [27] K. Krejtz, A. T. Duchowski, A. Niedzielska, C. Biele, and I. Krejtz (2018) Eye tracking cognitive load using pupil diameter and microsaccades with fixed gaze. PloS one 13 (9), pp. e0203629. Cited by: §II-A.
  • [28] W. H. Kruskal and W. A. Wallis (1952) Use of ranks in one-criterion variance analysis. Journal of the American statistical Association 47 (260), pp. 583–621. Cited by: §V, §V, §VI.
  • [29] G. Li, Y. Chen, D. Cao, X. Qu, B. Cheng, and K. Li (2021) Extraction of descriptive driving patterns from driving data using unsupervised algorithms. Mechanical Systems and Signal Processing 156, pp. 107589. Cited by: §I, §II-B, §II, §III-C, §III-D1, §III-D1, §III-D1, §III-D, §IV.
  • [30] N. Li, T. Misu, and A. Miranda (2016) Driver behavior event detection for manual annotation by clustering of the driver physiological signals. In 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), pp. 2583–2588. Cited by: §II-B.
  • [31] M. Lohani, B. R. Payne, and D. L. Strayer (2019) A review of psychophysiological measures to assess cognitive states in real-world driving. Frontiers in human neuroscience 13, pp. 57. Cited by: §II-A, §II-A, §II-A.
  • [32] R. Malladi, G. P. Kalamangalam, and B. Aazhang (2013) Online bayesian change point detection algorithms for segmentation of epileptic activity. In 2013 Asilomar Conference on Signals, Systems and Computers, pp. 1833–1837. Cited by: §III-C1.
  • [33] F. J. Massey Jr (1951) The kolmogorov-smirnov test for goodness of fit. Journal of the American statistical Association 46 (253), pp. 68–78. Cited by: §V, §V, §VI, §VI.
  • [34] J. Maye, R. Triebel, L. Spinello, and R. Siegwart (2011) Bayesian on-line learning of driving behaviors. In 2011 IEEE International Conference on Robotics and Automation, pp. 4341–4346. Cited by: §III-C1.
  • [35] M. L. McHugh (2013) The chi-square test of independence. Biochemia medica 23 (2), pp. 143–149. Cited by: §V, §V.
  • [36] A. Morando, P. Gershon, B. Mehler, and B. Reimer (2021) A model for naturalistic glance behavior around tesla autopilot disengagements. Accident Analysis & Prevention 161, pp. 106348. Cited by: §VII.
  • [37] S. Motamedi, P. Wang, T. Zhang, and C. Chan (2020) Acceptance of full driving automation: personally owned and shared-use concepts. Human factors 62 (2), pp. 288–309. Cited by: §I.
  • [38] E. J. C. Nacpil, Z. Wang, and K. Nakano (2021) Application of physiological sensors for personalization in semi-autonomous driving: a review. IEEE Sensors Journal. Cited by: §I.
  • [39] E. Pakdamanian, N. Namaky, S. Sheng, I. Kim, J. A. Coan, and L. Feng (2020) Toward minimum startle after take-over request: a preliminary study of physiological data. In 12th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, pp. 27–29. Cited by: §II-A.
  • [40] E. Pakdamanian, S. Sheng, S. Baee, S. Heo, S. Kraus, and L. Feng (2020) DeepTake: prediction of driver takeover behavior using multimodal data. arXiv preprint arXiv:2012.15441. Cited by: §II-A.
  • [41] S. Y. Park, D. J. Moore, and D. Sirkin (2020) What a driver wants: user preferences in semi-autonomous vehicle decision-making. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pp. 1–13. Cited by: §I.
  • [42] O. Perepelkina, M. Artemyev, M. Churikova, and M. Grinenko (2020)

    HeartTrack: convolutional neural network for remote video-based heart rate monitoring

    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 288–289. Cited by: §I.
  • [43] K. S. Prabhudesai, B. O. Mainsah, L. M. Collins, and C. S. Throckmorton (2018) Augmented latent dirichlet allocation (lda) topic model with gaussian mixture topics. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2451–2455. Cited by: §III-D1, §III-D.
  • [44] Y. Qiu, T. Misu, and C. Busso (2019) Analysis of the relationship between physiological signals and vehicle maneuvers during a naturalistic driving study. In 2019 IEEE Intelligent Transportation Systems Conference (ITSC), pp. 3230–3235. Cited by: §I.
  • [45] D. A. Reynolds (2009) Gaussian mixture models.. Encyclopedia of biometrics 741, pp. 659–663. Cited by: §III-D1, §III-D1.
  • [46] C. E. Shannon (1948) A mathematical theory of communication. The Bell system technical journal 27 (3), pp. 379–423. Cited by: §II-A.
  • [47] M. A. Sharif and D. M. Oppenheimer (2016) The effect of relative encoding on memory-based judgments. Psychological science 27 (8), pp. 1136–1145. Cited by: §VII.
  • [48] M. A. Sharif and D. M. Oppenheimer (2021) The effect of categories on relative encoding biases in memory-based judgments. Organizational Behavior and Human Decision Processes 162, pp. 1–8. Cited by: §VII.
  • [49] B. A. Shiferaw, D. P. Crewther, and L. A. Downey (2019) Gaze entropy measures detect alcohol-induced driver impairment. Drug and alcohol dependence 204, pp. 107519. Cited by: §II-A.
  • [50] B. A. Shiferaw, L. A. Downey, J. Westlake, B. Stevens, S. M. Rajaratnam, D. J. Berlowitz, P. Swann, and M. E. Howard (2018) Stationary gaze entropy predicts lane departure events in sleep-deprived drivers. Scientific reports 8 (1), pp. 1–10. Cited by: §II-A, §II-A.
  • [51] B. Shiferaw, L. Downey, and D. Crewther (2019) A review of gaze entropy as a measure of visual scanning efficiency. Neuroscience & Biobehavioral Reviews 96, pp. 353–366. Cited by: §II-A, §II-A, §II-A.
  • [52] R. Sugie, T. Arakawa, and K. Kozuka (2016) Detection of fatigue in long-distance driving by heart rate variability. ICIC Express Letters 10 (7), pp. 1553–1559. Cited by: §II-A.
  • [53] X. Sun, J. Li, P. Tang, S. Zhou, X. Peng, H. N. Li, and Q. Wang (2020) Exploring personalised autonomous vehicles to influence user trust. Cognitive Computation 12 (6), pp. 1170–1186. Cited by: §I.
  • [54] T. Taniguchi, S. Nagasaka, K. Hitomi, K. Takenaka, and T. Bando (2014) Unsupervised hierarchical modeling of driving behavior and prediction of contextual changing points. IEEE Transactions on Intelligent Transportation Systems 16 (4), pp. 1746–1760. Cited by: §II-B.
  • [55] A. Tavakoli, V. Balali, and A. Heydarian (2019) A multimodal approach for monitoring driving behavior and emotions. Technical report Cited by: §II-A.
  • [56] A. Tavakoli, M. Boukhechba, and A. Heydarian (2020) Personalized driver state profiles: a naturalistic data-driven study,. In 11th International Conference on Applied Human Factors and Ergonomics, Cited by: §VII.
  • [57] A. Tavakoli, M. Boukhechba, and A. Heydarian (2021) Leveraging ubiquitous computing for empathetic routing: a naturalistic data-driven approach. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–6. Cited by: §II-A.
  • [58] A. Tavakoli, S. Kumar, X. Guo, V. Balali, M. Boukhechba, and A. Heydarian (2021) HARMONY: a human-centered multimodal driving study in the wild. IEEE Access 9, pp. 23956–23978. Cited by: §I, §I, §II-A, §II-A, §II-B, §III-A, §III-A, §III-A, §III-B, §III-C1, §III, §V, §VII.
  • [59] M. Tran, T. Sen, K. Haut, M. R. Ali, and M. E. Hoque (2020) Are you really looking at me? a feature-extraction framework for estimating interpersonal eye gaze from conventional video. IEEE Transactions on Affective Computing. Cited by: §III-B.
  • [60] UVABRAINLAB(Website) External Links: Link Cited by: §III-A.
  • [61] B. Wang, J. Gong, R. Zhang, and H. Chen (2018) Learning to segment and represent motion primitives from driving data for motion planning applications. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 1408–1414. Cited by: §II-B.
  • [62] Y. Xing, C. Lv, D. Cao, and P. Hang (2021) Toward human-vehicle collaboration: review and perspectives on human-centered collaborative automated driving. Transportation Research Part C: Emerging Technologies 128, pp. 103199. Cited by: §I.
  • [63] T. Yamakoshi, P. Rolfe, Y. Yamakoshi, and H. Hirose (2009) A novel physiological index for driver’s activation state derived from simulated monotonous driving studies. Transportation research part C: emerging technologies 17 (1), pp. 69–80. Cited by: §I.
  • [64] S. Zepf, M. Dittrich, J. Hernandez, and A. Schmitt (2019) Towards empathetic car interfaces: emotional triggers while driving. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 1–6. Cited by: §I, §II-B.
  • [65] T. Zhang, W. Zeng, Y. Zhang, D. Tao, G. Li, and X. Qu (2021) What drives people to use automated vehicles? a meta-analytic review. Accident Analysis & Prevention 159, pp. 106270. Cited by: §I.
  • [66] X. Zhang, Y. Sugano, M. Fritz, and A. Bulling (2015) Appearance-based gaze estimation in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4511–4520. Cited by: §III-A.