Driver inattention is a dangerous phenomenon that can arise because of various reasons: distractions, drowsiness due to fatigue, less reaction time due to speeding, and intoxication. The consequences of inattentive driving can severely affect the driver’s safety even under normal road conditions, and can be devastating in terms of life-loss and/or long-lasting injuries. According to NHTSA’s latest revelations , in 2019, 3142 lives were claimed by distracted driving, 795 lives were claimed by drowsy driving, 9378 deaths were due to speeding, and 10,142 deaths were due to drunk driving, all in the United States alone. Therefore, several types of driver-assist systems have been developed and deployed in modern vehicles to mitigate inattentiveness. However, traditional driver-assist technologies are static and not personalized, which are insufficient to handle the situations in futuristic transportation systems with mostly connected and/or autonomous vehicles. For example, several deadly accidents have been reported where the Tesla driving assistants were working normally but the drivers were inattentive [2, 3]. As per SAE standard J3016 , the state-of-the-art vehicles mostly fall under Levels 2/3, which continue to demand significant driver attention (e.g. Tesla autopilot 
), especially in uncertain road and weather conditions. Therefore, there is a strong need to design dynamic, data-driven driver-alert systems which present effective interventions in a strategic manner based on its estimates of the driver’s attention level and physical conditions.
However, the design of strategic interventions to mitigate the ill effects of driver inattention is quite challenging due to three fundamental reasons. Firstly, the driver may not follow the vehicle’s recommendations (i) if the driver is inattentive, (ii) if the driver does not trust the vehicle’s recommendations, and/or (iii) if the recommendation signal is not accurate enough to steer driver’s choices (e.g. the driver may not stop the vehicle because of a false alarm). Secondly, the persuasive effectiveness of vehicle’s recommendations is technically difficult to evaluate due to its complex/unknown relationship with the driver’s (i) attention level , (ii) own judgment/prior of road conditions , and (iii) trust on the vehicle’s recommendation system . In addition, it is difficult to mathematically model and estimate these three terms [9, 10, 11]. Finally, there is strong evidence within the psychology literature that human decisions exhibit several anomalies to traditional decision theory. Examples include deviations from expected utility maximization such as Allais paradox , Ellsberg paradox , violations of transitivity and/or independence between alternatives 
; and deviations from classical Kolmogorov probability theory such as conjunction fallacy, disjunction fallacy , and violation of sure thing principle .
There have been a few relevant efforts in the recent literature where both the driver and the driver-assist system interact in a game theoretic setting. These efforts can be broadly classified into two types: (i) thedirect method where the system uses its on-board AI to directly control the vehicle, and (ii) the indirect method where the system indirectly controls the vehicle via relying on the driver to make decisions. On the one hand, Flad et al. proposed a direct method that models driver steering motion as a sequence of motion primitives so that the aims and steering actions of the driver can be predicted and then the optimal torque can be calculated . Another example that proposes a direct method is by Na and Cole in , where four different paradigms were investigated: (i) decentralized, (ii) non-cooperative Nash, (iii) non-cooperative Stackelberg, and (iv) cooperative Pareto, to determine the most effective method to model driver reactions in collision avoidance systems. Although direct methods can mimic driver actions, they certainly do not consider the driver’s cognition state (in terms of preferences, biases and attention) and no intervention was designed/implemented to mitigate inattention. On the other hand, indirect methods have bridged this gap via considering driver’s cognition state into account. Lutes et al. modeled driver-vehicle interaction as a Bayesian Stackelberg game, where the on-board AI in the vehicle (leader) presents binary signals (no-alert/alert) based on which the driver (follower) makes a binary decision (continue/stop) regarding controlling the vehicle on a road . This work and  share the same setting of unknown road condition and binary actions of two players, and also introduce a non-negative exponent parameter in the overall driver’s utility to capture his/her level of attention. The difference is that  still follows the traditional game theory framework of maximizing payoffs while this work extends the traditional framework in which the players do not necessarily maximize payoffs. Schwarting et al. integrated Social Value Orientation (SVO) into autonomous-vehicle decision making. Their model quantifies the degree of an agent’s selfishness or altruism in order to predict the social behavior of other drivers. They modeled interactions between agents as a best-response game wherein each agent negotiates to maximize their own utilities . However, all the human players in the game of the above research are still assumed to be rational players maximizing utilities, even though the utilities are modified to capture attention level or social behavior, whether by a non-negative exponent parameter or by SVO. The present work bridges this gap by directly considering the driver as an agent who does not seek to maximize payoff, but instead uses a quantum-cognition based decision process to make decisions.
Note that most of the past literature focused on addressing each of these challenges independently. The main contribution of this paper is that we address all the three challenges jointly in our driver-vehicle interaction setting. In Section II, we propose a novel strategic driver-vehicle interaction framework where all the aforementioned challenges are simultaneously addressed in a novel game-theoretic setting. We assume that the vehicle constructs recommendations so as to balance a prescribed trade-off between information accuracy and persuasive effectiveness. On the other hand, we model driver decisions using an open quantum cognition model that considers driver attention as model parameter and incorporates the driver prior regarding road condition into the initial state. In Section III, we present a closed-form expression for the cognition matrix in the driver’s open quantum cognition model. Given that the agent rationalities are fundamentally different from each other (vehicle being a utility-maximizer, and driver following an open quantum cognition model), we also propose a novel equilibrium notion, inspired by Nash equilibrium, and compute both pure and mixed equilibrium strategies for the proposed driver-vehicle interaction game in Sections IV and V respectively. Finally, we analyze the impact of driver inattention on the equilibrium of the proposed game.
Ii Strategic Driver-Vehicle Interaction Model
In this section, we model the strategic interaction between a driver-assist system (car) and an inattentive driver as a one-shot Bayesian game. We assuming that the physical conditions of the road are classified into two states, namely, safe (denoted as ) and dangerous (denoted as ). The vehicle can choose one of the two signaling strategies: alert the driver (denoted as ), or no-alert (denoted as ) based on its belief about the road state. Meanwhile, based on the driver’s belief about the road state and his/her own mental state (which defines driver’s type), the driver chooses to either continue driving (denoted as ), or stop the vehicle (denoted as ). Note that although the letter is used to denote both road state being safe and driver decision being stop, the reader can easily decipher the notation’s true meaning from context.
Depending on the true road state, we assume that the vehicle (row player) and the driver (column player) obtain utilities as defined in Table I. When the road is dangerous, we expect the car to alert the driver. If the car does not alert, it will get a low payoff. Furthermore, we assume this low payoff depends on the driver’s action. If the driver stops, the payoff is only slightly low because no damage or injury is incurred. If the driver continues to drive, the payoff is very low because damage or injury is incurred. When the road is safe, the correct action for the car is not to alert. If the car does not alert, it will get a high payoff. This high payoff depends on the driver’s action. If the driver stops, the payoff is only slightly high because it does not help the driver and an unnecessary stop is waste of time and energy. If the driver continues to drive, the reward is very high because everything is fine.
In this paper, we assume that both the car and the driver does not know the true road state. While the car relies on its observations from on-board sensors and other extrinsic information sources (e.g. nearby vehicles, road-side infrastructure) and its on-board machine learning algorithm for road judgment to construct its beliefregarding the road state being safe, we assume that the driver constructs a belief regarding the road state being safe based on what he/she sees and his/her prior experiences. Furthermore, as in the case of a traditional decision-theoretic agent, we assume that the car seeks to maximize its expected payoff. If is the probability with which the driver chooses , then the expected payoff for choosing and at the car are respectively given by
The calculation of is complicated by the fact that the driver exhibits bounded rationality. Fortunately, the bounded rationality can be characterized by the open quantum system cognition model, as described below.
Ii-a Driver’s Open Quantum Cognition Model
In this subsection, we present the basic elements of the open quantum system cognition model , and how it is applied to model driver behavior. The cognitive state of the agent is described by a mixed state or density matrix , which is a statistical mixture of pure states. Formally it is a Hermitian, non-negative operator, whose trace is equal to one. Under the Markov assumption (the evolution can be factorized as given a sequence of instants , , ), one can find the most general form of this time evolution based on a time local master equation , with a differential superoperator (it acts over operators) called Lindbladian, which is defined as follows.
Definition 1 ().
The Lindblad-Kossakowski equation for any open quantum system is defined as
is the Hamiltonian of the system,
is the commutation operation between the Hamiltonian and the density operator ,
are entry of some positive semidefinite matrix (denoted as ),
is a set of linear operators,
denotes the anticommutator. The superscript represents the adjoint (transpose and complex conjugate) operation.
In this paper, we set
as defined in , where, for any ,
is a column vector whoseth entry is 1 and the other entries are 0. Note that is obtained by transposing and then taking its complex conjugate. Thus is a row vector whose th entry is 1 and 0 otherwise.
The second term on the right side of Equation (3) contains the dissipative term responsible for the irreversibility in the decision-making process , weighted by the coefficient such that the parameter interpolates between the von Neumann evolution and the completely dissipative dynamics . Furthermore, the term is the -th entry in the cognitive matrix . This cognitive matrix is formalized as the linear combination of two matrices and , which are associated to the profitability comparison between alternatives and the formation of beliefs, respectively :
where is a parameter assessing the relevance of the formation of beliefs during the decision-making process, is the transition matrix where -th entry is the probability that the decision maker switches from strategy to for a given state of the world , and matrix allows the driver to introduce a change of belief about the state of the world in the cognitive process by jumping from one connected component associated to a particular state of the world to the connected component associated to another one , while keeping the action fixed. The superscript denotes the transpose matrix. Finally, the dimension of the square matrix , i.e. , can be inferred from the detailed discussion given below.
At the driver, the world state primarily consists of two components: (i) the road condition, and (ii) the car’s action, i.e., the set of world states of the driver is where the first letter represents road condition and the second letter represents car action. The utilities of the driver for choosing a strategy at a world state are as follows:
We choose the basis of the road-car-driver system spanning the space of states to be
Next we define the transition matrix . If the utility of the decision maker by choosing strategy at the world state of is , the transition probability that the decision maker would switch to strategy at time step from strategy at time step is given in the spirit of Luce’s choice axiom [24, 25, 26]:
where the exponent measures the decision maker’s ability to discriminate the profitability among the different options. When , each strategy has the same probability of being chosen (), and when only the dominant alternative is chosen. There are two implications in this formulation of : (1) to avoid negative ; (2) only depends on the destination and does not depend on the starting point .
Below are the probabilities needed for the matrix:
is the probability that driver picks when he/she assumes that road state is and the car chooses ,
is the probability that driver will pick when he/she assumes that road state is and the car chooses ,
is the probability that driver will pick when he/she assumes that road state is and the car chooses ,
is the probability that driver will pick when he/she assumes that road state is and the car chooses .
Equation (6) puts all the terms together in a matrix form and demonstrates the physical meaning of the row and column labels in .
In this paper, we set for the following two reasons: (1) Since the world state of the driver is mainly the action of the car and the action of the car is known when calculating the equilibrium, the driver does not need to form such a belief; (2) We are considering a one-shot game and we can assume the road condition does not change in one game, i.e., we are only considering short-time dynamic. The matrix is zeroed out and its content is not described here. Thus and we set in Equation (3).
Ii-B Pure and Mixed Strategy Equilibria
For the sake of simplicity, let us denote the car as Agent 1, and the driver as Agent 2 without any loss of generality. Since the car seeks to maximize its expected payoff given that the driver chooses a strategy , it is natural that the car’s final response is its best response that maximizes its expected payoff given in Equations (1) and (2), i.e.,
On the contrary, driver’s decisions are governed by the open quantum system model. If we denote the steady-state solution of Equation (3) as for a given car’s strategy , the final response of the driver is defined as
A strategy profile is a pure strategy equilibrium if and only if and .
On the contrary, the concept of mixed strategy equilibrium is actually more natural to the open-quantum-system model since the solution tells the probability of taking various actions instead of indicating a particular action. The open quantum system model directly gives a mixed strategy. Let the mixed strategy of the driver is denoted as where is the probability that the driver chooses to continue. Similarly, let the car’s mixed strategy be denoted as , where is the probability that the car chooses to alert. Then, a mixed strategy profile is denoted as . In such a mixed strategy setting, the car’s final response is its best mixed-strategy response, i.e.
Similarly, the final response of the driver is obtained from the steady-state solution of Eq. 3, i.e.
Then the mixed-strategy equilibrium of this game is defined as follows.
A strategy profile is an mixed-strategy equilibrium if and only if and .
Note that the above equilibrium notions presented in Definitions 2 and 3 are novel and different from traditional equilibrium notions in game theory. This is because our game comprises of two different players: (i) the car modeled as an expected utility maximizer, and (ii) the driver modeled using open quantum cognition equation, as is illustrated in Figure 1. However, our equilibrium notions are both inspired from the traditional definition of Nash equilibrium, and are defined using players’ final responses as opposed to best responses in the Nash sense. By doing so, we can easily expand traditional equilibrium notions to any strategic setting where heterogeneous entities interact in a competitive manner.
Iii Driver’s Final Response
Note that the dependent variable in Equation (3) is a matrix. In order to obtain the analytical solution, we vectorize by stacking its columns one on another to obtain vector . Thus, the vectorized version of Definition 1 is as follows.
The vectorized form for Lindblad-Kossakowski equation is given by
where is the identity matrix,
with the superscript * representing taking the complex conjugate of all entries.
Note that the symbol means direct-sum while the symbol
means tensor-product. The following two simple examples show their difference.
If the Hamiltonian of the 8-dimensinoal Lindblad-Kossakowski equation is defined as
then its vectorized form is given by
By Equation (11), we only need to calculate and . Noting , we have
with 1 the matrix whose elements are all 1.
Subtracting from blockwise then leads to the claimed . ∎
The condition of Lemma 1 is just setting the Hamiltonian of the Lindblad-Kossakowski equation as in Equation (9). is a sparse block diagonal matrix with four blocks, each being . is a sparse matrix consists of four blocks where the off-diagonal blocks are identity matrices and the diagonal matrices are again block diagonal matrices. Such a special structure results from stacking the columns of the all-one matrices.
The entry of the matrix with is given by
The entry of the matrix with is given by
is a real matrix, so and .
Since only the entry of is 1 and the others are 0, is a matrix with all entries zero except the entry, which is 1. Note that and range from 1 to 8.
Since the entry of is 1 and the other entries are 0, is a 6464 matrix whose entries are all zero except the th to the 8th diagonal entries (which are 1), and is a 6464 matrix whose entries are all zero except the entries (which are 1) with , for each fixed . Thus by Equation (14), is a 6464 matrix whose entries are all zero except the entries with or , for each fixed . The entries are 1/2 when and is 1 when .
By Equation (13), subtracting from leads to the claimed result: When , there is no cancellation of nonzero entries between and . When , only the entry of is nonzero (which is 1). The entry of also 1. Thus the resultant only has 14 nonzero entries. ∎
Note that does not mean the entry of . is itself a matrix. There are 64 such matrices and they will be weighed by and summed. Then entry of depend on , , , and . is very sparse. The nonzero entries can only take and since the building blocks and only has 1 as nonzero entry value. Given and , the entries with or are special since either or takes nonzero values at these entries.
Let the coefficients in the 8-dimensinoal Lindblad-Kossakowski equation be the entries of the matrix (ref. to Equation (6))
where is of the form
Then, the entries of within the vectorized Lindblad-Kossakowski equation (ref. to Def. 4) with are given by
where and , and the entries of with are given by
Interested readers may refer to Appendix A. ∎
depends on and . The expression of must consist of entries of . Theorem 1 just reveals explicitly these relations. The entries of appearing in the expression of are and where or . Such relations arise due to vectorization (stacking columns). Dividing by 8 and mode 8 appear since each column to be stacked is 8-dimensional. Despite summation over all and , at most two entries of appear in since is itself sparse.
If the coefficients of the 8 dimensional Lindblad-Kossakowski equation is set as the entries of
where is a matrix in the form of
and are 1616 matrices. They both have only one nonzero entry. The nonzero entries are taken from the cognition matrix :
The ’s are 44 matrices:
for , , , where
The Lindblad-Kossakowsi equation itself is not a cognition model since its coefficients are quite general. The open quantum cognition model is built by setting the as entry of the cognition matrix . The condition in Corollary 1 is just setting in Equation (5) and using the prescribed in Equation (6). This is exactly the scenario of the car-driver game.
The vectorized operator of the vectorized Lindblad-Kossakowski equation is a block diagonal matrix with four blocks. The four blocks have very similar structures. Each block is actually quite sparse since each block is a block matrix with totally four sub-blocks and the two off-diagonal sub-blocks are almost identity matrix (only one entry is different).
Iv Pure-strategy equilibrium
The diagonal elements of the steady-state solution of Equation (3) are just . Then we can calculate the probability for the driver to continue as
Let be the probability that the driver judges the road to be safe before knowing the car’s action and be the utility function of the driver. In this paper, we model driver’s pure strategy as the output of the open quantum cognition model parameters and taking the pure strategy of the car as input:
where is the parameter tuple of the open quantum model.
In this paper, we use in two different ways to obtain pure and mixed strategy equilibria. We obtain a pure strategy at the driver by employing a hard threshold on (in our case, Continue if , Stop otherwise). By treating as the driver’s mixed strategy in Section V, we will obtain the mixed-strategy equilibrium.
We set the initial density matrix as , where
when the car action is A and
when the car action is N, with prescribed in Subsection II-A. The calculation of the generalized pure-strategy equilibrium is similar to that of the Nash equilibrium. We simply replace the best response with the final response. We loop over the car strategies. In the loop, the car strategy is the input of the open quantum model and a driver strategy is the output. If the car strategy is the best response with the outputted driver strategy, then the strategy profile is outputted as pure-strategy equilibrium. Algorithm 1 lists the procedures of calculating the pure-strategy equilibrium.
Furthermore, in our numerical evaluation, we assume the utilities at both the car and the driver as shown in Table II. In addition to the case of a driver-conscient car, we consider a benchmark case where the car does not care about the driver and makes decisions solely based on its prior, i.e., alert if and does not alert if . In this benchmark case, the final response of the car is independent of the driver’s strategy. The equilibrium points of the driver-car games with a driver-agnostic car and with a driver-conscient car (driver making decisions according to open quantum model with ) under various prior beliefs on road condition are shown in Fig. 2. When both the driver and car are sure of safety, the equilibrium is . When both the driver and car are sure of danger, the equilibrium is . When the driver is sure of safety but the car is sure of danger, the equilibrium is . When the driver is sure of danger but the car is sure of safety, the equilibrium is . The division line is not . (S, A) has the largest area. When the car is driver-agnostic, the border between Not Alert and Alert in the equilibrium plot is always regardless of the equilibrium strategy of the driver. When the car is driver-conscient, the border between Not Alert and Alert depends on the equilibrium strategy of the driver (or equivalently, road prior of the driver): the border is located close to when and the border is located close to when .
The equilibrium points of the driver-car game with = 0, 1, 2, 3, 4, 10 and = 0.8 under various prior beliefs on road condition are shown in Fig. 3 (C: Continues, S: Stop, A: Alert, N: Not Alert). When drops from 10 to 4, the border between S and C shifts from left to right. When drops from 4 to 3, the border between (S, A) and (C, A) shifts from left to right and a region with two equilibrium points appears inside the region of (C, N). The two equilibria are (C, N) and (S, A). When drops from 4 to 3 and from 3 to 2, the border between (S, A) and (C, A) shifts from left to right and the region with two equilibrium points enlarges with the border shift. When drops from 2 to 0, the driver can no longer distinguish the utilities. The region is merged into the region and the two-equilibrium region is merged into the region. The border between and shifts from left to right and a new no-equilibrium region appears inside the previous region.
When , the driver cannot distinguish the utilities at all and is completely random, so the concept of final response does not apply. The type of pure-strategy equilibrium strongly aligns with the priors of the driver and the car. The desired equilibria are and , where the driver’s action is in harmony with car’s action.
Since Fig. 2 and Fig. 3 are plotted over axes, we can find out which type of equilibrium is most common. With the prescribed utilities, the most common pure-strategy equilibrium is . This is the most favorable equilibrium, since following the car’s recommendation in the dangerous road can save life.
As the driver’s ability to distinguish utilities weakens ( decreases), becomes more likely. This means that the driver follows the car’s advice diligently especially when he/she is incapable of making decisions on a dangerous road.
V Mixed-strategy equilibrium
When calculating the mixed-strategy equilibrium, and appear in the initial state of the open quantum model since the mixed-strategy of the car is completely determined by (ref. to Subsecion II-B). Theorem 2 will give a closed-form expression of by solving the vectorized Lindblad-Kossakowski equation (ref. to Definition 4).
Let the initial density matrix be given as where