1 Introduction
The advancement of Information and Communication Technologies (ICTs) and transportation technologies has been connecting entities, including people and devices, in various ways and improving the quality of lives globally. While benefiting from increasing connectivity, we have also experienced several toxic “side effects” that are bought by hidden spreading processes over the underlying networks. Such processes include the spread of contagious diseases over human contact networks and animal populations, the diffusion of viruses or worms over communication and computer networks, and the propagation of rumors/fake news over social media. The recent COVID19 pandemic, which has been taking a devastating toll on the physical and economic wellbeing of people across the world, needs more earnest heed to these processes. Certainly, a fundamental understanding of the evolution and control of these processes will contribute to alleviating the threats to the safety, wellbeing, and security of people and other interconnected systems around the world.
The underlying networks on which these processes spread are usually largescale complex networks composed of intelligent and strategic individuals with different beliefs, perceptions, and objectives. The scale and complexity of the underlying networks, the unpredictability of individuals’ behavior, and the unavailability of accurate and timely data pose challenges to the fundamental understanding of the evolution and control of these processes.
The inclusion of human decisionmakers in the epidemic spreading process creates challenges. It becomes a significant hurdle for researchers to understand humanintheloop spreading processes at a fundamental level. The key to clear this hurdle is to integrate decision models into epidemic models. Mathematical models of epidemic spreading started 200 years ago by Daniel Bernoulli bernoulli1760essai, at the dawn of the industrial revolution. Many works have been dedicated to the modeling of epidemic spreading over the last 200 years. But not until very recent years has there been works investigating human decisionmaking amid epidemics and the effect of human responses on the spreading processes. In this review, we review and present various solved and open problems in developing, analyzing, and mitigating the epidemic spreading process with human decisionmaking. We provide a tutorial on epidemic models and the pros and cons of different epidemic models. We explain in detail how decision models are introduced and integrated into epidemic models in the existing literature. For example, we provide concrete examples regarding what interventions can be taken by individuals and the central authority to fight against the epidemic, when interventions are taken, and how interventions are modeled.
Among various decision models, gametheoretic models have become prominent in modeling human responses/behavior amid epidemics in the last decade. The popularity of gametheoretic models for humanintheloop epidemics is primarily due to the following reasons. One rationale is the largescale nature of the human population. Centralized decisionmaking becomes intractable in a largescale network. Gametheoretic models provide a bottomup decentralized modeling framework that naturally makes the computation and design scalable. The second one is that individuals living amid an epidemic may not be willing to comply with the suggested protocols, and individuals are mostly selfinterested. The third one is that game theory as a mature and extensive field offers a set of relevant concepts and analytical techniques that can be leveraged to study human behavior amid epidemics. In this review, we demonstrate that gametheoretic frameworks are powerful in modeling the spreading processes of humanintheloop epidemics. We provide a multidimensional taxonomy of the existing literature that has proposed, studied, and analyzed gametheoretic models for humanintheloop epidemics. Among existing literature, we showcase three representative works with unique ways of integrating gametheoretic decisionmaking into the epidemic models. The uniqueness of each of these three works distinguishes them from each other by means of their models, analytical methods, and results.
Despite a recent surge in the literature about gametheoretic models for studying the humanintheloop epidemic spreading, a number of open problems and research gaps are left to be addressed and filled. Hence, we use one section to discuss emerging topics. As we continue to witness the devastating toll of the pandemic on human society, this review aims to introduce to more researchers, especially game and dynamic game theorists the subject of gametheoretic modeling of humanintheloop epidemic spreading processes. Their contributions to understanding and mitigating the humanintheloop epidemic spreading would make a significant societal impact.
1.1 Mathematical Preliminaries
Graph Theory: A directed graph (network) is a pair , where is the set of nodes representing individuals involved and is the set of edges representing connections between two individuals. The size of the network is . Given , an edge from node to node is denoted by . When implies and vice versa, the graph is undirected. For an undirected network, we define as the set of neighbors of node as . For a directed network, we denote the outneighbor set of individual , which is defined as . Let the adjacency matrix for a unweighted graph with elements if and only if . Otherwise, . For an unweighted network, we use as the adjacency matrix with elements .
The number of neighbors a node has is the degree of this node denoted by . Given a graph , is the proportion of nodes who have degree . The average degree of a graph is denoted by . Let . If is a scalefree graph, its degree distribution follows with ranges from to . If is a regular graph, then each node in has the same degree.
Notation: For a square matrix , is the spectral radius of matrix
. Given a vector
, is an matrix whose elements on the th row and the th column is. The identity matrix is denoted by
. We use andto denote the expected value and the probability of the argument. Given
, denotes the vector removing element . Given a graph and a vector associated with its nodes , denotes the vector that includes all the elements associated with node or individual and its neighbors. Let be the probability simplex of dimension , i.e.,2 Epidemic Models & Decision Models
To understand the evolution and control of epidemic spreading at a fundamental level, we need to understand both the epidemic model and the decision model, and the art of coupling these two models.
2.1 Epidemic Models
The spreading of infectious diseases has affected human civilization since the age of nomadic huntergatherers. Mathematical models of epidemic spreading started to be proposed and studied since the beginning of the industrial revolution, with one of the earliest attempts to model infectious disease transmission mathematically by Daniel Bernoulli bernoulli1760essai. Over the last years, many mathematical models of epidemics have been proposed and analyzed andersson2012stochastic; pare2020modeling; draief2009epidemics; van2008virus; pastor2001epidemic.
Compartment Models:
The basic models of many mathematical modeling of epidemics are the wellknown compartment models kermack1932contributions. In compartment models, every subject, based on the status, belongs to some compartment of the population at any given time. Common compartments include the susceptible (S), the exposed (E), the infected (I), the asymptomatic (A), and the recovered (R). Depending on the mechanisms of infectious diseases, different compartment epidemic models such as SIS (SusceptibleInfectedRecovered), SIR (SusceptibleInfectedRecovered), and SEIR (SusceptibleExposedInfectedRecovered) are studied and analyzed. Some infections, such as the common cold and influenza, do not confer any longlasting immunity. Such infections can be modeled by the SIS model since individuals can become susceptible again after their recovery. If individuals recover with permanent immunity, the model is an SIR model. The SIR model has been used to study infectious diseases such as measles, mumps, and smallpox kermack1932contributions. Many variants of compartment models have been proposed and studied in the past decades. For example, Rothe et al. rothe2020transmission have looked into an SAIRS (susceptibleasymptomaticinfectedrecoveredsusceptible) to model the COVID19 pandemic; Erdem et al. erdem2017mathematical have investigated an SIQR (susceptibleasymptomaticinfectedquarantinerecovered) to study the effect of imperfect quarantine on the spreading of an influenza epidemic; Huang et al. huang2016novel have proposed a model that connects the SIS and the SIR models, in which once recovered from an infection, individuals become less susceptible to the disease.
2.1.1 Stochastic versus Deterministic Models
To capture the dynamics of the epidemicspreading process, we need a dynamic model to describe the evolution of the population in each compartment. In general, epidemic models are categorized into two groups: deterministic epidemic models and stochastic epidemic models, depending on the mathematical formulation. Deterministic models, oftentimes represented by a collection of ordinary differential equations (ODEs), have perhaps received more attention in the literature
kermack1932contributions; pare2020modeling; pastor2001epidemic; van2008virus. Their popularity is because deterministic models can become more complex yet still feasible to analyze, at least when numerical results are sufficient. In contrast, stochastic models, usually represented by Markov processes, need to be fairly simple to be mathematically manageableandersson2012stochastic; draief2009epidemics.There are, however, several advantages of stochastic epidemic models over deterministic epidemic models when the analysis is tractable. First, the epidemic spreading processes are stochastic by nature. For example, the disease transmission between individuals is more spontaneously described by probabilities than deterministic rules that govern the transmission. Second, stochastic modeling has its deterministic counterpart through meanfield analysis. The stochastic modeling provides a microscopic description of the epidemic process, while the deterministic counterpart is useful to describe the spreading of epidemics at a macroscopic level, e.g., the fraction of the infected population at a given time. When the number of individual is small, or the size of the infection is small in a large community, the meanfield approximation will experience a considerable approximation error, and hence deterministic models may fail to accurately describe the spreading process van2008virus
. Third, deterministic models are incapable of capturing higherorder characteristics of the spreading process, such as variances, which are useful for the understanding of the uncertainties in the estimates.
Overall, deterministic models and stochastic ones are complementary to each other. Deterministic models describe the spreading process at a macroscopic level and are more manageable mathematically, yet subject to assumptions on the spreading processes. Stochastic models approach the spreading processes from a microscopic point of view and offer a detailed description of the spreading process. In the following subsections, we introduce both deterministic and stochastic models using the SIS epidemic models as examples.
2.1.2 Deterministic Models
Over the last century, increasingly sophisticated deterministic epidemic models have been proposed to capture the spreading processes on growingly complex and realistic networks.
In an SIS model, each individual in the system is either infected or susceptible. An infected node can infect its susceptible neighbors with an infection rate . An infected individual recovers at recovery rate . Once recovered, the individual is again prone to the disease. The simplest deterministic SIS model is introduced by Kermack and McKendrick kermack1932contributions:
(1)  
where is the fraction of the population who are susceptible, is the fraction of the infected. In 1, the rate at which the fraction of infected individuals evolve is determined by the rate at which the infected population is recovered, i.e., and the rate at which the fraction of infected population grows, i.e., . The latter rate captures the encounter between the fraction of susceptible individuals and the fraction of infected individuals. Simple deterministic models like (1) assume that the individuals in the population are homogeneously mixed; i.e., each individual is equally likely to encounter every other node. Such models have ignored the structure of the underlying network.
Starting from the 90s, new deterministic models have been proposed and studied to accommodate epidemic processes over more complex and realistic network structures. Kerphart and White kephart1992directed investigated a regular graph with individuals where each individual has degree . The Kerphart and White model is described by the ODE:
(2) 
where the rate of infection is which is proportional to the fraction of susceptible individuals, i.e., . For each susceptible individual, the rate of infection is the product of the infection rate and the number of infected neighbors . The Kermack and McKendrick model (1) and the Kerphart and White model (2) are referred to as “homogeneous” models since they assume that the underlying network has homogeneous degree distributions; i.e., each node in the network has the same degree.
With the emerging occurrence of complex networks in many social, biological, and communication systems, it is of great interest to investigate the effect of their features on epidemic and disease spreading. PastorSatorras et al. pastor2001epidemic; pastor2001epidemic1 studied the spreading of epidemics on scalefree (SF) networks. In SF networks, the probability that an individual has degree follows a scalefree distribution , with ranges from to . It has been shown that many social networks such as collaboration networks, and computer networks such as the Internet and the World Wide Web exhibit such structure properties. The PastorSatorras model further divides individuals into subcompartments based on their degrees, with representing the proportion of infected individuals with a given degree :
(3) 
where
represents the probability that any given link points to an infected individual. The rate at which the infected population grows is proportional to the infection rate , the number of connections , and the probability of linking to an infected individual. The model of PastorSatorras et al. incorporates the degree distribution of the underlying network and approximates the spreading processes on more complex and realistic networks.
To incorporate an arbitrary network characterized by the adjacency matrix , Wang et al. wang2003epidemic have proposed a discretetime model that generalizes the Kerphart and White model (2) and the model of PastorSatorras et al. (3). Mieghem et al. van2008virus have studied a continuoustime SIS model, called the intertwined deterministic model that generalizes the results in wang2003epidemic. The intertwined deterministic model describes the spreading processes by
(4) 
where denotes the probability of individual being infected van2008virus or the proportion of infected individuals in the subpopulation fall2007epidemiological. The element in the adjacency matrix represents the connectivity between individual and individual . The intertwined deterministic model provides a microscopic description of the spreading processes by incorporating the adjacency matrix that fully characterizes the underlying network. Other variants of the intertwined deterministic model (4) have been recently studied in khanafer2014stability; pare2017epidemic. To capture the heterogeneity of individuals’ demographic or health situation, people consider a intertwined model with heterogeneous parameters:
(5) 
where describes the infection rate of individual and represents the recovery rate of individual . Khanafer et al. have considered an intertwined mode in a directed network khanafer2014stability. Paré et al. have extended the intertwined deterministic model to accommodate timevarying networks pare2017epidemic.
Most studies of deterministic epidemic models focus on analyzing the system equilibria (the limit behavior of the spreading process as time reaches infinity), characterizing the threshold that determines which equilibrium the system converges to, and evaluating stability properties of different equilibria van2008virus; pare2020modeling; pastor2001epidemic1; kephart1992directed; wang2003epidemic. Especially, conditions for the existence of and convergence to “diseasefree” or “endemicfree” (endemic state) equilibria and the stability properties of these equilibria have been found. Define the effective spreading rate . One of the primary goals in most studies of deterministic epidemic models is to characterize the threshold . If , the epidemic persists and at least a nonzero proportion of the individuals are infected, whereas for , the epidemic dies out. The Kerphart and White model (2) gives a “steadystate” epidemic threshold , which is inversely proportional to the individuals’ degree in a regular graph kephart1992directed. The model of PastorSatorras et al. (3) provides a threshold , with and , which exhibits an absence of epidemic threshold in SF networks for which . The epidemic threshold for the intertwined deterministic model (4) is specified by , where the special radius of the adjacency matrix van2008virus. For the heterogeneous intertwined deterministic model (5), the infectious disease will die out when . Otherwise, an outbreak will occur. Here, , , and is the adjacency matrix of the underlying network. For a comprehensive review of deterministic epidemic models, their epidemic thresholds and stability properties, one can refer to nowzari2016analysis; pare2020modeling.
2.1.3 Stochastic Models
[Stochastic models general introduction]
In this subsection, we present several wellknown stochastic epidemic models based on SIS compartment models. The counterparts for the SIR type of epidemics can be formulated similarly. First, we consider stochastic epidemic models without network structures, which we refer to as stochastic population models.
Consider a population of individuals. Recall that the rate of infected individuals infecting someone else is denoted by . Each infected individual recovers and becomes susceptible at rate . Assume that individuals encounter each other uniformly at random from the whole population. Let be the number of infected individuals at time . Then, is a Markov jump process with state space , with transition rates from state to state , and from state to state . The transition probability hence is given as
(6)  
The stochastic population model (6) does not consider network structures and assumes that each individual encounters each other uniformly. It is clear that the stochastic population model (6) is a Markov jump process with an absorbing state , i.e., the state where no individual is infected. When the process enters the absorbing state, the infection dies out. Many studies of stochastic population models focus on studying the conditions under which the infection will die out.
Another common result in the studies of stochastic epidemic models dwells is to determine the meanfield deterministic model associated with the stochastic model when the population becomes large. Indeed, as goes to infinity, the trajectory converges to the solution of the Kermack and McKendrick deterministic model (1), i.e., almost surely. The proof of the convergence is underpinned by the wellknown Kurtz’s theorem. One can refer to Section 5.3 of draief2009epidemics for more details.
We now introduce the stochastic network models that incorporate network structures. Define as the epidemic state of individual . The individual is infected if , and healthy (susceptible) if . Infected individuals recover at rate , while susceptible individuals become infected at rate , i.e., the product of the the infection rate and the number of infected neighbors. Let be the entire epidemic state of the whole population. The SIS stochastic network model can be expressed by the following Markov process:
(7)  
for . The stochastic network model (7) has states among which there is an absorbing state (i.e., a state where no individual is infected) reachable from any state with nonzero probability. Hence, this model indicates that the epidemic will die out almost surely in finite time. A more meaningful way to describe the spreading process is through probabilistic quantities. Indeed, the probability that the epidemic will not die out at time is bounded as follows draief2009epidemics
(8) 
where is the adjacency matrix of the underlying network, is its spectral radius. From (8), how fast the epidemic will die out depends on the effective spreading rate and the spectral radius of the underlying network. Let denote the time to absorption state (the time when the epidemic dies out). An application of (8) leads to the results regarding the expected extinction of the epidemic draief2009epidemics: Given arbitrary initial condition , if ,
(9) 
If , the expected extinction time increases exponentially as the number of individuals increases, i.e., , where depends on the effective spreading rate and the network structure ganesh2005effect. Loosely speaking, the spectral radius quantifies “how tightly the underlying network is connected”. The results (8) and (9) align well with the intuition that it is easier for an infectious disease to grow on a more tightly connected network. Letting goes to zero, the dynamics of can be written as
(10) 
Alleviating the complication of the term by assuming for all , one can recover from (10) the intertwined deterministic model (4), which restates below:
Since , represents the probability that individual is infected at time . Since is not necessarily true, the intertwined deterministic model (4) serves as an approximation of the stochastic network model (7). Indeed, it is shown that the expected values in (4) are upper bounds on the actual probabilities given by (7) van2008virus; cator2012second; nowzari2016analysis; pare2017epidemic. Furthermore, van2015accuracy investigates how accurate the deterministic approximations (4) are in describing the stochastic model (7).
So far, we have introduced main stochastic and deterministic epidemic models that are well studied in the literature. To offer an overview of these models and their connection, in Fig. 2, we present an illustrative taxonomy of the epidemic models we introduced so far. Apart from models that are based on Markov processes and ODEs, there are epidemic models based on stochastic differential equations allen2007modeling; gray2011stochastic, purely datadriven approaches chimmula2020time, or spatial modeling rhee2011levy; valler2011epidemic; huang2016epidemic; possieri2019mathematical. With a basic understanding of the epidemic spreading processes and their stochastic and deterministic modeling, the next section introduces how decisions can be made based on these models and how decisionmaking can in turn affect the spreading processes.
2.2 Decision Models
The epidemic models introduced in the previous subsection have described the spreading process when there is no human intervention. In practice, humans will take immediate and drastic responses depending on the severity of the infection. The human intervention creates an ineligible impact on the spreading processes. Hence, the modeling of infectious diseases needs to take it into consideration to provide a consolidated understanding of the epidemic spreading processes in the human population. We introduce a holistic framework that incorporates epidemic models and decision models, illustrated in Fig. 2.
2.2.1 What Interventions to Take?
Decision models in epidemic spreading mainly include two types of decisionmakers: a central planner xu2015competition; yang2021modeling; preciado2013optimal; kohler2020robust; ansumali2020modelling; parino2021Modelling; shen2014differential; pejo2020corona; dashtbali2020optimal; khouzani2011saddle; aurell2020optimal; pezzutto2021smart; hota2020closed; watkins2016optimal; ogura2016efficient; hota2016optimal; li2017minimizing; zhao2018virus; li2019suboptimal; watkins2019robust; zhang2019differential; di2020covid; huang2020differential; nowzari2016analysis; chen2021optimal; wang2020modeling; mai2018distributed; xue2018distributed, which works for social benefits by conducting mechanism design and/or applying enforceable measures directly to the general public, and a collection of selfish and selfless individuals trajanovski2015decentralized; reluga2010game; theodorakopoulos2012selfish; hota2016interdependent; hota2019game; huang2020differential; aurell2020optimal; adiga2016delay; zhang2013braess; dashtbali2020optimal; eksin2019control; hayel2017epidemic; bauch2004vaccination; feng2016epidemic; feng2016epidemicPA; huang2019achieving; breban2007mean; trajanovski2017designing who work toward their own goals. Decisions makers can take either nonpharmaceutical interventions (NPIs) or pharmaceutical interventions or both. For a central planner, nonpharmaceutical interventions include but not limit to requiring mandatory social distancing, enforcing lockdown, quarantining infected individuals, and deploying protective resources such as masks, gloves, gowns, and testing kits. Pharmaceutical interventions are related with the availability of vaccines or antidotes. A central planner’s decision may involve vaccine distributions, antidote allocations, treatment prioritization, etc. For individuals, possible nonpharmaceutical interventions are wearing a mask, practicing social distancing, selfquarantine, etc. Pharmaceutical interventions, such as getting vaccinated, seeking for treatment, securing an antidote, are usually adopted by individuals to protect themselves in the epidemic.
2.2.2 How are Interventions modeled?
To understand the coupling between the decision models and the epidemic models, we need to figure out how nonpharmaceutical and pharmaceutical interventions can be modeled and incorporated into the epidemic models. Generally, nonpharmaceutical interventions help curb the spreading by either reducing the interaction between individuals (e.g., avoiding crowds, social distancing, lockdown, and quarantine) or utilizing protective resources (e.g., wearing a mask and frequent use of hand sanitizer).
In networked epidemic models such as the intertwined deterministic model (4) and the stochastic network model (7), the interaction between individuals is usually captured by the network topology. In an unweighted network, the adjacency element means that there exist interactions between individual and individual . Otherwise, . Some existing works use the adjacency elements to describe strict measures that completely cut down the interaction between individuals such as lockdown and quarantine watkins2019robust; xue2018distributed; eksin2017disease. For example, if individual is quarantined, then for all . To capture the effect of soft measures such as social distancing, several works have considered weighted networks to describe the intensity of interactions pezzutto2021smart; huang2020differential. For example, huang2020differential uses a weight coefficient to describe the intensity of the interaction between individual and . In epidemic models that do not capture the complete topology (e.g., the Kermack and McKendrick model (1), the Kerphart and White model (2), the model of PastorSatorras et al. (3), and the stochastic population model (6)), the reduced interaction between individuals is modeled by a scaled infection rate , where is the scaling factor and is the normal infection rate when there is no intervention pejo2020corona; reluga2010game; kohler2020robust; ansumali2020modelling; parino2021Modelling; dashtbali2020optimal. For example, in kohler2020robust, captures how well people practice social distancing. The better people practice social distancing, the smaller the factor is. An alternative way of modeling interventions such as lockdown and quarantine is to create a new compartment erdem2017mathematical; li2019suboptimal; yang2021modeling. In li2019suboptimal, the central planner decides the number of infected individuals to be quarantined to curb the spreading. The authors introduce a new compartment called ‘Quarantine’ to model infected individuals being selected for quarantine. Once individuals are quarantined, they will not infect susceptible individuals and will reenter the Susceptible compartment after recovery.
Utilizing protective resources such as wearing a mask and using hand sanitizer helps individuals protect themselves from infection without reducing their interaction with other individuals. Such interventions are also captured by a scaled infection rate ogura2016efficient; nowzari2016analysis with , which describes the fact that when contacting infected individuals while wearing a mask, susceptible individuals will less likely to be infected.
Pharmaceutical interventions can be adopted when vaccines, antidotes, or effective treatment methods are available. Many works have studied the effect of vaccination on the spreading process gubar2015two; preciado2013optimal; andersson2012stochastic; hota2019game; trajanovski2015decentralized; adiga2016delay; bauch2004vaccination; li2017minimizing; mahrouf2020non, which can be modeled in several ways. One way is to reduce the infection rate between an infectious and a vaccinated individual to while those who are not vaccinated suffer a higher infection rate andersson2012stochastic; preciado2013optimal; hota2019game; adiga2016delay; bauch2004vaccination; li2017minimizing. How small is depends on the efficacy of the vaccine distributed. Another way to create a new compartment called ‘Vaccinated’ and the rate at which individuals exit this compartment captures the protection duration of the vaccinesmahrouf2020non; abouelkheir2019optimal. The use of antidotes and the deployment of mass treatment accelerate the recovery process. As a result, The use of antidotes and the deployment of mass treatment are usually modeled by a recovery rate higher than the natural recovery rate chen2021optimal.
2.2.3 When are Interventions Taken?
Time plays a significant role in the coupling between the epidemic spreading processes and the decision models. Depending on when decisions are made, decisionmaking can be categories into preepidemic decisionmaking hossain2020explainable; herrera2016disease; sparks2011optimal; hota2016optimal; nowzari2016analysis; preciado2013optimal; blume2013network; trajanovski2017designing, duringepidemic decisionmaking mai2018distributed; pejo2020corona; saha2014equilibria; bauch2004vaccination; hota2016interdependent; hota2019game; li2017minimizing; xue2018distributed; chen2021optimal; hota2020impacts; shen2014differential; eksin2019control; akhil2019mean; dashtbali2020optimal; eksin2017disease; reluga2010game; khouzani2011saddle; adiga2016delay; huang2020differential; di2020covid; farhadi2019efficient; hota2020closed; pezzutto2021smart; watkins2019robust; theodorakopoulos2012selfish; nowzari2016analysis; li2019suboptimal; breban2007mean; liu2012impact; zhang2013braess; trajanovski2015decentralized, and postepidemic decisionmakingxue2018distributed; huang2020differential; patro2020towards; bieck2020redirecting. Preepidemic decisionmaking refers to the decisions made before an epidemic happens or at the beginning of the pandemic. Most works focus on network design/formation problems in which a virus resistant network with guaranteed performance is designed/formed hota2016optimal; nowzari2016analysis; blume2013network; trajanovski2017designing. Some works study the optimal design of an epidemic surveillance system on complex networks that helps detect an epidemic at an early stage hossain2020explainable; herrera2016disease; sparks2011optimal. Early epidemic detection allows a central planner to kill the spreading at its infancy. Postepidemic decisionmaking happens at the very end of an epidemic or after an epidemic dies out. The postepidemic decisionmaking addresses problems such as how to safely lift restrictions, how to reverse the interventions taken during the epidemic season xue2018distributed; huang2020differential; patro2020towards; bieck2020redirecting.
This review focuses on duringepidemic decisionmaking problems where decisions are made while an epidemic is present. The dynamics of the epidemic spreading processes and the dynamics of the decision adaptation may evolve at different time scales. Some interventions such as getting vaccinated (if one obtains lifelong protection from the vaccine), taking an antidote, and distributing curing resources are irreversible mai2018distributed; bauch2004vaccination; hota2016interdependent; hota2019game; trajanovski2015decentralized. For such interventions, decisions were only made once. In other works, authors assume that decisions are made once and for all and remain fixed over the spreading process or that decisions are made at a much smaller frequency than the epidemic spreading process pejo2020corona; saha2014equilibria; chen2021optimal. These assumptions and the irreversibility of some interventions make it possible to formulate a static optimization mai2018distributed; chen2021optimal or game bauch2004vaccination; hota2016interdependent; hota2019game; trajanovski2015decentralized; pejo2020corona; saha2014equilibria problem to study the decisionmaking during the epidemic spreading. The objective functions of these static optimizations or game problems only involve the limiting behavior of the epidemic models (1)(7), e.g., the infection level at equilibria (when the epidemic model reaches a steady state) trajanovski2015decentralized; chen2021optimal; bauch2004vaccination; saha2014equilibria, whether the threshold condition is met (i.e., whether the epidemic will eventually die out) nowzari2016analysis.
If the epidemic spreading processes and the decision models evolve at the same time scale, adaptive strategies are employed in which the decisions are adapted at the same pace as the epidemic propagates. For example, a central planner distributes testing kits based on the daily infection data. This creates a realtime feedback loop in the humanintheloop epidemic framework shown in Fig. 1. In this case, individuals take interventions such as whether to wear a mask, conduct social distancing, or stay selfquarantined based on currently perceived information; the central planner adapts his/her interventions according to the observed infection status of the whole population in realtime. The strategies of individuals and the central planner is represented by a map that maps the information they received so far to an action that describes the interventions being taken. Depending on the choice of epidemic models, researchers employ different tools to study the humanintheloop epidemic framework depicted in Fig. 1. When deterministic models are employed to describe the epidemic spreading processes, optimal control dashtbali2020optimal; di2020covid; hota2020closed; watkins2019robust; nowzari2016analysis; li2019suboptimal, differential game huang2020differential; shen2014differential; dashtbali2020optimal; reluga2010game; khouzani2011saddle, or evolutionary game theory theodorakopoulos2012selfish
have been applied to study the humanintheloop epidemic framework. When stochastic models are used, the humanintheloop epidemic framework is often modeled by Markov decision processes
pezzutto2021smart; yaesoubi2011generalized; gast2012mean and stochastic games hota2020impacts; eksin2019control; akhil2019mean; eksin2017disease. Some research works consider seasonal epidemics and at each epidemic season, decisions are made once. People adapt their decisions based on the payoff of the prior seasonal epidemic. This type of decisionmaking problem under seasonal epidemics is solved by analyzing repeated games breban2007mean; liu2012impact or evolutionary games liu2012impact.2.2.4 Who are the decisionmakers?
For the humanintheloop epidemic framework, decisionmakers usually involve a central planner that represents the central authority and individuals that represent the general public huang2020differential. The central planner can be an effective central government when fighting against an epidemic, or a network operator whose users are obliged to abide by the company security policy. An individual can be a individual citizen of a society van2008virus, a local community fall2007epidemiological, or a user in a computer network zhao2018virus.
A central planner cares about the welfare of the whole population such as the number of infected individuals in the entire population, the wellbeing of the economy xu2015competition; yang2021modeling; preciado2013optimal; kohler2020robust; ansumali2020modelling; parino2021Modelling; shen2014differential; pejo2020corona; dashtbali2020optimal; khouzani2011saddle; aurell2020optimal; pezzutto2021smart; hota2020closed; watkins2016optimal; ogura2016efficient; hota2016optimal; li2017minimizing; zhao2018virus; li2019suboptimal; watkins2019robust; zhang2019differential; di2020covid; huang2020differential; nowzari2016analysis; chen2021optimal; wang2020modeling; mai2018distributed; xue2018distributed. Individuals concern about their own interests, which include his/her own infection risk, the inconvenience of wearing a mask, and the monetary cost of getting a effective treatment trajanovski2015decentralized; reluga2010game; theodorakopoulos2012selfish; hota2016interdependent; hota2019game; huang2020differential; aurell2020optimal; adiga2016delay; zhang2013braess; dashtbali2020optimal; eksin2019control; hayel2017epidemic; bauch2004vaccination; feng2016epidemic; feng2016epidemicPA; huang2019achieving; breban2007mean; trajanovski2017designing. Due to the selfishness of the individuals, the goal of an individual is not well aligned and sometimes conflicts with the goal of a central planner huang2020differential; eksin2017disease. For example, the infected individuals, with no infection risk anymore, might be reluctant to take preemptive measures to avoid spreading the disease eksin2017disease. It is shown that there will be an increase in the number of infected individuals if they optimize for their own benefits instead of complying with the rules applied by the central planner huang2020differential. Indeed, enforcing a strict protocol can be costly and sometimes impossible for the central planner. As a result, central authorities should ask themselves whether they can offer the public sufficient incentives that are acceptable by the individuals and sufficiently strong to combat the epidemic. A recent example is that, to reach herd immunity, the Ohio state of the United States will give people million each in COVID19 vaccine lottery to combat the hesitancy of getting a COVID19 vaccine dareh2021vaccinated. Hence, instead of solving an optimization problem or a game problem directly, some works have looked into the mechanism design problem or the information design problem zhang2021informational, on behalf of the central planner, that incorporates both the global state of the whole population and the individual’s choice into designing incentives to combat the epidemic aurell2020optimal; huang2020differential; farhadi2019efficient; pejo2020corona; breban2007mean; li2017minimizing; omic2009protecting.
2.2.5 Information Matters
decisionmakers rely on what information they have to make decisions. The information available to decisionmakers at the time when the decision is made plays a crucial role in the humanintheloop epidemic framework di2020covid. For example, the severity of COVID19 infection, perception of government responses, media coverage, acceptance of COVID19related conspiracy theories lead to a change of people’s attitude about wearing masks during COVID19 rieger2020german. Many works assume that perfect information, including the health status of every individual and complete knowledge of the network topology, is available to decisionmakers at all times eksin2019control; shen2014differential; dashtbali2020optimal; farhadi2019efficient; eksin2017disease; hota2019game; adiga2016delay; nowzari2016analysis; li2019suboptimal. However, in epidemics, acquiring perfect, accurate, and timely information regarding the spreading process is arduous if not impossible bhattacharyya2010game. For example, obtaining an estimate of the number of infected individuals requires testing at scale, which can be challenging to implement in rural areas mercer2021testing. Also, testing results can be delayed due to a high testing demand and a long sample analysis time. Many works study the decisionmaking based on an estimated disease prevalence from available data hota2020impacts; pezzutto2021smart; watkins2019robust and the effect of delayed information in decisionmaking zhu2019stability.
Individuals can receive information from mass media (global broadcasters) such as TV, radio, newspaper, and official accounts on social media and/or from local contacts such as friends, family members, and connections on social media. During an epidemic, individuals may receive two levels of information: one is statistical information that describes the overall prevalence of the epidemic such the number of positive cases, the number of hospitalized patients, and the death toll; the other is local information such as whether people with close connections are infected or not, risk level in one’s neighborhood lagos2020games; granell2014competing. wang2020epidemic; granell2014competing; funk2009spread investigate how the wordofmouth type of information spreading affects individuals’ behavior and hence, alters the spreading processes.
Individuals may suffer from inaccurate information from unreliable resources. An example would be information obtained from social media or by word of mouth in a spatially or culturally isolated community or neighborhood. The perceived information of an individual may not necessarily reflect the actual prevalence of the infectious disease. Such incomplete or biased information about the epidemic together with strong prior beliefs may impede individuals from taking rational and reasonable responses to protect themselves and others reluga2010game. Information released by central authorities also plays a significant role in individuals’ decisionmaking. Responsible central authorities should therefore not only fight against the epidemicrelated misinformation but also conduct information design to curb epidemic spreading and even panic spreading.
3 GameTheoretic decisionmaking in Epidemics
Game theory, in a nutshell, is a powerful mathematical tool of modeling how people make strategic decisions within a group fudenberg1991game; bacsar1998dynamic. In the last decade, there has been a surge in research works in gametheoretic decisionmaking amid an epidemic theodorakopoulos2012selfish; hota2016interdependent; hota2016optimal; li2017minimizing; hota2019game; zhang2019differential; huang2020differential; khouzani2011saddle; aurell2020optimal; adiga2016delay; zhang2013braess; dashtbali2020optimal; eksin2017disease; eksin2019control; pejo2020corona; hayel2017epidemic; bauch2004vaccination; shen2014differential; huang2019achieving; breban2007mean; reluga2010game; trajanovski2015decentralized; xu2015competition; saha2014equilibria; hota2020impacts; farhadi2019efficient; liu2012impact; trajanovski2017designing; omic2009protecting; bhattacharyya2010game; lagos2020games; chang2020game; amaral2021epidemiological; ibuka2014free; reluga2006evolving. The reason behind the surge is threefold.
First, centralized decisionmaking becomes less practical for largescale networked systems such as human contact networks and most computer networks. Computing centralized protection strategies faces the challenge of scalability when they are applied to very large networks mai2018distributed; xue2018distributed; hota2016interdependent. Also, it requires a high level of information granularity for a central authority to make satisfactory centralized decisions for most individuals. The central authority has to gather a huge amount of local information, which not only is challenging to implement but also creates privacy issues and management overheads. In contrast, decentralized decisionmaking is more reliable and practical since local entities decide their own protection strategies satisfying highlevel guidelines provided by the central authority. Second, Selfinterested individuals in the midst of the epidemic might not be willing to comply with the suggested protocols aurell2020optimal; bauch2004vaccination. This is because, as we have explained in Section 2.2.4, there is a misalignment of individual interests. Individuals concern less about and societal interests that are major concerns of the central authority. There also exists a misalignment of interests between individuals and interdependencies among the individuals. Each individual has choices, but the payoff for each choice depends on choices made by others. Third, game theory, as a mature and broad field, provides a plethora of useful solution concepts and analytical techniques that can model and explain human decisionmaking. For example, the selfinterested strategy maximizing individual payoff is called the Nash equilibrium in game theory bacsar1998dynamic. Through a Stackelberg game framework, a central authority can design incentives for the public individuals to combat the epidemic aurell2020optimal. Many infectious disease models usually do not incorporate human behaviors that change as the epidemic evolves and the information spreads over the network. Dynamic game theory, which has been applied in many dynamic settings such as management science bagagiolo2014mean, labor economics liu2020stochastic, and cybersecurity huang2020dynamic, delivers a powerful paradigm to capture dynamic human behaviors bauch2004vaccination. We start the introduction of gametheoretic models in epidemics by presenting a taxonomy in the next section.
3.1 A MultiDimensional Taxonomy of GameTheoretic Models in Epidemics
The synthesis of gametheoretic models and epidemic models roots in the coupling between decision models and epidemic models, which we introduce in Section 2.2. The choice of gametheoretic models depends on multiple factors such as who the decisionmakers are, what interventions decisionmakers can take and when, what information decisionmakers know and etc. Existing literature mainly consider the following four types of games: static games hota2018game; hota2019game; hota2016optimal; huang2019achieving; omic2009protecting; pejo2020corona; saha2014equilibria; trajanovski2015decentralized; trajanovski2017designing; xu2015competition, discretetime stochastic games eksin2019control; eksin2017disease; lagos2020games, differential games reluga2010game; aurell2020optimal; dashtbali2020optimal; huang2019achieving; huang2020differential; khouzani2011saddle; shen2014differential, repeated games adiga2016delay; breban2007mean; li2017minimizing; huang2019game, and evolutionary games hayel2017epidemic; amaral2021epidemiological; zhang2013braess; reluga2006evolving; poletti2009spontaneous.
In static game frameworks, researchers have incorporated the epidemic models by only considering the limiting behavior of these models hota2019game; hota2016optimal; omic2009protecting; saha2014equilibria; trajanovski2015decentralized; trajanovski2017designing; xu2015competition. Here, we use omic2009protecting as an example. In omic2009protecting, J. omic et al. captures the risk of infection using the limiting behavior of the heterogeneous intertwined deterministic model (5). Let be the steady state of model (5) for each individual . Letting , one obtains
(11) 
In omic2009protecting, each individual decides its own recovery rate by seeking treatment, having antidotes to optimize the tradeoff between the overhead invested in recovery and the penalty of infection . omic2009protecting creates a game with players whose goals are to minimize . The coupling between individuals’ strategies is captured by the limiting behavior of the epidemic model (11). Static game frameworks consider onceandforall interventions which cannot be revoked and concern about the longterm outcomes such as the infection risk at the steadystate, or whether the disease will die out eventually.
Discretetime stochastic games and differential game frameworks are introduced to capture the transient behavior of the epidemic process and to model adaptive interventions. The difference between Markov game frameworks and differential game frameworks lies in the choice of epidemic models. Discretetime stochastic game frameworks are built upon stochastic epidemic models such as the stochastic population model (6) and the stochastic network model (7) eksin2019control; eksin2017disease; lagos2020games. Differential game frameworks rely on deterministic epidemic models or stochastic epidemic models that use stochastic differential equations to describe the dynamics of the spreading processes reluga2010game; aurell2020optimal; dashtbali2020optimal; huang2019achieving; huang2020differential; khouzani2011saddle; shen2014differential. Characterizing a Nash equilibrium over the whole horizon is prohibitive for discretetime stochastic games when the number of individuals increases or the number of stages becomes large. Even structural results are difficult to obtain. Hence, in eksin2019control; eksin2017disease, the authors introduce a concept called myopic Markov perfect equilibrium (MMPE). The solution concept MMPE implies the assumption that individuals maximize their current utility given the state of the disease ignoring their future risks of infection and/or future costs of taking interventions in their current decisionmaking. This is a reasonable assumption considering the computational complexity of accounting for future states of the disease during an epidemic. lagos2020games also adopts a similar solution concept where only the current state and the next state of one’s infection are considered. For differential game frameworks, the equilibrium can be calculated using the general methods of Isaacs isaacs1999differential. Using Pontryagin’s maximum principle, reluga2010game studied the differential game of social distancing and the spreading of an epidemic under the equilibrium; Khouzani et al. khouzani2011saddle found the optimal way of dissemination security patches in wireless networks to combat the spread of malware controlled by an adversary; Huang et al. huang2020differential characterized the optimal way of reducing connectivity to keep the balance between mitigating the virus and maintaining the economy.
Repeated game frameworks are used to model seasonal epidemics which appear periodically breban2007mean; li2017minimizing. Interventions will be taken repeatedly at each epidemic season. For example, to protect oneself from influenza, one needs to get a flu vaccine each flu season due to the mutation of the virus or the protection time of a vaccine. Individuals adapt his/her behavior based on the cost/payoff incurred last season. Different from differential games and discretetime stochastic game frameworks, repeated game frameworks do not include the transient behavior of the spreading process adiga2016delay; huang2019game.
One common way of individuals making ‘bestresponse’ decisions that give the best immediate or longterm payoff. This way of decisionmaking is adopted by differential game, stochastic game, and repeated game frameworks. Another is the use of ‘imitation’ dynamics where individuals copy the behavior that is previously or currently most successful reluga2006evolving. The ‘imitation’ dynamics governing the time evolution of the fractions of strategies in the population is similar to the replicator dynamics of evolutionary game theory poletti2009spontaneous. Evolutionary game theory has been adopted to model the human behavior of imitating other individuals’ successful strategy by many previous works hayel2017epidemic; amaral2021epidemiological; zhang2013braess; reluga2006evolving; poletti2009spontaneous. In the evolutionary game framework for epidemic modeling, the strategy dynamics is coupled with the epidemic dynamics hayel2017epidemic; zhang2013braess; reluga2006evolving. The focus of evolutionary game frameworks is on the analysis of the coupled dynamics and the interpretation of the behavior of these dynamics amaral2021epidemiological. A detailed review of this branch of research can be found in chang2020game.
Based on the interventions individuals adopt, gametheoretic models in epidemics can be categorized into vaccination game adiga2016delay; breban2007mean; li2017minimizing; saha2014equilibria; zhang2013braess; hota2019game, social distancing game aurell2020optimal; dashtbali2020optimal; huang2019achieving; huang2020differential; lagos2020games; pejo2020corona; reluga2010game, quarantine game hota2020impacts; amaral2021epidemiological, and mask wearing game pejo2020corona etc. There are also works that study the adoption of preemptive interventions eksin2017disease; eksin2019control; zhang2013braess and interventions that changes the recovery rate hota2018game; xu2015competition. Beyond human social networks, many works studied interventions that can curb the malware or virus spreading in computer networks or wireless communication networks huang2019game; hayel2017epidemic; huang2019achieving; huang2020differential; khouzani2011saddle; omic2009protecting; shen2014differential; trajanovski2015decentralized; trajanovski2017designing, such as network reforming trajanovski2015decentralized; trajanovski2017designing, installing security patches hayel2017epidemic; shen2014differential, reducing the communication rate khouzani2011saddle; huang2020differential and etc.
If all individuals practice socially distancing, wear masks, stick to stayathome orders, the risk of infection and the infection level of the whole population will be reduced significantly. However, there always exist tradeoffs and temptations to defect from the regimen. Handwashing is tedious, wearing a mask is uncomfortable or annoying, socializing is necessary. When it comes to getting a vaccine, people express concerns about safety and side effects. One commonality of most effective interventions usually exhibits the characteristics that if one takes the intervention, he/her needs to pay for all the cost or inconvenience but everyone else in the population will more or less benefit from his/her behavior. This characteristic creates the coupling between individuals. For example, one can enjoy empty streets and markets without having a higher risk of infection if most people stay at home. Those who choose not to get a vaccine effectively reap the benefits of reduced virus transmission contributed by the people who do opt for vaccination. This behavior is referred to as ‘freeriding’ behavior ibuka2014free. When a significant number of free rides appear, there will be a collective threat to containing the virus.
Different interventions induce different costs and provide benefits in different ways. For example, wearing a mask gives immediate protection and is irreversible, hence induces instantaneous cost and benefit. Vaccination creates longer protection yet induces an immediate cost such as making a payment for the vaccine and experiencing side effects. Hence, different models were used to model different interventions. Vaccination games are usually modeled by static games or repeated games due to the irreversible of getting a vaccine adiga2016delay; breban2007mean; hota2019game; saha2014equilibria. Revocable interventions such as quarantine, social distancing, wearing a mask are usually modeled by differential games or stochastic games aurell2020optimal; dashtbali2020optimal; huang2020differential; lagos2020games; reluga2010game.
Another criterion to classify the literature is to consider the players of the game. Most gametheoretic models in epidemics investigate the interplay between individuals
bauch2004vaccination; bhattacharyya2010game; breban2007mean; chang2020game; dashtbali2020optimal; hayel2017epidemic; hota2020impacts; hota2016interdependent; hota2019game; ibuka2014free; lagos2020games; liu2012impact; reluga2010game; reluga2006evolving; zhang2013braess; eksin2017disease; eksin2019control; chapman2012using; adiga2016delay; huang2020differential; li2017minimizing; saha2014equilibria; trajanovski2015decentralized; trajanovski2017designing. Individuals can be completely selfish who maximize their own payoff via optimizing their own payoff or imitating the most successful individuals adiga2016delay; bauch2004vaccination; bhattacharyya2010game; breban2007mean; chang2020game; dashtbali2020optimal; hayel2017epidemic; hota2020impacts; hota2016interdependent; hota2019game; ibuka2014free; lagos2020games; liu2012impact; reluga2010game; reluga2006evolving; zhang2013braess. Some works incorporate the effect of altruism into their gametheoretic models in which individuals are not completely selfish and care about the wellbeing of their neighbors eksin2017disease; eksin2019control; chapman2012using. It is shown by Eksin et al. eksin2017disease that a little empathy can significantly decrease the infection level of the whole population. The results in chapman2012using by Chapman et al. show that the central planner should promote vaccination as an act of altruism, thereby boosting vaccine uptake beyond the Nash equilibrium and serving the common good. Many works examine the inefficiency of selfish acts of individuals and the inefficiency is quantified by the price of anarchy adiga2016delay; huang2020differential; li2017minimizing; saha2014equilibria; trajanovski2015decentralized; trajanovski2017designing. Results from these works demonstrate that individuals’ selfishness becomes a big hurdle to fight against infectious diseases. Hence, some works introduce the role of central authorities and study how central authorities should create incentives/penalties to achieve social optimum aurell2020optimal; farhadi2019efficient; pejo2020corona. Another strain of research focuses on the interplay between a central authority and an adversary khouzani2011saddle; shen2014differential; xu2015competition; zhang2019differential. The adversary aims to maximize the overall damage inflicted by the malware and the central authority tries to find the best countermeasure policy to oppose the spread of the infection. The conflicting goals between the players are usually captured by a zerosum dynamic game.3.2 A FineGrained Dynamic Game Framework for HumanintheLoop Epidemic Modelling
In the existing literature, there is no consensus on which gametheoretic framework to study humanintheloop epidemics. The integration between gametheoretic models and epidemic models is done on a casebycase basis depending on the players involved, what interventions people take(see Section 2.2.1), when interventions are taken (see Section 2.2.3), what epidemic models modelers choose (see Section 2.1), and the underlying network structure. Here, we present a finegrained dynamic game framework to describe the essence of humanintheloop epidemic modeling.
We consider the following discretetime Markov game with players.

Players: We consider a population of individuals denoted by and a central authority.

The individual state space: Each individual has a state from the finite set . The state indicates the health status of an individual. Elements in may include susceptible state, infected state, recovered state, and/or quarantined state etc., depending on the compartment model being used and how interventions are modeled. The state of individual at time is denoted by . For example, if we consider an model where individuals decide whether to get vaccinated, the state space contains three elements , with () meaning individual is susceptible (infected, vaccinated respectively).
One can also introduce the concept of type in game theory into the humanintheloop framework to capture different social, political, and other demographic groups.

The population profile: The global detailed description of the population’s health status at time is . The population profile is denoted by , where , indicating the proportion of individuals in state . A central authority cares about the wellbeing of the whole population, hence pays attention only to the population profile .

The individual action space: Let denote a set of possible actions individuals can take to combat the virus when his/her state is . Depending on which intervention is studied (see Section 2.2.2), can be either finite or continuum. At every time , an individual chooses an action . The action profile of the whole population is denoted by . The action set can be statedependent if necessary.

The transition kernel: The epidemic process is a Markov Process once the sequence of actions taken by individuals is fixed. Let be the transition kernel, namely is a mapping . Given the population infection profile at time , if individuals take their actions , then at time , the global detailed description of the population’s health status follows the distribution:
The transition kernel is decided by the epidemic models (see Section 2.1.3) and the interventions the actions represent (see Section 2.2.2). For example, the transition kernel can be constructed using the SIS networked stochastic model (7) or other epidemic models. For example, if the actions of individual include with representing individual is selfquarantined and representing individual staying normal, then the transition kernel can be constructed by
(12) for . As we can see if all individuals brace for the epidemic by quarantining, i.e., for all for some , the spreading will slow down at a fast rate decided by . Not every individual has the inventive to do so.

Costs: The costs come from two sources: one is from the risk of catching the virus, another is from the interventions being taken. Most interventions/measures come with either monetary costs or inconvenience. Handwashing is tedious, wearing a mask is uncomfortable or annoying, socializing is necessary. The instant cost at time for individual hence depends on his/her health status (state), the interventions he/she takes (action), and/or the states of other individuals. Formally,
in which there exits three levels of altruism. The cost function captures the selfishness of individual , where the superscript means selfishness. If the second cost function is added, it means individual care about his/her neighbors (fiends, family members). The superscript of means friends or family. The highest level of altruism is captured by meaning individual cares about every individual in the population. For a completely selfish individual and . Modellers can also make , , and time dependent if necessary.

Information and Strategies: The information set of individual at time is denoted by . Individual may not every detail about the whole population. He/she may only know his own health status and the population profile broadcasted by the central authority, i.e., . Or individual may only know information about his neighboring individuals, which gives . Individuals make decisions based off of the information available to them. The rules individuals follow to make decisions is called strategies. The strategy of individual , , is a map from the information space and the time space to his/her action space, meaning that the action of individual is chosen as .
Remark 1.
When there is a presence of a central authority, the central authority cares about the population profile instead of the health status of a particular individual. For example, the goals of the central authority might be to suppress the proportion of infected individuals, reduce the death toll on the general public, and boost up the uptake of vaccines. These metrics can all be reflected in the population profile for . Hence, the cost function of the central authority can be a function of , i.e., . There are two paths the central authority can follow to achieve its goals. The first is designing a penalty/reward function that rewards (punish) individuals who comply with (violate) the suggested rules such as social distancing, quarantine, or getting vaccinated. The second is through information design such as promoting altruism, raising awareness, health education, etc.
The game between individuals unfolds over a finite or infinite sequence of stages, where the number of stages is called the horizon of the game. Infinite horizon game models epidemics that persist for decades reluga2010game. Some papers consider finite horizon game where the terminal time is decided by when the vaccines are widely available reluga2006evolving; huang2020differential; chang2020game. The overall objective, for each individual, is to minimize the expected sum of costs he/she receives during the epidemic.
Solving such a finegrained stochastic game is difficult, if not prohibitive, under general solution concepts. The difficulties emerge from three facts. The first is that as the number of individuals increases, the size of the state space for the global state increases exponentially. In the human population, the number of individuals in a community ranges from thousands to millions. Hence, analyzing such an enormous number of individuals under the finegrained dynamic game framework becomes impossible. The second is that individuals do not know the exact information of the whole population for every . For some epidemics, it is difficult for individuals to know their own state due to the fact that some infectious diseases do not cause symptoms for some individuals or cause common symptoms that are shared with other diseases. Also to gather information regarding the population profile for each time and broadcast it to the individuals, the central authority needs to arrange largescale surveys, polls, and diagnostic tests on a daily or weekly basis. Even so, the population profile can only be estimated using gathered data. This partial information situation creates a partially observable stochastic game. It is intractable to compute Nash or other reasonable strategies for such partially observable stochastic games in most general cases horak2019solving. The third is if the transition kernel is described by a networked stochastic epidemic model such as (12), the state dynamics of one individual is coupled directly or indirectly with every other individual through the underlying network. The fact makes it impossible to obtain an appropriate strategy by decoupling eksin2017disease. Hence, we propose this finegrained stochastic game framework to describe the quintessences of the integration model into the epidemic spreading process, with no intention to solve it.
Existing literature usually proposes less finegrained gametheoretic frameworks in order to obtain meaningful results that help understand the spreading of epidemics under human responses. These works generalize or simplify the finegrained dynamic game mainly in three ways. One is to use meanfield techniques by assuming the transition probability of each individual’s state only couples with the population profile and some indistinguishability assumptions tembine2009mean; gast2012mean; lee2021controlling; tembine2020covid; reluga2010game. The second is to study a simplified solution concept where each individual considers costs over only a limited number of stages eksin2019control; eksin2017disease; lagos2020games. The third is to consider a continuous intertwined epidemic model (4), rather than a stochastic one, where a dynamic game is built upon huang2020differential or to apply meanfield techniques in a homogeneous epidemic model such as the Kermack and McKendrick model (1) reluga2010game. In the next subsection, we present three representative works that proposed simplified frameworks to deliver meaningful results by leveraging the abovementioned methods. These works choose totally different epidemic models and have their unique ways of integrating gametheoretic decisionmaking into these epidemic models.
3.3 Social Distancing Game with Homogeneous SIR Epidemic Model reluga2010game
In reluga2010game, T. Reluga studies the effect of social distancing on the spreading of SIR type of infectious diseases. The author uses a homogeneous deterministic epidemic model: the Kermack and McKendrick model (1). A differential game framework is proposed in which the interplay between individuals is simplified as the interaction between a specific individual and the aggregate behavior of other individuals.
3.3.1 Modelling
How social distancing is modeled: Let be one specific individual’s strategy of daily investment in social distancing. The population strategy is the aggregate daily investment in social distancing by the population. Borrowing the idea from meanfield games, in the limit of infinitely large populations, i.e., , , and are independent strategies because changes in one individual’s behavior will have a negligible effect on the average behavior. The effectiveness of investment in social distancing is captured by , which is the infection rate given an aggregate investment in social distancing practices. Without loss of generality, we set when there is no investment. To model the diminishing returns with increasing investment, the author assumes that is convex and given by
with the maximum efficiency of social distancing .
Epidemic Models: Epidemic usually start with one or a few infected cases, so . The macroscopic behavior of the spreading process under the aggregate social distancing investment can be captured by a normalized SIR version of the Kermack and McKendrick epidemic model:
(13)  
where the infection rate and the recovery rate are normalized in order to focus on the effect of social distancing practices on the spreading process.
The cost function: The total cost of the epidemic to the population, , includes the daily costs from infection , the daily costs of infection :
(14) 
where is a discount term with being the discount rate and the last term is a salvage term representing the cumulative costs associated with individuals are sick at the time mass vaccination occurs ().
The evolution of individual states: The premise of the game is that at each point in the epidemic, individuals can choose to pay a cost associated with social distancing in exchange for a reduction in their risk of infection. Let be the probabilities that an individual is in the susceptible, infected, or recovered state at time t. The probabilities evolve according to the Markov process
(15) 
where is the individual’s daily investment in social distancing and the transitionrate matrix
The coupling between the population profile and the individual profile is described by the two processes (13) and (15). The individual risk of infection depends on his/her investment in social distancing and the infection level of the whole population .
The values of states: Using the ideas of Isaacs isaacs1999differential, we calculate expected present values of each state at each time, conditional on the investment in social distancing. The expected present value is average value one expects after accounting for the probabilities of all future events, and discounting future costs relative to immediate costs. Let denote the expected present values with , , and representing the expected present values of being in the susceptible, infected, or removed state at time when using strategy in a population using strategy . The expected present values eolves according to the adjoint equations
where incorporates the individualized cost of (14) into the expected present values . Since the dynamics of is independent of , there is no need to consider recovered individuals further. Further simplifying the dynamics of by taking (not discount) and (fixed expected value at infected state), one obtains
(16) 
which evolves backward in time with boundary condition . If everyone else invest heavily in social distancing, the individual can become a free rider that earns the benefit (value) without having to invest too much in social distancing because the infection level will remain low. To find a balanced social distancing strategy (Nash strategy), one can simply focuses on (13) and (16).
3.3.2 Analysis
To find a balanced strategy is to find the best strategy to play, given that all the other individuals are also attempting to do so. Given such context, a Nash equilibrium solution becomes an appropriate solution concept for the differential game formulated in Section 3.3.1.
Definition 1.
Given the expected value in the susceptible state and the associated population level spreading dynamics (13), is a Nash equilibrium if for any possible strategy , .
A Nash equilibrium is a subgame perfect equilibrium if it is also a Nash equilibrium at every state the system may pass through. Indeed, the Nash equilibrium can be obtained by finding the investment that maximizes the rate of increase in the individual’s expected value.
Lemma 1.
If is a subgame perfect equilibrium, then it satisfies the maximum principle
when everywhere
One can solve for , if behaves well.
Theorem 3.1.
If is differentiable, decreasing, and strictly convex, then is uniqely defined by the relations
(17) 
From Theorem 3.1, one knows that at the Nash equilibrium, whether an individual invests in social distancing depends on the maximum efficiency of social distancing , the current infection level of the population , and the expected value in susceptible state . When an individual does invest in social distancing, an individual tends to invest more if the efficiency of social distancing is higher, or the current infection level is higher, or the value of the susceptible state is greater.
3.3.3 Highlighted Results
We observe the instantaneous behavior given the value of the expected value in susceptible state , the infection level , and the susceptible level . Such results can be computed by solving (17) and the results are shown in Figure 3.
The last few ‘survivors’ tend to social distance more: Figure 3 shows the infection risk in feedback form with implicit coordinates on the left and transformed explicit coordinates on the right. As we expect, the left figure shows that a larger value of the susceptible state induces a greater instantaneous social distancing. From the right figure, one can see that as the number of susceptible individuals increases, the investment in social distancing decreases, hence the individual’s infection rate increases with less social distancing. One can also see that when only a small portion of the population remains susceptible, the biggest investments in social distancing happens. That means the last few ’survivors’ tend to social distance to brace for the infection.
Two scenarios are investigated in reluga2010game. The first is the infinite horizon differential game that gives the equilibrium behavior when there is never a vaccine and the epidemic spreads until its natural end. The second is the finitehorizon problem that studies the individual behavior in equilibrium when there will be a vaccine introduced at time . For the infinitehorizon case, the epidemic spreading dynamics under the social distancing equilibrium and in the absence of social distancing is plotted in Figure 4.
Social distancing occurs later but ends sooner than the wide spreading of epidemics: As we can see from the top left figure in Figure 4, under equilibrium social distancing, social distancing is never used until partway into the epidemic and ceases before the epidemic fully dies out. That means at the beginning of the epidemic, individuals will not be alert to take any interventions until the epidemic prevails. And social distancing practices are going to be lifted before the epidemic completely ends when the situation gets better.
Social distancing leads to a smaller epidemic but prolongs the epidemic: When comparing the time series data on the top left figure and its counterpart on the bottom left figure of Figure 4, one can observe that social distancing reduces the scale of the epidemic and prolongs the prevalence of the epidemic. Even though social distancing prolongs the epidemic, practicing social distancing is still important since it helps ‘flatten the curve’ and avoid a large number of individuals being infected in a short period of time.
Now let’s shift the focus to the finitehorizon problem where vaccines are universally available at a given time . As is shown in Figure 5, at an equilibrium, social distancing will last until the very time when vaccines are universally available. When the wide availability of vaccines arrives sooner, social distancing begins sooner. When the vaccine becomes available at (see the left plot in Figure 5), individuals save of the cost of infection per capita by practicing social distancing. When the vaccine becomes available earlier, say when , of the cost of infection can be saved per capita.
Social distancing enlarge the window of opportunity during which mass vaccine can reduce the cost of the epidemic: The earlier a vaccine becomes available, the less the whole society suffers. If a vaccine becomes available at the late stage of the epidemic when most individuals are recovered, the vaccine won’t help much reduce the transmission. There exists a limited window during which largescale vaccination can effectively cut down the cost of infection at the population level. Numerical results in reluga2010game show that equilibrium social distancing can extend this limited window of opportunity.
3.3.4 Discussions
The modeling in reluga2010game unravels the complexity of the finegrained dynamic framework from several aspects. The first is the use of a homogeneous SIR deterministic epidemic model: the Kermack–McKendrick SIR model, in which the epidemic process is described by two ordinary differential equations ( can be expressed as ). The effect of social distancing on the spreading process is captured by a scalar function which is homogeneous to all individuals. The spreading process enjoys a decreased infection rate as the population invests more in social distancing. The use of such a homogeneous epidemic model is a doubleedged sword. On one hand, homogeneous epidemic models make analytical results more attainable, but on the other side, homogeneous epidemic models need the assumption that the population is homogeneous and strongly mixed. However, we know that the contact patterns among individuals are highly structured, with regular temporal, spatial, and social correlations. The second is the decoupling of the direct connection between an individual’s strategy and the aggregate strategy of the population . This allows one’s risk of infection to depend only on the infection level of the population , which implicitly depends on .
Realistically, mass vaccination cannot happen overnight as is assumed in the paper. Vaccination is usually rolled out continuously as it is proved to put into use. This effect can be incorporated into the model by considering a timedependent forcing. In this game, the individual has complete information about the epidemic including the expected value and the infection level of the whole population. However, in reality, incomplete information (biased or inaccurate information) may drive human behavior away from the equilibria obtained in this paper.
3.4 The Power of Empathy: A Markov Game under the Myopic Equilibrium eksin2017disease
In eksin2017disease, Eksin et al. have proposed a Markov game framework using the contact network stochastic epidemic model (newman2010networks, Ch. 17) in which healthy individuals utilize protective measures to avoid contracting a disease and sick individuals utilize preemptive measures out of empathy to avoid spreading a disease. A solution concept, called the myopic Markov perfect equilibrium (MMPE), is introduced to model human behaviors, which also makes theoretical results attainable for such a framework. Eksin et al. have shown that there is a critical level of empathy by the sick individuals above which the infectious disease die out rapidly. Further, they show that empathy among sick individuals is more effective than riskaversion from healthy individuals.
3.4.1 Modelling
The epidemic model: Eksin et al. considers a networked SIS stochastic epidemic model, which is a variant of the networked stochastic model (7). An individual in the population susceptible () or infected and infectious () at any given time . A global detailed description of the population is denoted by .
The transition kernel can be specified by and for and . is the probability that individual is susceptible at time , but gets infected at time and
(18) 
where is the infection rate, is the action of individual , and the action of its neighboring individual is for .
How protective and preemptive measures are modeled: Each term is the probability that the individual is not infected by neighbor . If either individual or his/her neighbor takes protective and preemptive measures such as wearing masks, practice social distancing, or other measures, the probability that individual is infected by neighbor will decrease. An extreme case is either or under which individual will never be infected by his/her neighbor . The product of all the terms is the probability that the individual is not infected by any interactions. is the probability that individual is infected at time but recovered at time . This probability is equal to the inherent recovery rate of the disease , i.e.,
(19) 
The payoff function: Individuals make their decisions based on the tradeoff between the risk of contracting the virus and the costs of taking protective and preemptive measures. Different from other works in which individuals are completely selfish adiga2016delay; bauch2004vaccination; bhattacharyya2010game; breban2007mean; chang2020game; dashtbali2020optimal; hayel2017epidemic; hota2020impacts; hota2016interdependent; hota2019game; ibuka2014free; lagos2020games; liu2012impact; reluga2010game; reluga2006evolving; zhang2013braess, Eksin et al. considers a sense of altruism among individuals. Individuals are concerned not only about getting infected themselves but also about infecting others in their neighborhood. An individual’s payoff at time is a weighted linear combination of these considerations:
(20)  
where are fixed weights. The first term inside the square bracket is the payoffs of not taking any measures including socialization benefits, convenience benefits, and economic benefits etc. The second term captures the risk aversion of susceptible individuals. The risk comes from contacting with infectious neighbors who fail to take serious measures to protect others. The third term is the empathy term that quantifies the risk of infecting others. Hence, the weights , and are referred to as socialization, risk aversion, and empathy constants.
The payoff function is a bilinear function of and for . That means given the actions of individual ’s neighbors and their health status , to maximize his/her payoff, he/she needs to decide whether to resumes normal activity () or selfisolates () depending on the sign of expression inside the square bracket. If the expression is positive, individual prefers to resume normal. Otherwise, selfisolation is the best choice.
The payoffs of the neighbors of individual depend on the actions of their own neighbors. This means if the underlying network is connected, the payoff profile of the population couples the actions of all individuals. Hence, individuals need to reason about the interaction levels of their neighbors in their decisionmaking. Such individual reasoning can be modeled using game theory.
3.4.2 Analysis
Obtaining an analytical solution such as the Nash equilibrium is difficult if one considers accumulative payoffs over a finite period of time or an infinite horizon. Here, Eksin et al. have considered a solution concept called the myopic Markov perfect equilibrium (MMPE).
Definition 2.
The use of MMPE profile carries two implied assumptions. One is the assumption that individuals’ actions depend only on the payoff relevant state of the disease. Whether the class of Markovian strategies contains the Nash strategy for all possible strategies is not discussed in eksin2017disease. Another is the assumption that individuals make decisions considering the current instantaneous payoff only. Under the assumption of myopic strategies, individuals do not foresee their future risks of infection or infecting others in their decisionmaking.
The computation of the MMPE strategy profile involves only one stage of the payoff. So, it is more tractable than computing the Nash strategies that consider the accumulative payoffs with states evolving from time to time. Indeed, Eksin et al. show that there exists at least one such strategy profile for the bilinear game captured by (20). The proof of existence is constructive, which also provides an algorithm that computes an MMPE strategy profile in finite time (Readers who are interested in the proof can refer to eksin2017disease). But unfortunately, even for such a simplified solution concept, no closedform results in terms of expressing the MMPE action as a function of the current state is obtained. In the next subsection, we present several highlighted results obtained from simulations by the authors.
3.4.3 Highlighted Results:
A little empathy plays a huge role in bounding the basic reproduction number: An important measure for an epidemic process is the basic reproduction number , which measures the spread of an infectious disease from an initial sick individual in an otherwise susceptible host population. Whether or not is an indicator that the disease is likely to persist when there is a relatively low number of infected individuals. Hence, is an important measure relating the likelihood of disease persistence to network and utility weights. When individuals act according to an MMPE strategy profile, the following bound holds for ,
where is the proportion of individuals who have degree in the network, and . Here, is the largest degree an individual has in the network. When there are no protective and preemptive measures taken, the bound for the network is . If the empathy constant is close to zero such that is smaller than , one recovers the bound for the contact network models with no protective or preemptive measures. When individuals weigh the risk of infecting others and the costs of taking measures equally, i.e., , the basic reproduction number is well bounded, i.e., . In a scalefree network with degree distribution , the bound becomes
(21) 
If the empathy weight is negligible (), meaning individuals do not care about infecting others, then the bound increases logarithmically with the size of the population, i.e., . Indeed, the bound is the exact reproduction number for the contact network SIS model with no individual behavior response to disease prevalence
Comments
There are no comments yet.