 # Semi-Explicit Solutions to some Non-Linear Non-Quadratic Mean-Field-Type Games: A Direct Method

This article examines the solvability of mean-field-type game problems by means of a direct method. We provide various solvable examples beyond the classical LQ game problems. It includes quadratic-quadratic games, power, logarithmic, sine square, hyperbolic sine square payoffs. Non-linear state dynamics such as control-dependent regime switching, quadratic state, cotangent state and hyperbolic cotangent state are considered. Both equilibrium strategies and equilibrium costs are given in a semi-explicit way. The optimal strategies are shown to be in state-and-conditional mean-field-type feedback form. It is shown that a simple direct method can be used to solve a broader classes of non-quadratic mean-field-type games under jump-diffusion-regime switching Gauss-Volterra processes which include fractional Brownian motion and multi-fractional Brownian motion. We provide semi-explicit solutions to the fully cooperative, noncooperative nonzero-sum, and adversarial game problems.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Mean-field-type game theory studies a class of games in which the payoffs and or state dynamics depend not only on the state-action pairs but also the distribution of them. In mean-field-type games, (i) a single decision-maker may have a strong impact on the mean-field terms, (ii) the expected payoffs are not necessarily linear with respect to the state distribution, (iii) the number of decision-makers (“true decision-makers”) is not necessarily infinite.

Games with non-linearly distribution-dependent quantity-of-interest [1, 2, 3]

are very attractive in terms applications because the non-linear dependence of the payoff functions in terms of state distribution allow us to capture risk measures which are functionals of variance, inverse quantile, and or higher moments. During the past, a significant amount of research on mean-field-type games has been performed

[4, 5, 6, 8, 9, 10]. In the time-dependent case, the analysis of mean-field-type games is not without challenges. Previous works have devoted tremendous effort in terms of partial integro-differential system of equations (PIDEs), in infinite dimensions, of Liouville, Boltzmann or McKean-Vlasov type. At the same time, an important set of numerical tools have been developed to address the master equilibrium system. However, the current state-of-the-art of numerical schemes is problem-specific and need to be adjusted properly depending on the underlying problem. To date, the question of computation of the master system in the general setting remains open. This work provides explicit solutions of a class of master systems. These explicit solutions can be used to build reference trajectories and several numerical schemes developed to solve PIDEs can be tested beyond the linear-quadratic setting.

### 1.1 Direct Method

The direct method consists of five elementary steps. The first step starts by setting the mean-field terms of the problem. The second step consists of the identification of a partial guess functional where the coefficient functionals are random and regime switching dependent. The third step uses the stochastic integration formula. The fourth step uses a completion of terms in one-shot optimization for both control actions and conditional expected value the control actions of all decision-makers. The fifth and last step uses an algebraic basis of linearly independent processes to identify the coefficients. The identification leads to a (possibly stochastic) differential system of equations, providing a semi-explicit representation of the solution. These five elementary steps of the Direct method are displayed in Figure 1.

### 1.2 Direct Method for LQ-MFTG

In the current literature, only relatively few examples of explicitly solvable mean-field-type game problems are available. The most notable examples are (i) linear-quadratic mean-field-type games (LQ-MFTG) , (ii) linear-exponentiated quadratic mean-field-type games (LEQ-MFTG)  , (ii) adversarial linear-quadratic mean-field-type games (minmax LQ, minmax LEQ-MFTG) . In LQ-MFTG the base state dynamics has two components: drift and noise.

• the drift is an affine function of the state, expected value of the state, control action and expected value of the control actions of all decision-makers. The coefficients are regime switching dependent.

• the noise are combinations of diffusion, Gauss-Volterra, jump, regime-switching process where the noise coefficients are affine functions of the state, expected value of the state, control action and expected value of the control actions of all decision-makers. The coefficients are regime switching and jump dependent.

To the state dynamics, one can add a common noise which is a diffusion-Gauss-Volterra-jump-regime-switching process. The cost functions are polynomial of degree two and include the weighted conditional variances, co-variances between state and control actions of all decision-makers. In addition, the cost functional is not measured perfectly. Only a noisy cost is available.

This basic model of LQ mean-field-type games captures several interesting features such as heterogeneity, risk-awareness and empathy of the decision-makers.

To solve LQ-MFTG problems one can use the direct method proposed in Figure 1. This solution approach does not require solving the Bellman-Kolmogorov equations or backward-forward stochastic differential equations of Pontryagin’s type. The proposed direct method can be easily implemented by beginners and engineers who are new to the emerging field of mean-field-type game theory.

For this broader class of LQ-MFTG problem one can derive a semi-explicit solution under sufficient conditions. The existence of solution to the master system corresponding to the LQ-MFTG problem can be converted into an existence of solution to a system of ordinary differential equations driven by common noises. In some particular cases, these systems are stochastic Riccati systems and extensions of Riccati to include some fractional order terms.

### 1.3 Direct Method beyond LQ-MFTG

The direct method is not limited to the linear-quadratic case. Direct method can be extended to a class of LEQ-MFTG, minmax LQ-MFTG and minmax LEQ-MFTG. In this article, we present several examples to illustrate how direct method addresses non-linear and/or non-quadratic mean-field-type games. The examples below go beyond LQ-MFTG, LEQ-MFTG and minmax LQ problems.

Our contribution can be summarized as follows. We provide semi-explicit solution for classes of mean-field-type game problems presented in Table 2. Several noises are examined: Brownian motion , regime switching , jump process , and Gauss-Volterra process . The Gauss-Volterra noise processes are obtained from the integral of a Brownian motion with a suitable kernel function. In addition, several type of common noises are considered:

To the best of the authors’ knowledge this is the first work to provide semi-explicit solutions of mean-field-type games beyond LQ and under Gauss-Volterra processes.

### Structure

The rest of the article is structured as follows. Section 2 presents semi-explicit solutions to some non-linear non-quadratic stochastic differential games. In Section 3 we formulate and solve various mean-field-type games with non-quadratic quantity-of-interest and provides semi-explicit solutions using a direct method. Section 4 presents semi-explicit solutions to some non-quadratic mean-field-type games driven by Gauss-Volterra processes. Numerical examples are presented in Section 5. The last section summarizes the work.

## Notations

We introduce the following notations (see Table 3). Let be a fixed time horizon and

be a given filtered probability space. The filtration

is the natural filtration of the union augmented by null sets of In practice, is used to capture smaller disturbance, is used for larger jumps of the system, is used for Gauss-Volterra processes (including sub- or super diffusion). Let is the set of measurable functions such that . is the set of -adapted -valued processes such that The stochastic quantity

denotes the conditional expectation of the random variable

with respect to the filtration Note that is a random process. Below, by abuse of notation we use for the values inside the jump processes or the regime-switching process . The set of decision-makers is denoted by An admissible control strategy of the decision-maker is an -adapted and square-integrable process with values in a non-empty subset . We denote the set of all admissible controls by :

 Ui={ui(⋅)∈L2F([0,T]×S;R);ui(.)∈Uia.e.t∈[0,T],P−a.s.}.

Decision-maker chooses a control strategy to optimize its performance functional. The information structure of the problem under perfect state observation and under common noise observation

## 2 Some Solvable Mean-Field-Free Games

We start with mean-field-free settings where logarithm, logarithm square, Legendre-Fenchel duality, and power payoffs are presented. The cost functions are not necessarily quadratic and the state dynamics is not necessarily linear.

### 2.1 Logarithmic Scale

Consider a set of decision makers interacting in the following non-linear non-quadratic mean-field-free game:

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩Li(x,u)=qi(T,s(T))ln(x(T))+∫T0(qiln(x)+\definecolor[named]pgfstrokecolorrgb0,0,1\pgfsys@color@rgb@stroke001\pgfsys@color@rgb@fill001riu2ki)dt,infui E[Li(x,u)],subject~{}todx=(b1xln(x)+∑j∈Ib2jxuj)dt+\definecolor[named]pgfstrokecolorrgb0,0,1\pgfsys@color@rgb@stroke001\pgfsys@color@rgb@fill001x[σdB+∫γd~N],P(s(t+ϵ)=s′|s,u)=∫t+ϵt~qss′dt′+o(ϵ), s′≠s (1)

and with a given initial condition , and and is an integer, and

###### Proposition 1

The non-linear non-quadratic mean-field-free Nash equilibrium and corresponding equilibrium cost are given by:

 u∗i =[−12kαib2iri]12k−1, E[Li(x,u∗)] =E[αi(0,s0)ln(x0)+δi(0,s0)],

where and solve the following differential equations:

 ˙αi+qi+αib1+∑s′[αi(t,s′)−αi(t,s)]~qss′=0,˙δi+αi[−σ22+∫θ∈Θ[ln(1+γ(θ))−γ(θ)]ν(dθ)]+∑s′[δi(t,s′)−δi(t,s)]~qss′−(2k−1)ri(−12krib2iαi)2k2k−1+αi∑j≠ib2j[−12kαjb2jrj]12k−1=0, (2)

where , and .

.

Proof. Consider the following guess functional:

 fi(t,x,s)=αiln(x)+δi.

By applying Itô’s formula for jump-diffusion-regime switching processes, the gap between the cost and the guess functional can be computed and it is given by

 E[Li(x,u)−fi(0,x0,s0)]=(qi(T,s(T))−αi(T,s(T)))ln(x(T))+δi(T,s(T))E∫T0{˙αi+qi+αib1+∑s′[αi(t,s′)−αi(t,s)]~qss′}ln(x)dt+∫T0˙δi−σ22αi+∑s′[δi(t,s′)−δi(t,s)]~qss′dt+∫T0−(2k−1)ri(−12krib2iαi)2k2k−1+αi∑j≠ib2j[−12kαjb2jrj]12k−1+∫T0αi∫θ∈Θ[ln(1+γ(θ))−γ(θ)]ν(dθ)dt+E∫T0[b2iαiui+riu2ki+(2k−1)ri(−12krib2iαi)2k2k−1]dt, (3)

Noting that

 [b2iαiui+riu2ki+(2k−1)ri(−12krib2iαi)2k2k−1]≥0

with equality iff the announced result follows.

###### Remark 1

For the system reduces to the following ordinary differential equations:

 u∗i=[−12αib2iri],\parE[Li(x,u∗)]=E[αi(0)ln(x0)]+δi(0)],˙αi+qi+αib1+∑s′[αi(t,s′)−αi(t,s)]~qss′=0,˙δi−σ22αi+∑s′[δi(t,s′)−δi(t,s)]~qss′−14rib22iα2i−12αi∑j≠ib22jαjrj+αi∫θ∈Θ[ln(1+γ(θ))−γ(θ)]ν(dθ)=0 (4)

### 2.2 Logarithm square

Consider the following non-linear non-quadratic mean-field-free game:

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩Li(x,u)=qi(T)ln2(x(T))+∫T0(qiln2(x)+riu2i)dt,infui E[Li(x,u)],subject~{}todx=(−12x+b1xln(x)+∑j∈Ib2jxuj)dt+x√ln(x)dB, (5)

and with a given initial condition .

###### Proposition 2

Asume that The non-linear non-quadratic mean-field-free Nash equilibrium and corresponding optimal cost are given by:

 u∗i =−αib2iriln(x), E[Li(x,u∗)] =E[αi(0)ln2(x0)],

where solves the following differential equation:

 ˙αi=−qi+(1−2b1)αi+2αi∑j∈I∖{i}αjb22jrj+α2ib22iri,

where .

Proof. Consider the following guess functional:

 fi(t,x)=αiln2(x).

Applying the Itô’s formula yields

 fi(T,x(T))−fi(0,x0)=∫T0˙αiln2(x)dt +∫T02αiln(x)(−12+b1ln(x)+∑j∈Ib2juj)dt +∫T0αi(1−ln(x))ln(x)dt+∫T02αiln(x)√ln(x)dB.

Thus, the gap is given by

 E[Li(x,u)−fi(0,x0)]=(qi(T)−αi(T))ln2(x(T)) +E∫T0qiln2(x)dt+E∫T0˙αiln2(x)dt +E∫T0⎛⎝2αib1ln2(x)+2αiln(x)∑j∈I∖{i}b2juj⎞⎠dt −E∫T0αiln2(x)dt+E∫T0ri(u2i+2αiln(x)b2iriui)dt

By performing square completion one obtains

 (ui+αiln(x)b2iri)2−α2iln2(x)b22ir2i=u2i+2αiln(x)b2iriui,

then,

 E[Li(x,u)−fi(0,x0)]=(qi(T)−αi(T))ln2(x(T)) +E∫T0qiln2(x)dt+E∫T0˙αiln2(x)dt +E∫T0⎛⎝2αib1ln2(x)−2αiln2(x)∑j∈I∖{i}αjb22jrj⎞⎠ dt −E∫T0αiln2(x)dt+E∫T0ri(ui+αiln(x)b2iri)2dt −E∫T0α2iln2(x)b22iridt

Finally, the announced result is obtained by minimizing the terms.

### 2.3 Legendre-Fenchel

We consider a convex running loss functions

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩Li=qiTl1(xT)+∫T0qil1(x)+∑j∈Irijl1(uj)dt,infuiE[Li],subject to P(s(t+ϵ)=s′|s)=∫t+ϵt~qss′dt′+o(ϵ), s′≠sdx=[b1l1(x)l′1(x)+h(x)∑jb2jujl′1(x)]dt+√σ21+σ22l1(x)l′′1(x)dB,x(0)=x0, s(0)=s0 (6)

where and Recall that the Legendre-Fenchel transform of

 −l∗(x)=infu{l(u)−xu}.
###### Proposition 3

Assume that are positive. Then, the game problem (6) has a solution:

 u∗i=(l∗2)′(−b2iαiriih(x)),E[Li(x,u)]=E[αi(0,s0)l(x0)+δi(0,s0)], (7)

with

 ˙αi+qi+αi(b1+σ222)+∑s′(αi(t,s′)−αi(t,s))~qss′−ηii+∑j≠iηij+αib2jγj=0,˙δi+σ212+∑s′(δi(t,s′)−δi(t,s))~qss′=0 (8)

where

 riil