# Three algorithms for solving high-dimensional fully-coupled FBSDEs through deep learning

Recently, the deep learning method has been used for solving forward backward stochastic differential equations (FBSDEs) and parabolic partial differential equations (PDEs). It has good accuracy and performance for high-dimensional problems. In this paper, we mainly solve fully coupled FBSDEs through deep learning and provide three algorithms. Several numerical results show remarkable performance especially for high-dimensional cases.

## Authors

• 4 publications
• 4 publications
• 5 publications
• 3 publications
• ### Convergence of the Deep BSDE Method for Coupled FBSDEs

The recently proposed numerical algorithm, deep BSDE method, has shown r...
11/03/2018 ∙ by Jiequn Han, et al. ∙ 0

• ### Forward-Backward Stochastic Neural Networks: Deep Learning of High-dimensional Partial Differential Equations

Classical numerical methods for solving partial differential equations s...
04/19/2018 ∙ by Maziar Raissi, et al. ∙ 0

• ### Towards Robust and Stable Deep Learning Algorithms for Forward Backward Stochastic Differential Equations

Applications in quantitative finance such as optimal trade execution, ri...
10/25/2019 ∙ by Batuhan Güler, et al. ∙ 0

• ### A Deep Neural Network Surrogate for High-Dimensional Random Partial Differential Equations

Developing efficient numerical algorithms for high dimensional random Pa...
06/08/2018 ∙ by Mohammad Amin Nabian, et al. ∙ 2

• ### The Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems

We propose a deep learning based method, the Deep Ritz Method, for numer...
09/30/2017 ∙ by Weinan E, et al. ∙ 1

• ### PyDEns: a Python Framework for Solving Differential Equations with Neural Networks

Recently, a lot of papers proposed to use neural networks to approximate...
09/25/2019 ∙ by Alexander Koryagin, et al. ∙ 0

• ### The Deep Learning Galerkin Method for the General Stokes Equations

The finite element method, finite difference method, finite volume metho...
09/18/2020 ∙ by Jian Li, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Bismut firstly introduced the linear style backward stochastic differential equation (BSDE) in 1973 [1]. In 1990, Pardoux and Peng proved the existence and uniqueness of the adapted solution for nonlinear BSDEs [2]. In 1997, E.Karoui, Peng and Quenez  [3] found important applications of BSDEs in finance. When a BSDE is coupled with a (forward) stochastic differential equation (SDE), the system is usually called a forward backward stochastic differential equation (FBSDE). In recent years, the FBSDEs have shown important applications in many fields. For example, the FBSDEs could be used to model financial markets when a large investor influences the stock price [4]. The solution of a FBSDE is related to a second-order quasilinear partial differential equation (PDE) [5].

Generally speaking, it is difficult for us to obtain the explicit solution of a FBSDE. Therefore, it is necessary to find the approximate solution. In this paper, we aim to obtain the numerical solution of the following fully-coupled FBSDE through deep learning:

 ⎧⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪⎩Xt=X0+∫t0b(s,Xs,Ys,Zs)ds+∫t0σ(s,Xs,Ys,Zs)dWs,Yt=g(XT)+∫Ttf(s,Xs,Ys,Zs)ds−∫TtZsdWs. (1.1)

There are several ways to find the numerical solution of FBSDE (1.1). Based on the relationship between FBSDEs and PDEs (see [5]), numerical methods for solving the PDEs, such as the finite element method, the finite difference method, or the sparse grid method [6], can be applied to solve the FBSDEs. In [7, 8], Yong and Ma studied the solvability of coupled FBSDEs and proposed a four-step approach. Moreover, some probabilistic methods, which approximate the conditional expectation with numerical schemes, were developped to solve the FBSDEs. For example, [9] proposed a theta-scheme numerical method with high accuracy for coupled Markovian FBSDEs. [10] proposed a numerical scheme for coupled FBSDEs when the forward process does not depend on . The BCOS method [11] and Fourier methods [12] are also proposed for solving the FBSDEs.

As is known, there is a significant difficulty for solving high dimensional BSDEs and FBSDEs, namely "curse of dimensionality

[13]. The computational complexity grows exponentially when the dimension increases, while the accuracy decline sharply. Therefore most of the aforementioned numerical methods can not deal with high-dimensional problems.

Recently, deep-learning method has achieved great success in many application areas [14]

, such as computer vision

[15][16], gaming [17], etc. It provides a new point of view to approximate functions and shows optimistic performance in solving problems with high- dimension features. This poses a possible way to solve the "curse of dimensionality" although the reason why deep-learning has so remarkable performance has not been proven completely.

Weinan E and his collaborators [18, 19]

constructed a neural network to approximate the conditional expectation. This method has shown superior performance and accuracy in solving high dimensional BSDEs on comparing with the traditional numerical methods. Han and Long

[20] extended this method to solve the following coupled FBSDE, where the forward SDE does not depend on :

 ⎧⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪⎩Xt=X0+∫t0b(s,Xs,Ys)ds+∫t0σ(s,Xs,Ys)dWs,Yt=g(XT)+∫Ttf(s,Xs,Ys,Zs)ds−∫TtZsdWs.

They regard as a control and assume that

 Zt=ϕ(Xt,Yt),

where the function is simulated by neural network.

In this paper, we propose three algorithms to solve the fully-coupled FBSDEs (1.1) through deep learning. The first algorithm (Algorithm 1) is inspired by the idea of the Picard iteration (see [5]). The term is regarded as the control and in doing iterations, we assume that depends on , and according to their pathes. In more details, we set

 ~Zk+1t=ϕ(~Xk+1t,~Ykt,~Zkt),

where is denoted as the iteration step. It should be noted that this iterative approach is path-to-path. The second algorithm (Algorithm 2) is motivated by Han and Long [20]. We also regard as the control and suppose that depends on the state of the forward SDE and the state of the BSDE, i.e.

 ~Zk+1t=ϕ(~Xk+1t,~Yk+1t),

the above iterative relationship is adopted in [20] and we extend it to solve our fully-coupled FBSDE (1.1) in which and depend on . In the third algorithm (Algorithm 3), in addition to the control , we regard in the forward SDE as a new control and denote it by . Both and are supposed to be dependent on the state of the forward SDE:

 uk+1t =ϕ1(~Xk+1t), ~Zk+1t =ϕ2(~Xk+1t).

The price of doing this is that we need to add a penalty term in the cost function to punish the difference between the control and the solution of the backward SDE.

As a result, all the three algorithms can approximate the solution of the FBSDE (1.1) and perform well in high-dimensional cases. As shown in the examples, the relative errors of these algorithms are less than 1%. Algorithm 1 takes only a few steps to achieve convergence results. But the iteration may take more time. Although Algorithms 2 and 3 are computationally fast, but they may require more steps to converge.

The remainder of this paper is organized as following. In Section 2, we firstly introduce the preliminaries on FBSDEs and give the existence and uniqueness conditions of fully-coupled FBSDEs. In Section 3, the relationship between FBSDEs and an optimal control problem are presented, which indicates that the FBSDEs can be solved from a control perspective. The theoretical proof is also given. According to different kinds of state feedback, we propose another two optimal control problems for solving FBSDE (1.1). In Section 4, we present our numerical schemes and the corresponding iterative algorithms. Section 5 gives some examples and shows the comparison among different algorithms for solving coupled FBSDEs.

## 2 Preliminaries on FBSDEs

In this section, we mainly introduce the form of FBSDEs and the existence and uniqueness conditions of fully-coupled FBSDEs [21].

Let and

be a filtered probability space, where

is a -dimensional standard Brownian motion on , is the natural filtration generated by the Brownian motion . is the initial condition for the FBSDE.

Considering the following coupled FBSDE,

 ⎧⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪⎩Xt=X0+∫t0b(s,Xs,Ys,Zs)ds+∫t0σ(s,Xs,Ys,Zs)dWs,Yt=g(XT)+∫Ttf(s,Xs,Ys,Zs)ds−∫TtZsdWs, (2.1)

where are -adapted stochastic processes taking value in respectively. The functions

 b: Ω×[0,T]×Rn×Rm×Rm×d→Rn σ: Ω×[0,T]×Rn×Rm×Rm×d→Rn×d f: Ω×[0,T]×Rn×Rm×Rm×d→Rm g: Ω×[0,T]×Rn→Rm

are deterministic globally continuous functions. and are the drift coefficient and diffusion coefficient of respectively, and is referred to as the generator of the coupled FBSDE. If there is a triple satisfies the above FBSDE on , -almost surely, square integrable and -adapted, the triple are called the solutions of FBSDE(2.1). When functions and are independent of both and , FBSDE(2.1) is called a decoupled FBSDE.

Given a full-rank matrix , We define

 u=⎛⎜⎝xyz⎞⎟⎠, A(t,u)=⎛⎜⎝−GTfGbGσ⎞⎟⎠(t,u),

where .

Firstly, we give two assumptions as the following,

###### Assumption 1.
1. is uniformly Lipschitz with respect to ;

2. is in for ;

3. is uniformly Lipschitz with respect to ;

4. is in for .

###### Assumption 2.
 ⟨A(t,u)−A(t,¯u),u−¯u⟩ ≤−β1|G^x|2−β2(|GT^y|2+|GT^z|), ⟨g(x)−g(¯x),G(x−¯x)⟩ ≥μ1|G^x|2,
 ∀u=(x,y,z),¯u=(¯x,¯y,¯z),^x=x−¯x,^y=y−¯y,^z=z−¯z,

where and are given nonnegative constants with

Then, the following theorem is given:

###### Theorem 1.

Let Assumptions 1 and 2 hold, then there exists a unique adapted solution of FBSDE (2.1).

We omit the proof, see Theorem 2.6 of [21] in detail.

###### Remark.

Assumptions 1 and 2 are necessary but not sufficient conditions for FBSDE (2.1). Many FBSDEs not satisfying the conditions A.1 and A.2 have also solutions, such as

 Xt=X0+∫t0ZsdWs, Yt=XT−∫TtZsdWs.

where we set

For convenience, in this article, we assume that is the Lipschitz constant satisfying Assumption 1 , that is, for

 |l(t,x,y,z)−l(t,x′,y′,z′)| ≤L(|x−x′|+|y−y′|+|z−z′|), |g(x)−g(x′)| ≤L(|x−x′|),

where represents one of the functions among and .

## 3 Solving FBSDEs from an optimal control perspective

Essentially, the deep neural network can be regarded as a control system, which is used to approximate the mapping from the input set to the label set. The parameters in the network can be seen as the control, and the cost function can be seen as the optimization objective. Thus, we first transform the FBSDE solving problem into an optimal control problem in order to apply the deep neural network.

### 3.1 Pichard iteration method

Before the transformation, we first give an existing conclusion. As we know, FBSDE (2.1) has a unique solution under monotonicity conditions, and Pardoux and Tang’s results [5] have shown that (2.1) can be constructed via Picard iteration

 ⎧⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪⎩Xk+1t=X0+∫t0b(s,Xk+1s,Yks,Zks)ds+∫t0σ(s,Xk+1s,Yks,Zks)dWs,Yk+1t=g(Xk+1T)+∫Ttf(s,Xk+1s,Yk+1s,Zk+1s)ds−∫TtZk+1sdWs, (3.1)

when are given, is denoted as the iteration step. Therefore, the solution of the decoupled FBSDE (3.1) converges to to the solution of (2.1) when tends to infinity, i.e.

 limk→∞E[sup0≤t≤T(|Xk+1t−Xt|2+|Yk+1t−Yt|2)+∫T0|Zk+1t−Zt|2dt]=0. (3.2)

Regarding and as controls, we consider the following control problem

 inf~Yk+10,{~Zk+1t}0≤t≤T E[|g(~Xk+1T)−~Yk+1T|2], (3.3) s.t.  ~Xk+1t=X0+∫t0b(s,~Xk+1s, ~Yks,~Zks)ds+∫t0σ(s,~Xk+1s,~Yks,~Zks)dWs, ~Yk+1t=~Yk+10−∫t0f(s, ~Xk+1s,~Yk+1s,~Zk+1s)ds+∫t0~Zk+1sdWs.

In the following, we will show that control problem (3.3) is equivalent to FBSDE (3.1).

When are known, we consider the following SDE

 ⎧⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪⎩~Xk+1t=X0+∫t0b(s,~Xk+1s,~Yks,~Zks)ds+∫t0σ(s,~Xk+1s,~Yks,~Zks)dWs,~Yk+1t=~Yk+10−∫t0f(s,~Xk+1s,~Yk+1s,~Zk+1s)ds+∫t0~Zk+1sdWs, (3.4)

(3.4) has infinite number of solutions because both the initial value and the process are uncertain.

Given and the process , assuming , which takes value in and depends on the initial condition and process , then equation (3.4) can be written as

 ~Ut=~U0+∫t0~b(s,~Us)ds+∫t0~σ(s,~Us)dWs, (3.5)

where

 ~b: Ω×[0,T]×Rn+m→Rn+m ~σ: Ω×[0,T]×Rn+m→R(n+m)×d

are two functions denoted as

 ~b(s,~Us) =(b(s,~Xk+1s,~Yks,~Zks),−f(s,~Xk+1s,~Yk+1s,~Zk+1s)) ~σ(s,~Us) =(σ(s,~Xk+1s,~Yks,~Zks),~Zk+1s).
###### Lemma 1.

Assume that Assumptions 1 and 2 hold. Then equation (3.5) has a unique solution when and process are given, and equation (3.4) has a unique solution .

###### Proof.

According to Assumption 1 (i) and as the matrix is full-rank, we can get that functions and are all uniformly Lipschitz, satisfying

 |Δb(t)|+|Δσ(t)|+ |Δf(t)|≤3L(|x1−x2|+|y1−y2|), Δb(t) =b(t,x1,y1,⋅)−b(t,x2,y2,⋅) Δσ(t) =σ(t,x1,y1,⋅)−σ(t,x2,y2,⋅) Δf(t) =f(t,x1,y1,⋅)−f(t,x2,y2,⋅) ∀xi∈ Rn,yi∈Rm,t∈[0,T],i=1,2

where is the Lipschitz constant. Then we get the following inequation:

 |~b(t,u1)−~b(t,u2)|+|~σ(t,u1)− ~σ(t,u2)|≤L′(|u1−u2|) ∀ ui=(xi,yi)∈Rn+m, xi∈Rn,yi∈Rm,t∈[0,T],i=1,2

where only depends on . Let , then we get

 supt(|~b(t,0)|+|~σ(t,0)|)≤D<∞

According to Assumption 1 (ii), there exist a constant , satisfying

 |~b(t,u)|+|~σ(t,u)| ≤|~b(t,u)−~b(t,0)|+|~σ(t,u)−~σ(t,0)|+|~b(t,0)|+|~σ(t,0)| ≤L′(|u|)+D≤C(1+|u|),

where . According to [22], the proof is completed. ∎

Lemma 1 shows that the solution of (3.4) is determined by and . However, because of the uncertainty of and , equation (3.4) has infinite number of solutions.

Now we denote as

 J(~Yk+10,~Zk+1)=E[|g(~Xk+1T)−~Yk+1T|2],

and then the control problem (3.3) becomes the problem of finding the minimum value of . In the following Theorem 2, we will show that the solution of (3.4) converges to the solution of FBSDE (2.1), when goes to zero as tends to infinity, which means that solving the control problem (3.3) is equivalent to solving the FBSDE (2.1).

###### Theorem 2.

Suppose Assumption 1 and 2 hold true and there exit , where

 C1 =e(4L+3L2)T⋅(L+3L2) C2 =C1(L+L2)e(2L+2L2)T C3 =C2T C4 =(3L2C1+9L2(C1+2C3)+3C3) C5 =(C4+C3)(T+1).

If satisfies

 limk→∞J(~Yk+10,~Zk+1)=0,

then the solution of SDE (3.4) satisfies

 limk→∞E[sup0≤t≤T|~Yk+1t−Yt|2+∫T0|~Zk+1t−Zt|2dt]=0 (3.6)

where is the solution of FBSDE (2.1).

###### Proof.

The proof of this theorem is divided into two steps.

• Supposing that the following equation

 ⎧⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪⎩^Xk+1t=X0+∫t0b(s,^Xk+1s,~Yks,~Zks)ds+∫t0σ(s,^Xk+1s,^Yks,^Zks)dWs,^Yk+1t=g(^Xk+1T)+∫Ttf(s,^Xk+1s,^Yk+1s,^Zk+1s)ds−∫Tt^Zk+1sdWs, (3.7)

has a solution . Let

 δXk+1t =^Xk+1t−Xk+1t, δYk+1t =^Yk+1t−Yk+1t, δZk+1t =^Zk+1t−Zk+1t, δYk+1T =g(^Xk+1T)−g(Xk+1T) δbt =b(t,^Xk+1t,~Ykt,~Zkt)−b(t,Xk+1t,Ykt,Zkt) δσt =σ(t,^Xk+1t,~Ykt,~Zkt)−σ(t,Xk+1t,Ykt,Zkt) δft =f(t,^Xk+1t,^Yk+1t,^Zk+1t)−f(t,Xk+1t,Yk+1t,Zk+1t).

From (3.1) and (3.7), we get

 δXk+1t =∫t0δbsds−∫t0δσsdWs, δYk+1t =δYk+1T+∫Ttδfsds−∫TtδZk+1sdWs,

whose differential form is

 dδXk+1t =δbtdt+δσk+1tdWt, −dδYk+1t =δftdt−δZk+1tdWt,

plugging Ito’s formula into ,

 d|δXk+1t|2 =2δXk+1t⋅dδXk+1t+dδXk+1t⋅dδXk+1t =2δXk+1t(δbtdt+δσk+1tdWt)−|δσk+1t|2dt,

integrate from to ,

 |δXk+1t|2 =2∫t0δXk+1s(δbsdt+δσk+1sdWs)+∫t0|δσk+1s|2ds,

and take the expectation

 E[|δXk+1t|2 =E[∫t0(2δXk+1sδbs+|δσk+1s|2)ds] ≤2E[∫t0|δXk+1s|L(|δXk+1s|+|~Ykt−Ykt|+|~Zkt−Zkt|)ds] +E[∫t0L2(|δXk+1s|+|~Ykt−Ykt|+|~Zkt−Zkt|)2ds] ≤E[∫t0((2L+L+L)|δXk+1s|2+L|~Ykt−Ykt|2+L|~Zkt−Zkt|2)ds] +E[∫t03L2(|δXk+1s|2+|~Ykt−Ykt|2+|~Zkt−Zkt|2)ds] =(4L+3L2)E[∫t0|δXk+1s|2]+(L+3L2)E[∫t0|~Ykt−Ykt|2+|~Zkt−Zkt|2ds] ≤(4L+3L2)E[∫t0|δXk+1s|2]+(L+3L2)E[∫T0|~Ykt−Ykt|2+|~Zkt−Zkt|2ds]

based on the Gronwall inequality, we get

 E[|δXk+1t|2] ≤(L+3L2)E[∫T0|~Ykt−Ykt|2+|~Zkt−Zkt|2ds]⋅e(4L+3L2)T =C1E[∫T0|~Ykt−Ykt|2+|~Zkt−Zkt|2ds], (3.8)

Similarly, we have

 −d|δYk+1t|2 =−2δYk+1t⋅dδYk+1t−dδYk+1t⋅dδYk+1t =2δYk+1t(δftdt−δZk+1tdWt)−|δZk+1t|2dt,

integrate from to ,

 |δYk+1t|2+∫Tt|δZk+1s|2ds =|δYk+1T|2+2∫TtδY