Bismut firstly introduced the linear style backward stochastic differential equation (BSDE) in 1973 . In 1990, Pardoux and Peng proved the existence and uniqueness of the adapted solution for nonlinear BSDEs . In 1997, E.Karoui, Peng and Quenez  found important applications of BSDEs in finance. When a BSDE is coupled with a (forward) stochastic differential equation (SDE), the system is usually called a forward backward stochastic differential equation (FBSDE). In recent years, the FBSDEs have shown important applications in many fields. For example, the FBSDEs could be used to model financial markets when a large investor influences the stock price . The solution of a FBSDE is related to a second-order quasilinear partial differential equation (PDE) .
Generally speaking, it is difficult for us to obtain the explicit solution of a FBSDE. Therefore, it is necessary to find the approximate solution. In this paper, we aim to obtain the numerical solution of the following fully-coupled FBSDE through deep learning:
There are several ways to find the numerical solution of FBSDE (1.1). Based on the relationship between FBSDEs and PDEs (see ), numerical methods for solving the PDEs, such as the finite element method, the finite difference method, or the sparse grid method , can be applied to solve the FBSDEs. In [7, 8], Yong and Ma studied the solvability of coupled FBSDEs and proposed a four-step approach. Moreover, some probabilistic methods, which approximate the conditional expectation with numerical schemes, were developped to solve the FBSDEs. For example,  proposed a theta-scheme numerical method with high accuracy for coupled Markovian FBSDEs.  proposed a numerical scheme for coupled FBSDEs when the forward process does not depend on . The BCOS method  and Fourier methods  are also proposed for solving the FBSDEs.
As is known, there is a significant difficulty for solving high dimensional BSDEs and FBSDEs, namely "curse of dimensionality". The computational complexity grows exponentially when the dimension increases, while the accuracy decline sharply. Therefore most of the aforementioned numerical methods can not deal with high-dimensional problems.
Recently, deep-learning method has achieved great success in many application areas 
, such as computer vision16], gaming , etc. It provides a new point of view to approximate functions and shows optimistic performance in solving problems with high- dimension features. This poses a possible way to solve the "curse of dimensionality" although the reason why deep-learning has so remarkable performance has not been proven completely.
constructed a neural network to approximate the conditional expectation. This method has shown superior performance and accuracy in solving high dimensional BSDEs on comparing with the traditional numerical methods. Han and Long extended this method to solve the following coupled FBSDE, where the forward SDE does not depend on :
They regard as a control and assume that
where the function is simulated by neural network.
In this paper, we propose three algorithms to solve the fully-coupled FBSDEs (1.1) through deep learning. The first algorithm (Algorithm 1) is inspired by the idea of the Picard iteration (see ). The term is regarded as the control and in doing iterations, we assume that depends on , and according to their pathes. In more details, we set
where is denoted as the iteration step. It should be noted that this iterative approach is path-to-path. The second algorithm (Algorithm 2) is motivated by Han and Long . We also regard as the control and suppose that depends on the state of the forward SDE and the state of the BSDE, i.e.
the above iterative relationship is adopted in  and we extend it to solve our fully-coupled FBSDE (1.1) in which and depend on . In the third algorithm (Algorithm 3), in addition to the control , we regard in the forward SDE as a new control and denote it by . Both and are supposed to be dependent on the state of the forward SDE:
The price of doing this is that we need to add a penalty term in the cost function to punish the difference between the control and the solution of the backward SDE.
As a result, all the three algorithms can approximate the solution of the FBSDE (1.1) and perform well in high-dimensional cases. As shown in the examples, the relative errors of these algorithms are less than 1%. Algorithm 1 takes only a few steps to achieve convergence results. But the iteration may take more time. Although Algorithms 2 and 3 are computationally fast, but they may require more steps to converge.
The remainder of this paper is organized as following. In Section 2, we firstly introduce the preliminaries on FBSDEs and give the existence and uniqueness conditions of fully-coupled FBSDEs. In Section 3, the relationship between FBSDEs and an optimal control problem are presented, which indicates that the FBSDEs can be solved from a control perspective. The theoretical proof is also given. According to different kinds of state feedback, we propose another two optimal control problems for solving FBSDE (1.1). In Section 4, we present our numerical schemes and the corresponding iterative algorithms. Section 5 gives some examples and shows the comparison among different algorithms for solving coupled FBSDEs.
2 Preliminaries on FBSDEs
In this section, we mainly introduce the form of FBSDEs and the existence and uniqueness conditions of fully-coupled FBSDEs .
be a filtered probability space, whereis a -dimensional standard Brownian motion on , is the natural filtration generated by the Brownian motion . is the initial condition for the FBSDE.
Considering the following coupled FBSDE,
where are -adapted stochastic processes taking value in respectively. The functions
are deterministic globally continuous functions. and are the drift coefficient and diffusion coefficient of respectively, and is referred to as the generator of the coupled FBSDE. If there is a triple satisfies the above FBSDE on , -almost surely, square integrable and -adapted, the triple are called the solutions of FBSDE(2.1). When functions and are independent of both and , FBSDE(2.1) is called a decoupled FBSDE.
Given a full-rank matrix , We define
Firstly, we give two assumptions as the following,
is uniformly Lipschitz with respect to ;
is in for ;
is uniformly Lipschitz with respect to ;
is in for .
where and are given nonnegative constants with
Then, the following theorem is given:
We omit the proof, see Theorem 2.6 of  in detail.
For convenience, in this article, we assume that is the Lipschitz constant satisfying Assumption 1 , that is, for
where represents one of the functions among and .
3 Solving FBSDEs from an optimal control perspective
Essentially, the deep neural network can be regarded as a control system, which is used to approximate the mapping from the input set to the label set. The parameters in the network can be seen as the control, and the cost function can be seen as the optimization objective. Thus, we first transform the FBSDE solving problem into an optimal control problem in order to apply the deep neural network.
3.1 Pichard iteration method
Before the transformation, we first give an existing conclusion. As we know, FBSDE (2.1) has a unique solution under monotonicity conditions, and Pardoux and Tang’s results  have shown that (2.1) can be constructed via Picard iteration
Regarding and as controls, we consider the following control problem
When are known, we consider the following SDE
(3.4) has infinite number of solutions because both the initial value and the process are uncertain.
Given and the process , assuming , which takes value in and depends on the initial condition and process , then equation (3.4) can be written as
are two functions denoted as
According to Assumption 1 (i) and as the matrix is full-rank, we can get that functions and are all uniformly Lipschitz, satisfying
where is the Lipschitz constant. Then we get the following inequation:
where only depends on . Let , then we get
According to Assumption 1 (ii), there exist a constant , satisfying
where . According to , the proof is completed. ∎
Now we denote as
and then the control problem (3.3) becomes the problem of finding the minimum value of . In the following Theorem 2, we will show that the solution of (3.4) converges to the solution of FBSDE (2.1), when goes to zero as tends to infinity, which means that solving the control problem (3.3) is equivalent to solving the FBSDE (2.1).
The proof of this theorem is divided into two steps.