Recent developments of numerical algorithm for solving high dimensional PDEs draw a great amount of attention in various scientific fields. In the seminal paper 
, deep learning technique was first introduced to study the numerical algorithms for high dimensional parabolic PDEs. The deep learning BSDE method is based on the non-linear Feynman-Kac formula, which provides the equivalent relations between parabolic PDEs and Markovian backward stochastic differential equations (BSDEs) (see e.g.). When the non-Markovian property, e.g. path-dependent property, is involved, the BSDE is equivalent to a path-dependent PDE (PPDE), which was first introduced in  for path-dependent option pricing problem. The deep learning BSDE method has been recently extended to design numerical algorithms for PPDEs. The non-Markovian property introduces extra complexity in the numerical scheme, and it returns a high dimensional problem even if the original space variable is low dimensional. In this study, we shall focus on the numerical solutions for the corresponding Markovian and non-Markovian FBSDEs.
For the deep learning BSDE method 
, it shows the efficiency of machine learning in solving high dimensional parabolic PDEs but subject to small Lipschitz constants or equivalently small time duration. The exponential stopping time strategy has been introduced in to extend the time duration. However, both algorithms are still using the deep neural network combined with standard Euler scheme in essence, which makes it sensitive to the time discretization. Namely, the time dimension is still large for long time duration, which may take a long time to train the deep neural network (DNN) model. Furthermore, the deep learning BSDE method is not robust to missing data. If we miss a proportion of our data (e.g. data points in the Euler scheme), the accuracy will be affected. In particular, this is the same type of difficulty when dealing with high frequency data. In this case, one has to down-sample the stream data to a coarser time grid to feed it into the DNN-type algorithm. It may miss the microscopic characteristic of the streamed data and render lower accuracy. On the other hand, the high frequency and path-dependent features show up naturally in option pricing problems and non-linear expectations within various financial contexts, e.g. limit order book [5, 6, 9, 16, 23, 25], nonlinear pricing [29, 37], Asian option pricing , model ambiguity[3, 4, 10, 15], stochastic games and mean field games [13, 31], etc.
Motivated by these problems, we introduce the deep signature transformation into the recurrent neural network (RNN) model to solve BSDEs. The “signature” is defined as an iterated integral of a continuous path with bounded -variation, for , which is a recurring theme in the rough path theory introduced by T. Lyons . The “signature” has recently been used to define kernels [8, 20, 27] for sequentially ordered data in the corresponding reproducing kernel Hilbert space (RKHS). This idea is further developed in  to design “deep signature” by combing the kernel method and DNN. Furthermore, the “deep signature” has been used in RNN to study controlled differential equations in . The signature approach also provides a non-parametric way for extraction of characteristic features from the data, see e.g. . The data are converted into a multi-dimensional path through various embedding algorithms and then processed for computation of individual terms of the signature, which captures certain information contained in the data. The advantage of this signature method is that this method can deal with high frequency data, and is not sensitive to the time discretization. Motivated by this idea, we propose to combine the signature/log-signature transformation and RNN model to solve the FBSDEs, which should have a much coarser time partition, a better downsampling effect, and more robust to the high-frequency data assumptions.
The numerical algorithm for solving PPDE with path-dependent terminal condition (first type PPDE) has been recently studied in [34, 35] by using recurrent neural network. The second type PPDE arises from the Volterra SDE setting, where the non-Markovian property is introduced by the forward process instead of the terminal condition. The numerical algorithms for the option pricing problem in the Volterra SDEs setting has been recently studied in [18, 33] by using deep learning,  by using regularity structures, and  by using cubature formula.
However, none of these works consider the high frequency data features in the algorithm. Neither do they consider the longer time duration in the model. Furthermore, we also provide the convergence analysis of our algorithm after introducing the signature/log-signature transformation layer into the RNN model.
2 Algorithms and convergence analysis
We consider the Markovian and non-Markovian FBSDEs of the following form, for ,
In both the Markovian () and non-Markovian () FBSDEs system above, we denote as -valued Brownian motion. Throughout the paper, unless otherwise stated, the process , and take values in and , respectively. We denote as the state dependent terminal condition and denote as the terminal condition depending on the path of , which corresponds to the the payoff function in the option pricing problem. The pair solves the BSDE in and respectively.
We present signature/ log-signature FBSDE numerical scheme in detail. We first partition the time horizon into time steps with a mesh size , and the time partition is given by . The state process is generated from Euler scheme as
where denotes the increment of the Brownian motion. Next, for some , we partition the time interval into segmentations with step size . The segmentation can be written as . Then we compute the signature/log-signature 111The numerical implementation of the signature/log-signature transformation was borrowed from . of the forward process truncated at order based on the segmentation , which is denoted as . Moreover, we approximate the process Z using a recurrent neural network (RNN) with truncated signature / log-signature at order as the inputs. Then, the backward process would follow,
where , for , denotes the output of the RNN 222In particular, the recurrent network in this paper is the LSTM network . with truncated signature of forward process at order as the inputs, and . Similarly, we denote
as the output of RNN with log signature inputs. Lastly, the objective is to minimize the loss function, and update parameters
by stochastic gradient descent. The full algorithm is presented in Algorithm1. We keep the following standard assumptions on the coefficients for FBSDEs.
Let the following assumptions be in force.
are deterministic taking values in , respectively; and and are bounded.
are smooth enough with respect to all variables and all derivatives are bounded by constant .
We are now ready to present the universality approximation property of deep signature/log-signature Markovian BSDE.
Let Assumption 1 hold and assume kh is small enough, for some constant , and for any , there exists recurrent neural network , such that
Furthermore, we have the following estimate.
Let Assumption 1 be in force. Then one can conclude that,
In this section, we implement our algorithm to a wide range of applications including European call option, lookback option under Black-Scholes model, European call option under Heston model, a high dimensional example, and a nonlinear example. In summary, our /-FBSDE method has the following advantages over other numerical methods in the current literature 333The desktop we used in this study is equipped with an i7-8700 CPU and a RTX 2080Ti GPU. For all the examples in this paper, we generated in total of paths for the forward processes. paths were used to test, and the rest were used to train the neural network.:
Our algorithm is capable to find a more accurate solution to the FBSDE.
Our algorithm is capable to approximate the true solution efficiently in terms of time.
Our algorithm is capable to handle high frequency data in a long time duration. The results are accurate and computation times are efficient.
Our algorithm is capable to handle high dimensional and non-linear scenarios.
3.1 Best Ask Price for GBM European Call Option
We first apply our algorithm to provide lower/upper bound for the limit order book spread. Let the underlying asset follows a geometric Brownian motion,
where is a standard Brownian motion under risk neutral measure . Based on the no good deal theory  [Theorem 3.9] , the best ask price for the European call option at level can be represented as
where denotes no good deal pricing set at level , and is a discount factor. With the specification of the pricing measure 444See details in the Appendix., under no good deal condition assumption (see  Definition 2.16 and Theorem 2.17), the best ask/bid price (3.5) at level are unique solutions to the following BSDEs at ,
where and . In this example, we implement 1-dimensional best ask scenario for (3.6), and we compare results from our signature methods with simple neural network method. We choose the following parameters for the simulation , and batch size 1000.
As we see from Table1, our algorithm combining signature with LSTM neural network (labeled as Sig-LSTM) outperforms the simple neural network method in terms of efficiency, our algorithm runs 20 times faster than simple neural network approach with . This is what we should expect, since for each iteration our algorithm runs 5 steps segmented by signature () instead of 100 steps () in the simple neural network approach with Euler scheme. Also, as we can see in Table 1, the result converges to when increases. More accuracy and time efficiency results comparisons are illustrated in the lookback option example, which is a path dependent option.
3.2 Lookback Option Example
In this example, we consider the lookback option pricing under the the classical Black-Scholes model with constant interest rate , and volatility . Under the risk neutral measure , the stock prices and the lookback option price follow the FBSDEs below,
In this example, we choose the following parameters in simulation, . In Figure 2, we compare the convergence of lookback option prices from different methods, and different time discretization steps. Vanilla-LSTM refers to the algorithm that the inputs to the neural networks are the stock prices. PDGM from  is a numerical scheme based on recurrent neural network, and it is used to solve PPDEs. LogSig-LSTM and Sig-LSTM refer to the two numerical algorithms proposed in this study. Figure 2 and Figure 6 list all computation errors and running times over different methods and time steps respectively.
The first observation is that under the same number of time steps, the numerical solutions from all methods are very similar. Secondly, the key to improve the numerical solutions to be closer to the true solution is the number of the time steps during simulation, which is quite intuitive. The smaller the mesh size during simulation, the result is better. With , our log-signature and signature perform the best, and with , the numerical solution is only approximately 0.6% apart from the true solution. The third observation is that the convergence rate is slower with smaller number of segmentations in log-signature and signature method. In addition, with a larger number of segmentations, the numerical results are generally better. Therefore, one may be encouraged to have become as large as possible. However, this is not feasible in practice due to the running times.
Figure 6 compares the running times over different methods and time steps respectively. The running times are approximately linear with the number of segmentations and time steps. Log-signature and signature methods run 100 times faster with 5 segmentation () than vanilla-LSTM with 500 times steps (). Therefore, summarizing the stock data paths into signature into a few segmentations, and then inputting them into the neural network would save us a great amount of time, and obtain the similar accuracy.
In addition, our method can handle high-frequencey data. It would be impracticable to input a stock paths with into the vanilla-LSTM since it would take too long to train. However, we could first divide the 5000 time steps into 5 or 20 segmentation, and then compute the log-signature and signature of segmentations, which will be finally input into the neural networks. As we can see from this example, our method reaches a higher accuracy in an time efficient manner.
Furthermore, our method could handle high frequency data with a long time duration. Continuing with lookback option example, now we choose the parameters to be . Since the numerical difference between log-signature and signature methods are minimal, we only make a comparison between vanilla-LSTM and Sig-LSTM in Figure 6. Figure 6 plots a closeup of lookback option prices with different time-steps. Comparing to Vanilla-LSTM with , our Sig-LSTM methods with and improves the accuracy by 1.36%, and underestimates the solution only 0.624%. In the meantime, our Sig-LSTM method with and runs 100 times faster than Vanilla-LSTM with , which is quite impressive.
3.3 European Call Option in the Heston Model under Parameter Uncertainty
In this section, we study the European call option pricing problem with stochastic volatility model under parameter uncertainty. The general model setup follows from , where the asset price
and the variance processin the Heston model are given by
where we denote as two independent Brownian motions under risk neutral measure , with correlation . Under the parameter uncertainty situation, we do not know the precise values of , with . However, following the ellipsoidal specification 555See details in the Appendix. of uncertainty, the pricing bound for the Heston call option under model ambiguity is the unique solution of the following BSDE with payoff at maturity ,
where can be computed explicitly. We implement the example in  with the same experiment set up: , and covariance matrix .
From Figure 6, we can see that our method provides a better pricing bound over the recursive MARS degree 2 with variance reduction method (denoted as ”MARS” in the figure) in , by providing a slightly wider bound for the optimally controlled value process. The closeup plots are in Figure 7. As we increase the number of time steps to , the Vanilla-LSTM performs better than MARS method with , and . Lastly, with and , our Sig-LSTM method efficiently improves the bound. This is what we should expect. With a larger number of time discretization , the driver in (3.7) are updated more accurately, which leads to the value process in (3.7) optimised to a higher degree. Lastly, we should mention that traditional numerical methods like ”MC” and ”MARS” are not efficient to be extended to large time discretization scenarios.
3.4 High Dimensional and Non-linear Example
In this section, we consider the following path-dependent BSDE,
and the forward process is given by . Based on the association with PPDE and the nonlinear Feynman-Kac formula, we construct a high dimensional example and a non-linear example below, which we could find the true solution.
High Dimensional Example
For simplicity, let , and , we can find the explicit solution of this problem. We then compare the true solution with the solution approximated by our algorithm. In this example, we use the deep log-signature BSDE algorithm because the input of the network grows exponentially in terms of the dimension. With , after 10000 training iteration, the approximated solution of from our algorithm is 6.60 with an error of 1% to the true solution of 6.66. Again, our algorithm runs only 5 steps () during training, which is quite time efficient. Here we remark that even our algorithm is able to approximate the true solution of the high dimensional example, it is more suitable for high frequency, path dependent and long duration data. This is because when generating signatures / log-signatures from high dimensional paths, the dimension of signatures / log-signatures would increase exponentially in terms of the dimension of the path, we could see this from equation (5.11) in the Appendix.
In this example, we apply our algorithm to approximate the solutions of an non-linear FBSDE (3.8) with , and the generator
In the numerical implementation, we choose the terminal condition to be . This example is inspired by a two person zero sum game from . We choose the following parameters to implement our algorithm: , , , . As illustrated in Figure 8 and Table 3, with an increase of number of segmentation and number of time steps in the Euler scheme will simultaneously improve the accuracy. With only 20 segmentations for , reaches 0.9982 with an error of only 0.18%, where true solution is 1.
This paper aims to develop efficient algorithms to solve non-Markovian FBSDEs or equivalent PPDEs. We combine the signature/log-signature transformation together with RNN model to solve the FBSDE numerically. Our algorithms show advantages in solving path-dependent problems, high-frequency data problems, and long time duration problems, which apply to a wide range of applications in financial markets.
Acknowledgments. We would like to thank Professor Jin Ma and Professor Jianfeng Zhang for all the insightful comments.
-  (2018) Derivatives pricing using signature payoffs. arXiv preprint arXiv:1809.09466. Cited by: Proposition 5.2.
-  (2016) Pricing under rough volatility. Quantitative Finance 16 (6), pp. 887–904. Cited by: §1.
-  (2017) Good deal hedging and valuation under combined uncertainty about drift and volatility. Probability, Uncertainty and Quantitative Risk 2 (1), pp. 13. External Links: Cited by: §1, §5.2.
-  (2017) The robust merton problem of an ambiguity averse investor. Mathematics and Financial Economics 11 (1), pp. 1–24. External Links: Cited by: §1, §5.2.
-  (2013) Dynamic conic finance: pricing and hedging in market models with transaction costs via dynamic coherent acceptability indices. International Journal of Theoretical and Applied Finance 16 (01), pp. 1350002. Cited by: §1, §3.1, §5.2.
-  (2001) Pricing and hedging in incomplete markets. Journal of financial economics 62 (1), pp. 131–167. Cited by: §1, §5.2.
-  (2009) New measures for performance evaluation. The Review of Financial Studies 22 (7), pp. 2571–2606. Cited by: §5.2.
Signature moments to characterize laws of stochastic process. arXiv: 1810.10971v1.. Cited by: §1.
-  (2000) Beyond arbitrage: good-deal asset price bounds in incomplete markets. Journal of political economy 108 (1), pp. 79–119. Cited by: §1, §5.2.
-  (2017) European option pricing with stochastic volatility models under parameter uncertainty. In International symposium on bsdes, pp. 123–167. Cited by: §1, §3.3, §3.3, §5.2.
-  (2019) Functional itô calculus. Quantitative Finance 19 (5), pp. 721–729. Cited by: §1.
-  (2021) Cubature method for volterra sdes and rough volatility model.. preprint. Cited by: §1.
-  (2020) Deep learning methods for mean field control problems with delay. Frontiers in Applied Mathematics and Statistics 6. Cited by: §1.
-  (1993) Approximation of dynamical systems by continuous time recurrent neural networks. Neural networks 6 (6), pp. 801–806. Cited by: §5.3.
-  (2007) Portfolio selection with parameter and model uncertainty: a multi-prior approach. The Review of Financial Studies 20 (1), pp. 41–81. External Links: Cited by: §1, §5.2.
-  (1991) Implications of security market data for models of dynamic economies. Journal of political economy 99 (2), pp. 225–262. Cited by: §1, §5.2.
-  (1997) Long short-term memory. Neural computation 9 (8), pp. 1735–1780. Cited by: footnote 2.
-  (2019) Deep ppdes for rough local stochastic volatility. Available at SSRN 3400035. Cited by: §1.
-  (2019) Deep signature transforms. In Advances in Neural Information Processing Systems, pp. 3105–3115. Cited by: §1, Proposition 5.2.
-  (2019) Kernels for sequentially ordered data. Journal of Machine Learning Research. Cited by: §1.
-  (2013) Learning from the past, predicting the statistics for the future, learning an evolving system. arXiv preprint arXiv:1309.0260. Cited by: §1.
-  (2019) Learning stochastic differential equations using rnn with log signature features. arXiv preprint arXiv:1908.08286. Cited by: §1, §5.1, §5.1, §5.3, §5.3.
-  (2021) On the dynamic frontiers of the limit order books-a principal agent problem view. Cited by: §1, §5.2, Remark 1.
-  (1994) Differential equations driven by rough signals (i): an extension of an inequality of lc young. Mathematical Research Letters 1 (4), pp. 451–464. Cited by: §1, §5.1.
-  (2010) Illiquid markets as a counterparty: an introduction to conic finance. Robert H. Smith School Research Paper No. RHS, pp. 06–115. Cited by: §1, §5.2.
-  (1973) Theory of rational option pricing. The Bell Journal of economics and management science, pp. 141–183. Cited by: §5.2.
-  (2020) Convolutional signature for sequential data. arXiv preprint arXiv:2009.06719. Cited by: §1.
-  (2005) Martingale methods in financial modelling. Springer. Cited by: §1, §5.2.
-  (1992) Backward stochastic differential equations and quasilinear parabolic partial differential equations. In Stochastic partial differential equations and their applications, pp. 200–217. Cited by: §1, §1.
-  (2004) Filtration consistent nonlinear expectations and evaluations of contingent claims. Acta Mathematicae Applicatae Sinica, English Series 20 (2), pp. 191–214. Cited by: §5.2.
-  (2014) Two person zero-sum game in weak formulation and path dependent Bellman-Isaacs equation. SIAM J. Control Optim. 52 (4), pp. 2090–2121. Cited by: §1, §3.4, §5.2.
-  (2020) Algorithm 1004: the iisignature library: efficient calculation of iterated-integral signatures and log signatures. ACM Transactions on Mathematical Software (TOMS). Cited by: footnote 1.
-  (2020) Numerical methods for high-dimensional path-dependent pdes driven by stochastic volterra integral equations. Cited by: §1, §1.
-  (2020) Solving path dependent pdes with lstm networks and path signatures. arXiv preprint arXiv:2011.10630. Cited by: §1.
-  (2020) PDGM: a neural network approach to solve path-dependent partial differential equations. arXiv preprint arXiv:2003.02035. Cited by: §1, §3.2.
-  (2017) Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Communications in Mathematics and Statistics 5 (4), pp. 349–380. Cited by: §1, §1.
-  (2020) Backward deep bsde methods and applications to nonlinear problems. External Links: Cited by: §1.
-  (2017) Backward stochastic differential equations. Springer. Cited by: §5.3.
5.1 Signature and signature transformation
In this section, we introduce the preliminary facts about the signature from the rough path theory  and the signature transformation  we used in the algorithm. In general, for a bounded variation path , for , the signature of (up to order ) is defined as the iterated integrals of . More precisely, for a word with size ,
where we use the convention that . The signature lives in a strict subspace , known as the free Carnot group over of step , where
is the truncated tensor algebra overFurthermore, the exponential map defines the diffeomorphism from the Lie algebra to the Lie group , namely
where is the Lie sub-algebra of generated by the canonical basis of , and the Lie bracket is given by . Thus, the signature lives in the linear space , and we we denote logarithm of the signature of the path as Let be the projection map of the signature and the signature at order . We denote as the truncated log signature of a pah of order . We introduce the following standard treatment when computing the signature of a path together with the time parameter.
Given a path , we define the corresponding time-augmented path by , which is a path in .
We should remark here that a bounded -variation path is essentially determined by its truncated signature at order . This means that essentially no information is lost when applying the signature transform of a path at certain order without using the whole signature process.
Let be a real-valued continuous function on continuous piecewise smooth paths in and let be a compact set of such paths. Then for all and , there exists a linear functional such that for all ,
We introduce the signature and the signature layer in .
Definition 5.3 (Signature and Signature Sequence Layer).
Consider a discrete -dimensional time series over time interval . A signature layer of degree is a mapping from to , which computes as an output for any , where is the truncated signature of over time interval of degree as follows:
where and is the dimension of the truncated signature.
5.2 Backgrounds of numerical examples
Throughout this section, we denote as the filtered probability space and denote as the risk neutral measure.
Best Ask Price for GBM European Call Option.
The limit order book spread has been extensively investigated through No Arbitrage Bound/No Good Deal bound in incomplete markets [6, 7, 9, 16, 25, 26]. Traditionally, under a risk neutral measure , one may assume the underlying asset follows a geometric Brownian motion i.e.,
where is a standard Brownian motion under . By no good deal theory, the best ask price for the European call option at level ( can be thought as the bound for girsanov kernels) can be represented as
where the set is nonempty and called the no good deal pricing set at level , is the discount factor defined by fixed risk-free interest rate . More details can be referred to [5, 7, 23, 25]. In our setting, we define as follow
where , the process denote all possible girsanov kernels and their bound is .
There are several notions to introduce the kernel function . The kernel ambiguity introduced by drift uncertainty with discount factor has been recently studied in . Here we work in a simplified version, where we consider the discount factor fixed, we define , where , and . The main motivation to consider a fixed interest rate instead of is that we could numerically compute the lower and upper bound by using the empirical calibration of from the market data.
With the specification of the pricing measure set in (5.14), we can show that (3.5) is closely linked to the following BSDE. The proof follows from the comparison theorem for BSDEs, and we refer details in .
In this example, we consider the classical Black-Scholes model setting. Under the risk neutral measure , the stock prices follows a geometric Brownian Motion with constant interest rate , and volatility ,
Lookback option is one of the path-dependent financial derivatives. A lookback call option with floating strike is given by the payoff function
It is clear that the option price has the form
Fortunately, has an explicit solution, (e.g. ),
where , and
In the meantime, the option price can also be represented as a solution to the following BSDE,
Therefore, we are able apply our numerical method, and compare solutions with the true solution, and solutions from other numerical schemes.
European call options in the Heston model under parameter uncertainty.
Recall that, under the risk neutral measure , we consider the following Heston model in  for a European call option pricing problem with stochastic volatility model under parameter uncertainty. For , the asset price and forward variance process follows,
and are two Brownian motions under with correlation . Parameters () are assumed to be nonnegative and satisfy the Feller’s condition to guarantee that variance process is bounded below from zero. Moreover, under the parameter uncertainty situation, an elliptical uncertainty set for parameters (, where ) with confidence is given by the quadratic form , where is the perspective deviance towards the true parameters denoted as , is the covariance matrix of the parameters and
is the quantile of the chi-square distribution with three degrees of freedom. We should remark here that ellipsoidal specifications of uncertainty appear naturally in multivariate Gaussian settings for the uncertainty about the drifts of tradeable asset prices, and literature can be referred to[3, 4, 15]. In , the pricing bound for the Heston call option under model ambiguity is derived and proved to be the unique solutions of the following BSDEs with payoff at maturity ,
is the vector of coefficients to the parameter deviances of equation given by
and Also, the perspective deviance towards the true parameter corresponding to (5.16) are . Following the idea in , the forward component of the SDE is generated by standard Euler-Maruyama scheme for the log-price and an implicit Milstein scheme for the variance