1 Introduction
This paper focuses on the derivation and analysis of a NeuralNetwork (NN) based Schwarz Waveform Relaxation (SWR) Domain Decomposition Method (DDM) for solving partial differential equations (PDE) in parallel. We will focus in this paper on a simple diffusionadvectionreaction equation. Still, the proposed strategy applies to any other evolution (in particular wavelike) equations, for which the convergence of SWR is proven
Antoine and Lorin (2018, 2017); Antoine et al. (2018); Antoine and Lorin (2019b); Halpern and Szeftel (2010); Gander and Halpern (2007); Gander et al. (1999); Antoine and Lorin (2019a); Gander (2003, 2006); Dolean,V. et al. (2015). We derive a combined DDMSWR and PhysicsInformed Neural Network (PINN) method for solving local and nonlocal diffusionadvectionreaction equations. The latter was developed by Karniadakis et al. Raissi et al. (2019); Pang et al. (2019); Yang et al. (2020)and is a general strategy in scientific machine learning for solving PDE using deep neural networks via the minimization of welldesigned loss functions. Notice that in
Jagtap and Karniadakis (2020), was also proposed a more direct DDM for solving PDE. Interestingly, both (Jagtap and Karniadakis (2020) and the one presented here) methods could actually be combined; this was however not tested in this paper. Let us also mention a recent paper Heinlein et al. (2021), where a combination of Schwarz DDM with NNbased solvers is proposed for stationary PDE. Beyond the derivation of the SWRNN method, this paper’s objective is to exhibit some fundamental properties that make this methodology very promising. The general principle is to solve Initial Boundary Value Problems (IBVPs) by constructing local solutions (subdomaindependent) obtained by minimizing local loss functions. The overall strategy is convergent (provided that the PINN method is convergent) and allows, in particular, to locally decompose the training process in different subdomains within an embarrassingly parallel procedure. The construction of local solutions also allows to locally adapt the depth of the deep neural network, depending on the solution’s spectral space and time complexity in each subdomain.In this paper, we will primarily focus on the derivation aspects and will not necessarily detail all the computational aspects, particularly regarding the selection of the training points. This will, however, be specified in the Numerical Experiment section. For convenience, we shall recall some basic aspects about PINNs, neural networks and SWR method for evolution equations, which shall be used in the paper later.
1.1 Basics on PINNs
Let us recall the principle of PINNs for solving, e.g., an evolution PDE over ,
(1) 
where i) is a differential operator in space, and a differential or algebraic boundary operator over the domain boundary ; ii) are and are imposed functions. The PINN approach, which generalizes the DEsolver from Lagaris Lagaris et al. (1998) consists in parameterizing (by say ) a NN, approximating the solution to (1), by minimizing (a discrete version of) the following loss function
where are some free positive parameters and for some large , and where (resp. ) denotes the norm over (resp.
). Practically, the loss functions are constructed by estimating the values at a very large number of space
timetraining points . Hence the norm are not exactly computed, but only approximated. Karniadakis and collaborators have developed numerous techniques to improve the efficiency and rate of convergence of PINNalgorithms for different types of PDE. We refer for instance to Raissi et al. (2019); Pang et al. (2019); Yang et al. (2020) for details. Ultimately, the PINN strategy is to provide more efficient solvers than standard methods (finite difference, finitevolume, finiteelements, spectral methods, pseudospectral methods, etc) for high dimensional (stochastic or deterministic) PDEs. As far as we know, this is not clearly yet established, which justifies the active research in this field and the developed of new methods.1.2 Basics of Neural Networks
We here recall the basics of neural networks. We denote the neural network with and where we denote by
the unknown parameters. Neural networks usually read (for 1 hidden layer, machine learning)
(2) 
where are the sigmoid transfer functions, and is the number of sigmoid units, are the weights and
the bias. When considering several hidden layers (deep learning), we have then to compose functions of the form (
2), Després (2021). That iswhere for , is defined from (with ) to by ,
is an activation function,
and where where layers are considered. The layer is the input layer and is the output layer, such that . In fine, from to .1.3 Basics on SWR methods for evolution equations
In this subsection, we recall the principle of SWRDDM for solving evolution PDE. Consider a dimensional first order in time evolution partial differential equation in the spatial domain , and time domain , where is a linear differential operator in space. The initial data is denoted by , and we impose, say, null Dirichlet boundary conditions on . We present the method for 2 subdomains, although in practice an arbitrary number of subdomains can be employed. We first split into two open subdomains , with or without overlap ( or ), and . The SWR algorithm consists in iteratively solving IBVPs in , using transmission conditions at the subdomain interfaces . The imposed transmission conditions are established using the preceding Schwarz iteration data in the adjacent subdomain. That is, for , and denoting the solution in , we consider
(3) 
with a given initial guess , where denotes a boundarytransmission operator and where are internal boundaries. Classical SWR (CSWR) method consists in taking as the identity operator while OptimizedSWR (OSWR) method consists in taking for some well chosen (optimized from the convergence rate point of view)
, and outward normal vector
to . The OSWR method is then a special case of RobinSWR methods. In addition, in order to provide a faster convergence than CSWR, the OSWR method is often convergent even for nonoverlapping DDM. The latter is hence of crucial interest from the computational complexity point. We refer to Gander and Halpern (2007); Gander (2003, 2006); Dolean,V. et al. (2015); Gander et al. (1999) for details. The convergence criterion for the Schwarz DDM is typically given for any , by(4) 
with small enough. When the convergence of the full iterative algorithm is obtained at Schwarz iteration , one gets the converged global solution in . The reconstructed solution , is finally defined as .
1.4 Advectiondiffusionreaction equation
Rather than considering a general situation, for which the rapid convergence of the SWR method and efficiency are not necessarily proven, we propose to focus on the advectiondiffusionreaction equation, for which both properties are established in Gander and Halpern (2007) (see also Halpern and Szeftel (2010); Antoine and Lorin (2018, 2017); Antoine et al. (2018); Antoine and Lorin (2019b) for the Schrödinger equation). Let us consider the following initial boundaryvalue problem: find the real function solution to the advectiondiffusionreaction equation on , ,
(5) 
with initial condition , and the realvalued spacedependent smooth reaction term , advection vector and diffusion .
We recall from Gander and Halpern (2007), that for considering with constant coefficients in (5), and for , , there exists a unique weak solution in . Moreover, if and , there exists a unique weak solution in .
1.5 Organization of the paper
The rest of the paper is organized as follows. In Section 2, we derive the combined SWRPINN method, and some properties are proposed in Sections 2.2 and 2.4. Section 3 is devoted to some numerical experiments illustrating the convergence of the overall SWRPINN method. We make conclusive remarks in Section 4.
2 SWRPINN method
In this section, we propose to combine PINNbased solvers with SWRDDM to solve the advectiondiffusionreaction equation on a bounded domain , imposing null Dirichlet boundary conditions at . For the sake of simplicity of the presentation, the derivation is proposed for two subdomains; the extension to an arbitrary number of subdomains is straightforward.
2.1 Derivation of the SWRPINN method
The standard SWR method for two subdomains consists in solving the IVBP using the following algorithm
(6) 
where is a boundary operator, and where we recall that . The wellposedness and the convergence of this method and its rate of convergence were established in Gander and Halpern (2007) for different types of transmission conditions. SWR algorithms can actually be reformulated as a fixed point methods (FPM), and their rate of convergence is hence determined by a contraction factor of the FPM. More specifically, it is proven in Halpern and Szeftel (2010) that for
where , , in , the CSWR method is convergent and has a convergence rate (contract factor), at least, given by
In fact, this can be refined to superlinear convergence rate . For RobinSWR methods, with transmission conditions , the convergence rate is actually improved
as . We notice in particular, that a crucial element for the rate of convergence of SWR methods, is the size of the overlapping zone. However, overlapping is not required for RobinSWR methods to converge.
Rather than a standard approximation of (6) using a finite elementsdifference or pseudospectral methods Antoine and Lorin (2019b); Halpern and Szeftel (2010); Antoine et al. (2018), we then propose to solve this system using a PINN method. We denote by the generic NN to optimize, where denotes the unknown parameters. The SWRNN hence consists in searching for an approximate solution to the SWR method by applying local PINN algorithms. That is, we now consider
(7) 
Remark 2.0
For the CSWR method ( is the identity operator), it is proven in Gander and Halpern (2007), among many other wellposed results, that for and with and some compatibility conditions, the algorithms (6) is wellposed in .
Let us now denote and . In Theorem 3.3 from Gander and Halpern (2007) it is stated that for any and for some positive constant
where we have denoted
Now if we assume (convergence of the PINNmethod) that the NN solution to (7) is such that there exist
then
We then trivially deduce the convergence of the overall PINNCSWR method. Similar conclusions can be reached for the PINNOSWR algorithms. In particular in Gander and Halpern (2007), the OSWR method is shown to be convergent in .
First, we use the standard technique to include the initial condition Lagaris et al. (1998), by searching for a trial network in the form
For the sake of simplicity of the notations, we will yet denote by the neural networks, which includes the contribution of the initial data. Notice that this step is not essential, but allows to simplify the loss functions. Hence at each Schwarz iteration we solve (7), by minimization the following two local “independent” loss functions , for some positive parameters , and where we have denoted . In particular, we benefit from local training processes (subdomaindependent), which allows us to potentially avoid using the stochastic gradient method orand improve its convergence. Typically, the minibatches would actually correspond to training points for the local loss functions under consideration. At Schwarz iteration , we hence minimize
where were computed at the Schwarz iteration . Recall that practically the loss functions are numerically evaluated by approximating the norm using training points, typically randomly chosen in . This method allows for a complete spatial decoupling of the problem over 2 (or arbitrary number of) subdomains. Finally, the reconstructed solution is hence defined as for all . More specifically
(8) 
and we define the solution to the advectiondiffusionreaction equation as
Practically, in order to evaluate the loss functions, it is necessary to compute the equation at some very large randomly chosen training points in , as the norms are not exactly performed. From the point of view of the computation of the loss function (requiring the evaluation of the PDE at the training points), the algorithm is hence trivially embarrassingly parallel. From the optimization point of view, the method now requires minimizing two loss functions. Naturally, the computation of the minimization problems is embarrassingly parallel as the two IBVPs are totally decoupled. As we are now considering two IBVPs on smaller spatial domains, we can locally adapt the depth of the local networks.
It is important to mention that, unlike SWR methods combined with standard numerical (finitedifference, volume, elements, pseudospectral) methods, for which convergence can be proven, the combination of SWR and PINN methods will not necessarily ensure convergence to zero of the residual history. This is due to the fact that from one Schwarz iteration to the next, the reconstructed solutions may slightly differ as the minima obtained by minimization of the local loss functions will a priori slightly differ. This fact is actually inherent to the NNbased method. However, we expect the residual history to be small from a practical point of view and for loss functions sufficiently small. In addition to this argument, let us mention that the transmission condition is naturally not exactly satisfied if it is included in the loss function. A large enough weight can, for instance, be imposed on the transmission constraint to ensure that it is accurately satisfied.
2.2 About the interest of using SWR DDM for NNbased algorithms
The estimation of the loss function using the direct PINNmethod for solving local PDE is trivially embarrassingly parallel, as the estimation is independently performed at any given training point. However, this associated minimization problem (batchsize related) is not locally specific, and Stochastic Gradient Method (SGM) is hence an essential tool. In the proposed approach, the local loss functions which are evaluated have specific meanings; and allows to get accurate approximations of the solution in any given subdomain.
The SWR method is a domain decomposition method in space for solving PDE. Using standard advectiondiffusionreaction equation solvers, the main algorithmic costs are the loss function estimations and the computation of solutions to linear systems at each time iteration, involved in implicit or semiimplicit stable schemes Gander and Halpern (2007). The latter has a polynomial complexity , where is typically dependent on the structure of the matrix involved in the linear system. Using a PINN approach, there are naturally no more linear systems to solve “to estimate” the solution. Instead, an optimization algorithm is necessary to parameterize the NNbased solution. Denoting by the (a priori very large) number of spacetime training points to construct the loss function, and the total number of parameters. The computation of solution using the direct PINN method is decomposed into two parts:

Estimation of the loss function, with a complexity with . This step is hence embarrassingly parallel for local PDE (with or without combination with the SWR method.)

Minimization of the loss function with a complexity for . Typically stochastic gradient methods Robbins and Monro (1951); Bottou (2010); Sun et al. (2020) are used to deal with possibly high dimensionality (for very accurate solutions) of this minimization problem and allows for a relatively efficient parallelization.
Within the framework of DDM and for two subdomains, the SWRNN indeed requires the embarrassingly parallel minimization of two independent loss functions constructed using local training. The empirical argument which justifies the proposed methodology is as follows. The structure and complexity of the solution is thought to be “divided” in the two (much more in practice of course) spatial subdomains. As a consequence, based on the local structure of the solutions in , the depth of the local neural networks can then be adaptedreduced compared to the onedomain PINN approach with one unique neural network. The extreme case in that matter, would be a domain decomposition into small finite volumes, where the solution would be approximated by a constant (cellcenter finite volume method) that is 0depth NNs, even if the overall solution has a complex spatial structure. Naturally, the larger the subdomain size, the deeper the depth of the searched local neural network associated to this subdomain. For two subdomains, the minimization step within the SWRNN consists in solving in parallel
rather than (for direct method)
where . That is, it is possible to decompose the minimization problem in several spatial subregions, where the spectral structure of the solution can be very different from on subdomain to the next, requiring locally smaller depths than using a unique global deep NN. Hence, for SWRNN method we have to perform the following tasks.

Estimation of the loss functions with a complexity . This step is hence embarrassingly parallel within the SWR method and allows to deal with local (subdomaindependent) training points.

Minimization of the local loss functions with a complexity for , where in principle .
The downsize of SWRNN methods is that it requires to repeat times (that until convergence) the computation of the uncoupled systems. Unlike standard SWRDDM, where the gain is on the computation of local linear systems of smaller size, the main interest is that we locally solve local (and less complex) minimization problems, where we expect the size of the search space to be smaller.
Notice that the SWRDDM allows for an embarrassing parallelization of the overall PINN PDE solver. Indeed, unlike the standard computation of the (local) minima of the loss function, which requires nontrivial nonembarrassingly parallelization, the proposed approach allows for the embarrassingly parallel computation of minima of local loss functions. Three levels of parallelization are then possible

Trivial parallelization of the estimation of the local loss functions.

Embarrassingly parallel computation of the minima of the local loss functions.

In addition, the minimum of a local loss function can also be performed in parallel using the domain decomposition method for PINN, as proposed in see Jagtap and Karniadakis (2020).
From the computational point of view, the SWRPINN algorithm allows i) to adapt the depth of (most of) local NNs compared to using one unique (global) NN, and ii) to estimate the local loss functions using local subdomaindependent training points and potentially allows for using direct (nonestochastic) gradient methods for a sufficiently large number of subdomains. This step is the analog of the reduction of the size of the linear systems to be solved (scaling effect) within standard SWR when are applied as real space solvers Antoine and Lorin (2018, 2017); Antoine et al. (2018).
We here summarize the overall computational complexity of SWRPINN and direct PINN methods.

Direct approach: . In this case, we expect to be large, and depends on the used optimization algorithm.

SWR approach: . In this case, we expect . As is strictly greater than there is a scaling effect which makes this approach potential more efficient. Moreover the prefactor is also though to be much smaller using SWR methods. Practically it is required for to be small enough. As it is wellknown the choice of the transmission conditions is a crucial element to minimize . Dirichlet transmission conditions is known to provide very slow convergence. At the opposite of the spectrum and for wavelike equations, DirichlettoNeumann like transmission conditions are known to provide extremely fast convergence, but can be computationally complex to approximate. Another way to accelerate the convergence of the SWR algorithm consists in increasing the subdomain overlap (that is increase ) For the advectiondiffusionreaction equation, optimized SWR method, based on optimized Robintransmission conditions is a good compromise between convergence rate and computational complexity Halpern and Szeftel (2010). As specified above, the computation of the loss function is embarrassingly parallel unlike the direct approach.
2.3 Nonlocal operator
We have argued above that the use of SWR methods allows for an efficient parallel computation of the overall loss functions through the efficient estimation (using local training points) of local loss functions. We show below that whenever nonlocal terms are present in the equation, the efficiency of the SWRPINN method is not deteriorated by those terms. In the following, we assume that the equation contains a nonlocal operator , typically defined as a convolution product:

where denotes the spatial convolution product, a nonlocal potential for some given function .
We consider the equation on a truncated domain with boundary , as follows
(9) 
and such that is defined as a convolution product in space
Then the SWRPINN scheme reads
(10) 
where was computed at the previous Schwarz iteration, with some transmission operator . Hence in this case, we still have to minimize local loss functions
Practically, we can approximate the convolution product as follows. Denoting by the local spatial training points, for , we approximate
by
for some weights .
As it was discussed above the interest of using a DDM is to decompose the training and search of the local solution over smaller set of parameters. However, whenever the equation is nonlocal, it is necessary to extend the search of the parameters in the global computational domains. More specifically, for local equations, the local NNsolution in (resp. ) only requires parameters in (resp. ). However, if the equation is nonlocal, in order to construct the solution in , we have in principle to search the NN parameters in all , containing both and , as the solution values in depend on values of the solution in . This problem would also occur to construct the loss function, using the direct PINNs method, within the term
The SWRPINN method allows to deal with this issue, as at Schwarz iteration , the loss function in is evaluated through the solution in at the previous Schwarz iteration from thanks to the previously evaluated parameter .
2.4 How about a noniterative domain decomposition in space?
The domain decomposition method which is derived in this paper is a SWRin space method which is an iterative method allowing for the convergence of the decomposed solution towards the exact solution of the PDE under consideration. The main weakness of this DDM is the fact that the decoupled system has to be solved several times (iterative method). It is hence natural to ask if a “similar spatial domain decomposition”, but noniterative, is possible.
In this goal, we decompose the domain as above: , with or without overlap ( or ), with and consider (9). That is we search for a solution of the form for all such that
(11) 
where . The PINNmethod then consists in solving
where
Therefore, in this case, we still have to minimize local loss functions. However, there are 2 main issues:

Even if is taken equal to zero, the decoupling of the solution in the two subdomains, naturally induces a discontinuity at the subdomain interfaces. It is possible to impose additional compatibility conditions (to be included in the loss function), in the form of continuity condition at , differentiability, but the reconstructed global solution (such that ), will obviously not be an approximate solution to the equation under consideration. Moreover, the compatibility conditions will induce a recoupling of the two systems in the spirit of the following item.

The two systems, in and are actually also coupled through the nonlocal term. This effect is similar to the addition of a compatibility condition described above. Hence, the computation of the loss functions would not be embarrassingly parallel anymore. This is not an issue in the SWR framework; as in the latter case, say at Schwarz iteration , the nonlocal term uses the approximate solution at the Schwarz iteration , which is a known quantity.
Hence unlike the SWRPINN method, for which (8) occurs
(12) 
3 Numerical Experiments
In this section, we propose basic experiments in order to numerically illustrate the convergence of the overall method. The PINNalgorithm was implemented using deeplearning and optimization toolboxes from matlab, DeepXDE Lu et al. (2020) and tensorflow Abadi et al. (2015). Although relatively simple, these experiments illustrate the proof of concept of the proposed strategy and not to provide the best convergence possible (which will be the purpose of a future work).
Experiment 1. We consider the standard advectiondiffusionreaction equation
(13) 
on with Dirichlet boundary conditions at and such that , , and the initial conditions is . We decompose the domain in two subdomains and with . We here use the Classical Schwarz Waveform Relaxation method, based on Dirichlet transmission conditions , where are the two local NN defined in . We consider the following data: the NN have both layers, with neurons each. We select internal collocation points. We also use a local SGM with epochs and minibatches size of . In the gradient method the learning rate with decay rate of starting at . We reconstruct the overall solution using a total of prediction points. Initially we take . We report the reconstructed solution after the first SWR iteration (resp. converged SWR algorithm) in Fig. 1 (Left) (resp. 1 (Right)) from two the local solutions in at final time , and overlapping zone of size .
The SWR convergence rate is defined as the slope of the logarithm of the residual history according to the Schwarz iteration number, that is , with (for 2 subdomains)
(14) 
being a small parameter.
We report in Fig 2 (Left) the graph of convergence of the stochastic gradient methods applied to each local loss functions. Notice that each “oscillation” corresponds to a new Schwarz iteration. We report in Fig. 2 (Right) the graph of convergence of the SWRmethod in the form of the residual history in the overlapping zone. The combination of convergent SWR methods with standard numerical (finite element, finitedifference) methods for which there is a uniform convergence to zero of the residual history as a function of the Schwarz iterations.
Notice that, unlike the converged SWR method combined with standard numerical methods where the residual goes to zero when goes to infinity, the residual does not exactly go to zero. This is due to the fact that from one Schwarz iteration to the next, the (local) solution are obtained by constructing “new” local minima as the local loss functions are small but not null, and hence change from one iteration to the next.
Experiment 2. In the following experiment, we implement a RobinSWR method for solving (13), which is expected to provide better convergence than CSWR Halpern and Szeftel (2010). As it was discussed in Halpern and Szeftel (2010) and recalled above, the optimized SWR (and more generally RobinSWR) methods is convergent, even without overlap, that is when is null. We consider the same equation as above with , , and . The initial conditions is . We decompose the domain in two subdomains and , with hence . The Robin transmission conditions , where are the two local NN defined in and we have taken . We consider the following data: the NN have both layers, with neurons each. We select internal collocation points. We also use local SGM with epochs and minibatches size of . In the gradient method the learning rate with decay rate of starting at . We reconstruct the overall solution using a total of prediction points. Initially we take . We report the reconstructed solution after the first SWR iteration (resp. converged SWR algorithm) in Fig. 3 (Left) (resp. 3 (Right)) from two the local solutions in at final time .
We next report in Fig 4 (Left) the graph of convergence of the stochastic gradient methods applied to each local loss functions. We report in Fig. 4 (Right) the graph of convergence of the SWRmethod in the form of the residual history in the overlapping zone.
Importantly, we observe that RobinSWRPINN still converges even if the two subdomains do not overlap.
Experiment 2bis. In the following nonoverlapping 2domain RobinSWR experiment, we now consider that the diffusion coefficient is spacedependent; more specifically, (resp. ) for (resp. ). The rest of the data are as follows: , , , , . The initial condition is given by and is such that the solution has a very different structure in the two subdomains. We want here to illustrate the ability of the derived approach to select different depths of the local neural networks, depending on the structure of the solution: in (resp. ) the solution is mainly null (resp. oscillatory), except close to the interface. The two subdomains are and , with hence . The two local NN , over have the following structure: (resp. ) possesses (resp. ) layers and (resp. ) neurons. The minimization process in is much more efficiently performed than in with a relatively similar accuracy. As above, we select internal collocation points. We also use local SGM with epochs and minibatches size of . In the gradient method the learning rate with decay rate of starting at . We reconstruct the overall solution using a total of prediction points. Initially we take . We report the reconstructed solution after the first SWR iteration (resp. converged SWR algorithm) in Fig. 5 (TopLeft) (resp. 5 (TopRight)) from two the local solutions in at final time . We also zoom in (5, (BottomLeft)), in the interface region to better observe the SWR convergence. The local loss functions are represented in Fig. 5 (BottomRight). We observe that roughly, the computational time to perform the solution in was times faster than in .
Experiment 3. In this last experiment, we consider a twodimensional advectiondiffusion equation on a square .
with and and . The two subdomains are and where ; hence the interfaces are located at . The initial data is a Gaussian function and the final computational time is . A classical SWR algorithm is here combined with the PINN method. On the other subdomain boundaries, we impose null Dirichlet boundary conditions. The equation is solved using the library DeepXDE Lu et al. (2020) combined with tensorflow Abadi et al. (2015). In each subdomain is used a neural network with layers and neurons; Adam’s optimizer is used (learning rate , epoch=) along with activation function. In Fig. 6 (Top), we report the initial data in . In Fig. 6, we represent the solution at the end of the first Schwarz iteration (Left) and fully converged solution (Right) at final time . In future works, we will propose more advanced simulations. The corresponding code is available on GitHub, where the interested reader could find all the relevant information regarding the code.
4 Conclusion
In this paper, we have derived a Schwarz Waveform Relaxed PhysicsInformed Neural Networks (SWRPINN) method for solving advectiondiffusionreaction equations in parallel. Some preliminary illustrating experiments are presented to validate the approach.
4.1 Pros. and cons. of the SWRPINN method
We summarize below the pros and cons of the proposed method.
Pros.

Embarrassingly parallelization of the local loss function training.

Parallel construction of local neural networks with adaptive depth and complexity.

For convergent PINN algorithms, the SWRPINN is convergent.

Flexible choice of the transmission conditions.
Cons.

As a fixed point method, SWR methods require several iterations.

The transmission conditions must be accurately satisfied through a penalization term in the loss function in order to accurately implement the SWR algorithm. Ideally, we should directly include the transmission within the definition in the NN. This is possible, considering the CSWR (Dirichletbased transmission conditions) method and the following NN,

Convergence or high precision of the overall algorithm can be hard to reach if the PINN algorithm is not used with sufficiently high precision. Instable numerical behavior can also be observed with the CSWR method.
4.2 Concluding remarks and future investigations
As far as we know, this paper is the first attempt to combine the SWR and PINN methods. Although the theory of SWRDDM is now well developed in terms of convergence and convergence rate for different types of evolution PDE and their approximation with finite difference and finite element methods, the theory of convergence of PINN is not yet complete. Consequently, the convergence of the overall SWRPINN method is still subject to the proof of convergence of the latter, which is largely empirically established. In future works, we plan to focus on “reallife” experiments where the main benefits of the SWRPINN will be exhibited and illustrated.
References
 [1] (2015) TensorFlow: largescale machine learning on heterogeneous systems. Note: Software available from tensorflow.org External Links: Link Cited by: §3, §3.
 [2] (2018) Asymptotic estimates of the convergence of classical Schwarz waveform relaxation domain decomposition methods for twodimensional stationary quantum waves. ESAIM Math. Model. Numer. Anal. 52 (4), pp. 1569–1596. External Links: ISSN 0764583X Cited by: §1.4, §1, §2.1, §2.2.
 [3] (2017) An analysis of Schwarz waveform relaxation domain decomposition methods for the imaginarytime linear Schrödinger and GrossPitaevskii equations. Numer. Math. 137 (4), pp. 923–958. Cited by: §1.4, §1, §2.2.
 [4] (2018) Multilevel preconditioning technique for Schwarz waveform relaxation domain decomposition method for real and imaginarytime nonlinear Schrödinger equation. Appl. Math. Comput. 336, pp. 403–417. External Links: ISSN 00963003 Cited by: §1.4, §1, §2.2.
 [5] (2019) Asymptotic convergence rates of Schwarz waveform relaxation algorithms for Schrödinger equations with an arbitrary number of subdomains. Multiscale Science and Engineering 1 (1), pp. 34–46. Cited by: §1.
 [6] (2019) On the rate of convergence of Schwarz waveform relaxation methods for the timedependent Schrödinger equation. J. Comput. Appl. Math. 354, pp. 15–30. Cited by: §1.4, §1, §2.1.

[7]
(2010)
Largescale machine learning with stochastic gradient descent
. In Proceedings of COMPSTAT’2010, pp. 177–186. External Links: MathReview Entry Cited by: item 2.  [8] (2014) A review of definitions for fractional derivatives and integral. Mathematical Problems in Engineering 2014. Cited by: item 1.
 [9] (2021) Analyse numérique et neural networks. Technical report Université de Paris, https://www.ljll.math.upmc.fr/despres. Cited by: §1.2.
 [10] (2015) An introduction to domain decomposition methods: theory and parallel implementation. External Links: Link Cited by: §1.3, §1.
 [11] (2007) Optimized Schwarz waveform relaxation methods for advection reaction diffusion problems. SIAM J. Numer. Anal. 45 (2). Cited by: §1.3, §1.4, §1, §2.1, §2.2, Remark 1.
 [12] (1999) Optimal convergence for overlapping and nonoverlapping Schwarz waveform relaxation. In Proceedings of the 11th International Conference on Domain decomposition, pp. 27–36. Cited by: §1.3, §1.
 [13] (2003) Optimal Schwarz waveform relaxation methods for the onedimensional wave equation. SIAM J. Numer. Anal. 41, pp. 1643–1681. Cited by: §1.3, §1.
 [14] (2006) Optimized Schwarz methods. SIAM J. Numer. Anal. 44, pp. 699–731. Cited by: §1.3, §1.
 [15] (2010) Optimized and quasioptimal Schwarz waveform relaxation for the onedimensional Schrödinger equation. Math. Models Methods Appl. Sci. 20 (12), pp. 2167–2199. External Links: ISSN 02182025, MathReview (Boris Ettinger) Cited by: §1.4, §1, item 2, §2.1, §3.
 [16] (2021) Combining machine learning and domain decomposition methods for the solution of partial differential equations—a review. GAMM Mitteilungen 44 (1). Cited by: §1.
 [17] (2020) Extended physicsinformed neural networks (XPINNs): a generalized spacetime domain decomposition based deep learning framework for nonlinear partial differential equations. Commun. Comput. Phys. 28 (5), pp. 2002–2041. External Links: ISSN 18152406, Document, Link, MathReview Entry Cited by: §1, item 3.
 [18] (1998) Artificial neural networks for solving ordinary and partial differential equations. IEEE Transactions on Neural Networks 9 (5), pp. 987–1000. Cited by: §1.1, §2.1.
 [19] (2020) What is the fractional Laplacian? A comparative review with new results. J. Comput. Phys. 404. Cited by: item 1.
 [20] (2020) DeepXDE: a deep learning library for solving differential equations. External Links: 1907.04502 Cited by: §3, §3.
 [21] (2019) FPINNs: fractional physicsinformed neural networks. SIAM J. Sci. Comput. 41 (4), pp. A2603–A2626. External Links: ISSN 10648275, Document, Link, MathReview Entry Cited by: §1.1, §1.
 [22] (2019) Physicsinformed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics 378, pp. 686–707. Cited by: §1.1, §1.
 [23] (1951) A stochastic approximation method. Ann. Math. Statistics 22, pp. 400–407. External Links: ISSN 00034851, Document, Link, MathReview (R. P. Peterson) Cited by: item 2.
 [24] (2020) A survey of optimization methods from a machine learning perspective. IEEE Transactions on Cybernetics 50 (8), pp. 3668–3681. External Links: Link Cited by: item 2.

[25]
(2020)
Physicsinformed generative adversarial networks for stochastic differential equations
. SIAM J. Sci. Comput. 42 (1), pp. A292–A317. External Links: ISSN 10648275, Document, Link, MathReview Entry Cited by: §1.1, §1.
Comments
There are no comments yet.