Deep Domain Decomposition Method: Elliptic Problems

04/10/2020
by   Wuyang Li, et al.
0

This paper proposes a deep-learning-based domain decomposition method (DeepDDM), which leverages deep neural networks (DNN) to discretize the subproblems divided by domain decomposition methods (DDM) for solving partial differential equations (PDE). Using DNN to solve PDE is a physics-informed learning problem with the objective involving two terms, domain term and boundary term, which respectively make the desired solution satisfy the PDE and corresponding boundary conditions. DeepDDM will exchange the subproblem information across the interface in DDM by adjusting the boundary term for solving each subproblem by DNN. Benefiting from the simple implementation and mesh-free strategy of using DNN for PDE, DeepDDM will simplify the implementation of DDM and make DDM more flexible for complex PDE, e.g., those with complex interfaces in the computational domain. This paper will firstly investigate the performance of using DeepDDM for elliptic problems, including a model problem and an interface problem. The numerical examples demonstrate that DeepDDM exhibits behaviors consistent with conventional DDM: the number of iterations by DeepDDM is independent of network architecture and decreases with increasing overlapping size. The performance of DeepDDM on elliptic problems will encourage us to further investigate its performance for other kinds of PDE and may provide new insights for improving the PDE solver by deep learning.

READ FULL TEXT VIEW PDF

Authors

page 8

01/17/2020

A Derivative-Free Method for Solving Elliptic Partial Differential Equations with Deep Neural Networks

We introduce a deep neural network based method for solving a class of e...
03/03/2022

Spectrally accurate solutions to inhomogeneous elliptic PDE in smooth geometries using function intension

We present a spectrally accurate embedded boundary method for solving li...
06/25/2019

Advances in Implementation, Theoretical Motivation, and Numerical Results for the Nested Iteration with Range Decomposition Algorithm

This paper studies a low-communication algorithm for solving elliptic pa...
01/03/2019

A mesh-free method for interface problems using the deep learning approach

In this paper, we propose a mesh-free method to solve interface problems...
06/12/2021

Solving PDEs on Unknown Manifolds with Machine Learning

This paper proposes a mesh-free computational framework and machine lear...
03/03/2022

A PDE-constrained optimization method for 3D-1D coupled problems with discontinuous solutions

A numerical method for coupled 3D-1D problems with discontinuous solutio...
08/26/2020

A general framework for substructuring-based domain decomposition methods for models having nonlocal interactions

A rigorous mathematical framework is provided for a substructuring-based...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In the realms of engineering and scientific computing, the domain decomposition methods (DDM) has attracted a great deal of interest and is well-known as an efficient approach for solving partial differential equations (PDE). The main idea of DDM is first splitting the computational domain into smaller subdomains and then solving in parallel the subproblems defined in subdomains, along with exchanging solution information between adjacent subdomains. When combined with the discretization of PDE by finite element methods (FEM) or finite difference methods (FDM), DDM can achieve remarkable performance. This paper will propose an approach named DeepDDM which investigates the behavior of discretizing PDE by deep neural networks (DNN) in DDM. Compared with DDM-FEM/DDM-FDM, DeepDDM will simplify the implementation of DDM. Especially, DeepDDM is more flexible for complex PDE, e.g., those with complex interfaces in the computational domain.

Due to the success of deep learning (DL) in engineering, the research using deep learning to solve PDE has provoked interest from numerous researchers. Recently, many relevant papers have been published focusing on different topics. [26]

presents physics-informed neural networks (PINN) that take initial conditions and boundary conditions as the penalties of the optimization objective loss function. By using the backpropagation algorithm and automatic differentiation, it realizes the calculation of the high-order differential in the framework of TensorFlow. The network can achieve good accuracy for both forward and inverse problems.

[29] proposes a deep Galerkin method, which is similar to PINN in that initial conditions and boundary conditions are also added as penalty terms into the optimization objective loss function. The difference is that [29] bypasses the high-order differentiation of the neural network through the Monte Carlo method. For well-known reasons, the computational cost of high-order differentiation of neural networks is exorbitant. Another method, a deep Ritz method, is presented by [8]

, in which different network architectures and loss functions are designed for variational problems, particularly the ones that arise from PDE. The large multiple layer convolutional neural network is used to solve and discover evolutionary PDE by

[22]. Recently, a large number of researchers have attempted to use deep learning tools to solve various practical problems [1, 27].

In the field of combining machine learning and domain decomposition,

[23]

presented a combination of radial basis function network methods and domain decomposition technique for approximating functions and solving Poisson equations. Recently, as we prepared for this paper, some related works of combining domain decomposition method and deep learning have appeared on the Internet.

[20] presented D3M, a method that combines Deep Ritz method and DDM to solve general PDE in parallel. However, our approach focus on the performance of combing PINN and DDM for Elliptic problems, especially for the problem with a complex interface. [7] presented a Distributed PINN that divides the problem into many subproblems according to a domain splitting, and then use the total loss, i.e. the sum of all subproblem’s loss and the loss on the interface, to train all the subproblems at the same time. We notice that the training objective of each subproblem in Distributed PINN is related to its neighbors at each training step. That means Distributed PINN is not DDM-style method, since we need to first independently solve each subproblem and then exchange the interface information in DDM, like D3M and the proposed DeepDDM. On the application front, [15] applied PINN in different domains of cardiovascular flows modeling with the similar spirit of Distributed PINN.

We present a deep-learning-based domain decomposition framework, called the deep domain decomposition method (DeepDDM), which extracts the spirit of deep learning and domain decomposition. Using DNN to solve PDE is a physics-informed learning problem with the objective involving two terms, domain term and boundary term, which respectively make the desired solution satisfy the PDE and corresponding boundary conditions [26]. By dividing the domain of interest into many subdomains, DeepDDM alternatively solves each subproblem by DNN and exchanges interface information until convergence. The decomposition of the domain of interest is based on the properties of the original physical background or the convenience of computing. Since DNN makes solving PDE a simple learning problem and is actually a mesh-free strategy, DeepDDM will simplify the implementation of DDM and make DDM more flexible for complex PDE.

This paper will first investigate the performance of DeepDDM on elliptic problems, including a model problem and an interface problem. For the model problem with the Dirichlet boundary condition, we consider the original problem divided into two or four subproblems in coordination with overlapping information and varying discrete functional spaces (network architectures). The results demonstrate that DeepDDM has properties similar to DDM with FEM or FDM: the necessary number of iterations is independent of the network architecture, given the number of subproblems; the necessary number of iterations increases along with the number of subproblems; the necessary number of iterations decreases according to increasing overlap size; and the numerical convergence rate coincides with the analytic convergence rate in various cases. Next, we investigate the performance of DeepDDM on an artificial interface problem, which is divided into two subproblems. Although only the elementary Dirichlet-Neumann interface condition is adopted, DeepDDM also exhibits good performance for the interface problem, just as for the model problem. In addition, DeepDDM reaches the relative error accuracy of within several iterations for varying network architectures, especially for the case that the coefficient contrast (the ratio of the maximal diffusion coefficient on the minimal diffusion coefficient of the interface problem) is . In conclusion, as an approach with simple implementation, DeepDDM produces similar results as DDM with FEM or FDM and may improve the performance of using DDM for interface problems.

This paper is structured as follows. In section 2 and section 3, we provide a brief review of DDM and the PINN, respectively. A detailed description of DeepDDM for the general differential operator and boundary condition is presented in section 4. We implement and test the performance of DeepDDM on two kinds of elliptic problems in section 5. The section 6 will conclude this paper and present some directions for future work.

Figure 1: The diagram used by Schwarz in 1870.

2 Domain decomposition methods

DDM are parallel, potentially fast, robust algorithms for solving PDE, which has been discretized using, e.g., FEM or FDM. Even in the 19th century, Hermann Schwarz considered a Poisson problem set on a union of simple geometries and introduced an alternating iterative method in [28]. Nearly a century later, Pierre-Louis Lions presented parallel Schwarz methods for parallel computing in [21] after the parallel computing capability became available. For the discrete system, the multiplicative Schwarz method, sequential algorithms in nature, is introduced in [3, 30] and the additive Schwarz method, parallel algorithms, is studied by Dryja and Widlund in [6]. With the wide use of parallel computers, DDM develop rapidly and provide many fast iterative algorithms and efficient preconditioners. Balancing domain decomposition by constraints (BDDC) and finite element tearing and interconnecting (FETI) have been researched for a variety of PDE as typical representatives of primal methods and dual methods, respectively [5, 18, 24, 25]. Moreover, a hybrid method between primal and dual, dual-primal finite element tearing and interconnecting (FETI-DP), enforcing equality of the solution at the subdomain interface by Lagrange multipliers, except at the subdomain corner, was initially introduced by [9] and has developed into many varieties [5, 17].

Here we briefly introduce a classical DDM, the Schwarz alternating method of Schwarz in 1870, as follows,

(1)

where , shown as Figure 1, may be divided into a disk and a rectangle , , with interfaces and , and is a function given as the boundary condition. The Schwarz alternating method starts with an initial guess along and then alternatively computes and , as follows:

Schwarz presented the earliest alternating Schwarz method to prove Dirichlet’s principle. Later, with the development of parallel computing and computational mathematics, a multitude of interesting DDM have been developed. Formally, we can extend the Schwarz method to a general differential operator with boundary condition as follows:

(2)

where can be a negative Laplace operator, , or any reasonable operator, can be or any boundary condition, and are two given functions. We always presume that the problem (2) is well-posed. Suppose that is the operator of an artificial interface transmission condition, e.g., Dirichlet, Neumann or Robin. We then have a general parallel DDM with domain setting in Figure 1 listed in Algorithm 1. Different from the Schwarz alternating method, the presented Algorithm 1 can be implemented parallel.

1:   Give an initial guess along
2:  for i=1, … do
3:     Solve for
(3)
4:     
5:     Solve for
(4)
6:     
7:     if  and  then
8:        STOP
9:     end if
10:     if  and  then
11:        STOP
12:     end if
13:  end for
Algorithm 1 A parallel Domain Decomposition Method for Two Subdomains in Figure 1

Here, in Algorithm 1, the stop strategy is the relative error of the current solution with respect to the previous one on the artificial interface or inside the subdomain less than a given tolerance or , respectively.

3 Physics-informed neural networks

We now introduce the physics-informed neural networks, which are deep fully connected feedforward neural networks, for solving PDE [26]. The entire neural network consists of layers, where layer is the input layer and layer is the output layer. Layers

are the hidden layers. All of the layers have an activation function, excluding the output layer. The activation function can take the form of Sigmoid, Tanh (hyperbolic tangents) or ReLU (rectified linear units).

We denote as a list of integers, with representing the lengths of input signal and output signal of the neural network. Define function , ,

where and . Thus, we can simply represent a deep fully connected feedforward neural network using the composite function ,

where is the activation function and represents the collection of all parameters.

Solving a general PDE such as (2) by DNN is a physics-informed minimization problem with the objective consisting of two terms as follows:

(5)

where and are the collocation points in the inside and on the boundary, respectively. The domain term and boundary term enforce the condition that the desired optimized neural network satisfies and , respectively.

Gradient descent methods can be used to solve this kind of optimization problem; however, from the empirical point of view, more effective and efficient stochastic gradient descents with minibatches are recommended

[19]. With regard to the analysis of convergence of stochastic gradient descent, there are many early contributions, such as [2, 16].

We should note that the PINN actually provide another kind of discretization scheme for solving PDE. As in the cases of FEM or FDM, a basic and interesting question is as follows: is it possible to use to approximate the solution of PDE? A well-known answer is as follows: if the solution is bounded and continuous, then can approximate

to any desired accuracy, given the increasing hidden neurons 

[4, 13, 12].

4 Deep domain decomposition methods

Now, it is natural to mix DDM with DNN. Here, we provide a general formulation. Suppose that we divide the problem (2) into subproblems defined in subdomains as follows:

(6)

where and are the expressions of and in domain , respectively, and is the artificial interface of domain to other subdomains. To simplify the discussion, we simply denote as the solution of the neighboring subproblems of . We also denote as the operator of the artificial interface transmission conditions.

We denote , as the DNN we used for each subproblem (6), i.e., a surrogate of the solution , and denote as network parameters of the -th subproblem. That means different neural networks are used for different subproblems. In contrast with the minimization problem (5), we can define subminimization problems as follows:

where

with , and representing the collocation points in the inside, on the local boundary and on the interface for the -th subproblem, respectively. Denote as the set of whole data. When the training data

are rearranged into several minibatches at each epoch, we only split the data inside the subdomain. That means, at each epoch, we first randomly rearrange

into disjoint parts , and then let be the -th minibatch111

We numerically found that this strategy can make the learning process stable. Here we give a heuristic explanation. Since the number of

or is much smaller than that of , once we directly rearrange the total training data, we may have no boundary information in some minibatches. However, for solving a general PDE, the boundary information is quite essential. A more detailed theoretical or numerical analysis would strengthen this point. Considering it’s somehow independent with the key inspirit of DeepDDM, we ignore the related numerical discussion.
. For simplicity, we denote as the information that would be transported to the objective subproblem labelled by from the neighboring subproblems labelled by . Let be the learning rate. The DeepDDM algorithm for subproblem (6) is given in Algorithm 2.

1:  Construct ;
2:  Initial parameters and interface information along ;
3:  for  do
4:     Set ; Start DDM iteration
5:     for  do
6:        Set ; Start DL iteration
7:        Rearrange randomly training data ;
8:        for  do
9:            Update : ; Update on minibatch
10:        end for
11:        if   then
12:           BREAK;
13:        end if
14:     end for
15:     Set ;
16:     Interchange the interface information ;
17:     if   then
18:        STOP;
19:     end if
20:     if   then
21:        STOP;
22:     end if
23:  end for
Algorithm 2 DeepDDM for the -th subproblem (6), .

We give some remarks as follows:

  • The loop in Step 3 implements the iteration among subproblems, i.e., domain decomposition; the loop in Step 5 implements the iteration of the objective function in each subproblem, i.e., deep learning; and the loop in Step 8 implements the iteration on each minibatch;

  • The stop criterion for subproblem optimization, shown in Step 11, is the relative error of current loss and historical loss at step . Here once a subproblem reach the stop criterion, it will break the loop 5. The interface information will be interchanged after the same loops of all subproblems are broken.

  • The stop criterion for DDM, shown in Step 17 and Step 20, is the relative error of solution on interface collocation points and inside collocation points, respectively. Here the DeepDDM algorithm will stop after all the subproblems reach stop criteria.

Computational cost: Obviously, benefiting from DDM, DeepDDM can be implemented in parallel. Suppose we fixed the total number of training data, denoted by . Once we divide the domain to subdomains, then the training data for each subproblem is about . For a fixed network architecture, the training time of one epoch is . Denote the number of training epoch (DL iteration in Algorithm 2) for each subproblem as . Denote the number of DDM iteration as which usually depends on the number of subdomains. Then given a fixed network architecture, the total computational cost of DeepDDM is . For considering the dependency of computation cost on network architecture, it highly depends on the computational device (GPU) and the detailed operators in the network. For example, if the model size located at a reasonable range and we use NVIDIA V100, for a fixed number of training data, the learning time for one epoch varies little.

5 Numerical examples

In this section, we present a set of systematic numerical results of elliptic problems, including a model problem and an interface problem with a curved interface. The results of the model problem in different subdomain cases are compared with respect to various network architectures and overlapping size conditions in Section 5.1. In Section 5.2, the numerical results of the interface problem with discontinuous coefficients are shown, further demonstrating the effectiveness and robustness of our approach. Benefiting from the mesh-free strength, the implementation of domain decomposition can be easily handled even in cases of curved interfaces or more complex geometries. DeepDDM not only attains promising accuracy but also preserves some properties of DDM existing in conventional discrete strategy.

Setting: In our cases, network architectures are simply set as having equal units for each layer. The used later indicates the hidden layers in the network, and indicates the units per layer. We chose Tanh as activation function and stochastic gradient descent using Adam [14] to update the optimization method. The mini-batches are set as 64 in this paper, if not otherwise specified. For the choice of learning rate, the initial learning rates and decay rates change for different situations. More specifically, learning rates initially range from to and decay every steps with a base of . The training points herein are all randomly selected. The test points used to calculate errors are regularly sampled by row and column, with 40,000 in all. The figure of training data indicates the amount of that in the whole domain, and so, the amount of data used in a subdomain is equal to the figure divided by the number of subdomains in the model problem or by the subdomain area ratio in the interface problem.

5.1 Model problem

The Poisson equation is a simplified form of important equations derived from engineering and physical problems. The effectiveness of applying algorithms to the Poisson equation is a necessary prerequisite for algorithms applied to more complicated equations.

We consider a Poisson equation with Dirichlet boundary conditions:

(7)

To obtain an analytic solution, we take, for example, , and substitute it into (7) to compute and . We denote the numerical solution by and define the relative error:

To begin with, we present the results of solving (7) by physics-informed neural networks. The stop criterion of the inner iteration training process is , or that we exceeded the maximum number of allowed iterations, set as 50,000 here, when the strategy without domain decomposition is used. Although there is no theory to ensure that the optimization method certainly converges on the global minimum point, our practical experiences indicate that good results would be obtained if the problem is well-posed; the network is large enough; the amount of data is large.

In Table 1, relative errors between the predicted and the exact solution are presented for different network architectures, while the total numbers of training points are maintained as and , respectively. As expected, if the number of is fixed, more corresponds with smaller relative errors. Moreover, for a fixed number of , relative errors would also be reduced when the number of increases. The general trend suggests that the numerical solutions obtained will be more accurate upon increasing expression capacity of the network. This observation is in agreement with the results of [26, 31].

10 20 30 40 50 100
2 4.1e-3 2.6e-3 1.4e-3 1.5e-3 1.0e-3 1.0e-3
3 1.7e-3 1.2e-3 5.8e-4 3.2e-4 4.3e-4 3.7e-4
4 1.7e-4 1.0e-3 7.1e-4 5.0e-4 5.0e-4 2.1e-4
Table 1: Relative error between the predicted and the exact solution for different numbers of hidden layers and different numbers of units per layer. Here, the total numbers of training data are fixed to and .
7.3e-3 2.8e-3 3.4e-3 6.3e-3 9.3e-3 2.5e-3
4.5e-3 2.0e-3 1.9e-3 1.4e-3 1.8e-3 1.4e-3
3.9e-3 9.8e-4 7.1e-4 6.2e-4 4.7e-4 1.2e-4
Table 2: Relative error between the predicted and the exact solution for different numbers of training data. The network architecture is fixed to , .

In Table 2, we fixed the network architecture, and and recorded the change of relative error under different numbers of training data. As expected, as the amount of training data is increased, a more accurate numerical solution is obtained. Furthermore, the output is more sensitive to than . This observation is in agreement with the results of [26, 31].

Furthermore, to illustrate the effectiveness of DeepDDM and investigate its properties, we present the results of the two-subdomain case and multi-subdomain case. In practice, we set . The stop criterion of the inner iteration training process is that , or that the maximum number of allowed iterations, set as 10,000, is exceeded when DeepDDM is used herein. Let be overlapping. For the two-subdomain case, we set , ; however, the numerical solutions are simply defined as follows:

For four-subdomain case, we set , , , , however the numerical solutions are simply defined as follows,

Figure 2: The numerical solutions in the whole domain for different cases. Here, the network architecture is fixed to , ; the total numbers of training data are fixed to , and ; and for domain decomposition cases.

First, for the same situation, same network , and same training data , and , the predicted solutions of the Poisson equation for three cases are presented in Figure 2. The results obtained from these cases are close to each other.

Figure 3: The error in the whole domain for outer iterations under two subdomains (left) and four subdomains (right). Here, for both cases, the network architecture is fixed to ; the total numbers of training data are fixed to , and ; and .

In Figure 3, the network is fixed to , , the numbers of data to , and , and overlapping ; the errors of solutions in the whole domain are presented for different outer iterations for the two-subdomain case and four-subdomain case. It is clearly visible that the errors of solutions are relatively larger after the first outer iteration. However, the numerical solution quickly approximates the exact solution as outer iterations increase. After the third outer iteration, the algorithm reaches the stop criterion, and the errors of solutions reach the order of . The performance of DeepDDM in the multi-subdomain case is still promising, and the entire convergence process can be clearly observed.

In Table 3, the relative errors and the number of outer iterations are shown for different network architectures for the two-subdomain case and the four-subdomain case. The accuracy acquired by DeepDDM in Table 3 is almost the same as that of the results in Table 1 in various scales of network architectures. Otherwise, if we focus on the number of outer iterations, the general trend indicates that the number of outer iterations remain approximately constant no matter how the network architecture changes. This is consistent with our expectation that the number of outer iterations is independent of the function space, i.e., the network architecture. Further, all numbers of outer iterations obtained in the four-subdomain case are larger than those of the two-subdomain case. This observation coincides with the conventional DDM. As the number of subdomains increases, the number of outer iterations required will increase a little. The table indicates that DeepDDM not only has the same accuracy as PINN, but also can be implemented in parallel. Different training data are employed for different cases, but the impact on our results is extremely limited.

No. of
subdomains
10 20 30 40 50 100
2 1.9e-3(7) 4.9e-3(6) 2.7e-3(7) 2.6e-3(7) 4.0e-3(6) 2.7e-3(7)
Two 3 3.9e-3(6) 2.9e-3(7) 1.0e-3(8) 2.4e-3(7) 2.6e-3(7) 2.5e-3(7)
4 3.1e-3(7) 2.6e-3(7) 2.5e-3(7) 2.5e-3(7) 4.3e-3(6) 4.1e-3(6)
2 5.4e-3(8) 6.1e-3(8) 5.4e-3(8) 5.6e-3(8) 5.9e-3(8) 6.9e-3(8)
Four 3 3.0e-3(9) 5.5e-3(8) 5.8e-3(8) 5.9e-3(8) 4.9e-3(8) 5.7e-3(8)
4 5.1e-3(8) 6.1e-3(8) 5.7e-3(8) 6.0e-3(8) 5.7e-3(8) 5.5e-3(8)
Table 3: Relative error and the number of outer iterations for different network architectures under two-subdomain case and four-subdomain case. The figures in parentheses are the numbers of outer iterations. For the two-subdomain case, the total numbers of training are fixed to , and . For the four-subdomain case, the total numbers of training are fixed to , and . Here, for both cases.
Figure 4: The change in relative error along with out-iteration in different network architectures under two subdomains (left) and four subdomains (right). Here, the training data are fixed to , and for the two-subdomain case, the training data are fixed to , and for the four-subdomain case, and for both.

Further results can be obtained as long as we plot the process of error convergence. Firstly, if Fourier analysis is used for the domain decomposition model of (7), we would have the analytic domain decomposition convergence factor:

(8)

where is the minimum Fourier frequency, and is the overlap size [11]. In our case, . The process of error convergence and the analytic convergence factor are shown in Figure 4. Only a few cases are selected; however, other network architecture cases exhibit similar results. It is clearly observed that the convergence rates of numerical results are always close to that of analysis, regardless of what kind of network architecture is used, in two-subdomain and four-subdomain cases. This observation coincides with the proposition that the convergence factor depends only on overlap size and domain size.

A more detailed assessment of the effects of overlap sizes is presented in LABEL:tab:out-iter_overlap. In particular, we present relative errors and outer iterations, for both the two-subdomain case and four-subdomain case, with different overlaps from 0.05 to 0.8. If we fix the overlap and compare results from the two-subdomain case and the four-subdomain case, the latter requires more iterations than the former. With the increase in the overlapping domain, the outer iterations decline monotonically whether for the two-subdomain case or four-subdomain case. Moreover, the accuracy would decline as the overlapping domain diminishes, which occurs on account of the decrease of training data in the overlapping domain and the randomness of samples. It is noteworthy that these observations coincide with the conventional DDM. Additionally, LABEL:fig:convergence_overlap presents the comparison of convergence rates for different sizes of overlap under the two-subdomain case and the four-subdomain case. The line with different colors represents the case with different overlap sizes. The dashed line and the dotted line respectively represent the two-subdomain case and the four-subdomain case, and the analytic convergence factors are plotted with solid lines. The figure clearly displays that the numerical convergence rate is closely related to overlap size rather than the number of subdomains. From the analytical point of view, as long as the overlapping domain expands, the convergence factor would be smaller, and so, the algorithm converges faster in numerical experiments.

The experiments described above demonstrate that DeepDDM effectively inherits the nature of DDM. Similarly to conventional DDM, as the number of subdomains increases, the necessary iterations also escalate, and the convergence factor shrinks along with the expansion of overlapping domains. We can perhaps introduce coarse space correction, just like the conventional manner, to improve the performance of DeepDDM. Moreover, we find that the method of sampling affects the computing results to some extent, and so, the adaptive strategy should be considered and adopted. However, we leave these questions for future work.

5.2 Interface problem with discontinuous coefficients

The example aims to highlight the ability of our approach in handling anisotropic coefficients and a curved interface condition in the governing PDE. To this end, we consider the case of a steady-state diffusion problem:

(9)

where , is the outer normal of the interface and the scalar coefficient is the piecewise constant function

where . The symbolize the ratio of coefficient. As is enlarged, the heterogeneity strengthens, which means that it is more challenging to approximate the exact solution numerically. In the case of strong heterogeneity in the material, corresponding to a large coefficient ratio, domain decomposition is a recommended approach. We choose the exact solution as follows,

so that the force term and boundary condition in (9) can be easily obtained. It is also easy to verify that the above functions exactly satisfy the interface condition in (9). Here, we choose the boundary operator as the identity operator, which means that the Dirichlet boundary condition is employed. Using physical coupling conditions between the subdomains, the equation can be written (in a weak sense) in a multidomain formulation,

The above equations in two subdomains imply that we set the identity operator (Dirichlet boundary condition) in and the outer normal operator (Neumann boundary condition) in during applying Alogrithm 2.

To assess the strength of our approach, and by focusing on the curved interface and anisotropic coefficient problems, we present a set of numerical results for various conditions. The results given demonstrate the effectiveness and the rapid convergence of DeepDDM in various situations. Moreover, some properties of conventional DDM are also retained in DeepDDM for interface problems, e.g. the iteration is independent of function space; the number of iterations is positively correlated with the ratio of coefficients. In practice, the tolerance of the stop criterion for DeepDDM is set as . Additionally, the stop criterion of the training process in each subproblem is that , or that the maximum number of allowed iterations, 10,000, is reached.

Figure 5: The error in the whole domain for outer iterations with (left) and (right) in the interface problem. Here, the net architecture is fixed to and , and training data are fixed to , and .

In the following example, our setup aims to highlight the robustness and effectiveness of the proposed algorithm, even in cases of small network architecture. The changes in the error of numerical solutions and exact solutions in the whole domain for and are given in Figure 5. Our training dataset consists of , and . All of the training points are randomly sampled.

We chose to represent the latent function using a 3-layer network with 20 units per layer. As shown by Figure 5, the numerical solution quickly approaches the exact solution, along with the subdomain information interchanged on the interface. Obtained from the test data in the norm, the relative errors under and are and , respectively. It is noted that the elementary Dirichlet-Neumann interface condition is used in the strongly discontinuous coefficient problems, implying overlap equal to 0.

The relative errors and outer-iterations for and are summarized in Table 4. Here, the total number of training data is kept fixed at , and , respectively. The key observation here is that as the numbers of and increase; i.e., the capacity of the network to approximate more complex functions is enlarged, the numerical accuracy is improved. When it comes to iterations, they are independent of the network architecture, which is similar to the observation from the previous example. It is remarkable that the errors of the solution for specific network architecture with are almost equal to the errors of corresponding situations in the case. Additionally, we also observe that the iteration needed in the case is relatively smaller than that in the case. In other words, when becomes larger within a proper range, the number of outer iterations needed is attenuated. A similar observation has been found in [10].

Cases 10 20 30 40 50 100
2 4.4e-3(4) 3.3e-3(4) 1.8e-3(5) 6.1e-4(4) 2.6e-3(4) 4.4e-3(3)
3 6.5e-3(4) 2.6e-3(4) 7.7e-3(3) 1.7e-3(5) 4.4e-3(4) 2.0e-3(5)
4 2.6e-3(5) 1.9e-3(4) 2.1e-3(4) 1.4e-3(5) 4.0e-3(3) 3.5e-3(3)
2 1.2e-2(4) 1.5e-2(2) 5.9e-3(3) 5.5e-3(3) 3.3e-3(3) 2.2e-3(3)
3 1.4e-3(3) 2.4e-3(3) 5.1e-3(3) 3.8e-3(3) 5.0e-3(2) 5.3e-3(3)
4 7.1e-3(3) 3.9e-3(3) 5.4e-3(3) 5.3e-3(2) 1.4e-3(2) 2.7e-3(2)
Table 4: Relative errors and the number of outer iterations for different network architectures. These figures in parentheses indicate the number of outer iterations. Here, the total number of training data is fixed to , and .
Figure 6: The change in relative error along with out-iteration in different network architectures and different . Here, the training data are kept fixed to , and .

Respecting the convergence rate of DeepDDM, Figure 6 illustrates the change in relative errors along with outer iterations in various network architectures at and . We selected some network architecture cases in Table 4 and then plotted the error as the iterations increase. According to the yellow line, we can conjecture that the numerical convergence factor is about . The theoretical analysis of this interface problem is still an open problem.

6 Conclusions

We have introduced DeepDDM, a novel framework bridging deep learning and domain decomposition, to approximate solutions of PDE according to given equation information. The presented approach showcases a series of promising results for a model problem and an interface problem. These numerical results demonstrate that the convergence rate of DeepDDM applied to the Poisson equation is close to the analytical value of DDM. DeepDDM maintains some properties of conventional DDM: for instance, the outer-iteration depends on the size overlap and the number of subdomains instead of function space. Furthermore, DeepDDM can easily handle PDE with curve interface and heterogeneity.

This work is seminal in combining deep learning and domain decomposition, and it presents some experiments to provide insights for theoretical study. The work does not pertain to a stack of fundamental topics behind the approach presented. Take the following questions as examples. What is the convergence rate of deep learning solving PDE? What is the standard to design the best network architecture? How do we sample the training points for the best result? Some of these questions are open problems in this field; however, it is still significant to explore the properties of DeepDDM. Future works should consider applying DeepDDM with the coarse space correction and higher-order interface conditions.

Acknowledgements

This research was funded by National Natural Science Foundation of China with grant 11831016 and by Beijing Nova Program of Science and Technology under Grant Z191100001119129.

References

  • [1] J. Berg and K. Nyström (2018) A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, pp. 28–41. Cited by: §1.
  • [2] L. Bottou (1998) Online algorithms and stochastic approximations. In Online Learning and Neural Networks, D. Saad (Ed.), Note: revised, oct 2012 External Links: Link Cited by: §3.
  • [3] T. F. Chan and T. P. Mathew (1994) Domain decomposition algorithms. Acta numerica 3, pp. 61–143. Cited by: §2.
  • [4] G. Cybenko (1989)

    Approximation by superpositions of a sigmoidal function

    .
    Mathematics of control, signals and systems 2 (4), pp. 303–314. Cited by: §3.
  • [5] V. Dolean, P. Jolivet, and F. Nataf (2015) An introduction to domain decomposition methods: algorithms, theory, and parallel implementation. Vol. 144, SIAM. Cited by: §2.
  • [6] M. Dryja and O. B. Widlund (1990) Some domain decomposition algorithms for elliptic problems. In Iterative methods for large linear systems, pp. 273–291. Cited by: §2.
  • [7] V. Dwivedi, N. Parashar, and B. Srinivasan (2019) Distributed physics informed neural network for data-efficient solution to partial differential equations. arXiv preprint arXiv:1907.08967. Cited by: §1.
  • [8] W. E, J. Han, and A. Jentzen (2017) Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Communications in Mathematics and Statistics 5 (4), pp. 349–380. Cited by: §1.
  • [9] C. Farhat, M. Lesoinne, P. LeTallec, K. Pierson, and D. Rixen (2001) FETI-dp: a dual–primal unified feti method—part i: a faster alternative to the two-level feti method. International journal for numerical methods in engineering 50 (7), pp. 1523–1544. Cited by: §2.
  • [10] M. J. Gander and O. Dubois (2015) Optimized schwarz methods for a diffusion problem with discontinuous coefficient. Numerical Algorithms 69 (1), pp. 109–144. Cited by: §5.2.
  • [11] M. J. Gander (2006) Optimized schwarz methods. SIAM Journal on Numerical Analysis 44 (2), pp. 699–731. Cited by: §5.1.
  • [12] K. Hornik, M. Stinchcombe, and H. White (1989) Multilayer feedforward networks are universal approximators. Neural networks 2 (5), pp. 359–366. Cited by: §3.
  • [13] K. Hornik (1991) Approximation capabilities of multilayer feedforward networks. Neural networks 4 (2), pp. 251–257. Cited by: §3.
  • [14] D. P. Kingma and J. Ba (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980. Cited by: §5.
  • [15] G. Kissas, Y. Yang, E. Hwuang, W. R. Witschey, J. A. Detre, and P. Perdikaris (2020) Machine learning in cardiovascular flows modeling: predicting arterial blood pressure from non-invasive 4d flow mri data using physics-informed neural networks. Computer Methods in Applied Mechanics and Engineering 358, pp. 112623. Cited by: §1.
  • [16] K. C. Kiwiel (2001) Convergence and efficiency of subgradient methods for quasiconvex minimization. Mathematical programming 90 (1), pp. 1–25. Cited by: §3.
  • [17] A. Klawonn, M. Lanser, and O. Rheinbach (2014) Nonlinear feti-dp and bddc methods. SIAM Journal on Scientific Computing 36 (2), pp. A737–A765. Cited by: §2.
  • [18] M. H. Lanser (2015) Nonlinear feti-dp and bddc methods. Ph.D. Thesis, Universität zu Köln. Cited by: §2.
  • [19] Y. A. LeCun, L. Bottou, G. B. Orr, and K. Müller (2012) Efficient backprop. In Neural networks: Tricks of the trade, pp. 9–48. Cited by: §3.
  • [20] K. Li, K. Tang, T. Wu, and Q. Liao (2019) D3M: a deep domain decomposition method for partial differential equations. IEEE Access. Cited by: §1.
  • [21] P. Lions (1988) On the schwarz alternating method. i. In First international symposium on domain decomposition methods for partial differential equations, Vol. 1, pp. 42. Cited by: §2.
  • [22] Z. Long, Y. Lu, and B. Dong (2019) PDE-net 2.0: learning pdes from data with a numeric-symbolic hybrid deep network. Journal of Computational Physics 399, pp. 108925. Cited by: §1.
  • [23] N. Mai-Duy and T. Tran-Cong (2002) Mesh-free radial basis function network methods with domain decomposition for approximation of functions and numerical solution of poisson’s equations. Engineering Analysis with Boundary Elements 26 (2), pp. 133–156. Cited by: §1.
  • [24] T. Mathew (2008) Domain decomposition methods for the numerical solution of partial differential equations. Vol. 61, Springer Science & Business Media. Cited by: §2.
  • [25] C. Pechstein and C. R. Dohrmann (2017) A unified framework for adaptive bddc. Electron. Trans. Numer. Anal 46, pp. 273–336. Cited by: §2.
  • [26] M. Raissi, P. Perdikaris, and G. Karniadakis (2019) Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics 378, pp. 686–707. Cited by: §1, §1, §3, §5.1, §5.1.
  • [27] K. Rudd, G. Di Muro, and S. Ferrari (2013) A constrained backpropagation approach for the adaptive solution of partial differential equations. IEEE transactions on neural networks and learning systems 25 (3), pp. 571–584. Cited by: §1.
  • [28] H. A. Schwarz (1870) Ueber einen grenzübergang durch alternirendes verfahren. Zürcher u. Furrer. Cited by: §2.
  • [29] J. Sirignano and K. Spiliopoulos (2018) DGM: a deep learning algorithm for solving partial differential equations. Journal of Computational Physics 375, pp. 1339–1364. Cited by: §1.
  • [30] B. Smith, P. Bjorstad, and W. Gropp (2004) Domain decomposition: parallel multilevel methods for elliptic partial differential equations. Cambridge university press. Cited by: §2.
  • [31] Y. Yang and P. Perdikaris (2019) Adversarial uncertainty quantification in physics-informed neural networks. Journal of Computational Physics 394, pp. 136–152. Cited by: §5.1, §5.1.