Log In Sign Up

Deep unfitted Nitsche method for elliptic interface problems

In this paper, we propose a deep unfitted Nitsche method for computing elliptic interface problems with high contrasts in high dimensions. To capture discontinuities of the solution caused by interfaces, we reformulate the problem as an energy minimization involving two weakly coupled components. This enables us to train two deep neural networks to represent two components of the solution in high-dimensional. The curse of dimensionality is alleviated by using the Monte-Carlo method to discretize the unfitted Nitsche energy function. We present several numerical examples to show the efficiency and accuracy of the proposed method.


page 1

page 2

page 3

page 4


A Mesh-free Method Using Piecewise Deep Neural Network for Elliptic Interface Problems

In this paper, we propose a novel mesh-free numerical method for solving...

Deep neural network surrogates for non-smooth quantities of interest in shape uncertainty quantification

We consider the point evaluation of the solution to interface problems w...

Analytic solutions and numerical method for a coupled thermo-neutronic problem

We consider in this contribution a simplified idealized one-dimensional ...

Computer-assisted analysis of the sign-change structure for elliptic problems

In this paper, a method is proposed for rigorously analyzing the sign-ch...

A Shallow Ritz Method for elliptic problems with Singular Sources

In this paper, a shallow Ritz-type neural network for solving elliptic p...

A Discontinuity Capturing Shallow Neural Network for Elliptic Interface Problems

In this paper, a new Discontinuity Capturing Shallow Neural Network (DCS...

CDNNs: The coupled deep neural networks for coupling of the Stokes and Darcy-Forchheimer problems

In this article, we present an efficient deep learning method called cou...

1 Introduction

In this paper, we continue our previous studies on elliptic interface problems [guo2017gradient, guo2018gradient, GuYa2018], arising in many applications such as fluid dynamics and materials science, where the background consists of rather different materials on the subdomains separated by smooth curves called interfaces. We aim to address the high-dimensional challenge, which is well known as the curse of dimensionality leading to unaffordable computational time in traditional numerical methods (e.g., finite difference and finite element methods).

Deep neural networks have been shown as a powerful tool to overcome the curse of dimensionality [dudley1969speed, fournier2015rate, weed2019sharp, dos2018simulation]

, and have been applied to solve partial differential equations (PDEs),

e.g., the deep BSDE method [HJE2018, EHJ2017], the deep Galerkin method (DGM)[SiSp2018], the physics-informed neural networks (PINNs) [raissi2019physics], the deep Ritz method (DRM) [EYu2018], and the weak adversarial networks (WAN) [ZBYZ2020]. The deep BSDE reformulates the time-dependent equations into stochastic optimization problems. DGM and PINNs train neural networks by minimizing the mean squared error loss of the equation residual, while DRM trains networks by minimizing the energy functional of the variational problem equivalent to the PDEs. WAN uses the weak formulation and trains the primary and adversarial network alternatively using the min-max weak formulation. Moreover, the convergence of DRM was studied by [lu2021priori, duan2021convergence], and the deep Nitsche method was proposed in [LiMi2021], which enhanced the deep Ritz method with natural treatment of essential boundary conditions. In a recent work[ShYa2021], Sheng and Yang trained an additional neural network to impose the Dirichlet boundary conditions. However, these neural networks-based methods, in general, require the smoothness of the solutions to the PDEs, and thus can not be directly used to solve the elliptic interface problems, where the solutions are only piecewise smooth.

In literature, there are some recent works of solving elliptic interface problems using neural networks. For example, [WaZh2020] proposed a network architecture similar to the deep Ritz method [EYu2018], and solved the equivalent variational problem with the boundary conditions approximated by a shallow neural network. [he2020mesh]

used different neural networks to approximate the solutions in disjoint subdomains. They reformulated the interface problem as a least-squares problem and solved it by stochastic gradient descent.

[hu2021discontinuity] proposed the discontinuity capturing shallow neural network (DCSNN) to approximate piecewise continuous functions and solved elliptic interface problems by minimizing the mean squared error loss consistent with the residual of the equation, boundary and interface jump conditions.

In this paper, we propose a deep learning method for interface problems based on the minimization of the unfitted Nitsche’s energy, inspired by our previous studies

[GuYa2018, GYZ2021, GuYZ2021] on the unfitted Nitsche’s method. One of the most significant differences between the unfitted Nitsche’s method[BCHLM2015, HaHa2002, GYZ2021, GuYZ2021] and other methods for interface problems (e.g., the immersed finite element method [LiIt2006]) is that the unfitted Nitsche’s finite element functions can be discontinuous inside elements. This is possible by adopting two different sets of basis functions on the interface elements (i.e., the elements cut by the interface) which are weakly coupled using Nitsche’s methods. Based on the unfitted Nitsche’s formulation, we can define a so-call unfitted Nitsche energy functional (see equation (2.11) ). It turns out that the weak formulation of unfitted Nitsche’s method is just the Euler-Lagrange equation of unfitted Nitsche energy functional. To address the challenges of the curse of dimensionality, we naturally use deep neural networks to present functions in high dimensions. Following the idea of classical unfitted Nitsche’s method [BCHLM2015, HaHa2002, GYZ2021, GuYZ2021], we use two deep neural networks: one for the part inside the interface and the other one for the part outside the interface. These two parts are weakly connected using Nitsche’s method. The deep unfitted Nitsche method trains the two neural network functions independently using the same unfitted Nitsche energy functional.

The rest of the paper is organized as follows. In Section 2, we introduce the model setup of the elliptic interface problems and its unfitted Nitsche’s weak form; In Section 3, we describe the formulation of deep unfitted Nitsche method with details; We present numerical examples in Section 4, and make conclusive remarks in Section 5.

2 Model equation

Denote by a bounded Lipschitz domain in , and assume there is a smooth closed curve separating into two parts: and . In general, can be described as a zero level set of some level set function . Then we have and .

In this paper, we shall consider the following elliptic interface problem equationparentequation


where with

being the unit outward normal vector of

and the jump on is defined as


with being the restriction of on . The diffusion coefficient is a piecewise constant, i.e.,


which has a finite jump of function value at the interface . An illustration of the domain is given in Figure 1.

Figure 1: Illustrative example of a domain with a circular interface in two dimensional.

To prepare the presentation of Nitsche’s weak formulation, we introduce two weights


which satisfies that . Then, we define the weighted averaging of a function on the interface as


and also its dual weighted averaging as


Let be the function space consisting of piecewise Sobolev functions such that and , whose norm is defined as


where is the -norm of a function in . Similar notations are used for piecewise space and its corresponding norm. In addition, let be the subspace of such that . In particular, is the subspace of with homogeneous Dirichlet boundary conditions.

The unfitted Nitsche’s weak formulation [HaHa2002, BCHLM2015, GuYa2018] of the interface problem (2.1a)-(2.1d) is to find such that


where the bilinear form is defined as


and the linear functional is defined as


with the stability parameter .

To adopt the deep neural network, we reformulate the variational problem (2.8) as an energy minimization problem. For such a purpose, we define the unfitted Nitsche’s energy functional as


Then, we can show the variational problem is equivalent to the following energy minimization problem:


In other words, equation (2.1a)-(2.1d) is the Euler-Lagrangian equation of (2.12).


To simplify the presentation of this paper, we only consider inhomogeneous Dirichlet boundary conditions. For Neumann and Robin boundary conditions, they can be absorbed into the variation formulation. Other types of boundary conditions like mixed Dirichlet and Neumann boundary conditions can be handled similarly.


We choose to impose the Dirichlet boundary condition strongly by building it into the solution space. Alternatively, we can impose the Dirichlet boundary condition weakly as in [LiMi2021].

3 Deep unfitted Nitsche method

In this section, we use the universal approximation property of deep neural network and flexibility of unfitted Nitsche weak form to develop the deep unfitted Nitsche method for solving elliptic interface problem (2.1a)-(2.1d). In the first subsection, we briefly introduce the deep neural networks used in this paper. In the second subsection, we describe the details of the deep unfitted Nitsche formulation.

3.1 Deep neural network

One of the key ingredients for using deep learning to solve partial differential equations is to select a deep neural network as the ansatz function for trial functions. The commonly used deep neural networks to approximate the solutions to PDEs include feedforward neural network[LMMK, GoBC2016] and residual neural network (ResNet)[EYu2018, HZRS2016]. Similar to the deep Ritz method in [EYu2018], we take the ansatz function to be the residual neural network.

For each integer and some positive integer , let be the matrix of weights and be vector of biases for . Then, the th block of ResNet with neurons can be written as



is the activation function

[GoBC2016, HiHi2019]. To avoid the problem of vanish gradient, smooth functions like sigmoid and hyperbolic tangent can be adopted.

Similarly, let (or ) be the matrix of weights in the first (or last) layer and (or be the vector of biases in the first (or last) layer. The deep neural network with block can viewed as the composition of ’s


where denote the sets of parameters, i.e.,

In Figure 2, we plot the architecture of the ResNet with 3 block.

Figure 2: Diagram of ResNet with 4 block where FC layer denotes fully connected layer.

3.2 Deep unfitted Nitsche formulation

The main advantage of the unfitted Nitsche’s method lies in the ability to use meshes independent of the location of the interface. This is possible by employing two different ansatz functions on the interface elements (elements cut through by the interface): one is for the interior domain and the other one is for the exterior domain. Those two different ansatz functions are discontinuous across the interface and patched together by Nitsche’s method. In the traditional unfitted Nitsche’s methods, the ansatz functions are piecewise polynomials. Following this line, we adopt two different deep neural network functions as the two ansatz functions to minimize the unfitted Nitsche energy (2.11).

Let be the ansatz function in () and denote . where . Before we proceed, we should make sure satisfies the Dirichlet boundary condition 2.1b. For such purpose, we introduce an additional boundary penalty term as


Ideally, we expect close to zero. We define the extended unfitted Nitsche’s functional as


where is the boundary penalty parameter.

Then, our deep unfitted Nitsche method is equivalent the following optimization problem


In the literature, there are several alternative methods to impose the Dirichlet boundary conditions by building the Dirichlet boundary condition into the loss function

[LiMi2021] and training another deep neural network [ShYa2021]. For the sake of simplicity, we choose to impose the Dirichlet boundary condition for deep neural network functions using the penalty method as in [EYu2018]. Interesting readers are referred to [ChDW2020] for a comparison study of different boundary conditions handling methods.

To solve the optimization problem using stochastic gradient descent type algorithms (e.g., SGD[GoBC2016] or ADMM[KiBa2015adam]), we approximate the integrals by using the Monte-Carlo method, where the number of integral point is independent of the dimensional of the underlying domain. Suppose are the uniformly sampled point in the domain for . Similarly, let and be the randomly sampled point on the interface and the domain boundary , respectively. The loss function is defined as


where is the measure of in () and (or ) is the measure of (or ) in . Then, the discrete counterpart of the optimization problem (3.5) reads as


The discrete optimization problem (3.7) actually involves the training of two deep neural network function and . Those two neural networks can be trained independently using the same loss function . The gradient of deep neural network function can be efficiently calculated using automatic differentiation functionality [pytorchad2019]

in the modern machine learning platform.

In the loss function(3.6), we need to compute the measure of each

. For problems with simple geometric shapes, we can compute them analytically. In general, we can Monte-Carlo simulation like the hit-or-miss method to estimate the measure. Similarly, we can estimate the measure of

and .

4 Numerical Experiments

In this section, we present several numerical examples to illustrate the performance of the proposed deep unfitted Nitsche method. The proposed algorithm is implemented using the machine learning platform Pytorch

[NEURIPS2019_9015]. In the following numerical experiments, we choose and have the same neural network architectures and select the activation function . Both deep neural networks are initialized by Xavier initialization to prevent from exploding or vanishing and are trained independently using ADAM[KiBa2015adam]. In all the following numerical experiments, we choose the learning rate

and generate new mini-batches every 10 epochs. To provide a qualitative description of training results, we use the following relative



The relative -error is computed using Monte-Carlo methods with uniform sampled points. The relative error are computed and recorded in every 100 epochs. Similarly, we record the loss in every 100 epochs.

4.1 Flower shape interface problem in 2D

In our first example, we consider the flower interface problem in the domain [GuYa2018, guo2017gradient]. The flower shape interface is given by the following polar coordinate


The piecewise diffusion coefficient are and . We choose the right hand side function to fit the exact solution

The nonhomogeneous jump conditions (2.1d) and (2.1c) can be calculated from the above exact solution.

Figure 3: Comparison of solutions for flower interface problem: (a) Deep unfitted Nitsche’s solution; (b) Exact solution.
Figure 4: Training process of flower interface problem: (a) Decay of the relative -error; (b) Decay of the loss.

In this case, one can compute and

. To general uniformly distributed random points in

and , we firstly generate uniformly distributed random points in the whole domain . Then, we count the random points inside the interface as the randomly sampled points in and the rest random points are in . The randomly sampled points is produced by generating the uniformed sampled points on the interval and then are mapped onto the interface using (4.2).

In this numerical test, we choose the ResNet with 3 blocks with for each ansatz function, as illustrated in the diagram 2. Each ansatz function has 701 parameters. We uniformly sample 1024 points in the domain . It turns out that there are points inside the interface . In other words, and . In addition, we just choose and . We choose and . The corresponding decay of errors and loss functions during the training process is plotted in Figure 4. We can see the error decays very fast at the initial several thousand epochs and then fluctuate. After 50000 epochs, the relative -error is reduced to . In Figure 3, we present a comparison between the deep unfitted Nitsche method(DUNM) solution and the exact solution. We can see the DUNM solution match well with the exact solution even though there is an inhomogeneous jump in the function values.

Figure 5: Training process of flower interface problem: (a) Decay of the relative -error; (b) Decay of the loss.
Figure 6: Training process of flower interface problem: (a) Decay of the relative -error; (b) Decay of the loss.
Figure 7: Training process of flower interface problem: (a) Decay of the relative -error; (b) Decay of the loss.
Figure 8: Training process of flower interface problem: (a) Decay of the relative -error; (b) Decay of the loss.
Figure 9: Comparison of solutions for circular interface problem with and : (a) Deep unfitted Nitsche’s solution; (b) Exact solution.
Figure 10: Comparison of solutions for circular interface problem with and : (a) Deep unfitted Nitsche’s solution; (b) Exact solution.

4.2 High-contrast interface problem in 2D

In this example, we consider the high contrast interface problem with homogeneous jump conditions in the domain [GuYa2018, guo2018gradient, guo2017gradient]. The interface is the circle of radius centered at the original. The exact solution is

where .

In this example, we consider the radius circle with . We also use the ResNet with 3 blocks with output in each block to represent each function. The uniformly random sampled points are generated by the same method as in previous example. Again, we generate random sampled points in . In that case, points are inside and points are outside . Similarly, and . The penalty parameters and are the exactly the same as previous case.

Error() 2.29 0.62 2.36 0.63
Table 1: Relative errors for high contrast interface problems

In the following numerical tests, we focus on the training of high contrast interface problem by considering the following four typical different jump ratios: (large jump), (large jump), (huge jump), and (huge jump). The training processes are described in Figures 5-7 for each cases, respectively. In these figures, we notice that the errors quickly decay to some specific errors and then oscillate around them. The relative errors after 200000 epochs are summarized in the Table 1, where one can observe that the errors are not influenced by the jump ratios.

In the Figures 9 and 10, we plot the DUNM solution and the exact solution for the case and , respectively. It is clear to see that the DUNE solutions match well with the corresponding exact solutions. We have also compared two other cases and observed the same phenomena.

4.3 High-dimensional interface problems

In this example, we consider the -dimensional interface problem in the unit cube . The unit cube is divided into two parts and by the d-dimensional sphere with radius . The diffusion coefficients are and . The exact solution is

where . Therefore, we have homogeneous jump conditions. The right hand side function and boundary condition can be obtained from .

Figure 11: Decay of training error of high dimensional interface problem : (a) d; (b) ; (c) ; (d) .
Error() 3.90 2.14 4.43 7.04
Table 2: Relative errors for high dimensional interface problems

In this case, the measure of and the measure of . The measure of the interface is . We approximate each component of the solution by ResNet with 3 blocks of 20 neurons. In total, there are 2621 parameters for each deep neural network. To generate uniformly random points on dimensional spheres, we use the sample_hypersphere function in BoTorch [BKJD2020]. To guarantee there are enough random sampled point in . We first generate (about ) uniformly random point inside the -dimensional ball with using the dropped coordinates method [HaLa2010]. In specific, we firstly generate points on sphere with radius using the sample_hypersphere function in BoTorch [BKJD2020] and drop the last two coordinates to get the uniform random sampled points. Then, we generate random points in which may also contain points in . So we have and . We take and . In the following test, we take and .

We consider four different high dimensional cases: . Figure 11 displays the decay of the relative error during the training process. Similar to the previous two examples, we can see that the errors decay quickly at the first few epochs and then fluctuate around some specific levels. In Table 2, we reports the relative error after epochs. We can see that we get almost the same level of relative errors independent of the dimensionality of the space even though we use the same number of points.

5 Conclusion

As a continuous study of our previous works on elliptic interface problems [guo2017gradient, guo2018gradient, GuYa2018], we propose a deep unfitted Nitsche method to address the high-dimensional challenge with high-contrasts. The challenge is also known as the curse of dimensionality. The classical numerical methods like finite difference and finite methods require unaffordable computational time. The proposed method deploys the deep neural network to solve the equivalent high-dimensional optimization problem. To address the high contrasts of the solution, we introduced a so-called unfitted Nitsche energy functional, which utilize different deep neural networks to present different components of the solution in the high dimensional case. Different deep networks are patched together weakly by Nitsche’s method and can be trained independently using the unfitted Nitsche functional. The unfitted Nitsche’s energy function is approximated by the Monte-Carlo methods. An additional penalty term is added to the discrete energy functional to handle the Dirichlet boundary conditions. The proposed method is easy to be implemented and mesh-free, which is illustrated by several numerical examples including high contrasts and high dimensional cases.


H.G. is grateful to Prof. Zuoqiang Shi from Tsinghua University and Dr Guozhi Dong from the Humboldt University of Berlin for stimulating and helpful discussions. H.G. was partially supported by Andrew Sisson Fund of the University of Melbourne, X.Y. was partially supported by the NSF grant DMS-1818592.