1 Introduction
Deep neural network (DNN) has found many applications beyond its traditional applications such as image classification and speech recognition into the arena of scientific computing [9, 10, 12, 13, 14, 16, 20, 11, 21, 22]. However, to apply the commonlyused DNNs to these computational science and engineering problems, we are faced with several challenges. The most prominent issue is that the DNN normally only handles data with low frequency content well, which has been shown by a Frequency Principle (FPrinciple) that many DNNs learn the low frequency content of the data quickly with good generalization error, but they will be inadequate when high frequency data are involved [24, 19, 23]. The fast convergence behavior of low frequency has been recently studied rigorously in theory in [18, 26, 2, 6]. As a comparison, such a behavior of DNN is just the opposite of that of the popular multigrid methods (MGM) for solving numerically PDEs, such as the PoissonBoltzmann (PB) equation, where the convergence is achieved first in the high frequency spectrum of the solution due to the smoothing operations in the MGM. Due to the potential of DNNs in handling higher dimensional solutions and approximating functions without the need of a structured mesh as in traditional finite element or finite difference method, it is of great value to extend the capability of DNN as a meshless PDE solver. Therefore, it is imperative to improve the convergence of DNNs for fine structures in the solution as encountered in the electrostatic potentials of complex molecules.
The electrostatic interaction of the biomolecules with ionic solvents, governed by the PoissonBoltzmann (PB) equation within the DebyeHuckel theory [5], plays an important role in many applications including drug design and the study of disease. However, due to the complex surface structure of the biomolecules, usually represented by a bead model, it has been a long outstanding challenging to design efficient numerical method to handle the singular molecular surface, which is either the van der Waals (vdW) surface being the sum of overlapping vdw spheres or the solvent accessible surface (SAS) generated by rolling a small ball on the vdW surface [17], and the complex distribution of the electrostatic potential over the molecular surfaces. Tradition finite element [1] and finite difference methods [25] have faced difficulties in the costly mesh generation and expensive solution of the discretized linear system. Therefore, in this paper, we will propose and investigate multiscale DNNs, termed MscaleDNN, with the goal of approximating both low and high frequency information of a function uniformly and developing a meshless solver for PDEs such as the PB equations in domains with complex and singular geometries.
The main idea of the MscaleDNN is to find a way to convert the learning or approximation of high frequency data to that of a low frequency one. Similar idea has been attempted in a previous work in the development of a phase shift DNN (PhaseDNN) [3], where the high frequency component of the data was given a phase shift downward to a low frequency spectrum. The learning of the shifted data can be achieved with a small sized DNN quickly, which was then shifted (upward) to give approximation to the original high frequency data. The PhaseDNN has been shown to be very effective to handle highly oscillatory data from solution of high frequency Helmholtz equations and functions of small dimensions. However, due to the number of phase shifts employed along each coordinate direction independently, the PhaseDNN will result in many small DNNs and a considerable computational cost even for three dimensional problems. In this paper, we will consider a different approach to achieve the conversion of high frequency to lower one, namely, with a radial partition of the Fourier space, a scaling down operation will be used to convert higher frequency spectrum to a low frequency one before the learning is carried out with a smallsized DNN. As the scaling operation only needs to be done along the radial direction in the Fourier space, this approach is easy to be implemented and gives an overall small number of DNNs, thus reducing the computational cost. In addition, borrowing the multiresolution concept of wavelet approximation theory using compact mother scaling and wavelet functions, we will modify the traditional global activation functions to ones with compact support. The compact support of the activation functions with sufficient smoothness will give a localization in the frequency domain where scaling operation will effectively produce DNNs to approximate different frequency contents of a PDE solution.
Two types of MscaleDNN architectures are proposed, investigated and compared for their performances. After various experiments, we demonstrate that MscaleDNNs solves elliptic PDEs much faster and can achieve a much smaller generalization error, compared with normal fully connected networks with similar size. We will apply MscaleDNNs to solve variable coefficient elliptic equations, including those solutions with a broad range of frequencies and over different types of domains such as a ringshaped domain and a cubic domain with multiple holes. Also, to test the potential of MscaleDNN for finding PoissonBoltzmann electrostatic solvation energy in biomolecules, we apply MscaleDNN to solve elliptic equation with geometric singularities, such as cusps and selfintersecting surfaces in a molecular surface. These extensive experiments clearly demonstrate that the MscaleDNN is an efficient and easytoimplement meshless PDE solver in complex domains.
The rest of the paper is organized as follows. In section 2, we will introduce frequency scaling to generate a MscaleDNN representation. Section 3 will present MscaleDNN structures with compact support activation functions. Section 4 will present a minimization approach through the Ritz energy for finding the solution of elliptic PDEs and a minimization approach through a least squared error for fitting functions. In section 5, we use two test problems to show the effectiveness of the proposed MscaleDNN over a normal fully connected DNN of same size. Next, numerical results of the solution of complex elliptic PDEs with complex domains by the proposed MscaleDNN will be given in Section 6. Finally, Section 7 gives a conclusion and some discussion for further work.
2 Frequency scaled DNN and compact activation function
In this section, we will first present a naive idea of how to use a frequency scaling in Fourier wave number space to reduce a high frequency learning problems for a function to a low frequency learning one for the DNN and will also point out the difficulties it may encounter as a practical algorithm.
(1) 
We will first partition the domain as union of concentric annulus with uniform or nonuniform width, e.g.,
(2) 
so that
(3) 
Now, we can decompose the function as follows
(4) 
where
(5) 
The decomposition in the Fourier space give a corresponding one in the physical space
(6) 
where
(7) 
and the inverse Fourier transform of is called the frequency selection kernel [3] and can be computed analytically using Bessel functions
(8) 
From (5), we can apply a simple downscaling to convert the high frequency region to a low frequency region. Namely, we define a scaled version of as
(9) 
and
(10) 
or
(11) 
noting the low frequency spectrum of the scaled function if is chosen large enough, i.e.,
(12) 
Using the FPrinciple of common DNNs [23], with being small, we can train a DNN to learn quickly
(13) 
which gives an approximation to immediately
(14) 
and to as well
(15) 
The difficulty of the above procedure for approximating function and even more for finding a PDE solution is the need to compute the convolution in (7), which is computationally expensive for scattered data in the space, especially in higher dimensional problems.
3 MscaleDNN structure
3.1 Activation function with compact support
In order to produce scale separation and identification capability of a MscaleDNN, we borrow the idea of compact mother scaling function in the wavelet theory [8], and consider the activation functions with compact support as well. Compared with the normal activation function , we found activation functions with compact support are more effective in MscaleDNNs. Two possible activation functions are defined as follows
(16) 
and the quadratic Bspline with first continuous derivative
(17) 
where . All three activation functions are illustrated in spatial domain in Fig. 1 and the Fourier transforms of both and are illustrated in Fig. 2.
3.2 MscaleDNN structure
While the procedure leading to (15) is not practical for numerical approximation in high dimension, it does suggest a plausible form of function space for finding the solution more quickly with DNN functions. We can use a series of ranging from to a large number to produce a MscaleDNN structure to achieve our goal in speeding up the convergence for solution with a wide range of frequencies with uniform accuracy in frequencies. For this purpose, we propose the following two multiscale structures.
MscaleDNN1
For the first kind, we separate the neuron in the first hiddenlayer into to
parts. The neuron in the th part receives input , that is, its output is , where , , b are weight, input, and bias parameters, respectively. A complete MscaleDNNs takes the following form(18) 
where , , is the neuron number of th hidden layer, , , is a scalar function and “” means entrywise operation, is the Hadamard product and
(19) 
where or .
We refer to this structure as Multiscale DNN1 (MscaleDNN1) of the form in Eq. (18), as depicted in Fig. 3(a).
MscaleDNN2 A second kind of multiscale DNN is given in Fig. 3(b), as a sum of subnetworks, in which each scale input goes through a subnetwork. In MscaleDNN2, weight matrices from to are block diagonal. Again, we could select the scale coefficient or .
For comparison studies, we will define a “normal” network as an one fully connected DNN with no multiscale features. We would perform extensive numerical experiments to examine the effectiveness of different settings and use an efficient one to solve complex problems. All models are trained by Adam [15] with learning rate .
4 MscaleDNN for approximation and elliptic PDE’s solution
In this section, we will address two problems, i.e., solving PDEs such as the PB equations, and fitting functions, to show the effectiveness of MscaleDNNs in the following sections.
4.1 Mean squared error training for fitting functions
A DNN, denoted by , will be trained with the mean squared error (MSE) loss to fit a target function
. The loss function is defined as
(20) 
where is a neural network with parameter .
In our training process, the training data are sampled from
at each training epoch, the loss at each epoch is
(21) 
where is the sample size in .
The above training process requires all information of the target function, which indicates such training process is not of much practical use. We conduct this study to examine the ability of a DNN in fitting highfrequency functions given sufficient information of the target function.
4.2 A Ritz variational method for PoissonBoltzmann equations
Let us consider the following elliptic PoissonBoltzmann equation,
(22) 
where is the dielectric constant and the inverse DebyeHuckel length of an ionic solvent. For a typical solvation problem of a solute such as a biomolecule in ionic solvent, the dielectric constant will be a discontinuous function across the solutesolvent interface and the following transmission condition will be included,
(23) 
(24) 
where denotes the jump of the quantity inside the square bracket.
We will apply the deep Ritz method as proposed in [10], which produces a variational solution of equation (22) and (23) (24) through the following minimization problem
(25) 
where the energy functional is defined as
(26) 
We use the MscaleDNN to represent trial functions in the above variational problem, where is the DNN parameter set. Then, the MscaleDNN solution is
(27) 
The minimizer can be found by a stochastic gradient decent (SGD) method,
(28) 
The integral in Eq. (26) will only be sampled at some random points at each training step (see (2.11) in [10]), namely,
(29) 
At convergence , we obtain a MscaleDNN solution .
In our numerical tests, the Ritz loss function is modified to account for boundary conditions
(30) 
where is the DNN output, is the sample set from and is the sample size, indicates the number of sample set from . The second penalty term is to enforce the boundary condition. We choose for all experiments.
To see the learning accuracy, we also compute the error between and on test data points inside the domain,
(31) 
5 Effectiveness of various MscaleDNN settings
In this section, we will show that MscaleDNNs outperform normal fullyconnected DNNs (indicated by “normal" in the numerical results) in various settings, that is, the loss function of MscaleDNN decays faster to smaller values than that of normal fullyconnected DNNs. First, we will carry out two test problems. Second, we will demonstrate that compact supported activation functions of and are much better than the common used . Third, we use activation functions to show MscaleDNN structures are better than normal fully connected one. Finally, we examine the difference of different scale selections.
5.1 Two test problems
To understand the performance of different MscaleDNNs and their parameters, here we consider one and two dimensional problems in fitting functions and solving PDEs, and problems in 3D in complex domains will be considered in the next section.
Test problem 1: Fitting problem
The target function for the fitting problem is
(32) 
where ,
In the case of , we choose while for the case of , . The functions of and are shown in Fig. 4. training data at each epoch and test data are randomly sampled from . All DNNs are trained by Adam optimizer with learning rate .
Test problem 2: Solving PB equations
We will solve the elliptic equation (22) with and a constant in a domain and the right hand side
which gives a PB equation with the following exact solution,
with corresponding boundary condition given by the exact solution.
For , we choose , . For , we choose , . The exact solutions for and are shown in Fig. 5. DNNs are trained by Adam optimizer with learning rate . training data at each epoch are randomly sampled from . We choose the penalty coefficient for boundary as . The number of boundary data randomly sampled from is for and for .
5.2 Different activation functions
We use the following three network structures to examine the effectiveness of different activation functions by solving onedimensional fitting and PDE problems described above:

fullyconnected DNN with size 19009009001 (normal).

MscaleDNN1 with size 19009009001 and scale coefficients of (MscaleDNN1(32)).

MscaleDNN2 with six subnetworks with size 11501501501 and scale coefficients of . (MscaleDNN2(32)).
We use three different activation functions, i.e., , , for the above structures. For normal network structure in the fitting problem, as shown in Fig. 6(a), (blue) performs much better than other two activation functions, however, for normal network structure in the solving PDE, as shown in Fig. 7(a), (blue) performs much worse than other two activation functions. These indicate all three activation function are not stable in normal fully connected structure. On the other hand, as shown in Fig. 6 (b, c), and 7 (b, c), for both MscaleDNN structures, the performance of compact supported activation functions, (orange) and (blue), are both much better than that of (green) for both test problems.
5.3 Different network structures
In this subsection, we examine the effectiveness of the following different network structures with the activation function of :

fullyconnected DNN with size 19009009001 (normal).

MscaleDNN1 with size 19009009001 and scale coefficients of (MscaleDNN1(32)).

MscaleDNN2 with six subnetworks with size 11501501501 and scale coefficients of . (MscaleDNN2(32)).
As shown in Fig. 8 and 9, both MscaleDNN structures are better than normal structures in both problems. Two different MscaleDNN structures have similar performance in both test problems. As MscaleDNN2 performs better than MscaleDNN1 and also has much less connections compared with MscaleDNN1, in the following we use MscaleDNN2 for further numerical experiments.
5.4 Different scale selections in MscaleDNNs
In this subsection, we will test different scales for the activation function in MscaleDNNs:

fullyconnected DNN with size 19009009001 (normal).

MscaleDNN2 with six subnetworks with size 11501501501 and scale coefficients of (MscaleDNN2(1)).

MscaleDNN2 with three subnetworks with size 13003003001 and scale coefficients of . (MscaleDNN2(3)).

MscaleDNN2 with three subnetworks with size 13003003001 and scale coefficients of . (MscaleDNN2(4)).

MscaleDNN2 with six subnetworks with size 11501501501 and scale coefficients of (MscaleDNN2(6)).

MscaleDNN2 with six subnetworks with size 11501501501 and scale coefficients of (MscaleDNN2(32)).
As shown in Fig. 10, MscaleDNNs almost perform consistently better than normal DNNs. Note that with largerrange scale, MscaleDNN solves the problem faster. With all scales as , the performance of DNN structure (MscaleDNN2(1)) is much worse than those with multiscales in solving elliptic PDEs. Therefore, with the subnetwork structures with different scales, the MscaleDNN is able to achieve a faster convergence. These experiments show that MscaleDNNs with proper scales are more efficient in solving PDE problems and the selection of the scales are not too sensitive.
With these numerical experiments, we have demonstrated that MscaleDNN is much more efficient to solve elliptic PDEs and the preferred network is MscaleDNN2 with the compact support function , which will be used for the rest of the paper for solving Poisson and PB equations in complex and/or singular domains.
6 MscaleDNNs for Poisson and PoissonBoltzmann equations in complex domains
In this section, we apply MscaleDNNs with activation function to solve complex elliptic equations, including cases with a broad range of frequencies, variable coefficients, a ringshaped domain, and a cubic domain with multiple holes. Finally, we apply the MscaleDNN to solve PB equations with geometric singularities, such as cusps and selfintersecting surfaces, which comes from a typical beadmodel of biomolecule. Through these experiments, we convincingly demonstrate that MscaleDNNs are an efficient and easyimplemented meshless method to solve complex elliptic PDEs.
6.1 Broad range of frequencies
Consider the Poisson equation in ,
(33) 
where
The equation has an exact solution as
which will also provide the boundary condition in problem (33).
In each training epoch, we sample points inside the domain and points from the boundary. We examine the following two structures:

a fullyconnected DNN with size 11000100010001 (normal).

a MscaleDNN2 with five subnetworks with size 12002002001, and scale coefficients . (Mscale).
This problem does not have a fixed frequency but a broad range of frequencies. A commonlyused fully connected DNN will not be able to solve this problem. For , the exact solution for the twodimensional case of problem (33) is shown in Fig. 11 (a) as a highly oscillated function. The solution, obtained by the normal DNN in Fig. 11 (b), fails to capture the oscillate structure, while the solution obtained by the MscaleDNN in Fig. 11 (c) captures well the differentscale oscillations. For example of the area marked by the red circle, the expected oscillation almost disappears in the solution of the normal networks while MscaleDNN solutions resolve the oscillations well. Similar behavior differences occur for the oscillations at four corners.
The errors of the twodimensional and the threedimensional problems are shown in Fig. 12 (a) and (b), respectively. In both cases, MscaleDNNs solve problems much faster to lower errors.
6.2 PB equations with Variable coefficients
Consider the PB equation (22) in with
and
which has an exact solution as
The boundary condition is also given by the exact solution . We choose .
In each training epoch, we sample points inside the domain and points from the boundary. We compare the following two DNN structures:

a fullyconnected DNN with size 19009009001 (normal).

a MscaleDNN2 with six subnetworks with size 11501501501 and scale coefficients . (Mscale).
As shown in Fig. 13, during the training process, the error of the MscaleDNN decays significantly, while the error of the normal DNN almost keeps unchanged. Therefore, MscaleDNN solves the problem much faster with a much better accuracy.
6.3 A ringshaped domain
Consider the Poisson equation (33) in a ringshaped domain with its center at and inner radius and outer radius with a source term
(34) 
where is the Bessel function. The exact solution is given by
(35) 
Again, the boundary condition is again given by the exact solution . We choose and solve the equation with and .
In each training epoch, we sample points inside the domain and points from the boundary. We examine the following two structures:

a fullyconnected DNN with size 15005005001 (normal).

a MscaleDNN2 with five subnetworks with size 11001001001 and scale coefficients . (Mscale).
The exact solutions and numerical solutions obtained by normal and MscaleDNNs are shown in Fig. 14 () and Fig. 15 (). To highlight the superior performance of the MscaleDNNs, areas in the figures marked by the black circle show the region of the solution with the largest amplitude, the normal networks completely fail to capture the oscillations while the MscaleDNNs faithfully captures them in both cases. Again, as shown in Fig. 16, MscaleDNN solves both problems with a much better accuracy.
6.4 A square domain with a few holes
We consider next the following two square domains with three and four holes, respectively:
Domain one
The centers for three circle holes are , , and , with radiuses of , , and , respectively. In each epoch, we randomly sample on outer boundary, points on the boundary of each big hole and points on the boundary of the small hole.
Domain two
The centers for three circle holes are , and , with radiuses of , , , respectively. The boundary of the elliptic hole is described by . The sample sizes at each epoch are , , , and for the outer boundary, the boundary of the big circle hole, the boundary of each small circle hole, and the boundary of the elliptic hole, respectively.
We solve the Poisson equation (33) with the source term as
(36) 
The exact solution is
(37) 
which also provides the boundary condition.
In each training epoch, we sample points inside the domain. We examine the following two DNN structures:

a fullyconnected DNN with size 11000100010001 (normal).

a MscaleDNN2 with five subnetworks with size 12002002001, and scale coefficients of . (Mscale).
Compared with the exact solutions in Fig. 17 (a) and Fig. 18 (a), normal DNN fails to resolve the magnitidues of many oscillations as shown in Fig. 17 (b) and Fig. 18 (b) while MscaleDNNs capture each oscillation of the true solutions accurately as shown in Fig. 17 (c) and Fig. 18 (c).
As shown in Fig. 19. MscaleDNNs solve both problems much faster to lower errors.
6.5 A square domain with many holes
To verify the capability of the MscaleDNN for complex domains, we consider a three dimensional cube with 125 holes inside removed, and the holes are centered at a uniform mesh, i.e.,
, with radiuses randomly sampled from a uniform distribution in
. The sample sizes for training DNNs at each training epoch are for the outer boundary and for the inner holes ( points for each hole).Again, consider the Poisson equation with and the Dirichlet boundary condition given by the exact solution for the following three cases:

Example 1: .

Example 2: .

Example 3: .
The difficulty of this problem consists of the complex holes and oscillatory exact solutions with .
To visualize the complexity of the problem, we show holes of domain in Fig. 20.
In each training epoch, we sample points inside the domain, and compare the following two structures:

a fullyconnected DNN with size 11000100010001 (normal).

a MscaleDNN2 with five subnetworks with size 12002002001, and scale coefficients of . (Mscale).
As shown in Fig. 21 for all three cases, the normal fullyconnected structures do not converge for such complex problems at all while MscaleDNNs can solve the problem with much smaller errors.
6.6 Geometric singularities
In this subsection, we consider the PB equation (22) in a domain with geometric singularities and jump condition on interior interfaces, which arises from the simulation of solvation of biomolecules. Consider an open bounded domain , which divides into two disjoint open subdomains by the surface . is identified as the biomolecular domain, and is called the solvent domain. The exact solution is also divided into two parts, is defined in and is defined in . The solution will also satisfy the transmission condition (23) (24) along the interface and a decaying condition at the , i.e.
(38) 
To deal with the unbounded domain, we truncate the solution domain to a large ball or cube, denoted by satisfying and and an approximate condition will be posed on the boundary of the ball (Fig. 22) and such a crude boundary condition will surely introduce error to the PDEs solution and higher order boundary conditions have been studied extensively. As we are more interested in the performance of the DNNs near interior interface, we will not ponder over this issue here.
The domain with geometric singularities is constructed as follows. We choose a big ball with center and radius . points are randomly selected on the surface of the big ball as the centers of small balls. Radiuses of the small balls are randomly sampled from . is the union of these balls and the big ball. The shape of is illustrated in Fig. 23. The intersections among balls cause geometric singularities, such as kinks, which poses major challenges for obtaining mesh generation for traditional finite element and boundary element methods and accurate solution procedures.
Following two examples are considered. In both examples, coefficients and are chosen as piecewise constant. We do not consider singular sources for the PB equations, which can occur from the point charges inside biomolecules or ions in the solvents. These point charge sources, modeled by Dirac delta function, will create point singularity in the solution, which can be easily removed by subtracting a singular solution [7] and the remaining smooth part can be then solved by the MscaleDNN as follows.
Example 1
The exact solution is
(39) 
with coefficients for the PB equation as
(40) 
The whole domain is truncated by a ball with center at and a radius with zero boundary condition on the sphere.
Example 2
We choose
(41) 
with coefficients
(42) 
In this case, the computational domain is obtained with a truncation by a cube and the reference solution is calculated by finite difference method (FDM) with a sufficient fine mesh ensuring enough accuracy.
In example 1, in each training epoch, we sample points inside the domain and points on boundary . In example 2, we sample points inside the domain , points on boundary . We train MscaleDNNs with the Ritz loss function in (30). Note that the continuous condition in (23) is automatically satisfied since we use a single network to fit the whole domain ; The natural condition in (24) is also automatically satisfied due to the Ritz loss.
We examine the following two structures:

a fullyconnected DNN with size 11000100010001 (normal).

a MscaleDNN2 with five subnetworks with size 12002002001, and scale coefficients of . (Mscale).
Since the value of the exact solution is small, we show the relative error for both cases. As in practice, the exact solution is unknown, therefore, we also show the training loss for both examples, which could be used as a possible criteria to terminate the training. For example 1 as shown in Fig. 24, the training loss in Fig. 24(a) and the error in Fig. 24(b) have similar trends, that is, the training loss and the error of the MscaleDNN converge faster to smaller values, compared with the normal DNN. For example 2 shown in Fig. 25, the MscaleDNN shows a similar advantage over the normal DNN. These examples indicate that with by just monitoring the training loss, MscaleDNN solves the PB equations with nonsmooth solution over singular domains much faster and with better accuracy.
For illustration, we show a cross section of the solution in the second example. The reference solution is obtained by the FDM. Numerical solutions on the line obtained by FDM(), normal DNN( epochs) and MscaleDNN( epochs) are shown in Fig. 26. The output of the normal fully connected network gives a wrong solution in the interior of the singular domain while the MscaleDNN gives satisfactory approximation to the reference solution.
7 Conclusion and future work
In this paper, we have introduced a new kind of multiscale DNN, MscaleDNN, using a frequency domain scaling technique and compactly supported activation functions, to generate a multiscale capability for finding the solutions of elliptic PDEs with rich frequency contents. By using a radial scaling in the Fourier domain of the solutions, the MscaleDNN is shown to be an efficient meshless and easytoimplement method for PDEs on complex and singular domains.
For future work, we will also explore the idea of activation function with the wavelet mother wavelet properties as proposed in our previous version [4], which should give further frequency localization and separation capability in the MscaleDNNs. More importantly, an area to be explored is to apply the MscaleDNN to high dimensional PDEs such as Schrodinger equations for many body quantum systems, issues of high dimensional sampling and low dimensional structure of solutions will be studied.
Acknowledgments
ZX is supported by by National Key R&D Program of China (2019YFA0709503), and Shanghai Sailing Program.
References
 [1] (2001) Electrostatics of nanosystems: application to microtubules and the ribosome. Proceedings of the National Academy of Sciences 98 (18), pp. 10037–41. Cited by: §1.
 [2] (2019) The convergence rate of neural networks for learned functions of different frequencies. arXiv preprint arXiv:1906.00425. Cited by: §1.
 [3] (2019) A phase shift deep neural network for high frequency approximation and wave problems. Accepted by SISC, arXiv:1909.11759. Cited by: §1, §2.
 [4] (2019) Multiscale deep neural networks for solving high dimensional pdes. Arxiv preprint, arXiv:1910.11710. Cited by: §7.
 [5] (2013) Computational methods for electromagnetic phenomena, electrostatics in solvation, scatterings, and electron transport. Cambirdge University Press. Cited by: §1.

[6]
(2020)
Towards Understanding the Spectral Bias of Deep Learning
. arXiv:1912.01198 [cs, stat]. Cited by: §1.  [7] (2003) Accurate evaluation of electrostatics for macromolecules in solution. Meth. Appl. Anal. 10 (), pp. 309–328. Cited by: §6.6.
 [8] (1992) Ten lectures on wavelets. Vol. 61, Siam. Cited by: §3.1.

[9]
(2017)
Deep learningbased numerical methods for highdimensional parabolic partial differential equations and backward stochastic differential equations
. Communications in Mathematics and Statistics 5 (4), pp. 349–380. Cited by: §1.  [10] (2018) The deep ritz method: a deep learningbased numerical algorithm for solving variational problems. Communications in Mathematics and Statistics 6 (1), pp. 1–12. Cited by: §1, §4.2, §4.2.
 [11] (2019) DNN approximation of nonlinear finite element equations. Technical report Lawrence Livermore National Lab.(LLNL), Livermore, CA (United States). Cited by: §1.
 [12] (2018) Solving highdimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences 115 (34), pp. 8505–8510. Cited by: §1.
 [13] (2018) Deep potential: a general representation of a manybody potential energy surface. Communications in Computational Physics 23 (3). Cited by: §1.
 [14] (2018) Relu deep neural networks and linear finite elements. arXiv preprint arXiv:1807.03973. Cited by: §1.
 [15] (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980. Cited by: §3.2.
 [16] (2019) Deep nitsche method: deep ritz method with essential boundary conditions. arXiv preprint arXiv:1912.01309. Cited by: §1.
 [17] (1997) Structure and mechanism of carbonic anhydrase. Pharmacol. Therapeut. 74 (), pp. 1–20. Cited by: §1.
 [18] (2019) Theory of the frequency principle for general deep neural networks. arXiv preprint arXiv:1906.09235. Cited by: §1.

[19]
(2019)
On the Spectral Bias of Neural Networks.
In
International Conference on Machine Learning
, pp. 5301–5310. Cited by: §1.  [20] (2019) Physicsinformed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics 378, pp. 686–707. Cited by: §1.

[21]
(2019)
Datadriven, physicsbased feature extraction from fluid flow fields using convolutional neural networks
. Communications in Computational Physics 25 (3), pp. 625–650. Cited by: §1.  [22] (2020) A meshfree method for interface problems using the deep learning approach. Journal of Computational Physics 400, pp. 108963. Cited by: §1.
 [23] (2019) Frequency principle: fourier analysis sheds light on deep neural networks. Accepted by Communications in Computational Physics, arXiv:1901.06523. Cited by: §1, §2.
 [24] (2019) Training Behavior of Deep Neural Network in Frequency Domain. In Neural Information Processing, Lecture Notes in Computer Science, pp. 264–274. External Links: Document Cited by: §1.
 [25] (2007) Treatment of geometric singularities in implicit solvent models. J. Chem. Phys. 126 (), pp. 244108. Cited by: §1.
 [26] (2019) Explicitizing an implicit bias of the frequency principle in twolayer neural networks. arXiv preprint arXiv:1905.10264. Cited by: §1.