Fast Neural Network based Solving of Partial Differential Equations

05/18/2022
by   Jaroslaw Rzepecki, et al.
75

We present a novel method for using Neural Networks (NNs) for finding solutions to a class of Partial Differential Equations (PDEs). Our method builds on recent advances in Neural Radiance Field research (NeRFs) and allows for a NN to converge to a PDE solution much faster than classic Physically Informed Neural Network (PINNs) approaches.

READ FULL TEXT VIEW PDF

Authors

page 4

page 5

04/15/2019

A Discussion on Solving Partial Differential Equations using Neural Networks

Can neural networks learn to solve partial differential equations (PDEs)...
01/17/2020

A Derivative-Free Method for Solving Elliptic Partial Differential Equations with Deep Neural Networks

We introduce a deep neural network based method for solving a class of e...
10/22/2019

Learning Partial Differential Equations from Data Using Neural Networks

We develop a framework for estimating unknown partial differential equat...
02/25/2021

SPINN: Sparse, Physics-based, and Interpretable Neural Networks for PDEs

We introduce a class of Sparse, Physics-based, and Interpretable Neural ...
01/15/2020

Time evolution of the characteristic and probability density function of diffusion processes via neural networks

We investigate the use of physics-informed neural networks-based solutio...
07/09/2021

An efficient greedy training algorithm for neural networks and applications in PDEs

Recently, neural networks have been widely applied for solving partial d...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

References

1 Introduction

Partial differential equations (PDEs) are found throughout physics and engineering, but solving them is notoriously difficult. Numerical approaches such as Finite Element Methods (FEMs) simplify the problem by breaking it down into smaller pieces (“elements”), and making assumptions about the relationships within and between those pieces. This is effective for small problems, but does not scale well as the number of elements grows.

Recently, Physically Informed Neural Networks (PINNs) have exploited neural networks’ ability to approximate functions to learn solutions to PDEs. They are trained by measuring how far the neural network’s function is from a solution to the PDE, and computing a loss term accordingly. This can naturally be extended to solve multiple systems of PDEs simultaneously, but training a neural network can take a long time. Neural networks can also lead to a more compact and elegant functional representation of the solution to a PDE, instead of the brute force enumeration of positions and values typically used by FEMs.

Our method provides a faster alternative to PINNs by building on recent developments in Neural Radiance Fields (NeRFs). In particular, PDEs often tell us requirements of any path or surface in the domain, which allows us to efficiently generate arbitrary amounts of training data by simply generating random paths and/or surfaces and computing their required properties. We have also found that sampling the neural network’s output along entire paths instead of individual points improves numerical stability and makes it easier for the network to converge to a result.

2 Physically Informed Neural Networks

Neural Network based PDE solvers [Raissi et al. [raissi2019physics], Lagaris et al. [lagaris1998artificial], Sirignano et al. [sirignano2018dgm], Hennigh et al. [hennigh2021nvidia]

], after successful training, allow for nearly instantaneous solving of parametrized models/PDEs, which can benefit both manual (human in the loop) and fully automated optimization schemes. Additionally, at an expense of a more-complicated loss function, they make combination of multiple physical phenomena (i.e. different types of PDEs) relatively straightforward.

Neural Network solvers are based on two main observations: (1) Neural Networks are very good general function approximators and (2) it is difficult to find a solution to a PDE, however it is easy to check if a given candidate function does solve a PDE.

Following the notation from Hennigh et al. [hennigh2021nvidia] let us consider a general form of a PDE:

(1)

In the above equation s are differential operators, is the set of independent variables defined over a bounded continuous domain , and is the solution to the PDE. s denote the constraint operators that cover the boundary and initial conditions. also denotes a subset of the domain boundary that is required for defining those constraints. We seek to approximate the solution by a neural network .

Depending on the actual problem at hand and the type of the PDE, different neural network architectures can be used. Generally, a standard multi-layer fully-connected architecture that transforms the input into has been found to work well for a large class of problems.

The loss function needed to train the neural network can be constructed based on how well the neural network approximated solution fulfils the PDE and its boundary conditions. To construct the loss function, first we define the PDE () and constrains () residuals:

(2)

where is the set of neural network trainable parameters (weights and biases).

The final loss function, , can now be defined as a weighted sum of all the residuals:

(3)

In practice, the integrals in Eq. (3) can be approximated by discrete sampling of the input space.

A neural network trained using the Loss from Eq. (3) will theoretically converge to a function that is a solution to the given set of PDEs and their boundary conditions. In practice it has been shown (Hennigh et al. [hennigh2021nvidia]) to work well for many different problems including heat transfer, mechanical stress, diffusion and others. However, for this method to work one needs access to the gradients of the test function (which are usually given by a NN training framework) and it requires careful manual tuning of the weight functions (, ) that control the relative significance of the loss components (i.e. PDE solution vs boundary condition).

In particular for electromagnetic problems involving Neumann boundary conditions between materials with different magnetic (relative) permeability (, ) we found that the boundary condition part of the loss function was not numerically stable (due to the orders of magnitude jump in function gradient requirement on the boundary) and it was difficult to choose its weighting in such a way that the network would converge to the full PDE solution.

3 Neural Radiance Fields

Neural Radiance Field methods provide a neural-network-backed method for synthesising new views of complex 3D scenes by optimizing an underlying volumetric scene function using a sparse set of input views/images [Mildenhall et al. [nerf:mildenhall]]. The 3D scene is represented using a neural network (usually a fully-connected deep network). The input to the network is a 5D coordinate – spatial location () and viewing direction (). The output of the network is the view-dependent emitted radiance and the volume density at the corresponding input location. A 2D view (image) can be synthesised by querying 5D coordinates along camera rays and then using classic volume rendering techniques to project the output colors and densities into an image. This process involves integrating the radiance and density outputs along each camera ray to produce a final pixel in the 2D image.

The training data set contains a set of 2D images of the 3D scene we want to recreate, each image is taken with a known camera angle. For a given pixel for the training loss is a difference between the value of that pixel in the training image, and the colour of that pixel predicted by the NeRF model. To obtain the pixel colour from the NeRF model we need to integrate the outputs (radiance and density ) of the neural network along the ray [Mildenhall et al. [nerf:mildenhall]]:

(4)

where:

(5)

is the accumulated transmittance along the integrated ray up until point . The integrals in Equations (4, 5) are taken along a line (ray) parameterised with , starting at and terminating at :

(6)

where, is the ray origin (determined by the position of the pixel in a given 2D image/view) and is the ray direction which is determined by the camera angle used when the given image/view was generated. In practice, both integrals are approximated by sampling and summation.

Total training loss can then be constructed as an average error of colour predictions of all the training pixels.

Note that in order for the neural network to correctly recreate the radiance and the density of the 3D scene it must be capable of capturing very rapid changes in its output in relation to its inputs. In particular the density output will have its values change dramatically on the boundaries of solid objects in the scene. If the network was not capable of recreating those rapid density changes, than all the images/views recreated by the NeRF model would be blurry without proper definition of solid object boundaries.

To sum up – a NeRF model is trained on a loss function that is based on a line integral(s) of the network outputs Eq. (4) and it needs to be able to properly capture rapid changes in the output values with respect to the inputs.

4 Input encoding

Transformation of the inputs to the neural network (input encoding) has been noted to play a crucial role for speed and accuracy of training both in the case of PINNs and NeRF. Hennigh et al. [hennigh2021nvidia] implement both Fourier based encoding and SiReN encoding introduced by Sitzmann et al. [encoding:siren]. Both of these encodings aim at increasing the ability of the network to learn important high frequency features.

The work of Müller et al. [encoding:instant_ngp] takes this a step further and focuses on adaptive encoding. This adaptive, multi-resolution hash-based encoding lead to several orders of magnitude speed-up in training of a set of graphics related tasks, including a NeRF model.

5 Fast Integral PINN Solving

In this section we describe our method for fast solving of a class of PDEs. For clarity of argument we will focus on solving magneto-static Maxwell equations, but the methodology can be easily applied to any type of PDE that can be represented in integral form.

A full set of general Maxwell equations:

(7)

can be rewritten in integral form:

(8)

In the above equations, is magnetic field strength, is magnetic flux density, is electric field strength, is electric displacement field, is electric charge density and is electric current density. The line integrals are taken along any closed loop, and the surface integrals are taken over any closed surface S.

For the magneto-static case, the integral equations reduce to:

(9)

with the relation between and given by permeability of a material at the given point is space: . In general, the value of for a given material is a function of (material curve).

In a 2D case that we consider here, we can further rewrite Eq. (9) as:

(10)

where, is the total current enclosed by the close loop L, is the loop integration element parallel to the integration line and is an element perpendicular to .

The integrals in Eq. (10) have to be fulfilled for any closed loop. A single loop integral is not enough to uniquely determine the field, however if enough random loops are considered in the given domain, than it will be enough to find the solution. This makes it similar to the NeRF problem, where a single ray traced through the scene is not enough to determine the 3D radiance filed, but many rays combined are.

This conceptual similarity between NeRF ray integration and the integral form of Maxwell magneto-static equations is the basis of our method, which allows us to quickly train a neural network to find a solution to any PDE as long as it can be expressed in integral form.

5.1 Method

Our goal is to find the field for any point in the domain of interest, assuming that we know the material properties and the current distribution (). To do so, we introduce a neural network that maps input coordinates to . The architecture of the network and an additional encoding of the inputs can be chosen to best suit the problem at hand. We chose to use a fully connected architecture, with multi-resolution hash input encoding from Müller et al. [encoding:instant_ngp].

Figure 1: Test problem setup. Iron core electromagnet (), surrounded by air. Force is provided by a wire (red square) with current density Jc. Black lines show example integration paths.

The training loss function is constructed by calculating the integrals from Equations (10) along many randomly chosen closed paths within the spatial domain of interest. For implementation simplicity we have used integration paths that are either squares or circles of random dimensions, see Fig 1. For a single path L, the loss function is:

(11)

where are trainable parameters of the neural network and a training batch consists of multiple integration paths.

By training the network with the above loss function on enough different integration paths, the network will learn the solution to Eq. (10) for any point within the domain.

6 Results

We have tested our method on a simple magneto-static setup of a “horse-shoe” electromagnet. This setup consist of a C-shaped iron core (light blue, in Fig 1), surrounded by air (). The force component is provided by a wire with current density Jc (red square in Fig 1).

To model the variable , we use a simple piecewise-linear approximation to the true curve. We use two values of

, one when the iron is saturated with magnetic flux and one when it is not saturated, and we determine the saturation level by sampling the neural network to estimate

.

(a) Magnetic flux density obtained with our method.
(b) Reference magnetic flux density obtained by FEM solution (Elmer).
Figure 2: Comparison of magnetic flux magnitude of our method against reference solution from FEM solver (Elmer). Blue corresponds to no flux, red is maximum flux (intermediate colour mappings have not been aligned).
Figure 3: Corner zoomed comparison between our solution (left) and reference (right). Both solutions show high flux density on the inner corner (red circle), and low flux at the outer corner (black circle).
Figure 4: Airgap zoomed comparison between our solution (left) and reference (right). Both solutions show high flux density on the airgap iron corners (red circles), and lower flux in the middle of the gap (black circle).

Figure 2 shows the magnitude of magnetic flux B obtained by our method compared to reference solution from the Elmer FEM solver (https://github.com/ElmerCSC/elmerfem). Please note that the electric current driving both cases is slightly different – for FEM there is a current density over a coil surface, for our solution we have used a single wire. The singularity associated with a point wire has been removed from Fig 1(a).

Thanks to the same input encoding that sped up NeRF training by orders of magnitude (Müller et al. [encoding:instant_ngp]) the solution shown in Fig 1(a) has been obtained after only about 30 seconds of training on a high-end laptop. This is considerably faster than existing PINN methods, but slower than standard FEM approaches. However, we believe that neural approaches will ultimately be more flexible as they can more naturally be extended to accept design space parameters as inputs, and once trained, will be able to instantly generate solutions for new designs.

Our solution matches the reference solution very closely, including key critical properties of the magnetic flux density in this setting. Namely:

  1. The flux “cuts corners” – the flux density in high close to the inner corners of the iron core and very low at the outer corners – see Fig 3.

  2. The flux distribution in the airgap has the characteristic v-shape, with higher density at the corners of the iron – see Fig 4.

  3. Generally all the flux is contained within the iron.

7 Conclusions

By transforming the differential form of Maxwell equations for magneto-statics (Eq. 7) to a closed loop integral form (Eq. 10), and by applying the input encoding from the latest NeRF break-through paper (Müller et al. [encoding:instant_ngp]) we managed to train a neural network to solve the magneto-static problem orders of magnitude faster than if using classical PINN approach (i.e. training with differential form loss function 3). Our method can be used each time a PDE (or other type of equation) can be transformed to a line integral (closed or not). An extension to 3D would require changing the integration lines into surfaces but the rest of the methodology would remain the same. The integral form of the equations helps deal with numerical instabilities and at the same time allows us to take advantage of novel input space encoding introduced in NeRF research.

7.1 Future research

An interesting next step would also be to try to train the network using parametrization of the input geometries (i.e. iron shapes) and current densities. This would lead to a PDE solution that can generalize across a set of geometries/currents and would therefore allow for usage in optimization routines as a extremely fast approximation to FEM solution.

Acknowledgement

This work has been carried out as part of Monumo Ltd. research and development efforts.