A nonlocal physics-informed deep learning framework using the peridynamic differential operator

05/31/2020
by   Ehsan Haghighat, et al.
9

The Physics-Informed Neural Network (PINN) framework introduced recently incorporates physics into deep learning, and offers a promising avenue for the solution of partial differential equations (PDEs) as well as identification of the equation parameters. The performance of existing PINN approaches, however, may degrade in the presence of sharp gradients, as a result of the inability of the network to capture the solution behavior globally. We posit that this shortcoming may be remedied by introducing long-range (nonlocal) interactions into the network's input, in addition to the short-range (local) space and time variables. Following this ansatz, here we develop a nonlocal PINN approach using the Peridynamic Differential Operator (PDDO)—a numerical method which incorporates long-range interactions and removes spatial derivatives in the governing equations. Because the PDDO functions can be readily incorporated in the neural network architecture, the nonlocality does not degrade the performance of modern deep-learning algorithms. We apply nonlocal PDDO-PINN to the solution and identification of material parameters in solid mechanics and, specifically, to elastoplastic deformation in a domain subjected to indentation by a rigid punch, for which the mixed displacement–traction boundary condition leads to localized deformation and sharp gradients in the solution. We document the superior behavior of nonlocal PINN with respect to local PINN in both solution accuracy and parameter inference, illustrating its potential for simulation and discovery of partial differential equations whose solution develops sharp gradients.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 10

page 13

page 16

07/08/2019

Physics Informed Extreme Learning Machine (PIELM) -- A rapid method for the numerical solution of partial differential equations

There has been rapid progress recently on the application of deep networ...
12/02/2020

Meshless physics-informed deep learning method for three-dimensional solid mechanics

Deep learning and the collocation method are merged and used to solve pa...
03/23/2022

Applications of physics informed neural operators

We present an end-to-end framework to learn partial differential equatio...
01/18/2022

Self-similar blow-up profile for the Boussinesq equations via a physics-informed neural network

We develop a new numerical framework, employing physics-informed neural ...
02/14/2020

A deep learning framework for solution and discovery in solid mechanics: linear elasticity

We present the application of a class of deep learning, known as Physics...
02/14/2020

A deep learning framework for solution and discovery in solid mechanics

We present the application of a class of deep learning, known as Physics...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Deep learning has emerged as a powerful approach to computing-enabled knowledge in many fields Goodfellow et al. (2016), such as image processing and classification Bishop (2006); Krizhevsky et al. (2012); LeCun et al. (2015), search and recommender systems Jannach et al. (2010); Zhang et al. (2019), speech recognition Graves et al. (2013), autonomous driving Bojarski et al. (2016), and healthcare Miotto et al. (2018)

. The particular needs of each application, collectively, have led to many different neural-network architectures, including Deep Neural Networks (DNN), Convolutional NNs (CNN), Recurrent NNs(RNN) and its variants including Long Short-Term Memory RNNs (LSTM). Some of these frameworks have also been employed for data-driven modeling in computational mechanics

Brunton and Kutz (2019), including fluid mechanics and turbulent flow modeling Brenner et al. (2019), solid mechanics and constitutive modeling Ghaboussi and Sidarta (1998); Kirchdoerfer and Ortiz (2016); Haghighat et al. (2020), and earthquake prediction and detection DeVries et al. (2018); Kong et al. (2018)

. These efforts have resulted in the availability of open-source deep-learning platforms, including Theano

Bergstra et al. (2010)

, Tensorflow

Abadi et al. (2016)

, and PyTorch

Paszke et al. (2019). These software packages are highly efficient and ready to use on different platforms, from mobile devices to massively parallel cloud-based clusters, features that can be inherited in the development of tools for physics-informed deep learning Haghighat and Juanes (2020).

Of particular interest to us are recent applications of deep learning in computational science and engineering, concerning the solution and discovery (identification) of partial differential equations describing various physical systems Han et al. (2018); Bar-Sinai et al. (2019); Rudy et al. (2019); Raissi et al. (2019); Champion et al. (2019); Raissi et al. (2020). Among these applications, a specific framework called Physics-Informed Neural Networks (PINN) Raissi et al. (2019)

enables the construction of the solution space using feed-forward neural networks with space and time variables as the network’s input. The governing equations are enforced in the loss function using automatic differentiation

Baydin et al. (2017). It is a framework that permits solving partial differential equations (PDEs) and conducting parameter identification (inversion) from data. Multiple variations of this framework exist, such as Variational PINNs Kharazmi et al. (2019) and Parareal PINNs Meng et al. (2019), which have been used for physics-informed learning of the Burgers equation, the Navier–Stokes equations, and the Schrödinger equation.

Recently, PINN has been applied for inversion and discovery in solid mechanics Haghighat et al. (2020)

. While the method provides accurate and robust reconstructions and parameter estimates when the solution is smooth, the performance degrades in the presence of sharp gradients in the strain or stress fields. The emergence of near-discontinuities in the solution can occur for several reasons, including shear-band localization, crack propagation, and the presence of “mixed” displacement–traction boundary conditions. In the latter case, the point at which the boundary condition changes type often gives rise to stress concentration or even a stress singularity. In these cases, existing PINN approaches are much less accurate as a result of the inability of the network to capture the solution behavior globally. We posit that this shortcoming may be remedied by introducing long-range (nonlocal) interactions into the network’s input, in addition to the short-range (local) space and time variables.

Here, we propose to use the Peridynamic Differential Operator (PDDO) Madenci et al. (2016, 2019) to construct nonlocal neural networks with long-range interactions. Peridynamics, a nonlocal theory, was first introduced as an alternative to the local classical continuum mechanics to incorporate long-range interactions and to remove spatial derivatives in the governing equations Silling (2000); Silling and Askari (2005); Silling et al. (2007); Silling and Lehoucq (2008). It has been shown to be well suited to model crack initiation and propagation Madenci and Oterkus (2014). It has also been shown that the peridynamic governing equations can be derived by replacing the local spatial derivatives in the Navier displacement equilibrium equations with their nonlocal representation using PDDO Madenci et al. (2016, 2017, 2019). PDDO has an analytical form in terms of spatial integrals for a point with a symmetric interaction domain or support region. The PD functions can be easily incorporated into the nonlocal physics-informed deep learning framework: they are generated in discrete form during the preprocessing phase, and therefore they do not interfere with the deep-learning architectures, keeping their performance intact.

The outline of the paper is as follows. In Section 2 we give a brief overview of the established (local) PINN framework, and its application to solid mechanics problems. In Section 3 we propose and describe an extension of the local (short-range) PINN to a nonlocal (long-range) PINN framework using PDDO. In Section 4 we present the application of both local and nonlocal PINN to a representative example of elastoplastic deformation, corresponding to the indentation of a body by a rigid punch—an example that illustrates the effects of sharp gradients as a result of mixed displacement–traction boundary conditions. Finally, in Section 5 we discuss the results and summarize the main conclusions.

2 Physics-Informed Deep Learning in Solid Mechanics

In this section we provide a brief overview of the established (local) PINN framework Raissi et al. (2019), and its application to forward modeling and parameter identification in solid mechanics, as described by elastoplasticity.

2.1 Basics of the PINN framework

In the PINN framework Raissi et al. (2019), the solution space is constructed by a deep neural network with the independent variables (e.g., coordinates 

) as the network inputs. In this feed-forward network, each layer outputs data as inputs for the next layer through nested transformations. Corresponding to the vector of input variables,

, the output values, can be mathematically expressed as

(2.1)

where , and represent the inputs, final outputs and hidden layer outputs of the network, and and

represent the weights and biases of each layer, respectively. Note that lowercase and capital boldface letters are used to reflect vector and matrix components while scalars are shown with italic fonts. The activation function is denoted by

actf; it renders the network nonlinear with respect to the inputs.

The ‘trained’ can be considered as an approximate solution to the governing PDE. It defines a mapping from inputs to the field variable in the form of a multi-layer deep neural network, i.e., , with and representing the set of all network parameters. The network inputs

can be temporal and spatial variables in reference to a Cartesian coordinate system, i.e.,

in 2D.

In the PINN framework, the physics, described by a partial differential equation with as the partial differential operator, is incorporated in the loss or cost function along with the training data as

(2.2)

where is the training dataset (which can be inside the domain or on the boundary), and represents the expected (true) value for the differential operation at any given training or sampling point. In all modern implementations of the deep-learning framework, such as Theano Bergstra et al. (2010), Tensorflow Abadi et al. (2016) and MXNet Chen et al. (2015), the partial derivatives in can be performed using automatic differentiation (AD) Baydin et al. (2017)—a fundamental aspect of the PINN architecture.

Figure 1: Local PINN architecture, defining the mapping .

Different optimization algorithms exist for training a neural network; these include Adagrad Duchi et al. (2011) and Adam Kingma and Ba (2014). Several algorithmic parameters affect the rate of convergence of network training. The algorithmic parameters include batch-size, epochs, shuffle and patience

. Batch-size controls the number of samples from a dataset used to evaluate one gradient update. A batch-size of 1 is associated with a full stochastic gradient descent optimization. One epoch is one round of training on a dataset. If a dataset is shuffled, then a new round of training (epoch) results in an updated parameter set because the batch-gradients are evaluated on different batches. It is common to reshuffle a dataset many times and perform the back-propagation updates.

The optimizer may, however, stop earlier if it finds that new rounds of epochs are not improving the loss function. This situation is described with the keyword patience. It primarily occurs because the loss function is nonconvex. Therefore, the training needs to be tested with different starting points and in different directions to build confidence on the parameters evaluated from minimization of the loss function on a given dataset. Patience is the parameter that controls when the optimizer should stop the training. There are three basic strategies to train the network: (1) generate a sufficiently large number of datasets and perform a one-epoch training on each dataset, (2) work on one dataset over many epochs by reshuffling the data, and (3) a combination of these. When dealing with synthetic data, all approaches are feasible. In the original work on PINN Raissi et al. (2019), the first strategy was used to train the model, with datasets being generated at random space locations at each epoch. This strategy, however, is often not applicable in real-world applications, especially when measurements are collected from sensors that are installed at fixed and limited spatial locations.

In this paper, we rely on SciANN Haghighat and Juanes (2020)

, a recent implementation of PINN as a high-level Keras

Chollet (2015) wrapper for physics-informed deep learning and scientific computations. Experimenting with all of the previously mentioned network choices can be easily done, with minimal coding, in SciANN Haghighat and Juanes (2020); Haghighat et al. (2020).

2.2 Solid mechanics with elastoplastic deformation

In the absence of body forces and neglecting inertia, the balance of linear momentum takes the form:

(2.3)

for , where

 is the Cauchy stress tensor, the subscript after a comma denotes differentiation, and a repeated subscript implies summation.

The linear elastic response of the material can be described by the stress–strain relations as

(2.4)

where the pressure or volumetric stress is

(2.5)

and the deviatoric stress tensor is

(2.6)

in which

(2.7)

with the strain tensor defined as

(2.8)

where  are the components of the displacement field.

The nonlinear material response follows the classical description of elastoplastic behavior Simo and Hughes (1998). In particular, we adopt the von Mises flow theory with a yield surface defined as

(2.9)

in which is the initial yield stress, is the work hardening parameter, is the equivalent plastic strain, and is the effective stress. The plastic deformation occurs in the direction normal to the yield surface, i.e., . We decompose the deviatoric strain tensor into its elastic and plastic components,

(2.10)

To account for plastic deformation, the equations describing the linear elastic material response, Eq. (2.6) can be rewritten as

(2.11)

and

(2.12)

The effective stress  is defined as

(2.13)

with

(2.14)

The equivalent plastic strain, can be obtained from Eq. (2.9), Eq. (2.11) and Eq. (2.12) as

(2.15)

where

(2.16)

For linear elasticity under plane-strain conditions, the transverse component of strain, , is identically equal to zero, and the transverse normal component of stress is evaluated as . For elastoplasticity, however, is not predefined while remains identically equal to zero.

2.3 Local PINN for elastoplasticity

Here we apply the PINN framework to the solution and inference of two-dimensional quasi-static mechanics. The input variables to the feed-forward neural network are the coordinates, and , and the output variables are the components of the displacement, , , strain tensor, , , , and stress tensor, , , . We define the loss function for linear elasticity as:

(2.17)

where the and components refer to predicted and true values, respectively. The set contains all sampling nodes. The set contains all sampling nodes for variable where actual data exist. The terms in the loss function represent measures of the error in the displacement, strain and stress fields, the equilibrium equations, and the constitutive relations.

Similarly, the loss function for elastoplasticity is:

(2.18)

These loss functions are used for deep-learning-based solution of the governing PDEs as well as for identification of the model parameters. The constitutive relations and governing equations are tested at all sampling (collocation) points, while data can be selectively imposed. The material parameters are treated as constant values in the network for the solution of governing PDEs. However, they are treated as network parameters, which change during the training phase, during model identification (see Fig. 1). TensorFlow Abadi et al. (2016) permits such variables to be defined as Constant (PDE solution) or Variable (parameter identification) objects, respectively.

3 Nonlocal PINN Architecture with the Peridynamics Differential Operator

Here we propose and describe an extension of the local (short-range) PINN with a single input  to a nonlocal neural network that employs input variables in the form of family members of point , defined as . Each point has its own unique family in its domain of interaction (an area in two-dimensional analysis). Given the relative position with reference to point , , the nondimensional weight function  represents the degree of interaction between the material points in each family. We define it as:

(3.1)

where the parameter , referred to as the horizon, defines the extent of the interaction domain (long-range interactions). In discrete form, the family members of point are denoted as , and their relative positions are defined as .

Figure 2: Interaction domain for point , with in its family.

Silling Silling (2000) and Silling et al. Silling and Askari (2005) introduced the Peridynamic (PD) theory for failure initiation and growth in materials under complex loading conditions. Recently, Madenci et al. Madenci et al. (2016, 2019) introduced the Peridynamic Differential Operator (PDDO) to approximate the nonlocal representation of any function, such as a scalar field and its derivatives at point , by accounting for the effect of its interactions with the other points, in the domain of interaction (Fig. 2).

The derivation of PDDO employs Taylor Series Expansion (TSE), weighted integration and orthogonality (see A). The major difference between PDDO and other existing local and nonlocal numerical differentiation methods is that PDDO leads to analytical expressions for arbitrary-order derivatives in integral form for a point with symmetric location in a circle. These analytical expressions, when substituted into the Navier displacement equilibrium equation, allow one to recover the PD equation of motion derived by Silling et al. Silling (2000), which was based on the balance laws of continuum mechanics. The spatial integration can be performed numerically with simple quadrature techniques. As shown by Madenci et al. Madenci et al. (2016, 2019, 2017), PDDO provides accurate evaluation of derivatives in the interior as well as the near the boundaries of the domain.

The nonlocal PD representation of function and its derivatives can be expressed in continuous and discrete forms as

(3.2)
(3.3)
(3.4)

where with () represent the PD functions obtained by enforcing the orthogonality condition of PDDO Madenci et al. (2016, 2019), and the integration is performed over the interaction domain. The subscript reflects the discrete value of , , and  a family of point .

A nonlocal neural network for a point  and its family members can then be expressed as

(3.5)

This network maps and its family members to the corresponding values of , i.e., . With these output values, the nonlocal value of the field and its derivatives can be readily constructed as

(3.6)

where, . Here, the summation over discrete family points in Eq. (3.2) is expressed as a dot product. Note that if in the influence function (3.1) approaches zero, then and , and we recover the local PINN architecture in Fig. 1.

Figure 3: A nonlocal PDDO-PINN network architecture for approximation of function .

In our applications of PINN for solution and parameter inference, we make the distinction between two cases:

  1. We can use PDDO to approximate only the function, and use automatic differentiation (AD) to evaluate the derivatives. We refer to this case as AD-PDDO-PINN.

  2. We can instead use PDDO to approximate the function as well as its derivatives, instead of evaluating them from the network. We refer to this case as PDDO-PINN.

As we will see, the use of PDDO-PINN enables the use of activation functions (such as ) and network architectures that cannot be used with local PINN—a capability that may lead to increased computational efficiency (per epoch) since it does not rely on extended graph computations associated with AD.

4 Representative Example: Indentation of an Elastic or Elastoplastic Body

In this section, we apply the different PINN formalisms to the solution and inference of a solid mechanics problem described by plane-strain elastoplastic deformation. The problem is designed to reflect a common scenario in boundary value problems: the presence of mixed boundary conditions, in which part of the boundary is subject to Dirichlet (displacement) boundary conditions, while part of the boundary is subject to Neumann (traction) boundary conditions. The sharp transition in type of boundary condition often leads to stress concentration (and sometimes a stress singularity). The problem we study here—indentation of a elastoplastic body—is subject to this stress concentration phenomenon, which poses a significant challenge to the application of existing deep learning techniques.

4.1 Problem description

We simulate the deformation of a square domain under plane-strain conditions, as a result of the indentation by a rigid punch (Fig. 4). The body is constrained by roller support conditions along the lateral boundaries, and subject to fixed zero displacement along the bottom boundary. The dimensions of the domain are  m, and thickness  m. The width of the rigid punch is  m, which indents the body at the top boundary a prescribed vertical displacement  mm. These boundary conditions can be expressed as

(4.1a)
(4.1b)
(4.1c)
(4.1d)
(4.1e)
(4.1f)
(4.1g)
(4.1h)

The material exhibits elastic or elastic-plastic deformation with strain hardening. The elastic modulus, Poisson’s ratio, yield stress and hardening parameter of the material are specified as  GPa, ,  GPa and  GPa, respectively. The Lamé elastic constants, therefore, have values  GPa and  GPa.

Figure 4: A square elastoplastic body under plane-strain conditions, and subject to a displacement  in a portion of the top boundray via indentation by a rigid punch.

To generate synthetic (simulated) data to be used in the deep learning frameworks, we simulate the problem described above with the finite element method using COMSOL COMSOL (2020). The domain is discretized with a uniform mesh of size elements of quartic Lagrange polynomials.

The simulated displacement (, ), strain (, , , ) and stress (, , , ) is computed for a purely linear elastic response (Fig. 5) and for elastic-plastic deformation (Fig. 6). It is apparent that the distribution of strain and stress components for the elastoplastic case are significantly different from those of the elastic case, with more localized deformation underneath the rigid punch. As expected, the plastic-strain components are zero in most of the domain, except in the vicinity of the corners of the punch, where it exhibits sharp gradients—a feature that, as we will see, poses a challenge for the approximation of the solution with a neural network.

Figure 5: FEM reference solution for displacement, strain, and stress components in the case of purely linear elastic deformation.
Figure 6: FEM reference solution for displacement, strain, and stress components in the case of elastic-plastic deformation.

4.2 Local PINN results

We first apply the established (local) PINN framework for solution and parameter identification of the indentation problem described above Raissi et al. (2019); Haghighat et al. (2020)

. Training of the neural networks is performed with 10,000 training points (nodal solutions of the FEM solution). The convergence of the network training is sensitive to the choice of data-normalization and network size. After a trial-and-error approach, we selected the network architectures and parameters that led to the lowest value of the loss function and the highest accuracy of the physical model parameters. The selected feed-forward neural network has 4 hidden layers, each with 100 neuron units, and employs the hyperbolic-tangent activation function between layers. We adopt batch-training with a total number of 20,000 epochs and a batch-size of 64. We use the Adam optimizer with a learning rate initialized to 0.0005 and decreased gradually to 0.000001 for the last epoch.

The local PINN predictions for elastic deformation do capture the high-gradient regions near the corners of punch, but they are significantly diffused. The differences between the local PINN predictions and the true data are shown in Fig. 7. The Lamé coefficients identified by the network are  GPa and  GPa—an error of less than 3% (Fig. 8).

Figure 7: Difference between the local PINN predictions and the true data for displacement, strain, and stress components in the case of purely linear elastic deformation.

[width=0.45trim=0 0 0 0.50pt, clip]figs/res_pinn_linear_4x100_tanh-Parameters.png

Figure 8: Local PINN predictions of the material parameters and in the case of purely linear elastic deformation. White color indicates the true values of the parameters.)

In the case of elastic-plastic deformation, depends on the plastic-strain components. Thus, the PINN architecture is defined with networks for , , , , and . The error between the local PINN predictions and the true data are shown in Fig. 9. In contrast with the local PINN predictions for the elastic case, the predictions for this case show poor quantitative agreement with the exact solution. The material parameters identified by the method are:  GPa,  GPa,  GPa and  GPa. While the elastic Lamé coefficients and the yield stress are identified accurately, the method fails to identify the hardening parameter  (Fig. 10). We speculate that this is due to the localized plastic deformation in narrow regions in the vicinity of the corners of the rigid punch (Fig. 6). Therefore, there are very few sampling points that contribute to the loss function with the local PINN network.

Figure 9: Difference between the local PINN predictions and the true data for displacement, strain, and stress components in the case of elastic-plastic deformation.

[width=1.0trim=0 0 0 0.50pt, clip]figs/res_pinn_nonlinear_4x100_tanh-Parameters.png

Figure 10: Local PINN predictions of the material parameters , , and in the case of elastic-plastic deformation. White color indicates the true values of the parameters.

4.3 Nonlocal PINN results

Given the relative success of the local PINN framework, but also the challenges faced in capturing the sharp gradients in the solution of the elastoplastic deformation problem, here we investigate the application of nonlocal PINN to this problem.

The selected feed-forward neural networks are identical to those used in local PINN: they have 4 hidden layers, each with 100 neuron units, and with hyperbolic-tangent activation functions. ]In the construction of the PD functions, , the TSE is truncated after the second-order derivatives (, see  A). The number of family members for each point depends on the order of approximation in the TSE; it is  points in each dimension, resulting in for a square horizon in 2D Madenci et al. (2019). Therefore, we choose a maximum number of 49 members as the nonlocal input features. Depending on the location of , the influence (degree of interaction) of some of these points (family members) might be zero. However, they are all incorporated in the construction of the nonlocal neural network to simplify the implementation procedure.

In what follows we present the results for both AD-PDDO-PINN and PDDO-PINN architectures to the indentation problem with elastic-plastic deformation.

4.3.1 Ad-Pddo-Pinn

The nonlocal deep neural network described by Eq. (3.5) is employed to construct approximations for variables , , , and . They are evaluated as

(4.2)

where represents , , , , . The derivatives are evaluated using automatic differentiation (AD). Since is a nonlocal function of and its family points , the differentiation of is performed with respect to each family member using AD as

(4.3)

In order to incorporate the effect of family members on the derivatives, the local AD differentiations are recast as

(4.4)

The differences between the AD-PDDO-PINN predictions and the true solution for the elastoplastic deformation case are shown in Fig. 11. The value of the elastoplastic model parameters estimated by the method are:  GPa,  GPa,  GPa and  GPa (Fig. 12). Both the solution and the model parameters are captured much more accurately than in the local PINN framework. In particular, the method reproduces the regions of high gradients in the solution, and is now able to accurately identify the hardening parameter .

Figure 11: Difference between the nonlocal AD-PDDO-PINN predictions and the true data for displacement, strain, and stress components in the case of elastic-plastic deformation.

[width=1.0trim=0 0 0 0.50pt, clip]figs/res_adpddo_pinn_nonlinear_hor-3_nfeat-49_4x100_tanh-Parameters.png

Figure 12: Nonlocal AD-PDDO-PINN predictions of the material parameters , , and in the case of elastic-plastic deformation. White color indicates the true values of the parameters.

4.3.2 Pddo-Pinn

We now employ the nonlocal deep neural network described by Eq. (2.17) to construct approximations for variables , , , , and their derivatives. These derivatives are evaluated as

(4.5)

where represents , , , , .

The errors in the PDDO-PINN solution for the elastoplastic deformation case are shown in Fig. 13, and the estimated elastoplastic model parameters are:  GPa,  GPa,  GPa and  GPa (Fig. 14). The overall performance is better than that of local PINN, but less accurate than that of AD-PDDO-PINN. An advantage of the PDDO-PINN framework, however, is that it does not rely on automatic differentiation; therefore, the evaluation of derivatives through Eq. (4.5) is faster for each epoch of training.

Figure 13: Difference between the nonlocal PDDO-PINN predictions and the true data for displacement, strain, and stress components in the case of elastic-plastic deformation.

[width=1.0trim=0 0 0 0.50pt, clip]figs/res_pddo_pinn_nonlinear_hor-3_nfeat-49_4x100_tanh-Parameters.png

Figure 14: Nonlocal PDDO-PINN predictions of the material parameters , , and in the case of elastic-plastic deformation. White color indicates the true values of the parameters.

5 Discussion and Conclusions

The results of the previous section demonstrate the benefits of the nonlocal PINN framework in the reconstruction of the deformation and parameter identification for solid-mechanics problems with sharp gradients in the solution, compared with those obtained with the local PINN architecture. This improved performance is also apparent from examination of the evolution of the normalized loss function  for the different architectures (Fig. 15), illustrating the faster convergence and lower final value of  of the nonlocal PINN approaches.

Figure 15: Convergence behavior of the different PINN frameworks (I: local PINN, II: nonlocal AD-PDDO-PINN, and III: nonlocal PDDO-PINN), showing the evolution of the normalized loss function  (left axis) and the learning rate (right axis) as a function of the number of epochs for both the linear-elastic and nonlinear elastoplastic deformation cases.

In summary, we have introduced a nonlocal approach to Physics-Informed Neural Networks (PINN) using the Peridynamic Differential Operator (PDDO). In the limit when the interaction range  approaches zero, the method reverts to the local PINN model. We have presented two versions of the proposed approach: one with automatic differentiation using the neural network (AD-PDDO-PINN), and the other with analytical evaluation of the derivatives relying on PDDO functions (PDDO-PINN). The PD functions can be readily and efficiently incorporated in the neural network architecture and, therefore, the nonlocality does not degrade the performance of modern deep-learning algorithms. We have applied both versions of nonlocal PINN to the solution and identification of material parameters in solid mechanics. Specifically, we focused on the solution and inference of linear-elastic and elastoplastic deformation in a domain subjected to indentation by a rigid punch. The resulting boundary value problem is challenging because of the mixed displacement–traction boundary conditions along the top boundary, which result in localized deformation and sharp gradients in the solution. We have shown that the PDDO framework is able to capture the stress and strain concentrations with global functions and, as a result, leads to the superior behavior of nonlocal PINN both in terms of the accuracy of the solution and the estimated model parameters. While many questions remain with regard to the selection of network size, order of the PDDO approximation and training optimization algorithms, these results suggest that nonlocal PINN may offer a powerful framework for simulation and discovery of partial differential equations whose solution develops sharp gradients.

Acknowledgments

RJ and EH conducted this work as a part of KFUPM-MIT collaborative agreement ‘Multiscale Reservoir Science’. EM and ACB performed this work as part of the ongoing research at the MURI Center for Material Failure Prediction through Peridynamics at the University of Arizona (AFOSR Grant No. FA9550-14-1-0073).

Appendix A PDDO Derivation

According to the 2nd-order TSE in a 2-dimensional space, the following expression holds

(A.1)

where is the remainder. Multiplying each term with PD functions, and integrating over the domain of interaction (family), , results in

(A.2)

in which the point is not necessarily symmetrically located in the domain of interaction. The initial relative position, , between the material points and can be expressed as . This ability permits each point to have its own unique family with an arbitrary position. Therefore, the size and shape of each family can be different, and they significantly influence the degree of nonlocality. The degree of interaction between the material points in each family is specified by a nondimensional weight function, , which can vary from point to point. The interactions become more local with decreasing family size. Thus, the family size and shape are important parameters. In general, the family of a point can be nonsymmetric due to nonuniform spatial discretization. Each point has its own family members in the domain of interaction (family), and occupies an infinitesimally small entity such as volume, area or a distance.

The PD functions are constructed such that they are orthogonal to each term in the TSE as

(A.3)

with () and is the Kronecker symbol. Enforcing the orthogonality conditions in the TSE leads to the nonlocal PD representation of the function itself and its derivatives as

(A.4a)
(A.4b)
(A.4c)

The PD functions can be constructed as a linear combination of polynomial basis functions

(A.5)

where are the unknown coefficients, are the influence functions, and and are the components of the vector . Assuming and incorporating the PD functions into the orthogonality equation lead to a system of algebraic equations for the determination of the coefficients as

(A.6)

where

(A.7a)
(A.7b)
(A.7c)

After determining the coefficients via , the PD functions can be constructed. The detailed derivations and the associated computer programs can be found in Madenci et al. (2019). The PDDO is nonlocal; however, in the limit as the horizon size approaches zero, it recovers the local differentiation as proven by Silling and Lehoucq Silling and Lehoucq (2008).

References

  • M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng (2016)

    TensorFlow: A system for large-scale machine learning

    .
    In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), Savannah, GA, pp. 265–283. External Links: ISBN 978-1-931971-33-1, Link Cited by: §1, §2.1, §2.3.
  • Y. Bar-Sinai, S. Hoyer, J. Hickey, and M. P. Brenner (2019) Learning data-driven discretizations for partial differential equations. Proceedings of the National Academy of Sciences 116 (31), pp. 15344–15349. External Links: Document, Link Cited by: §1.
  • A. G. Baydin, B. A. Pearlmutter, A. A. Radul, and J. M. Siskind (2017) Automatic differentiation in machine learning: a survey. The Journal of Machine Learning Research 18 (1), pp. 5595–5637. External Links: Link Cited by: §1, §2.1.
  • J. Bergstra, O. Breuleux, F. Bastien, P. Lamblin, R. Pascanu, G. Desjardins, J. Turian, D. Warde-Farley, and Y. Bengio (2010) Theano: a CPU and GPU math expression compiler. In Proceedings of the Python for Scientific Computing Conference (SciPy), Vol. 4, pp. 3. Cited by: §1, §2.1.
  • C. M. Bishop (2006) Pattern recognition and machine learning. Springer-Verlag, Berlin, Heidelberg. External Links: ISBN 0387310738, Document, Link Cited by: §1.
  • M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, et al. (2016) End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316. Cited by: §1.
  • M. P. Brenner, J. D. Eldredge, and J. B. Freund (2019) Perspective on machine learning for advancing fluid mechanics. Physical Review Fluids 4 (10), pp. 100501. External Links: Document, Link Cited by: §1.
  • S. L. Brunton and J. N. Kutz (2019) Data-driven science and engineering: machine learning, dynamical systems, and control. Cambridge University Press. External Links: Document Cited by: §1.
  • K. Champion, B. Lusch, J. N. Kutz, and S. L. Brunton (2019) Data-driven discovery of coordinates and governing equations. Proceedings of the National Academy of Sciences 116 (45), pp. 22445–22451. External Links: Document, Link Cited by: §1.
  • T. Chen, M. Li, Y. Li, M. Lin, N. Wang, M. Wang, T. Xiao, B. Xu, C. Zhang, and Z. Zhang (2015) MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. External Links: 1512.01274 Cited by: §2.1.
  • F. Chollet (2015) Keras. External Links: Link Cited by: §2.1.
  • COMSOL (2020) COMSOL Multiphysics user’s guide. COMSOL, Stockholm, Sweden. Cited by: §4.1.
  • P. M.R. DeVries, F. Viégas, M. Wattenberg, and B. J. Meade (2018) Deep learning of aftershock patterns following large earthquakes. Nature 560 (7720), pp. 632–634. External Links: Document, ISSN 14764687, Link Cited by: §1.
  • J. Duchi, E. Hazan, and Y. Singer (2011) Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12 (Jul), pp. 2121–2159. External Links: Link Cited by: §2.1.
  • J. Ghaboussi and D. Sidarta (1998) New nested adaptive neural networks (NANN) for constitutive modeling. Computers and Geotechnics 22 (1), pp. 29–52. Cited by: §1.
  • I. Goodfellow, Y. Bengio, and A. Courville (2016) Deep learning. MIT press. External Links: ISBN 9781405161251, Link, Document Cited by: §1.
  • A. Graves, M. Abdel-Rahman, and G. Hinton (2013)

    Speech recognition with deep recurrent neural networks

    .
    In IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649. Cited by: §1.
  • E. Haghighat and R. Juanes (2020) SciANN: a Keras wrapper for scientific computations and physics-informed deep learning using artificial neural networks. arXiv preprint arXiv:2005.08803. External Links: Link Cited by: §1, §2.1.
  • E. Haghighat, M. Raissi, A. Moure, H. Gomez, and R. Juanes (2020) A deep learning framework for solution and discovery in solid mechanics. arXiv preprint arXiv:2003.02751. Cited by: §1, §1, §2.1, §4.2.
  • J. Han, A. Jentzen, and W. E (2018) Solving high-dimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences 115 (34), pp. 8505–8510. External Links: Document, Link Cited by: §1.
  • D. Jannach, M. Zanker, A. Felfernig, and G. Friedrich (2010) Recommender systems: an introduction. Cambridge University Press. Cited by: §1.
  • E. Kharazmi, Z. Zhang, and G. E. Karniadakis (2019) Variational physics-informed neural networks for solving partial differential equations. External Links: 1912.00873 Cited by: §1.
  • D. P. Kingma and J. Ba (2014) Adam: A method for stochastic optimization. External Links: 1412.6980 Cited by: §2.1.
  • T. Kirchdoerfer and M. Ortiz (2016) Data-driven computational mechanics. Computer Methods in Applied Mechanics and Engineering 304, pp. 81–101. Cited by: §1.
  • Q. Kong, D. T. Trugman, Z. E. Ross, M. J. Bianco, B. J. Meade, and P. Gerstoft (2018) Machine learning in seismology: turning data into insights. Seismological Research Letters 90 (1), pp. 3–14. External Links: Document, ISSN 0895-0695 Cited by: §1.
  • A. Krizhevsky, I. Sutskever, and G. E. Hinton (2012) Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (NIPS 2012), F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.), pp. 1097–1105. Cited by: §1.
  • Y. LeCun, Y. Bengio, and G. Hinton (2015) Deep learning. Nature 521 (7553), pp. 436–444. External Links: Document, Link Cited by: §1.
  • E. Madenci, A. Barut, and M. Dorduncu (2019) Peridynamic differential operator for numerical analysis. Springer. Cited by: Appendix A, §1, §3, §3, §3, §4.3.
  • E. Madenci, A. Barut, and M. Futch (2016) Peridynamic differential operator and its applications. Computer Methods in Applied Mechanics and Engineering 304, pp. 408–451. Cited by: §1, §3, §3, §3.
  • E. Madenci, M. Dorduncu, A. Barut, and M. Futch (2017) Numerical solution of linear and nonlinear partial differential equations using the peridynamic differential operator. Numerical Methods for Partial Differential Equations 33 (5), pp. 1726–1753. Cited by: §1, §3.
  • E. Madenci and E. Oterkus (2014) Peridynamic theory and its applications. Springer. Cited by: §1.
  • X. Meng, Z. Li, D. Zhang, and G. E. Karniadakis (2019) PPINN: Parareal physics-informed neural network for time-dependent PDEs. arXiv preprint arXiv:1909.10145. Cited by: §1.
  • R. Miotto, F. Wang, S. Wang, X. Jiang, and J. T. Dudley (2018) Deep learning for healthcare: review, opportunities and challenges. Briefings in Bioinformatics 19 (6), pp. 1236–1246. Cited by: §1.
  • A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. (2019) PyTorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems (NIPS 2019), H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché-Buc, E. Fox, and R. Garnett (Eds.), pp. 8024–8035. Cited by: §1.
  • M. Raissi, P. Perdikaris, and G. E. Karniadakis (2019) Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics 378, pp. 686–707. External Links: Document, ISSN 10902716, Link Cited by: §1, §2.1, §2.1, §2, §4.2.
  • M. Raissi, A. Yazdani, and G. E. Karniadakis (2020) Hidden fluid mechanics: learning velocity and pressure fields from flow visualizations. Science 367 (6481), pp. 1026–1030. Cited by: §1.
  • S. Rudy, A. Alla, S. L. Brunton, and J. N. Kutz (2019) Data-driven identification of parametric partial differential equations. SIAM Journal on Applied Dynamical Systems 18 (2), pp. 643–660. External Links: Document, Link Cited by: §1.
  • S. A. Silling and E. Askari (2005) A meshfree method based on the peridynamic model of solid mechanics. Computers & Structures 83 (17-18), pp. 1526–1535. Cited by: §1, §3.
  • S. A. Silling, M. Epton, O. Weckner, J. Xu, and E. Askari (2007) Peridynamic states and constitutive modeling. Journal of Elasticity 88 (2), pp. 151–184. Cited by: §1.
  • S. A. Silling and R. B. Lehoucq (2008) Convergence of peridynamics to classical elasticity theory. Journal of Elasticity 93 (1), pp. 13. Cited by: Appendix A, §1.
  • S. A. Silling (2000) Reformulation of elasticity theory for discontinuities and long-range forces. Journal of the Mechanics and Physics of Solids 48 (1), pp. 175–209. Cited by: §1, §3, §3.
  • J. C. Simo and T. J. R. Hughes (1998) Computational inelasticity. Springer. Cited by: §2.2.
  • S. Zhang, L. Yao, A. Sun, and Y. Tay (2019) Deep learning based recommender system: a survey and new perspectives. ACM Computing Surveys (CSUR) 52 (1), pp. 1–38. Cited by: §1.