## 1 Introduction

Coupled flow and mechanics is an important physical process in geotechnical, biomechanical, and reservoir engineering Armero and Simo (1992); Armero (1999); Park (1983); Zienkiewicz et al. (1988); Lewis et al. (1998); Schrefler (2004); White and Borja (2008); Settari and Mourits (1998); Thomas et al. (2003); Khoei and Mohammadnejad (2011). For example, the pore fluid plays a dominant role in the liquefaction of saturated fine-grain geo-structures such as dams and earthslopes, and has been the cause of instability in infrastructures after earthquakes Zienkiewicz et al. (1999). In reservoir engineering, it controls ground subsidence due to water and oil production Fredrich et al. (2000). Significant research and development have been done on this topic to arrive at stable and scalable computational models that can solve complex large-scale engineering problems Jha and Juanes (2007); White and Borja (2008); Kim et al. (2011); Jha and Juanes (2014). Like many other classical computational methods, these models suffer from a major drawback for today’s needs: it is difficult to integrate data within these frameworks. To integrate data in the computations, there is often a need for an external optimization loop to match the output to sensor observations. This process constitutes a major challenge in real-world applications.

A recent paradigm in deep learning, known as Physics-Informed Neural Networks Raissi et al. (2019)

, offers a unified approach to solve governing equations and to integrate data. In this method, the unknown solution variables are approximated by feed-forward neural networks and the governing relations, initial and boundary conditions, and data form different terms of the total loss function. The solution, i.e., optimal parameters of the neural networks, are then found by minimizing the total loss function over random sampling points as well as data points. PINN has resulted in excitement on the use of machine learning algorithms for solving physical systems and optimizing their characteristic parameters given data. PINNs have now been applied to solve a variety of problems including fluid mechanics

Raissi et al. (2019); Jin et al. (2021); Garipov et al. (2018); Wu et al. (2018); Cai et al. (2021a); Rao et al. (2020); Mao et al. (2020); Reyes et al. (2021), solid mechanics Haghighat et al. (2021b); Rao et al. (2021); Haghighat et al. (2021a); Guo and Haghighat (2020), heat transfer Cai et al. (2021b); Niaki et al. (2021); Zobeiry and Humfeld (2021), electro-chemistry Pilania et al. (2018); Liu et al. (2021); Ji et al. (2020), electro-magnetics Fang and Zhan (2019); Chen et al. (2020); Noakoasteen et al. (2020), geophysics bin Waheed et al. (2021); Song et al. (2021); Song and Alkhalifah (2021); Waheed et al. (2021), and flow in porous media Fuks and Tchelepi (2020); Almajid and Abu-Alsaud (2022); Bekele (2021); Shokouhi et al. (2021); Kadeethum et al. (2020); Bekele (2020); Tartakovsky et al. (2020) (for a detailed review, see Karniadakis et al. (2021)). A few libraries have also been developed for solving PDEs using PINNs, including SciANN Haghighat and Juanes (2021), DeepXDE Lu et al. (2021), SimNet Hennigh et al. (2021), and NeuralPDE Zubov et al. (2021).It has been found, however, that training PINNs is slow and challenging for complex problems, particularly those with multiple coupled governing equations, and requires a significant effort on network design and hyper-parameter optimization, often in a trial and error fashion. One major challenge is associated with the multiple-objective optimization problem which is the result of multiple terms in the total loss function. There have been a few works trying to address this challenge. Wang et al. (2020a) analyzed the distribution of the gradient of each term in the loss function, similar to Chen et al. (2018)

, and found that a gradient scaling approach can be a good choice for specifying the weights of the loss terms. They also later proposed another approach based on the eigenvalues of the neural tangent kernel (NTK) matrix

Wang et al. (2020b) to find optimal weights for each term in the loss function. Another challenge was found to be associated with the partial differential equation itself. For instance, Fuks and Tchelepi (2020) reported that solving a nonlinear hyperbolic PDE results in exploding gradients; a regularization was found to provide a solution to the problem. Ji et al. (2020) found that solving stiff reaction equations is problematic with PINNs and they addressed that by recasting the fast-reacting relation at equilibrium. Another source of challenges seems to be associated with the simple feed-forward network architecture. Wang et al. (2021) proposed a unique network architecture by enriching the input features using Fourier features and the network outputs by a relation motivated by the Fourier series. Haghighat et al. (2021a) reported challenges when dealing with problems developing sharp gradients. They proposed a nonlocal remedy using the Peridynamic Differential Operator Madenci et al. (2016). Another remedy has been through space or time discretization of the domain using domain decomposition approaches Jagtap et al. (2020b); Jagtap and Karniadakis (2020); Niaki et al. (2021).At the time of drafting this work, the studies on the use of PINNs for solving coupled flow-mechanics problems remain uncommon. In a recent work, Bekele (2021) used PINNs to solve Terzaghi’s consolidation problem. Terzaghi’s problem, however, is effectively decoupled and therefore the authors only need to solve the transient flow problem. Fuks and Tchelepi (2020) and Fraces et al. (2020) studied the Buckley-Leverett problem, where they also only solved the transport problem. Bekele (2020) tried to calibrate the Barry-Mercer problem and reported poor performance of PINNs for the inverse problem. Although it is found that PINNs can work well for low-dimensional inverse problems, using PINNs to solve the forward problems has proven to be very challenging, particularly for coupled flow and mechanics.

In this work, we study the application of PINNs to coupled flow and mechanics, considering fully saturated (single-phase) and partially saturated (two-phase) laminar flow in porous media. We find that expressing the problem in dimensionless form results in much more stable behavior of the PINN formalism. We study different adaptive weighting strategies and propose one that is most suitable for these problems. We analyze and compare two strategies for training of the feed-forward neural networks: one based on simultaneous training of the coupled equations and one based on sequential training. While being slower, we find that the sequential training using the fixed stress-split Kim et al. (2011) strategy results in the most stable and accurate training approach. We then apply the developed framework, in its general form, to solve Mandel’s consolidation problem, Barry-Mercer’s injection-production problem as well as a two-phase drainage problem. All these problems are developed using the SciANN library Haghighat and Juanes (2021). This work is the first comprehensive strudy on solving the coupled flow-mechanics equations using PINNs.

## 2 Governing Equations

### 2.1 Balance Laws

The classical formulation of flow in porous media, in which all constituents are represented as a continuum, are expressed as follows. Let us denote the domain of the problem as with its boundary represented as . The first conservation relation is the mass balance equation, expressed as

(1) |

where the accumulation term describes the temporal variation of the mass of fluid phase relative to the motion of the solid skeleton, is the mass flux of fluid phase relative to the solid skeleton, and is the volumetric source term for phase .

The second conservation law, under quasi-static assumptions, is the linear momentum balance of solid-fluid mixture expressed as

(2) |

where

is the total Cauchy stress tensor,

is the bulk density, is the gravity field directed in .### 2.2 Small Deformation Kinematics

The total small-strain tensor is defined as

(3) |

with

as the displacement vector field of the solid skeleton. The total strain tensor

is often decomposed into its volumetric and deviatoric parts, i.e., and , respectively, expressed as(4) | ||||

(5) |

### 2.3 Single-Phase Poromechanics

For single phase flow in a porous medium, the fluid mass conservation relation reduces to

(6) |

where is the fluid mass flux, and is the relative seepage velocity, expressed by Darcy’s law as

(7) |

with porous medium’s intrinsic permeability as , fluid viscosity as , and fluid pressure as . in Eq. 6 is the mass of the fluid stored in the pores and evaluated as

(8) |

with as the porosity of the medium, expressed as the ratio of void volume to the bulk volume as . The change in the fluid content is governed in the theory of poroelasticity by

(9) |

where and are the Biot bulk modulus and the Biot coefficient, and are related to fluid and rock properties as

(10) |

Here, is the fluid compressibility defined as , and are the bulk moduli of fluid and solid phases respectively, and is the drained bulk modulus.

As expressed by Biot Biot (1941), the poroelastic response of the bulk due to changes in the pore pressure is expressed as

(11) |

where, is the 4th order drained elastic stiffness tensor and is the 2nd order identity tensor. This implies that the effective stress, responsible for the deformation in the soil, can be defined as . The isotropic elasticity stress-strain relations implies that , with as the fourth-order identity tensor and as the shear modulus. For convenience, we can express as a function of Poisson ratio and the drained bulk modulus as

We can therefore write,

(12) |

The volumetric part of the stress tensor, also known as the hydrostatic invariant, is defined as , which can be expressed as

(13) |

The pressure field is coupled with the stress field through the volumetric strain. Since, neural networks have multi-layer compositional mathematical forms, with their derivative taking very complex forms, instead of evaluating the volumetric strain from displacements, we keep it as a separate solution variable subject to the following kinematic law

(14) |

Therefore, the final form of the relations describing the problem of single-phase flow in a porous medium are derived by expressing the poroelasticity constitutive relations as

(15) | |||

(16) |

We can then write the linear momentum relation (2) in terms of displacements as

(17) |

Note that for reasons we will discuss later, we do not merge the and terms in this relation. Combining Eqs. 9, 7 and 6 and after some algebra, we can express the mass balance relation in terms of the volumetric strain as

(18) |

or, equivalently, in terms of the volumetric stress as

(19) |

##### Dimensionless relations

Since the deep learning optimization methods work most effectively on dimensionless data, we derive the dimensionless relations here. Let us consider the dimensionless variables, denoted by , as

(20) |

where are dimensionless factors to be defined later. This implies that the partial derivatives and the divergence operator are also expressed as

(21) |

and therefore, we can express the seepage velocity as

(22) |

The Navier displacement and stress-strain relations are expressed in the dimensionless form as

(23) | |||

(24) | |||

(25) |

The dimensionless form of Eqs. 19 and 18 is expressed, in terms of stress, as

(26) |

or, in terms of strain, as

(27) |

where the dimensionless parameters are

(28) |

In sequential iterations, Eq. 26 is known as the *fixed-stress-split* approach, while the use of Eq. 27 is referred to the *fixed-strain-split* method Kim et al. (2011).

### 2.4 Two-Phase Flow Poromechanics

In the case of multiphase flow poromechanics, let us first define the ‘equivalent pressure’, expressed as

(29) |

with and as the degree of saturation and pressure of the fluid phase , respectively, and as the interfacial energy between phases, defined incrementally as Kim et al. (2013); Coussy (2004). Let us also define here the capillary pressure , denoting the pressure difference between the nonwetting fluid phase and the wetting fluid phase as

(30) |

The poroelastic stress-strain relation of a multiphase system is expressed as

(31) | ||||

(32) |

where . The mass of the fluid phase is expressed as

(33) |

We can now express the fluid content for phase as

(34) |

Substituting (34) into (1), we can write the fluid mass conservation equation in terms of volumetric strain as:

(35) |

where and expresses components of the inverse Biot modulus matrix for a multiphase system which can be obtained from fluid mass equation for two-phase flow system as:

(36) | |||

(37) | |||

(38) |

with . The seepage velocity of the fluid phase , i.e., , is obtained by extending the Darcy’s law as

(39) |

where denotes the gravity force vector for the phase , and is the relative permeability of phase . Considering the relation between the volumetric strain and equivalent volumetric stress,

(40) |

Consequently, the fluid mass conservation equation can be rewritten in terms of equivalent volumetric stress for phase Kim et al. (2013); Jha and Juanes (2014),

(41) |

##### Dimensionless relations

Here, we define dimensionless variables for two-phase poromechanics as:

(42) |

Noting that , and considering the dimensionless parameters

(43) |

the dimensionless *strain-split* fluid mass conservation equation takes the form

(44) |

and the dimensionless *stress-split* formulation takes the form

(45) |

in which . The linear momentum equation for the whole medium and the stress-strain relation in terms of dimensionless variables are expressed as

(46) | |||

(47) |

where is the bulk density.

## 3 Physics-Informed Neural Networks

We approximate the solution variables of the poromechanics problem using deep neural networks. The governing equations and initial/boundary conditions form different terms of the loss function. The total loss (error) function is then defined as a linear composition of these terms Raissi et al. (2019). The optimal weights of this composition are either found by trial and error or through more fundamental approaches such as gradient normalization that will be discussed later Raissi et al. (2019); Wang et al. (2020a). Solving the problem, i.e., identifying the parameters of the neural network, is then done by minimizing the total loss function on random collocation points inside the domain and on the boundary. It has been shown that minimizing such a loss function is a non-trivial optimization task, particularly for the case of coupled differential equations (see Karniadakis et al. (2021)), and special treatment is needed, which we will discuss later in this text.

Let us define the nonlinear mapping function as

(48) |

where, and are considered the input and output of hidden layer , with dimensions and , respectively. For the first layer, i.e. , , , and for other layers, is the output of the previous layer. represents total number of transformations Eq. 48, a.k.a. layers, that should be applied to the inputs to reach the final output in a *forward pass*. The parameters of this transformation and

are known as weights and biases, respectively. The activation function

introduces nonlinearity to the transformation. For PINNs, is often taken as the*hyperbolic-tangent*function and kept the same for all hidden layers. With this definition as the main building block of a neural network, we can now construct an -layer feed-forward neural network as

(49) |

where represent the composition operator, and as the collection of all parameters of the network to be identified through the optimization, with . The main input features of this transformation are spatio-temporal coordinates and , with as the spatial dimension, and the ultimate output is the network output . Note that the last layer (output layer) is often kept linear for regression-type problems, i.e., . In the context of PINNs, the output can be the desired solution variables such as displacement or pressure fields.

Finally, let us consider a time-varying partial differential equation with as the differential operator and as the source term, defined in the domain , with and representing the spatial and temporal domains, respectively. This PDE is subjected to Dirichlet boundary condition on , Neumann boundary condition on , where and . It is also subject to the initial condition at . Based on the PINNs, the unknown variable is approximated using a multi-layer neural network, i.e., . The PDE residual is evaluated using automatic differentiation. With as the outward normal to the boundary, the boundary flux term is evaluated as . Therefore, the residual terms for the initial and boundary conditions are constructed accordingly. The total loss function is then expressed as

(50) |

where represent the weight (penalty) of each term constructing the total loss function. The solution to this initial and boundary value problem is found by minimizing the total loss function on a set of collocation points with as the total number of collocation points. This optimization problem is expressed mathematically as

(51) |

with , as the *optimal* values for the network parameters, minimizing the total loss function above Raissi et al. (2019).

### 3.1 Optimization method

The most common approach for training neural networks in general and PINNs in particular is using the Adam Kingma and Ba (2014)

optimizer from the first-order stochastic gradient descent family. This method is highly scalable and therefore has been successfully applied to many supervised-learning tasks. For the case of multi-objective optimization, which arise with PINNs, fine-tuning its learning rate and weights associated with the individual loss terms is challenging.

Another relatively common approach for training PINNs is the second-order BFGS method and its better memory-efficient variant, L-BFGSLiu and Nocedal (1989); this method, however, lacks scalability and becomes very slow for deep neural networks with large number of parameters. It does not have some drawbacks of the first order methods particularly the learning rate is automatically identified through the Hessian matrix Nocedal and Wright (2006). In this study, we use the Adam optimizer with an initial learning rate of to train our networks.

### 3.2 Adaptive weights

Another challenge in training PINNs is associated with the gradient pathology of each loss term in the total loss function Chen et al. (2018). If we look at the gradient vector of each loss term with respect to network parameters, we can find that the terms with higher derivatives tend to have larger gradients, and therefore dominate more the total gradient vector used to train the network Wang et al. (2020a). This is however not desirable in PINNs since the solution to the PDEs depends heavily on the accurate imposition of the boundary conditions.

To address this, Chen et al. (2018) proposed a gradient scaling algorithm, known as *GradNorm*, as described below. Let us consider the total loss function is expressed as with as different loss terms such as the PDE residual or boundary conditions and as their weights. Let us also define the total gradient vector as the gradient of total loss with respect to network parameters , which can be expressed as , with as the gradient vector of the weighted loss term with respect to network parameters , i.e., , and with as its -norm. According to this approach, the objective is to find s such that the of the gradient vector of the weighted loss term, i.e., , approaches the average norm of all loss terms, factored by a score value, i.e.,

(52) |

with, as the value of loss term at the beginning of training, as the relative value for the loss term, as a measure of how much a term is trained compared to all other terms, referred here as the *score* of loss term . To find the weights, the authors propose a gradient descent update to minimize the *GradNorm* loss function, as

(53) |

Knowing that the minimum of summation of positive functions is zero and occurs when all functions are zero, we can algebraically recast the weights such that it zeros the . Then using an Euler update, as proposed by Wang et al. (2020a), the loss weight

at training epoch

can be expressed as(54) |

We therefore use loss-term weights evaluated following this strategy. There are also adaptive spatial and temporal sampling strategies offered in the literature Nabian et al. (2021); Wight and Zhao (2020); however, we do not employ them in this work.

## 4 PINN-PoroMechanics

Now that we discussed the method of Physics-Informed Neural Networks (PINNs), we can express the PINN solution strategy for the single-phase and multiphase poroelasticity problem discussed in Section 2. For definiteness, we consider two-dimensional problems.

### 4.1 PINN solution of single-phase poroelasticity

For the case of single-phase poroelasticity, the unknown solution variables are . Considering that is correlated with the volumetric strain , and also considering that the derivatives of multi-layer neural network takes very complicated forms, we also take the volumetric strain as an unknown so that the we can better enforce the coupling between fluid pressure and displacements. This results in an additional conservation PDE on the volumetric strain, expressed as

(55) |

By trial and error, we find this strategy to improve the training; This strategy of explicitly imposing volumetric constrains has a long history in FEM modeling of quasi-incompressible materials Zienkiewicz et al. (1977); Hughes (2012). Therefore, the neural networks for the dimensionless form of these variables are,

(56) | ||||

(57) | ||||

(58) | ||||

(59) |

where highlights that these networks have independent parameters, as proposed in our earlier work Haghighat et al. (2021b). The total coupled loss function is then expressed as

(60) |

where implies the boundary or initial values for each term. The many terms in the total loss function, and their complexity, pose enormous challenges in the training of the network. Meaningful results of the Adam optimizer were obtained only with the use of scored-adaptive weights, but our experience has been that the procedure lacks robustness. To address the inadequacy of traditional methods of network training to multiphysics problems, we propose a novel strategy, inspired by the success of the fixed-stress operator split for time integration of the forward problem in coupled poromechanics Kim et al. (2011).

We adopt a sequential training of the flow and mechanics sub-problems, expressed in fixed-stress split form:

(61) |

as summarized in Algorithm 1. Although more robust compared to the simultaneous-solution form, the sequential strategy incurs in extended computational steps due to the operator-split iterations.

### 4.2 PINN solution of two-phase poroelasticity

For the case of two-phase poroelasticity, the main unknown variables are . For the same reason explained earlier, we also introduce a separate unknown for the . Hence, the neural networks for the dimensionless form of these variables are

(62) | ||||

(63) | ||||

(64) | ||||

(65) | ||||

(66) |

The total loss function can be derived by following the single-phase case. We also follow two strategies here to solve the optimization problem. In the first case, we perform the optimization on the scored-adaptive-weights introduced earlier. The alternative approach is to decouple the displacement relations from the fluid relations, which results into two optimization problems that are solved sequentially following the stress-split strategy. Since the overall framework is very similar to the single-phase case, we avoid re-writing those details here.

## 5 Applications

In this section, we study the PINN solution of a few synthetic poromechanics problems. In particular, we first study Mandel’s consolidation problem in detail to arrive the the right set of hyper-parameters for the PINN solution. We also discuss the sensitivity of the PINN solution to the choice of parameters for this problem. Following that, we apply PINN to solve Barry-Mercer’s problem, and a synthetic two-phase drainage problem. In all of these cases, we keep the formulation in its two dimensional form, while considering coupled displacement and pressure effects. All the problems discussed below are coded using the SciANN package Haghighat and Juanes (2021), and the codes will be shared publicly at https://github.com/sciann/sciann-applications.

Comments

There are no comments yet.