## 1 Introduction

The sustainable management of groundwater (GW) resources is quickly emerging as a critical issue as irrigated agriculture continues to grow and rely more and more on GW [Hoekstra Mekonnen (2012)]. Estimates suggest that half the GW extracted for irrigation exceeds aquifer recharge and is hence unsustainable [Rost . (2008), Vorosmarty (2000)]. In particular, areas with persistent water stress and large aquifer systems have increased their reliance on GW and now experience sustained GW depletion [Hanasaki . (2008), Wada . (2010), Scanlon . (2012)], which poses a threat to agricultural systems, food security, and livelihoods [UNESCO (2012)].

Sustainable GW management under changing climatic and societal conditions requires models that can predict the aquifer response to these forcings. The dominant uncertainty in the predictions of any GW model is the insufficient characterization of the aquifer properties [Eaton (2006), Bohling Butler (2010), Oliver Chen (2011)]

. These properties can change by several orders of magnitude and display variability on all scales. In principle, the uncertainty in these parameters could be quantified with Bayesian inference

[Carrera Neuman (19861), Carrera Neuman (19862), Yeh (1986), McLaughlin Townley (1996)], if a sufficiently rich data set is available. However, standard aquifer characterization based on sparse measurements at wells is often not sufficient to significantly reduce the uncertainty in model predictions [Bohling Butler (2010)]. Hence there is an urgent need to improve GW monitoring and integrate new data into GW models to reduce the uncertainty in predictions and to facilitate decision making.Over the past several decades, advances in satellite remote sensing techniques have revolutionized capabilities for observing freshwater resources. In particular, spaceborne Interferometric Synthetic Aperture Radar (InSAR) has been measuring surface deformation history since 1992 with 10–100 meter spatial resolution and up to millimeter-level measurement accuracy [Hanssen (2001), Rosen . (2000)]. The potential use of surface deformations derived from InSAR in aquifer characterization has been recognized since as early as 2000; see galloway2007 for a review. Since then spaceborne InSAR missions such as ERS-1/2, Envisat, ALOS, and Sentinel-1 have generated large amounts of freely available data (1992–present) for monitoring the state of confined groundwater aquifers. Unlike data types that require field work and instrumentation, InSAR data can be obtained and processed at relatively low cost and effort.

However, despite significant work on InSAR based aquifer monitoring [Amelung . (1999), Hoffmann . (2001), Du Olson (2001), Lu Danskin (2001), D. Vasco . (2001), Schmidt Brügmann (2003), D. Galloway Hoffmann (2007), DW. Vasco . (2008), Burbey (2008), Bell (2008), DL. Galloway Burbey (2011), Chaussard . (2014), J. Chen . (2016), J. Chen . (2017), Miller . (2017)], the potential for this data for informing lateral aquifer heterogeneity has not yet been realized. This requires scalable computational approaches that allow the inference of high-dimensional parameter fields describing the aquifer properties. It also requires the use of geomechanical models that link the observed surface deformation to GW flow in the aquifer. Such poroelastic inversion methods have only recently been developed by IglesiasMcLaughlin12 and HesseStadler14 and have not yet been applied to real field data.

Here we build on the work of HesseStadler14 and develop a high-dimensional scalable poroelastic Bayesian inversion framework to characterize lateral permeability variations in groundwater aquifers based on InSAR surface deformation data. The Bayesian framework provides a systematic and rational framework for quantifying uncertainties in the inverse solution, stemming from uncertainties in the data, model, and any prior information on the parameters, along with insensitivity of observables to parameters [Tarantola (2005), Kaipio Somersalo (2005), Stuart (2010)]. We apply this framework to a well-characterized field site in Nevada [Burbey (2006)] and focus on the inference of lateral permeability variations. In particular, we find the maximum a posteriori (MAP) point of the Bayesian posterior distribution. This application is computationally challenging and requires the inference of parameters in a three-dimensional geomechanical model with state variables. To validate our inversion results, we predict the lateral deformations using a poroelasticity model with an InSAR-inferred permeability field and compare them to available GPS data [Burbey (2006)]

. In part II, we build on the Bayesian framework we present here to quantify the uncertainty in the inverse solution and resulting state variable predictions, using a tailored Markov chain Monte Carlo method.

This manuscript is organized as follows. We present the general Bayesian geomechanical inversion framework in section 2. In section 3, we describe the Nevada test case to which we apply our framework and provide details about the available InSAR and GPS data sets. Numerical results for the Nevada test site are presented in section 4 and discussed in section 5. Finally, the broader conclusions of this study are presented in section 6.

## 2 Methods

The solution of realistic aquifer characterization problems, such as the test case in section 3

, requires robust, efficient, and scalable methods for both the forward and inverse problems. The Bayesian inversion framework provides a means of incorporating prior assumptions about the aquifer properties and the data noise probability distribution to infer model parameters that comply with these assumptions and with a forward model. Section

2.1 describes the formulation and a robust discretization of the poroelastic forward problem and section 2.2 outlines the application of the Bayesian inversion framework to this forward problem with InSAR surface deformation data.### 2.1 The Forward Model

Here we use quasi-static linear poroelasticity to model coupled groundwater flow and elastic deformation in a confined aquifer [Biot (1941), Showalter (2000), Wang (2000), Segall . (2010)]. We adopt a three-field formulation and a mixed discretization to conserve mass discretely and reduce numerical oscillations [Phillips (2005), Phillips Wheeler (2009), Ferronato . (2010), Haga . (2012)]. This formulation of linear quasi-static poroelasticity in a space-time domain can be written as:

(1) | ||||

(2) | ||||

(3) |

where denotes time derivative, is the deviation from hydrostatic pressure, is the volumetric fluid flux, is the displacement of the solid skeleton, is a fluid source per unit volume, is the body force per unit volume, is the specific storage, is the medium permeability field, is the pore fluid dynamic viscosity, is the Biot–Willis coupling parameter, and

is the stress tensor. The medium permeability is written in terms of the log-permeability field

to ensure that the inferred permeability field is positive. The initial and boundary conditions for the state variables , , and are given by:(4) |

where is the initial pressure at time , is the value of the fluid flux across the pressure Neumann boundaries , is the prescribed pressure on the pressure Dirichlet boundaries , is the prescribed traction on the displacement Neumann boundaries , and is the prescribed displacement on the displacement Dirichlet boundaries . The stress tensor in terms of displacement is given for isotropic linear elasticity by

(5) |

where is the drained shear modulus and is the Poisson’s ratio.

We discretize the system (1)–(3) with the Mixed Finite Element Method (MFEM) proposed by FerronatoCastellettoGambolati10, which approximates the fluid flux in the lowest-order Raviart–Thomas space, the pressure in the space of piecewise constant functions, and the displacement in the first-order Lagrange polynomial space. We use the implicit Euler method for integration in time. In A, we provide the details of the weak form and discretization. The resulting linear system of equations that needs to be solved at each time step is given by:

(6) |

where and

are the discretized pressure, displacement and fluid flux degrees of freedom (DOFs) vectors, respectively. The subscript

denotes the evaluation of the variable (or the operator) at time and is the -th time step size. The discrete differential operators are denoted and for the divergence of the displacement and the flux, and for the gradient of the pressure in the pressure and the mixed flow spaces, respectively, and for the linear elastic operator. The mass matrices and are weighted by the specific storage and mobility (permeability/viscosity), respectively. Right hand side contributions include the discrete source terms for fluid and body force, and , as well as boundary tractions, , and pressure natural boundary conditions, .To facilitate the derivation of the time-dependent inverse problem in section 2.2.4, we follow HesseStadler14 and define the “global” space-time system that represents the solution for all time steps simultaneously and denote it as

(7) |

where the block lower triangular matrix combines the differential operators of system (6), combines the source terms and the natural boundary conditions, and consists of the pressure, displacement, and fluid flux DOFs at all time steps (see A for the explicit expressions). The bar notation is used to distinguish the space-time discretized operators and vectors from the space-only discretized operators and vectors.

### 2.2 Bayesian Framework for Three-Field Formulation of Fully Coupled 3D Poroelasticity Inverse Problem

The Bayesian inverse problem solution is a probabilistic characterization of unknown physical system parameters, such as material properties, source terms, boundary conditions, and initial conditions. Bayes’ theory provides a framework for updating “prior” statistical knowledge about these parameters by using observational data and a mathematical model of the system via a “likelihood distribution”. This likelihood distribution determines how likely it is that the observed data result from a particular parameter realization. The updated probability distribution is known as the posterior distribution, and is regarded as the solution of the inverse problem. Bayesian inversion, therefore, provides a characterization of the uncertainty in the parameters due to uncertainty in the prior information, data, and model, as opposed to finding a “point estimate”, as is the case with deterministic inversion. In the present article (Part I), we formulate the Bayesian inversion problem, paying special attention to the prior distribution of the model parameters and the observational noise model. We also take the first step in exploring the posterior distribution by introducing fast Hessian-based methods for scalably determining the maximum a posteriori (MAP) point. In a subsequent article (Part II), we will build on the tools developed here to construct a Hessian-driven Markov chain Monte Carlo method (MCMC) to sample the posterior distribution and compute statistics of interest as well as posterior predictives.

In section 2.2.1, we describe the formulation and discretization of the Bayesian inverse problem for the three-field formulation of the linear quasi-static poroelasticity model described in section 2.1. Our methodology follows closely the framework presented by HesseStadler14, with the following differences or extensions: (1) we base the framework on the three-field Biot system formulation rather than the classic two-field, (2) we use a Matérn class prior for the Bayesian inverse problem (section 2.2.2), and (3) we define the likelihood distribution based on a noise model for InSAR surface deformation data (section 2.2.3).

#### 2.2.1 Formulation of the Infinite Dimensional Bayesian Inverse Problem and its Discretization

Given the observational data, , and prior statistical knowledge about a parameter field

, we seek the posterior probability distribution of

. The parameter is related to the observational data through a Gaussian additive noise model,(8) |

The parameter-to-observable map, , maps a realization of the parameter field from the infinite dimensional space to the finite dimensional space of observables . Evaluating involves solution of a system of PDEs in which the parameter is a coefficient, source term, boundary condition, or initial condition of the system, or a combination thereof. In our case, we will invert for the log-permeability field appearing in equation (3). The parameter-to-observable map involves solution of the poroelasticity model (1)–(3), and extraction of surface displacements.

The random variable

is observational noise with known statistical properties. Here,is assumed to follow a Gaussian distribution with mean zero and covariance matrix

, reflecting the estimated noise level and correlation. The likelihood probability measure, which is the probability measure of the difference between the observational data and the simulated observations,, follows the normal distribution of the observational data noise.

A general framework for infinite dimensional Bayesian inverse problems governed by PDEs is presented in Stuart10. In infinite dimensions, Bayes’ formula reads

(9) |

where

is the likelihood probability density function,

is the normalization constant required for the posterior measure, , to be a probability measure, and is the Radon–Nikodym derivative of the posterior with respect to the prior.As a first step to characterising the posterior , we wish to find the point that maximizes the posterior, the so-called MAP point. We follow the discretize-then-optimize approach because it guarantees a consistent discretized gradient [Gunzburger (2003)]. The corresponding Bayes formula in finite dimensions is given by:

(10) |

where the symbol indicates equality up to a constant, denotes the discretized parameters,

(11) |

Here is the normal distribution of the prior with mean and covariance matrix . The function , the negative log of the likelihood, is obtained from equation (8) and the fact that as follows:

(12) |

where is the discretized parameter-to-observable map. The negative log of the likelihood, , is also known as the “data misfit term” in the deterministic inversion setting. A typical choice of is a diagonal matrix with entries

, the noise-level variance of the

observation . This choice of noise is valid for cases in which the noise polluting one measurement is uncorrelated with the noise in other measurements.#### 2.2.2 Prior

As a prior, we choose a Matérn class that can represent several commonly used covariance models used in geostatistics [Goovaerts (1997), Deutsch Journel (1998), Wackernagel (2010)]

. Our representation of the Matérn prior is based on the link between the Matérn class of Gaussian fields and solutions of elliptic stochastic partial differential equations (SPDE)

[Lindgren . (2011)]. We define the parameter to be a Matérn class Gaussian random field in the domain , , , or . LindgrenRueLindstroem11 show that its covariance operator is the square of the solution operator of the SPDE(13) |

For , should satisfy , we choose . The right-hand side of the SPDE (13),

, is a white-noise Gaussian random field with unit marginal variance. The parameters

, , and are chosen to achieve the desired marginal variance and correlation length of the features of the parameter field.The parameter is discretized over the same triangularization defined for the linear poroelasticity PDE problem. We use first-order Lagrange polynomials to approximate . For our choice , the Matérn class precision operator, , is constructed as follows: . The operator is the finite element discretization of the differential operator and is the mass matrix in the finite element space where is approximated. can be efficiently approximated [Villa . (2019)], in which case the cost of sampling from the prior is mainly the cost of solving the elliptic PDE (13) for a realization of the white noise right hand side, which can be carried out with multigrid solvers which have linear complexity. The covariance of the prior, , is given by , the inverse of the precision operator.

For an isotropic Gaussian field of Matérn class, the correlation length (or range) is ; in particular, and , for any , of distance from each other, have approximately a correlation near [Lindgren . (2011)]. In all results presented here, the domain is three-dimensional () and we choose the elliptic operator power , so that reduces to . Choosing , for example, dictates a correlation near 0.1 for points that are about length units apart.

#### 2.2.3 Likelihood and Data Noise of InSAR Data

Interferometric SAR (InSAR) computes the phase difference between two synthetic aperture radar (SAR) images. The resulting interferogram can be used to infer a 2D map of surface deformation between two SAR acquisition times along the radar line-of-sight (LOS) direction. The LOS deformation, , can be described by the formula:

(14) |

where , and are the east, north and vertical displacements, accordingly. The vector is the radar LOS direction unit vector that can be calculated based on the known radar imaging geometry.

We define the pointwise InSAR data misfit function (i.e., the negative log likelihood), , as follows:

(15) |

where is the number of interferograms used in the inversion and is the number of pixels in the interferogram. In this data misfit expression, we assume the covariance operator of the noise is diagonal, where is the noise level of the interferogram. This assumption is valid when InSAR measurement noise mostly stems from phase decorrelation as in the Nevada test case, presented in section 3. In cases where InSAR measurement noise contains a non-negligible spatially coherent component (e.g. noise from atmospheric delays), the covariance operator can be constructed to dictate that noise correlation. The value is the LOS deformation measured from the pixel in the interferogram, and is the simulated LOS deformation of the pixel in the interferogram as computed from the forward poroelasticity model.

#### 2.2.4 Finding the MAP Point Estimate of the Posterior Distribution

The discretized posterior (10) in general is a distribution in high dimensional parameter space—thousands or millions. It is difficult to explore such distributions due to the high dimensionality and the need to solve the 3D poroelasticity system (1)–(3) to evaluate the posterior at each point in parameter space. Here in Part I, we seek to determine the MAP point, which as we will see, amounts to a deterministic inverse problem with specially-chosen weighting matrices. We employ adjoint-based derivatives, the Gauss–Newton method, and early termination of conjugate gradient (CG) iterations at each Gauss–Newton step to ensure the efficiency and scalability to high parameter dimensions for solving this deterministic inverse problem.

The MAP point, , is the point that maximizes the posterior distribution (10). We can define the parameter-to-observable map, , as follows: , where the state variables depend on the permeability parameter vector through the forward problem (7), and is the linear observation operator that extracts the observations from , i.e. is the surface displacements at times and locations of the observed data. Thus we can rewrite the the posterior distribution (10) as:

exp | (16) |

Maximizing the posterior distribution with respect to the parameter is equivalent to minimizing the negative log posterior. Therefore, our goal is to minimize the objective function:

(17) |

where depends on through the solution of the forward problem . It can be seen that acts as a regularization operator in the deterministic problem (17)[Tarantola (2005)]. To derive the gradient using the adjoint method, we construct a Lagrangian by adding to the objective function the inner product of the left-hand side of the state equation with a Lagrange multiplier (also called the adjoint variable) . In the Lagrangian formulation, is considered an independent variable, and the dependence of on the parameters is enforced through the Lagrange multiplier term.

At a minimum of (17), the partial derivatives of the Lagrangian with respect to the state variable, , the parameter, , and the adjoint variable, , vanish. We set and to zero, which gives the state equation and the adjoint equation respectively. We satisfy these two conditions by solving the discretized state and adjoint equations exactly for any value of . Thus we seek the parameter that ensures that the gradient vanishes. We solve the equation using Newton method. The details of forming the Lagrangian and constructing the Newton iteration are left to B.

The Newton system we need to solve at each iteration is:

(18) |

where is the gradient and is the Newton direction. The superscript we use for the matrices, and the forward and the adjoint solutions indicates that these matrices and vectors are evaluated at the iteration MAP point approximation, . The Hessian is given by

(19) |

The operator is the partial derivative (sensitivity) of the residual with respect to the parameter

(20) |

and the other matrices are defined as follows

The direction is used to update the approximation of the MAP point as follows:

(21) |

where is a step length parameter chosen to satisfy the Armijo rule, to guarantee a sufficient reduction in the objective function [Nocedal Wright (2006)]. Backtracking line search is used to reduce the step size until the objective function is sufficiently reduced. The conjugate gradient method with early termination is used to solve the linear system (18) at each Newton iteration. This CG iteration is also terminated when a direction of negative curvature is detected, which ensures that the direction is a descent direction [Nocedal Wright (2006)].

The full Hessian (19) can be approximated by the Gauss–Newton Hessian , in which all matrices depending on the adjoint variable , i.e. , , and , are neglected. The reasoning behind this is that, when the data misfit is zero (i.e. when the model fits the data), the right hand side of the adjoint equation (39) vanishes, and since the equation is linear in the adjoint variable , it vanishes. Thus when the data misfit is small, the Gauss–Newton approximation can be expected to be accurate. The resulting approximation is guaranteed to be positive definite for appropriately chosen regularization, is computationally less expensive, and is expected to converge super linearly near the optimal point, , in the zero-noise case. However, if the noise level is high, we can switch to the full Newton Hessian after taking some initial iterations with . The GN Hessian is given by

(22) |

Forming the GN Hessian (and similarly the full Hessian) explicitly each Newton iteration (18) is computationally intractable. It requires applying the operator to each column of to form the matrices product , each Newton iteration . Applying the operator to a vector amounts to an incremental forward poroelasticity solve, equation (45). Therefore, forming the GN Hessian explicitly costs poroelasticity PDEs solves (note that no additional solves are required for forming the product since it is the transpose of the product ). When using CG, there is no need to form the GN Hessian in each Newton iteration explicitly. What is required, in each CG iteration , is computing the matrix vector product , where is the conjugate gradient direction and is the prior covariance matrix which is commonly a favorable choice as a preconditioner for the Hessian. The main cost of computing this product is the cost of applying the operators and each to a vector wich is the cost of two poroelasticity PDEs solves (one incremental forward (45) and one incremental adjoint (46)) each CG iteration. We emphasize that it is often the case that the number of CG iterations required in each (Gauss–)Newton iteration is much smaller than, and independent of, the number of parameters () in many PDE-based optimization problems including those governed by poroelasticity, as has been proven for some particular problems and observed numerically for a number of others [Bashir . (2008), Flath . (2011), Bui-Thanh Ghattas (20122), Bui-Thanh Ghattas (2013), Bui-Thanh Ghattas (20121), Bui-Thanh . (2012), Bui-Thanh . (2013), P. Chen . (2017), Alexanderian . (2016), Alexanderian . (2017), Alexanderian . (2014), Crestel . (2017), Petra . (2014), Isaac . (2015), Martin . (2012), Bui-Thanh Ghattas (2015), Hesse Stadler (2014)]

. As a consequence of ill-posedness, a typically small and parameter dimension-independent number of CG iterations is required, corresponding to the dominant eigenvalues of the prior preconditioned (Gauss–)Newton Hessian that are distinct from those that cluster around one. For sufficiently large CG tolerance, the eigenvector components of the error corresponding to the remaining eigenvalues clustered around one can be eliminated in

CG iteration [Nocedal Wright (2006)]. Furthermore, in early Newton iterations when the MAP point approximation is away from the true solution, solving the linear system arising in the Newton iteration requires even fewer CG iterations than the number of the dominant eigenvalues due to imposing Eisenstat-Walker tolerance that prevents oversolving the linear system [Eisenstat Walker (1996), Nocedal Wright (2006), Villa . (2019)].## 3 Nevada Test Case

In this section, we first review an aquifer pumping test conducted in 2003 by BurbeyWarnerBlewittEtAl06 near Mesquite, Nevada. We then introduce the InSAR and GPS data used to measure the aquifer deformation induced by this test. We finally describe the aquifer model, and how we set up the Bayesian inversion framework to infer the lateral permeability variations of this aquifer.

### 3.1 The Study Site

The Nevada study site is located southeast of Mesquite, Nevada, in the Mesquite basin of the lower Virgin River valley (Figure 1*a* and 1*b*). The aquifer is meter thick, which is confined by a brittle unsaturated layer at the top and a clay layer at the bottom (Figure 1*c*). Hydraulic head records suggest that the regional GW recharge is due mainly to winter precipitation in the surrounding mountains, and hence is negligible during the duration of the well test [Burbey . (2006), Burbey (2008)].

A controlled aquifer test was performed at a newly installed municipal well WX31 between May 7, 2003 and July 9, 2003 [Burbey (2006)]. The well was screened over a meter interval in the deeper portion of the confined aquifer, and produced an average of cubic meters of water daily. The pumping continued until October, 2003. Note that the aquifer was isolated from poromechanical stresses exerted by pumping activities in adjacent wells. Over the test period and within a radius of km from well WX31, only a single well, located 4 km toward the north-northwest of well WX31, was in production and its radius of influence did not intersect with that of WX31 [Burbey . (2006)].

Surface deformation caused by pumping was measured daily at 10 high-precision GPS stations between April 30, 2003 and July 9, 2003. Using these measurements, Burbey08 estimated the mean hydraulic conductivity, Poisson’s ratio, and shear modulus of the aquifer. They further postulated the existence of a fault 1.1 km northwest of the well based on an EnviSAT interferogram that spans May 4, 2003 and September 5, 2004.

The relatively simple geologic setting along with the availability of the GPS and InSAR deformation data make the Nevada site an attractive test case for Bayesian framework proposed in section 2.2.

### 3.2 InSAR and GPS Data

We generate an interferogram spanning May 4, 2003 and October 26, 2003 to measure the LOS deformation caused by the 2003 aquifer test. The LOS unit look vector is . We observe a subsidence bowl southeast of the well WX31 and an uplift bowl northwest of the well (Figure 1*d*). This asymmetrical pattern suggests the possible existence of a fault as reported in Burbey08. Our interferogram shows similar deformation patterns as the one that spans May 4, 2003 and September 5, 2004 [Burbey (2008)]. This suggests that little deformation occurred after the end of the well test between October 26, 2003 and September 5, 2004, confirming that the regional GW recharge is minimal.

We select a high phase-correlation () block bounded by the gray outline in Figure 1*d* as the input for the inversion. In this 8-by-7 region, there are no visible orbital errors, ionospheric artifacts, or phase unwrapping errors that impact the deformation estimates. DEM errors and tropospheric errors are not substantial, because the study site is relatively flat with a dry desert climate. The dominant error source here is due to the change of surface scattering properties between the two SAR acquisitions. This phase noise (known as phase decorrelation) is not correlated in time or space [Zebker Villasenor (1992)]. We estimate the phase decorrelation noise level ( 3.2 mm) directly from the data and use it as in equation (15).

We use the lateral displacements as recorded at 6 GPS stations during the first 22 days of the 2003 pumping test (Figure 1*e*) as an independent validation of the InSAR-based permeability estimates. These GPS stations are closer to the well, and little deformation was recorded at the other 4 stations that are further away.
The joint inversion of GPS and InSAR data sets as well as their relative merits will be discussed in a future article.

### 3.3 The Aquifer Model

We model the aquifer site as an 885 m deep layered poroelastic medium (as shown in Figure 1*c*) over the region bounded by the red polygon in Figure 1*a*. We assume no GW flux at all boundaries, zero displacement at the lateral boundaries, zero normal displacement at the bottom boundary, and a traction-free top surface. We set the initial deviation from the hydrostatic pressure before the start of the pumping, , to zero. We model the groundwater extraction as a volumetric sink term, in equation (1), located in the segment of the well coinciding with the lower half of the aquifer (the blue segment of the well in Figure 1*c*). Following Burbey08, the aquifer parameters are listed in Table 1.

We set up the Matérn class prior Gaussian field defined in equation (13), which yields permeability samples with the following properties: (1) on average, at any given point in space, the permeability values vary from the prior mean listed in Table 1*b* by a standard deviation (SD) in decimal logarithm; (2) the permeability samples correlate anisotropically in space, with a correlation length km in the lateral direction and an order of magnitude larger correlation length in the vertical direction. The large vertical correlation length suppresses vertical variations that are not informed by the surface deformation data. Therefore, the estimated permeability should be interpreted as the vertical average of the permeability field. This average is a useful up-scaled representation of the aquifer permeability, because the GW flow is predominantly horizontal and along the unresolved fine-scale horizontal layering in the aquifer [Cherry Freeze (1979)]. In section 5.2, we show that the inverse solution is robust irrespective of a wide range of SD and in the horizontal direction.

Volumetric force () | 0 |
---|---|

Pumping rate () | |

Fluid viscosity () | |

Biot-Willis coefficient () | |

Height of the domain () | m |

Water density () | |

Gravitational acceleration () | |

Water compressibility () |

[0.3in] Units Aquifer Confining layer Lower clay Poisson’s ratio () - Drained shear modulus () Pa Specific storage () Prior mean value () (for ) -11.2 -14.2 -14.2

We generate two unstructured tetrahedral meshes of differing resolution for our computational domain using Gmsh [Geuzaine Remacle (2009)]. The number of nodes in these two meshes, and hence the number of DOFs approximating the log permeability , are and (Figure 2*a*). The overall number of DOFs at each time step of the discrete system state variables (, and ) are and , respectively. This system is built using the discretely-mass-conserving FEM described in section 2.1. The -node mesh provides sufficient accuracy for our application and therefore we use this mesh in all results presented in this article except where noted. We construct the mesh so that the elements are refined near the well location and we ensure no two GPS station locations belong to the same element. To avoid excessive mesh refinement, we do not mesh to the exact well radius but rather to a cylindrical volume of radius 7 m that has the same axis as the well. The sink term is distributed over that volume such that it integrates to the pumping rate. We discretize the temporal evolution into 154 variable-length time steps over a period of 175 days. At the start of the simulation, the time step is set to 1.2 hours to accurately capture the rapid changes after the onset of pumping. The time step length increases gradually to up to 5-day time interval toward the end of the simulation.

We implement the forward model, adjoint system, and gradient and Hessian evaluations using the Python-based FEniCS library [Logg . (2012)] for FEM discretization in space. We use the hIPPYlib library for state-of-the-art Bayesian and deterministic PDE-constrained inversion algorithms and prior construction [Villa . (2018)]. Using this implementation we infer the log-permeability MAP point, i.e. the point in the parameter space that maximizes the posterior distribution, from the InSAR data. Based on this inferred permeability field, the forward model can simulate a three dimensional evolution of the pore pressure in response to the pumping test (e.g. Figure 2*b*) and the expected deformation in east, north, and vertical directions at each time interval. We then compare this model-based lateral deformation with the GPS data set for validation of the inverse solution.

## 4 Numerical Results

In section 4.1, we present results of applying the Bayesian inversion framework to infer the permeability field of the Nevada test case described in section 3. Section 4.2 demonstrates the superiority of Newton-type methods over steepest descent for these poroelastic inverse problems.

### 4.1 Characterizing the Aquifer Permeability Using InSAR Data

We solve for the log-permeability MAP point, , using the Envisat interferogram that spans May 4, 2003 and October 26, 2003 (Figure 1*d*). The two-dimensional horizontal slice of the reconstructed permeability, , at mid-aquifer depth (Figure 3*a*) reveals distinct features in the permeability field: A high permeability channel extending from south/southwest of the well (marked by red star) to the west; and a low-permeability flow barrier northwest of the well.

Using this InSAR-inferred permeability field and the forward model, we simulate the expected LOS deformation that occurred between May 2003 to October 2003. This reconstructed LOS displacement accurately captures the subsidence bowl due to the pumping test. In most of the region, the discrepancy between the InSAR data and the simulated LOS deformation (Figure 3*c*) is roughly random noise of magnitude within the estimated noise level. We observe a relatively large discrepancy in the northwest, likely due to the existence of a fault [Burbey (2008)] that is not accounted for in our aquifer model. In section 5.3, we discuss the impact of the model error in our inversion solution.

Note that the simulated LOS deformation is computed from the simulated east, north, and vertical deformations (Figure 3*d*–3*f*) following equation (14). To validate our inversion results, we further compare the simulated lateral deformation with the lateral displacements as recorded at 6 GPS stations (Figure 3*g*). We confirm that the reconstructed lateral displacements capture the GPS-observed nontrivial southeast-trending surface deformation during the first 22 days of the 2003 pumping test. It is worth noting that the northward component of these GPS-observed lateral displacements is well-approximated by our model despite the fact that the northward contribution to the LOS deformation is negligible (). This can be achieved only when a high fidelity physical approximation—the 3D linear poroelasticity assumption—is incorporated into the inversion framework.

### 4.2 Convergence of the (Gauss–)Newton Method

The (Gauss–)Newton conjugate gradient method, described in section 2.2.4, is known to converge rapidly when applied to optimization problems for which the data misfit Hessian effective rank is relatively small and independent of the discretization dimension [Heinkenschloss (1993)]. As has been mentioned in section 2.2.4, this has been demonstrated either analytically or numerically for a wide spectrum of geophysical inverse problems. Although this property of the Hessian has not been shown analytically for poroelastic inverse problems, our numerical results confirm previous work by HesseStadler14 that demonstrated the computational advantage of Newton-type methods, suggesting rapid decay of the eigenvalues of the data misfit Hessian for this problem.

To verify the dimension-independent convergence of (Gauss–)Newton–CG iteration, in the form of equation (18), we solve for the permeability MAP point for both our coarse and fine meshes and compare their convergence in the blue and green curves in Figure 4*a*). In both cases, the (Gauss–)Newton–CG method converges in a small number of iterations (less than 60). Even though the fine mesh has four times the number of parameter DOFs as the coarse mesh, the number of iterations and PDE solves required for both meshes are comparable (Table 2). This demonstrates that the algorithm is dimension independent and hence scalable to larger problems. This is essential if the InSAR-based Bayesian poroelastic aquifer characterization is to be applied to GW management at the basin scale [J. Chen . (2016), J. Chen . (2017)]. Additionally, the inferred permeability fields in both the coarse mesh (Figure 3*a*) and the fine mesh (Figure 4*b*) are consistent. Note that the method was set to use the Gauss–Newton Hessian in the first 50 iterations, then switch to using the full Newton Hessian in the remaining iterations.

We also demonstrate the superiority of the (Gauss–)Newton–CG method’s convergence over the classical steepest descent method on the coarse mesh (red line in Figure 4*a*). As shown in Table 2, the steepest descent method terminated after 18 days of runtime (due to hitting the iteration limit of ) without converging. In contrast, the (Gauss–)Newton–CG method converged in hours, giving an apparent speedup of (but the true speed up would be much larger had we allowed steepest descent to run to convergence).
Comparing the number of PDE solves in each case, we see that steepest descent method required 30 times the number of PDE solves required by the (Gauss–)Newton–CG method and yet did not converge (Table 2).

We point out that part of the computational savings obtained when using the (Gauss–)Newton–CG method are due to the cost of a PDE solve for the steepest descent method being greater than that for the (Gauss–)Newton–CG method in our implementation. On average, a PDE solve requires 20 seconds in the former case, while requiring only 7 seconds for the latter case on the coarse mesh. This is a result of using a direct solver for the linear poroelasticity equations. When using the GN method, the direct solver’s LU factors can be computed once at the GN iteration, stored, and reused in all the CG iterations required at that GN iteration. This is because the same incremental forward and adjoint PDEs are solved at each CG iteration (equations 45 and 46). On the other hand, for the steepest descent method, the parameter changes in each iteration and thus a new factorization of the poroelasticity operator is required at each steepest descent iteration.

Method, | Gradient | PDE solves: | Num. of global | Num. of total | Total |

mesh | reduction | forward (fwd), | iterations | CG it. | time |

factor | adjoint (adj) | ||||

GN, | fwd: , | 50 | 4.5h | ||

coarse | adj: | ||||

GN, | fwd: , | 55 | 22.3h | ||

fine | adj: | ||||

SDM, | fwd: , | - | 18d | ||

coarse | adj: | ||||

The number of forward and adjoint PDE solves are given as: number of forward (or adjoint) poroelasticity solves number of incremental forward (or adjoint) poroelasticity solves. |

## 5 Discussion

In this section, we discuss how the essentially arbitrary pixel spacing of the InSAR data can be treated systematically in the Bayesian setting, without the introduction of artificial weights. We then show that the inferred MAP point of the permeability field is not appreciably affected by the choice of prior. We further show that the discrepancy between the model and the data beyond the noise identifies a region where our model exhibits structural error. We conclude this section with a brief overview of Part II of this study, which focuses on MCMC sampling of the Bayesian posterior, building on tools developed in the present article.

### 5.1 Multi-looking InSAR Data with Adjusted Noise Estimation

Here we study the sensitivity of the inverse solution to the essentially arbitrary pixel spacing of the InSAR data. We downsample the original InSAR data multiple times (Figure 5). Each level of downsampling is performed by averaging adjacent two-by-two pixels into a single pixel. This is known as multi-looking, which reduces the InSAR phase decorrelation noise by a factor of two. We incorporate this noise adjustment in (15). Figure 6 shows the resulting inferred permeability MAP point at mid-aquifer depth using each of the six multi-looking InSAR data. When the number of InSAR pixels is sufficient to resolve the main deformation features (Level 0-3), the inferred permeability solutions are similar. Hence we conclude that the InSAR data redundancy can be addressed in a fully Bayesian setting, i.e., without introducing artificial weights, by treating the data noise appropriately. However, if InSAR data are too coarse to resolve the main deformation features (Level 4-5), a deterioration in the validity of the inferred permeability field is expected.

### 5.2 The Role of the Prior

The limited data available for aquifer characterization has placed emphasis on methods to inform the characterization with prior geostatistical knowledge [Goovaerts (1997), Deutsch Journel (1998), Wackernagel (2010)]. A large number of methods that allow imposition of additional knowledge from small-scale horizontal layering to the connectivity of large-scale fluvial channels [Remy . (2009), Mariethoz Caers (2014)] have been developed. Unfortunately, most of these methods are not compatible with the scalable framework used here that requires a sparse prior precision operator. However, we show below that InSAR data are sufficiently informative that dense geostatistical priors are not required.

Here we limit ourselves to the class of Matérn covariances discussed in section 2.2.2. These priors have the typical parameters of a semi-variogram [Deutsch Journel (1998)], the range, , and the sill (Figure 7*a*). The sill is the square of the pointwise marginal standard deviation . Due to the boundary effects, the exact sill cannot be determined a priori and is hence difficult to control exactly in Matérn priors. Due to the continuous nature of the prior there is no nugget. In addition, the Matérn prior also allows us to specify a mean, which is taken from Burbey08. Two-dimensional horizontal slices of prior samples with SD and increasing range from to , are shown in Figure 7*b* and 7*c*, respectively. We also note that in infinite dimensional inverse problems the prior must act as a regularization. Table 3 notes priors that do not regularize the inverse problem sufficiently so that MAP point estimate does not converge to the required tolerance.

SD | SD | SD | Converged | Gradient | ||||

min. | max. | avg. | (km) | reduction | ||||

1.1 | 3.0 | 1.4 | 2 | 300 | yes | |||

2.3 | 6.1 | 2.8 | 2 | 600 | no | |||

0.59 | 1.5 | 0.7 | 2 | 150 | yes | |||

1.1 | 3.0 | 1.3 | 1 | 300 | no | |||

1.2 | 3.0 | 1.6 | 4 | 300 | yes |

^{2}norm of the gradient by 4 orders of magnitude.

The high spatial resolution of the InSAR data provides good constraints on the lateral permeability variation in the aquifer. This can be seen in Figure 8, which shows the MAP points for each of the five prior choices in Table 3. Despite large ranges in both and SD, the main features of the MAP point remain almost unchanged. This includes both the high permeability channel south of the well and the low-permeability region northwest of the well. Of course, the amplitudes increase with SD and small features are lost as increases, but main pattern is independent of the prior. This demonstrates that the MAP point is dominantly informed by the data and that the effect of the prior on the inferred permeability variation is small.

This highlights the step change in the data availability for the characterization of lateral aquifer heterogeneity that is provided by InSAR. Given upcoming missions with improved accuracy and increased frequency, InSAR-derived surface deformation measurements will provide a powerful and low-cost means of aquifer characterization.

### 5.3 Assessment of the Model Error

In section 4.1, we showed that the discrepancy between the model and the data is roughly a random noise signal of magnitude within the estimated noise level, except in the northwest corner of the domain
(Figure 3*c*). Burbey08 have postulated the existence of a fault in this area, which is not accounted for in our aquifer model, presented in section 3.3. Here we investigate whether the relatively large discrepancy in the northwest corner is due to this structural error in the model.

Introducing the actual fault and the associated mechanics is beyond the scope of this article. Instead, we explore the possibility of model error by studying a synthetic inversion. For this model-error-free scenario we regard the permeability field inferred from the real InSAR data (Figure 3*a*) as the “true” permeability. Using this truth permeability field, a synthesized InSAR LOS displacement map is generated from the model-predicted LOS observations over the same period and area as for the actual InSAR observations. These synthesized observations are then polluted by a normally-distributed random noise of the same magnitude assumed for the real InSAR data, mm (the synthesized data are shown in Figure 9*a*).

We then use the synthesized LOS data to solve the inverse problem of estimating the permeability MAP point. The overall pattern of the resulting permeability MAP point (Figure 9*b*) is very similar to those of the “true” medium (Figure 3*a*). There is, however, a notable decrease in the magnitude of the permeability variation from the prior mean, due to the regularizing action of the prior.

Using the permeability field inferred from the synthetic data (Figure 9*b*), we solve the forward problem and reconstruct the LOS observations, . The error in these reconstructed LOS observations (Figure 9*c*) is within the noise everywhere in the observed area. This demonstrates that we should be able to reconstruct the LOS observations to within data noise in the absence of model error. This suggests that the inability of the poroelatic model to reconstruct the LOS observations in the northwest (Figure 3c) is likely due to a number of possible simplifications in the model, including the presence of a fault.

Again this demonstrates the high information content of the InSAR data. Not only do they provide detailed constraints on the lateral permeability variation, they also clearly highlight regions where the model likely has structural errors. This allows the targeted improvement of the aquifer model.

### 5.4 Exploration of the posterior distribution (Part II)

Our focus in part I of this two-part series of articles is on estimating the permeability maximum A posteriori point. In the preceding discussion, we have paid careful attention to matters arising in the Bayesian formulation of this inverse problem, including treatment of multilooking data sets, the construction of the prior and its influence on the MAP point, and analysis of model error. Our ultimate goal is to explore the posterior distribution and estimate the expected value along with the associated uncertainties of various quantities of interest. Estimating the MAP point is a critical first step because of its intrinsic importance as the most likely permeability realization. Furthermore, the MAP point can be used to create proposals for MCMC sampling algorithms required to explore the posterior distribution [Petra . (2014), Martin . (2012)]. Since this discussion of estimating the MAP point is self-contained and draws important scientific conclusions about applying our method to the Nevada test case and the informativeness of InSAR data, and to avoid an unnecessarily lengthy presentation, we defer the discussion of the uncertainty quantification and MCMC sampling to part II.

## 6 Conclusion

We have shown that InSAR-based surface deformation measurements provide detailed information about lateral variations in aquifer permeability. In the Nevada test case, the surface deformation due to a single well allowed us to identify both high permeability pathways and flow barriers. Due to the high spatial resolution of the InSAR data, the main features of the inferred MAP permeability field are essentially independent of prior assumptions. The inferred permeability field also allows us to reconstruct the lateral deformations recorded at six GPS stations near the well. This provides validation, because the GPS data were not used in the inference of the permeability field.

The above conclusions are derived from a single interferogram that captures the total subsidence due to the well test. Given the increasing availability of InSAR data from Sentinel-1 and upcoming NISAR missions, future aquifer characterization will be able to utilize higher quality and more frequent surface deformation measurements. We therefore believe that geodetic surface deformation measurements will dramatically increase the information for aquifer characterization. Scalable and robust inversion frameworks that integrate InSAR data into poromechanical aquifer models are likely to become critical tools for regional groundwater management.

The Bayesian inference framework was applied to a three-dimensional transient multi-physics problem with real data that requires the inference of up to parameters. This requires dimension invariant methods and was achieved by exploiting the compact nature of the parameter-to-observable map, adjoint-based derivatives, and Gauss–Newton–CG method, the combination of which capture the intrinsic low dimensionality of this inverse problem and permit rapid, dimension-independent convergence. The power of these methods is available through the Python-based Bayesian inverse problems library hIPPYlib. The scalability of this framework provides the basis for Bayesian uncertainty quantification, based on Markov chain Monte Carlo sampling, which we will explore in Part II of this two-part series of articles.

## Appendix A Variational Formulation and Discretization Details of the 3-Field Formulation of the Biot System

For the variational form of the system (1)–(3) to be well-defined, we assume the state variables , and belong to the following infinite dimensional function spaces (note that these are regularity assumptions in space, we additionally require sufficient regularity in time):

(23) | ||||

(24) | ||||

(25) |

respectively, where , , or is the dimension of the physical domain. The space is defined as .

To obtain the weak form of the system (1)–(3), we multiply the equations (1), (2) and (3) by test functions , and , respectively, where the function spaces , , and are defined as follows:

(26) | ||||

(27) | ||||

(28) |

We then integrate the three equations in space over the domain . We discretize in time using backward Euler method. The variational problem is therefore the problem of finding , and that satisfy the following equations:

(29) | |||

(30) | |||

(31) |

for all the choices of test functions , and and for time steps . The functions and are the initial values for the pressure and the displacement, respectively. Requiring that , , , and is necessary for the well-posedness of the weak problem. We follow [Ferronato . (2010)] in our choice of the finite element function spaces in which the pressure is approximated by piecewise constant function in and the velocity is approximated in the lowest order Raviart-Thomas space by the function in . The displacement is approximated by first-order Lagrange polynomials, function in . This choice of elements results in continuity of normal velocity across the elements facets and hence guarantees discrete mass conservation.

The discretized state functions can be written as follows (we adopt some of the notation from [Ferronato . (2010)]):

where is given by:

is a vector of the DOFs of the displacement in , followed by the -displacement DOFs and then by the -displacement DOFs. The integer is the number of nodes. The basis functions are given by the formula:

where is the position vector of the node associated with the basis function and is the position vector of the normal projection of on the plane containing the face opposite to the node in the tetrahedron . The integer is the number of the tetrahedra that share the node and the set of indices is the global indices of those tetrahedra, where is the number of tetrahedra in the mesh. The piecewise constant approximation of the pressure can be written as , is the vector of the basis functions , is the DOFs of the pressure, and the basis functions are given by:

where is the tetrahedron associated to the basis function, . The lowest order RV approximation of the velocity is given by the following expression: , where is the number of faces, is the vector of the basis functions, and is the vector of the velocity DOFs. The vector-valued basis functions are given by:

where are the indices of the tetrahedrons that share the face, and is the position of the node that is not shared by face in the tetrahedron . The conventional choice of the sign is such that the vector points outward the element with smallest index .

We substitute the approximated functions in the weak form (29)–(31) to obtain the discretized system. The resulting linear system of equations that we need to solve each time step is as follows:

(32) | ||||

(33) | ||||

(34) |

where the subscript denote that the vector is evaluated (or to be evaluated) at time . We combine the equations (32)–(34) in one system as follows:

(35) |

where

Comments

There are no comments yet.