Full waveform inversion with unbalanced optimal transport distance

04/10/2020
by   Da Li, et al.
0

Full waveform inversion (FWI) is an important and popular technique in subsurface earth property estimation. However, using the least-squares norm in the misfit function often leads to cycle-skipping artifacts, increased nonlinearity of the optimization problem and local minima. Several methods that apply optimal transport distances to mitigate this problem have been proposed recently. The optimal transport distance is to compare two positive signals with equal mass. To overcome the mass equality limit, we introduce an unbalanced optimal transport (UOT) distance with Kullback-Leibler divergence to balance the mass difference. An entropy regularization and a scaling algorithm is used to compute the distance and its gradient efficiently. Two strategies of normalization methods which transform the seismic signals into non-negative functions are compared. Numerical examples in one and two dimension are solved to demonstrate the efficiency and effectiveness of the new method.

READ FULL TEXT VIEW PDF

Authors

page 13

page 15

page 16

page 17

page 18

10/20/2016

Regularized Optimal Transport and the Rot Mover's Distance

This paper presents a unified framework for smooth convex regularization...
01/31/2020

Optimal Transport Based Seismic Inversion: Beyond Cycle Skipping

Full waveform inversion (FWI) is today a standard process for the invers...
04/06/2019

Automatic Target Recognition Using Discrimination Based on Optimal Transport

The use of distances based on optimal transportation has recently shown ...
06/08/2021

Unbalanced Optimal Transport through Non-negative Penalized Linear Regression

This paper addresses the problem of Unbalanced Optimal Transport (UOT) i...
02/08/2020

A data-driven choice of misfit function for FWI using reinforcement learning

In the workflow of Full-Waveform Inversion (FWI), we often tune the para...
12/20/2016

Quantum Optimal Transport for Tensor Field Processing

This article introduces a new notion of optimal transport (OT) between t...
07/21/2013

Regularized Discrete Optimal Transport

This article introduces a generalization of the discrete optimal transpo...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Full waveform inversion (FWI) is a high resolution seismic imaging technique that is designed to recover subsurface structure and physical properties from seismic data. It was proposed by Lailly [16] and Tarantola [26] in the early 1980s and has been gaining popularity with the improvement of the computing capacity. From mathematical point of view, FWI is a nonlinear PDE-constrained optimization problem with subsurface physical properties such as velocity and density as the control parameters, and the waveform received by the receivers as the state parameters [14]. Depending on different physical model, the constraint PDE can be wave equation, acoustic wave equation or elastic wave equation with proper boundary conditions or techniques to simulate wave propagating in an unbounded domain [14]. Because of the huge size of the problem, gradient based optimization methods such as gradient descent, nonlinear conjugate gradient method, l-BFGS and Newton’s method are usually implemented. The gradient of misfit function generally can be calculated by the adjoint state method [22].

In conventional methods, the distance is used in the misfit function during optimization to minimize the difference between observed and synthetic data [14, 27]. As a nonlinear optimization problem, FWI algorithm suffers the existence of local minima. One of the reasons causing the local minima is the cycle-skipping artifact, which occurs as the phase difference between two seismic signals is larger then half of wavelength [29]. To mitigate this problem, the optimal transport (OT) distance or so called Wasserstein distance has been introduced to describe the difference between seismic signals [11] and later it has been applied to the seismic imaging [13, 32] and FWI problem in [10]. Although it requires certain prerequisites such as mass conservation and normalization, the OT distance has the convexity property with respect to dilations, and amplitude changes in signals [10] which is one of the main motivations of introducing OT distance to the FWI problem.

The optimal transport distance is designed to describe the difference between two positive finite measures with equal total mass. Despite of the promising properties, the seismic signal is oscillating around and usually the condition of equal mass is not satisfied. Several works have been proposed to overcome those restrictions and integrate the OT distance to FWI problem. In the first strategy, the non-negative and equal mass restrictions are overcome by connecting the 1-Wasserstein distance to KR norm [5] with the dual form of the Kantorovich problem. And then the distance is computed by a proximal splitting strategy called simultaneous descent method of multipliers (SDMM) [19, 18]. In [33], the 1-Wasserstein distance is evaluated through the dynamic formulation [1], and then solved by a primal-dual hybrid gradient method (PDHG) with line search. Another strategy is to normalize the seismic signals into positive functions with equal mass first, then compute the OT distance. In [23, 24, 31, 30], the seismic signals are normalized into positive functions with equal mass through normalization methods such as linear, quadratic and exponential functions. Then the 2-Wasserstein distance between seismic signals can be evaluated either through a trace-by-trace technique or through numerically computation of Monge-Ampère equation for 2D seismogram case.

To overcome the equal mass limitation, the unbalanced optimal transport (UOT) problem is raised in [3] based on a dynamic approach. Later several works have been proposed in both static and dynamic approaches [21, 6, 7]. In this paper we introduce the UOT distance mainly based on the work in [7] to the FWI problem. Linear and exponential normalization functions have been studied to overcome the non-negative restriction. Numerical examples show that the UOT distance leads to a more convex behavior with respect to time shift of signal under proper normalization compared to the distance and 1-Wasserstein distance in [18] and [33]. Compared to the 2-Wasserstein trace by trace strategy in [30], the UOT distance provides more consistency in the adjoint source due to the normalization only working on one restriction. And during the optimization procedure, the misfit function with UOT distance provides more accurate gradient compared to the distance when a poor initial model is used.

The rest of this paper is organized as the follows. In section 2, we first give a general review of optimal transport problem and unbalanced optimal transport problem. Then we introduce a scaling algorithm to compute the UOT distance and its derivative. Section 3 gives the methodology of signal normalization and the application of UOT distance to the FWI problem. Four numerical examples are shown in section 4 on the UOT distance and the application to the FWI problem.

2 Unbalanced optimal transport distance

In this section, we provide a short review of the optimal transport problem in the discrete case. The definition of unbalanced optimal transport problem with Kullback–Leibler divergence is provided. A scaling algorithm is used to compute the unbalanced optimal transport distance and its gradient with respect to the signal.

We first define the indicator function between vectors

and is defined as:

(2.1)

The entropy function of a matrix with is defined as:

(2.2)

and here we use the convention . The Kullback-Leibler divergence between vectors and is given by:

(2.3)

Similar to the vector case, the Kullback-Leibler divergence between matrices with and is:

(2.4)

here we use the convention and , .

2.1 Optimal transport and unbalanced optimal transport problem

The optimal transport problem has a long history and dates back to the 18th century [20]. The modern formulation is given by Kantorovich [15]. Please refer to [28, 25] for a comprehensive review.

Begin with two compact subsets , and the cost function measuring the distance between and

. For two probability measures

and , the Kantorovich formulation of optimal transport problem is defined as

(2.5)

where is the set of joint probability measures on ,

(2.6)

the and are the projection operators to and . When the cost function is defined on the space as , the -Wasserstein distance between and is:

(2.7)

We focus on the discrete setting in this work. Let , , . Consider the cost function be the squared Euclidean distance. The optimal transport problem in the discrete form is:

(2.8)

Here matrix represents the ground cost distance defined by . Similar to the measure case, the -Wasserstein distance between vector is defined as:

(2.9)

Since and are the density functions of probability measure and , the equal mass condition is satisfied intrinsically. The unbalanced optimal transport problem is a generalization of the optimal transport problem to overcome the equal mass restriction between and . The unbalanced optimal transport problem used in this paper is based on the work in [7]. To relax the marginal constraints in (2.8), the unbalanced optimal transport problem is defined as:

(2.10)

here both and are proper convex functions.

For example, when and are the indicator function as (2.1):

(2.11)

it can be easily checked that the unbalanced optimal transport problem (2.10) coincides with the optimal transport problem (2.8). In this work we consider the case when

(2.12)

Here and are the Kullback-Leibler divergence between vectors shown in equation (2.3) which measures the differences between and , and , and the parameter controls the weight of the mass balancing term in (2.10).

Similar to (2.9), the unbalanced optimal transport distance based on ground cost between vector is:

(2.13)

where .

2.2 Numerical method for unbalanced optimal transport distance

We introduce entropy regularization (2.2) to compute the unbalanced optimal transport distance (2.13). The entropy regularization is first introduced to the optimal transport problem in the work [9] to increase the computation efficiency. Then the optimal transport distance with entropy regularization is used in an optimization problem named Wasserstein Barycenter which provides great improvements compared to distance [8, 2]. Instead of computing the exact unbalanced optimal transport distance (2.13), a regularized distance is introduced in this subsection. Then a coordinate ascent algorithm is used to compute the regularized distance based on the dual formulation. The numerical method in this section is a special case of the comprehensive work in [7].

With the mass balance term and as the Kullback-Leibler divergence between vectors (2.3), given the regularization parameter , the regularized problem of (2.10) can be represented as

(2.14)

The above equation can be rewritten as

(2.15)

where is the Kullback-Leibler divergence between matrices (2.4), and . Then we have the definition of regularized unbalanced optimal transport distance used in this work:

Definition 2.1.

Define the ground cost matrix by . With regularization parameter and mass balancing parameter , the regularized unbalanced optimal transport distance between vectors is

(2.16)

Here is the Kullback-Leibler divergence between two matrices or vectors. And .

Equation (2.16) is denoted as the primal problem. To compute the unbalanced optimal transport distance, the dual problem is needed.

Theorem 2.2.

The dual problem of (2.16) is

(2.17)

Strong duality holds. The minimization is attained for a unique for the primal problem (2.16). And , maximize the dual problem (2.17) if and only if:

(2.18)
Proof.

The proof of this theorem is a straight forward application of Theorem 3.2 in [7]. ∎

A coordinate ascent method can be used to compute the unbalanced optimal transport distance.

Proposition 1.

Given matrix , coefficient and in consistent with Theorem 2.2. Suppose , solves the dual problem (2.17), let with and . For :

(2.19)

The above proposition can be easily checked by computing the first order optimality condition of the dual problem (2.17). The following remark provides an algorithm to compute the unbalanced optimal transport distance with entropy regularization as Definition 2.1.

Remark 2.3.

Fix , cost matrix , regularization parameter and mass balancing parameter . Matrix is defined by . Starting with an initial value , the dual problem can be computed through a coordinate ascent algorithm: for the -th iteration,

(2.20)

Suppose the coordinate ascent algorithm converges with , the transport matrix in (2.16) can be computed as

(2.21)

Also, the gradient of regularized unbalanced optimal transport distance can be achieved through following remark.

Remark 2.4.

Suppose , and solve the primal problem (2.16) and dual problem (2.17), the gradient of regularized unbalanced optimal transport distance with respect to is:

(2.22)

The scaling algorithm to compute the regularized unbalanced optimal transport distance and gradient is given in 1.

Input: , ,
Initialization: , ,
while not converged do
       update vector with
       update vector with
end while
Compute transport matrix with
Compute regularization UOT distance:
Compute gradient:
Return: dist, grad
Algorithm 1 Scaling algorithm for regularized UOT distance and gradient

3 Full waveform inversion and gradient computation

Since the optimal transport problem was proposed for positive measures, normalization is needed before introducing the regularized UOT distance for seismic signals. Normalizations with linear and exponential functions are studied in this paper:

(3.1)
(3.2)

The normalization coefficient is chosen such that in equation (3.1). Here the exponential is taken for each entry of vector . As is small enough the exponential normalization (3.2) should be close to linear normalization (3.1) by the Taylor series. Notice that after the linear or exponential normalization, the convexity with respect to the time shift property of Wasserstein distance demonstrated in [10] is no longer preserved. However, we can still expect similar properties as monotone behavior with respect to the time shift with proper normalization in certain cases. This will be discussed in section 4.1.

In this work we consider the full waveform inversion governed by the 2D acoustic equation with constant density in a discrete form. Suppose is the -th discrete point in the spatial domain, and for the discrete point in time as . Suppose there are sources and receivers in the physical model. Let , be the indexes of sources and receivers.

Consider the 2D acoustic equation with constant density:

(3.3)

Here is the acoustic velocity field, is the -th source, is the wavefield generated by . In practice, the absorbing boundary condition (ABC) [12] or perfectly matched layer (PML) [4] is needed to simulate the seismic wave propagating in a unbounded domain. Here we denote the equation (3.3) with suitable initial conditions and boundary conditions (ABC or PML) in a compact form:

(3.4)

where is the finite difference operator with velocity field .

The full waveform inversion is a PDE-constrained optimization problem in both state (wavefield) and control (velocity) space. Since the acoustic equation (3.3) is well posed with suitable initial and boundary condtions, the FWI problem has a reduced form. Given the observed data and seismic source for , the full waveform inversion is characterized as:

(3.5)
subject to (3.6)

Here is the projection operator that maps the wavefield to the -th receiver. And is the distance used in the misfit function . When is the distance, the misfit function is:

(3.7)

When the regularized UOT distance as defined in Definition 2.1 is used,

(3.8)

where is the linear or exponential functions defined in (3.1) (3.2) with normalization coefficient .

The gradient of the misfit function can be achieved through the adjoint state method [22]. In this case the gradient will be represented as an inner product by the derivative of forward modelling wavefield and adjoint wavefield.

(3.9)

The adjoint wavefields are the solutions of the adjoint equations with time reversed,

(3.10)

For the misfit function (3.7) the adjoint sources are:

(3.11)

As the regularized UOT distance is used in the misfit function, the adjoint sources can be computed through Remark 2.4. For the linear normalization (3.1), the adjoint sources are

(3.12)

And for the exponential normalization:

(3.13)

Here the is the gradient of the first term in UOT distance. Once the gradient is calculated, the PDE-constrained optimization problem (3.5) can be solved by gradient based methods or quasi-Newton methods such as l-BFGS.

4 Numerical Example

We provide four numerical examples to show the different behavior of optimization with distance and unbalanced optimal transport distance with linear and exponential normalization. Example 1 and 2 show that compared to the distance, regularized UOT distance can reduce the cycle skipping artifact efficiently. The FWI problems with direct wave and reflection wave are shown in example 3 and 4, which suggest that when a poor initial model is used, the UOT distance has better performance thant the distance.

4.1 Shifted Ricker example

We investigate the behavior with respect to time shift of Ricker wavelets with distance and regularized UOT distance in this example. To compare with different distance, we define a misfit function:

The distance represents distance, regularized UOT distance with linear and exponential normalization. Here and are two Ricker wavelets centered at time s and peak frequency 10 Hz, the amplitude of is times of . The sample frequency is Hz. The case of and is shown in Figure 1.

We fix as the reference signal, then move the center of along time axis from s to s. The normalized of distance, regularized UOT distance with linear and exponential normalization has been shown in Figure 2 (a), (c), (e) respectively. In Figure 2 (a), one global minima and two local minima are observed which is a sign for the cycle-skipping artifact. Compared to distance, the cycle-skipping artifact is slightly reduced by the regularized UOT distance with linear normalization as shown in Figure 2 (c). With smaller normalization coefficient , better performance can be achieved. However, can not be less than the absolute value of minimal value of and . In Figure 2 (e), as , the misfit function is similar to the case of (a) and (c). One global minima is obtained with , the misfit function will increase as the difference between center time increase and the cycle-skipping artifact is avoided. Compared to the case (a), the regularized UOT distance can mitigate the cycle-skipping artifact with proper normalization coefficient . Also compared to the previous work in [18, 33], the UOT distance here provides a more convex behavior compared to the 1-Wasserstein distance with respect to time shift.

Adjoint sources of the case in Figure 1 are given in Figure 2 (b), (d), (f). The adjoint source of distance is shown in (b) which is . The adjoint sources of regularized UOT distance with linear normalization are shown in (d). Different than the case, the appearance of adjoint sources behaves like the difference between the envelope of and . Similar results can be found in the study of [30, 31, 33]. Also, as the normalization coefficient decreases, the amplitude of difference between the envelope is increasing. In Figure 2 (f), adjoint sources of regularized UOT distance with exponential normalization are shown. At , the adjoint source is similar to the case in (d), but great distortion happens as . This is the reason in the normalization (3.2), the interval where plays an predominate role in the amplitude of the adjoint source according to the exponential term in (3.13).

Compared to adjoint source, the UOT adjoint sources are more sensitive to the position of wavelets while providing less information of the detail of the wavelets. This behavior brings smaller wavenumber components in the gradient which is achieved through the adjoint state method (3.9) and (3.10). With comprehensive consideration of the behavior of misfit functions and adjoint sources, smaller is encouraged for UOT distance with linear normalization. In the case of UOT distance with exponential normalization, dedicated needs to be chosen to make the amplitude of close to 1. This effort can largely reduce the cycle-skipping issue and cause less distortion in the adjoint source at the same time. For the regularized UOT distance, the behavior of misfit function and adjoint source of both normalization methods are similar as is small enough. One of the explanation is the linear function in equation (3.1) converges to the exponential function in (3.2) as is decreasing. A discussion of different normalization methods can also be found in the work of [31]. We focus on the exponential normalization in the following numerical examples as it provides different behavior of misfit function and adjoint source as is lager.

Figure 1: Ricker wavelet and .
Figure 2: (a), (b): Misfit function and adjoint source using distance. (c), (d): Misfit functions and adjoint sources using UOT distance with linear normalization. (e), (f): Misfit functions and adjoint sources using UOT distance with exponential normalization.

4.2 Single layer model

Due to the large size and nonlinear behavior of the FWI problem, a single layer model example with 2 coefficients is provided for a detailed insight. We investigate a simple 2D FWI problem in a region with km wide and km deep, discretized into grid points. We use 51 receivers at the top of the region, and a source with 5 Hz Ricker wavelet located at km, km. Consider the velocity model:

here is the Heaviside step function along direction. Background velocity km/s. Define the misfit function:

where is defined in (3.5). The true model of the FWI problem is which is shown in Figure 3. We set with grid size and with discrete grid size . As the acoustic wave propagating in the domain, a single reflection wave caused by the velocity perturbation will be recorded at the position of receivers. For different model, the position of the reflector controls the arriving time of the reflection wave, the velocity difference controls the amplitude of the reflection wave. As and are changing, the reflection waves may start interact with each other which will cause the cycle-skipping artifact. We evaluate for each by using distance, regularized UOT distance with exponential normalization, results are shown in Figure 4.

In Figure 4, the z axis is the normalized misfit function . The distance case is shown in (a), the convergence will be trapped in the local minima due to the wrinkles in the surface of misfit function as bad initial values are provided. The regularized UOT distance cases with exponential normalization coefficient are shown in figure (b), (c) and (d) separately. Compared to (a), the surface of misfit functions in (b), (c) and (d) have less wrinkles in the surface of objectives. With larger normalization coefficient, the surface of misfit function is flatter. Notice that the flat areas such as goes smaller and goes larger in all four figures may indicate the existence of local minima and may affect the convergence speed. However, in this example, the regularized UOT distance with linear or exponential normalization still provides a larger region which leads to convergence to global minima. And this is an explanation that the cycle-skipping artifact can be mitigated by the regularized UOT distance.

Figure 3: The velocity model of . The black spot shows the position of seismic source.
Figure 4: (a) Misfit function by using distance. (b), (c), (d): Misfit functions by regularized UOT distance with exponential normalization coefficient .

4.3 2D crosshole model

In this part we perform the full waveform inversion in a 2D crosshole model to investigate the behavior of the full waveform inversion problem with direct wave. The model width and depth are km with grid size km. In the true model, background velocity is km/s, a single circle anomaly is located at the center of the model, with radius km and velocity km/s as shown in Figure 6 (a). There are 11 sources are equally spaced on the left side and 101 receivers on the right side. Synthetic data is generated with the 10 Hz Ricker wavelet and an initial model with homogeneous km/s is used in FWI problem. The homogeneous initial model is shown in Figure 6 (b). We compare the FWI problem by distance and regularized UOT distance with the exponential normalization coefficient in this example.

Figure 5 shows the adjoint sources of distance and UOT distance respectively. The adjoint source generated by the UOT distance provides smooth transitions on the positions of seismic wavefront which will leads the gradients with less large wavenumber components due to the adjoint state method (3.9) and (3.10). And the UOT adjoint source here is more regular than the trace by trace strategy used in [30]. Figures 6 (c), (d) display the inverse results of distance and UOT distance respectively. The l-BFGS method with memory parameter as is used during the optimization process and we proceed iterations to show the directions of velocity model updates. Both results describe the presence of the velocity anomaly. However, the result contains abnormal disturbances at left and right parts of the center which will provides wrong velocity update during future updates. Compared to the result, the UOT result provides more regular shape of the velocity anomaly. This experiment shows the UOT distance can reduce the risk of wrong velocity updates which may cause the iterations to be trapped in the local minima.

Figure 5: (a), (b): The adjoint sources generated by the sixth source in the model with distance, regularized UOT distance with exponential normalization respectively.
Figure 6: (a): The true velocity model. (b), (c), (d): Inverse results of gradient descent after 5 iterations with distance, UOT distance with linear and exponential normalization respectively.

4.4 2D reflection model

In this subsection we compare the difference between the FWI problem based on distance and regularized UOT distance with exponential normalization through a 2D reflection model. As shown in Figure 7 (a), the velocity model is a part of the Marmousi 2 model [17]. The size of the velocity model is with the spatial step size m. There are 11 equally spaced sources and 101 equally spaced receivers located on the surface of the velocity model. The synthetic data is generated by the Ricker wavelet with central frequency Hz as the source function. The sampling frequency is Hz and recording time is s. Perfectly matched layer technique is performed to simulate the seismic wave propagating in a free domain. The initial model is shown in Figure 7 (b) which is a strongly smoothened velocity model of (a). Normalization coefficient of regularized UOT distance is chosen as .

Figure 7: (a): True model. (b): Initial model used in the FWI problem.

The nonlinear conjugate gradient method is performed to minimize the misfit function with both distance and UOT distance. Figure 8 shows the adjoint sources of distance and UOT distance at the first iteration. Similar to the example 1 and example 3, the adjoint source generated by UOT distance concentrate on the location of wavelets and provides smoothed detail on the shape of reflected wavelets. The gradients of first iteration of distance and UOT distance are shown in the Figure 9. Compared to the gradient, the UOT gradient provides more large scale structure during the optimization which will increase the stability during the iterations. The inverse results are shown in Figure 10. The FWI results with distance and UOT distance after 20 iterations are shown in Figure 10 (a) and (b), the results after 40 iterations are shown in Figure 10 (c) and (d). It is clear that the distance case failed to recover the structure of the true velocity model. To increase the resolution of inverse result, we continue the FWI with distance 80 iterations with the UOT result in Figure 10 (d) as an initial model to increase the image resolution. The final result is shown in Figure 10 (e). Figure 10 (f) is the difference between Figure 10 (d) and Figure 10 (e).

This numerical example suggests that as the initial model is inaccurate and deficient in low frequency components, the UOT distance provides more stability during the optimization. Also after the large scale structure of the image is fixed, the UOT distance can be replaced by the regular distance to increase the image resolution.

Figure 8: The adjoint sources of distance and UOT distance at the first iteration
Figure 9: The gradients of first iteration.
Figure 10: (a), (b): FWI results with distance and regularized UOT distance after 20 iterations. (c), (d): FWI results with distance and regularized UOT distance after 40 iterations. (e): FWI result with distance with result in (d) as an initial model after 80 iterations. (f): Difference between (d) and (e).

5 Conclusion

In this work we have studied the methodology of integrating the unbalanced optimal transport distance to full waveform inversion problem. A regularized version of UOT distance and its gradient can be efficiently achieved by a scaling algorithm. Also linear and exponential normalization methods of the signals have been discussed. We implement the trace by trace strategy to compute the regularized UOT distance between two seismograms. The adjoint state method including the derivation of adjoint sources of regularized UOT distance is also provided.

The 1D numerical example indicates that the regularized UOT distance used in this paper has a more convex behavior comparing to the distance with respect to time shift. The single layer model shows that, as the difference between seismograms focus on arriving time and amplitude of the signal, the regularized UOT distance can mitigate the cycle-skipping artifact efficiently comparing to the conventional distance in FWI problem. With a poor initial model, the regularized UOT distance can provide more accurate gradient of the misfit function during the iterations to increase the stability of the optimization algorithm. This is described by the 2D crosshole and reflection model experiments. Also the last example shows that for a practical FWI usage, the regularized UOT distance can be used in the first few steps to generate a good initial model for further iterations. As the previous numerical study indicates, the behavior of UOT distance is heavily dependant on the normalization methods and coefficient. More theoretical results on the choice of normalization method and coefficient is expected in the future work.

6 Acknowledgements

The authors would like to thank the industrial sponsors of the CREWES consortium, and their support through NSERC grant CRDPJ 461179-13. The first author thanks the China Scholarship Council (CSC) for supporting the research. The work of the second author is also supported by NSERC Discovery grant RGPIN-2015-06038. The work of the third author is supported by NSERC Discovery grant RGPIN-2019-04830. We deeply appreciate the constructive discussions with Dr. Peng Yong.

References

  • [1] J. Benamou, Y. Brenier, and K. Guittet (2002) The Monge–Kantorovitch mass transfer and its computational fluid mechanics formulation. International Journal for Numerical methods in fluids 40 (1-2), pp. 21–30. Cited by: §1.
  • [2] J. Benamou, G. Carlier, M. Cuturi, L. Nenna, and G. Peyré (2015) Iterative Bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing 37 (2), pp. A1111–A1138. Cited by: §2.2.
  • [3] J. Benamou (2003) Numerical resolution of an “unbalanced” mass transport problem. ESAIM: Mathematical Modelling and Numerical Analysis 37 (5), pp. 851–868. Cited by: §1.
  • [4] J. Berenger et al. (1994) A perfectly matched layer for the absorption of electromagnetic waves. Journal of computational physics 114 (2), pp. 185–200. Cited by: §3.
  • [5] V. I. Bogachev (2007) Measure theory. Vol. 1, Springer Science & Business Media. Cited by: §1.
  • [6] L. Chizat, G. Peyré, B. Schmitzer, and F. Vialard (2015) Unbalanced Optimal Transport: Dynamic and Kantorovich Formulation. arXiv preprint arXiv:1508.05216. Cited by: §1.
  • [7] L. Chizat, G. Peyré, B. Schmitzer, and F. Vialard (2018) Scaling algorithms for unbalanced optimal transport problems. Mathematics of Computation 87 (314), pp. 2563–2609. Cited by: §1, §2.1, §2.2, §2.2.
  • [8] M. Cuturi and A. Doucet (2014) Fast computation of Wasserstein barycenters. In

    International Conference on Machine Learning

    ,
    pp. 685–693. Cited by: §2.2.
  • [9] M. Cuturi (2013) Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in neural information processing systems, pp. 2292–2300. Cited by: §2.2.
  • [10] B. Engquist, B. D. Froese, and Y. Yang (2016) Optimal transport for seismic full waveform inversion. arXiv preprint arXiv:1602.01540. Cited by: §1, §3.
  • [11] B. Engquist and B. D. Froese (2013) Application of the Wasserstein metric to seismic signals. arXiv preprint arXiv:1311.4581. Cited by: §1.
  • [12] B. Engquist and A. Majda (1977) Absorbing boundary conditions for numerical simulation of waves. Proceedings of the National Academy of Sciences 74 (5), pp. 1765–1766. Cited by: §3.
  • [13] B. Engquist and Y. Yang (2018) Seismic imaging and optimal transport. arXiv preprint arXiv:1808.04801. Cited by: §1.
  • [14] A. Fichtner (2010) Full seismic waveform modelling and inversion. Springer Science & Business Media. Cited by: §1, §1.
  • [15] L. V. Kantorovich (2006) On the translocation of masses. Journal of Mathematical Sciences 133 (4), pp. 1381–1382. Cited by: §2.1.
  • [16] P. Lailly and J. Bednar (1983) The seismic inverse problem as a sequence of before stack migrations. In Conference on inverse scattering: theory and application, pp. 206–220. Cited by: §1.
  • [17] G. S. Martin, R. Wiley, and K. J. Marfurt (2006) Marmousi2: an elastic upgrade for marmousi. The leading edge 25 (2), pp. 156–166. Cited by: §4.4.
  • [18] L. Métivier, R. Brossier, Q. Merigot, E. Oudet, and J. Virieux (2016) An optimal transport approach for seismic tomography: Application to 3D full waveform inversion. Inverse Problems 32 (11), pp. 115008. Cited by: §1, §1, §4.1.
  • [19] L. Métivier, R. Brossier, Q. Mérigot, E. Oudet, and J. Virieux (2016) Measuring the misfit between seismograms using an optimal transport distance: Application to full waveform inversion. Geophysical Supplements to the Monthly Notices of the Royal Astronomical Society 205 (1), pp. 345–377. Cited by: §1.
  • [20] G. Monge (1781) Mémoire sur la théorie des déblais et des remblais. Histoire de l’Académie Royale des Sciences de Paris. Cited by: §2.1.
  • [21] B. Piccoli and F. Rossi (2014) Generalized Wasserstein distance and its application to transport equations with source. Archive for Rational Mechanics and Analysis 211 (1), pp. 335–358. Cited by: §1.
  • [22] R. Plessix (2006) A review of the adjoint-state method for computing the gradient of a functional with geophysical applications. Geophysical Journal International 167 (2), pp. 495–503. Cited by: §1, §3.
  • [23] L. Qiu, J. Ramos-Martínez, A. Valenciano, J. Kirkebø, and N. Chemingui (2017) Mitigating the cycle-skipping of full-waveform inversion: an optimal transport approach with exponential encoding. In SEG 2017 Workshop: Full-waveform Inversion and Beyond, Beijing, China, 20-22 November 2017, pp. 1–4. Cited by: §1.
  • [24] L. Qiu, J. Ramos-Martínez, A. Valenciano, Y. Yang, and B. Engquist (2017) Full-waveform inversion with an exponentially encoded optimal-transport norm. In SEG Technical Program Expanded Abstracts 2017, pp. 1286–1290. Cited by: §1.
  • [25] F. Santambrogio (2015) Optimal transport for applied mathematicians. Birkäuser, NY 55, pp. 58–63. Cited by: §2.1.
  • [26] A. Tarantola (1984) Inversion of seismic reflection data in the acoustic approximation. Geophysics 49 (8), pp. 1259–1266. Cited by: §1.
  • [27] A. Tarantola (2005) Inverse problem theory and methods for model parameter estimation. Vol. 89, siam. Cited by: §1.
  • [28] C. Villani (2008) Optimal transport: old and new. Vol. 338, Springer Science & Business Media. Cited by: §2.1.
  • [29] J. Virieux and S. Operto (2009) An overview of full-waveform inversion in exploration geophysics. Geophysics 74 (6), pp. WCC1–WCC26. Cited by: §1.
  • [30] Y. Yang, B. Engquist, J. Sun, and B. F. Hamfeldt (2018) Application of optimal transport and the quadratic Wasserstein metric to full-waveform inversion. Geophysics 83 (1), pp. R43–R62. Cited by: §1, §1, §4.1, §4.3.
  • [31] Y. Yang and B. Engquist (2017) Analysis of optimal transport and related misfit functions in full-waveform inversion. Geophysics 83 (1), pp. A7–A12. Cited by: §1, §4.1, §4.1.
  • [32] P. Yong, J. Huang, Z. Li, W. Liao, and L. Qu (2019) Least-squares reverse time migration via linearized waveform inversion using a Wasserstein metric. Geophysics 84 (5), pp. S411–S423. Cited by: §1.
  • [33] P. Yong, W. Liao, J. Huang, Z. Li, and Y. Lin (2019) Misfit function for full waveform inversion based on the Wasserstein metric with dynamic formulation. Journal of Computational Physics 399, pp. 108911. Cited by: §1, §1, §4.1, §4.1.