Log In Sign Up

Structured regularization based local earthquake tomography for the adaptation to velocity discontinuities

Here we propose a local earthquake tomography method that applies a structured regularization technique to determine sharp changes in the Earth's seismic velocity structure with travel time data of direct waves. Our approach focuses on the ability to better image two common features that are observed the Earth's seismic velocity structure: velocity jumps that correspond to material boundaries, such as the Conrad and Moho discontinuities, and gradual velocity changes that are associated with the pressure and temperature distributions in the crust and mantle. We employ different penalty terms in the vertical and horizontal directions to refine the imaging process. We utilize a vertical-direction (depth) penalty term that takes the form of the l1-sum of the l2-norm of the second-order differences of the horizontal units in the vertical direction. This penalty is intended to represent sharp velocity jumps due to discontinuities by creating a piecewise linear depth profile of the average velocity structure. We set a horizontal-direction penalty term on the basis of the l2-norm to express gradual velocity tendencies in the horizontal direction. We use a synthetic dataset to demonstrate that our method provides significant improvements over the estimated velocity structures from conventional methods by obtaining stable estimates of both the velocity jumps and gradual velocity changes. We also demonstrate that our proposed method is relatively robust against variations in the amplitude of the velocity jump, initial velocity model, and the number of observed travel times. Furthermore, we present a considerable potential for detecting a velocity discontinuity using the observed travel times from only a small number of direct-wave observations.


page 5

page 7

page 8

page 10

page 14


A sharp, structure preserving two-velocity model for two-phase flow

The numerical modelling of convection dominated high density ratio two-p...

A Doubly Adaptive Penalty Method for the Navier Stokes Equations

We develop, analyze and test adaptive penalty parameter methods. We prov...

A local velocity grid conservative semi-Lagrangian schemes for BGK model

Most numerical schemes proposed for solving BGK models for rarefied gas ...

Analysis of lowest-order characteristics-mixed FEMs for incompressible miscible flow in porous media

The time discrete scheme of characteristics type is especially effective...

An eXtended HDG method for Darcy-Stokes-Brinkman interface problems

This paper proposes an interface/boundary-unfitted eXtended hybridizable...

A New Method for Atlanta World Frame Estimation

In this paper, we propose a new Atlanta frame estimation method by consi...

1 Introduction

An earthquake tomography method is a tool to visualize seismic velocity structure in the Earth. The crust is an approximately 10–50-km-thick layer that covers the Earth’s surface ([Bassin 2000]), and hosts intense shallow seismicity. Local earthquake tomography (LET) is often used to capture the high-resolution, three-dimensional (3-D) crustal structure of a given region (e.g., [Aki & Lee 1976], [Thurber 1993]) and relocate earthquake hypocenters (e.g., [Thurber 1983]). Therefore, LET and its associated approaches, such as double-difference tomography ([Zhang & Thurber 2003]), provide fundamental information for understanding earthquake generation mechanisms in and around the crust (e.g., [Alesssandrini et al. 2001], [Nugraha & Mori 2006]).

Seismic tomography works best when a large number of different ray paths are produced by well-distributed source and receiver arrays. However, seismic sources are often localized around fault zones, and most seismic stations are deployed near the Earth’s surface, resulting in an inhomogeneous distribution of seismic ray paths due to these uneven source and receiver distributions. Therefore, LET often suffers from the unstable estimation of structural parameters, or overfitting. Laplacian regularization has commonly been used in LET to mitigate overfitting and stabilize crustal seismic velocity structure estimations (e.g., [Lees & Crosson 1989], [Zhang et al. 1998], [Moran et al. 1999]). The penalty to dissimilar velocity at neighborhoods by the Laplacian regularization yields smooth fluctuations of seismic velocity. The smoothed estimates are reasonably acceptable in seismology because spatial pressure and temperature variations, which are key factors affecting the seismic velocity structure, are generally gradual. However, Laplacian regularization ignores an important component of crustal structures; i.e., a velocity discontinuity which is due to a rapid change in seismic velocity and often represents either a geological boundary or a solid–liquid contact. The Conrad and Mohorovičić (Moho) discontinuities are well-known, and have been incorporated into one-dimensional (1-D) velocity models, such as the PREM ([Dziewonski & Anderson 1981]) and IASP91 ([Kennett & Engdahl 1991]) models. Furthermore, there may also be local discontinuities that are difficult to image in the crust, such as the boundary between a sedimentary basin and basement rocks. It is therefore desirable to obtain a stable estimate of both a sharp velocity jump and gradual change in seismic velocity. One way to overcome the pitfall of Laplacian regularization is to place grid points along a discontinuity ([Zhao et al. 1992], [Moran et al. 1999]), but this approach requires a prior knowledge of the discontinuity.

Here we develop a new regularization method for 3-D LET that can handle both drastic and gradual changes in the seismic wave velocity structure of the crust. Specifically, we utilize a combination of the following two penalty terms in a geophysical inverse problem: (i) an -type penalty on the -norm of the second-order differences between the horizontal units in the vertical direction; and (ii) an -type penalty on the first-order differences of the horizontal components. Penalty (i) is a regularization term that reflects our seismological knowledge of discontinuities, and is constructed by applying trend filtering ([Kim et al. 2009]), which is a sparse estimation technique. Sparse estimations with -type penalties yield estimates with zero values, and work well in balancing the tradeoffs of mitigating overfitting and obtaining estimation accuracy when the estimand has sparse representation (e.g., [Tibshirani 1996], [Schmidt et al. 2007]). Sparse estimations have been utilized in a variety of research areas, including seismology (e.g., [Loris et al. 2007], [Yao et al. 2011], [Evans & Meade 2012], [Charlety et al. 2013], [Nakata et al. 2016], [Nakata et al. 2017], [Hirose et al. 2020]

). In our case, penalty (i) makes the distribution of resulting velocities averaged on the horizontal unit piecewise linear along the vertical direction, which enhances the depths at which drastic velocity changes occur. Penalty (ii) produces velocity distributions with smooth changes in the horizontal directions which fit the common understanding of velocity structures in the Earth’s interior, since horizontal velocity variations are generally mild compared with vertical (depth) velocity variations. These penalty terms allow the proposed method to elucidate both gradual velocity changes and velocity discontinuities. Our penalty term in the depth direction can be applied to automatically image unknown velocity discontinuities in the crust since it does not require a prior knowledge of any existing discontinuities. We determine all values of hyperparameters (regularization parameters) via cross-validation.

This paper is organized as follows. We first outline the basis of LET and introduce the proposed approach in Section 2. We then conduct a synthetic test in Section 3 to demonstrate that our proposed method works better in estimating velocity structures with drastic changes than existing methods. We discuss the characteristics of conventional methods and our proposed method via several additional experiments in Section 4, and apply the proposed method to real seismic data in Section 5. We conclude by providing arguments for our proposed method in Section 6.

2 Mathematical formulation

Here we provide the LET mathematical formulation, including estimations of the 3-D velocity structures and hypocentral parameters (earthquake origin times and locations). We only focus on cases using compressional-wave (P-wave) arrivals since the description does not depend on a specific seismic phase.

2.1 LET fundamental framework

We first design 3-D grid points to model the subsurface velocity structures. Let be a seismic velocity parameter at grid point . In this paper, the axis indicates the depth, and the -

plane indicates the horizontal location. We assume that the grid points are arranged at regular intervals in the horizontal and vertical directions. We then calculate the velocity at an arbitrary point by interpolating the values of the velocity parameters at the nearest eight grid points, where

denotes the velocity structure at coordinate (not necessarily at a grid point).

An arrival time contains information on the following factors: the origin time of earthquake , the hypocenter location of earthquake , the velocity parameter at grid point , and the ray path from to seismic station . These factors are combined using ray theory to provide the predicted travel time from earthquake to seismic station :

where denotes the element of the path length. We hereafter call the set of velocity parameters and the set of hypocentral parameters.

The ray paths and other factors are calculated in an alternate update for a given set of observed arrival times as follows. The ray path for each earthquake–station pair is calculated for given hypocentral and velocity parameter estimates. The hypocentral and velocity parameter estimates are then updated for a given ray path. This alternate update is conducted until the desired accuracy of the tomography is achieved. Here we focus on the velocity parameter estimations since the alternating update segregates the parameter estimation and ray-path calculation.

The velocity (and hypocentral) parameter estimations are usually conducted by minimizing the sum of the squared residuals between the calculated arrival times and observed arrival times for all of the available earthquake–station pairs:


where and are the observed earthquakes and available observation stations, respectively.

2.2 Structured regularization for 3-D LET

Here we propose a structured regularization approach to accurately image two different velocity changes: drastic velocity changes in the vertical direction at discontinuities and gradual velocity changes in the horizontal directions. Our objective function, which is minimized to estimate the optimal velocity structure, incorporates three components (the loss function

in Eq. (1), vertical regularization term on the horizontal units in the vertical direction, and horizontal regularization term on the components in the horizontal direction) as follows:


where and are non-negative regularization parameters. We multiply by for the convenience of computation. We obtain estimates by applying iterative calculations based on quadratic approximations and the alternating direction method of multipliers (ADMM; [Glowinski & Marroco 1975], [Gabay & Mercier 1976]) to this nonlinear and nonconvex problem (see the Appendix). The vertical regularization term takes the form:


Our vertical-direction penalty is the -sum of , that is, the -norm of the second-order differences in the horizontal units. The value of becomes to be zero when gradient of the average velocities on among , , and th layers is constant, and thus the penalty forces the minimizer of Eq. (2) piecewise linearly in the vertical direction. The horizontal regularization term is given by:


where the relation means that the two grid points are located on the same layer and are adjacent to each other. The horizontal-direction penalty builds upon the first-order velocity differences among adjacent grids on the same layer and the -norm, which allows the resultant velocity parameters to vary smoothly. Note that, the penalty terms in Eqs. (3) and (5) need to be divided by the corresponding grid intervals if the grid points are not arranged at regular intervals.

Figure 1 illustrates how the penalty terms work in the vertical and horizontal directions. The vertical-direction penalty term (Eq. (3)) is a version of trend filtering ([Kim et al. 2009]), which has been applied in various research fields (e.g., [Tibshirani 2014], [Selvin et al. 2016], [Guntuboyina et al. 2020]). This method is known to be suitable for capturing underlying piecewise linear trends. Trend filtering can reduce the penalized elements to zero, whereas Laplacian-based regularization does not (e.g., [Wang et al. 2016]). Our proposed vertical-direction penalty takes the form of the -sum of , which reduces the change in the average-velocity gradient in the vertical direction. Our proposed method can adapt to drastic velocity changes due to geological discontinuities at depth, and it does not require a prior information on the location of the discontinuity.

Figure 1: Penalty terms in the proposed method. (a) Vertical-direction penalty. The sum of the second-order velocity differences among adjacent layers is penalized. (b) Horizontal-direction penalty. The first-order velocity differences among adjacent grid points on the same layer are penalized. The central grid point in this figure has four adjacent grid points.

3 Numerical experiments

We evaluate the performances of our proposed regularization method and two existing methods, SIMUL and Laplacian regularization, to determine the effectiveness of our proposed method in reproducing the seismic velocity structure of a given region. SIMUL is a LET-based software package that simultaneously performs 3-D ray tracing, and both velocity and hypocentral parameter estimations ([Thurber 1993]); we hereafter refer to the method without any structured regularization as “SIMUL” for simplicity. Laplacian regularization has often been used in LET, as mentioned in Section 1. The objective function that is minimized during the velocity and hypocentral parameter estimations takes the following form:




and and are regularization parameters. Both Laplacian regularization and our proposed method impose the same penalty in the horizontal direction because we mainly focus on the detection performance of velocity discontinuities at depth. We hereafter refer to this method as “Laplacian” for notational simplicity. A key difference between Laplacian and our proposed method is the employed norm: the former uses the -sum of the -norm, and the latter uses the -sum of the -norm (Eqs. (3) and (7)). We applied the algorithm in SIMUL to perform the 3-D ray-tracing calculations through the resultant velocity model for each method.

We determined the Laplacian regularization parameters (Eq. (6)) and those for our proposed method (Eq. (2)) via cross-validation. We first split the dataset into training and validation datasets. We then estimated the velocity parameters using given regularization parameter values that were in a set of prepared values. We finally selected the regularization parameter values within the set that minimized the root mean square error of the predicted arrival times for the validation dataset.

3.1 Synthetic data

We conducted synthetic tests using the Japan Meteorological Agency (JMA) unified earthquake catalog to investigate the performances of the three different approaches. Here we focus on the velocity model estimation by fixing the location and origin time of each earthquake to the JMA catalog values for simplicity. The location of the study area is shown in Figure 2. The dataset consists of 199 earthquakes that occurred in Central Japan. We obtained 3,954 P-wave arrival times from 68 seismic stations in the target area. The arrival times were divided into estimation and validation datasets, with 2,965 arrival times used to estimate the velocity parameters and 989 arrival times used to validate the method accuracy.

Figure 2: (a) Map of the study area (bold rectangle) in Japan. (b) Epicenter and observation station distributions that are used in our experiment. Circles represent the earthquake epicenters, and gray inverted triangles represent the observation stations. The earthquakes are color-coded by depth.

We constructed a 26-layer model that extended from 0.0 km (surface) to 25.0 km depth at 1.0-km intervals. We denote the surface layer as “Layer 0” and the layers with grid points at km depth as “Layer ”. We then placed 36 (66) horizontal-directed grids at an 8.0-km horizontal interval on each layer, such that the center of the grids was at  N,  E. We set constant velocities for the outer points, which surrounded the main target region, because some of the hypocenters and stations were located outside the target region. We arranged the outer points as those that were 220 km from the end grid points of each layer in the horizontal direction. We set the “outer layer” at 200 km depth in the vertical direction.

We calculated the synthetic P-wave arrival times as follows. We first defined the baseline velocity of each layer, as shown in Figure 3(a). We assumed a 1-D velocity model with a drastic increase in its P-wave velocity at around Layer 12. We then generated “true” velocities at the grid points by adding % anomalies to the baseline velocities to produce a checkerboard pattern for each layer (Fig. 3

(b)). We finally calculated synthetic arrival times for the available earthquake–station pairs using this assumed 3-D velocity structure, and generated additional time by adding Gaussian noise with a standard deviation of 0.1 s.

Figure 3: (a) Baseline velocity–depth profile for the 26-layer model: 4.0 km/s in Layers 0–11, 4.5 km/s in Layer 12, and 5.0 km/s in Layers 13–25. (b) True 3-D velocity structure. Dots represent the grid point locations. A checkerboard pattern (% anomalies), which is based on the baseline velocity structure in (a), is created.

3.2 Results

Figure 4: (a) Comparison of the estimated velocity structures. The SIMUL (green line), Laplacian (blue line), and our proposed method (red line) estimates are shown. Black and gray lines illustrate the true and initial velocity structures, respectively. (b–d) values (colored circles), which are computed from the estimated average velocity structure for each method (gray lines).

Figure 4 shows the averaged P-wave velocity–depth profiles for the three methods. The initial value of velocity parameter in each grid point is assumed to be 4.0 km/s in this synthetic test. The following regularization parameter values were determined via cross-validation: in Laplacian (Eq. (6)) and in the proposed method (Eq. (2)). The averaged velocities obtained via the three methods all capture the drastic velocity increases around Layer 12, as shown in Figure 4(a). However, the SIMUL output shows obvious fluctuations in its estimated velocity structure. This may be due to the 1 km grid interval in the vertical direction being finer than the grid interval that LET studies have generally employed when using data from the nationwide seismic network in Japan (e.g., [Matsubara et al. 2017]). These unstable SIMUL estimations indicate that information on the spatial arrangement of the grid points is needed to avoid ill-posed estimations of the velocity structure in our case. Laplacian outperformed SIMUL due to the employed regularization, which reduced the fluctuations in the averaged velocities. However, the Laplacian estimates at the grid points in the layers near and below the velocity jump (Layers 13–25) were unable to obtain a constant velocity. Conversely, our proposed method relatively recovered the true average velocities, including the drastic increase in velocity around Layer 12 accurately. We quantitatively compare the estimation accuracies of the three methods by calculating the mean absolute error (MAE):

where is the number of grid points, and and are the estimated and true velocity parameters at each grid point, respectively. The MAE of SIMUL, Laplacian, and our proposed methods were , , and , respectively. We confirm that the proposed penalty can replicate regions where the average velocities are constant at depth based on the behavior in Figure 4(d) and the MAE values.

The norm of the penalty in the vertical direction is different in the Laplacian and proposed model, as shown in Eqs (3) and (7). Laplacian employs the -sum of the -norm, whereas the proposed method employs the -sum of the -norm as the vertical-direction penalty. The values (Eq. (4)), which were evaluated using the obtained velocity structure for each method, are illustrated in Figure 4(b). The values should be zero at most of the layers, except for those around Layer 12, where there is a sudden velocity change, because the true velocity structure was generated from only three baseline velocities (Fig. 3(a)): 4.0 km/s in Layers 0–11, 4.5 km/s in Layer 12, and 5.0 km/s in Layers 13–25. Most of the computed values for the SIMUL-estimated velocity structure were far from zero, as shown in Figure 4(b). Although the Laplacian-estimated values were relatively small compared with the SIMUL-estimated values, the Laplacian regularization terms did not reduce to zero. In contrast, most of the values estimated by our proposed method were almost exactly zero, as the penalty terms in the proposed method produce a piecewise linear velocity structure.

We then focus on the spatial variations in the estimated velocities in the horizontal units. The imposed checkerboard anomalies on the true velocity structure and the estimated velocity perturbations, both of which are estimated from the baseline velocities at each grid point in Layers 1, 12, 20, and 25, are shown in Figure 5. SIMUL tends to estimate amplitude anomalies that are more than 5% smaller/larger than the assumed true structure in this experiment. Both the Laplacian and proposed method reproduced the checkerboard pattern in the shallower layers (Layers 0–10). However, Laplacian failed to reproduce the assumed checkerboard pattern in the deeper layers (Layers 18–25), whereas the proposed method successfully restored the true structure in most areas (Fig. 5(b)).

Figure 5: Checkerboard test results. (a) Introduced velocity anomalies of the true velocity structure from the baseline velocity (Fig. 3(a)) in each layer. Blue and red indicate that the estimated velocities are faster and slower than the baseline velocity, respectively. (b) Anomalies of the estimated velocities from the baseline velocities in several layers for the SIMUL (left), Laplacian (center), and our proposed (right) methods.

4 Discussion

4.1 Size of the velocity jump at a discontinuity

We examined the sensitivity of the three methods to the amplitude of the velocity jump in the vertical direction. It is expected that estimation accuracy of the velocity parameters will deteriorate as the size of the velocity jump becomes larger. The results of this sensitivity test are shown in Figures 6(a)–(c) and 4 (the case when the size of the velocity jump is km/s). Figure 6 shows the averages of the true and estimated velocities at each layer, and the MAEs for each tested velocity jump. The three methods yield comparable estimation accuracies when there is no velocity jump (constant velocity with depth; Fig. 6(a)). We find that the SIMUL performance degraded gradually as the size of the velocity jump increased. Laplacian performed better than SIMUL based on the MAEs, but it failed to reproduce the linear trend in Layers 13–25 (Figs 6(b) and (c)). This occurs because the penalty term of the Laplacian does not strictly hold the average velocity gradient constant since it is composed of the -sum of the -norm (; Eq. (4)), as discussed in Subsection 3.2. In contrast, the decrease in the precision of the proposed method is relatively suppressed, especially in the layers below the velocity jump (Layers 13–25), by expressing a piecewise linear trend of the true velocity structure via the penalty term, which consists of the -sum of the -norm. We confirm that the proposed method estimated velocity parameters more stably for each of the tested velocity jumps compared with the conventional methods (Fig. 6(d)). These results suggest that the proposed method can recover a range of velocity jumps (small to large amplitudes) that may be associated with a discontinuity.

Figure 6: Recoveries of the estimated average velocities by changing the size of the velocity jump: (a) 0.0 km/s; (b) 0.2 km/s; and (c) 0.5 km/s. (d) MAE variations for each of the tested velocity jumps and methods (SIMUL, Laplacian, and our proposed methods). Colored lines in (a–c) are the same as those in Figure 4.

4.2 Initial model dependence

We conducted additional experiments to verify the influence of the initial model on the estimated velocity structure for each of the three methods, which may be due to the nonlinearity of the objective function. The main experiment adopted an initial velocity structure of 4.0 km/s at all of the grid points (Fig. 4); our additional experiments tested initial velocities of 4.5 and 5.0 km/s. We configured all of the other settings to be the same as those in the main experiment. The averages of the true and estimated velocities in each layer, and the associated MAEs for different initial velocities are shown in Figure 7. Note that the proposed method yields the smallest MAEs among the three methods for each of the three initial velocities (Fig. 7(c)), indicating that our structured regularizations provide stable estimations of the velocity structure, regardless of the initial velocity model. It is generally more difficult to estimate the velocity parameters in the deeper layers than those in the shallow layers because of the sparsity of the seismic ray paths at depth. The estimated average velocity better reproduced the true velocity in all three cases when the initial velocity was set to 5.0 km/s, which is close to the true velocity of the deep layers.

Figure 7: Recoveries of the estimated average velocities for different initial velocities: (a) 4.5 km/s and (b) 5.0 km/s. (c) MAE variations due to the initial velocities for the SIMUL, Laplacian, and our proposed methods. Colored lines in (a–c) are the same as those in Figure 4.

4.3 The relationship between method accuracy and sample size

We verified the accuracy of each method for different sample sizes (the number of travel time data). The sample size was controlled by either decreasing or increasing the number of available seismic stations that were analyzed to extract the P-wave travel times. Figure 8 shows the averages of the true and estimated velocities in each layer, and the MAEs for each sample size. We confirmed that the proposed method performed the best in each of the tested settings based on its MAE values (Fig. 8(d)). The number of velocity parameters was as large as 936 () in our experiments, inevitably making it difficult to estimate the velocity structure without regularization when the sample size was small (Fig. 8(a)), as discussed in Subsection 3.1. Although all three methods performed well for a large sample size (Figs 8(b) and (c)), the methods with regularizations (Laplacian and proposed methods) yielded better accuracies than that without regularization (SIMUL). The Laplacian and proposed methods yielded relatively stable accuracies, even when the number of travel time data was small, since regularization methods are generally capable of avoiding overfitting and performing well when there are a lot of parameters to estimate (e.g., [Negahban et al. 2012], [Hastie et al. 2015]). Furthermore, the proposed method reproduced the sharp change in the average velocity structure the best among the three methods and outperformed the SIMUL and Laplacian methods in the layers below the velocity jump (Layers 13–25) in every setting, with this superior performance attributed to the -penalty.

Figure 8: Recoveries of the estimated average velocities by changing the number of travel time data used in the analysis: (a) 1,250; (b) 4,393; and (c) 6,414, which are about one-half, twice, and three times the number used in the main results, respectively (Section 3). (d) MAE variations in the sample sizes for the SIMUL, Laplacian, and our proposed methods. The colored lines in (a–c) are the same as those in Figure 4.

5 Application to real seismic data

We applied the proposed method to real seismic data, using the seismic waveforms from 211 earthquakes that were observed by the high-sensitivity seismograph network in Japan (Hi-net; [NIED 2019]) during the 2005–2014 period. We picked 2,042 P-wave travel times from the waveforms, and divided the travel times into 1,701 training data and 341 validation data for the cross-validation. The target region of this experiment is shown in Figure 2. We employed the same grid points as those in the synthetic test. We set a velocity of 6.0 km/s at all of grid points in the study area for the initial velocity model, and fixed JMA2001 1-D velocity model ([Ueno et al. 2002]) values, which have been commonly employed for routine hypocenter determinations throughout Japan, at the outer points (outside the target region).

Figure 9: (a) Estimated average P-wave velocity structure for the target area. The results of the proposed method, which are derived from real observational data, are shown by the red line. The gray line indicates the initial velocity model. Green and blues lines are the estimated P-wave velocity structures determined via the SIMUL and Laplacian methods, respectively. (b, c) Vertical variations in the P-wave velocities. Arrows indicate 16 km depth. Cross-section locations are indicated in the inset map.

The resultant P-wave velocity–depth profiles for the three methods are shown in Figure 9(a). Our proposed method estimated a notable velocity change in the target area (red line in Fig. 9(a)), with monotonically increasing velocities to approximately 16 km depth (Layer 16) and a nearly constant velocity at greater depths. The Conrad discontinuity in the target region has been imaged at approximately 15–20 km depth (e.g., [Iidaka et al. 2003], [Katsumata 2010]). Therefore, the depth at which the obtained average velocity gradient suddenly changes can be interpreted as the Conrad discontinuity. The obtained average velocities, which span the 5.0–6.5 km/s range above 16 km depth and are approximately 6.7 km/s at greater depths, are comparable with those retrieved by a reflection and wide-angle refraction survey ([Iidaka et al. 2003]). Therefore, the applicability of the proposed method in elucidating sharp velocity discontinuities is validated by its success in detecting the Conrad discontinuity, which is defined by this sudden change in the velocity gradient within the target area.

There were large fluctuations in the SIMUL average velocities (green line in Fig. 9(a)), whereas the Laplacian vertical fluctuations were smoothed (blue line in Fig. 9(a)). The average velocity gradient of the Laplacian estimates changes in the Layer 13–16 range, which seems to be associated with the Conrad discontinuity. However, the velocities at greater depths are clearly smaller than those expected beneath the Conrad discontinuity in the target region ([Iidaka et al. 2003]). The small number of travel time data used here can cause underestimations in the Laplacian average velocities, as discussed in the numerical experiments (Subsection 4.3).

Furthermore, we illustrate the vertical variations in P-wave velocities along the two E–W profiles obtained by the proposed method in Figures 9(b) and (c). The P-wave velocities above 5 km depth decrease gradually toward the east, which may correspond to a local geological structure. This lateral velocity variation resembles one that has been imaged in previous studies (e.g., [Matsubara et al. 2019]). The regularization parameters for the proposed and Laplacian methods are and , respectively.

These results, which are obtained from real seismic data, suggest that the proposed method, with -type regularization, can stably detect the true depth of the velocity discontinuity, even when the number of available observational data is small. Later reflected and/or converted waves have conventionally been used to investigate the depths of various velocity discontinuities, such as the Conrad and Moho discontinuities, and the subducting plate interface (e.g., [Matsuzawa et al. 1986], [Zhao et al. 1997]); however, such later waves are only identified in a limited number of earthquake–station pairs. Unlike reflected and converted waves, direct P and S waves are commonly and widely observed from numerous earthquakes in the Earth. Therefore, the proposed method will improve the detection of velocity discontinuities considerably and refine imaging in areas where later waves are not widely observed.

6 Conclusion

We introduced a nonlinear inversion method with structured regularization to image the crustal structure of the Earth. Our proposed LET method simultaneously estimated the smooth trends and drastic changes in the crustal velocity structure, both of which are expected in the Earth’s interior, by employing two types of penalty terms that are added to the vertical and horizontal directions of the model space, respectively. We employed a vertical-direction penalty term that consisted of the second-order differences in the depth-dependent velocity parameters to detect a velocity discontinuity, thereby highlighting the ability to image drastic velocity changes in the vertical direction. This vertical-direction penalty term works on the depth-averaged velocity values, and takes the form of the -sum of the -norm, which allows it to reproduce piecewise linear trends in the velocity changes at depth. We used a horizontal-direction penalty term that consisted of first-order differences of the velocities that were based on the -norms. This horizontal-direction penalty smooths velocity fluctuations.

We compared the imaging capability of the proposed method with a conventional LET approach (SIMUL) and Laplacian regularization via synthetic data experiments to verify the performance of the proposed method. The proposed method can adequately reproduce drastic velocity changes in the vertical direction and gradual velocity changes in the horizontal direction. Our results highlighted the importance of regularization to better estimate the subsurface velocity structure, and also demonstrated that the proposed method is stable against variations in the amplitude of velocity jump, initial velocity structure, and sample size. Furthermore, we applied the proposed method to real seismic data from Central Japan, and successfully imaged a distinct velocity gradient at approximately 15 km depth, which corresponded to the Conrad discontinuity. Therefore, the proposed method can improve the detectability of a velocity discontinuity using P- and/or S-wave travel time data.

Our proposed method automatically detects the existence (or nonexistence) of discontinuities because it does not require a prior information for the velocity discontinuity. We focused on an estimation of the velocity parameters, whereas LET usually includes updated hypocentral locations. Therefore, the accuracy of the velocity estimations and hypocenter determinations via LET will be further improved by replacing the LET penalty terms with our proposed terms from the conventional framework.

Data availability

The data underlying this article are available in Japan Meteorological Agency (JMA) at


GMT software package [Wessel & Smith 1998] and R [R Core Team 2020] were used for creating figures. This work is supported by JST CREST Grant Number JPMJCR1763, Japan, and JSPS KAKENHI Grant Number JP20K19753.


  • [Aki & Lee 1976] Aki, K. & Lee, W. H. K., 1976. Determination of three-dimensional velocity anomalies under a seismic array using first P arrival times from local earthquakes: 1. A homogeneous initial model, Journal of Geophysical research, 81, 4381–4399.
  • [Alesssandrini et al. 2001] Alesssandrini, B., Filippi, L. & Borgia, A., 2001. Upper-crust tomographic structure of the Central Apennines, Italy, from local earthquakes, Tectonophysics, 339, 479–494.
  • [Bassin 2000] Bassin, C., 2000. The current limits of resolution for surface wave tomography in North America, Eos, Transactions American Geophysical Union, 81, S12A-03.
  • [Boyd et al. 2011] Boyd, S., Parikh, N., Chu, E., Peleato, B. & Eckstein, J., 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers,

    Foundations and Trends ® in Machine Learning

    , 3, 1–122.
  • [Charlety et al. 2013] Charlety, J., Voronin, S., Nolet, G., Loris, I., Simons, F. J., Sigloch, K. & Daubechies, I. C., 2013. Global seismic tomography with sparsity constraints: Comparison with smoothing and damping regularization, Journal of Geophysical Research: Solid Earth, 118, 4887–4899.
  • [Dziewonski & Anderson 1981] Dziewonski, A. M. & Anderson, D. L., 1981. Preliminary reference Earth model, Physics of the earth and planetary interiors, 25, 297–356.
  • [Evans & Meade 2012] Evans, E. L. & Meade, B. J., 2012. Geodetic imaging of coseismic slip and postseismic afterslip: Sparsity promoting methods applied to the great Tohoku earthquake, Geophysical Research Letters, 39, L11314.
  • [Gabay & Mercier 1976] Gabay, D. & Mercier, B., 1976. A dual algorithm for the solution of nonlinear variational problems via finite element approximation, Computers & mathematics with applications, 2, 17–40.
  • [Gavin 2013] Gavin, H., 2013. The Levenberg-Marquardt method for nonlinear least squares curve-fitting problems, Technical Report, Department of Civil & Environmental Engineering, Duke University.
  • [Glowinski & Marroco 1975] Glowinski, R. & Marroco, A., 1975. Sur l’approximation, par éléments finis d’ordre un, et la résolution, par pénalisation-dualité d’une classe de problèmes de Dirichlet non linéaires, ESAIM: Mathematical Modelling and Numerical Analysis - Modélisation Mathématique et Analyse Numérique, 9, 41–76.
  • [Guntuboyina et al. 2020] Guntuboyina, A., Lieu, D., Chatterjee, S. & Sen, B., 2020. Adaptive risk bounds in univariate total variation denoising and trend filtering, The Annals of Statistics, 48, 205–229.
  • [Hastie et al. 2015] Hastie, T., Tibshirani, R. & Wainwright, M., 2015. Statistical learning with sparsity: The lasso and generalizations, CRC press.
  • [Hirose et al. 2020] Hirose, T., Nakahara, H., Nishimura, T. & Campillo, M., 2020. Locating spatial changes of seismic scattering property by sparse modeling of seismic ambient noise cross-correlation functions: Application to the 2008 Iwate-Miyagi Nairiku (Mw 6.9), Japan, earthquake, Journal of Geophysical Research: Solid Earth, 125, e2019JB019307.
  • [Iidaka et al. 2003] Iidaka, T., Iwasaki, T., Takeda, T., Moriya, T., Kumakawa, I., Kurashimo, E., Kawamura, T., Yamazaki, F., Koike, K. & Aoki, G., 2003. Configuration of subducting Philippine Sea plate and crustal structure in the central Japan region, Geophysical research letters, 30, 1219.
  • [Katsumata 2010] Katsumata, A., 2010. Depth of the Moho discontinuity beneath the Japanese islands estimated by traveltime analysis, Journal of Geophysical Research: Solid Earth, 115, B04303.
  • [Kennett & Engdahl 1991] Kennett, B. L. N. & Engdahl, E. R., 1991. Traveltimes for global earthquake location and phase identification, Geophysical Journal International, 105, 429–465.
  • [Kim et al. 2009] Kim, S.-J., Koh, K. & Boyd, S. and Gorinevsky, D., 2009. L1 trend filtering, SIAM review, 51, 339–360.
  • [Lees & Crosson 1989] Lees, J. M. & Crosson, R. S., 1989. Tomographic inversion for three-dimensional velocity structure at Mount St. Helens using earthquake data, Journal of Geophysical Research: Solid Earth, 94, 5716–5728.
  • [Levenberg 1944] Levenberg, K., 1944. A method for the solution of certain non-linear problems in least squares, Quarterly of applied mathematics, 2, 164–168.
  • [Li & Harris 2018] Li, D. & Harris, J. M., 2018. Full waveform inversion with nonlocal similarity and model-derivative domain adaptive sparsity-promoting regularization, Geophysical Journal International, 215, 1841–1864.
  • [Loris et al. 2007] Loris, I., Nolet, G., Daubechies, I. & Dahlen, F. A., 2007. Tomographic inversion using l1-norm regularization of wavelet coefficients, Geophysical Journal International, 170, 359–370.
  • [Marquardt 1963] Marquardt, D. W., 1963. An algorithm for least-squares estimation of nonlinear parameters, Journal of the society for Industrial and Applied Mathematics, 11, 431–441.
  • [Matsubara et al. 2017] Matsubara, M., Sato, H., Uehira, K., Mochizuki, M. & Kanazawa, T., 2017. Three-dimensional seismic velocity structure beneath Japanese Islands and surroundings based on NIED seismic networks using both inland and offshore events, Journal of Disaster Research, 12, 844–857.
  • [Matsubara et al. 2019] Matsubara, M., Sato, H., Uehira, K., Mochizuki, M., Kanazawa, T., Takahashi, N., Suzuki, K. & Kamiya, S., 2019. Seismic velocity structure in and around the Japanese Island Arc derived from seismic tomography including NIED MOWLAS Hi-net and S-net data, in Seismic Waves - Probing Earth System, IntechOpen.
  • [Matsuzawa et al. 1986] Matsuzawa, T., Umino, N., Hasegawa, A. & Takagi, A., 1986. Upper mantle velocity structure estimated from PS-converted wave beneath the north-eastern Japan Arc, Geophysical Journal International, 86, 767–787.
  • [Moran et al. 1999] Moran, S. C., Lees, J. M. & Malone, S. D., 1999. P wave crustal velocity structure in the greater Mount Rainier area from local earthquake tomography, Journal of Geophysical Research: Solid Earth, 104, 10775–10786.
  • [Nakata et al. 2017] Nakata, R., Hino, H., Kuwatani, T., Yoshioka, S., Okada, M. & Hori, T., 2017. Discontinuous boundaries of slow slip events beneath the Bungo Channel, southwest Japan, Scientific reports, 7, 6129.
  • [Nakata et al. 2016] Nakata, R., Kuwatani, T., Okada, M. & Hori, T., 2016. Geodetic inversion for spatial distribution of slip under smoothness, discontinuity, and sparsity constraints, Earth, Planets and Space, 68, 20.
  • [NIED 2019] National Research Institute for Earth Science and Disaster Resilience, 2019. NIED Hi-net, National Research Institute for Earth Science and Disaster Resilience.
  • [Negahban et al. 2012] Negahban, S. N., Ravikumar, P., Wainwright, M. J. & Yu, B., 2012. A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers, Statistical science, 27, 538–557.
  • [Nugraha & Mori 2006] Nugraha, A. D. & Mori, J., 2006. Three-dimensional velocity structure in the Bungo Channel and Shikoku area, Japan, and its relationship to low-frequency earthquakes, Geophysical research letters, 33, L24307.
  • [R Core Team 2020] R Core Team, 2020. R: A language and environment for statistical computing, available from:
  • [Schmidt et al. 2007] Schmidt, M., Niculescu-Mizil, A. & Murphy, K., 2007. Learning graphical model structure using L1-regularization paths, in AAAI, volume 7, 1278–1283.
  • [Selvin et al. 2016] Selvin, S., Ajay, S. G., Gowri, B. G., Sowmya, V. & Somon, K. P., 2016. L1 trend filter for image denoising, Procedia Computer Science, 93, 495–502.
  • [Thurber 1983] Thurber, C. H., 1983. Earthquake locations and three-dimensional crustal structure in the Coyote Lake area, central California, Journal of Geophysical Research: Solid Earth, 88, 8226–8236.
  • [Thurber 1993] Thurber, C. H., 1993. Seismic tomography: Theory and practice, in Local earthquake tomography: velocities and Vp/Vs-theory, ed. Iyer, H. M. & Hirahara, K., Chapman and Hall.
  • [Tibshirani 1996] Tibshirani, R., 1996. Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B (Methodological), 58, 267–288.
  • [Tibshirani 2014] Tibshirani, R. J., 2014. Adaptive piecewise polynomial estimation via trend filtering, The Annals of Statistics, 42, 285–323.
  • [Ueno et al. 2002] Ueno, H., Hatakeyama, S., Aketagawa, T., Funasaki, J. & Hamada, N., 2002. Improvement of hypocenter determination procedures in the Japan Meteorological Agency, Quarterly Journal of Seismology, 65, 123–134.
  • [Wang et al. 2016] Wang, Y.-X., Sharpnack, J., Smola, A. J. & Tibshirani, R. J., 2016. Trend filtering on graphs, The Journal of Machine Learning Research, 17, 3651–3691.
  • [Wessel & Smith 1998] Wessel, P. & Smith, W. H. F., 1998. New, improved version of Generic Mapping Tools released, Eos, Transactions American Geophysical Union, 79, 579.
  • [Yao et al. 2011] Yao, H., Gerstoft, P., Shearer, P. M. & Mecklenbräuker, C., 2011. Compressive sensing of the Tohoku-Oki Mw 9.0 earthquake: Frequency-dependent rupture modes, Geophysical Research Letters, 38, L20310.
  • [Zhang & Thurber 2003] Zhang, H. & Thurber, C. H., 2003. Double-difference tomography: The method and its application to the Hayward fault, California, Bulletin of the Seismological Society of America, 93, 1875–1889.
  • [Zhang et al. 1998] Zhang, J., ten Brink, U. S. & Toksöz, M. N., 1998. Nonlinear refraction and reflection travel time tomography, Journal of Geophysical Research: Solid Earth, 103, 29743–29757.
  • [Zhao et al. 1992] Zhao, D., Hasegawa, A. & Horiuchi, S., 1992. Tomographic imaging of P and S wave velocity structure beneath northeastern Japan, Journal of Geophysical Research: Solid Earth, 97, 19909–19928.
  • [Zhao et al. 1997] Zhao, D., Matsuzawa, T. & Hasegawa, A., 1997. Morphology of the subducting slab boundary in the northeastern Japan arc, Physics of the Earth and Planetary Interiors, 102, 89–104.

Appendix: An optimization with ADMM

The objective function of our method, as outlined in Subsection 2.2, is:

Here we consider an optimization of the quadratic form (hereafter denoted by ), instead of the sum of the squared residuals . We then separate

into an additive form of two terms that depend on only the velocity parameters and hypocentral parameters via QR decomposition:

. We obtain the following optimization problem via the duality principle:

where represents the -norm,

is a matrix satisfying (), and , , and are the numbers of grid points in -, -, and -directions, respectively.

We then apply the Lagrange multiplier method, and consider the following function:

We obtain estimates by applying the ADMM ([Glowinski & Marroco 1975], [Gabay & Mercier 1976]), a widely used algorithm that is well suited for solving distributed convex optimization problems (e.g., [Boyd et al. 2011], [Li & Harris 2018]):

where “” is the proximal operator. We note that the minimization function for is nonlinear and nonconvex, and that it takes an additive form of the sums of squares via iterative quadratic approximation. Therefore, we obtain the optimal value using the Levenberg–Marquardt method ([Levenberg 1944], [Marquardt 1963], [Gavin 2013]).