## 1 Introduction

An abundance of real-world inverse problems, for instance in biomedical imaging, non-destructive testing, geological exploration, and sensing of seismic events, is concerned with the spatial and/or temporal support localization of *sources* generating wave fields in acoustic, electromagnetic, or elastic media (see, e.g., [8, 10, 20, 32, 37, 41, 45, 47, 50] and references therein). Numerous application-specific algorithms have been proposed in the recent past to procure solutions of diverse *inverse source problems* from time-series or time-harmonic measurements of the generated waves (see, e.g., [2, 6, 7, 19, 36, 51, 54, 55, 58, 64, 65]).
The *inverse elastic source problems* are of particular interest in this paper due to their relevence in *elastography* [3, 10, 20, 50]. Another potential application is the localization of the background noise source distribution of earth, which contains significant information about the regional geology, time-dependent crustal changes and earthquakes [21, 32, 34, 37].

Most of the conventional algorithms are suited to continuous measurements, in other words, to experimental setups allowing to measure wave fields at each point inside a region of interest or on a substantial part of its boundary. In practice, this requires mechanical systems that furnish discrete data sampled on a very fine grid confirming to the Nyquist sampling rate. Unfortunately, this is not practically feasible due to mechanical, computational, and financial constraints. In this article, we are therefore interested in the problem of elastic source imaging with very sparse data, both in space and time, for which the image resolution furnished by the conventional algorithms degenerates. In order to explain the idea of the proposed framework, we will restrict ourselves to the *time-reversal technique* for elastic source localization presented by Ammari et al. [5] as the base conventional algorithm due to its robustness and efficiency. It is precised that any other contemporary algorithm can be adopted accordingly. The interested readers are referred to the articles [4, 5, 8, 19, 34, 55] and reference cited therein for further details on time-reversal techniques for inverse source problems and their mathematical analysis.

One potential remedy to overcome the limitation of the conventional algorithms is to incorporate the smoothness penalty such as the total variation (TV) or other sparsity-inducing penalties under a data fidelity term. These approaches are, however, computationally expensive due to the repeated applications of the forward solvers and reconstruction steps during iterative updates. Direct image domain processing using these penalties could bypass the iterative applications of the forward and inverse steps, but the performance improvements are not remarkable.

Since the deep convolutional neural network (CNN) known as AlexNet [35]

pushed the state of the art by about 10%, winning a top-5 test error rate of 15.3% in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012

[49] compared to the second-best entry of 26.2%, the performance of CNNs continuously improved and eventually surpassed the human-level-performance (5.1%, [49]) in the image classification task. Recently, deep learning approaches have achieved tremendous success not only for classification tasks, but also in various inverse problems of computer vision area such as segmentation

[48], image captioning [31], denoising [66], and super resolution

[11], for example.Along with those developments, by applying the deep learning techniques, a lot of studies in medical imaging area have also shown good performance in various applications [1, 14, 15, 16, 23, 26, 28, 29, 30, 56, 60, 61]. For example, Kang et al. [29] first successfully demonstrated wavelet domain deep convolutional neural network (DCNN) for low-dose computed tomography (CT), winning the second place in 2016 American Association of Physicists in Medicine (AAPM) X-ray CT Low-dose Grand Challenge [44]. Jin et al. [26] and Han et al. [23] independently showed that the global streaking artifacts from the sparse-view CT can be removed efficiently with the deep network. In MRI, Wang et al. [57] applied deep learning to provide a soft initialization for compressed sensing MRI (CS-MRI). In photo-acoustic tomography, Antholzer et al. [9] proposed a U-Net architecture [48] to effectively remove streaking artifacts from inverse spherical Radon transform based reconstructed images. The power of machine learning for inverse problems has been also demonstrated in material discovery and designs, in which the goal is to find the material compositions and structures to satisfy the design goals under assorted design constraints [42, 43, 52].

In spite of such intriguing performance improvement by deep learning approaches, the origin of the success for inverse problems was poorly understood. To address this, we recently proposed
so-called *deep convolutional framelets* as a powerful mathematical framework to understand deep learning approaches for inverse problems [62]. The novelty of the deep convolutional framelets was the discovery that an encoder-decoder network structure emerges as the signal space manifestation from
Hankel matrix decomposition in the higher dimensional space [62].
In addition, by controlling the number of filter channels, a neural network is trained to learn the optimal *local bases* so that it gives the best low-rank shrinkage [62].
This discovery demonstrates an important link between the deep learning and the compressed sensing approachs [17] through a Hankel structure matrix decomposition [25, 27, 63].

Thus, the aim of this paper is to provide a deep learning reconstruction formula for elastic source imaging from sparse measurements. Specifically, a generic framework is provided that incorporates a low-dimensional manifold regularization in the conventional reconstruction frameworks. As it will be explained later on, the resulting algorithm can be extended to the deep convolutional framelet expansion in order to achieve an image resolution comparable to that furnished by the continuous/dense measurements [62].

The paper is organized as follows. The inverse elastic source problem, both in discrete and continuous settings, is introduced in Section 2 and a brief review of the time-reversal algorithm is also provided. The mathematical foundations of the proposed deep learning approach are furnished in Section 3. Section 4 is dedicated to the design and training of the deep neural network. A few numerical examples are furnished in Section 5. The article ends with a brief summary in Section 6.

## 2 Problem formulation

Let us first mathematically formulate the inverse elastic source problems with continuous and discrete measurements. Then, we will briefly review the time-reversal technique for elastic source imaging (with continuous data) as discussed in [5] in order to make the paper self-contained.

### 2.1 Inverse elastic source problem with continuous measurements

Let be a compactly supported function. Then, the wave propagation in a linear isotropic elastic medium loaded in () is governed by the Lamé system,

where is the elastic wave field generated by the source , operator is the linear isotropic elasticity operator with Lamé parameters of the medium , and superscript indicates the transpose operation. Here, it is assumed for simplicity that the volume density of the medium is unit, i.e., , , and are density normalized. Moreover, the source is punctual in time, i.e., , where denotes the Dirac mass at and its derivative is defined in the sense of distributions.

Let be an open bounded smooth imaging domain with boundary , compactly containing the spatial support of , denoted by , i.e., there exists a compact set strictly contained in such that . Then, the inverse elastic source problem with continuous measurement data is to recover given the measurements

where is the final control time such that and for all .

It is precised that and can be decomposed in terms of irrotational components (or pressure components polarizing along the direction of propagation) and solenoidal components (or shear components polarizing orthogonal to the direction of propagation). In particular, in a two-dimensional (2D) frame-of-reference wherein - and -axes are aligned with and orthogonal to the direction of propagation, respectively, the respective components of are its pressure and shear components (see, e.g., Figure 1 for the imaging setup and source configuration).

### 2.2 Inverse elastic source problem with discrete measurements

Most of the conventional algorithms require the measurement domain to be sampled at the Nyquist rate so that a numerical reconstruction of the spatial support is achieved at a high resolution. Specifically, the distance between consecutive receivers is taken to be less than half of the wavelength corresponding to the smallest frequency in the bandwidth and the temporal scanning is done at a fine rate so that the relative difference between consecutive scanning times is very small.

In practice, it is not feasible to place a large number of receivers at the boundary of the imaging domain and most often the measurements are available only at a few detectors (relative to the number of those required at the Nyquist sampling rate). As a result, one can not expect well-resolved reconstructed images from the conventional algorithms requiring continuous or dense measurements.

In the rest of this subsection, the mathematical formulation of the discrete inverse elastic source problem is provided. Towards this end, some notation is fixed upfront. For any sufficiently smooth function

, its temporal Fourier transform is defined by

where is the temporal frequency. Similarly, the spatial Fourier transform of an arbitrary smooth function is defined by

with spatial frequency . Let the function be the Kupradze matrix of fundamental solutions associated to the time-harmonic elastic wave equation, i.e.,

(1) |

where

is the identity matrix. For later use, we decompose

into its shear and pressure parts aswhere

Here, denotes the first-kind Hankel function of order zero, and and are the pressure and shear wave speeds, respectively.

If then, by invoking the Green’s theorem,

(2) |

for all . Here, denotes the source-to-measurement operator.

Let be the locations of point receivers measuring the time-series of the outgoing elastic wave at instances for some . Then, the inverse elastic source problem with discrete data is to recover given the discrete measurement set

In this article, we are interested in the discrete inverse source problem with sparse data, i.e., when and are small relative to the Nyquist sampling rate.

In order to facilitate the ensuing discussion, let us introduce the discrete measurement vector

by(3) |

Here and throughout this investigation, notation indicates the -th component of a vector and indicates the -th component of a matrix. Thus, , for , denotes the vector formed by the -th components of the waves recorded at points at a fixed time instance .

Let us also introduce the forward operator, , in the discrete measurement case by

Then, the inverse elastic source problem with discrete data is to recover from the relationship

(4) |

### 2.3 Time-reversal for elastic source imaging: A review

The idea of the time-reversal algorithm is based on a very simple observation that the wave operator in loss-less (non-attenuating) media is self-adjoint and that the corresponding Green’s function possesses the reciprocity property [19]. In other words, the wave operator is invariant under time transformation and the positions of the sources and receivers can be swapped. Therefore, it is possible to theoretically *revert* a wave from the recording positions and different control times to the source locations and the initial time in chronology thereby converging to the source density. Practically, this is done by *back-propagating* the measured data, after transformation , through the adjoint waves (for each time instance ) and adding the contributions for all after evaluating them at the final time . Precisely, the adjoint wave , for each , is constructed as the solution to

where is the surface Dirac mass on . Then, the time-reversal imaging function is defined by

(5) |

By the definition of the adjoint field and the Green’s theorem,

Therefore, the time-reversal function can be explicitly expressed as

The time-reversal function in (5) is usually adopted to reconstruct the source distribution in an elastic medium. However, it does not provide a good reconstruction due to a non-linear coupling between the shear and pressure parts of the elastic field at the boundary, especially when the sources are extended [34, 8, 5]. In fact, these components propagate at different wave-speeds and polarization directions, and cannot be separated at the surface of the imaging domain. If we simply back-propagate the measured data then the time-reversal operation mixes the components of the recovered support of the density . Specifcally, it has been established in [5] that, by time reversing and back-propagating the elastic wave field signals as in (5), only a blurry image can be reconstructed together with an additive term introducing the coupling artifacts.

As a simple remedy for the coupling artifacts, a surgical procedure is proposed in [5] taking the leverage of a *Helmholtz decomposition* of , (regarded as an *initial guess*). A weighted time-reversal imaging function (denoted by hereinafter) is constructed by separating the shear and pressure components of as

and then taking their weighted sum wherein the weights are respective wave speeds and the functions and are obtained by solving a weak Neumann problem. Precisely, is defined by

(6) |

In fact, thanks to the Parseval’s theorem and the fact that is compactly supported inside , it can be established that

for a large final control time with

After tedious manipulations, using the *elastic Helmholtz-Kirchhoff identities* (see, e.g., [5, Proposition 2.5]), and assuming to be a ball with radius , one finds out that

Since

which comes from the integration of the time-dependent version of Eq. (1) between and , the following result holds (see, e.g., [5, Theorem 2.6]).

###### Theorem 1

Let be a ball in with large radius R. Let be sufficiently far from the boundary with respect to the wavelength and be defined by (6). Then,

We conclude this section with the following remarks. Let be the source-to-measurement operator, defined in (2). Then, it is easy to infer from Theorem 1 that its inverse (or the measurement-to-source) operator is given by

when imaging domain is a ball with large radius . However, there are a few technical limitations. Firstly, if is not *sufficiently large* as compared to the characteristic size of the support of , which in turn should be sufficiently localized at the *center* of the imaging domain (i.e., located far away from the boundary ), one can only get an approximation of which may not be very well-resolved. Moreover, may not be able to effectively rectify the coupling artifacts in that case as it has been observed for extended sources in [5]. Secondly, like most of the contemporary conventional techniques, time-reversal algorithm requires continuous measurements (or dense measurements at the Nyquist sampling rate). Therefore, as will be highlighted later on in the subsequent sections, very strong streaking artifacts appear when the time-reversal algorithm is applied with sparse measurements. In order to overcome these issues, a deep learning approach is discussed in the next section.

## 3 Deep learning approach for inverse elastic source problem

Let us consider the inverse elastic source problem with sparse measurements. Our aim is to recover from the relationship (4). Unfortunately, (4) is not uniquely solvable due to sub-sampling. In fact, the null space, , of the forward operator is non-empty, i.e., there exist non-zero functions, , such that

Moreover, the existence of the non-radiating parts of the source also makes the solution non-unique. This suggests that there are infinite many feasible solutions to the discrete problem (4). Hence, the application of the time-reversal algorithm requiring the availability of continuous or dense measurements results in strong imaging artifacts severely affecting the resolution of the reconstruction.

A typical way to avoid the non-uniqueness of the solution from sparse measurements is the use of regularization. Accordingly, many regularization techniques have been proposed over the past few decades. Among various penalties for regularization, here our discussion begins with a low-dimensional manifold constraint using a structured low-rank penalty [63], which is closely related to the deep learning approach proposed in this investigation.

### 3.1 Generic inversion formula under structured low-rank constraint

Let , for some integer , be a collection of finite number of sampling points of the region of interest confirming to the Nyquist sampling rate. In this section, a (discrete) approximation of the density is sought using piece-wise constants or splines ansatz

where is the basis function for the -th coordinate, associated with . Accordingly, the discretized source density to be sought is introduced by

Let us define the row-vector by

where superposed and indicate the dependence on -th time instance for and -th boundary point for , respectively. The subscripts indicate that the -th component of the Kupradze matrix is invoked and the index indicates that the basis function associated with the internal mesh point for the -th coordinate is used. Accordingly, the sensing matrix is defined by

(7) |

Then, the discrete version of the relationship (4) is given by

In order to facilitate the ensuing discussion, we define the wrap-around structured Hankel matrix associated to , for , by

where is the so-called matrix-pencil size. As shown in [25, 27, 62, 63] and reproduced in Appendix for self-containment, if the coordinate function corresponds to a smoothly varying perturbation or it has either edges or patterns, then the corresponding Fourier spectrum is mostly concentrated in a small number of coefficients. Thus, if is a discretization of at the Nyquist sampling rate, then according to the sampling theory of the signals with the finite rate of innovations (FRI) [53], there exists an annihilating filter whose convolution with the image vanishes. Furthermore, the annihilating filter size is determined by the sparsity level in the Fourier domain, so the associated Hankel structured matrix in the image domain is low-rank if the matrix-pencil size is chosen larger than the annihilating filter size. The interested readers are referred to Appendix or the references [25, 27, 62, 63] for further details.

In the same way, it is expected that the block Hankel structured matrix of the discrete source vector , constructed as

is low-rank, where . Let and where denotes the rank of a matrix. Then, a generic form of the low-rank Hankel structured constrained inverse problem can be formulated as

(8) | |||||

subject to |

It is clear that, for a feasible solution of the regularization problem (8), the Hankel structured matrix , for

, admits the singular value decomposition

. Here, and denote the left and the right singular vector basis matrices, respectively, and refers to the diagonal matrix with singular values as elements. If there exist two pairs of matrices and , , for each and , satisfying the conditions(9) |

then

(10) |

with the transformation given by

(11) |

which is often called the *convolutional framelet coefficient* [62].
In Eq. (10),

(12) |

where and denote the -th and the -th columns of and , respectively.
This implies that the Hankel matrix can be decomposed using the basis matrices .Here, the first condition in (9) is the so-called *frame condition*, denotes the range space of and represents a projection onto [62].
In addition, the pair () is *non-local* in the sense that these matrices interact with all the components of the vector . On the other hand, the pair (, ) is *local* since these matrices interact with only components of .
Precisely, (10) is equivalent to the paired encoder-decoder convolution structure when it is un-lifted to the original signal space [62]

(13) |

which is illustrated in Figure 2. The convolutions in (13) correspond to the multi-channel convolutions (as used in standard CNN) with the associated filters,

Here, the superposed prime over , for fixed , and , indicates its flipped version, i.e., the indices of are reversed [63].

Let us introduce the block matrices

Then, thanks to conditions in (9), the pairs (, ) and (, ) satisfy the conditions

Consequently,

with the matrix transformation given by

Let , for , refer to the space of signals admissible in the form (13), i.e.,

Then, the problem (8) can be converted to

(14) |

or equivalently,

(15) |

where the sub-matrices and are defined in Eqs. 7 and 3, respectively.

### 3.2 Extension to Deep Neural Network

One of the most important discoveries in [62] is that an encoder-decoder network architecture in the convolutional neural network (CNN) is emerged from Eqs. (10),(11), and (13). In particular, the non-local bases matrices and play the role of user-specified *pooling* and *unpooling* operations, respectively (see, Section 3.3),
whereas the local-bases and correspond to the * encoder and decoder layer convolutional filters* that have to be learned from the data
[62].

Specifically, our goal is to learn () in a *data-driven* fashion so that the optimization problem (14) (or equivalently (15)) can be simplified. Toward this, we first define (resp. ) as a right pseudo-inverse of (resp. ), i.e., (resp. ) for all so that the cost in (14) (resp (15)) can be automatically minimized with
the right pseudo-inverse solution. However, the solution leads to

where denotes the true solution and . Therefore, one looks for the matrices () such that

where the operator is defined in terms of the mapping = as

(16) |

for all . In fact, the operator in (16) can be engineered so that its output belongs to . This can be achieved by selecting the filters ’s that *annihilate* the null-space components ’s, i.e.,

so that

In other words, the block filter should span the orthogonal complement of . Therefore, the local bases learning problem becomes

(17) |

where

denotes the training data-set composed of the input and ground-truth pairs. This is equivalent to saying that the proposed neural network is for learning the local basis from the training data assuming that the Hankel matrices associated with discrete source densities are of rank [62].

Still, the convolutional framelet expansion is linear, so we restricted the space so that the framelet coefficient matrices are restricted to have positive elements only, i.e., the signal lives in the conic hull of the basis to enable part-by-part representation similar to non-negative matrix factorization (NMF) [38, 39, 40]:

for .
This positivity constraint can be implemented using the *rectified linear unit*

(ReLU)

[46] during training. Accordingly, the local basis learning problem (17) can equivalently be expressed asHere, the operator is defined analogously as in (16) but in terms of the mapping given by

where denotes ReLU, i.e., for arbitrary matrix , we have , for all and .

The geometric implication of this representation is illustrated in
Figure 3.
Specifically, the original image is first *lifted* to higher dimensional space via Hankel matrix, ,
which is then decomposed into positive (conic) combination using the matrix bases in (12

). During this procedure, the outlier signals (black color) are placed outside of the conic hull of the bases, so that they can be removed during the decomposition. When this high conic decomposition procedure is observed in the original signal space, it becomes one level encoder-decoder neural network with ReLU. Therefore, an encoder-decoder network can be understood as a signal space manifestation of the conic decomposition of the signal being lifted to a higher-dimensional space.

The idea can be further extended to the multi-layer deep neural network. Specially, suppose that the encoder and decoder convolution filter and can be represented in a cascaded convolution of small length filters:

then the signal space is recursively defined as

where, for all ,

(18) | |||

Here, the -th layer encoder and decoder filters, and , are given by

where , , and are the filter lengths, the number of input channels, and the number of output channels, respectively. This is equivalent to recursively applying high dimensional conic decomposition procedure to the next level convolutional framelet coefficients as illustrated in Figure 4(a). The resulting signal space manifestation is a deep neural network shown in Figure 4(b).

### 3.3 Dual-Frame U-Net

As discussed before,
the non-local bases and correspond to the generalized pooling and unpooling operations, which can be designed by the users for specific inverse problems. Here, the key requisite is the frame condition in (9), i.e., . As the artifacts of the time-reversal recovery from sparse measurements are distributed globally, a network architecture with large receptive fields is needed. Thus, in order to learn the optimal local basis from the minimization problem (17), we adopt the commonly used CNN architecture known as *U-Net* [48] and its deep convolutional framelets based variant, coined as *Dual-Frame U-Net* [22] (see Figure 5). These networks have pooling layers with down sampling, resulting exponentially large receptive fields.

As shown in [22], one of the main limitation of the standard U-Net is that it does not satisfy the frame condition in (9). Specifically, by considering both skipped connection and the pooling in Figure 5(a), the non-local basis for the standard U-Net is given by

(19) |

where denotes an average pooling operator given by

Moreover, the unpooling layer in the standard U-Net is given by . Therefore,

Consequently, the frame condition in (9) is not satisfied. As shown in [62], this results in the duplication of the low-frequency components, making the final results blurry.

In order to address the aforementioned problem, in the Dual-Frame U-Net [22], the dual frame is directly implemented. More specifically, the dual frame for the specific frame operator (19) is given by

where the matrix inversion lemma and the orthogonality are invoked to arrive at the last equality. It was shown in [22] that the corresponding generalized unpooling operation is given by

(20) |

where denotes the skipped component.
Equation (20) suggests a network structure for the Dual-Frame U-Net.
More specifically, unlike the U-Net, the *residual signal* at the low resolution should be upsampled through the unpooling layer and subtracted from the by-pass signal to eliminate the duplicate contribution of the low-frequency components.
This can be easily implemented using additional bypass connection for the low-resolution signal as shown in Figure 5(b).
This simple fix allows the proposed network to satisfy the frame condition (9).
The interested readers are suggested to consult [22] for further details.

## 4 Network design and training

Let us now design the U-Net and Dual-Frame U-Net neural networks for the elastic source imaging based on the analysis performed in the previous section. For simplicity, consider a 2D case (i.e., ) for the recovery of the -component (i.e., ) of the unknown source. The -component (i.e., ) of the source can be obtained in exactly the same fashion using the network architectures discussed above.

### 4.1 Description of the forward solver and time-reversal algorithm

For numerical illustrations and generation of the training data, the region of interest is considered as a unit disk centered at origin. Each solution of the elastic wave equation is computed over the box so that , i.e., with