Image Reconstruction: From Sparsity to Data-adaptive Methods and Machine Learning

The field of image reconstruction has undergone four waves of methods. The first wave was analytical methods, such as filtered back-projection (FBP) for X-ray computed tomography (CT) and the inverse Fourier transform for magnetic resonance imaging (MRI), based on simple mathematical models for the imaging systems. These methods are typically fast, but have suboptimal properties such as poor resolution-noise trade-off for CT. The second wave was iterative reconstruction methods based on more complete models for the imaging system physics and, where appropriate, models for the sensor statistics. These iterative methods improved image quality by reducing noise and artifacts. The FDA-approved methods among these have been based on relatively simple regularization models. The third wave of methods has been designed to accommodate modified data acquisition methods, such as reduced sampling in MRI and CT to reduce scan time or radiation dose. These methods typically involve mathematical image models involving assumptions such as sparsity or low-rank. The fourth wave of methods replaces mathematically designed models of signals and processes with data-driven or adaptive models inspired by the field of machine learning. This paper reviews the progress in image reconstruction methods with focus on the two most recent trends: methods based on sparsity or low-rank models, and data-driven methods based on machine learning techniques.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 6

page 7

page 9

page 14

08/30/2019

Deep Plug-and-Play Prior for Parallel MRI Reconstruction

Fast data acquisition in Magnetic Resonance Imaging (MRI) is vastly in d...
12/27/2019

ODE-based Deep Network for MRI Reconstruction

Fast data acquisition in Magnetic Resonance Imaging (MRI) is vastly in d...
03/26/2021

Model-based Reconstruction with Learning: From Unsupervised to Supervised and Beyond

Many techniques have been proposed for image reconstruction in medical i...
10/03/2020

Physics-based Reconstruction Methods for Magnetic Resonance Imaging

Conventional Magnetic Resonance Imaging (MRI) is hampered by long scan t...
06/01/2019

Multi-layer Residual Sparsifying Transform Learning for Image Reconstruction

Signal models based on sparsity, low-rank and other properties have been...
07/21/2021

Towards Lower-Dose PET using Physics-Based Uncertainty-Aware Multimodal Learning with Robustness to Out-of-Distribution Data

Radiation exposure in positron emission tomography (PET) imaging limits ...
12/24/2020

Parallel-beam X-ray CT datasets of apples with internal defects and label balancing for machine learning

We present three parallel-beam tomographic datasets of 94 apples with in...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Various imaging modalities are popular in clinical practice such as magnetic resonance imaging (MRI), X-ray computed tomography (CT), positron-emission tomography (PET), single-photon emission computed tomography (SPECT), etc. These modalities help image various biological and anatomical structures and physiological functions, and aid in medical diagnosis and treatment. Ensuring high quality and resolution of images reconstructed from limited or corrupted (e.g., noisy) measurements such as subsampled data in MRI (which reduce acquisition time) or low-dose or sparse-view data in CT (which reduce patient radiation exposure) has been a popular area of research and holds high value in clinical practice in improving clinical throughput and patient experience. This paper reviews some of the major recent advances in the field of image reconstruction, focusing on methods that use sparsity, low-rankness, and machine learning.

I-a Timeline of Image Reconstruction

Image reconstruction methods for various modalities have undergone significant advances over the past few decades. These advances can be broadly grouped in four phases. The first or early phase comprised analytical and algebraic methods. These methods include the classical filtered back-projection (FBP) methods for X-ray CT (e.g., Feldkamp-Davis-Kress or FDK method [1]) and the inverse Fast Fourier transform and extensions such as involving Nonuniform Fast Fourier Transform (NUFFT) [2, 3, 4] for MRI and CT. These methods are based on relatively simple mathematical models of the imaging systems, and although they have efficient, fast implementations, they suffer from suboptimal properties such as poor resolution-noise trade-off for CT.

The second phase of reconstruction methods involved iterative reconstruction algorithms that were based on more sophisticated models for the imaging system’s physics and models for sensor and noise statistics. Often referred to as model-based image reconstruction (MBIR) methods or statistical image reconstruction (SIR) methods, these schemes iteratively estimate the unknown image based on the system (physical or forward) model, measurement statistical model, and assumed prior information about the underlying object

[5, 6, 7, 8]. For example, minimizing penalized weighted-least squares (PWLS) cost functions has been popular in many modalities including PET and X-ray CT, and these costs include a statistically weighted quadratic data-fidelity term (capturing the imaging forward model and noise model) and a penalty term called regularizer that models the prior information about the object [9]. These iterative reconstruction methods improved image quality by reducing noise and artifacts. In MRI, parallel data acquisition methods (P-MRI) exploit the diversity of multiple receiver coils to acquire fewer Fourier or k-space samples [10]. Today, P-MRI acquisition is used widely in commercial systems, and MBIR-type methods in this case include those based on coil sensitivity encoding (SENSE) [10], etc. The iterative medical image reconstruction methods approved by the U.S. food and drug administration (FDA) for SPECT, PET, and X-ray CT have been based on relatively simple regularization models.

The third phase of reconstruction methods were developed to accommodate modified data acquisition methods such as reduced sampling in MRI and CT to significantly reduce scan time and/or radiation dose. Compressed sensing (CS) techniques [11, 12, 13, 14, 15] have been particularly popular among this class of methods (cf. the IEEE TMI special issue on CS [16]), and have been so beneficial for MRI [17, 18] that they recently got FDA approval [19, 20, 21]. CS theory predicts the recovery of images from far fewer measurements than the number of unknowns provided that the image is sparse in a transform domain or dictionary, and the acquisition or sampling procedure is appropriately incoherent with the transform. Since MR acquisition in Fourier or k-space occurs sequentially over time, making it a relatively slow modality, CS for MRI can enable quicker acquisition by collecting fewer k-space samples. However, the reduced sampling time comes at the cost of slower, nonlinear, iterative reconstruction. The methods for reconstruction from limited data typically exploit mathematical image models based on sparsity or low-rank, etc. In particular, CS-based MRI methods use variable density random sampling techniques to acquire the data and use sparsifying transforms such as wavelets, finite difference operators (via total variation (TV) penalty), contourlets, etc., for reconstruction [17, 22]. Research in the third phase also focused on developing new theory and guarantees for sampling and reconstruction from limited data [23], new optimization algorithms for reconstruction with good convergence rates [24, 25], etc.

The fourth wave of image reconstruction methods replaced mathematically designed models of images and processes with data-driven or adaptive models inspired by the field of machine learning. Such models (e.g., synthesis dictionaries [26], sparsifying transforms [27]

, tensor models, etc.) can be learned in various ways such as by using training datasets 

[28, 29], or even learned jointly with the reconstruction [30, 28, 31, 32], a setting called model-blind reconstruction or blind compressed sensing (BCS) [33]. While most of these methods perform offline reconstruction (where the reconstruction is performed once all the measurements are collected), recent works show that the models can also be learned in a time-sequential or online manner from streaming measurements to reconstruct dynamic objects [34, 35]

. The learning can be done in an unsupervised manner employing model-based and surrogate cost functions or the reconstruction algorithms (such as deep convolutional neural networks (CNNs)) can be trained in a supervised manner to minimize the error in reconstructing training datasets that typically consist of pairs of ground truth and undersampled data 

[36, 37, 38]. These fourth-generation learning-based reconstruction methods form a very active field of research with numerous conference special sessions and special journal issues devoted to the topic [39].

I-B Focus and Outline of This Paper

This paper reviews the progress in medical image reconstruction, focusing on the two most recent trends: methods based on sparsity using analytical models, and low-rank models and extensions that combine sparsity and low-rank, etc.; and data-driven models and approaches exploiting machine learning. Some of the mathematical underpinnings and connections between different models and their pros and cons are also discussed.

The paper is organized as follows. Section II describes early image reconstruction approaches, especially those used in current clinical systems. Sections III and IV describe sparsity and low-rank based approaches for image reconstruction. Then, Section V

surveys the advances in data-driven image models and related machine learning approaches for image reconstruction. Among the learning-based methods, techniques that learn image models using model-based cost functions from training data, or on-the-fly from measurements are discussed, followed by recent methods relying on supervised learning of models for reconstruction, typically from datasets of high quality images and their corrupted versions. Section 

VI reviews the very recent works using learned convolutional neural networks (a.k.a. deep learning) for image reconstruction. Section VII discusses some of the current challenges and open questions in image reconstruction and outlines future directions for the field. Section VIII concludes this review paper.

Ii Early Approaches

This section focuses on the iterative MBIR methods that are in routine clinical use currently, and relates the models used in those systems to the sparsity models used in the contemporary literature. As mentioned in the introduction, MBIR methods have been used routinely for many years in commercial SPECT, PET and CT systems. Early publications on MBIR methods tended to focus on “Bayesian” methods, which is slightly ironic because these early methods were based on mathematical models, not actual prior data sets. In contrast, recent data-driven methods that are based on empirical distributions from training data, as discussed later in the paper, are arguably more properly Bayesian. The dominant Bayesian approach for reconstructing an image from data was the maximum a posteriori (MAP) approach of finding the maximizer of the posterior . By Bayes rule, the MAP approach is equivalent to

(1)

where denotes the negative log-likelihood that describes the imaging system physics and noise statistics. The benefits of modeling the system noise and physics properties were the primary driver for the early work on MBIR methods for PET and SPECT, compared to classical reconstruction methods like FBP that use quite simple geometric models and lack statistical modeling. The function in (1) denotes a Bayesian prior that captures assumptions about the image

. Markov random field models were particularly popular in early work; these methods typically proscribe higher prior probabilities for images

where neighboring pixels tend to have similar values [40]. Although the term sparsity is uncommon in papers about MRF models, the “older” assumption that neighboring pixels tend to have similar values is quite closely related to the “newer” assumption that the differences between neighboring pixel values tend to be sparse.

The form of (1) is equivalent to the following regularized optimization problem:

(2)

where denotes a data-fidelity term and denotes a regularizer that encourages the image to have some assumed properties such as piece-wise smoothness. The positive regularization parameter controls the trade-off between over-fitting the (noisy) data and over-smoothing the image. More recent MBIR papers, and the commercial methods, tend to adopt this regularization perspective rather than using Bayesian terminology. Early commercial PET and SPECT reconstruction methods used unregularized algorithms [41], but more recent methods use edge-preserving regularization involving differences between neighboring pixels [42], essentially implicitly assuming that the image gradients are sparse. In 1D, a typical regularizer would be

(3)

where is the number of pixels, and denotes a “potential function” (in Bayesian parlance) such as the hyperbola (A few modifications of the regularizer are needed to make it work well in practice [43, 44].) MBIR methods for clinical CT systems also use edge-preserving regularization [45]. These clinically used regularizers are relatives of the total variation (TV) regularizer that is studied widely in the academic literature. However, TV imposes a strong assumption of gradient sparsity because it uses the nonsmooth absolute value potential that is well-suited to images that are piece-wise constant but less suitable for images that are piece-wise smooth. In particular, the TV regularizer leads to CT images with undesirable patchy textures; so the commercial systems use an edge-preserving regularizer that does not enforce sparsity as strictly [45]. In summary, the current clinical methods for PET, SPECT and CT use optimization formulations of the form (2) with regularizers akin to (3), thereby moderately encouraging gradient sparsity.

Iii Sparsity Using Mathematical Models

This section discusses image reconstruction methods that are based on models for the image that involve some form of sparsity. Such methods are now being used clinically to accelerate MRI scans, making such scans shorter, reducing the effects of patient motion and improving patient comfort.

The regularizer based on finite differences in (3) is equivalent to assuming the image gradients are sparse. This model is a special case of the more general assumption that is sparse for some spatial transform , i.e., the image is “transform sparse.” This is called “analysis regularization” and a typical image reconstruction optimization formulation for such models is

(4)

There are many transforms that have been used for image reconstruction; the two most popular ones are finite-differences, corresponding to TV, and various wavelet transforms. Wavelets are the transform model used in the JPEG 2000 image compression standard, because they are effective at sparsifying natural images. The combination of both wavelets and TV is particularly common in MRI [17], and, although the details are proprietary, it is likely that such combinations are used in the commercial MRI systems, e.g., [46].

An alternative to the analysis regularization model (4) is to assume that the image can be represented as a sparse linear combination of atoms from a dictionary, i.e., where is a dictionary and

is a coefficient vector. One way to express this assumption as an optimization problem is

(5)

where denotes the imaging system model. A drawback of this synthesis sparsity formulation is that it relies heavily on the assumption that , whereas an approximate form may be more reasonable in practice, particularly when the dictionary comes from a mathematical model that might not perfectly represent natural medical images. An alternative synthesis formulation that allows an approximate sparsity model is:

A drawback of this approach is that it requires one to select two regularization parameters ( and ).

The drawback of all of the models discussed in this section is that, traditionally, and are designed mathematically rather than being informed by the data. Nevertheless, they are useful, as evidenced by their adoption in clinical MRI systems. The methods in subsequent sections extend these methods to more data-driven approaches.

Iv Low-rank Models

While sparsity models have been popular in image reconstruction, particularly in CS, various alternative models exploiting properties such as the inherent low-rankness of the data have also shown promise in imaging applications. This section reviews some of the low-rank models and their extensions such as when combined with sparsity, followed by recent structured low-rank matrix approaches [47, 48, 49, 50, 51, 52].

Iv-a Low-Rank Models and Extensions

Low-rank models have been exploited in many imaging applications such as dynamic MRI [53], functional MRI [54], and MR fingerprinting (MRF) [55]. Low-rank assumptions are especially useful when processing dynamic or time-series data, and have been popular in dynamic MRI, where the underlying image sequence tends to be quite correlated over time. In dynamic MRI, the measurements are inherent undersampled because the object changes as the samples are collected. Reconstruction methods therefore typically pool the k-t space data in time to make sets of k-space data (the underlying dynamic object is written in the form of a Casorati matrix [53], whose rows represent voxels and columns denote temporal frames, and the sets of k-space data denote measurements of such frames) that appear to have sufficient samples. However, these methods can have poor temporal resolution and artifacts due to pooling. Careful model-based (CS-type) techniques can help achieve improved temporal or spatial resolution in such undersampled settings.

Several works have exploited low-rankness of the underlying Casorati (space-time) matrix for dynamic MRI reconstruction [53, 56, 57, 58]. Low-rank modeling of local space-time image patches has also been investigated in [59]. Later works combined low-rank (L) and sparsity (S) models for improved reconstruction. Some of these works model the dynamic image sequence as both low-rank and sparse (L & S) [60, 61]

. There has also been growing interest in models that decompose the dynamic image sequence into the sum of a low-rank and sparse component (a.k.a. robust principal component analysis (RPCA)) 

[62, 63]. In this L+S model, the low-rank component can capture the background or slowly changing parts of the dynamic object, whereas the sparse component can capture the dynamics in the foreground such as local motion or contrast changes, etc.

Recent works have applied the L+S model to dynamic MRI reconstruction [64, 65], with the S component modeled as sparse by itself or in a known transform domain. Accurate reconstructions can be obtained [64] when the underlying L and S components are incoherent (distinguishable) and the k-t space acquisition is appropriately incoherent with these components. The L+S reconstruction problem can be formulated as follows:

(6)

Here, the underlying vectorized object satisfies the L+S decomposition . The sensing operator acting on it can take various forms. For example, in parallel imaging of a dynamic object, performs frame-by-frame multiplication by coil sensitivities (in the SENSE approach) followed by undersampled Fourier encoding. The low-rank regularization penalizes the nuclear norm of , where reshapes its input into a space-time matrix. The nuclear norm serves as a convex surrogate or envelope for the nonconvex matrix rank. The sparsity penalty on has a similar form as in CS approaches, and and are non-negative weights above. Problem (6) is convex and can be solved using various iterative techniques. Otazo et al. [64]

used the proximal gradient method, wherein the updates involved simple singular value thresholding (SVT) for the L component and soft thresholding for the S component. Later, we mention a data-driven version of the L+S model in Section 

V-B. While the above works used low-rank models of matrices (e.g., obtained by reshaping the underlying multi-dimensional dynamic object into a space-time matrix), some recent works also used low-rank tensor models of the underlying object (a tensor) in reconstruction [66, 67].

Iv-B Low-Rank Structured Matrix Models

The low-rank Hankel structure matrix approaches [48, 49, 50, 68, 51, 52, 69, 70, 47] have been studied extensively for various imaging problems and are based on the fundamental duality between spatial domain sparsity and the spectral domain Hankel matrix rank. To explain this concept, we first briefly review the literature on the sampling theory of signals having finite rate of innovations (FRI) [71, 72, 73].

Consider the the superposition of Dirac impulses:

(7)

The associated Fourier series coefficients are given by

(8)

The sampling theory for FRI signals [71, 72] showed that there exists an annihilating filter in the Fourier domain, of length , such that

(9)

whose -transform representation is given by

(10)

In [69], it was shown that (9) implies that the following Hankel structured matrix is rank-deficient:

where . More specifically, it was shown in [69] that if the minimum annihilating filter length is , then

Thus, given sparsely sampled spectral measurements on the index set , the missing spectrum estimation problem can be formulated as

(11)
subject to (12)

where denotes the projection on the measured k-space samples on the index set . Although the above discussion is for Dirac impulses, the same principle holds for general FRI signals that can be converted to Diracs or differentiated Diracs after a whitening operator, since the corresponding Fourier spectrum is a simple element-wise multiplication with the spectrums of the operator and the unknown signal, and the weighted spectrum has a low-rank Hankel structure [48, 69].

In contrast to standard compressed sensing approaches, the optimization problem in (11) is purely in the measurement domain. After estimating the fully sampled Fourier data, the final reconstruction can be obtained by a simple inverse Fourier transform. This property leads to remarkable flexibility in real-world applications, which classical approaches have difficulty exploiting. For example, this formulation has been successfully applied to compressed sensing MRI with state-of-the art performance for single coil imaging [47, 48, 49, 69, 50]. Another flexibility is the recovery of images from multichannel measurements with unknown sensitivities [74, 48]. These schemes rely on the low-rank structure of a structured matrix, obtained by concatenating block Hankel matrices formed from each channel’s data. Similar to L+S decomposition in [64], the L+S model for Hankel structure matrix was also used to remove the

-space outliers in MR imaging problems 

[52]

. Beyond the medical imaging applications, such approaches have been successfully used for super-resolution microscopy 

[75]

, image inpainting problems 

[76], image impulse noise removal [77], etc.

The common thread between sparsity models and low-rank models is that both approaches strive to capture signal redundancies to make up for missing or noisy data.

V Data-Driven and Learning-Based Models

The most recent class of methods constituting the fourth wave in image reconstruction exploit data-driven and learning-based models. This section and the next review these varied models and methods. The pros and cons of various methods, as well as their connections are also discussed.

V-a Partially Data-Adaptive Sparsity-Based Methods

While early reconstruction methods such as in CS MRI used sparsity in known transform domains such as wavelets [17], total variation domain, contourlets [22], etc., later works proposed partially data-adaptive sparsity models by incorporating directional information of patches or block matching, etc., during reconstruction.

Qu et al. [78] proposed a patch-based directional wavelets (PBDW) scheme for MRI, wherein the regularizer was based on analysis sparsity and was the sum of the norms of each optimally (adaptively) rearranged and transformed (by fixed 1D Haar wavelets) image patch. The patch rearrangement or permutation involved rearranging pixels parallel to a certain geometric direction, approximating patch rotation. The best permutation for each patch from among a set of pre-defined permutations was pre-computed based on an initial reconstruction to minimize the residual between the transformed permuted patch and its thresholded (to its largest coefficients) version. While one could presumably alternate in this framework between estimating the optimal rearrangements of patches and the reconstruction with the updated sparsity regularization, the image quality improvements were observed to be negligible after only two iterations [78].

Ning et al. [79] proposed an improved PBDWS method, where the optimal permutations were computed for patches extracted from the subbands in the 2D Wavelet domain (a shift-invariant discrete wavelet transform (SIDWT) is used) of the image. Recently, Zhan et al. [80]

proposed a different modification of the PBDW scheme, wherein a unitary matrix is adapted to sparsify the patches grouped with a common (optimal) permutation, and is used in the aforementioned analysis sparsity penalty during reconstruction.In this case, the

norms of the patches transformed by the adapted unitary matrices (one per group of patches) is used in the regularizer instead of the norm of the optimally rearranged and 1D Haar wavelet transformed patches.

A different partially adaptive reconstruction method was proposed in [81], wherein for each patch, a small group of patches most similar to it was pre-estimated (called block matching), and the regularizer during reconstruction penalized the sparsity of the groups of (2D) patches in a known (3D) transform domain. All these aforementioned methods are related to the recent transform learning-based methods described in Section V-C, where the sparsifying operators are fully adapted in an optimization framework.

V-B Synthesis Dictionary Learning-Based Approaches for Reconstruction

Among the learning-based approaches that have shown promise for medical image reconstruction, one popular class of methods exploits synthesis dictionary learning.

V-B1 Synthesis Dictionary Model

As briefly discussed in Section III, the synthesis model suggests that a signal can be approximated by a sparse linear combination of atoms or columns of a dictionary, i.e., the signal lives approximately in a subspace spanned by a few dictionary atoms. Because different signals may be approximated with different subsets of dictionary columns, the model is viewed as a union of subspaces model [82].

In imaging, the synthesis model is often applied to image patches (see Fig. 1) or image blocks as , with denoting the operator that extracts a vectorized patch (with pixels) of , denoting a synthesis dictionary (in general complex-valued), and being the sparse representation or code for the patch with many zeros. While dictionaries based on the discrete cosine transform (DCT), etc., can be used to model image patches, much better representations can be obtained by adapting the dictionaries to data. The learning of synthesis dictionaries has been explored in many works [83, 84, 85] and shown to be promising in inverse problem settings [86, 87, 30].

V-B2 Dictionary Learning for MRI

Ravishankar and Bresler [30] proposed a dictionary learning-based method for MRI (DL-MRI), where the image and the dictionary for its patches are simultaneously estimated from limited measurements. The approach also known as blind compressed sensing (BCS) [33] does not require training data and learns a dictionary that is highly adaptive to the underlying image content. However, the optimization problem is highly nonconvex, and is formulated as follows:

(13)

This corresponds to using a dictionary learning regularizer (weighted by ) of the following form:

(14)

where is a matrix whose columns are the sparse codes that each have at most non-zeros, and the “norm” counts the total number of nonzeros in a vector or matrix. The columns of are constrained to have unit norm as otherwise can be scaled arbitrarily along with corresponding inverse scaling of the th row of , and the objective is invariant to this scaling ambiguity.

Fig. 1: The synthesis dictionary model for image patches: overlapping patches of the image are assumed approximated by sparse linear combinations of the columns of the dictionary , i.e., , where has several zeros (denoted with white blocks above).

Problem (13) was optimized in [30] by alternating between solving for the image (image update step) and optimizing the dictionary and sparse coefficients (dictionary learning step). The image update step can be solved by standard least squares optimization techniques (e.g., conjugate gradients (CG)), or in specific cases such as in single coil Cartesian MRI (where , the undersampled DFT), in closed-form using FFTs. However, the dictionary learning step (solving (13) with respect to ) involves a nonconvex and NP-hard dictionary learning optimization problem [88]. Various dictionary learning algorithms exist for this problem and its variants [84, 89, 85] that typically alternate many times between updating the sparse coefficients (synthesis sparse coding step that involves an NP-hard problem) and the dictionary (dictionary update step). The well-known K-SVD dictionary learning algorithm [84] updates the sparse codes (with fixed ) for each patch in a greedy manner using the orthogonal matching pursuit (OMP) method [90], and then updates the dictionary atoms together with the nonzero coefficients (with their locations or support fixed) in in a sequential (atom by atom) manner in the dictionary update step.

The DL-MRI method for (13) used K-SVD in the dictionary learning step and showed significant image quality improvements over previous CS MRI methods that used nonadaptive wavelets and total variation [17]. However, it is slow due to expensive and repeated sparse coding steps, and lacked convergence guarantees. In practice, variable rather than common sparsity levels across patches can be allowed in DL-MRI by using an error threshold based stopping criterion when sparse coding with OMP.

V-B3 Other Applications and Variations

Later works applied dictionary learning to dynamic MRI [31, 91, 92], parallel MRI [93], and PET reconstruction [94]. An alternative Bayesian nonparametric dictionary learning approach was used for MRI reconstruction in [95]. Xu et al. [28] applied dictionary learning to CT image reconstruction and compared the BCS approach to pre-learning the dictionary from a dataset and fixing it during reconstruction. The former was found to be more promising when sufficient views (in sparse-view CT) were measured, whereas with very few views (or with very little measured information), pre-learning performed better. Tensor-structured (patch-based) dictionary learning has also been exploited recently for dynamic CT [96] and spectral CT [97] reconstructions.

V-B4 Recent Efficient Dictionary Learning-Based Methods

Recent work proposed efficient dictionary learning-based reconstruction algorithms, dubbed SOUP-DIL MRI [98] that used the following regularizer:

(15)

Here, the aggregate sparsity penalty with weight automatically enables variable sparsity levels across patches. The dictionary learning step of SOUP-DIL MRI optimized (15) using an exact block coordinate descent scheme by decomposing as a sum of outer products (SOUP) of dictionary columns and rows of , and solving for using sparse matrix-vector multiplications and normalization, then solving for the th row of by hard-thresholding, and cycling over all such pairs (i.e., ). The hard-thresholding update becomes soft-thresholding [99], when using an norm penalty. The and methods were dubbed SOUP-DILLO MRI and SOUP-DILLI MRI, respectively, with the former showing more effectiveness in practice [98].

(a) (b)
(c) (d)
Fig. 2: Dictionary Learning for MRI (images from [98]): (a) SOUP-DILLO MRI [98] reconstruction of the water phantom [79]; (b) sampling mask in k-space with 2.5x undersampling; and (c) real and (d) imaginary parts of the dictionary learned during reconstruction, with atoms shown as patches.

While the earlier DL-MRI used inexact (greedy) and expensive sparse code updates and lacked convergence analysis, SOUP-DIL MRI used efficient, exact updates and was proved to converge to the critical points (generalized stationary points) of the underlying problems and improved image quality over several schemes [98]. Fig. 2 shows an example reconstruction with this BCS method along with the learned dictionaries. Another recent work [100] extended the L+S model for dynamic MRI reconstruction in (6) to a low-rank and adaptive sparse signal (LASSI) model that incorporated a dictionary learning regularizer similar to (15) for the component.

V-B5 Alternative Convolutional Dictionary Model

One can replace the patch-based dictionary model with a convolutional model as that directly represents the image as a sum of (possibly circular) convolutions of dictionary filters and sparse coefficient maps  [101, 102]. The convolutional synthesis dictionary model is distinct from the patch-based model. However, its main drawback is the inability to represent low-frequency content in images, necessitating pre-processing of images to remove low-frequency content prior to convolutional dictionary learning. The utility of convolutional synthesis dictionary learning for biomedical image reconstruction is an open and interesting area for future research; see [103] for a denoising formulation that could be extended to inverse problems.

V-C Sparsifying Transform Learning-Based Methods

Several recent works have studied the learning of the efficient sparsifying transform model for biomedical image reconstruction [104, 32, 29]. This subsection reviews these advances (see [105] for an MRI focused review).

V-C1 Transform Model

The sparsifying transform model is a generalization [27] of the analysis dictionary model. The latter assumes that applying an operator to a signal produces several zeros in the output, i.e., the signal lies in the null space of a subset of rows of the operator. The sparsifying transform model allows for a sparse approximation as , where has several zeros and is a small transform domain modeling error. Natural images are well-known to be approximately sparse in transform domains such as the DCT and wavelets, a property that has been exploited for image compression [106], denoising, and inverse problems. A key advantage of the sparsifying transform model compared to the synthesis dictionary model is that the transform domain sparse approximation can be computed exactly and cheaply by thresholding  [27].

V-C2 Early Efficient Transform Learning-Based Methods

Recent works [104, 32] proposed transform learning (TL) based BCS methods that involved computationally cheap, closed-form updates in the iterative algorithms. In the rest of this section, we use TL-MRI as an umbrella term encompassing TL based reconstruction methods applied to MRI, some of which are discussed next. The following square transform learning (STL) [27] regularizer was used for reconstruction in [104]:

(16)

where is a square matrix and the transform learning regularizer with weight

prevents trivial solutions in learning such as the zero matrix or matrices with repeated rows. Moreover, it also helps control the condition number of the transform 

[27], a property useful for reconstruction. The term denotes the transform domain modeling error or sparsification error, which is minimized to learn a good sparsifying transform. The constraint in (16) on the “norm” of the matrix controls the net or aggregate sparsity of all patches’ sparse coefficients.

The image reconstruction problem with regularizer (16) was solved in [104] using a highly efficient block coordinate descent (BCD) approach that alternates between updating (transform sparse coding step), (transform update step), and (image update step). Importantly, the transform sparse coding step has a closed-form solution, where the matrix , whose columns are , is thresholded to its largest magnitude elements, with other entries set to zero. Similar thresholding-based solutions hold when the sparsity constraint is replaced with alternative sparsity promoting functions such as an or sparsity penalty. For example, when employing (with ) as a penalty in (16), the sparse coding solution is obtained by hard-thresholding with threshold . The minimization with respect to

has a simple, exact solution involving the singular value decomposition (SVD) of a small matrix 

[104]. Finally, the image update step involves a simple least squares problem, which in the case of single coil Cartesian MRI is solved in closed-form using FFTs. This efficient BCD scheme was proven to converge in general to the critical points of the nonconvex reconstruction problem [104].

In practice, the sparsity controlling parameter can be varied over algorithm iterations (a continuation strategy), allowing for faster artifact removal initially and then reduced bias over the iterations [32]. The STL-MRI scheme was shown [104] to be much faster than the previous DL-MRI scheme. Later, Tanc and Eksioglu [107] further combined STL with global sparsity regularization in known transform domains for CS MRI.

Pfister and Bresler applied STL to CT reconstruction [108]. Another recent work used STL for low-dose CT image reconstruction [109] with a shifted-Poisson likelihood penalty for the data-fidelity term in the cost (instead of the conventional weighted least squares penalty), but pre-learned the transform from a dataset and fixed it during reconstruction to save computation.

Other works have explored alternative formulations for transform learning that could be potentially used for image reconstruction. For example, a recent work [110] learned efficient double sparse transforms for image denoising, wherein , with being a sparse matrix and being an analytical transform (e.g., DCT) with a fast implementation. Another work [111] learned an overcomplete or tall operator by controlling the conditioning of along with penalizing the coherence between the rows of .

V-C3 Learning Rich Unions of Transforms for Reconstruction

Since, images typically contain a diversity of textures, features, and edge information, recent works [112, 32, 29] learned a union of transforms (a rich model) for image reconstruction. In this setting, a collection of transforms are learned and the image patches are grouped or clustered into classes, with each class of (similar) patches best matched to and using a particular transform. The UNITE-MRI formulation in [32] uses the following regularizer for reconstruction:

(17)

Here, is a set containing the indices of all patches matched to the transform , and denotes the set of all partitions of into disjoint subsets, where is the total number of overlapping patches. Note that when indexed variables are enclosed in braces (in (17) and later equations), we mean the set of all variables over the range of the indices.

The UNITE-MRI reconstruction formulation jointly learns a collection of transforms, clusters and sparse codes patches, and reconstructs the image from measurements. An efficient BCD algorithm with convergence guarantees was proposed for optimizing the problem in [32] that alternates between solving for (transform update), (sparse coding and clustering), and (image update), with efficient solutions in each of the steps. In particular, the transform update step computes the optimal transform in each cluster via an SVD, and the optimal clusters and sparse coefficients are solved jointly in the sparse coding and clustering step, wherein each patch is grouped to the transform that gives the smallest cost in (17) with , where performs hard-thresholding at threshold . The transforms in (17) are constrained to be unitary, simplifying the solution of the image update step [32]. UNITE-MRI achieved improved image quality over STL-MRI when reconstructing from undersampled k-space measurements [32].

Recent works applied learned unions of transforms to other applications. For example, the union of transforms model was pre-learned (from a dataset) and used in a clustering-based low-dose 3D CT reconstruction scheme [29]. Fig. 3 shows an example of high quality reconstructions obtained with this PWLS-ULTRA scheme. While the work used a PWLS-type reconstruction cost, a more recent method called SPULTRA [109] replaced the weighted least squares data-fidelity term with a shifted-Poisson likelihood penalty, which further improved image quality and reduced bias in the reconstruction in ultra low-dose settings. The image update step in these CT reconstruction algorithms is much more intensive than for MRI, for example exploiting the relaxed linearized augmented Lagrangian method with ordered-subsets (relaxed OS-LALM) [24] and quadratic surrogates (in [109]). Another recent work combined a learned union of transforms model with a mixed material model and applied it to image-domain material decomposition in dual-energy CT with high quality results [113].

V-C4 Learning Structured Transform Models

It is often useful to incorporate various structures and invariances in learning to better model natural data, and to prevent learning spurious features in the presence of noise and corruptions. Flipping and rotation invariant sparsifying transform (FRIST) learning was recently proposed and applied to image reconstruction in [114]. The regularization is similar to (17), but using with a common parent transform and denoting a set of known flipping and rotation (FR) operators that apply to each (row) atom of and approximate FR by permutations (similar to [78], but which used fixed 1D Haar wavelets as the parent). This enables learning a much more structured but flexible (depending on the number of FR operators) model than in (17), with clustering done more based on similar directional properties. Images with more directional features are better modeled by learned FRISTs [114].

V-C5 Learning Complementary Models – Low-rank and Transform Sparsity

Wen et al. [115] proposed STROLLR-MRI that combines two complementary regularizers: one exploits (non-local) self-similarity between regions, and another exploits transform learning via STL that is based on local patch sparsity. Non-local similarity and block matching models are well-known to have excellent performance in image processing tasks such as image denoising (with BM3D [116]). STROLLR-MRI was shown to achieve better CS MRI image quality over several methods including the supervised (deep) learning based ADMM-Net [117]. Its regularizer has the form , where the low-rank regularizer is as follows:

(18)

and the transform learning regularizer is

(19)

Here, the operator is a block matching operator that extracts the th patch and the patches most similar to it and forms a matrix, whose columns are the th patch and its matched siblings, ordered by degree of match. This matrix is approximated by a low-rank matrix in (18), with . The vector is a vectorization of the submatrix that is the first columns of . Thus the regularizer in (19) learns a higher-dimensional (unitary) transform (e.g., 3D transform for 2D patches), and jointly sparsifies non-local but similar patches. Fig. 4 shows example MRI reconstructions and comparisons.

Similar to UNITE-MRI or FRIST-MRI, there is an underlying grouping of patches, but STROLLR-MRI exploits block matching and sparsity to implicitly perform grouping. The STROLLR-MRI reconstruction algorithm [115] is also similar to the previous schemes, except that the sparse coding and clustering step such as in UNITE-MRI is replaced with a block matching, low-rank approximation, and sparse coding step.


Fig. 3: Cone-beam CT reconstructions (images from [29]) of the XCAT phantom [118] using the FDK, PWLS-EP [119] (with edge-preserving regularizer), and PWLS-ULTRA [29] () methods at dose incident photons per ray, shown along with the ground truth (top left). The central axial, sagittal, and coronal planes of the 3D reconstruction are shown. The learning-based PWLS-ULTRA removes noise and preserve edges much better than the other schemes.

V-D Online Learning for Reconstruction

Recent works have proposed online learning of sophisticated models for reconstruction particularly of dynamic data from time-series measurements [120, 121, 35, 122]. In this setting, the reconstructions are produced in a time-sequential manner from the incoming measurement sequence, with the models also adapted simultaneously and sequentially over time to track the underlying object’s dynamics and aid reconstruction. Such methods allow greater adaptivity to temporal dynamics and can enable dynamic reconstruction with less latency, memory use, and computation than conventional methods. Potential applications include real-time medical imaging, interventional imaging, etc., or they could be used even for more efficient and (temporally) adaptive offline reconstruction of large-scale (big) data. Recently, Wen et al. [120] proposed online sparsifying transform learning-based denoising of videos that also incorporated block matching techniques. The method, named VIDOSAT, achieved state-of-the-art video denoising performance.

Ground Truth Sparse MRI
ADMM-Net STROLLR-MRI
Fig. 4: MRI reconstructions (images from [105]) with pseudo-radial sampling and 5x undersampling using Sparse MRI [17] (PSNR = dB), ADMM-Net [117] (PSNR = dB), and STROLLR-MRI [115] (PSNR = dB), along with the original image from [117]. STROLLR-MRI clearly outperforms the nonadaptive Sparse MRI, while ADMM-Net also produces undesirable artifacts.

A recent work also efficiently adapted low-rank tensor models in an online manner for dynamic MRI [122]. Online learning for dynamic MRI image reconstruction was shown to be promising in [121, 35], which adapted synthesis dictionaries to spatio-temporal patches. In this setup [35] (dubbed OnAIR), measurements corresponding to a group (called mini-batch) of frames are processed at a time using a sliding window strategy. The objective function for reconstruction is a weighted time average of instantaneous cost functions, each corresponding to a group of processed frames. An exponential weighting (forgetting) factor for the instantaneous cost functions controls the past memory in the objective. The instantaneous cost functions include both a data-fidelity and a regularizer (corresponding to patches in the group of frames) term. The objective function thus changes over time and is optimized at each time point with respect to the most recent mini-batch of frames and corresponding sparse coefficients (with older frames and coefficients fixed), but the dictionary is itself adapted therein to all the data. Each frame can be reconstructed from multiple overlapping temporal windows and a weighted average of those used as the final estimate.

The online learning algorithms in [35] achieved memory and computational efficiency by using warm start initializations (that improve over time) for variables and frames based on estimates in previous windows, and thus running only a few iterations of optimization for each new window. They stored past information in small (cumulatively updated) matrices for the dictionary update. The OnAIR methods were significantly more efficient and more effective than batch learning-based techniques for dynamic MRI that iteratively learn and reconstruct from all k-t space measurements. Given the potential of online learning methods to transform dynamic and large-scale imaging, we expect to see growing interest and research in this domain.

V-E Connections between Transform Learning Approaches and Convolutional Network Models

The sparsifying transform models in Section V-C have close connections with convolutional filterbanks. This subsection and the next review some of these connections and implications for reconstruction.

V-E1 Connections to Filterbanks

Transform learning and its application to regularly spaced image patches [27, 110] can be equivalently performed using convolutional operations. For example, applying an atom of the transform to all the overlapping patches of an (2D) image via inner products is equivalent to convolving the image with a transform filter that is the (2D) flipped version of the atom. Thus, sparse coding in the transform model can be viewed as convolving the image with a set of transform filters (obtained from the transform atoms) and thresholding the resulting filter coefficient maps, and transform learning can be viewed as equivalently learning convolutional sparsifying filters [123, 124]. When using only a regularly spaced subset of patches, the above interpretation of transform sparse coding modifies to convolving the image with the transform filters, downsampling the results, and then thresholding [125]. Transform models based on clustering [32] add non-trivial complexities to this process.

Applying the matrix to the sparse codes of all overlapping patches and spatially aggregating the results, an operation used in iterative transform-based reconstruction algorithms [32], is equivalent to filtering the thresholded filter coefficient maps with corresponding matched filters (complex conjugate of transform atoms) and summing the results over the channels. These equivalences between patch-based and convolutional operations for the transform model contrast with the case for the synthesis dictionary model in Section V-B, where the patch-based and convolutional versions of the model are not equivalent in general. When a disparate set of (e.g., randomly chosen) image patches or operations such as block matching [115], etc., are used with the transform model, the underlying operations do not correspond to convolutions (thus, the transform learning frameworks can be viewed as more general). Typically the convolutional implementation of transforms is more computationally efficient than the patch-based version for large filter sizes [125]. For smaller or conventional (e.g., ) filter sizes, the patch-based version can be equally or more efficient.

Recent works have exploited the filterbank interpretation of the transform model [125, 126, 127]. Pfister and Bresler [128] learned filterbanks for MRI reconstruction. In  [125], they studied alternative properties and regularizers for transform learning.

V-E2 Multi-layer Transform Learning

Ravishankar and Wohlberg [127] recently proposed learning multi-layer extensions of the transform model (dubbed deep residual transforms (DeepResT)) that closely mimic convolutional neural networks (CNNs) by incorporating components such as filtering, nonlinearities, pooling, skip connections, and stacking; however, the learning was done using unsupervised model-based transform learning-type cost functions.

Fig. 5: The reconstruction model (see [126]) derived from the image update step of UT-MRI [32]. The model here has layers corresponding to iterations. Each layer first has a decorruption step that computes the second term in (20) using filtering and thresholding operations, assuming a transform model with filters. This is followed by a system model block that adds the fixed bias term to the output of the decorruption step and performs a least-squares type image update (e.g., using CG) to enforce the imaging forward model.

In the conventional transform model, the image is passed through a set of transform filters and thresholded (the non-linearity) to generate the sparse coefficient maps, whereas in the DeepResT model, the residual (difference) between the filter maps and their thresholded versions is computed and the residual maps for different filters are stacked together to form a residual volume that is further jointly sparsified in the next layer. However, to prevent dimensionality explosion, each filtering of the residual volume in the second and subsequent layers produces a 2D output (for a 2D initial image). The multi-layer model consists of successive joint sparsification of residual maps several times, with the filters and sparsity maps in all layers of the (encoder) network jointly learned in [127]

to provide the smallest thresholding (or sparsification) residuals in the final (output) layer, a transform learning-type cost. The learned model and multi-layer sparse coefficient maps can then be backpropagated in a linear fashion (

decoder) to generate an image approximation. The DeepResT model also downsampled (pooled) the residual maps in each encoder layer before further filtering them, providing robustness to noise and data corruptions. The learned models were shown [127] to provide promising performance for denoising images when learning directly from noisy data, and moreover learning stacked multi-layer encoder-decoder modules was shown to improve performance, especially at high noise levels. Application of such deep transform models to medical image reconstruction is an ongoing area of potent research.

V-F Physics-Driven Deep Training of Transform-Based Reconstruction Models

There has been growing recent interest in supervised learning approaches for image reconstruction [36]. These methods learn the parameters of reconstruction algorithms from training datasets (typically consisting of pairs of ground truth images and initial reconstructions from measurements) to minimize the error in reconstructing the training images from their typically limited or corrupted measurements. For example, the reconstruction model can be a deep CNN (typically consisting of encoder and decoder parts) that can be trained (as a denoiser) to produce a reconstruction from an initial corrupted version [37]. Section VI discusses such approaches in more detail. These methods can often require large training sets to learn billions of parameters (e.g., filters, etc.). Moreover, learned CNNs (deep learning) may not typically or rigorously incorporate the imaging measurement model or the information about the Physics of the imaging process, which are a key part of solving inverse problems. Hence, there has been recent interest in learning the parameters of iterative algorithms that solve regularized inverse problems [117, 38] (cf. Section VI for more such methods). These methods can also typically have fewer free parameters to train.

Recent works have interpreted early transform-based BCS algorithms as deep physics-driven convolutional networks learned on-the-fly, i.e., in a blind manner, from measurements [38, 126]. For example, the image update step in the transform BCS algorithm UT-MRI that learns a unitary transform [32] involves a least squares-type optimization with the following normal equation:

(20)

where (for in Section V-B or V-C) and denotes the iteration number in the block coordinate descent UT-MRI scheme. Matrix is a (matched) synthesis operator, and is a fixed matrix. The hard-thresholding in (20) corresponds to the solution of the sparse coding step in UT-MRI. The solution in (20) may be computed cheaply using FFTs [32] in some cases, or alternatively using conjugate gradients (CG).

Fig. 5 shows an unrolling of iterations (layers) of (20), with fresh filters in each iteration. Each layer has a system model block that solves (20) (e.g., with FFTs or CG), whose inputs are the two terms on the right hand side of (20): the first term is a fixed bias term; and the second term (denotes a decorruption step) is computed via convolutions by first applying the transform filters (denoted by , in Fig. 5) followed by thresholding (the non-linearity) and then matched synthesis filters (denoted by , ), and summing the outputs over the filters. This is clear from writing the second term in (20) as , with and denoting the th columns of and , respectively. Each of the terms forming the outer summation here corresponds to the output of an arm (of transform filtering, thresholding, and synthesis filtering) in the decorruption module of Fig. 5. Since UT-MRI does not use training data, it can be interpreted as learning the model in Fig. 5 on-the-fly from measurements.

Recent works [38, 126] learned the filters in this multi-layer model (a block coordinate descent or BCD Net [129]) with soft-thresholding ( norm-based) nonlinearities and trainable thresholds using a greedy scheme to minimize the error in reconstructing a training set from limited measurements. These and similar approaches (including the transform-based ADMM-Net [117]) involving unrolling of typical MRI inversion algorithms are physics-driven deep training methods due to the systematic inclusion of the imaging forward model in the convolutional network. Once learned, the reconstruction model can be efficiently applied to test data using convolutions, thresholding, and least squares-type updates. While [38, 126] did not enforce the corresponding synthesis and transform filters (in each arm of the decorruption module) to be matched in each layer, recent work [129] learned matched filters, improving image quality. The learning of such physics-driven transform-based networks is an active area of research, with interesting possibilities for new innovation in the convolutional models in the architecture motivated by more recent transform and dictionary learning based reconstruction methods. In all such methods, the thresholding operation is the key to exploiting sparsity.

Vi Deep Learning Methods

One of the most important recent developments in the field of image reconstruction is the introduction of deep learning approaches [130]. Motivated by the tremendous success of deep learning for image classification[131, 132], image segmentation[133]

, low-level computer vision problems such as segmentation 

[133], denoising [134], and super-resolution [135], many groups have recently successfully applied deep learning approaches to various image reconstruction problems such as X-ray CT [136, 137, 138, 139, 140, 141, 142], MRI [143, 144, 145, 146, 141, 36, 147], PET [148, 149] ultrasound [150, 151, 152], and optics [153, 154, 155].

The sharp increase in deep learning approaches to image reconstruction problems, which is part of the fourth wave in image reconstruction may be due to the “perfect storm” resulting from a combination of multiple attributes in perfect timing: availability of large public data, well-established GPU infrastructure in the image reconstruction community, easy-to-access deep learning toolboxes, industrial push, and open publications using arXiv, etc.

For example, one important public data set that has significantly contributed to this wave is the 2016 American Association of Physicists in Medicine (AAPM) Low-Dose X-ray CT Grand Challenge data set [156]. The training data sets consist of normal-dose and quarter-dose abdominal CT data from ten patients. Another emerging important data set is the fast MRI data set by NYU Langone Health and Facebook [157]. The dataset comprises raw k-space data from more than 1,500 fully sampled knee MRIs and DICOM images from 10,000 clinical knee MRIs obtained at 3 Tesla or 1.5 Tesla.

In addition, GPU methods have been extensively implemented to accelerate iterative methods in the field of image reconstruction. As a result, open deep learning toolboxes such as Tensorflow, pyTorch, MatConvNet, etc., based on GPU programming, are easily accessible to researchers in the field of image reconstruction. Moreover, in contrast to the previous waves of image reconstruction, the industry has been engaging in a big push in this development from the early phase, since deep learning based image reconstruction methods are well-suited to their business models. This is because the training can be done by the vendors with large databases and the users could enjoy high quality reconstruction results at near real-time reconstruction speed.

Given the relatively long publication cycle for regular journals in the field of image reconstruction, most new developments are found in arXiv preprints, well before they are formally accepted by the journals. This new trend of open publication facilitates the significant progress in this area within a short time period. The following subsection reviews the recent developments based on the peer-reviewed publications and arXiv preprints.

Vi-a Interpretation of Deep Models

One of the major hurdles of the deep learning approaches for image reconstruction is the black-box nature of neural networks. This is especially problematic for medical imaging applications, since many doctors are concerned about whether the performance improvement is real or cosmetic. Currently, there are three main approaches to explain the origin of the performance improvement: 1) unrolled sparse recovery, 2) generative models, and 3) representation learning. This section explains them in more detail.

Vi-A1 Unrolled sparse recovery

Similar to (5), an sparse recovery problem can be formulated as

(21)

The iterative soft-thresholding algorithm (ISTA) [158] for (21) iterates the following recursion:

(22)

where

(23)
(24)

and refers to element-wise soft-thresholding with parameter (threshold) , and is the Lipschitz constant of the gradient of the quadratic in (21) (i.e., spectral norm of ). The Learned ISTA (LISTA) [159] then uses a time unfolded version of the ISTA block diagram, truncated to a fixed number of iterations (see Fig. 6). Specifically, the matrices and for each block are learned so as to minimize the approximation error to the optimal sparse codes on a given dataset. Furthermore, the sparsifying soft-thresholding operator is interpreted as the nonlinearity in deep neural networks (as also done in later works discussed in Section V-F).

(a) ISTA
(b) LISTA
Fig. 6: (a) Block diagram of the ISTA algorithm for the optimization problem in (21). (b) Learned ISTA (LIST) replaces each block of ISTA with a learned weight from training data.

FBPConvNet [141] extended the LISTA interpretation for one specific class of inverse problems: those where the normal operator associated with the forward model () is a convolution. For this class of inverse problems, a CNN then emerges. This class of normal operators includes MRI, parallel-beam X-ray CT, and diffraction tomography (DT).

This interpretation of the deep neural networks is widely accepted in the field of image reconstruction, and many extensions have been proposed. For example, in ADMM-Net [117], the unrolled steps of the alternating directional method of multipliers (ADMM) based reconstruction algorithm are mapped to each layer of a deep neural network. Recently, Adler et al. [139] extended this idea to the primal-dual algorithm to obtain a CNN-based learned primal-dual approach.

However, a limitation of the unrolled sparse recovery interpretation in LISTA is the difficulty of explaining the number of filter channels, since the normal operator in (24) only explains a single channel convolution operator.

Fig. 7: An example of encoder-decoder CNN.

In variational networks [144], the authors explained the channels by decomposing the regularization terms. Specifically, the variational network is based on unfolding the following optimization problem:

(25)

where the regularization term is represented as a sum of multichannel operations:

(26)

where denotes the th linear operator represented by the th channel convolution (like the transform filters in Section V), and

denotes the associated activation functions. The corresponding Landweber iteration is

In a variational network [144], the convolution-based linear operator , the gradient of the activation function , and the regularization parameter are learned for each unfolded step. However, it is still unclear why the number of channels should vary for each layer.

Vi-A2 Generative model

Another interesting interpretation of deep learning for image reconstruction comes from the generative model interpretation. For example, the deep image prior approach [160] formulates the image reconstruction problem as

(27)
subject to (28)

where is a deep neural network parameterized by . In the original deep prior model [160], the input for the neural network was a noise vector, from which the neural network parameter is estimated by minimizing the data fidelity term. Recently, the authors in [149]

adapted this model to PET image reconstruction, where the conventional image reconstruction using ordered subset expectation maximization (OSEM) is used as the input to the neural network.

However, the neural network architecture is still a black-box, and the generative model interpretation does not give design insight for the deep models.

Vi-A3 Representation learning

The recent theory of deep convolutional framelets claims that a deep neural network can be interpreted as a framelet representation, whose frame basis is learned from the training data [161].

To understand this claim, consider a symmetric encoder-decoder CNN (E-D CNN) in Fig. 7, which is used for image reconstruction problems [141, 142]. Specifically, the encoder network maps a given input signal to a feature space , whereas the decoder takes this feature map as an input, processes it, and produces an output . At the th layer, , , and denote the dimension of the signal, the number of filter channels, and the total feature vector dimension, respectively. We consider a symmetric configuration where both the encoder and decoder have the same number of layers, say ; and the encoder layer and the decoder layer are symmetric.

The th channel output from the the th layer encoder can be represented by a multi-channel convolution operation [162]:

(29)

where denotes the th input channel signal, denotes the -tap convolutional kernel that is convolved with the th input channel to contribute to the th channel output, and is the pooling operator. Here, is the flipped version of the vector such that with the periodic boundary condition, and is the circular convolution. (Using periodic boundary conditions simplifies the mathematical treatments.) Similarly, the th channel decoder layer convolution output is given by [162]:

(30)

where denotes the unpooling operator.

By concatenating the multi-channel signal in column direction as

the encoder and decoder convolution in (29) and (30) can be represented using matrix notation:

(31)

where

denotes the element-wise rectified linear unit (ReLU) and

(32)
(33)

and

Then, the output of the ED-CNN can be represented by a nonlinear basis representation [162]:

(34)

where and denote the th columns of the following frame basis and its dual:

(35)
(36)

and and denote diagonal matrices with 0 and 1 values that are determined by the ReLU output in the previous convolution steps. Note that (34), (35), and (36) show explicitly the dependence on the input due to their dependence on the ReLU.

An important contribution of the recent theory of deep convolutional framelets [161] is that encoder-decoder CNNs have an interesting link to multi-scale convolutional framelet expansion. Specifically, suppose that the activation function is linear, i.e., , and the following frame conditions are satisfied for all layers:

(37)

for some positive , where denotes the filter length and the filter matrices and for the encoder and decoder, respectively, are constructed as follows:

Then the network output in (34) equals the network input, i.e., , satisfying the perfect reconstruction condition [162].

Moreover, [161] showed that E-D CNN is closely related to a Hankel matrix decomposition. To see this, concatenate the multi-channel signal side by side:

and define an extended Hankel matrix by cascading a Hankel matrix for each vector side by side. Then, under the frame condition (37), the following matrix identity holds:

(38)
(39)

where

(40)

Using the property of Hankel matrix [161], this can be equivalently represented by

where is the encoder layer multi-channel convolution filters obtained by rearranging . This is in fact equal to the encoder layer convolution in (31) so that . Similarly, we have

where is the decoder layer multi-channel convolution filters obtained by rearranging [161]. Therefore, [161] concluded that E-D CNN emerges from the Hankel matrix decomposition and the network performance depends on the proper decomposition.

However, in neural networks the input and output should differ, so perfect reconstruction condition is not of practical interest. Furthermore, the signal representation in (34) should generalize well for various inputs rather than for specific inputs at the training phase. CNN generalizability comes from the nonlinear nature of the expansion in (34) due to the combinatorial selection behavior of ReLU [162]; therefore, even with the same filter set, the expressivity from (34) increases exponentially with the network depth, width, and the skipped connection, which is the main origin of the superior performance of deep neural networks.

Vi-B Categories of the existing approaches

This section reviews network architectures used for image reconstruction problems. Fig. 8 illustrates various architectures at a high level.

(a) Image-domain Learning
(b) Hybrid-domain Learning
(c) AUTOMAP
(d) Sensor-domain Learning
Fig. 8: Various realizations of deep learning for image reconstruction.

Vi-B1 Image-domain learning

In image domain approaches [144, 163, 136, 164, 141, 146, 142, 137, 140], artifact-corrupted images are first generated from the measurement data using some analytic methods (e.g., FBP, Fourier transform, etc.), from which neural networks are trained to learn the artifacts (see Fig. 8(a)). For example, the low-dose and sparse CT neural networks [136, 164, 141, 142, 137, 140] belong to this class, where the noise corrupted images are first generated from the noisy or sparse view sinogram data using FBP, after which the artifacts are learned by comparing with the noiseless label images. In MR applications, the early U-Net architectures for compressed sensing MRI [165, 146] were also designed to remove the aliasing artifacts after obtaining the Fourier inversion image from the downsampled k-space data.

While the image domain approaches have been widely used in image reconstruction problems, the interpretation of the methods are somewhat different between the authors. For example, in FBPConvNet [141], the U-Net is interpreted as a multi-scale unrolling of a sparse recovery algorithm, whereas in [138, 142], the U-Net is interpreted as a deep convolutional framelet expansion.

A current trend in image domain learning is to use more sophisticated loss functions to overcome the observed smoothing artifacts. For example, in 

[166], the authors used the perceptual loss and Wasserstein distance loss to improve the resolution.

Vi-B2 Hybrid-domain learning

In this class of approaches [144, 117, 143, 117, 36, 167, 139, 145, 138, 168, 169, 170, 171, 172], the data consistency term is imposed in the neural network training and inference to improve the performance as shown in Fig. 8(b). For example, the variational neural network for compressed sensing MRI by Hammernik et al. [144] derived an unrolled neural network that uses a data consistency term for each layer. The physics-driven deep training methods in Section V-F included the full data-fidelity based image update in each layer. Related approaches are taken in dynamic cardiac MRI by Schlemper et al. [36].

Another class of popular hybrid domain approaches is based on the CNN penalty and plug-and-play model. Specifically, in CNN penalty approaches [143], a neural network is used as a prior model within an MBIR framework. Rather than using a CNN penalty explicitly, in the plug-and-play approach [172, 171], the denoising step of an iteration like ADMM is replaced with a neural network denoiser. Recently, Gupta et al. [167] proposed a projected gradient method, where the neural network is trained as a projector on a desirable function space. In the learned primal-dual approach [139], two neural networks are learned for the primal step and dual step.

Vi-B3 Automap

THe Automated Transform by Manifold Approximation (AUTOMAP) [147] (see Fig. 8(c)) approach learns a direct mapping from the measurement domain to image domain using a neural network. This approach requires a fully connected layer followed by convolution layers, leading to high memory requirements for storing that fully connected layer, currently limiting AUTOMAP applications to small size reconstruction problems.

Vi-B4 Sensor-domain learning

Sensor-domain learning approaches try to learn the sensor domain interpolation and denoising using a neural network as shown in Fig. 

8(d). For low-dose CT, [173] designed a neural network in the projection domain, yet the neural network is trained in an end-to-end manner from the sinogram to the image domain. Accordingly, the final output of the neural network is a data-driven ramp filter designed by minimizing the image domain loss. In k-space deep learning for accelerated MRI [174, 175, 176], the authors took similar approaches. Specifically, neural networks were designed to learn k-space interpolation kernels in an end-to-end manner from k-space to the image domain using an image domain loss. In contrast to AUTOMAP, a fully connected layer is not necessary in [173, 174, 175, 176], since an analytic transform is used in the backpropagation learning.

Vi-B5 Some variations

In [136, 138], the neural networks were designed to learn the relationship between contourlet transform coefficients of the low-dose input and high dose label data. These approaches are an early form of the transform domain approaches, where neural networks are designed in a specific transform domain that facilitates learning. Here, the choice of appropriate domain is based on domain expertise. For example, in a recent deep neural network architecture for interior tomography problems, [177] observed that the neural network is more robust with respect to different ROI sizes, detector pitch, short scan and sparse view artifacts, if the neural network is designed in the differentiated backprojection (DBP) domain. The DBP domain is neither a measurement domain or the image domain, so it is not well-known beyond the X-ray CT community. However, from works of cone-beam analytic reconstruction, the DBP is known for its robustness to short scan artifact, interior tomography, etc. Therefore, this work [177] shows an interesting research direction, in which the domain knowledge from imaging physics is exploited to design a better neural network architecture.

Vi-C Semi-supervised and Unsupervised Learning

Most deep learning approaches for image reconstruction have been based on the supervised learning framework. For example, in the low-dose CT reconstruction problems, the neural network is trained to learn the mapping between the noisy image and the noiseless (or high dose) label images. Similar approaches are taken in accelerated MRI, where the relationship between high accelerated and artifact corrupted input and the fully sampled label data are learned using training data.

Unfortunately, in many imaging scenarios the noiseless label images are difficult to obtain or even impossible to acquire. For example, in low-dose CT problems, an institutional reviewer board (IRB) rarely approves experiments that would require two exposures at low and high dose levels due to the potential risks to patients. This is why in the AAPM X-ray CT Low-Dose Grand Challenge, the matched low-dose images were generated by adding synthetic noise to the full dose sinogram data. Even in accelerated MRI, high-resolution fully sampled k-space data is very difficult to acquire due to the long scan time, and impossible to collect for dynamic MRI data sets that are all inherently under-sampled. Therefore, neural network training without reference or with small reference pairs are very important in the field of image reconstruction.

One of the earliest works in this regard is the low-dose CT denoising network by Wolterink et al [140]

. Instead of using matched high-dose data, the authors employ the GAN loss to match the probability distribution. One of the limitations of this work is that the network is very sensitive and, without careful training, spurious artifacts are often generated due to the generative nature of GAN. To address this problem, Kang et al. 

[178] proposed a cycleGAN architecture by employing cyclic loss and identity loss for multiphase cardiac CT problems. Thanks to the identity loss that works as a fixed point constraint, the authors demonstrated that no spurious artifact appeared in their results even without reference data. Besides, several of the learning approaches in Section V also do not require reference data to provide high quality reconstructions.

Given the limited amount of labeling in medical imaging problems, unsupervised learning is still a growing field that requires many ideas and breakthroughs.

Vii Open Questions and Future Directions

There are various challenges, open questions, and directions for image reconstruction that require further research. This section discusses some of the important directions.

First, as discussed in preceding sections, several learning-driven iterative algorithms have been proposed for image reconstruction particularly from limited or corrupted data, and have shown promise in imaging applications. Some of these methods have proven convergence guarantees. For example, recent works [104, 32] show the convergence of block coordinate descent transform learning-based blind compressed sensing algorithms to the critical points of the underlying highly nonconvex problems. However, analysis of theoretical conditions on the learned models, cost functions, algorithm initializations, and (e.g., k-space) sampling guaranteeing accurate and stable image recovery in learning-based setups requires further research. Such results would shed light on appropriate model properties and constraints for different modalities and also aid the development of better behaved iterative algorithms. Theoretical results on desirable properties and invariances for filters and non-linearities and provable ways to incorporate physics in the algorithm architecture would also benefit CNN based reconstruction methods [37] and physics-driven deep training-based reconstruction approaches [38, 129].

Second, in online learning based reconstruction, adapting relatively simple models may speed up the algorithm (particularly when real-time reconstruction is needed), but at the cost of image quality and vice-versa. Developing online learning based approaches that achieve optimal trade-offs between complexity or richness of the learned model, runtime per minibatch, and convergence (over time) is an important area of future research.

Third, a rigorous understanding of the pros and cons of different learning-based approaches and the regimes (signal-to-noise ratios, dose levels, or undersampling) where they work well is lacking. For example, some methods learn models such as dictionaries or sparsifying transforms using model-based cost functions from training data. These methods require fairly modest training data (e.g., several images or patches), and can effectively learn the general properties of images that generalize fairly well to new data (e.g., unseen anomalies may contain similar directional features). Blind compressed sensing methods on the other hand learn models on-the-fly from measurements without requiring training data, mimicking multi-layer (iterative) networks but learned in a completely unsupervised and highly adaptive manner. Supervised learning approaches learn the parameters of reconstruction models often from large datasets of input-output pairs, but may be less likely to generalize to unseen data or could produce spurious reconstructions of unseen features and anomalies (which are much less likely to occur in training sets). Moreover, supervised learning-based methods typically do not incorporate instance-adaptive components such as optimizing clustering for each test case within a network. A rigorous analysis of the different learning methodologies and their efficacy and drawbacks in different (training and testing) data and noise regimes would enable better use of such methods as well as aid the development of better models and improved learning-based reconstruction. Effectively and efficiently combining the benefits of both supervised and unsupervised or model-based learning methods is an interesting line of future research.

Fourth, there is increasing interest in learning-driven sampling of data, particularly limited measurements, in medical imaging. Some recent works [179, 180] proposed learning the undersampling pattern for CS MRI to minimize error in reconstructing a set of training images (e.g., pre-scans). The underlying optimization problems for learning the sampling were combinatorial and moreover, the reconstruction error that is optimized would depend on the chosen reconstruction algorithm and could be a highly nonconvex function as well. These works [179, 180] proposed adapting the sampling to both the training data and the reconstruction algorithm including learning-based reconstruction schemes, and showed improved image quality compared to conventional sampling strategies such as variable density random sampling for MRI. However, the learning could be computationally very expensive (e.g., in [180]) and the convergence behavior of these sampling adaptation algorithms is unknown. Development of efficient sampling (acquisition) learning algorithms with guarantees would be a promising direction of future research.

Finally, given the recent trends and breakthroughs in learning for biomedical imaging, we expect that the next generation imaging systems would leverage learning in all aspects of the imaging system. Such smart imaging systems may learn from big datasets (available locally in hospitals or in the cloud) as well as from real-time patient inputs and optimize the sampling for rapid (e.g., with limited measurements) or low-dose imaging, and also optimize the underlying models for efficient and effective reconstruction and analytics (e.g., classification, segmentation, disease feature detection, etc.). Such adaptation of the data acquisition, reconstruction, and analytics components could be done jointly in an end-to-end manner to maximize performance in specific clinical tasks and allowing for both radiologist and patient inputs in the learning process. The development of these next generation learning-driven systems would involve research thrusts in both modeling and algorithmic directions coupled with innovations in physics, hardware, pulse sequence design, etc. Importantly, we expect models, algorithms, and computation to play an important and key role in the development of medical imaging in the near future.

Viii Conclusions

This paper surveyed various advances in the field of medical image reconstruction beginning with early analytical approaches and simple model-based iterative reconstruction methods based on better models of the imaging system physics and sensor statistics and simple image regularization. Then the paper focused on later techniques exploiting improved image models and properties such as sparsity and low-rankness that enable reconstructions from limited or corrupted data, and then discussed more recent works on sophisticated data-driven or adaptive models and machine learning techniques for reconstruction. Examples and discussions were used to provide insight into the behavior and limitations of various classes of surveyed methods. We discussed the different regimes of adaptivity and learning and some of the connections between different learning-based models and methods. While the field of learning-driven imaging and the concurrent interest in smart imaging systems (with learning-driven acquisition, reconstruction, analytics, and diagnostics) is growing, we discussed some of the ongoing challenges, open questions, and future directions for the field in this paper.

References

  • [1] L. A. Feldkamp, L. C. Davis, and J. W. Kress, “Practical cone beam algorithm,” J. Opt. Soc. Am. A, vol. 1, no. 6, pp. 612–9, Jun. 1984.
  • [2] J. A. Fessler and B. P. Sutton, “Nonuniform fast Fourier transforms using min-max interpolation,” IEEE Trans. Sig. Proc., vol. 51, no. 2, pp. 560–74, Feb. 2003.
  • [3] S. De Francesco and A. M. Ferreira da Silva, “Efficient NUFFT-based direct Fourier algorithm for fan beam CT reconstruction,” in Proc. SPIE 5370 Medical Imaging: Image Proc., 2004, pp. 666–77.
  • [4] J. A. Fessler, “On NUFFT-based gridding for non-Cartesian MRI,” J. Mag. Res., vol. 188, no. 2, pp. 191–5, Oct. 2007.
  • [5] ——, “Statistical image reconstruction methods for transmission tomography,” in Handbook of Medical Imaging, Volume 2. Medical Image Processing and Analysis, M. Sonka and J. M. Fitzpatrick, Eds.   Bellingham: SPIE, 2000, pp. 1–70.
  • [6] I. A. Elbakri and J. A. Fessler, “Statistical image reconstruction for polyenergetic X-ray computed tomography,” IEEE Trans. Med. Imag., vol. 21, no. 2, pp. 89–99, Feb. 2002.
  • [7] K. Sauer and C. Bouman, “A local update strategy for iterative reconstruction from projections,” IEEE Trans. Sig. Proc., vol. 41, no. 2, pp. 534–48, Feb. 1993.
  • [8] J.-B. Thibault, C. A. Bouman, K. D. Sauer, and J. Hsieh, “A recursive filter for noise reduction in statistical iterative tomographic imaging,” in Proc. SPIE 6065 Computational Imaging IV, 2006, p. 60650X.
  • [9] J. A. Fessler, “Penalized weighted least-squares image reconstruction for positron emission tomography,” IEEE Trans. Med. Imag., vol. 13, no. 2, pp. 290–300, Jun. 1994.
  • [10] K. P. Pruessmann, “Encoding and reconstruction in parallel MRI,” NMR in Biomedicine, vol. 19, no. 3, pp. 288–299, 2006.
  • [11] P. Feng and Y. Bresler, “Spectrum-blind minimum-rate sampling and reconstruction of multiband signals,” in ICASSP, vol. 3, may 1996, pp. 1689–1692.
  • [12] Y. Bresler and P. Feng, “Spectrum-blind minimum-rate sampling and reconstruction of 2-D multiband signals,” in Proc. 3rd IEEE Int. Conf. on Image Processing, ICIP’96, 1996, pp. 701–704.
  • [13] D. L. Donoho, “Compressed sensing,” IEEE Trans. Information Theory, vol. 52, no. 4, pp. 1289–1306, 2006.
  • [14] C. Emmanuel, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information,” IEEE Trans. Information Theory, vol. 52, pp. 489–509, 2006.
  • [15] J. C. Ye, Y. Bresler, and P. Moulin, “A self-referencing level-set method for image reconstruction from sparse fourier samples,” International Journal of Computer Vision, vol. 50, no. 3, pp. 253–270, 2002.
  • [16] G. Wang, Y. Bresler, and V. Ntziachristos, “Guest editorial compressive sensing for biomedical imaging,” IEEE Transactions on Medical Imaging, vol. 30, no. 5, pp. 1013–1016, 2011.
  • [17] M. Lustig, D. Donoho, and J. M. Pauly, “Sparse MRI: The application of compressed sensing for rapid MR imaging,” Magnetic resonance in medicine, vol. 58, no. 6, pp. 1182–1195, 2007.
  • [18] M. Lustig, D. L. Donoho, J. M. Santos, and J. M. Pauly, “Compressed sensing mri,” IEEE Signal Processing Magazine, vol. 25, no. 2, pp. 72–82, 2008.
  • [19] FDA, “510k premarket notification of HyperSense (GE Medical Systems),” 2017.
  • [20] ——, “510k premarket notification of Compressed Sensing Cardiac Cine (Siemens),” 2017.
  • [21] ——, “510k premarket notification of Compressed SENSE,” 2018.
  • [22] X. Qu, W. Zhang, D. Guo, C. Cai, S. Cai, and Z. Chen, “Iterative thresholding compressed sensing MRI based on contourlet transform,” Inverse Problems in Science and Engineering, vol. 18, no. 6, pp. 737–758, 2010.
  • [23] B. Adcock, A. C. Hansen, C. Poon, and B. Roman, “Breaking the coherence barrier: A new theory for compressed sensing,” arXiv preprint arXiv:1302.0561, 2013.
  • [24] H. Nien and J. A. Fessler, “Relaxed linearized algorithms for faster X-ray CT image reconstruction,” IEEE Trans. Med. Imag., vol. 35, no. 4, pp. 1090–8, Apr. 2016.
  • [25] D. Kim, S. Ramani, and J. A. Fessler, “Combining ordered subsets and momentum for accelerated X-ray CT image reconstruction,” IEEE Trans. Med. Imag., vol. 34, no. 1, pp. 167–78, Jan. 2015.
  • [26] M. Aharon, M. Elad, and A. Bruckstein, “K-SVD : An algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Transactions on Signal Processing, vol. 54, no. 11, pp. 4311–4322, 2006.
  • [27] S. Ravishankar and Y. Bresler, “Learning sparsifying transforms,” IEEE Transactions on Signal Processing, vol. 61, no. 5, pp. 1072–1086, 2013.
  • [28] Q. Xu, H. Yu, X. Mou, L. Zhang, J. Hsieh, and G. Wang, “Low-dose X-ray CT reconstruction via dictionary learning,” IEEE Trans. Med. Imag., vol. 31, no. 9, pp. 1682–1697, 2012.
  • [29] X. Zheng, S. Ravishankar, Y. Long, and J. A. Fessler, “PWLS-ULTRA: An efficient clustering and learning-based approach for low-dose 3D CT image reconstruction,” IEEE Trans. Med. Imag., vol. 37, no. 6, pp. 1498–510, Jun. 2018.
  • [30] S. Ravishankar and Y. Bresler, “MR image reconstruction from highly undersampled k-space data by dictionary learning,” IEEE transactions on medical imaging, vol. 30, no. 5, pp. 1028–1041, 2011.
  • [31] S. G. Lingala and M. Jacob, “Blind compressive sensing dynamic MRI,” IEEE Transactions on Medical Imaging, vol. 32, no. 6, pp. 1132–1145, 2013.
  • [32] S. Ravishankar and Y. Bresler, “Data-driven learning of a union of sparsifying transforms model for blind compressed sensing,” IEEE Transactions on Computational Imaging, vol. 2, no. 3, pp. 294–309, 2016.
  • [33] S. Gleichman and Y. C. Eldar, “Blind compressed sensing,” IEEE Transactions on Information Theory, vol. 57, no. 10, pp. 6958–6975, 2011.
  • [34]

    M. Mardani, G. Mateos, and G. B. Giannakis, “Subspace learning and imputation for streaming big data matrices and tensors,”

    IEEE Transactions on Signal Processing, vol. 63, no. 10, pp. 2663–2677, 2015.
  • [35] B. E. Moore, S. Ravishankar, R. R. Nadakuditi, and J. A. Fessler, “Online adaptive image reconstruction (OnAIR) using dictionary models,” IEEE Transactions on Computational Imaging, 2018, arXiv preprint, arXiv:1809.01817.
  • [36] J. Schlemper, J. Caballero, J. V. Hajnal, A. N. Price, and D. Rueckert, “A deep cascade of convolutional neural networks for dynamic MR image reconstruction,” IEEE Transactions on Medical Imaging, vol. 37, no. 2, pp. 491–503, 2018.
  • [37] D. Lee, J. Yoo, S. Tak, and J. C. Ye, “Deep residual learning for accelerated MRI using magnitude and phase networks,” IEEE Transactions on Biomedical Engineering, vol. 65, no. 9, pp. 1985–1995, 2018.
  • [38] S. Ravishankar, I. Y. Chun, and J. A. Fessler, “Physics-driven deep training of dictionary-based algorithms for mr image reconstruction,” in 2017 51st Asilomar Conference on Signals, Systems, and Computers, 2017, pp. 1859–1863.
  • [39] G. Wang, J. C. Ye, K. Mueller, and J. A. Fessler, “Image reconstruction is a new frontier of machine learning,” IEEE Trans. Med. Imag., vol. 37, no. 6, pp. 1289–96, Jun. 2018.
  • [40] J. Besag, “On the statistical analysis of dirty pictures,” J. Royal Stat. Soc. Ser. B, vol. 48, no. 3, pp. 259–302, 1986.
  • [41] H. M. Hudson and R. S. Larkin, “Accelerated image reconstruction using ordered subsets of projection data,” IEEE Trans. Med. Imag., vol. 13, no. 4, pp. 601–9, Dec. 1994.
  • [42] S. Ahn, S. G. Ross, E. Asma, J. Miao, X. Jin, L. Cheng, S. D. Wollenweber, and R. M. Manjeshwar, “Quantitative comparison of OSEM and penalized likelihood image reconstruction using relative difference penalties for clinical PET,” Phys. Med. Biol., vol. 60, no. 15, pp. 5733–52, Aug. 2015.
  • [43] J. A. Fessler and W. L. Rogers, “Spatial resolution properties of penalized-likelihood image reconstruction methods: Space-invariant tomographs,” IEEE Trans. Im. Proc., vol. 5, no. 9, pp. 1346–58, Sep. 1996.
  • [44] J. Nuyts, D. Beque, P. Dupont, and L. Mortelmans, “A concave prior penalizing relative differences for maximum-a-posteriori reconstruction in emission tomography,” IEEE Trans. Nuc. Sci., vol. 49, no. 1-1, pp. 56–60, Feb. 2002.
  • [45] J.-B. Thibault, K. Sauer, C. Bouman, and J. Hsieh, “A three-dimensional statistical approach to improved image quality for multi-slice helical CT,” Med. Phys., vol. 34, no. 11, pp. 4526–44, Nov. 2007.
  • [46] L. Geerts-Ossevoort, E. . Weerdt, A. Duijndam, G. van I Jperen, H. Peeters, M. Doneva, M. Nijenhuis, and A. Huang, “Compressed SENSE,” 2018, philips white paper 4522 991 31821 Nov. 2018.
  • [47] J. P. Haldar, “Low-rank modeling of local-space neighborhoods (LORAKS) for constrained MRI,” IEEE Transactions on Medical Imaging, vol. 33, no. 3, pp. 668–681, 2014.
  • [48] K. H. Jin, D. Lee, and J. C. Ye, “A general framework for compressed sensing and parallel MRI using annihilating filter based low-rank Hankel matrix,” IEEE Transactions on Computational Imaging, vol. 2, no. 4, pp. 480–495, 2016.
  • [49] D. Lee, K. H. Jin, E. Y. Kim, S.-H. Park, and J. C. Ye, “Acceleration of MR parameter mapping using annihilating filter-based low rank Hankel matrix (ALOHA),” Magnetic resonance in medicine, vol. 76, no. 6, pp. 1848–1864, 2016.
  • [50] G. Ongie and M. Jacob, “Off-the-grid recovery of piecewise constant images from few fourier samples,” SIAM Journal on Imaging Sciences, vol. 9, no. 3, pp. 1004–1041, 2016.
  • [51] ——, “A fast algorithm for convolutional structured low-rank matrix recovery,” IEEE transactions on computational imaging, vol. 3, no. 4, pp. 535–550, 2017.
  • [52] K. H. Jin, J.-Y. Um, D. Lee, J. Lee, S.-H. Park, and J. C. Ye, “MRI artifact correction using sparse+ low-rank decomposition of annihilating filter-based Hankel matrix,” Magnetic resonance in medicine, vol. 78, no. 1, pp. 327–340, 2017.
  • [53] Z. P. Liang, “Spatiotemporal imaging with partially separable functions,” in IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2007, pp. 988–991.
  • [54] V. Singh, A. H. Tewfik, and D. B. Ress, “Under-sampled functional mri using low-rank plus sparse matrix decomposition,” in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015, pp. 897–901.
  • [55] G. Mazor, L. Weizman, A. Tal, and Y. C. Eldar, “Low-rank magnetic resonance fingerprinting,” Medical Physics, vol. 45, no. 9, pp. 4066–4084, 2018.
  • [56] J. P. Haldar and Z. P. Liang, “Spatiotemporal imaging with partially separable functions: A matrix recovery approach,” in IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2010, pp. 716–719.
  • [57] B. Zhao, J. P. Haldar, C. Brinegar, and Z. P. Liang, “Low rank matrix recovery for real-time cardiac MRI,” in IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2010, pp. 996–999.
  • [58] H. Pedersen, S. Kozerke, S. Ringgaard, K. Nehrke, and W. Y. Kim, “k-t pca: Temporally constrained k-t blast reconstruction using principal component analysis,” Magnetic Resonance in Medicine, vol. 62, no. 3, pp. 706–716, 2009.
  • [59] J. Trzasko and A. Manduca, “Local versus global low-rank promotion in dynamic MRI series reconstruction,” in Proc. ISMRM, 2011, p. 4371.
  • [60] S. G. Lingala, Y. Hu, E. DiBella, and M. Jacob, “Accelerated dynamic MRI exploiting sparsity and low-rank structure: k-t SLR,” IEEE Transactions on Medical Imaging, vol. 30, no. 5, pp. 1042–1054, 2011.
  • [61] B. Zhao, J. P. Haldar, A. G. Christodoulou, and Z. P. Liang, “Image reconstruction from highly undersampled (k, t) -space data with joint partial separability and sparsity constraints,” IEEE Transactions on Medical Imaging, vol. 31, no. 9, pp. 1809–1820, 2012.
  • [62] E. J. Candès, X. Li, Y. Ma, and J. Wright, “Robust principal component analysis?” J. ACM, vol. 58, no. 3, pp. 11:1–11:37, 2011.
  • [63] H. Guo, C. Qiu, and N. Vaswani, “An online algorithm for separating sparse and low-dimensional signal sequences from their sum,” IEEE Transactions on Signal Processing, vol. 62, no. 16, pp. 4284–4297, 2014.
  • [64] R. Otazo, E. Candès, and D. K. Sodickson, “Low-rank plus sparse matrix decomposition for accelerated dynamic MRI with separation of background and dynamic components,” Magnetic Resonance in Medicine, vol. 73, no. 3, pp. 1125–1136, 2015.
  • [65] B. Trémoulhéac, N. Dikaios, D. Atkinson, and S. R. Arridge, “Dynamic mr image reconstruction - separation from undersampled ( k,t )-space via low-rank plus sparse prior,” IEEE Transactions on Medical Imaging, vol. 33, no. 8, pp. 1689–1701, 2014.
  • [66] D. Banco, S. Aeron, and W. S. Hoge, “Sampling and recovery of mri data using low rank tensor models,” in 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2016, pp. 448–452.
  • [67] B. Yaman, S. Weingärtner, N. Kargas, N. D. Sidiropoulos, and M. Akcakaya, “Locally low-rank tensor regularization for high-resolution quantitative dynamic MRI,” in 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 2017, pp. 1–5.
  • [68] J. Lee, K. H. Jin, and J. C. Ye, “Reference-free single-pass EPI Nyquist ghost correction using annihilating filter-based low rank Hankel matrix (ALOHA),” Magnetic resonance in medicine, vol. 76, no. 6, pp. 1775–1789, 2016.
  • [69] J. C. Ye, J. M. Kim, K. H. Jin, and K. Lee, “Compressive sampling using annihilating filter-based low-rank interpolation,” IEEE Transactions on Information Theory, vol. 63, no. 2, pp. 777–801, Feb. 2017.
  • [70] G. Ongie, S. Biswas, and M. Jacob, “Convex recovery of continuous domain piecewise constant images from nonuniform Fourier samples,” IEEE Transactions on Signal Processing, vol. 66, no. 1, pp. 236–250, 2017.
  • [71] M. Vetterli, P. Marziliano, and T. Blu, “Sampling signals with finite rate of innovation,” IEEE Trans. on Signal Processing, vol. 50, no. 6, pp. 1417–1428, 2002.
  • [72] I. Maravic and M. Vetterli, “Sampling and reconstruction of signals with finite rate of innovation in the presence of noise,” IEEE Trans. on Signal Processing, vol. 53, no. 8, pp. 2788–2805, 2005.
  • [73] E. M. Haacke, Z.-P. Liang, and S. H. Izen, “Superresolution reconstruction through object modeling and parameter estimation,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 37, no. 4, pp. 592–595, 1989.
  • [74] P. J. Shin, P. E. Larson, M. A. Ohliger, M. Elad, J. M. Pauly, D. B. Vigneron, and M. Lustig, “Calibrationless parallel imaging reconstruction based on structured low-rank matrix completion,” Magnetic resonance in medicine, vol. 72, no. 4, pp. 959–970, 2014.
  • [75] J. Min, K. H. Jin, M. Unser, and J. C. Ye, “Grid-free localization algorithm using low-rank hankel matrix for super-resolution microscopy,” IEEE Transactions on Image Processing, vol. 27, no. 10, pp. 4771–4786, 2018.
  • [76] K. H. Jin and J. C. Ye, “Annihilating filter-based low-rank Hankel matrix approach for image inpainting,” IEEE Transactions on Image Processing, vol. 24, no. 11, pp. 3498–3511, 2015.
  • [77] ——, “Sparse and low-rank decomposition of a Hankel structured matrix for impulse noise removal,” IEEE Transactions on Image Processing, vol. 27, no. 3, pp. 1448–1461, 2018.
  • [78] X. Qu, D. Guo, B. Ning, Y. Hou, Y. Lin, S. Cai, and Z. Chen, “Undersampled mri reconstruction with patch-based directional wavelets,” Magnetic resonance imaging, vol. 30, no. 7, pp. 964–977, 2012.
  • [79] B. Ning, X. Qu, D. Guo, C. Hu, and Z. Chen, “Magnetic resonance image reconstruction using trained geometric directions in 2D redundant wavelets domain and non-convex optimization,” Magnetic resonance imaging, vol. 31, no. 9, pp. 1611–1622, 2013.
  • [80] Z. Zhan, J.-F. Cai, D. Guo, Y. Liu, Z. Chen, and X. Qu, “Fast multiclass dictionaries learning with geometrical directions in MRI reconstruction,” IEEE Transactions on Biomedical Engineering, vol. 63, no. 9, pp. 1850–1861, 2016.
  • [81] X. Qu, Y. Hou, F. Lam, D. Guo, J. Zhong, and Z. Chen, “Magnetic resonance image reconstruction from undersampled measurements using a patch-based nonlocal operator,” Medical image analysis, vol. 18, no. 6, pp. 843–856, 2014.
  • [82] R. Vidal, “Subspace clustering,” IEEE Signal Processing Magazine, vol. 28, no. 2, pp. 52–68, 2011.
  • [83] B. A. Olshausen and D. J. Field, “Emergence of simple-cell receptive field properties by learning a sparse code for natural images,” Nature, vol. 381, no. 6583, pp. 607–609, 1996.
  • [84] M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Transactions on signal processing, vol. 54, no. 11, pp. 4311–4322, 2006.
  • [85] J. Mairal, F. Bach, J. Ponce, and G. Sapiro, “Online learning for matrix factorization and sparse coding,” J. Mach. Learn. Res., vol. 11, pp. 19–60, 2010.
  • [86] M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” IEEE Trans. Image Process., vol. 15, no. 12, pp. 3736–3745, 2006.
  • [87] J. Mairal, M. Elad, and G. Sapiro, “Sparse representation for color image restoration,” IEEE Trans. on Image Processing, vol. 17, no. 1, pp. 53–69, 2008.
  • [88] A. M. Bruckstein, D. L. Donoho, and M. Elad, “From sparse solutions of systems of equations to sparse modeling of signals and images,” SIAM Review, vol. 51, no. 1, pp. 34–81, 2009.
  • [89] R. Rubinstein, M. Zibulevsky, and M. Elad, “Double sparsity: Learning sparse dictionaries for sparse signal approximation,” IEEE Transactions on Signal Processing, vol. 58, no. 3, pp. 1553–1564, 2010.
  • [90] Y. Pati, R. Rezaiifar, and P. Krishnaprasad, “Orthogonal matching pursuit : recursive function approximation with applications to wavelet decomposition,” in Asilomar Conf. on Signals, Systems and Comput., 1993, pp. 40–44 vol.1.
  • [91] Y. Wang and L. Ying, “Compressed sensing dynamic cardiac cine mri using learned spatiotemporal dictionary,” IEEE Transactions on Biomedical Engineering, vol. 61, no. 4, pp. 1109–1120, 2014.
  • [92] J. Caballero, A. N. Price, D. Rueckert, and J. V. Hajnal, “Dictionary learning and time sparsity for dynamic MR data reconstruction,” IEEE Transactions on Medical Imaging, vol. 33, no. 4, pp. 979–994, 2014.
  • [93] D. Weller, “Reconstruction with dictionary learning for accelerated parallel magnetic resonance imaging,” in 2016 IEEE Southwest Symposium on Image Analysis and Interpretation (SSIAI), 2016, pp. 105–108.
  • [94] S. Chen, H. Liu, P. Shi, and Y. Chen, “Sparse representation and dictionary learning penalized image reconstruction for positron emission tomography,” Physics in Medicine and Biology, vol. 60, no. 2, pp. 807–823, 2015.
  • [95] Y. Huang, J. Paisley, Q. Lin, X. Ding, X. Fu, and X. P. Zhang, “Bayesian nonparametric dictionary learning for compressed sensing MRI,” IEEE Trans. Image Process., vol. 23, no. 12, pp. 5007–5019, 2014.
  • [96] S. Tan, Y. Zhang, G. Wang, X. Mou, G. Cao, Z. Wu, and H. Yu, “Tensor-based dictionary learning for dynamic tomographic reconstruction,” Physics in Medicine and Biology, vol. 60, no. 7, pp. 2803–2818, 2015.
  • [97] Y. Zhang, X. Mou, G. Wang, and H. Yu, “Tensor-based dictionary learning for spectral ct reconstruction,” IEEE Transactions on Medical Imaging, vol. 36, no. 1, pp. 142–154, 2017.
  • [98] S. Ravishankar, R. R. Nadakuditi, and J. A. Fessler, “Efficient sum of outer products dictionary learning (SOUP-DIL) and its application to inverse problems,” IEEE Transactions on Computational Imaging, vol. 3, no. 4, pp. 694–709, 2017.
  • [99] M. Sadeghi, M. Babaie-Zadeh, and C. Jutten, “Learning overcomplete dictionaries based on atom-by-atom updating,” IEEE Transactions on Signal Processing, vol. 62, no. 4, pp. 883–891, 2014.
  • [100] S. Ravishankar, B. E. Moore, R. R. Nadakuditi, and J. A. Fessler, “Low-rank and adaptive sparse signal (LASSI) models for highly accelerated dynamic imaging,” IEEE Transactions on Medical Imaging, vol. 36, no. 5, pp. 1116–1128, 2017.
  • [101] C. Garcia-Cardona and B. Wohlberg, “Convolutional dictionary learning: A comparative review and new algorithms,” IEEE Transactions on Computational Imaging, vol. 4, no. 3, pp. 366–381, Sept 2018.
  • [102] B. Wohlberg, “Efficient algorithms for convolutional sparse representations,” IEEE Transactions on Image Processing, vol. 25, no. 1, pp. 301–315, Jan 2016.
  • [103] I. Y. Chun and J. A. Fessler, “Convolutional dictionary learning: acceleration and convergence,” IEEE Trans. Im. Proc., vol. 27, no. 4, pp. 1697–712, Apr. 2018.
  • [104] S. Ravishankar and Y. Bresler, “Efficient blind compressed sensing using sparsifying transforms with convergence guarantees and application to magnetic resonance imaging,” SIAM Journal on Imaging Sciences, vol. 8, no. 4, pp. 2519–2557, 2015.
  • [105] B. Wen, S. Ravishankar, L. Pfister, and Y. Bresler, “Transform learning for magnetic resonance image reconstruction: From model-based learning to building neural networks,” 2019, arXiv preprint, arXiv:1903.11431.
  • [106] M. W. Marcellin, M. J. Gormish, A. Bilgin, and M. P. Boliek, “An overview of JPEG-2000,” in Proc. Data Compression Conf., 2000, pp. 523–541.
  • [107] A. K. Tanc and E. M. Eksioglu, “MRI reconstruction with joint global regularization and transform learning,” Computerized Medical Imaging and Graphics, vol. 53, pp. 1–8, 2016.
  • [108] L. Pfister and Y. Bresler, “Model-based iterative tomographic reconstruction with adaptive sparsifying transforms,” in SPIE International Symposium on Electronic Imaging: Computational Imaging XII, vol. 9020, 2014, pp. 90 200H–1–90 200H–11.
  • [109] S. Ye, S. Ravishankar, Y. Long, and J. A. Fessler, “Adaptive sparse modeling and shifted-poisson likelihood based approach for low-dose CT image reconstruction,” in 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), 2017, pp. 1–6.
  • [110] S. Ravishankar and Y. Bresler, “Learning doubly sparse transforms for images,” IEEE Transactions on Image Processing, vol. 22, no. 12, pp. 4598–4612, 2013.
  • [111] S. Ravishankar and Y. Bresler, “Learning overcomplete sparsifying transforms for signal processing,” in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 2013, pp. 3088–3092.
  • [112] B. Wen, S. Ravishankar, and Y. Bresler, “Structured overcomplete sparsifying transform learning with convergence guarantees and applications,” International Journal of Computer Vision, vol. 114, no. 2-3, pp. 137–167, 2015.
  • [113] Z. Li, S. Ravishankar, Y. Long, and J. A. Fessler, “DECT-MULTRA: dual-energy CT image decomposition with learned mixed material models and efficient clustering,” arXiv preprint arXiv:1901.00106, 2019.
  • [114] B. Wen, S. Ravishankar, and Y. Bresler, “FRIST- flipping and rotation invariant sparsifying transform learning and applications,” Inverse Problems, vol. 33, no. 7, p. 074007, 2017.
  • [115] B. Wen, Y. Li, and Y. Bresler, “The power of complementary regularizers: Image recovery via transform learning and low-rank modeling,” arXiv preprint arXiv:1808.01316, 2018.
  • [116] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising by sparse 3D transform-domain collaborative filtering,” IEEE Trans. on Image Processing, vol. 16, no. 8, pp. 2080–2095, 2007.
  • [117] Y. Yang, J. Sun, H. Li, and Z. Xu, “Deep ADMM-Net for compressive sensing MRI,” in Advances in Neural Information Processing Systems, 2016, pp. 10–18.
  • [118] W. P. Segars, M. Mahesh, T. J. Beck, E. C. Frey, and B. M. W. Tsui, “Realistic CT simulation using the 4D XCAT phantom,” Med. Phys., vol. 35, no. 8, pp. 3800–3808, Aug. 2008.
  • [119] J. H. Cho and J. A. Fessler, “Regularization designs for uniform spatial resolution and noise properties in statistical image reconstruction for 3D X-ray CT,” IEEE Trans. Med. Imag., vol. 34, no. 2, pp. 678–689, Feb. 2015.
  • [120] B. Wen, S. Ravishankar, and Y. Bresler, “VIDOSAT: High-dimensional sparsifying transform learning for online video denoising,” IEEE Transactions on Image Processing, vol. 28, no. 4, pp. 1691–1704, 2019.
  • [121] S. Ravishankar, B. E. Moore, R. R. Nadakuditi, and J. A. Fessler, “Efficient online dictionary adaptation and image reconstruction for dynamic MRI,” in 2017 51st Asilomar Conference on Signals, Systems, and Computers, 2017, pp. 835–839.
  • [122] M. Mardani, G. B. Giannakis, and K. Ugurbil, “Tracking tensor subspaces with informative random sampling for real-time MR imaging,” arXiv preprint, arXiv:1609.04104, 2016.
  • [123] I. Y. Chun and J. A. Fessler, “Convolutional analysis operator learning: Application to sparse-view CT,” in Proc., IEEE Asilomar Conf. on Signals, Systems, and Comp., 2018, pp. 1631–5, invited.
  • [124] ——, “Convolutional analysis operator learning: acceleration and convergence,” 2018, arXiv preprint, arXiv:1802.05584.
  • [125] L. Pfister and Y. Bresler, “Learning filter bank sparsifying transforms,” IEEE Transactions on Signal Processing, vol. 67, no. 2, pp. 504–519, 2019.
  • [126] S. Ravishankar, A. Lahiri, C. Blocker, and J. A. Fessler, “Deep dictionary-transform learning for image reconstruction,” in 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), 2018, pp. 1208–1212.
  • [127] S. Ravishankar and B. Wohlberg, “Learning multi-layer transform models,” in 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2018, pp. 160–165.
  • [128] L. Pfister and Y. Bresler, “Learning sparsifying filter banks,” in Proc. SPIE 9597, Wavelets and Sparsity XVI, 2015, pp. 959 703–1–959 703–10.
  • [129] Y. Chun and J. A. Fessler, “Deep BCD-Net using identical encoding-decoding CNN structures for iterative image recovery,” in 2018 IEEE 13th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP), 2018, pp. 1–5.
  • [130] G. Wang, J. C. Ye, K. Mueller, and J. A. Fessler, “Image reconstruction is a new frontier of machine learning,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1289–1296, 2018.
  • [131]

    A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in

    Advances in neural information processing systems, 2012, pp. 1097–1105.
  • [132] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in

    Proceedings of the IEEE conference on computer vision and pattern recognition

    , 2016, pp. 770–778.
  • [133] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in 18th International Conference on Medical image computing and computer-assisted intervention, Munich, Germany, 2015.
  • [134] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a gaussian denoiser: Residual learning of deep CNN for image denoising,” IEEE Transactions on Image Processing, vol. 26, no. 7, pp. 3142–3155, 2017.
  • [135] J. Kim, J. Kwon Lee, and K. Mu Lee, “Accurate image super-resolution using very deep convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 1646–1654.
  • [136] E. Kang, J. Min, and J. C. Ye, “A deep convolutional neural network using directional wavelets for low-dose X-ray CT reconstruction,” Medical physics, vol. 44, no. 10, 2017.
  • [137] H. Chen, Y. Zhang, M. K. Kalra, F. Lin, Y. Chen, P. Liao, J. Zhou, and G. Wang, “Low-dose CT with a residual encoder-decoder convolutional neural network,” IEEE transactions on medical imaging, vol. 36, no. 12, pp. 2524–2535, 2017.
  • [138] E. Kang, W. Chang, J. Yoo, and J. C. Ye, “Deep convolutional framelet denosing for low-dose CT via wavelet residual network,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1358–1369, 2018.
  • [139] J. Adler and O. Öktem, “Learned primal-dual reconstruction,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1322–1332, 2018.
  • [140] J. M. Wolterink, T. Leiner, M. A. Viergever, and I. Išgum, “Generative adversarial networks for noise reduction in low-dose CT,” IEEE transactions on medical imaging, vol. 36, no. 12, pp. 2536–2545, 2017.
  • [141] K. H. Jin, M. T. McCann, E. Froustey, and M. Unser, “Deep convolutional neural network for inverse problems in imaging,” IEEE Transactions on Image Processing, vol. 26, no. 9, pp. 4509–4522, 2017.
  • [142] Y. Han and J. C. Ye, “Framing U-Net via deep convolutional framelets: Application to sparse-view CT,” IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1418–1429, 2018.
  • [143] S. Wang, Z. Su, L. Ying, X. Peng, S. Zhu, F. Liang, D. Feng, and D. Liang, “Accelerating magnetic resonance imaging via deep learning,” in Biomedical Imaging (ISBI), 2016 IEEE 13th International Symposium on.   IEEE, 2016, pp. 514–517.
  • [144] K. Hammernik, T. Klatzer, E. Kobler, M. P. Recht, D. K. Sodickson, T. Pock, and F. Knoll, “Learning a variational network for reconstruction of accelerated MRI data,” Magnetic resonance in medicine, vol. 79, no. 6, pp. 3055–3071, 2018.
  • [145] D. Lee, J. Yoo, S. Tak, and J. C. Ye, “Deep residual learning for accelerated MRI using magnitude and phase networks,” IEEE Transactions on Biomedical Engineering, vol. 65, no. 9, pp. 1985–1995, 2018.
  • [146] Y. Han, J. Yoo, H. H. Kim, H. J. Shin, K. Sung, and J. C. Ye, “Deep learning with domain adaptation for accelerated projection-reconstruction MR,” Magnetic resonance in medicine, vol. 80, no. 3, pp. 1189–1205, 2018.
  • [147] B. Zhu, J. Z. Liu, S. F. Cauley, B. R. Rosen, and M. S. Rosen, “Image reconstruction by domain-transform manifold learning,” Nature, vol. 555, no. 7697, p. 487, 2018.
  • [148] K. Gong, J. Guan, K. Kim, X. Zhang, J. Yang, Y. Seo, G. El Fakhri, J. Qi, and Q. Li, “Iterative PET image reconstruction using convolutional neural network representation,” IEEE transactions on medical imaging, 2018.
  • [149] K. Gong, C. Catana, J. Qi, and Q. Li, “PET image reconstruction using deep image prior,” IEEE transactions on medical imaging, 2018.
  • [150] A. C. Luchies and B. C. Byram, “Deep neural networks for ultrasound beamforming,” IEEE transactions on medical imaging, vol. 37, no. 9, pp. 2010–2021, 2018.
  • [151] Y. H. Yoon, S. Khan, J. Huh, and J. C. Ye, “Efficient B-mode ultrasound image reconstruction from sub-sampled RF data using deep learning,” IEEE transactions on medical imaging, vol. 38, no. 2, pp. 325–336, 2019.
  • [152] S. Khan, J. Huh, and J. C. Ye, “Universal deep beamformer for variable rate ultrasound imaging,” arXiv preprint arXiv:1901.01706, 2019.
  • [153] Y. Rivenson, Z. Göröcs, H. Günaydin, Y. Zhang, H. Wang, and A. Ozcan, “Deep learning microscopy,” Optica, vol. 4, no. 11, pp. 1437–1443, 2017.
  • [154] E. Nehme, L. E. Weiss, T. Michaeli, and Y. Shechtman, “Deep-STORM: super-resolution single-molecule microscopy by deep learning,” Optica, vol. 5, no. 4, pp. 458–464, 2018.
  • [155] A. Sinha, J. Lee, S. Li, and G. Barbastathis, “Lensless computational imaging through deep learning,” Optica, vol. 4, no. 9, pp. 1117–1125, 2017.
  • [156] C. H. McCollough, A. C. Bartley, R. E. Carter, B. Chen, T. A. Drees, P. Edwards, D. R. Holmes III, A. E. Huang, F. Khan, S. Leng et al., “Low-dose CT for the detection and classification of metastatic liver lesions: Results of the 2016 low dose CT grand challenge,” Medical physics, vol. 44, no. 10, pp. e339–e352, 2017.
  • [157] J. Zbontar, F. Knoll, A. Sriram, M. J. Muckley, M. Bruno, A. Defazio, M. Parente, K. J. Geras, J. Katsnelson, H. Chandarana et al., “fastMRI: An open dataset and benchmarks for accelerated MRI,” arXiv preprint arXiv:1811.08839, 2018.
  • [158] A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM journal on imaging sciences, vol. 2, no. 1, pp. 183–202, 2009.
  • [159] K. Gregor and Y. LeCun, “Learning fast approximations of sparse coding,” in Proceedings of the 27th International Conference on International Conference on Machine Learning.   Omnipress, 2010, pp. 399–406.
  • [160] D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Deep image prior,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 9446–9454.
  • [161] J. C. Ye, Y. Han, and E. Cha, “Deep convolutional framelets: A general deep learning framework for inverse problems,” SIAM J. Imaging Sci., vol. 11, no. 2, pp. 991–1048, Jan. 2018.
  • [162] J. C. Ye and W. K. Sung, “Understanding geometry of encoder-decoder CNNs,” arXiv preprint arXiv:1901.07647, 2019.
  • [163]

    K. Kwon, D. Kim, and H. Park, “A parallel MR imaging method using multilayer perceptron,”

    Medical physics, vol. 44, no. 12, pp. 6209–6224, 2017.
  • [164] H. Chen, Y. Zhang, W. Zhang, P. Liao, K. Li, J. Zhou, and G. Wang, “Low-dose CT via convolutional neural network,” Biomedical Optics Express, vol. 8, no. 2, pp. 679–694, 2017.
  • [165] D. Lee, J. Yoo, and J. C. Ye, “Deep residual learning for compressed sensing MRI,” in Biomedical Imaging (ISBI 2017), 2017 IEEE 14th International Symposium on.   IEEE, 2017, pp. 15–18.
  • [166] Q. Yang, P. Yan, Y. Zhang, H. Yu, Y. Shi, X. Mou, M. K. Kalra, Y. Zhang, L. Sun, and G. Wang, “Low-dose CT image denoising using a generative adversarial network with Wasserstein distance and perceptual loss,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1348–1357, 2018.
  • [167] H. Gupta, K. H. Jin, H. Q. Nguyen, M. T. McCann, and M. Unser, “CNN-based projected gradient descent for consistent image reconstruction,” arXiv preprint arXiv:1709.01809, 2017.
  • [168] T. M. Quan, T. Nguyen-Duc, and W.-K. Jeong, “Compressed sensing MRI reconstruction using a generative adversarial network with a cyclic loss,” IEEE Transactions on Medical Imaging, 2018.
  • [169] H. Chen, Y. Zhang, Y. Chen, J. Zhang, W. Zhang, H. Sun, Y. Lv, P. Liao, J. Zhou, and G. Wang, “LEARN: Learned experts? assessment-based reconstruction network for sparse-data CT,” IEEE Transactions on Medical Imaging, 2018.
  • [170] H. K. Aggarwal, M. P. Mani, and M. Jacob, “MoDL: Model based deep learning architecture for inverse problems,” arXiv preprint arXiv:1712.02862, 2017.
  • [171] D. Wu, K. Kim, G. El Fakhri, and Q. Li, “Iterative low-dose CT reconstruction with priors trained by artificial neural network,” IEEE transactions on medical imaging, vol. 36, no. 12, pp. 2479–2486, 2017.
  • [172] K. Gong, J. Guan, K. Kim, X. Zhang, G. E. Fakhri, J. Qi, and Q. Li, “Iterative PET image reconstruction using convolutional neural network representation,” arXiv preprint arXiv:1710.03344, 2017.
  • [173] T. Würfl, F. C. Ghesu, V. Christlein, and A. Maier, “Deep learning computed tomography,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2016, pp. 432–440.
  • [174] Y. Han and J. Ye, “k-space deep learning for accelerated MRI,” arXiv 2018(1805.03779), 2018.
  • [175] E. Cha, E. Y. Kim, and J. C. Ye, “k-space deep learning for parallel mri: Application to time-resolved mr angiography,” arXiv preprint arXiv:1806.00806, 2018.
  • [176] J. Lee, Y. Han, and J. C. Ye, “k-space deep learning for reference-free epi ghost correction,” arXiv preprint arXiv:1806.00153, 2018.
  • [177] Y. Han and J. C. Ye, “One network to solve all ROIs: Deep learning CT for any ROI using differentiated backprojection,” arXiv preprint arXiv:1810.00500, 2018.
  • [178] E. Kang, H. J. Koo, D. H. Yang, J. B. Seo, and J. C. Ye, “Cycle-consistent adversarial denoising network for multiphase coronary CT angiography,” Medical physics, vol. 46, no. 2, pp. 550–562, 2019.
  • [179] S. Ravishankar and Y. Bresler, “Adaptive sampling design for compressed sensing MRI,” in 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2011, pp. 3751–3755.
  • [180] B. Gözcü, R. K. Mahabadi, Y. Li, E. Ilıcak, T. Çukur, J. Scarlett, and V. Cevher, “Learning-based compressive MRI,” IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1394–1406, 2018.