DeepAI
Log In Sign Up

DIMENSION: Dynamic MR Imaging with Both K-space and Spatial Prior Knowledge Obtained via Multi-Supervised Network Training

09/30/2018
by   Shanshan Wang, et al.
0

Dynamic MR image reconstruction from incomplete k-space data has generated great research interest due to its capability in reducing scan time. Nevertheless, the reconstruction problem is still challenging due to its ill-posed nature. Most existing methods either suffered from long iterative reconstruction time or explored limited prior knowledge. This paper proposes a dynamic MR imaging method with both k-space and spatial prior knowledge integrated via multi-supervised network training, dubbed as DIMENSION. Specifically, DIMENSION consists of a Fourier prior network for k-space completion and a spatial prior network for capturing image structures and details. Furthermore, a multi-supervised network training technique is developed to constrain the frequency domain information and reconstruction results at different levels. The comparisons with k-t FOCUSS, k-t SLR, L+S and the state-of-the-art CNN method on in vivo datasets show our method can achieve improved reconstruction results in shorter time.

READ FULL TEXT VIEW PDF

page 4

page 6

page 7

page 8

page 9

page 10

01/18/2019

CRDN: Cascaded Residual Dense Networks for Dynamic MR Imaging with Edge-enhanced Loss Constraint

Dynamic magnetic resonance (MR) imaging has generated great research int...
08/08/2022

SelfCoLearn: Self-supervised collaborative learning for accelerating dynamic MR imaging

Lately, deep learning has been extensively investigated for accelerating...
12/15/2022

Universal Generative Modeling in Dual-domain for Dynamic MR Imaging

Dynamic magnetic resonance image reconstruction from incomplete k-space ...
06/11/2019

DeepcomplexMRI: Exploiting deep residual network for fast parallel MR imaging with complex convolution

This paper proposes a multi-channel image reconstruction method, named D...
11/27/2020

Multi-task MR Imaging with Iterative Teacher Forcing and Re-weighted Deep Learning

Noises, artifacts, and loss of information caused by the magnetic resona...
06/22/2020

Deep Low-rank Prior in Dynamic MR Imaging

The deep learning methods have achieved attractive results in dynamic MR...
02/17/2022

Prior image-based medical image reconstruction using a style-based generative adversarial network

Computed medical imaging systems require a computational reconstruction ...

I Introduction

Dynamic MR imaging is a non-invasive imaging technique which could provide both spatial and temporal information for the underlying anatomy. Nevertheless, both physiological and hardware constraints have made it suffer from slow imaging speed or long imaging time, which may lead to patients’ discomfort or sometimes cause severe motion artifacts. Therefore, it is of great necessity to accelerate MR imaging.

To accelerate dynamic MR scan, there have been three main directions of efforts, namely in developing physics based fast imaging sequences [1], hardware based parallel imaging techniques [2] and signal processing based MR image reconstruction methods from incomplete k-space data. Our specific focus here is the undersampled MR image reconstruction, which requires prior information to solve the aliasing artifacts caused by the violation of the Nyquist sampling theorem. Specifically, the reconstruction task is normally formulated as solving an optimization problem with two terms i.e. data fidelity and prior regularization. Popular prior information includes sparsity, which prompts image to be sparsely represented in a certain transform domain while being reconstructed from incoherently undersampled k-space data. These techniques are well-known as compressed sensing MRI (CS-MRI) [3, 4]. For example, k-t FOCUSS [5] is asymptotically optimal from the compressed sensing theory by using a FOCUSS algorithm with a random k-t sampling pattern. It encompasses the celebrated k-t BLAST and k-t SENSE [6] as special cases. Liang et al [7] propose a k-t iterative support detection (k-t ISD) method to improve the CS reconstruction for dynamic MR. Besides, low-rank is also a prior regularization. It can use low-rank and incoherence conditions to complete missing or corrupted entries of a matrix. A typical example on low-rank is L+S [8], which takes the temporal frame as each column of the space-time matrix for dynamic MRI. And k-t SLR [9] exploits the correlations in the dynamic imaging dataset by modeling the data to have a compact representation in the Karhunen Louve transform (KLT) domain. Dictionary learning (DL) has also been investigated. DLTG [10] can learn redundancy in the data via an auxiliary constraint on temporal gradients (TG) sparsity. Wang et al [11] employs a patch-based 3-D spatiotemporal dictionary for sparse representations of dynamic image sequence. These methods have made great progresses in dynamic imaging and achieved improved results. Nevertheless, these methods only draw prior knowledge from limited samples. Furthermore, the reconstruction is iterative and sometimes time-consuming.

On the other hand, deep learning has shown great potential in accelerating MR imaging. There have been quite a few newly proposed methods, which can be roughly categorized into two types, model-based unrolling methods [12, 13, 14] and end-to-end learning methods [15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]. The model based unrolling methods are to formulate the iterative procedure of traditional optimization algorithms to network learning. They adaptively learn all the parameters of regularization terms and transforms in the model by network training. For example, in VN-Net [13], generalized compressed sensing reconstruction formulated as a variational model is embedded in an unrolled gradient descent scheme. ADMM-Net [12] is defined over a data flow graph, which is derived from the iterative procedures in Alternating Direction Method of Multipliers (ADMM) algorithm for optimizing a CS-based MRI model. The other type utilizes the big data information to learn a network that map between the undersampled and fully sampled data pairs. Wang et al. [15]

train a deep convolutional neural network (CNN) to learn the mapping relationship between undersampled brain MR images and fully sampled brain MR images. AUTOMAP

[18] learns a mapping between the sensor and the image domain from an appropriate training data. Despite all the successes, there are only two works that specifically apply to dynamic MR imaging [22, 23]. Both of these two works use a cascade of neural networks to learn the mapping between undersampled and fully sampled cardiac MR images. For example, a deep cascade of convolutional neural networks (DC-CNN) [22]

is designed for dynamic MR reconstruction. Both works make great contributions to dynamic MR imaging. Nevertheless, the reconstruction results can still be improved based on two observations. Firstly, they have adopted a single supervised loss function which only considers the fidelity between the final output and the ground truth. The intermediate results have not been utilized for the supervision. Furthermore, only a data-consistency layer is considered for the direct k-space correction. Previous research has shown that the combination of k-space domain networks and image domain networks is superior to single-domain CNN

[19]. There are still more valuable prior knowledge regarding k-space and different levels of reconstruction to be utilized for accurate MR image reconstruction.

In this work, we propose a DynamIc MR imaging method with both k-spacE aNd Spatial prior knowledge integrated via multI-supervised netwOrk traiNing, dubbed as DIMENSION. The improvements are mainly reflected in the cross-domain network structures and the multi-supervised loss function strategy. Our contributions could be summarized as follows:

  1. In the present study, a dynamic MR imaging method with both k-space and spatial prior knowledge integrated via multi-supervised network training is proposed, which can combine frequential domain and spatial domain information sufficiently.

  2. We propose a multi-supervised loss function strategy, which can constrain the frequency domain information and reconstruction results at different levels. Such loss function strategy is designed to get better network predicted k-space with the frequency domain learning, and can also prompt the reconstruction results at different levels in the spatial domain learning to be closer to the fully sampled MR images.

  3. Experimental results show that the proposed method is superior to conventional CS-based methods such as k-t FOCUSS, k-t SLR and L+S, as well as the state-of-the-art CNN-based method, DC-CNN. These demonstrate the effectiveness of the cross-domain learning and the multi-supervised loss function strategy in cardiac MR imaging.

Ii Methodology

Ii-a CS-MRI and CNN-MRI

According to compressed sensing (CS) [3, 4], MR images with a sparse representation in some transform domain can be reconstructed from randomly undersampled k-space data. Let represent a complex-valued dynamic MR image. The problem can be described by the following formula:

(1)

where is the undersampled measurements in k-space and the unsampled points are filled with zeros. is an undersampled Fourier encoding matrix, and is the acquisition noise. We want to reconstruct S by solving the inverse problem of Eq. 1. However, the inverse problem is ill-posed, resulting in that the reconstruction is not unique. In order to reconstruct S, we constrain this inverse problem by adding some prior knowledge and solve the following optimization problem:

(2)

The first term is the data fidelity, which ensures that the k-space of reconstruction is consistent with the actual measurements in k-space. The second term is often referred to as the prior regularization. In the methods of CS, is usually a sparse prior of S in some transform domains, e.g. finite difference, wavelet transform and discrete cosine transformation.

In CNN-based methods, is a CNN prior of S , which force S to match the output of the networks:

(3)

where is the undersampled image and is the output of the networks under the parameters . The training process of the networks is to find the optimal parameters . Once the network are trained, the networks’ output is the reconstruction we want.

Ii-B The Proposed Method

Ii-B1 The Proposed DIMENSION Network

In this work, we propose a convolutional neural network termed as DIMENSION for cardiac MR images reconstruction shown in Fig.1

. The DIMENSION network consists of two main parts: a frequency domain network for updating the k-space with its network prediction termed as FDN and a spatial domain network term as SDN, which is used to extract high-level features of images. The FDN and the SDN are connected by a Fourier inversion (see Inverse Fast Fourier Transform (IFFT) in Fig.

1).

(a)
Fig. 1: The proposed DIMENSION network architecture for cardiac MR reconstruction.

Specifically, the FDN consists of frequency domain blocks . Each block contains 3D convolutional layers and a k-space domain data consistency (KDC) layer. The forward-pass starts from the undersampled k-space . The forward-pass equations of the first block in the FDN could be described as:

(4)

The forward-pass equations of the later blocks in the FDN could be described as:

(5)

where the KDC is defined as the following equation:

(6)

, are the -th convolution filters, biases in the -th block of the FDN respectively with . denotes the output of the -th convolutional layer in the

-th block in the FDN. Each convolutional layer is followed by an activation function

for nonlinearity except for the last layer, which projects the extracted features to the k-space domain. After the convolution operation, the k-space domain data consistency is utilized to correct network predicted k-space with the actual sampled k-space as shown in Eq. 6, where denotes the correction of . The set consisting of indices of sampled in k-space is defined as . If k-space indices is in set , the would be corrected with the actual sampled k-space . is used to control the degree of the data consistency. If , we replace the unsampled points directly with the actual sampled points.

The final output of the FDN is . Then the inverse Fourier transform of the is performed to obtain the MR image, which is also the input of the SDN:

(7)

The SDN consists of image-domain blocks , each of which contains

3D convolutional layers, a residual connection

[26] and an image-domain data consistency layer (IDC). The forward-pass equations of the first block in the SDN are:

(8)

The forward-pass equations of the later blocks in the SDN could be described as:

(9)

where the IDC is defined as the following equations:

(10)

, are the -th convolution filters, biases in the -th block of the SDN respectively with . denotes the output of the -th convolutional layer in the -th block. Like FDN, each convolutional layer is nonlinear with the activation function except for the last convolutional layer, which projects the extracted spatial domain features to the image domain. After convolution operation, a residual connection is followed, which sums the output of each block with its input. is the result of the residual learning. Then the image-domain data consistency (IDC) is performed on to obtain the corrected as shown in Eq. 10. There are three steps, which respectively represents the fast Fourier transform (FFT), k-space correction, and inverse fast Fourier transform (IFFT). It’s noticeable that for the KDC layer, no frequency domain and spatial domain transformations are necessary, because the output is already a prediction of k-space. Through the above processes, we can get the reconstruction results of the different blocks. come from different network depths. So the reconstruction results are at different levels. For example, comes from the shallowest block, so the reconstruction level is the lowest. is the final output of the SDN, which is at the highest reconstruction level.

In [22], a DC-CNN model has been proposed to reconstruct undersampled cardiac MR images. The DC-CNN model consists of image-domain blocks, each of which contains convolutional layers, a residual connection and an image-domain data consistency layer (IDC). Despite favorable reconstruction quality has been achieved, there are still more valuable prior knowledge regarding k-space can be learned and utilized for improving MR image reconstruction. The DIMENSION model introduces k-space learning into the networks, so that the networks can not only extract features in the spatial domain, but also make better use of k-space prior knowledge.

Ii-B2 The Proposed Multi-Supervised Loss

In this work, we introduce a multi-supervised loss function strategy. The multi-supervised loss functions are composed of primary loss, k-space loss and spatial loss shown in Fig. 2. In the usual CNN-MRI models, only the primary loss exists, which measures the distance between the reconstruction and the ground truth. In this work, the k-space loss and the spatial loss are proposed to constrain the frequency domain information and reconstruction results at different levels. Let be the fully sampled k-space. The k-space loss can be expressed as the formula:

(11)

And the spatial loss can be expressed as the formula:

(12)

where and are the respective nonnegative weights of each loss. The primary loss is the mean square error (MSE) between the final reconstruction () and corresponding fully sampled image S:

(13)

Finally, the total training loss functions consist of these three terms:

(14)

The weights in Eq. 14 will be discussed in the later section. To facilitate the better understanding of this model, we provide an intuitive explanation of the k-space loss and the spatial loss.

(a)
Fig. 2: The multi-supervised loss in DIMENSION networks.

The k-space loss: The k-space points can be divided into two subsets: if a point has already been sampled, is a member of set , otherwise

. The purpose of the FDN is to update the k-space with its network prediction. The quality of prediction directly affects the image domain feature extraction in the SDN and the final reconstruction. Therefore, it is necessary to enhance the fidelity between the network predicted k-space and the fully sampled k-space. The combination of the KDC and the k-space loss can improve the quality of the predicted k-space. If

, the KDC layer is good at ensuring that the predicted k-space is consistent with the actual sampled k-space. If , the k-space loss can make the unsampled k-space as close as possible to the fully sampled k-space.

The spatial loss: the outputs of the spatial blocks from different network depths can be regarded as the reconstruction at different levels. Then the final output of the entire network can be seen as the final level reconstruction. However, previous studies only make constraints on the final level reconstruction and use them as the supervision of the entire networks. Those approaches do not take full advantage of reconstruction information at other levels. Here, we put forward a spatial loss, which can constrain the reconstruction at different levels to be closer to the ground truths. As can be seen in Eq. 14, reconstruction at each level contributes to the final result.

Iii Experimental Results

Iii-a Setup

Iii-A1 Data acquisition

We collected 101 fully sampled cardiac MR data using 3T scanner (SIMENS MAGNETOM Trio) with T1-weighted FLASH sequence. Written informed consent was obtained from all human subjects. Each scan contains a single slice FLASH acquisition with 25 temporal frames. The following parameters were used for FLASH scans: FOV mm, acquisition matrix , slice thickness = 6 mm, TR = 3 ms, TE = 50 ms and 24 receiving coils. The raw multi-coil data of each frame was combined by adaptive coil combine method [27] to produce a single-channel complex-valued image. We randomly selected 90% of the entire dataset for training and 10% for testing. Deep learning has a high demand for data volume [28]. Therefore, some data augmentation strategies have been applied. We shear the original images along the and direction. The sheared size is

, and the stride along the three directions is 7, 7 and 5 respectively. Finally, we obtained 17500 3D complex-valued cardiac MR data with the size of

.

For each frame, the original k-space was retrospectively undersampled with 6 ACS lines. Specifically, we fully samples frequency-encodes (along ) and randomly undersamples the phase encodes (along ) according to a zero-mean Gaussian variable density function [5].

Iii-A2 Network training

For network training, we divide each data into two channels, where the channels store real and imaginary parts of the data. So the inputs of the network are undersampled k-spaces and the outputs are reconstruction images . In this work, we focus on a D5C5 model, which works pretty well for the DC-CNN model. The D5C5 model consists of five blocks (C5) and each block has five convolutional layers (D5). In order to simplify the parameters and make a fair comparison with the D5C5 model, the FDN contains one frequency domain block () and the SDN consists of four image-domain blocks (). Every block contains five convolutional layers (). Therefore, both the proposed model and the D5C5 model have 25 convolutional layers in total. The details of the FDN and the SDN are shown in Table. I and II respectively. He initialization [29]

was used to initialize the network weights. Rectifier Linear Units (ReLU)

[30] were selected as the nonlinear activation functions. The mini-batch size was 20. The exponential decay learning rate [31] was used in all CNN-based experiments and the initial learning rate was set to 0.0001 with a decay of 0.95. All the models were trained by the Adam optimizer [32] with parameters and .

Layer of K5 Input size Number of filter Filter size Stride Activation Output
Complex conv1 117*120*6*2 64 3*3*3 [1, 1, 1] ReLU 117*120*6*64
Complex conv2 117*120*6*64 64 3*3*3 [1, 1, 1] ReLU 117*120*6*64
Complex conv3 117*120*6*64 64 3*3*3 [1, 1, 1] ReLU 117*120*6*64
Complex conv4 117*120*6*64 64 3*3*3 [1, 1, 1] ReLU 117*120*6*64
Complex conv5 117*120*6*64 2 3*3*3 [1, 1, 1] None 117*120*6*2
Kspace data consistency 117*120*6*2 / / / / 117*120*6*2
TABLE I: The parameters setting of each block in the FDN.
Layer of D5C4 Input size Number of filter Filter size Stride Activation Output
Complex conv1 117*120*6*2 64 3*3*3 [1, 1, 1] ReLU 117*120*6*64
Complex conv2 117*120*6*64 64 3*3*3 [1, 1, 1] ReLU 117*120*6*64
Complex conv3 117*120*6*64 64 3*3*3 [1, 1, 1] ReLU 117*120*6*64
Complex conv4 117*120*6*64 64 3*3*3 [1, 1, 1] ReLU 117*120*6*64
Complex conv5 117*120*6*64 2 3*3*3 [1, 1, 1] None 117*120*6*2
Residual 117*120*6*2 / / / / 117*120*6*2
Kspace data consistency 117*120*6*2 / / / / 117*120*6*2
TABLE II: The parameters setting of each block in the SDN.

The models were implemented on an Ubuntu 16.04 LTS (64-bit) operating system equipped with an Intel Xeon E5-2640 Central Processing Unit (CPU) and Tesla TITAN Xp Graphics Processing Unit (GPU, 12GB memory) in the open framework Tensorflow

[33] with CUDA and CUDNN support.

Iii-A3 Performance evaluation

For a quantitative evaluation, mean square error (MSE), peak signal to noise ratio (PSNR) and structural similarity index (SSIM) [34] were measured as follows:

(15)
(16)
(17)

where is the reconstructed image, denotes the reference image and is the total number of image pixels. The SSIM index is a multiplicative combination of the luminance term, the contrast term, and the structural term (details shown in [34]).

Iii-B Does the Frequency Domain Network Work?

To demonstrate the efficacy of the FDN, we compare the DIMENSION model with the state-of-the-art CNN method D5C5 [22] and set the weights of the k-space loss and the spatial loss to be zeros. So the total loss function in this section is:

(18)

The two models have the same amount of network parameters. And for fair comparisons, the networks’ hyperparameters are also set to be the same. The reconstruction results of the D5C5 and the DIMENSION models on the test datasets are shown in Fig.

3.

(a)
Fig. 3: The reconstruction results of the D5C5 model and the DIMENSION model. (a) ground truth, (b) zero-filling, (c) mask and its k-t extraction, (d) the D5C5 reconstruction, (g) the DIMENSION reconstruction, (e) and (h) their corresponding error maps with display ranges [0, 0.07]. (f) Extractions of slice along y and temporal dimensions (y-t), from top (f1) to bottom (f7): (f1) ground truth, (f2) zero-filling image, (f3) error map of zero-filling, (f4) the D5C5 reconstruction, (f5) error map of the D5C5, (f6) the DIMENSION reconstruction, (f7) error map of the DIMENSION.

The display ranges for the error maps are . At 4-fold acceleration, one can see that the DIMENSION reconstruction outperforms the D5C5 reconstruction in terms of artifacts removal and detail preservation (see the arrows in Fig. 3 (e) and (h)). To explore the reconstructed results in the direction of temporal, we show the slices along and temporal dimensions (). From the images (Fig. 3 (f)), we can see that the reconstruction of the DIMENSION model has a smaller error map, which is consistent with the above conclusion. We also show the quantitative evaluations of the DIMENSION model and the D5C5 model in Table III.

Models MSE PSNR SSIM
Zero-filling
D5C5
DIMENSION 0.000176 37.5857 0.9846
  • The bolder ones mean better.

TABLE III: The MSE, PSNR and SSIM of zero-filling, D5C5 and DIMENSION.

One can see that the DIMENSION model achieves optimal quantitative evaluations over the D5C5 model (0.000128 lower in MSE, 2.4163dB higher in PSNR and 0.071 higher in SSIM). So the DIMENSION model is superior to the D5C5 model in both visual results and quantitative indicators.

This indicates the DIMENSION model could effectively learn cross-domain information and improve MR reconstruction by using both k-space and spatial prior knowledge. Subsequent experiments are all constructed based on the DIMENSION model.

Iii-C Does the K-space Loss Work?

This section mainly explores whether the k-space loss can improve the reconstruction results. The purpose of introducing the k-space loss is to further constrain FDN to get a better network predicted k-space. The k-space loss is shown in Eq. 11. We select the MSE between predicted k-space and fully sampled k-space as the k-space loss and the total loss function in this section is shown below:

(19)

where is a hyperparameter. We choose here and refer to the DIMENSION model that introduces k-space loss as DIMENSION-KLoss. The comparison results of the DIMENSION model and the DIMENSION-KLoss model are shown in Fig. 4.

(a)
Fig. 4: The reconstructions of the DIMENSION model and the DIMENSION-KLoss model. (a) ground truth, (b) zero-filling image, (c) mask and its k-t extraction, (d) DIMENSION reconstruction, (g) DIMENSION-KLoss reconstruction, (e) and (h) their corresponding error maps with display ranges [0, 0.07]. (f) Extractions of slice along y and temporal dimensions (y-t), from top (f1) to bottom (f7): (f1) ground truth, (f2) zero-filling image, (f3) error map of zero-filling, (f4) DIMENSION reconstruction, (f5) error map of DIMENSION, (f6) DIMENSION-KLoss reconstruction, (f7) error map of DIMENSION-KLoss.

Obviously, the DIMENSION-KLoss model gets better results, especially in removing artifacts. We can obtain the same conclusions from the y-t images (Fig. 4 (f)). The quantitative measurements can be found in Table IV.

Models MSE PSNR SSIM
Zero-filling
DIMENSION
DIMENSION-KLoss 0.000061 41.8101 0.9939
  • The bolder ones mean better.

TABLE IV: The MSE, PSNR and SSIM of zero-filling, DIMENSION and DIMENSION-KLoss.

Introducing the k-space loss can improve the MSE, PSNR and SSIM (0.000044 lower in MSE, 2.3390dB higher in PSNR and 0.0029 higher in SSIM). Therefore, we can see that the k-space loss can effectively improve the cardiac MR reconstruction. With the joint action of KDC and k-space loss, more accurate predicted k-space can be obtained.

Iii-D Does the Spatial Loss Work?

In this section, we demonstrate whether the spatial loss can improve the MR reconstruction. The spatial loss (in Eq. 12) is to constrain the reconstruction results of different blocks with individual weights. We can regard the outputs of the shallow blocks as the preliminary reconstruction, and the output of the last block as the final reconstruction. To demonstrate the effectiveness of the spatial loss, we use the combination of the primary loss and the spatial loss as the total loss in this section:

(20)

Here, we choose and refer to the DIMENSION model with the spatial loss as DIMENSION-SLoss. Fig. 5 shows the reconstruction results of the DIMENSION model and the DIMENSION-SLoss model.

(a)
Fig. 5: The reconstructions of the DIMENSION model and the DIMENSION-SLoss model. (a) ground truth, (b) zero-filling image, (c) mask and its k-t extraction, (d) DIMENSION reconstruction, (g) DIMENSION-SLoss reconstruction, (e) and (h) their corresponding error maps with display ranges [0, 0.07]. (f) Extractions of slice along y and temporal dimensions (y-t), from top (f1) to bottom (f7): (f1) ground truth, (f2) zero-filling image, (f3) error map of zero-filling, (f4) DIMENSION reconstruction, (f5) error map of DIMENSION, (f6) DIMENSION -SLoss reconstruction, (f7) error map of DIMENSION-SLoss.

We can clearly see that the DIMENSION-SLoss model has fewer artifacts and retains details better from the error maps (as shown by the red and yellow arrows in Fig. 5 (e, h, f5, f7).

Models MSE PSNR SSIM
Zero-filling
DIMENSION
DIMENSION-SLoss 0.000214 36.6892 0.9764
  • The bolder ones mean better.

TABLE V: The MSE, PSNR and SSIM of zero-filling, DIMENSION and DIMENSION-SLoss.

The quantitative measurements can be found in Table V, from which we can see that the spatial loss can effectively improve the MSE, PSNR and SSIM (0.000047 lower in MSE, 0.8588dB higher in PSNR and 0.0026 higher in SSIM). Therefore, we can see that the spatial loss can further improve the cardiac MR reconstruction.

Iii-E Comparison to the State-of-the-art Methods

Based on the above sections, the network we used is the DIMENSION model (as shown in Fig. 1) and the multi-supervised loss we selected is

(21)

In this section, we will demonstrate the effectiveness of the proposed method by comparing it with the traditional CS (k-t FOCUSS [5], k-t SLR [9], S+L [8]) methods and the state-of-the-art CNN method (D5C5 [22]). All the CS-based methods select their single-channel versions. For fair comparisons, we adjust the parameters of the CS-MRI methods to their best performances. The reconstruction results at 4-fold acceleration from these methods are shown in Fig. 6.

(a)
Fig. 6: The comparison of cardiac MR reconstructions from different methods (k-t FOCUSS, k-t SLR, L+S, D5C5, the proposed method). (a) ground truth, (b) zero-filling image, (c) mask and its k-t extraction (d) k-t FOCUSS reconstruction, (e) k-t SLR reconstruction, (f) L+S reconstruction, (g) D5C5 reconstruction, (h) the proposed method reconstruction; (i), (j), (k), (l) and (m) their corresponding error maps with display ranges [0, 0.07].

The k-t SLR removes artifacts better than the k-t FOCUSS and S+L. However, these three CS-based methods lose more structural details than the CNN-based methods. Compared with the D5C5 method, the proposed method can not only retain more details, but also remove more artifacts.

We show the evaluation indexes in Table VI.

Models MSE PSNR SSIM
Zero-filling
k-t FOCUSS
k-t SLR
S+L
D5C5
The proposed 0.000086 40.6256 0.9655
  • The bolder ones mean better.

TABLE VI: The MSE, PSNR, SSIM and running time of zero-filling, k-t FOCUSS, k-t SLR, D5C5 and the proposed method.

Note that the CNN-based methods outperform the CS-based methods in all three performance indexes. We observe the proposed DIMENSION model achieves the optimal performance in MSE, PSNR and SSIM indexes among all the methods. The reconstruction time of different methods is shown in Table VII. And for fair comparsisons, all methods are implemented on the same CPU Intel Xeon E5-2640. It can be seen that the reconstruction time of the CNN-based methods is significantly shorter than the CS-based methods.

Methods k-t SLR L+S D5C5 The proposed
Running time/s 4.8845
  • The bolder ones mean better.

TABLE VII: The reconstruction time of k-t SLR, L+S, D5C5, the proposed methods.

At 8-fold acceleration, the reconstruction results of different methods are shown in Fig. 7. At high acceleration factors, the single-channel CS-based methods have difficulties in reconstructing high quality images. However, the CNN-based methods can still get better reconstruction results. Compared with the D5C5 method, the proposed method can not only retain the details, but also remove the artifacts better. The improvements of CNN-based methods in reconstruction are more obvious at high acceleration factors. The evaluation indexes are shown in Table. VIII. We observe that the proposed method achieves the optimal performances in MSE, PSNR and SSIM indexes among all the methods.

(a)
Fig. 7: The comparison of cardiac MR reconstructions from different methods (k-t FOCUSS, k-t SLR, D5C5, the proposed method) at 8-fold acceleration. (a) ground truth, (b) zero-filling image, (c) mask and its k-t extraction, (d) k-t FOCUSS reconstruction, (e) k-t SLR reconstruction, (f) D5C5 reconstruction, (g) the proposed method reconstruction; (h), (i), (j), (k) their corresponding error maps with display ranges [0, 0.07].
Models MSE PSNR SSIM
Zero-filling
k-t FOCUSS
k-t SLR
D5C5
DIMENSION 0.000381 32.7019 0.9690
  • The bolder ones mean better.

TABLE VIII: The MSE, PSNR, SSIM and running time of zero-filling, k-t FOCUSS, k-t SLR, D5C5 and the proposed method.

Iv Discussion

Iv-a Hyper-parameters Selection

The weights selection of the k-space loss and the spatial loss influence the reconstruction [28]. In this section, we will discussion the selections of and .

Iv-A1 The Selection of

In order to find an appropriate , we trained a series of DIMENSION models with different and then tested them on the same test set. The average MSE, PSNR, SSIM on the test set are shown in Fig. 8.

(a)
Fig. 8: The histograms of the average MSE, PSNR, SSIM on the test set with different . Here, goes from to

We can observe that when , the DIMENSION model gets the lowest MSE, the highest PSNR and the highest SSIM. So we pick as the appropriate value for .

Iv-A2 The Selection of

There are two ways to select : the one is to take the same values for these three weights. And the other is to take increasing or decreasing values for these three weights.

Firstly, we consider the case when each takes the same value. We trained a series of DIMENSION models, where , in turn, equal to . Then all these models were tested on the same test set. The quantitative results are shown in Fig. 9.

(a)
Fig. 9: The histograms of the average MSE, PSNR, SSIM on the test set with different . Here, and equal to

We can see that has the best performance on the test set with the lowest MSE, the highest PSNR and the highest SSIM. We refer to the DIMENSION model with as DIMENSION-SLoss1.

Secondly, we consider the case when these three weights take increasing or decreasing values. The values of are shown in Table IX.

Case 1 Case 2 Case 3 Case 4 Case 5 Case 6 Case 7 Case 8 Case 9 Case 10
TABLE IX: Ten cases of taking different values.

We have considered a total of 10 cases, where are increasing in the first five and decreasing in the last five. The quantitative results are shown in Fig. 10.

(a)
Fig. 10: The histograms of the average MSE, PSNR, SSIM on the test set with different , which shown in Table IX.

Obviously, we find that the ninth case gets the best quantization results, where MSE is the lowest, PSNR and SSIM are the highest. So, we choose as our preferred weights. We refer to the DIMENSION model with as DIMENSION-SLoss2.

Then, it is natural to think about which of these two ways to select works better. The quantitative and reconstruction results of the DIMENSION-SLoss1 model and the DIMENSION-SLoss2 model are respectively shown in Fig. 11

(a)
Fig. 11: The reconstructions of the DIMENSION-SLoss1 model and the DIMENSION-SLoss2 model. (a) ground truth, (b) zero-filling image, (c) mask and its k-t extraction, (d) DIMENSION-SLoss1 reconstruction, (g) DIMENSION-SLoss2 reconstruction, (e) and (h) their corresponding error maps with display ranges [0, 0.07]. (f) Extractions of slice along y and temporal dimensions (y-t), from top (f1) to bottom (f7): (f1) ground truth, (f2) zero-filling image, (f3) error map of zero-filling, (f4) DIMENSION-SLoss1 reconstruction, (f5) error map of DIMENSION-SLoss1, (f6) DIMENSION-SLoss2 reconstruction, (f7) error map of DIMENSION-SLoss2.

and Table X.

Models MSE PSNR SSIM
Zero-filling
DIMENSION-SLoss1 0.000083 40.8160 0.9918
DIMENSION-SLoss2
  • The bolder ones mean better.

TABLE X: The MSE, PSNR and SSIM of zero-filling, DIMENSION-SLoss1 and DIMENSION-SLoss2.

The two models have similar reconstructed results, but the quantization results of DIMENSION-SLoss1 are slightly improved. So we choose DIMENSION-SLoss1 as the final DIMENSION-SLoss model.

Iv-B The Limitations of the Proposed Work

Although our proposed method achieves the improved reconstruction results for dynamic MR imaging compared to other methods, there is still a certain degree of smooth in the reconstructed images at high acceleration factors. Part of the reasons may be the loss functions used in this work. The MSE loss functions only indicate the mean square information between the reconstructed image and the ground truth and cannot perceive the image structure information. DAGAN [35] couples an adversarial loss with an innovative content loss to reconstruct CS-MRI, which could preserve perceptual image details. This inspired us to use different loss functions related to structural information in the future works. Furthemore, network structures could have other options to improve the reconstruction, which are going to be explored and investigated. For example, Dense networks [36] may be utilized to make full use of hierarchical features.

V Conclusion and Outlook

In this work, we propose a dynamic MR imaging method with both k-space and spatial prior knowledge integrated via multi-supervised network training, dubbed as DIMENSION. Our contributions are mainly reflected in the cross-domain network structures and the multi-supervised loss function strategy. The cross-domain network structures help us to make better use of both frequential domain and spatial domain prior knowledge. The multi-supervised loss function strategy avoids the single-supervised training and provides different levels of supervision for the entire network. Specifically, we introduce the k-space loss, which can force the predicted k-space from the frequency domain network to be as close as possible to the fully sampled k-space. We also introduce the spatial loss into the spatial domain network, which can constrain the reconstruction results at different levels. We compared the proposed approach with k-t FOCUSS, k-t SLR, L+S and the state-of-the-art CNN-based method. Experimental results show that the proposed network structure and loss function strategy can improve dynamic MR reconstruction accuracy with shorter time. In future works, perceived loss functions and more advanced network structures may be studied and used.

References

  • [1] W. Kaiser and E. Zeitler, “MR imaging of the breast: fast imaging sequences with and without Gd-DTPA. preliminary observations.” Radiology, vol. 170, no. 3, pp. 681–686, 1989.
  • [2] D. K. Sodickson and W. J. Manning, “Simultaneous acquisition of spatial harmonics (SMASH): fast imaging with radiofrequency coil arrays,” Magnetic Resonance in Medicine, vol. 38, no. 4, pp. 591–603, 1997.
  • [3] D. L. Donoho, “Compressed sensing,” IEEE Transactions on Information Theory, vol. 52, no. 4, pp. 1289–1306, 2006.
  • [4] M. Lustig, D. Donoho, and J. M. Pauly, “Sparse MRI: The application of compressed sensing for rapid MR imaging,” Magnetic Resonance in Medicine, vol. 58, no. 6, pp. 1182–1195, 2007.
  • [5] H. Jung, J. C. Ye, and E. Y. Kim, “Improved k–t BLAST and k–t SENSE using FOCUSS,” Physics in Medicine & Biology, vol. 52, no. 11, p. 3201, 2007.
  • [6] J. Tsao, P. Boesiger, and K. P. Pruessmann, “k-t BLAST and k-t SENSE: dynamic MRI with high frame rate exploiting spatiotemporal correlations,” Magnetic Resonance in Medicine, vol. 50, no. 5, pp. 1031–1042, 2003.
  • [7] D. Liang, E. V. DiBella, R.-R. Chen, and L. Ying, “k-t ISD: dynamic cardiac MR imaging using compressed sensing with iterative support detection,” Magnetic Resonance in Medicine, vol. 68, no. 1, pp. 41–53, 2012.
  • [8] R. Otazo, E. Candès, and D. K. Sodickson, “Low-rank plus sparse matrix decomposition for accelerated dynamic MRI with separation of background and dynamic components,” Magnetic Resonance in Medicine, vol. 73, no. 3, pp. 1125–1136, 2015.
  • [9] S. G. Lingala, Y. Hu, E. DiBella, and M. Jacob, “Accelerated dynamic MRI exploiting sparsity and low-rank structure: k-t SLR,” IEEE Transactions on Medical Imaging, vol. 30, no. 5, pp. 1042–1054, 2011.
  • [10] J. Caballero, A. N. Price, D. Rueckert, and J. V. Hajnal, “Dictionary learning and time sparsity for dynamic MR data reconstruction,” IEEE Transactions on Medical Imaging, vol. 33, no. 4, pp. 979–994, 2014.
  • [11] Y. Wang and L. Ying, “Compressed sensing dynamic cardiac cine MRI using learned spatiotemporal dictionary.” IEEE Trans. Biomed. Engineering, vol. 61, no. 4, pp. 1109–1120, 2014.
  • [12] J. Sun, H. Li, Z. Xu et al., “Deep ADMM-Net for compressive sensing MRI,” in Advances in Neural Information Processing Systems, 2016, pp. 10–18.
  • [13] K. Hammernik, T. Klatzer, E. Kobler, M. P. Recht, D. K. Sodickson, T. Pock, and F. Knoll, “Learning a variational network for reconstruction of accelerated MRI data,” Magnetic Resonance in Medicine, vol. 79, no. 6, pp. 3055–3071, 2018.
  • [14]

    F. Knoll, K. Hammernik, E. Kobler, T. Pock, M. P. Recht, and D. K. Sodickson, “Assessment of the generalization of learned image reconstruction and the potential for transfer learning,”

    Magnetic Resonance in Medicine, 2018.
  • [15] S. Wang, Z. Su, L. Ying, X. Peng, S. Zhu, F. Liang, D. Feng, and D. Liang, “Accelerating magnetic resonance imaging via deep learning,” in Biomedical Imaging (ISBI), 2016 IEEE 13th International Symposium on.   IEEE, 2016, pp. 514–517.
  • [16]

    K. Kwon, D. Kim, and H. Park, “A parallel MR imaging method using multilayer perceptron,”

    Medical Physics, vol. 44, no. 12, pp. 6209–6224, 2017.
  • [17] Y. Han, J. Yoo, H. H. Kim, H. J. Shin, K. Sung, and J. C. Ye, “Deep learning with domain adaptation for accelerated projection-reconstruction MR,” Magnetic Resonance in Medicine, vol. 80, no. 3, pp. 1189–1205, 2018.
  • [18] B. Zhu, J. Z. Liu, S. F. Cauley, B. R. Rosen, and M. S. Rosen, “Image reconstruction by domain-transform manifold learning,” Nature, vol. 555, no. 7697, p. 487, 2018.
  • [19] T. Eo, Y. Jun, T. Kim, J. Jang, H.-J. Lee, and D. Hwang, “KIKI-net: cross-domain convolutional neural networks for reconstructing undersampled magnetic resonance images,” Magnetic Resonance in Medicine, 2018.
  • [20] L. Sun, Z. Fan, Y. Huang, X. Ding, and J. Paisley, “Compressed sensing MRI using a recursive dilated network,”

    The Association for the Advance of Artificial Intelligence

    , 2018.
  • [21] T. M. Quan, T. Nguyen-Duc, and W.-K. Jeong, “Compressed sensing MRI reconstruction using a generative adversarial network with a cyclic loss,” IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1488–1497, 2018.
  • [22] J. Schlemper, J. Caballero, J. V. Hajnal, A. N. Price, and D. Rueckert, “A deep cascade of convolutional neural networks for dynamic MR image reconstruction,” IEEE Transactions on Medical Imaging, vol. 37, no. 2, pp. 491–503, 2018.
  • [23]

    C. Qin, J. V. Hajnal, D. Rueckert, J. Schlemper, J. Caballero, and A. N. Price, “Convolutional recurrent neural networks for dynamic MR image reconstruction,”

    IEEE Transactions on Medical Imaging, 2018.
  • [24]

    G. Wang, J. C. Ye, K. Mueller, and J. A. Fessler, “Image reconstruction is a new frontier of machine learning,”

    IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1289–1296, 2018.
  • [25] H. K. Aggarwal, M. P. Mani, and M. Jacob, “MoDL: Model based deep learning architecture for inverse problems,” IEEE Transactions on Medical Imaging, 2018.
  • [26] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    , 2016, pp. 770–778.
  • [27] D. O. Walsh, A. F. Gmitro, and M. W. Marcellin, “Adaptive reconstruction of phased array MR imagery,” Magnetic Resonance in Medicine: An Official Journal of the International Society for Magnetic Resonance in Medicine, vol. 43, no. 5, pp. 682–690, 2000.
  • [28] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, p. 436, 2015.
  • [29]

    K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in

    Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 1026–1034.
  • [30] X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks,” in Proceedings of The Fourteenth International Conference on Artificial Intelligence and Statistics, 2011, pp. 315–323.
  • [31] M. D. Zeiler, “ADADELTA: an adaptive learning rate method,” arXiv preprint arXiv:1212.5701, 2012.
  • [32] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  • [33] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard et al., “Tensorflow: a system for large-scale machine learning.” in OSDI, vol. 16, 2016, pp. 265–283.
  • [34] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.
  • [35] G. Yang, S. Yu, H. Dong, G. Slabaugh, P. L. Dragotti, X. Ye, F. Liu, S. Arridge, J. Keegan, Y. Guo et al., “DAGAN: Deep de-aliasing generative adversarial networks for fast compressed sensing MRI reconstruction,” IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1310–1321, 2018.
  • [36]

    Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, “Residual dense network for image super-resolution,” in

    The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.