RinQ Fingerprinting: Recurrence-informed Quantile Networks for Magnetic Resonance Fingerprinting

07/09/2019
by   Elisabeth Hoppe, et al.
FAU
3

Recently, Magnetic Resonance Fingerprinting (MRF) was proposed as a quantitative imaging technique for the simultaneous acquisition of tissue parameters such as relaxation times T_1 and T_2. Although the acquisition is highly accelerated, the state-of-the-art reconstruction suffers from long computation times: Template matching methods are used to find the most similar signal to the measured one by comparing it to pre-simulated signals of possible parameter combinations in a discretized dictionary. Deep learning approaches can overcome this limitation, by providing the direct mapping from the measured signal to the underlying parameters by one forward pass through a network. In this work, we propose a Recurrent Neural Network (RNN) architecture in combination with a novel quantile layer. RNNs are well suited for the processing of time-dependent signals and the quantile layer helps to overcome the noisy outliers by considering the spatial neighbors of the signal. We evaluate our approach using in-vivo data from multiple brain slices and several volunteers, running various experiments. We show that the RNN approach with small patches of complex-valued input signals in combination with a quantile layer outperforms other architectures, e.g. previously proposed CNNs for the MRF reconstruction reducing the error in T_1 and T_2 by more than 80

READ FULL TEXT VIEW PDF

Authors

page 6

page 11

page 14

09/13/2019

Magnetic Resonance Fingerprinting Reconstruction Using Recurrent Neural Networks

Magnetic Resonance Fingerprinting (MRF) is an imaging technique acquirin...
12/19/2018

Magnetic Resonance Fingerprinting using Recurrent Neural Networks

Magnetic Resonance Fingerprinting (MRF) is a new approach to quantitativ...
07/17/2018

Magnetic Resonance Fingerprinting Reconstruction via Spatiotemporal Convolutional Neural Networks

Magnetic resonance fingerprinting (MRF) quantifies multiple nuclear magn...
12/02/2020

Channel Attention Networks for Robust MR Fingerprinting Matching

Magnetic Resonance Fingerprinting (MRF) enables simultaneous mapping of ...
02/28/2018

Multicompartment Magnetic Resonance Fingerprinting

Magnetic resonance fingerprinting (MRF) is a technique for quantitative ...
11/09/2019

Spatially Regularized Parametric Map Reconstruction for Fast Magnetic Resonance Fingerprinting

Magnetic resonance fingerprinting (MRF) provides a unique concept for si...
05/15/2019

Optimizing MRF-ASL Scan Design for Precise Quantification of Brain Hemodynamics using Neural Network Regression

Purpose: Arterial Spin Labeling (ASL) is a quantitative, non-invasive al...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

One disadvantage of the most currently used Magnetic Resonance Imaging (MRI) techniques is the qualitative nature of the images, thus in most cases no absolute values of the underlying physical tissue parameters, e.g. and relaxations, are obtained. Magnetic Resonance Fingerprinting (MRF) was recently proposed to overcome this limitation: It provides an accelerated acquisition of time signals which differ with the various tissue types by using randomly modified parameters during the acquisition (e.g. Flip Angle (FA) or Repetition Time (TR)) and strong undersampling with spiral readouts. These signals are compared to simulated signals of possible parameter combinations of and and quantitative maps are reconstructed [7, 8]. However, this state-of-the-art approach suffers from high computational effort: Every measured signal is compared to every simulated signal using template matching algorithms. Due to storage and computational limitations, this dictionary can only have a finite amount of possibilities and thus the maps are limited to these parameter combinations and can be erroneous [13]. The more combinations the dictionary contains, the more expensive is the reconstruction in terms of time and storage. In order to provide continuous predictions, to accelerate this process and to eliminate the burden of high storage requirements during the reconstruction, deep learning (DL) can be used: Reconstruction is now performed by forward passing the signal (or signals) through a (regression) network, which is able to predict the and relaxation times for the input. Proposed approaches vary from Fully Connected Neural Networks (FCNs) [1], Convolutional Neural Networks (CNNs) [2, 5, 6] and other architectures, e.g. incorporating an U-Net [3]. However, also state-of-the-art DL solutions have their drawbacks: While FCNs are known to tend to overfit because of the huge number of parameters, CNNs are not optimally suited for time-resolved tasks. To overcome these limitations, we propose Recurrent Neural Networks (RNNs) for this reconstruction task due to their capabilities to capture the time dependency in the signal better than e.g. CNNs. We evaluate our approach using in-vivo data from multiple brain slices and several volunteers and investigate with an extensive evaluation following aspects: (1) the superior performance of RNNs over CNNs, (2) complex-valued input signal data instead of magnitude data as in some previous approaches (e.g. [1, 5]) and (3) spatially connected signal patches instead of one signal for the input layer in combination with a novel quantile filtering layer prior to the output layer. We expect small, spatially connected patches to have the same type of tissue and therefore the same quantitative parameters. The knowledge of spatial neighbors was shown to help the reconstruction accuracy by e.g. [3], but they used the whole image as input. To be able to train their network, all signals have to be compressed and possibly important information may be lost in the signals. Our approach uses smaller, not compressed patches of spatially connected signals (cf. Fig.1). To the best of our knowledge, RNNs for MRF were only investigated using signals from a synthetic dataset and without the consideration of spatial neighbor signals [10].

Figure 1:

Overview over the MRF reconstruction process using deep learning. We map the reconstruction process using a Recurrent Neural Network with complex-valued input signals in combination with a quantile layer. LSTM: Long Short-Term Memory layer, FC: Fully Connected layer.

2 Methods

2.1 Recurrent Neural Networks

General architectures:

We devise a regression RNN to solve the MRF reconstruction task: From the input (one or more time signals), the network predicts the quantitative relaxation parameters for this signal. For the development of the networks, we use the well-known Long Short-Term-Memory (LSTM) layers [4]. In order to keep the sequence in a moderate size, we reshaped the signals of length

data points into 30 even sized parts. Thus, every sequence element consists of 100 complex-valued (flatterned to 200 values from the real and imaginary parts, respectively) or magnitude data points and is used in front of the LSTM layer as the first layer of our RNNs. This reshaping reduces the risk of vanishing or exploding gradient problems during the training 

[11]

. One LSTM layer is followed by the Rectified Linear Unit (ReLU) activation and a batch normalization (BN). Afterwards, we use 4 fully connected layers, each followed by a ReLU activation and a BN layer (each operating on either the magnitude or on the real and imaginery data points separately), to execute the regression.

Quantile layer:

To cope with signal outliers due to undersampling or noise during the acquisition, we propose to combine the RNN architecture with a quantile layer as the last layer prior to the output. Inspired by work from Schirrmacher et al. in [12], we use small patches of signals, which are locally connected for the input layer. Thus, the input for one regression is increased by a factor of 9 compared to networks with one signal as input. For the output, we compute the quantile of all predictions from this neighborhood. The quantile operation can be reformulated as , where denotes a sparse matrix which stores the position of the quantile. In the backward pass, the gradient w.r.t. the input is simply the transposed matrix . We expect the signals from small patches to belong to similiar or same parameters as they originate from same or similiar tissue type. The quantile layer enables a pooling operation that is more robust to noise compared to common pooling operations such as maximum or average pooling. To the best of our knowledge, we are the first to incorporate this operation as a network layer.

2.2 Training and Evaluation

All our models are trained based on the mean squared error (MSE) loss and optimized using ADAM. We evaluate all models by measuring the difference between the predicted and the ground-truth and

relaxation times, computed as the relative mean error and the appropriate standard deviation. Data is split into disjunct training, validation and test sets. The validation set is used to select the best model from all training epochs, the test set for testing a model on unknown data afterwards.

3 Experiments and Results

3.1 Data sets

Data acquisition:

All data sets for our experiments were measured as axial brain slices in 8 volunteers (4 male, 4 female, 4315 years) on a MAGNETOM Skyra 3T MR scanner (Siemens Healthcare, Erlangen, Germany) using a prototype sequence based on Fast Imaging with Steady State Precession with spiral readouts [7] and following sequence parameters: Field-of-View: 300 mm, resolution: mm, variable TR (12-15 ms), FA (5-74), number of repetitions: 3,000, undersampling factor: 48. From 2 volunteers, 2 different slices were available, from 6 volunteers, 4 slices were available each. All slices were measured at different positions and points in time to reduce possible correlations between slices from one volunteer.

Ground-truth data:

In order to create accurate ground-truth data for our DL experiments, we used a fine resolved dictionary containing overall 691,497 possible parameter combinations with in the range of 10 to 4,500 ms and of 2 to 3,000 ms, respectively. To be able to reconstruct the relaxation maps in a reasonable time and to reduce the memory requirements, the dictionary and measured signals were compressed to 50 main components in the time domain using SVD prior to the template matching.

3.2 Experiments for finding architectural settings

Experimental setup:

We ran three specific types of experiments to investigate following issues:

  1. Performance of networks using magnitude input signals vs. complex-valued input signals . For this, we compared the CNN (architectural details see Section 3.3) and RNN models with and .

  2. Performance of networks using CNN vs. RNN models (both with a comparable number of learnable parameters). For this, we compared the CNN and RNN models with input signals .

  3. Performance of networks using input signals vs. input signals in combination with a quantile layer prior to the output. For this, we compared RNN models with and without a quantile layer.

Data splitting:

As only a limited amount of data sets (overall 12 slices from 4 volunteers) was available for our extensive experiments, we first used all slices from these 4 volunteers randomly separated into training, validation and test sets (8, 2 and 2 slices, respectively). We then used additional 16 slices from another 4 volunteers (again randomly separated) for experiments with our best fitted model (19 slices for training, 7 for validation, 2 for testing).

3.3 Comparison with other DL architecture

We used the CNN model with overall 4 convolutional and 4 fully connected layers with ReLU activations and average pooling in [5] to compare our approach with another DL based MRF reconstruction framework. We extended this baseline model with BN layers after each convolutional and fully connected layer.

3.4 Results

Results can be found in Table 1 (validation loss from the best epoch) and in Figure 2 (parameter maps on the same test set from all models).

Validation loss [ms]
Input signals Model
CNN RNN RNN RNN
636.96 424.96 - -
470.26 269.20 221.52 195.34
Table 1: Validation losses across different experiments. Best results are marked in bold. The validation loss is measured as over and values. CNN: CNN model with input signals, RNN: RNN model with input signals, RNN: RNN model with signal patch as input and quantile layer, RNN: the same as RNN, trained with the larger data set. Detailed information about the models see Sections 2.1 and 3.3.
CNN with , RME  std.dev.[%]: ,
CNN with , RME  std.dev.[%]: ,
RNN with , RME  std.dev.[%]: ,
RNN with , RME  std.dev.[%]: ,
RNN with , RME  std.dev.[%]: ,
RNN with , RME  std.dev.[%]: ,
Figure 2: Predicted maps of one test data set from models using small data set (rows 1-5), or large data set (row 6). First column: maps. Second column: relative mean errors to the ground-truth. Third column: maps. Fourth column: relative mean errors to the ground-truth. For better visibility, all relative error maps were clipped at 100 %, the background of all and maps was set to -200 and they were windowed equally for fair comparison (0 - 4,000 ms and 0 - 600 ms, respectively). RME: Relative mean error, std.dev.: standard deviation.

4 Discussion

In summary, the main observation from our results is the clear improvement of the performance using our proposed RNN model in combination with complex-valued input signals and the quantile layer in comparison to all other tested models.

Magnitude vs. complex-valued signal inputs:

We first compare our models trained with and inputs. The utilization of both components of the complex-valued signals, instead of only computing the magnitudes for the input layers of the networks, is an essential factor for the performance. A clear reduction of the errors is achieved using for both approaches (CNN: more than 62 %, RNN: more than 50 %). Comparing the visual results of e.g. the same RNN model using and (cf. rows 3, 4 in Fig. 2), the complex version clearly yields reduced relative mean errors and improved parameter maps without being corrupted by the heavy ringing artifacts which appear with the inputs.

CNN vs. RNN:

A clear improvement is also achieved using a RNN instead of a CNN model with a reduction of the errors up to 53 %. Independent of the input signal types, the CNN model is not able to reconstruct meaningful parameter maps showing soft tissue contrast. In comparison, the RNN model is capable of reconstructing high detail parameter maps, showing the better capability of the RNN for processing time-dependent signals. Nevertheless, this holds only for the RNN using , since the RNN using is still corrupted by the ringing artifacts.

Quantile layer:

Our results show additionally, that a quantile layer furthermore improves the performance (cf. rows 4, 5 in Fig. 2), reducing the errors by 57 % and 43 % for and , respectively, in comparison to a RNN without quantile layer. The influence of the quantile layer is particularly evident at transitions between different tissue types in the parameter map. With the help of the quantile layer, the errors at the edges can be enormously reduced, as the 0.5 quantile layer acts as an edge-preserving denoising filter (cf. the relative error maps in rows 4, 5 in Fig. 2).

Challenges and limitations:

Our experiments show the improved performance step-by-step, that increases from (1) magnitude to complex-valued input signals, (2) from a CNN to a RNN model and (3) from a RNN without a quantile layer to a RNN with a quantile layer. Even though we use a limited amount of data, our results are a strong indication, that our model is able to generalize. Using our best RNN model and training it with slightly more data already decreased the error (cf. Table 1), which encourages this assumption. One further step, however, is the evaluation of our proposed approach using data splits with completely unseen volunteer data sets in the validation or test data when more data is available (preliminary experiments in this direction are attached in the Supplementary Material). Moreover, we used a very fine-resolved dictionary for the ground-truth data. While this is crucial for accurate ground-truth data, this further increases the amount of training data that is necessary to fully imprint the complex mapping into the network. In comparison to other MRF DL approaches (e.g. the MRF-EPI sequence in [1]), we used signals from a very strongly undersampled acquisition (undersampling factor: 48), which leads to very noisy and corrupted signals compared to simulated dictionary signals. As shown by Hoppe et al. in [5, 6], fully sampled dictionary signals can be easily learned by simple CNN models. However, undersampled in-vivo data are more challenging to reconstruct with the MRF DL method, thus a more complex model is required.

5 Conclusion

We proposed a regression RNN for MRF reconstruction. Our architecture combines a model used to deal with time-dependent complex-valued input signals incorporated as a LSTM layer with a novel quantile layer to deal with signal outliers, which are very common due to the strong undersampling during the acquisition. We evaluated our approach in a proof-of-concept study with various experiments and showed, that our model outperforms other DL models like CNNs or RNNs without the additional quantile layer, reducing the errors by more than 80 %. One limitation of our study is the restricted amount of training data, which will be addressed in future work. Furthermore, another future step will be a deeper comparison of the different architectures and their features which can help to improve the interpretability of the networks. In addition, the incorporation of known operations based on the imaging physics within the networks as described in [9] can help to reduce the complexity and improve the performance at the same time. This also will be investigated for our application.

References

  • [1] Cohen, O., Zhu, B., Rosen, M.S.: Mr fingerprinting deep reconstruction network (drone). Magnetic resonance in medicine 80(3), 885–894 (2018)
  • [2] Fang, Z., Chen, Y., Lin, W., Shen, D.: Quantification of relaxation times in mr fingerprinting using deep learning. In: Proceedings of the International Society for Magnetic Resonance in Medicine. Scientific Meeting and Exhibition. vol. 25. NIH Public Access (2017)
  • [3]

    Fang, Z., Chen, Y., Liu, M., Zhan, Y., Lin, W., Shen, D.: Deep learning for fast and spatially-constrained tissue quantification from highly-undersampled data in magnetic resonance fingerprinting (mrf). In: International Workshop on Machine Learning in Medical Imaging. pp. 398–405. Springer (2018)

  • [4] Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural computation 9(8), 1735–1780 (1997)
  • [5] Hoppe, E., Körzdörfer, G., Nittka, M., Würfl, T., Wetzl, J., Lugauer, F., Schneider, M.: Deep learning for magnetic resonance fingerprinting: Accelerating the reconstruction of quantitative relaxation maps. In: Proceedings of the Joint Annual Meeting ISMRM-ESMRMB (26th Annual Meeting and Exhibition), Paris, France. p. 2791 (2018)
  • [6] Hoppe, E., Körzdörfer, G., Würfl, T., Wetzl, J., Lugauer, F., Pfeuffer, J., Maier, A.K.: Deep learning for magnetic resonance fingerprinting: A new approach for predicting quantitative parameter values from time series. In: Studies in health technology and informatics. vol. 243, pp. 202–206 (2017)
  • [7] Jiang, Y., Ma, D., Seiberlich, N., Gulani, V., Griswold, M.A.: Mr fingerprinting using fast imaging with steady state precession (fisp) with spiral readout. Magnetic resonance in medicine 74(6), 1621–1631 (2015)
  • [8] Ma, D., Gulani, V., Seiberlich, N., Liu, K., Sunshine, J.L., Duerk, J.L., Griswold, M.A.: Magnetic resonance fingerprinting. Nature 495(7440), 187–192 (2013)
  • [9] Maier, A.K., Syben, C., Stimpel, B., Würfl, T., Hoffmann, M., Schebesch, F., Fu, W., Mill, L., Kling, L., Christiansen, S.: Learning with known operators reduces maximum training error bounds. Nature Machine Intelligence. Accepted for publication. (2019)
  • [10] Oksuz, I., Cruz, G., Clough, J., Bustin, A., Fuin, N., Botnar, R.M., Prieto, C., King, A.P., Schnabel, J.A.: Magnetic resonance fingerprinting using recurrent neural networks. In: 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019). pp. 1537–1540. IEEE (2019)
  • [11] Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International conference on machine learning. pp. 1310–1318 (2013)
  • [12] Schirrmacher, F., Köhler, T., Endres, J., Lindenberger, T., Husvogt, L., Fujimoto, J.G., Hornegger, J., Dörfler, A., Hoelter, P., Maier, A.K.: Temporal and volumetric denoising via quantile sparse image prior. Medical image analysis 48, 131–146 (2018)
  • [13] Wang, Z., Zhang, Q., Yuan, J., Wang, X.: Mrf denoising with compressed sensing and adaptive filtering. In: 2014 IEEE 11th International Symposium on Biomedical Imaging (ISBI 2014). pp. 870–873. IEEE (2014)

Appendix 0.A Dictionary parameters for ground-truth data

start:step:stop[ms] start:step:stop[ms]
10:10:90 2:2:98
100:20:1000 100:5:150
1040:40:2000 160:10:300
2050:100:4500 350:50:800
900:100:1600
1800:200:3000
Table 2: Parameter steps used for the dictionary for generating ground-truth parameter maps.

Appendix 0.B Deep learning architectures

0.b.1 Overview

Figure 3: Overview over the used models and input signal types in our work (not all layers within the networks are displayed). We used models with magnitude (upper model) and complex-valued input signals (middle and lower models). Furthermore, we investigated Convolutional Neural Networks (CNNs, upper model) and different Recurrent Neural Networks (RNNs, with and without a quantile layer prior to the output layer, the middle and the lower model, respectively). Conv: Convolutional layer, ReLU: Rectified Linear Unit, BN: Batch normalization, FC: Fully Connected layer, LSTM: Long Short-Term Memory layer.

0.b.2 Recurrent Neural Networks

Layer Output shape
One channel signals Two channels signals
Input (3,000, 1) (3,000, 2)
Reshape (30, 100) (30, 200)
LSTM (30, 1,000) (30, 1,000)
FC (30, 500) (30, 500)
FC (30, 250) (30, 250)
Flatten (7500) (7500)
FC (360) (360)
FC (2) (2)
Table 3: Details of different layers used for our RNN model processing signals from one pixel with one channel (magnitudes) and two channels (real and imaginery parts from the complex numbers) as input. After every layer (except for the Input, Reshape and Flatten) a Rectified Linear Unit activation is applied. LSTM: Long Short-Term Memory layer, FC: Fully Connected layer. Number of parameters [millions]: 7.7 (RNN with one channel signals), 8.1 (RNN with two channels signals).
Layer Output shape
Input (3, 3, 3,000, 2)
Reshape (30, 1800)
LSTM (30, 500)
FC (30, 500)
FC (30, 250)
Flatten (7500)
FC (360)
Reshape (3, 3, 40)
FC (3, 3, 2)
Quantile (2)
Table 4: Details of different layers used for our RNN model processing signals from 33 patches and two channels (real and imaginery parts from the complex numbers) as input. After every layer (except for the Input, Reshape and Flatten) a Rectified Linear Unit activation is applied. LSTM: Long Short-Term Memory layer, FC: Fully Connected layer. Number of parameters [millions]: 7.7.

0.b.3 Convolutional Neural Networks

Layer Kernel sizes Output shape
One channel signals Two channels signals
Input (3,000, 1) (3,000, 2)
Conv + BN K: 151, S: 5 (598, 30) (598, 30)
Conv + BN K: 101, S: 3 (197, 60) (197, 60)
Conv + BN K: 51, S: 2 (97, 120) (97, 120)
Conv + BN K: 31, S: 2 (48, 240) (48, 240)
AvgPool K: 3, S: 2 (23, 240) (23, 240)
Flatten (5520) (5520)
FC + BN (1,000) (1,000)
FC + BN (500) (500)
FC + BN (300) (300)
FC (2) (2)
Table 5:

Details of different layers used for our CNN model processing signals from one pixel as input. After every layer (except for the Input, AvgPool and Flatten) a Rectified Linear Unit activation is applied. Conv: Convolutional layer, BN: Batch normalization, AvgPool: Average Pooling, FC: Fully Connected layer, K: Kernel size, S: Stride size. Number of parameters [millions]: 6.3.

0.b.4 Training parameters

Model Learning rate Batch size
CNN 128
RNN
CNN 128
RNN
RNN 32
RNN
Table 6: Training parameters used for all our models and experiments. Other training parameters for our optimizer ADAM are: , . CNN: CNN model with input signals, RNN: RNN model with input signals, RNN: RNN model with signal patch as input and quantile layer, RNN: the same as RNN, trained with the larger data set. : magnitude input signals, : complex-valued input signals.

Appendix 0.C Leave-one-out splits: Preliminary results

Figure 4: Predicted maps of one test data set from models using leave-one-out data separation with the small data set (overall 12 slices from 4 volunteers, row 1), or using leave-one-out data separation with the extended data set (overall 28 slices from 8 volunteers, row 2). First column: maps. Second column: relative mean errors to the ground-truth. Third column: maps. Fourth column: relative mean errors to the ground-truth. For better visibility, all relative error maps were clipped at 100 %, the background of all and maps was set to -200 and they were windowed equally for fair comparison (0 - 4,000 ms and 0 - 600 ms, respectively). Every data set was separated using slices from one previously unseen volunteer for the validation and the test processes, respectively (resulting in 2 (small data set) or 6 (extended data set) volunteers for training, 1 volunteer for validation and 1 volunteer for testing). The homogeneous areas in the reconstructed parameter maps from the training with only 2 volunteer data sets (row 1) show, that these data sets are not sufficient for the model to be able to generalize. The same experiment with an extended amount of data sets to 6 volunteers in the training phase already shows an enormous increase of the performance, as tissue details can be recognized in the reconstructed parameter maps (row 2). With this example, we would like to emphasize that for the present case even 6 volunteer records are not nearly enough training data, but the improvements from 2 to 6 volunteer data sets are tremendous. The results of the leave-one-out experiments can be seen as the lower limit for future results with more volunteer training data.