1 Introduction
Magnetic resonance fingerprinting (MRF) is an emerging technology that enables simultaneous quantification of multitudes of tissues’ physical properties in short and clinically feasible scan times [21]. Iterative reconstruction methods based on Compressed Sensing (CS) have proven efficient to help MRF overcome the challenge of computing accurate quantitative images from the undersampled kspace measurements taken in aggressively short scan times [10, 3, 31, 11]. However, these methods require dictionary matching (DM) that is nonscalable and can create enormous storage and computational overhead. Further, such approaches often do not fully account for the joint spatiotemporal structures of the MRF data which can lead to poor reconstructions [14].
Deep learning methodologies have emerged to address DM’s computational bottleneck [9, 15, 28, 25], and in some cases to perform joint spatiotemporal MRF processing through using convolutional layers [12, 17, 8, 4, 19, 27, 13]. These models are trained in an endtoend fashion without an explicit account for the known physical acquisition model (i.e. the forward operator) and a mechanism for explicitly enforcing measurement consistency according to this sampling model which can be crucial in the safetyfirst medical applications. Further, ignoring the structure of the forward model could lead to building unnecessary large inference models and possible overfitted predictions, especially for the extremely scarce labelled anatomical quantitative MRI datasets that are available for training.
Our contributions: we propose PGDNet a deep convolutional model that is able to learn and perform robust spatiotemporal MRF processing, and work with limited access to the groundtruth (i.e. labelled) quantitative maps. Inspired by iterative proximal gradient descent (PGD) methods for CS reconstruction [23], we adopt learnable, compact and shared convolutional layers within a datadriven proximal step, meanwhile explicitly incorporating the acquisition model as a nontrainable gradient step in all iterations. The proximal operator is an autoencoder network whose decoder embeds the Bloch magnetic responses and its convolutional encoder embeds a dealiasing projector to the tissue maps’ quantitative properties. Our work is inspired by recent general CS methodologies [23, 26, 1, 2, 7] that replace traditional handcrafted image priors by deep datadriven models. To the best of our knowledge, this is the first work to adopt and investigate the feasibility of such an approach for solving the MRF inverse problem.
2 Methodology
MRF adopts a linear spatiotemporal compressive acquisition model:
(1) 
where are the kspace measurements collected at temporal frames and corrupted by some noise , and is the TimeSeries of Magnetisation Images (TSMI) with voxels across timeframes. The forward operator
models Fourier transformations subsampled according to a set of temporallyvarying kspace locations in each timeframe. Accelerated MRF acquisition implies working with heavily undersampled data
, which makes becomes illposed for the inversion.Bloch response model: Pervoxel TSMI temporal signal evolution is related to the quantitative NMR parameters/properties such as relaxation times, through the solutions of the Bloch differential equations , scaled by the proton density (PD) in each voxel [21, 20].
The subspace dimensionreducing model: In many MRF applications (including ours) a low dimensional subspace embeds the Bloch solutions . This subspace can be computed through PCA decomposition of the MRF dictionary [24], and enables rewriting (1) in a compact form that is beneficial to the storage, runtime and accuracy of the reconstruction [3, 15]:
(2) 
and is the dimensionreduced TSMI, and denotes the subspacecompressed Bloch solutions (for more details see [14]).
Tissue quantification: Given the compressed measurements y, the goal of MRF is to solve the inverse problem (2) and to compute the underlying multiparametric maps (and x as a biproduct). Such problems are typically casted as an optimisation problem of the form:
(3) 
and solved iteratively by the proximal gradient descent (PGD):
(4) 
where the gradient updates encourage kspace fidelity (the first term of (3)), and the proximal operator enforces image structure priors through a regularisation term that makes the inverse problem wellposed. The Bloch dynamics in (2) place an important temporal constraint (prior) for pervoxel trajectories of x. Projecting onto this model (i.e. a temporal Prox model) has been suggested via iterative dictionary search schemes [10, 3]. This approach boost MRF reconstruction accuracy compared to the noniterative DM [21], however, DM is nonscalable and can create enormous storage and computational overhead. Further, such approach processes data independently per voxel and neglects important spatial domain regularities in the TSMIs and quantitative maps.
3 PGDNet for MRF quantification
We propose to learn a datadriven proximal operator within the PGD mechanism for solving the MRF problem. Implemented by compact networks with convolutional layers, the neural Prox improves the storage overhead and the sluggish runtime of the DMbased PGD by orders of magnitudes. Further, trained on quantitative MR images, the neural Prox network learns to simultaneously enforce spatial and temporaldomain data structures within PGD iterations.
Prox autoencoder: We implement through a deep convolutional autoencoder network:
(5) 
consisting of an encoder and a decoder Bloch: subnetworks. The information bottleneck in the (neural) Prox autoencoder corresponds to projecting multichannel TSMIs to the lowdimensional manifold of the tissues’ intrinsic (quantitative) property maps [14].
Decoder network: creates a differentiable model for generating the Bloch magnetic responses. This network uses filters to process image timeseries in a voxelwise manner. Given quantitative properties , the decoder approximates (dimensionreduced) Bloch responses in voxel i.e. . This network is trained separately from the encoder. Training uses physical (Bloch) simulations for many combinations of the T1, T2 and PD values which can flexibly produce a rich training dataset [14].
Encoder network: projects g the gradientupdated TSMIs in each iteration (i.e. the first line of (4)) to the quantitative property maps m. Thus, must simultaneously (i) learn to incorporate spatialdomain regularities to dealias TSMIs from the undersampling artefacts, and (ii) resolve the temporaldomain inverse mapping from the (noisy) TSMIs to the quantitative property maps. For this, and unlike Bloch which applies pixelwise temporalonly processing, uses multichannel convolution filters with wider receptive fields to learn/enable spatiotemporal processing of the TSMIs.
PGDNet: Fig. 1 shows the recurrent architecture of the proposed learned PGD algorithm, coined as the PGDNet. The trainable parameters within the PGDNet are those of the encoder network and the step sizes . Other operators such as and Bloch (pretrained separately) are kept frozen during training. Further, ’s parameters are shared through all iterations. In practice, a truncated recurrent iterations is used for training. Supervised training requires the MRF measurements, TSMIs, and the ground truth property maps to form the training input y and target samples.
Note there are many arts of engineering to determine the optimal network architecture, including different ways to encode temporal [18] or spatialtemporal information [5], these aspects are somewhat orthogonal to the model consistency question. Indeed, such mechanisms could also be incorporated in PGDNet.
Training loss: Given a training set , and recurrent iterations of the PGDNet (i.e. iterations used in PGD), the loss is defined as
(6) 
where is the MSE loss defined with appropriate weights , on the reconstructed TSMIs x (which measures the Bloch dynamic consistency) and tissue property maps m, as well as on y to maximise kspace data consistency with respect to the (physical) forward acquisition model. In this paper, the scaling between parameters , and were initialized based on the physics (see 4.3).
4 Numerical experiments
4.1 Anatomical dataset
We construct a dataset of brain scans acquired using the 1.5T GE HDxT scanner with 8channel receiveonly head RF coil. For setting groundtruth (GT) values for the T1, T2 and PD parameters, gold standard anatomical maps were acquired using MAGIC quantification protocol [22]. Groundtruth quantitative maps were acquired from 8 healthy volunteers (16 axial brain slices each, at the spatial resolution of pixels). From these parametric maps, we then construct the TSMIs and MRF measurements using the MRF acquisition protocol mentioned below to form the training/testing tuples . Data from 7 subjects were used for training our models, and one subject was kept for performance testing. We augmented training data into total 224 samples using random rotations (uniform angles in
), and leftright flipping of the GT maps. Training batches at each learning epoch were corrupted by i.i.d Gaussian noises of 30dB SNR added to
y (we similarly add noise to the kspace test data).4.2 MRF acquisition
Our experiments use an excitation sequence of repetitions which jointly encodes T1 and T2 values using an inversion pulse followed by a flip angle schedule that linearly ramps up from to , i.e. truncated sequence than [16, 14]. Following [16], we set acquisition parameters Tinv=18 msec (inversion time), fixed TR=10 msec (repetition time), and TE = 0.46 msec (echo time). Spiral readouts subsample the kspace frequencies (the Cartesian FFT grid) across 200 repetition times. We sample spatial frequencies for , which after quantisation to the nearest FFT grid, results in samples per timeframe. In every repetition, similar to [21], this spiral pattern rotates by in order to subsample new kspace frequencies. Given the anatomical T1, T2 and PD maps, we simulate magnetic responses using the Extended Phase Graph (EPG) formalism [30] and construct TMSIs and kspace measurements datasets, and use them for training and retrospective validations.
4.3 Reconstruction algorithms
Two DM baselines namely, the noniterative Fast Group Matching (FGM) [6] and the modelbased iterative algorithm BLIP empowered by the FGM’s fast searches, were used for comparisons. For this, a MRF dictionary of 113’781 fingerprints was simulated over a dense grid of (T1, T2)=[100:10:4000][20:2:600] msec values. We implemented FGM searches on GPU using 100 groups for clustering this dictionary. The BLIP algorithm uses backtracking step size search and runs for maximum 20 iterations if is not convergent earlier. Further, we compared against related deep learning MRF baselines MRFCNN [8] and SCQ [12]. In particular, MRFCNN is a fully convolutional network and SCQ mainly uses 3 Unets to separately infer T1, T2 and PD maps. The input to these networks is the dimensionreduced backprojected TSMIs , and their training losses only consider quantitative maps consistency i.e. the second term in (6).
We trained PGDNet with recurrent iterations and 5 to learn appropriate proximal encoder and the step sizes . The architectures of and Bloch networks are illustrated in Fig. 1. Similar to [14], the MRF dictionary was used for pretraining the Bloch decoder that embeds a differentiable model for generating Bloch magnetic responses. A compact shallow network with one hidden layer and filters (for pixelwise processing) implements our Bloch model [14]. On the other hand, our encoder has two residual blocks with filters (for dealiasing) followed by three convolutional layers with filters for quantitative inference. The inputs were normalized such that PD ranged in ; smaller weights were used for x and y since they have higher energy than PD; we set since x’s norm is larger than y; values typically exhibit different ranges with , justifying their relative weightings in to balance these terms. The final hyperparameters were , and selected via a multiscale grid search to minimize error w.r.t. the ground truth. We used ADAM optimiser with 2000 epochs, minibatch size 4 and learning rate . We pretrained our encoder using backprojected TSMIs to initialise the recurrent training, and also to compare the encoder alone predictions to the PGDNet. All algorithms use a dimensional MRF subspace representation for temporaldomain dimensionality reduction. The input and output channels are respectively 10 and 3 for MRFCNN, SCQ and
. All networks were implemented in PyTorch, and trained and tested on NVIDIA 2080Ti GPUs.
NRMSE  SSIM  MAE (msec)  time (sec)  memory (MB)  

T1  T2  PD  T1  T2  PD  T1  T2  
FGM  0.475  0.354  1.12  0.614  0.652  0.687  350.0  14.6  1.29  8.81 
BLIP+FGM  0.230  0.545  0.073  0.886  0.880  0.984  91.7  8.0  79.28  8.81 
MRFCNN  0.155  0.158  0.063  0.943  0.972  0.987  80.3  5.4  0.083  4.72 
SCQ  0.172  0.177  0.064  0.929  0.967  0.984  91.7  6.1  0.132  464.51 
(encoder alone)  0.142  0.155  0.065  0.948  0.973  0.987  77.1  5.6  0.067  0.55 
PGDNet ()  0.104  0.138  0.050  0.973  0.979  0.991  59.9  5.0  0.078  0.57 
PGDNet ()  0.100  0.132  0.045  0.975  0.981  0.992  50.8  4.6  0.103  0.57 
4.4 Results and discussions
Table 1 and Figure 2 compare the performances of the MRF baselines against our proposed PGDNet using and 5 recurrent iterations. We also include inference results using the proposed encoder alone , without proximal iterations. Reconstruction performances were measured by the Normalised RMSE , MAE , Structural Similarity Index Metric (SSIM) [29], the required storage for the MRF dictionary (in DM methods) or the networks, and the algorithm runtimes averaged over the test image slices.
The noniterative FGM results in incorrect maps due to the severe undersampling artefacts. The modelbased BLIP iterations improve this, however, due to lacking spatial regularisation, BLIP has limited accuracy and cannot fully remove aliasing artefacts (e.g. see T2 maps in Figure 2) despite 20 iterations and very long runtime. In contrast, all deep learning methods outperform BLIP not only in accuracy but also in having 2 to 3 orders of magnitude faster reconstruction times—an important advantage of the learningbased methods. The proposed PGDNet consistently outperforms all baselines, including DM and learningbased methods, over all defined accuracy metrics. This is achieved due to learning an effective spatiotemporal model (only) for the proximal operator i.e. the and Bloch networks, directly incorporating the physical acquisition model H into the recurrent iterations to avoid overparameterisation of the overall inference model, as well as enforcing reconstructions to be consistent with the Bloch dynamics and the kspace data through the multiterm training loss (6). The MRFCNN and SCQ overparametrise the inference by 1 and 3 orders of magnitude larger model sizes (the SCQ requires larger memory than DM) and are unable to achieve PGDNet’s accuracy e.g. see the corresponding oversmoothed T2 maps in Fig. 2. Finally, we observe that despite having roughly the same model size (storage), the encoder alone predictions are not as accurate as the results of the PGDNet’s recurrent iterations. Similar to proximal GD, PGDNet are expected to converge to a fixed point. By increasing the number of iterations we observe that the PGDNet’s accuracy consistently improves despite having an acceptable longer inference time. However, accuracy gains for are marginal, suggesting method’s fast convergence.
5 Conclusions
In this work we showed that the consistency of the computed quantitative maps with respect to the physical forward acquisition model and the Bloch dynamics is important for reliably solving the MRF inverse problem using compact deep neural networks. For this, we proposed
PGDNet, a learned modelbased iterative reconstruction framework that directly incorporates the forward acquisition and Bloch dynamic models within a recurrent learning mechanism with a multiterm training loss. The PGDNet adopts a datadriven neural proximal model for spatiotemporal processing of the MRF data, TSMI dealiasing and quantitative inference. A chief advantage of this model is its compactness (a small number of weights/biases to tune), which might makes it particularly suitable for supervised training using scarce quantitative MRI datasets. Through our numerical validations we showed that the proposed PGDNet achieves a superior quantitative inference accuracy, much smaller storage requirement, and a comparable runtime to the recent deep learning MRF baselines, while being much faster than the MRF fast dictionary matching schemes. In future work, we plan to evaluate the nonsimulated scanner datasets with higher diversities and possible pathologies to further validate the method’s potential for clinical usage.Acknowledgements
The authors would like to thank Pedro Gomez, Carolin Prikl and Marion Menzel from the GE Healthcare in Munich, for useful discussions and for the quantitative anatomical maps dataset. DC and MD are supported by the ERC CSENSE project (ERCADG2015694888).
References
 [1] (2017) Solving illposed inverse problems using iterative deep neural networks. Inverse Problems 33 (12), pp. 124007. Cited by: §1.
 [2] (2018) MoDL: modelbased deep learning architecture for inverse problems. IEEE transactions on medical imaging 38 (2), pp. 394–405. Cited by: §1.
 [3] (2018) Low rank alternating direction method of multipliers reconstruction for mr fingerprinting. Magnetic resonance in medicine 79 (1), pp. 83–96. Cited by: §1, §2, §2.

[4]
(2018)
Magnetic resonance fingerprinting reconstruction via spatiotemporal convolutional neural networks
. InInternational Workshop on Machine Learning for Medical Image Reconstruction
, pp. 39–46. Cited by: §1.  [5] (2019) On the spatial and temporal influence for the reconstruction of magnetic resonance fingerprinting. In International Conference on Medical Imaging with Deep Learning, pp. 27–38. Cited by: §3.
 [6] (2015) Fast group matching for mr fingerprinting reconstruction. Magnetic resonance in medicine 74 (2), pp. 523–528. Cited by: §4.3.
 [7] (2019) Deep decomposition learning for inverse imaging problems. arXiv preprint arXiv:1911.11028. Cited by: §1.
 [8] (201908–10 Jul) Deep fully convolutional network for mr fingerprinting. In International Conference on Medical Imaging with Deep Learning (MIDL), London, United Kingdom. Cited by: §1, §4.3.
 [9] (2018) MR fingerprinting deep reconstruction network (drone). Magnetic resonance in medicine 80 (3), pp. 885–894. Cited by: §1.
 [10] (2014) A compressed sensing framework for magnetic resonance fingerprinting. SIAM Journal on Imaging Sciences 7 (4), pp. 2623–2656. Cited by: §1, §2.
 [11] (2017) Matrix completionbased reconstruction for undersampled magnetic resonance fingerprinting data. Magnetic resonance imaging 41, pp. 41–52. Cited by: §1.
 [12] (2019) Deep learning for fast and spatiallyconstrained tissue quantification from highlyaccelerated data in magnetic resonance fingerprinting. IEEE transactions on medical imaging. Cited by: §1, §4.3.
 [13] (2019) RCAunet: residual channel attention unet for fast tissue quantification in magnetic resonance fingerprinting. In International Conference on Medical Image Computing and ComputerAssisted Intervention, pp. 101–109. Cited by: §1.
 [14] (2020) Compressive mri quantification using convex spatiotemporal priors and deep autoencoders. arXiv preprint arXiv:2001.08746. Cited by: §1, §2, §3, §3, §4.2, §4.3.
 [15] (2019) Geometry of deep learning for magnetic resonance fingerprinting. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7825–7829. Cited by: §1, §2.
 [16] (2020) Rapid threedimensional multiparametric mri with quantitative transientstate imaging. arXiv preprint arXiv:2001.07173. Cited by: §4.2.
 [17] (2017) Deep learning for magnetic resonance fingerprinting: a new approach for predicting quantitative parameter values from time series.. Studies in health technology and informatics 243, pp. 202. Cited by: §1.
 [18] (2017) Deep learning for magnetic resonance fingerprinting: a new approach for predicting quantitative parameter values from time series.. In GMDS, pp. 202–206. Cited by: §3.

[19]
(2019)
RinQ fingerprinting: recurrenceinformed quantile networks for magnetic resonance fingerprinting
. In International Conference on Medical Image Computing and ComputerAssisted Intervention, pp. 92–100. Cited by: §1.  [20] (2015) MR fingerprinting using fast imaging with steady state precession (fisp) with spiral readout. Magnetic resonance in medicine 74 (6), pp. 1621–1631. Cited by: §2.
 [21] (2013) Magnetic resonance fingerprinting. Nature 495 (7440), pp. 187. Cited by: §1, §2, §2, §4.2.
 [22] (2015) New technology allows multiple image contrasts in a single scan. GESIGNAPULSE.COM/MR SPRING, pp. 6–10. Cited by: §4.1, Figure 2.
 [23] (2018) Neural proximal gradient descent for compressive imaging. In Advances in Neural Information Processing Systems, pp. 9573–9583. Cited by: §1.
 [24] (2014) SVD compression for magnetic resonance fingerprinting in the time domain. IEEE transactions on medical imaging 33 (12), pp. 2311–2322. Cited by: §2.

[25]
(2019)
Magnetic resonance fingerprinting using recurrent neural networks
. In IEEE Intl. Symposium on Biomedical Imaging (ISBI), pp. 1537–1540. Cited by: §1. 
[26]
(2017)
One network to solve them all–solving linear inverse problems using deep projection models.
In
Proceedings of the IEEE International Conference on Computer Vision
, pp. 5888–5897. Cited by: §1.  [27] (2019) HYDRA: hybrid deep magnetic resonance fingerprinting. Medical physics 46 (11), pp. 4951–4969. Cited by: §1.
 [28] (2017) Better than real: complexvalued neural nets for MRI fingerprinting. arXiv preprint arXiv:1707.00070. Cited by: §1.
 [29] (2004) Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13 (4), pp. 600–612. Cited by: §4.4.
 [30] (2015) Extended phase graphs: dephasing, rf pulses, and echoespure and simple. Journal of Magnetic Resonance Imaging 41 (2), pp. 266–295. Cited by: §4.2.
 [31] (2018) Improved magnetic resonance fingerprinting reconstruction with lowrank and subspace modeling. Magnetic resonance in medicine 79 (2), pp. 933–942. Cited by: §1.
Comments
There are no comments yet.