Rethinking Medical Image Reconstruction via Shape Prior, Going Deeper and Faster: Deep Joint Indirect Registration and Reconstruction

12/16/2019 ∙ by Jiulong Liu, et al. ∙ 41

Indirect image registration is a promising technique to improve image reconstruction quality by providing a shape prior for the reconstruction task. In this paper, we propose a novel hybrid method that seeks to reconstruct high quality images from few measurements whilst requiring low computational cost. With this purpose, our framework intertwines indirect registration and reconstruction tasks is a single functional. It is based on two major novelties. Firstly, we introduce a model based on deep nets to solve the indirect registration problem, in which the inversion and registration mappings are recurrently connected through a fixed-point interaction based sparse optimisation. Secondly, we introduce specific inversion blocks, that use the explicit physical forward operator, to map the acquired measurements to the image reconstruction. We also introduce registration blocks based deep nets to predict the registration parameters and warp transformation accurately and efficiently. We demonstrate, through extensive numerical and visual experiments, that our framework outperforms significantly classic reconstruction schemes and other bi-task method; this in terms of both image quality and computational time. Finally, we show generalisation capabilities of our approach by demonstrating their performance on fast Magnetic Resonance Imaging (MRI), sparse view computed tomography (CT) and low dose CT with measurements much below the Nyquist limit.



There are no comments yet.


page 17

page 18

page 19

page 22

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Image reconstruction and registration are two fundamental tasks in medical imaging. They are necessary to gain better insights in different applications - including diagnostic, surgery planning and radiotherapy (e.g. Alp et al. (1998); Wein et al. (2008); Crum et al. (2004); Smit et al. (2016)) just to mention few. For several medical imaging modalities, for example Magnetic Resonance Imaging (MRI), it is highly desirable to reduce the number of the acquired measurements to avoid image degradation Sachs et al. (1995); Zaitsev et al. (2015) (for example - geometric distortions and blurring effects). This with the purpose to deal with the central problem in MRI - the long acquisition time. However, to perform these tasks from undersampled and highly corrupted measurements become even a more challenging problem yet of great interest from the theoretical and practical points of view.

There have been different attempts to perform image reconstruction and registration in the community, which these two tasks are performed either separately and most recently jointly. For image reconstruction the majority of algorithmic approaches follow the notion of Compressed Sensing (CS)- e.g. Lustig et al. (2007); Liang (2007); Lingala et al. (2011); Otazo et al. (2015); Zhang et al. (2015). Most recently, there has been a growing interest in exploring similarity of image structures of to-be-registrated images as shape prior e.g. Liu et al. (2015), and deep learning based reconstruction approaches e.g. Sun et al. (2016); Hyun et al. (2018); Hammernik et al. (2018). For a detailed survey in image reconstruction, we refer the reader to Ravishankar et al. (2019).

Whilst for image registration, that seeks to find a mapping that aligns two or more images, the body of literature has reported promising results. These can be roughly divided in rigid and deformable algorithmic approaches. Whilst rigid registration, e.g. Adluru et al. (2006); Wong et al. (2008); Johansson et al. (2018), has shown promising results, it is not enough robust to describe complex physiological motions. Deformable registration offers greater opportunities to describe complex motion - for example Beg et al. (2005); Cao et al. (2005); Vercauteren et al. (2009). We refer the reader to Sotiras et al. (2013) for an extensive revision on deformable registration. More recently, deformable image registration has also benefited of the potentials of deep learning- e.g. Yang et al. (2017); Shen et al. (2019); Balakrishnan et al. (2019); Haskins et al. (2019). However, these approaches assume that the given images are already reconstructed.

A commonality of the aforementioned approaches is that they perform the reconstruction and registration tasks separately. In very recent developments in the area, e.g. Aviles-Rivero et al. (2018); Corona et al. (2019), have shown that performing those tasks jointly can reduce error propagation resulting in improving accuracy whilst achieving better generalisation capabilities Caruana (1997). However, a major bottleneck of such joint models is the computational complexity as they often seek to solve highly non-convex optimisation problems. Motivated by the current drawbacks in the literature, we address the problem of how to get higher quality reconstructed and registered images from noisy and undersampled MRI measurements whilst demanding low computational cost.

In this work, we address the previous question by proposing a new framework for simultaneous reconstruction and registration from corrupted and undersampled MRI data. Our approach is framed as a deep joint model, in which these two task are intertwined in a single optimisation model. It benefits from the theoretical guarantees of large deformation diffeomorphic metric mapping (LDDMM) and the powerful performance of deep learning. Our modelling hypothesis is that by providing a shape prior (i.e. registration task) to the reconstruction task, one can boost the overall performance of the final reconstruction. Most precisely, our framework seeks to learn a network parametrised mapping , where is the image to be reconstructed and are the template and target images to-be-register.

We remark to the reader that unlike the works of that Yang et al. (2017); Shen et al. (2019); Balakrishnan et al. (2019); Haskins et al. (2019), our approach follows a different philosophy which is based on three major differences. Firstly, we address the problem of indirect registration, in which the target image is unknown but encoded in the indirect corrupted measurements (i.e. raw data). Secondly, our ultimate goal is to improve the final image reconstruction through shape prior (i.e. registration task) instead of evaluate the tasks separately. Thirdly, unlike the work of that Lang et al. (2018) we gain further computational efficiency and reconstruction quality through our registration blocks based deep nets.

We highlight that computing image reconstruction and indirect registration simultaneously is even more challenging than performing the reconstruction and registration separately. This is because is not explicitly given and is encoded in a corrupted measurement, and the general physical forward operators (e.g. Fourier and Radon transforms) are not trivial to be learnt Zhu et al. (2018). Therefore, to build an end-to-end parameterised mapping for inverse problems is not straightforward via standard deep nets. Motivated by the existing shortcomings in the body of literature, in this work we propose a novel framework, that to the best of our knowledge, it is the first hybrid method (i.e. a combination of a model-based and deep-learning based approaches) that intertwines reconstruction and indirect registration. Although we emphasise the application of fast MRI, we also show generalisation capabilities using Computerised Tomography (CT) data. Whilst this is an relevant part of our approach, our contributions are:

  • We propose a novel mathematically well-motivated and computationally tractable framework for simultaneous reconstruction and indirect registration, in which we highlight:

    • A framework based on deep nets for solving indirect registration efficiently, in which the inversion and registration mappings are recurrently connected through a fixed-point iteration based sparse optimisation.

    • We introduce two types of blocks for efficient numerical solution of our bi-task framework. The first ones are specific inversion blocks that use the explicit physical forward operator, to map the acquired measurements to the image reconstruction. Whilst the second ones are registration blocks based deep nets to predict the registration parameters and warping transformation.

  • We exhaustively evaluate our framework with a range of numerical results and for several applications including fast MRI, sparse view computerised tomography (CT) and low dose CT.

  • We show that the carefully selected components in our framework mitigate major drawbacks of the traditional reconstruction algorithms resulting in significant increase in image quality whilst decreasing substantially the computational cost.

2 When Reconstruction Meets LDDMM: A Joint Model

In this section, we first introduce the tasks of image reconstruction and registration separately, and then, we describe how these two tasks can be cast in a unified framework.

Mathematically, the task of reconstructing a medical image modality, , from a set of measurements reads:


where is the forward operator associated with the acquired measurement ; and is the inherent noise. To deal with the ill-posedness of (1), one can be casted it as a variational approach as: , where is the data fidelity term, d is a regularisation term to restrict the space of solutions, and is a positve parameter balancing the influence of both terms. Whilst the task of registering a template image, , to a target one, , can be cast as an optimisation problem, which functional can be expressed as:


where denotes a deformation map and regularises the deformation map. In general, the registration problem is ill-posed, and a regulariser, , is necessary to obtain a reliable solution. There are several methods proposed in the literature to regularise the deformation mapping Sotiras et al. (2013). One well-established algorithmic approach, due to its desirable mathematical properties, is Large Deformation Diffeomorphic Metric Mapping (LDDMM) Cao et al. (2005).

In the LDDMM setting, the deformation map is assumed to be invertible (to make the deformation physically meaningful), and both and should be sufficiently smooth, i.e. , which is defined as:


The forms a group with the identity mapping as the neutral element. When small perturbations of the identity mapping are applied to , at a particular time point , the deformation at the next time point becomes , which can be described by the following difference equation:


and leads to a continuous-time flow equation, which reads:


LDDMM is a PDE constrained optimisation problem, which can be formulated as:


where is a self-adjoint differential operator, whose numerical solution can be given via Euler-Lagrange equations Beg et al. (2005). Let the momentum be the dual of velocity, i.e. , and the inverse of then (6) can be expressed as a function of the momentum as:


From an optimisation point of view, instead of solving (6) over all possible velocities , one can apply the shooting formulation Vialard et al. (2012) and account only for those with least norm for a given . Now when computing Euler-Lagrange equation to the regularisation term , one can get the Euler-Poincaré equation Holm et al. (1998):


where the adjoint action and the conjoint actions is defined via . Therefore, (7) can be efficiently optimised over via Geodesic shooting. It can now be expressed as:


As we are interested in performing simultaneously reconstruction and registration. We now turn to describe how these two task can be intertwined in an unified framework. Consider the target image to be encoded in a set of measurements , then one can join these two tasks, i.e. (1) and (2), as a single optimisation problem, which reads:


One can naturally rewrite (10) using LDDMM via geodesic shooting  (9). This results in the following expression:


where is the inverse of . However, a potential shortcoming of (11) is that the solution, via Euler-Lagrange method, is computationally expensive. In the next section, we describe how  (11) can be efficiently solved by using Deep Learning. In particular, using deep nets parametrised Douglas-Rachford iteration Lions and Mercier (1979).

Figure 1: Workflow of our proposed framework, in which the simultaneous reconstruction and registration is achieved using deep nets parametrised Douglas-Rachford iteration with stages () where the is initialized by which can be reconstructed by a conventional method such as total variation regularised reconstruction.

3 Deep Nets Paramatrised Douglas-Rachford Fixed-point Iteration of Sparsity Optimization (SOFPI-DR-Net) for Simultaneous Reconstruction and Registration

In this section, we describe in details our novel framework that joins two tasks in a unified optimisation problem. We then demonstrate that it can be solved efficiently by splitting our optimisation model into more tractable sub-problems. We also define our inversion and registration blocks based on deep nets. Fig. 1 displays the overview of our proposed frameworks.

We remind to the reader that we seek to solve (11) in a computational tractable manner. The model (11) is equivalent to:


An efficient manner to solve  (12) is via Alternating Direction Method of Multipliers (ADMM)/ Douglas-Rachford splitting, in which one can break  (12) into more computational tractable sub-problems. Therefore, we solve  (12) via alternating minimisation, which yields to the following sub-problems:


We now turn to give more details on the solution of each sub-problem. The first sub-problem [] can be solved by a general inversion method such as conjugate method as:


However, solving the second sub-problem [] is similar to LDDMM, and therefore, solving it is still computationally expensive. The solution is denoted as:


The problem (13) can be also rewritten as a fixed-point iteration as:


and then one can obtain:




Based on the update of along with (16), (17) and (18), a fixed-point iteration for (13) reads:


The fixed-point iteration is also called Douglas-Rachford iteration Lions and Mercier (1979). We consider parameterise the inversion mapping and registration mapping for the Douglas-Rachford iteration (19). For , a learnable inversion - with the parameter in optimisation model (13) considered to be either learnable or manually tunable - is used in the fixed-point iteration of (19). Whilst for the registration mapping, , a parameterised is replaced in the the fixed-point iteration (19). To use LDDMM framework to regularise the registration parameters, we use consisting of a momentum prediction neural net instead of searching momentum by (11). Moreover, a shooting-warping neural net , which mimics the shooting and warping in (11), is used. Finally, our framework for parameterising the algorithm (11) with stages is obtained by computing:


for . We now give more details on the Deep Nets used for , and in each stage.

3.1 The Inversion Operator and its Backward Gradients

We remark that we continue using the physical forward operator for inversion (instead of a neural net parameterised forward operator), and therefore, the analytic inversion can be obtained by solving the first sub-problem of (13), which reads:


One can numerically solve (21) by conjugate gradient. With this purpose, the derivatives for can be obtained by differentiating the following expression:


we then get:


Then the derivatives of are given by:


To give the backward gradients for the backpropagation algorithm, let

- then the derivatives of with respect to , and can be correspondingly computed by:


For the inversion , one can compute the derivatives of with respect to , and by applying conjugate gradient.

3.2 A Deep Registration Net for Image Shape Prior

In this subsection, we establish a neural-network-parameterised registration mapping, which serves as image shape prior for inversion block. Our motivation comes from recent developments on vector momentum-parameterised deep networks proposed, for example, in 

Yang et al. (2017); Shen et al. (2019), in which authors showed promising accuracy and significant speedup in obtaining the initial momentum prediction. With this motivation in mind, in this work, we split the deep registration net into two-Nets: a momentum prediction net and shooting-warping net . These nets are applied to each stage . The momentum net is expressed as:


whilst the warp Net reads:


That is- it can be expressed as:


In this work, for the momentum prediction, we use the vector momentum-parameterised stationary velocity field (vSVF) model of that Shen et al. (2019). This is displayed in Fig. 2. For the Shooting-warping Net , we propose an extension of the momentum Net to a symmetrical-like Net, whose detailed structure can be seen in Fig. 3.

Figure 2: Detailed architecture for the momentum prediction net .
Figure 3: Detailed architecture used for the shooting-warping net .

3.3 Loss function with momentum regularised via LDDMM

We denote the input template images and acquired measurements as with corresponding ground truth target images denoted as . Moreover, let be the collection of the weights of all registration Nets

. We then use the following loss function:




and , which seeks to regularise the registration parameters and guarantees physical meaning of all blocks, is denoted as:


We remark that, in this work, are obtained from target-template pairs by (9). Therefore, all momentum and reconstructed (warped) images can be obtained simultaneously, in which we seek that they approximate the ground-truth gradually stage by stage. That is,




After we obtain the learned network parameter set , the learned network


for , is ready to be used for mapping a given measurement-template data pair to a predicted momentum by the output of the last momentum net, that is:


For estimating

, one can have two options. As first option, can be obtained from the output of the last shooting-warping net as:


Alternatively, the predicted momentum can be used to obtain via the shooting equations:


and finally, as a second option, we can get the estimated ground truth image by:


In the experimental results, we include an ablation study to show the benefits of computing using (36) and (38).

4 Experimental Results

In this section, we describe in details the experiments conducted to validate our proposed framework.

4.1 Data Description

We remark that whilst our approach can be applied to different medical modalities. In this work, we showcase our approach for MRI, sparse-view CT and low dose CT.

  • Dataset A [MRI Dataset]: Cardiac cine MRI data coming from realistic simulations generated using the MRXCAT phantom framework Wissmann et al. (2014). The heart beat and respiration parameters were set to 1s and 5s respectively. Moreover, the Matrix size is , heart phases= 24 and coils=12.

  • Dataset B [Sparse-view CT Dataset]: We use the Thoracic 4D Computed Tomography (4DCT) dataset Castillo et al. (2009)111 The measurements are generated by: with 18 views over , where is X-ray transform and is normalised to .

  • Dataset C [Low Dose CT Dataset]: As in Dataset B we use Thoracic 4D Computed Tomography (4DCT) dataset Castillo et al. (2009). However, the measurements are generated by: with 181 views over and

    obey i.i.d normal distribution,


We remark that the MRI measurements are generated by partial Fourier transform as:

. Where is the noise level, obey i.i.d normal distribution, is the ground truth image, and is the undersampled operator, and

is Fourier Transform. In this work, we retrospectively undersampled the measurements using: radial sampling, 2D random variable-density with fully sampled center radius and 1D variable-density with fully sampled center. To show generalisation capabilities of our proposed approach, we ran our approach using different sampling rates =

4.2 Parameter Selection and Setting Details

In this part, we give further details on the choice of the parameters along with further specifics of how we ran our experimental results.

For the and Nets, we set the number of stages

for all our applications: for fast MRI , sparse-view CT, and low-dose CT. Our approach is a GPU-based implementation in Pytorch. The

in are set to be learnable, and we also restrict by adding a layer as: , where

is a Sigmoid function,

is learnable, and to prevent to become too big.

We use Adam algorithm for training with the following parameters: learning rate: 1e-4, epochs= 500. Moreover, for the learned

MRI ; sparse-view CT and low-dose CT

Setting for the MRI Case. The temporal cine cardiac data (Dataset A) is used to generate 376 2D image pairs as target-template image pairs, and then the momentums dataset associated with target-template image pairs is obtained via LDDMM (9) for regularising the momentum prediction Nets in our approach (20). In this work, is normalised to and set noise level . We use undersampling rate of . In each experiment, 360 measurement-template pairs with 360 target images and 360 momentums are used to train our proposed approach (20), and 16 measurement-template pairs are used for testing by (34). For speedup the training, we pretrain the model stage by stage for 500 epoch, and finally train the whole network for 500 epochs.

Setting for the Sparse-view and Low-dose CT Case. We generate 528 2D image pairs as target-template, and then the momentum is obtained via LDDMM (9) for regularising the momentum prediction Net. We use for the Randon Transform the CUDA version of Gao (2012). For the training the network (20), 480 measurement-template pairs with 480 target images, and 480 momentum are used. Whilst for testing (34), 48 measurement-template pairs are used.

4.3 Evaluation Methodology

We evaluate our proposed framework based on the following scheme.

Comparison against other MRI reconstruction schemes. For the first part of our evaluation, we compared our framework against the well-established compressed sensing (CS) reconstruction scheme. We solve the CS scheme with TV, and LDDMM computed sequentially. Furthermore, we ran experiments using three different sampling patterns: radial, 2D random and 1D random (cartesian). To show generalisation capabilities, we use different sampling rates = {1/5,1/4, 1/3 }.

We report the results of these comparisons based on both qualitative and quantitative results. The former is based on visual assessment of the reconstruction, and the latter on the computation of two well-established metrics: the structural similarity (SSIM) index and the Peak Signal-to-Noise Ratio (PSNR); along with the computational cost given in seconds.

Generalisation capabilities using CT data. For generalisation capabilities, we evaluate our framework using data coming from sparse view CT and low-dose CT. We compared our framework against classic TV-reconstruction scheme + LDDMM computed sequentially and another indirect registration approach that of  Chen and Oktem (2018). We report the comparison using qualitative and quantitative results using visual comparison of the reconstructions along with the error maps, reconstruction quality in terms of PSNR, SSIM and computation cost.

4.4 Results and Discussion

In this subsection, we demonstrate the capabilities of our framework following the evaluation scheme of subsection 4.3.

Is Our Framework better than a classic MRI Reconstruction Scheme? We begin by evaluating our approach against classic TV+LDDMM reconstruction scheme. We remark to the reader that classic scheme performs sequentially the reconstruction and registration whilst our approach computes simultaneously the MRI reconstruction and indirect image registration.

We report both qualitative and quantitative results in Table 1 and Figs. 4, 5 and 6. In Fig. 4, we show nine reconstructed output examples with three different sampling patters. Visual assessment agrees with the theory of our model, in which we highlight the reconstruction of higher quality and preservation of relevant anatomical parts whilst enhancing fine details and contrast. In a closer inspection at these reconstructions, one can see that our framework (in both cases either using  (36) or  (38)) leads to reconstructions with sharper edges and better preservation of fine details than the classic MRI reconstruction scheme. This is further supported by the reported reconstruction errors, in which our approach reported the lowest error values for all reconstructed samples.

Figure 4: MRI Reconstruction outputs and reconstruction errors using Dataset A with sampling rate = . Comparison of our approach vs classic scheme (TV + LDDMM). Our approach reconstruct higher quality images with sharp edges, preservation of fine details and contrast.
Figure 5: MRI Reconstruction outputs and reconstruction errors using Dataset A with sampling rate = and with different sampling patterns. Results from classic scheme (TV+LDDMM) vs our approach. One can see that our reconstructions have higher quality, this is reflected in the reconstruction error plots.
Figure 6: MRI Reconstruction outputs and reconstruction errors using Dataset A with sampling rate = and with different sampling patterns. Reconstructions show that our approach reconstructs higher quality images than classic scheme TV+LDDMM. This is further supported by the reconstruction error plots, in which our reconstructions reported the lowest error.
Pattern Quantity TV+LDDMM Ours (36) Ours (38)
Dataset A with Sampling Rate = 1/5
Radial (PSNR, SSIM) (25.84, 77.36) (37.90, 93.59) (35.11, 88.25)
Time Cost (s) 1.54 0.52 0.61
2D Random (PSNR, SSIM) (25.06, 77.61) (36.08, 93.34) (34.32, 88.38)
Time Cost (s) 1.66 0.56 0.67
1D Random (PSNR, SSIM) (20.61, 61.31) (36.10, 93.31) (34.99, 88.42)
Time Cost (s) 1.51 0.51 0.63
Dataset A with Sampling Rate = 1/4
Radial (PSNR, SSIM) (26.52, 78.89) (38.77, 94.43) (35.74, 90.18)
Time Cost (s) 1.60 0.57 0.63
2D Random (PSNR, SSIM) (25.94, 78.19) (38.12, 94.42) (35.70, 90.44)
Time Cost (s) 1.63 0.53 0.71
1D Random (PSNR, SSIM) (22.02, 65.67) (37.44, 94.33) (35.82, 90.18)
Time Cost (s) 1.58 0.56 0.66
Dataset A with Sampling Rate = 1/3
Radial (PSNR, SSIM) (26.82, 79.63) (39.01, 94.63) (35.77, 90.36)
Time Cost (s) 1.57 0.56 0.64
2D Random (PSNR, SSIM) (26.18, 78.77) (38.79, 94.75) (35.78, 90.65)
Time Cost (s) 1.47 0.49 0.63
1D Random (PSNR, SSIM) (22.60, 66.83) (38.45, 94.42) (35.84, 90.21)
Time Cost (s) 1.64 0.56 0.59
Table 1: Numerical comparison of our approach vs. other reconstruction schemes using the Dataset A, with different reconstruction patterns and acceleration factors. Results are reported from the testing set. SSIM is denoted in .   denotes the best image quality scores whilst   the lowest computational cost.

To show further generalisation capabilities, we ran a range of experiments using different sampling factors = {1/5, 1/4, 1/3}. Reconstruction outputs can be seen in Figs. 4, 5 and 6. One can see that the benefits of our approach described above are prevalent to all sampling factors. That is, our approach preserves small structures for example the papillary muscles of the heart. Moreover, in a visual comparison between these figures, we notice that our method generalises very well even when the acceleration factor is increasing; contrary to the classic scheme that exhibits loss of contrast and blurry effects. Overall, we can show that providing a shape prior, through a registration task, yields to higher quality images whilst decreasing the number of measurements to form an MRI.

Quantity TV+LDDMM Chen et al. Chen and Oktem (2018) Ours (36) Ours (38)
Dataset B
(PSNR,SSIM) (26.71, 0.72) (30.11, 0.96) (36.34, 0.97) (34.48, 0.95)
Time Cost (s) 1.82 81.37 0.76 0.87
Dataset C
(PSNR, SSIM) (30.66, 0.86) (31.41, 0.95) (39.18, 0.97) (35.78, 0.96)
Time Cost (s) 1.73 112.35 0.84 1.08
Table 2: Numerical comparison for sparse-view and low-dose CT datasets (B&C). The displayed results are the averaged accuracy and efficiency on the testing dataset.   denotes the best image quality scores whilst   the lowest computational cost.

Is a Two-task Model better than a Sequeantial Model - Does It Pay Off? To further support the aforementioned benefits of our model and for a more detailed quantitative analyses, we report the overall results of the Dataset A in Table 1. The results are the average of the image metrics, (PSNR, SSIM), across the whole Dataset A with different sampling patterns and sampling rates. We observe that our approach reported significant improvement in both metrics with respect to the classic MRI + LDDMM reconstructions and for all accelerations. These results further validate our hypothesis that providing shape prior improve substantially the reconstruction image quality.

After demonstrating the benefits of our approach quality-wise, we now pose a question- how is our approach performing from a computational point of view? The computational time is displayed in Table 1. One can observe that another major advantage of our model is the computational time, we achieve to decrease an average of  65% the computation cost with respect to the classic reconstruction scheme whilst achieving a substantial improvement in terms of image quality in both metrics. Overall, the potentials of our approach are preserved for all datasets and for all sampling rates.

Figure 7: CT reconstruction outputs and reconstruction errors using Datasets B and C. A comparison is displayed between classic reconstruction scheme and our approach. In a closer inspection, one can see that our reconstructions have higher image quality than the compared schemes. This is further supported by the reconstruction error plots, in which our reconstructions display the lowest errors.

Can Our Approach be Applied to other Modalities? Generalisation Capabilities To demonstrate generalisation capabilities of our model, we run experiments on both sparse-view and low-dose CT datasets (e.g. Datasets B and C). We remark to the reader, that to the best of our knowledge, this is the first hybrid approach reported that performs two tasks as a hybrid model. That is- an approach that combines a model-based and a deep learning-based models to improve image reconstruction. However, there is a model-based approach that follows similar philosophy than ours, which is that of Chen et al. Chen and Oktem (2018) that is applied to the CT case. Therefore, we ran our approach and compared against both the classic CT reconstruction scheme with TV + LDDMM, and that of Chen and Oktem (2018).

Figure 8: Visualisation of the predicted momentum. (From left to right) ground truth and predicted ones using cartesian and radial sampling patters, and DAtasets A, B and C.

We begin by evaluating visually our approach against the compared schemes and the results are displayed in Fig. 7. In that figure, we display two samples outputs using datasets B and C respectively. In a closer look at the reconstructions, one can see that classic TV + LDDMM reconstructions fail to preserve fine details and introduce strong blurring artefacts (see first column). Similarly, the algorithmic approach of that Chen and Oktem (2018) shows reconstructions with loss in contrast and texture, blurry artefacts and fine details. These negative effects are reflected at the reconstruction error plots in which our reconstructions (last two columns) reported the lowest errors. From these plots, one can see that our approach is able to reconstruct sharp edges whilst keeping fine details and texture.

To further support our approach, we perform further quantitative experiments, which are reported in Table 2. Similarity-wise we reported the highest values for both PSNR and SSIM metrics. In particular, we would like to highlight two major potentials of our approach. Firstly, our approach offers substantial improvement, in terms of both image quality metrics. In particular, for the PSNR metric the improvement is highly substantial compared to the approach. Also, in terms of SSIM, it outperforms the classic TV scheme and readily competes against Chen and Oktem (2018). Secondly, the computational cost is significantly lower than the approach of Chen and Oktem (2018) and the classic reconstruction scheme. Finally, for further visualisation support, we display the predicted momentum in Fig. 8.

5 Conclusion

In this paper, we propose for the first time a hybrid approach for simultaneous reconstruction and indirect registration. We demonstrated that indirect image registration, in combination with deep learning, is a promising technique for providing a shape prior to substantially improve image reconstruction. We show that our framework can significantly decrease the computational cost via deep nets.

In particular, we highlight the potentials of leveraging physics-driven regularisation methods with the powerful performance of deep learning in an unified framework. We show that our approach improves over existing regularisation methods. These improvements are in terms of getting higher quality images that preserve relevant anatomical parts whilst avoiding geometric distortions, and loss of fine details and contrast. Moreover, we also showed that our framework can substantially decrease the computational time by more than  66% whilst reporting the highest image quality metrics. These benefits are consistent over different settings such as acceleration factors, sampling patterns and medical image modalities.


AIAI gratefully acknowledges the financial support of the CMIH, University of Cambridge; and Noemie Debroux for very helpful discussions.

CBS acknowledges: inspiring and fruitful discussions with Ozan Öktem on the topic of indirect image reconstruction and learned image registration, support from the Leverhulme Trust project “Breaking the nonconvexity barrier”, the Philip Leverhulme Prize, the EPSRC EP/M00483X/1 and EP/S026045/1, the EPSRC Centre EP/N014588/1, the European Union Horizon 2020 research and innovation programmes under the Marie Skodowska-Curie grant agreement No. 777826 NoMADS and No. 691070 CHiPS, the Cantab Capital Institute for the Mathematics of Information and the Alan Turing Institute.


  • G. Adluru, E. V. DiBella, and M. C. Schabel (2006) Model-based registration for dynamic cardiac perfusion MRI. Journal of Magnetic Resonance Imaging. Cited by: §1.
  • S. Alp, M. Dujovny, M. Misra, F. Charbel, and J. Ausman (1998) Head registration techniques for image-guided surgery. Neurological research 20 (1), pp. 31–37. Cited by: §1.
  • A. I. Aviles-Rivero, G. Williams, M. J. Graves, and C. Schonlieb (2018) Compressed sensing plus motion (cs+ m): a new perspective for improving undersampled mr image reconstruction. arXiv preprint arXiv:1810.10828. Cited by: §1.
  • G. Balakrishnan, A. Zhao, M. R. Sabuncu, J. Guttag, and A. V. Dalca (2019) VoxelMorph: a learning framework for deformable medical image registration. IEEE transactions on medical imaging. Cited by: §1, §1.
  • M. F. Beg, M. I. Miller, A. Trouvé, and L. Younes (2005) Computing large deformation metric mappings via geodesic flows of diffeomorphisms.

    International journal of computer vision

    61 (2), pp. 139–157.
    Cited by: §1, §2.
  • Y. Cao, M. I. Miller, R. L. Winslow, and L. Younes (2005) Large deformation diffeomorphic metric mapping of vector fields. IEEE Transactions on Medical Imaging, pp. 1216–1230. Cited by: §1, §2.
  • R. Caruana (1997) Multitask learning. Machine learning 28 (1), pp. 41–75. Cited by: §1.
  • E. Castillo, R. Castillo, J. Martinez, M. Shenoy, and T. Guerrero (2009) Four-dimensional deformable image registration using trajectory modeling. Physics in Medicine & Biology 55 (1), pp. 305. Cited by: item 2, item 3.
  • C. Chen and O. Oktem (2018) Indirect image registration with large diffeomorphic deformations. SIAM Journal on Imaging Sciences 11 (1), pp. 575–617. Cited by: §4.3, §4.4, §4.4, §4.4, Table 2.
  • V. Corona, A. I. Aviles-Rivero, N. Debroux, C. Le Guyader, and C. Schönlieb (2019)

    Variational multi-task mri reconstruction: joint reconstruction, registration and super-resolution

    arXiv preprint arXiv:1908.05911. Cited by: §1.
  • W. R. Crum, T. Hartkens, and D. Hill (2004) Non-rigid image registration: theory and practice. The British journal of radiology 77 (suppl_2), pp. S140–S153. Cited by: §1.
  • H. Gao (2012) Fast parallel algorithms for the x-ray transform and its adjoint. Medical physics 39 (11), pp. 7110–7120. Cited by: §4.2.
  • K. Hammernik, T. Klatzer, E. Kobler, M. P. Recht, D. K. Sodickson, T. Pock, and F. Knoll (2018) Learning a variational network for reconstruction of accelerated mri data. Magnetic Resonance in Medicine. Cited by: §1.
  • G. Haskins, U. Kruger, and P. Yan (2019) Deep learning in medical image registration: a survey. arXiv preprint arXiv:1903.02026. Cited by: §1, §1.
  • D. D. Holm, J. E. Marsden, and T. Ratiu (1998) The euler poincare equations and semidirect products with applications to continuum theories. Adv. Math. Cited by: §2.
  • C. M. Hyun, H. P. Kim, S. M. Lee, S. Lee, and J. K. Seo (2018) Deep learning for undersampled mri reconstruction. Physics in Medicine & Biology 63 (13), pp. 135007. Cited by: §1.
  • A. Johansson, J. Balter, and Y. Cao (2018) Rigid-body motion correction of the liver in image reconstruction for golden-angle stack-of-stars DCE MRI. Magnetic Resonance Medicine. Cited by: §1.
  • L. F. Lang, S. Neumayer, O. Öktem, and C. Schönlieb (2018) Template-based image reconstruction from sparse tomographic data.

    Applied Mathematics & Optimization

    Cited by: §1.
  • Z. Liang (2007) Spatiotemporal imagingwith partially separable functions. In IEEE International Symposium on Biomedical Imaging: From Nano to Macro, pp. 988–991. Cited by: §1.
  • S. G. Lingala, Y. Hu, E. DiBella, and M. Jacob (2011) Accelerated dynamic mri exploiting sparsity and low-rank structure: kt slr. IEEE Trans Med Imaging 30 (5), pp. 1042–1054. Cited by: §1.
  • P. Lions and B. Mercier (1979) Splitting algorithms for the sum of two nonlinear operators. SIAM Journal on Numerical Analysis 16 (6), pp. 964–979. Cited by: §2, §3.
  • J. Liu, X. Zhang, X. Zhang, H. Zhao, Y. Gao, D. Thomas, D. A. Low, and H. Gao (2015) 5D respiratory motion model based image reconstruction algorithm for 4d cone-beam computed tomography. Inverse Problems. Cited by: §1.
  • M. Lustig, D. Donoho, and J. M. Pauly (2007) Sparse mri: the application of compressed sensing for rapid mr imaging. Magnetic Resonance in Medicine: An Official Journal of the International Society for Magnetic Resonance in Medicine 58 (6), pp. 1182–1195. Cited by: §1.
  • R. Otazo, E. Candès, and D. K. Sodickson (2015) Low-rank plus sparse matrix decomposition for accelerated dynamic mri with separation of background and dynamic components. Magn Reson Med, pp. 1125–1136. Cited by: §1.
  • S. Ravishankar, J. C. Ye, and J. A. Fessler (2019) Image reconstruction: from sparsity to data-adaptive methods and machine learning. arXiv preprint arXiv:1904.02816. Cited by: §1.
  • T. S. Sachs, C. H. Meyer, P. Irarrazabal, B. S. Hu, D. G. Nishimura, and A. Macovski (1995)

    The diminishing variance algorithm for real-time reduction of motion artifacts in mri

    Magnetic Resonance in Medicine 34 (3), pp. 412–422. Cited by: §1.
  • Z. Shen, X. Han, Z. Xu, and M. Niethammer (2019) Networks for joint affine and non-parametric image registration. In

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    pp. 4224–4233. Cited by: §1, §1, §3.2, §3.2.
  • N. Smit, K. Lawonn, A. Kraima, M. DeRuiter, H. Sokooti, S. Bruckner, E. Eisemann, and A. Vilanova (2016) Pelvis: atlas-based surgical planning for oncological pelvic surgery. IEEE Transactions on Visualization and Computer Graphics 23 (1), pp. 741–750. Cited by: §1.
  • A. Sotiras, C. Davatzikos, and N. Paragios (2013) Deformable medical image registration: a survey. IEEE Transactions on Medical Imaging. Cited by: §1, §2.
  • J. Sun, H. Li, Z. Xu, et al. (2016) Deep admm-net for compressive sensing mri. In Advances in neural information processing systems, pp. 10–18. Cited by: §1.
  • T. Vercauteren, X. Pennec, A. Perchant, and N. Ayache (2009) Diffeomorphic demons: efficient non-parametric image registration. NeuroImage. Cited by: §1.
  • F. Vialard, L. Risser, D. Rueckert, and C. J. Cotter (2012) Diffeomorphic 3d image registration via geodesic shooting using an efficient adjoint calculation. International Journal of Computer Vision 97 (2), pp. 229–241. Cited by: §2.
  • W. Wein, S. Brunke, A. Khamene, M. R. Callstrom, and N. Navab (2008) Automatic ct-ultrasound registration for diagnostic imaging and image-guided intervention. Medical image analysis 12 (5), pp. 577–585. Cited by: §1.
  • L. Wissmann, C. Santelli, W. P. Segars, and S. Kozerke (2014) MRXCAT: realistic numerical phantoms for cardiovascular magnetic resonance. Journal of Cardiovascular Magnetic Resonance 16 (1), pp. 63. Cited by: item 1.
  • K. K. Wong, E. S. Yang, E. X. Wu, H. Tse, and S. T. Wong (2008) First-pass myocardial perfusion image registration by maximization of normalized mutual information. Journal of Magnetic Resonance Imaging. Cited by: §1.
  • X. Yang, R. Kwitt, M. Styner, and M. Niethammer (2017) Quicksilver: fast predictive image registration–a deep learning approach. NeuroImage 158, pp. 378–396. Cited by: §1, §1, §3.2.
  • M. Zaitsev, J. Maclaren, and M. Herbst (2015) Motion artifacts in mri: a complex problem with many partial solutions. Journal of Magnetic Resonance Imaging 42 (4), pp. 887–901. Cited by: §1.
  • T. Zhang, J. M. Pauly, and I. R. Levesque (2015) Accelerating parameter mapping with a locally low rank constraint. Magn Reson Med 73 (2), pp. 655–661. Cited by: §1.
  • B. Zhu, J. Z. Liu, S. F. Cauley, B. R. Rosen, and M. S. Rosen (2018) Image reconstruction by domain-transform manifold learning. Nature 555 (7697), pp. 487. Cited by: §1.