X-Ray Computed Tomography (CT) is a technology that has revolutionized the way in which many fields, including medical imaging, have been able to investigate the inner tomography of bodies in a non-intrusive way, while still allowing the user to overcome the limit of radiography, where all sense of depth is made invisible.
The X-Ray CT scan relies on rotating a source and corresponding detector element around the body until scans have been taken at every point in the rotation. These scans are then assembled into a sinogram ( in figure 1), which we call the measurement data.
One of the most important technologies for X-Ray CT is image reconstruction. To be able to non-intrusively see depth within inner structures in 3 dimensions, it’s required to perform some sort of reconstruction function to retrieve original images from measurement data. For perfect situations, where many samples of the same slice are available and the samples are free of noise and artefacts, existing methods work well to reproduce 2D images of each slice. Unfortunately, using an X-Ray source near a human body for long periods of time is unhealthy and can be cancer-inducing. For this reason, we often end up with undersampled data, the effects of which are clear when using classical methods, since all noise in the measurement data is amplified greatly in the reconstruction (As can be seen with in figure 1).
Deep learning is already in the midst of having a transformative impact on so many fields, from playing games better than grandmasters, to computer vision in self-driving cars, to face and speech recognition tasks. A recent trend can be seen in the past decade, with deep learning being applied to inverse problems in optical tomography.
In this paper, we aim to give a background of the inverse problem and what it is, followed by a brief history of classical approaches to solving the inverse problem. After that, we will show the current state of the art by investigating the main approaches that have been taken to apply deep learning to image reconstruction and evaluating the papers that have been involved in advancing the field for each approach. We will then deep-dive into the architecture design of a few, key, landmark papers before finally offering a summary of the major challenges faced as well as giving some insight into the future trajectory of the field.
Ii-a What is the Inverse Problem?
Formally, the forward problem is a mapping from a true image to raw scan data where . Although this is a generalized form, with X-ray CT, represents an image of X-ray attenuations, represents the physics of the scanner described in section I and represents the resulting sinogram.
The inverse problem is then the recovery of the original image from the raw scan data using a reconstruction function as one can see in fig. 1.
The obvious solution is to find the mapping from to produce the true image. The main issue faced with this is, even for a image, the inversion is that of a matrix, which has approximately elements, a size which, for multiple slices, would be challenging to even store in modern hardware, let alone actually be able to invert. 
Ii-B Classical Solutions to the Inverse Problem
To achieve this reconstruction, we can model the forward problem
and produce an estimate of
by minimizing the loss function betweenand the estimated sinogram from using our modelled :
This is called the objective function approach. By computing the inverse operator , we trivially have .
For X-Ray CT, is known as the filtered back projection algorithm. An even more basic algorithm than that is the standard back projection algorithm which just simply assembles the raw data back into the form of an image, but as one can see in figure 1, it is not a great representation of the true image. Even the direct inverse, , sees noise amplification with undersampled, noisy scan data, due to the problem’s ill-posedness, meaning multiple non-unique solutions may exist. This is often the case, since we want to minimize exposing patients to harmful cancer-inducing X-ray radiation, resulting in a limited number of samples.
In the case that the inverse problem is ill-posed, a regularizer is often a better approach. This is where we add a regularization function of the original image to the minimization giving
where is the regularization functional.
The conditional mean approach does act as a regularizer so is well suited for ill-posed inverse problems, but like Maximum A Posteriori, its operator uses integration and so it is limited in scalability. The Bayesian estimator also acts as a regularizer but it involves minimizing over an integration, making it unfit even for small tasks. 
Variational regularizers take the idea one step further by aiming to minimize the loss of another regularization function such as a Tikhonov regularizer  or more commonly total variation or total generalized variation.  Up until recently, using TV as a regularization functional for was the preferred approach for producing denoised reconstructions of the true image.
Apart from the objective function approach, we also have the learning approach, where we create a set of true images and corresponding raw sinograms . From this, a learned reconstruction algorithm would be found by minimizing over all possible parameters
with being an error function like the 2-norm once again and g being a regularizer of the parameters to avoid overfitting. Once the optimal parameter is selected, is ready to be used for reconstruction of new images.
These two methods have been the state of the art up until recently, but both have their limitations. The learning approach, for example, needs to have a training set prepared and can still suffer heavily from overfitting, whereas the objective function approach suffers from needing a model of the forward problem, cost function and regularizer. 
Iii Reconstruction with Deep Learning
By using a Convolutional Neural Network, we can perform the optimization in3 by setting the set of to be a combination of filters, parameterized by the filter weights, over which the optimization occurs.
Following the massive surge of interest into ill-posed inverse problem in the early 2000s, many new approaches emerged and along with the rise in use of deep learning, three in particular were significant improvements on the classical approaches. We now compare these three here and explore their use within various papers.
Iii-a Learned Denoisers
Following the emergence of deep learning, the first applications to optical tomography inverse problems was through learned post-processors or learned denoisers, as suggested by Wang . For this approach, filtered back propagation is first performed on the sinogram to produce a noisy image, where a neural network is then used to remove the noise induced by the pseudo-inverse step. By limiting the learning stage to just the transformation of the pseudo-inverted sinogram reconstruction , we greatly reduce the complexity of the task and authors such as Zhao et al.  were among the first to show significant reduction in noise.
Chen et al.  took a comparable approach by building a convolution-deconvolution network using patched-based training to achieve similar results.
Given the relatively trivial nature of this approach, many groups like Jin et al.  use more generalized methods of image noise reduction such as U-net or, in the case of Kang et al. , using U-net on directional wavelets. Kang et al.  then developed this further to recover more texture in the image by using framelet-based denoising with a wavelet residual network .
Ye et al.  then further applied classical signal processing methods in his Deep Convolutional Framelets approach, further improving performance compared to the previous attempts.
With the learned deep learning denoiser approach having been proven, more and more interest lied in the possibility of further involvement of deep learning in the inverse problem. The learned iterative scheme seeks to learn the entire transformation from measurement domain to reconstruction domain. The issue with this, as explained in II-A, is the great computational expense of performing the reconstruction process in one step. 
To overcome this, it was proposed by Yang et al. 
to utilise learned iterative schemes that resembled classical optimization techniques but instead used machine learning to perform the optimization by making updates based on the result of applying the forward operator on the previous iteration. In this case, the operation was performed for MRI reconstruction
The most recent advancement in the field, introduced for MRI by Schlemper et al.  and Hammernik et al.  and later pioneered for X-Ray CT by Zhu et al.  with their AUTOMAP is the end to end approach. Unlike the iterative approach, AUTOMAP is a fully learned algorithm that produces a reconstruction of the true image from the measurement.
As is visible in figure 2, a neural network is trained to map raw measurement data to clean images in the training stage. This network can then be applied to new measurement data to produce clean images.
Although as explained before, this is a more computationally expensive method, AUTOMAP does successfully learn the entire image reconstruction process for low-resolution images and is more performant than traditional methods.
The deep neural networks used in some of the papers are sensitive to the architecture they employ. In this section, we look at the architectures used in some landmark papers.
Jin et al.  use a modified U-net convolutional network in their deep learned denoiser.
By using a dyadic scale decomposition, the filter sizes at layers at the top and bottom are smaller than those in the middle layers. Because of the nature of the forward problem, fixed filter sizes throughout all the layers may not have been sufficient to invert the function effectively. As in most convolutional neural networks, U-net uses multichannel filters, meaning many feature maps exist at each layer of the network. Finally, by including a skip connection, the network is able to learn the differences between input and output images.
As a comparison to an iterative approach, Adler and Öktem  use a vastly different network architecture due to there existing a mapping from one domain to another.
In this architecture diagram, the dual iterates are given by the blue boxes and the primal iterates are in the red boxes, with both blue and red boxes having the same architecture. Where this differs from the architecture of a classical primal dual hybrid gradient  is that the primal and dual iterates would be given by proximals with over-relaxation parameters as opposed to convolutional neural networks.
Finally, we compare the above two architectures to that of the AUTOMAP from Zhu et al. .
With a neural network of fully connected layers, the AUTOMAP architecture is interesting because, although the transformation from one domain to another sounds more complicated, fewer convolutional layers are used than in U-Net utilized by Jin et al. 111Of note is that the paper by Zhu et al.  is rather challenging to understand so the analysis of the architecture for this paper is limited..
V-a Major Challenges
The major challenges that have been faced in this field include the limitation provided by the harmful nature of the X-Ray CT scan. By using a radioactive source, the number of samples that one is capable of taking is heavily limited. With under-sampled data, the inverse problem becomes ill-posed, making classical approaches either useless or not capable of producing faithful reproductions of the true image.
To overcome this, three approaches have stood out. Firstly, learned denoisers use filtered back projection to produce an image in the domain of the true image then roll out general algorithms such as U-Net to learn a denoising function that, along with filtered back projection, can be applied to new scans. Secondly, the iterative approach learns the transformation from measurement to reconstruction by making updates iteratively. Finally, the latest advancement is surrounding the end-to-end approach, where Zhu et al.  were able to effectively create a fully-learned approach to reconstruct the true image from measurement data.
V-B Possible Next Steps
With the end-to-end approach having the greatest potential for improvement in the future, we would like to see this approach be developed further. The greatest drawback it faces at the moment is in performance.AUTOMAP does struggle heavily with higher-resolution images due to the computational cost involved. Potentially, it would be possible to further combine methods used in the iterative approach with those used in AUTOMAP, to create a more effective solution for end-to-end image reconstruction.
Further Applications: With many methods being proven at this stage for MRI and for X-Ray CT scans, it would be interesting to see the same theory be applied to other imaging devices such as ultrasound, microscopy and PET. This advancement could help improve what is possible across the whole field of medical imaging.
A special thanks to Simon Arridge for kindly offering his assistance in performing the research for this paper.
-  (2017) Solving ill-posed inverse problems using iterative deep neural networks. Inverse Problems 33 (12). External Links: Cited by: §IV.
-  (2018) Learned primal-dual reconstruction. IEEE transactions on medical imaging 37 (6), pp. 1322–1332. Cited by: §III-B, §III-B, Fig. 4, §IV.
-  (2019) Solving inverse problems using data-driven models. Acta Numerica 28, pp. 1–174. External Links: Cited by: §II-A, §II-B.
-  (2017) Low-dose CT with a residual encoder-decoder convolutional neural network. IEEE transactions on medical imaging 36 (12), pp. 2524–2535. Cited by: §III-A.
-  (2017) Image restoration: Wavelet frame shrinkage, nonlinear evolution PDEs, and beyond. Multiscale Modeling and Simulation 15 (1), pp. 606–660. External Links: Cited by: §III-A.
-  (2018) IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures. 35th International Conference on Machine Learning, ICML 2018 4, pp. 2263–2284. External Links: Cited by: item 3.
-  (2018) Learning a variational network for reconstruction of accelerated MRI data. Magnetic resonance in medicine 79 (6), pp. 3055–3071. Cited by: §III-C.
-  (2017) Deep convolutional neural network for inverse problems in imaging. IEEE Transactions on Image Processing 26 (9), pp. 4509–4522. Cited by: §III-A, Fig. 3, §IV, §IV.
-  (2011) Adaptive discretizations for the choice of a Tikhonov regularization parameter in nonlinear inverse problems. Inverse Problems 27 (12). External Links: Cited by: §II-B.
-  (2018) Deep Convolutional Framelet Denosing for Low-Dose CT via Wavelet Residual Network. IEEE Transactions on Medical Imaging 37 (6), pp. 1358–1369. External Links: Cited by: §III-A.
-  (2017) A deep convolutional neural network using directional wavelets for low-dose X-ray CT reconstruction. Medical physics 44 (10), pp. e360–e375. Cited by: §III-A.
-  (2015) Deep learning. Nature 521 (7553), pp. 436–444. External Links: Cited by: §I.
-  (2018) Adversarial regularizers in inverse problems. Advances in Neural Information Processing Systems 2018-Decem (NeurIPS), pp. 8507–8516. External Links: Cited by: §II-B.
-  (2017) Convolutional neural networks for inverse problems in imaging: A review. IEEE Signal Processing Magazine 34 (6), pp. 85–95. External Links: Cited by: Fig. 1, §II-B.
-  (2017) Recurrent Inference Machines for Solving Inverse Problems. (Nips). External Links: Cited by: §III-B.
-  (2017) A deep cascade of convolutional neural networks for MR image reconstruction. In International Conference on Information Processing in Medical Imaging, pp. 647–658. Cited by: §III-C.
-  (2016) A perspective on deep imaging. IEEE Access 4, pp. 8914–8924. External Links: Cited by: §III-A.
-  (2016) Deep ADMM-Net for compressive sensing MRI. Advances in Neural Information Processing Systems (Nips), pp. 10–18. External Links: Cited by: §III-B.
-  (2018) Deep convolutional framelets: A general deep learning framework for inverse problems. SIAM Journal on Imaging Sciences 11 (2), pp. 991–1048. External Links: Cited by: §III-A.
-  (2017-10) Few-view CT reconstruction method based on deep learning. In 2016 IEEE Nuclear Science Symposium, Medical Imaging Conference and Room-Temperature Semiconductor Detector Workshop, NSS/MIC/RTSD 2016, Vol. 2017-Janua, pp. 1–4. External Links: Cited by: §III-A.
-  (2018) Image reconstruction by domain-transform manifold learning. Nature 555 (7697), pp. 487–492. External Links: Cited by: Fig. 2, §III-C, Fig. 5, §IV, §V-A, footnote 1.