1 Introduction
Xray based computed tomography (CT) is a medical imaging procedure that reconstructs tomographic images by taking Xray measurements from different angles. To obtain highquality reconstructions, in early reconstruction algorithms such as FBP Katsevich (2002) and ART Gordon et al. (1970), a number of different angles need to be measured. However, since Xray is radioactive, the total dose of Xray needs to be restricted in the scanning process, and thus we need to either decrease the Xray intensity at each chosen angle, or to reduce the total number of angles taken. Decreasing Xray intensity in each angle will result in more noisy measurements, while fewer angles will reduce the information we need for a highquality reconstruction. This causes great challenge in designing efficient and effective reconstruction algorithms.
Compressed sensing Donoho (2006) resolves the issue to a certain extend. According to the theory of compressed sensing, if an image has a sparse property after certain transformations (e.g., wavelet transform), then it can be robustly reconstructed with a reduced number of random measurements by solving an minimization problem when the measurements and the transformation satisfy the DRIP condition Candes et al. (2010). We can use the alternating direction method of multipliers (ADMM) Goldstein and Osher (2009a); Boyd et al. (2011); Cai et al. (2009) or primaldual hybrid gradient method (PDHG) Chambolle and Pock (2011); Zhu and Chan (2008); Esser et al. (2201) to solve this minimization problem to obtain a reconstructed image.
In the literature of CT image reconstruction or image restoration in general, people are focused on designing effective regularizations, which includes the total variation (TV) Rudin et al. (1992), nonlocal means Buades et al. (2005), BM3D Dabov et al. (2007), WNNM Gu et al. (2014), wavelets and wavelet frame models Daubechies (1992); Mallat (1999); Dong et al. (2010), KSVD Elad and Aharon (2006), datadriven (tight) frame Cai et al. (2014); Tai and Weinan (2016), low dimensional manifold method (LDMM) Osher et al. (2017)
, etc. More recently, the rapid development of machine learning, especially deep learning, has lead to a paradigm shift of modeling and algorithmic design in computer vision and medical imaging
Wang (2016); Wang et al. (2017); McCann et al. (2017); Wang et al. (2018); Zhang and Dong (2020). Deep learning based models are able to leverage large image datasets to learn better image representations and produce better image reconstruction results than traditional methods Chen et al. (2017); Jin et al. (2017); Kang et al. (2017); Zhang et al. (2018c); Yang et al. (2018).In most CT reconstruction models, the emphasize is on designing effective image representation, while much less emphasize is on improving the scanning strategy. In compressed sensing, the scanning strategy is entirely random Candes et al. (2006, 2010), i.e., the measurement angles are selected randomly and the dose are allocated uniformly across the angles. In general, such a random scanning strategy, i.e., random angle selection and uniform dose allocation, performs well in practice. However, for each individual subject, this random scanning strategy may not be ideal. It is more desirable to design a personalized scanning strategy for each subject to achieve better reconstruction results. Our key observation is that the measurements collected in the early stage during the scanning process can be used to guide the later scanning.
Despite the potential improvement of a personalized scanning strategy for each individual subject, it is very difficult to handcraft such a strategy by a human expert. This is where machine learning can help. The personalized scanning strategy can be learned using either active learning
Settles (2009) or Reinforcement Learning (RL) Li (2017). In this paper, we propose to use reinforcement learning to learn such a personalized scanning strategy for each subject. The reason we choose RL over active learning is that RL is nongreedy and naturally guarantees the longterm reconstruction quality. We formulate the CT scanning process as a Markov Decision Process (MDP), where the state includes currently collected measurements, the action determines the next measurement angle and the dose usage, and the reward depends on the reconstruction quality. We further use modern Deep RL algorithms to solve it. We show in the experiments that the personalized scanning policy learned by RL significantly outperforms the random scanning strategy in terms of the reconstruction quality, and can generalize to be combined with different reconstruction algorithms. To the best of our knowledge, we are the first to use Deep RL to learn a personalized CT scanning strategy.
1.1 Related works
For compressed sensing, there have been two primary categories of scanning strategies: static and dynamic. Static scanning strategy refers to the method which collects measurements in a fixed order. Lowdiscrepancy sampling Ohbuchi and Aono (1996) and uniformly spaced sparse sampling methods K.A.Mohan et al. (2015) are two examples of static scanning strategy. Nonuniform static scanning strategy based on the model of the subject to be scanned is proposed in Mueller (2011); Z.Wang and G.R.Arce (2010). However, because the order of measurements is predetermined, static scanning strategy is not flexible for different subjects and may lead to poor results for some of them.
Dynamic scanning strategy refers to the methods which collect measurement adaptively based on information obtained from previous measurements. One traditional method tries to find the most suitable measurements which can minimize the entropy to decrease uncertainty of images, such as BCS Ji et al. (2008); Seeger and Nickisch (2008). Similarly, other methods Batenburg et al. (2013); Dabravolski et al. (2014, 2014)
use the information gain at each additional scan to guide the selection of the next measurement. However, these methods are typically greedy methods in nature, have many hyperparameters to be properly tuned, and are slow during inference as they either need to take inverse of large matrices, or to run the reconstruction algorithm for many times when determining the best next angel. More recently, deep neural networks are used to estimate the expected reduction in distortion (ERD) in the reconstructed image when an additional measurement is selected
Godaliyadda et al. (2016, 2017); Zhang et al. (2018b, a); Halimi et al. (2019); Monier et al. (2020). However, for the estimate of ERD to be accurate, it requires a large number of measurements in training.All the above methods are not specific to CT scanning. They are greedy in nature and do not provide a strategy for dose allocation. In contrast, RL is able to generate a nongreedy policy that aims at maximizing longterm rewards. Furthermore, the setting of RL is flexible enough to handle both angle selection and dose allocation, and even more decision options during scanning. Therefore, in this paper we use RL to design a scanning policy that acts optimally on each individual subject. In Scanning Transmission Electron Microscopy (STEM), a recent work by Ede proposes to use RL to guide the movement of the detector and uses a generator to generate reconstructed images. However, since the image modality is drastically different from CT, the proposed MDP (especially the state, action and architecture of the policy network) is vastly different from what is proposed in this paper.
2 Preliminaries
2.1 CT Reconstruction
Let denote the discretised image of a subject, with being the number of pixels. The scanning process is a linear operation which can be described as a matrix . Different scanning angles lead to different forms of linear operators . The measurements can be expressed as
(1) 
where is an additive noise. The reconstruction process is to reconstruct the CT image from the measurements . As equation (1) is a linear equation, it can be directly solved by ART or SART algorithm Gordon et al. (1970); Mueller et al. (1999). However, as can be far smaller than , equation (1) has far less equations than unknowns. In order to obtain highquality solutions, regularizationbased models are often used, which typically take the form as follows:
(2) 
2.2 Relationship between Measurement Noise and Dose
Noise intensity on measurements heavily relies on the Xray dose. It is common to assume that the measurement noise follows a Gaussian distribution
Yu et al. (2012), , and(3) 
where is the Xray dose used in a measurement, is the average intensity of measurement, and is the maximum number of photons the source can generate. We can easily see that if we use more dose, the noise level becomes smaller.
2.3 Some Further Discussions
As equation (1) shows, the measurements we obtain from a CT scan depends both on the angle (which determines ), and the Xray dose (which determines ). Due to the limitation on Xray dose usage, we can only select a limited number of angles and assign each of them limited amount of dose. Traditional methods simply randomly select the angles and equally distribute the allowed dose on them. Our goal is to use RL to learn a personalized policy to select the angles and the dose at each chosen angle for each individual subject.
3 Method
Our goal is to learn a policy that can decide the next measurement angle and its corresponding Xray dose based on the measurements that we have already obtained in the scanning process. We now present how the scanning process can be formulated as a Markov Decision Process (MDP) and solved by reinforcement learning algorithms. We choose the Proximal Policy Optimization (PPO) method as our RL algorithm Schulman and Wolski (2017).
3.1 A Brief Review on MDP and PPO
MDP is a tuple that consists of the state space , the action space , the discount factor
, the transition probability of the environment
and the reward . A policyin RL is a probability distribution on the action
over : . Given an MDP, our goal is to find a policy that maximizes the discounted accumulated rewards in this MDP:(4) 
Many effective RL algorithms have been developed to find the optimal policy . In this paper, we use the PPO algorithm. We now briefly review how it works. Given a parameterized policy , its value function, Q function and advantage function are defined as , , and , respectively. Given an old policy , let , PPO optimizes w.r.t. the following surrogate objective using gradient descent:
(5) 
where .
3.2 MDP Formulation of Personalized Scanning
The CT scanning process is naturally a sequential decision process, where at each time step we need to decide on the measurement angle and the corresponding Xray dose. Given an Image and the number of all possible angles (e.g., if we can choose all the integer angles from °to °), we now elaborate how the CT scanning process on can be formulated as an MDP:

The state is a sequence , where . is the collected measurement at time step . records the used dose distribution up to time step . It is an
dimensional vector, and the value at each entry represents the used Xray dose at that corresponding angle. The
is a scalar that represents the amount of the remaining dose that we can use. Because the reconstructed image at time step relies on all previously collected measurements, we include all of them in the state. 
The action is . is the angle we choose at time step , and it is a onehot vector of dimension . is the fraction of dose that we apply at the corresponding angle. If at a certain time , the total used dose exceeds the total allowed dose, we clip the exceeding dose and terminate the MDP.

The reward is computed as , where is the groundtruth image, and
represents the Peak Signal to Noise Ratio (PSNR) value of the reconstructed image
. We use the increment of PSNR to evaluate how much benefit the new chosen angle/dose brings. The reconstructed image can be obtained from any reconstruction algorithm such as SART, TVbased model, wavelet models, and etc. In this paper, we choose SART as the image reconstruction algorithm for fast computation of reward during training. One may use more refined image reconstruction algorithms. However, this will also significantly increase the training time, and may make it harder to find a better scanning policy. 
The transition model represents the scanning process of CT. At time step , given the state and action , the next state is simply the concatenation of and . We now show how each of the three elements in can be computed.

The new measurement is obtained as , where is the clean projected value obtained using the chosen angle , and is the measurement noise. The noise depends on the chosen dose as mentioned in section 2.2: , , where is the average of .

The new dose distribution is obtained by adding the new decision: , where is the onehot vector of the chosen angle .

The rest amount of dose is easily computed by subtracting the used dose: .
The MDP terminates once the dose is used up.

As we choose the increment in PSNR as the reward, the total sum of reward (when there is no discounting, as in our experiments) is the PSNR value of the final reconstructed image. Therefore, if we find the optimal policy to this MDP, it will also have the best reconstruction result for the image.
3.3 Policy Network Architecture
Because we include all the previous measurements in the state, the dimension of the state vector increases as we take more measurements. To handle the varying dimensionality of the state vector, we represent the policy network as a Recurrent Neural Network (RNN), so all the information from the past measurements can be encoded in the hidden state of the RNN. Specifically, we use the Gated Recurrent Unit (GRU). Besides, the policy network needs to output two different actions: the discrete action for choosing the angle
, and the continuous action for choosing the dose . To handle this, we design a special architecture for the policy network, as shown in Figure 1. We use separate MultiLayer Perceptron (MLP) after the RNN hidden states for these two actions.
is a probability vector of length , where the value at each entry represents the probability of choosing that angle. We use Softmax after the final linear layer to obtain the probability vector. We also introduce a mask to remove the previously chosen angles. For the dose usage , we assume , with the mean and std both learned by a MLP. It is natural to determine the amount of dose after the angle is chosen, i.e., , so we concatenate the onehot vector of the chosen angle as part of the input for the dose MLP.4 Experiments
4.1 Experiment Setup
We train the RL policy on 250 CT images of size from the AAPM dataset of the "2016 NIHAAPMMayo Clinic Low Dose CT Grand Challenge". During training, we use SART as the reconstruction algorithm for computing the reward. The possible angles are all integers in [0°,180°). We use Adam Kingma and Ba (2014) to optimize both the policy network and the value network, with a learning rate of , and . More detailed hyperparameters for PPO and network architecture can be found in the code which will be released upon acceptance of this paper.
After training, we test the learned RL policy on another 350 CT images from the AAPM dataset. We compare the following three scanning strategies: (1) RDED, which selects angles randomly and distributes the doze equally on them; (2) DSED, which selects angle by a dynamic sample strategy from Dabravolski et al. (2014) , while distributes the doze equally on them; (3) RLAD: which uses the learned personalized policy for both angle selection and dose allocation at each chosen angle. To shorten the inference time for DSED, we first use uniform samples with 10°spacing and then use information gain to decide the rest of the angles. During testing, we use four different reconstruction algorithms: SART, TV regularization (TV) Rudin et al. (1992); Goldstein and Osher (2009b), wavelet frame (WF) regularization Ron and Shen (1997); Cai et al. (2009), and the recently proposed deep learning method PDnet Adler and Oktem (2018)
. The evaluation metric is PSNR and the structure similarity metric (SSIM) of the reconstructed images.
A difficulty in conducting a fair comparison of RDED with DSED and RLAD is that the number of selected angles of DSED and RLAD can be different for different subjects (see Figure 3). In our experiments below, we choose the number of measurement angles for RDED to be 53, which is the mean number of measurement angles selected by RLAD over all the 350 test images. Thus, the dose on each measurement angle of RDED is . We also note that the deep reconstruction model PDnet is trained from scratch on the 250 images in the training set using 53 angles and parallel beams geometry.
4.2 Results
Model  Inference (s)  SART  WF  TV  PDnet  
PSNR  SSIM  PSNR  SSIM  PSNR  SSIM  PSNR  SSIM  
RLAD  2.50(0.05)  29.22(0.43)  0.683(0.026)  29.77(0.47)  0.723(0.030)  29.52(0.45)  0.702(0.029)  31.12(0.56)  0.800(0.036) 
DSED  1223.27(279)  28.07(0.45)  0.657(0.027)  28.69(0.48)  0.706(0.033)  28.35(0.46)  0.678(0.030)  30.10(0.60)  0.794(0.027) 
RDED  2.26(0.04)  27.86(0.47)  0.651(0.026)  28.44(0.48)  0.698(0.031)  28.17(0.46)  0.673(0.030)  29.99(0.69)  0.789(0.028) 
Table 1
presents the mean and standard deviation of the PSNR and SSIM values of the reconstructed images of all compared scanning strategies and reconstruction algorithms. As one can see that the proposed scanning strategy RLAD significantly outperforms dynamic sampling (DSED) and random scanning (RDED), while DSED outperforms RDED. Furthermore, we present the inference time for each of the compared scanning strategy. The inference time includes the time to compute measurements. Since DSED needs to frequently reconstruct CT image during the decision on measurement angles, it is significantly slower than RLAD and RDED.
We also note that the RLpolicy is trained only using the SART for computing the reward function, whereas the learned policy can generalize well to three other reconstruction algorithms, i.e., the TV regularization, the wavelet frame regularization and the deep learning model PDnet, where it still brings a notable improvement upon the dynamic sampling and random scanning baseline in reconstruction quality.
In Figure 2, we further show two examples of the reconstructed images using the random scanning strategy (RDED), the dynamic sampling strategy (DSED) and the learned personalized policy (RLAD), reconstructed using the deep learning model PDnet. We can see that the reconstructions using the RL policy are of higher qualities than those using random and dynamic sampling strategy, especially from the zoomin views of the figures.
We plot the distribution of number of measurements taken by the learned personalized policy (RLAD) in Figure 3 (a). The result demonstrates that for different subjects, the learned RL policy selects different number of angles and dose allocations. In Figure 3 (b), (c) and (d), we take 8 images on which the learned RL policy selects 45, 54 and 64 angles respectively and plot the distributions of the dose usage of these images. It can be seen that images using the same number of measurement angles have very similar dose allocations, and images that have more measurement angles use less dose at each angle. In Figure 4, we show 3 example images where the RL policy selects 45, 54 and 63 measurement angles respectively. We can see that images upon which the RL policy selects more measurement angles have more structures in the image, and thus more information/measurements need to be collected to obtain a highquality reconstruction. In Figure 5, we present the selected angles and part of the dose allocation on the subjects shown in Figure 4.
(a)  (b)  (c)  (d) 
(a)  (b)  (c) 
(e)  (f)  (g) 
5 Conclusion
In this paper, we proposed to use reinforcement learning to learn a personalized CT scanning strategy for measurement angle selection and dose allocation. We formulated the CT scanning process as a Markov Decision Process, and used the PPO algorithm to solve it. After training on 250 real 2D CT images, we validated the learned personalized scanning policy on another 350 CT images. Our validation showed that the personalized scanning policy lead to better overall reconstruction results in terms of PSNR values, and generalized well to be combined with different reconstruction algorithms. We also demonstrated that the personalized policy can indeed adjust its angle selection and dose allocations adaptive to different subjects. One drawback of the proposed method is the long training time (approximately 24 hours) even for 2D images, because RL algorithms usually need lots of simulation samples to converge, and to compute the reward in our formulated MDP requires running a reconstruction algorithm at each time step. This might prohibits the application of our method to 3D cases.
6 Broader Impact
As shown by our experiments, the learned personalized scanning strategy significantly improves the reconstruction quality. We hope our work can draw more attention on how to design more efficient and effective CT scanning strategies using latest tools developed in machine learning. Our method may also be generalized to other imaging modality such as Magnetic Resonance Imaging (MRI), where a smart scanning strategy may significantly reducing acquisition time which has been one of the major challenge for MRI. There are also some potential issues of our proposed method: 1) The training algorithm of the proposed framework can be difficult to tune; 2) The design of the MDP greatly affects the final performance and it is currently way underexplored; 3) In our training, we adopt an idealize assumption that the linear operator and the noise is a close approximation to the real physics of the imaging system, which make cause problem when deploying the trained RL policy to real imaging systems.
Bin Dong is supported in part by National Natural Science Foundation of China (NSFC) grant No. 11831002, Beijing Natural Science Foundation (No. 180001) and Beijing Academy of Artificial Intelligence (BAAI).
References
 [1] (2018) Learned primaldual reconstruction. IEEE Trans. Med. Imag ing 37 (6), pp. 1322–1332. Cited by: §4.1.
 [2] (2013) Dynamic angle selection in binary tomography. Computer Vision and Image Understanding 117 (4), pp. 306–318. Cited by: §1.1.
 [3] (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3 (1), pp. 1–122. Cited by: §1.

[4]
(2005)
A nonlocal algorithm for image denoising.
In
2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05)
, Vol. 2, pp. 60–65. Cited by: §1.  [5] (2014) Datadriven tight frame construction and image denoising. Applied and Computational Harmonic Analysis 37 (1), pp. 89–105. Cited by: §1.
 [6] (2009) Split bregman methods and frame based image restoration. Multiscale Model. Simul. 8 (2), pp. 337–369. Cited by: §1, §4.1.
 [7] (2006) Robust uncertainty principles: exact signal reconstruction from highly incomplete fourier information. IEEE Trans. Info. Theory 52 (2). Cited by: §1.
 [8] (2010) Compressed sensing with coherent and redundant dictionaries. Cited by: §1, §1.
 [9] (2011) A firstorder primaldual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 40 (1), pp. 120–145. Cited by: §1.

[10]
(2017)
Lowdose ct with a residual encoderdecoder convolutional neural network
. IEEE transactions on medical imaging 36 (12), pp. 2524–2535. Cited by: §1.  [11] (2007) Image denoising by sparse 3d transformdomain collaborative filtering. IEEE Transactions on image processing 16 (8), pp. 2080–2095. Cited by: §1.
 [12] (201404) Dynamic angle selection in xray computed tomography. Nuclear Instruments and Methods in Physics Research Section B: Beam Interactions with Materials and Atoms 324, pp. 17–24. Cited by: §1.1, §4.1.
 [13] (2014) Dynamic angle selection in xray computed tomography. Nuclear Instruments and Methods in Physics Research Section B: Beam Interactions with Materials and Atoms 324, pp. 17–24. Cited by: §1.1.
 [14] (1992) Ten lectures on wavelets. Vol. 61, Siam. Cited by: §1.
 [15] (2013) Xray ct image reconstruction via wavelet frame based regularization and radon domain inpainting. Journal of Scientific Computing 54 (23), pp. 333–349. Cited by: §2.1.
 [16] (2010) MRA based wavelet frames and applications. IAS Lecture Notes Series, Summer Program on “The Mathematics of Image Processing”, Park City Mathematics Institute 19. Cited by: §1.
 [17] (2006) Compressed sensing. IEEE Transactions on Information Theory 52 (4), pp. 1289–1306. Cited by: §1.
 [18] Adaptive partial scanning transmission electron microscopy with reinforcement learning. arXiv:2004.02786. Cited by: §1.1.
 [19] (2006) Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image processing 15 (12), pp. 3736–3745. Cited by: §1.
 [20] (22010) A general framework for a class of first order primaldual algorithms for convex optimization in imaging science. SIAM J. Imaging Sci. 3 (4), pp. 1015–1046. Cited by: §1.

[21]
(2016)
A supervised learning approach for dynamic sampling
. S&T Imaging. International Society for Optics and Photonics. Cited by: §1.1.  [22] (2017) A framework for dynamic image sampling based on supervised learning (slads). arXiv:1703.04653. Cited by: §1.1.
 [23] (2009) The split bregman method for l1regularized problems. SIAM J. Imaging Sci 2 (2), pp. 323–343. Cited by: §1.
 [24] (2009) The split bregman method for regularized problems. SIAM J. Imaging Sci. 2 (2), pp. 323–343. Cited by: §4.1.
 [25] (1970) Algebraic reconstruction techniques (art) for threedimensional electron microscopy and xray photography. Journal of Theoretical Biology. Cited by: §1, §2.1.
 [26] (2014) Weighted nuclear norm minimization with application to image denoising. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2862–2869. Cited by: §1.
 [27] (2019Dec.) Fast adaptive scene sampling for singlephoton 3d lidar images. IEEE CAMSAP 2019  International Workshop on Computational Advances in MultiSensor Adaptive Processing. Cited by: §1.1.
 [28] (2008) Bayesian compressive sensing. IEEE Transactions on Signal Processing 56 (6), pp. 2346–2356. Cited by: §1.1.
 [29] (2017) Deep convolutional neural network for inverse problems in imaging. IEEE Transactions on Image Processing 26 (9), pp. 4509–4522. Cited by: §1.
 [30] (2015) Timbir: a method for timespace reconstruction from interlaced views. IEEE Transactions on Computational Imaging, pp. 96–111. Cited by: §1.1.
 [31] (2017) A deep convolutional neural network using directional wavelets for lowdose xray ct reconstruction. Medical physics 44 (10), pp. e360–e375. Cited by: §1.
 [32] (2002) Theoretically exact filtered backprojectiontype inversion algorithm for spiral ct. Siam Journal on Applied Mathematics 62 (6), pp. 2012–2026. Cited by: §1.
 [33] (2014) Adam: a method for stochastic optimization. arXiv:1412.6980. Cited by: §4.1.
 [34] (2017) Deep reinforcement learning: an overview. arXiv:1701.07274. Cited by: §1.
 [35] (1999) A wavelet tour of signal processing. Elsevier. Cited by: §1.
 [36] (2017) Convolutional neural networks for inverse problems in imaging: a review. IEEE Signal Processing Magazine 34 (6), pp. 85–95. Cited by: §1.
 [37] (2020) Fast reconstruction of atomicscale stemeels images from sparse sampling. Ultramicroscopy. Cited by: §1.1.
 [38] (1999) Antialiased threedimensional conebeam reconstruction of lowcontrast objects with algebraic methods. IEEE Transactions On Medical Imaging 6 (18), pp. 519–537. Cited by: §2.1.
 [39] (2011Jan. 28) Selection of optimal views for computed tomography reconstruction. Patent WO. Cited by: §1.1.
 [40] (1996) Quasimonte carlo rendering with adaptive sampling. Cited by: §1.1.
 [41] (2017) Low dimensional manifold model for image processing. SIAM Journal on Imaging Sciences 10 (4), pp. 1669–1690. Cited by: §1.
 [42] (1997) Affine systems in : the analysis of the analysis operator. J. Funct. Anal. 148 (2), pp. 408–447. Cited by: §4.1.
 [43] (1992) Nonlinear total variation based noise removal algorithms. Physica D: nonlinear phenomena 60 (14), pp. 259–268. Cited by: §1, §4.1.
 [44] (2017) Proximal policy optimization algorithms. arXiv:1707.06347v2. Cited by: §3.
 [45] (2008) Compressed sensing and bayesian experimental design. In Proceedings of the 25th international conference on Machine learning, pp. 912–919. Cited by: §1.1.
 [46] (2009) Active learning literature survey. Technical report University of WisconsinMadison Department of Computer Sciences. Cited by: §1.
 [47] (2008) Image reconstruction in circular conebeam computed tomography by constrained, totalvariation minimization. Physics in medicine and biology 53, pp. 4777. Cited by: §2.1.
 [48] (2016) Multiscale adaptive representation of signals: i. the basic framework. The Journal of Machine Learning Research 17 (1), pp. 4875–4912. Cited by: §1.
 [49] (2017) Machine learning will transform radiology significantly within the next 5 years. Medical physics 44 (6), pp. 2041–2044. Cited by: §1.
 [50] (2018) Image reconstruction is a new frontier of machine learning. IEEE transactions on medical imaging 37 (6), pp. 1289–1296. Cited by: §1.
 [51] (2016) A perspective on deep imaging. Ieee Access 4, pp. 8914–8924. Cited by: §1.
 [52] (2018) Lowdose ct image denoising using a generative adversarial network with wasserstein distance and perceptual loss. IEEE transactions on medical imaging 37 (6), pp. 1348–1357. Cited by: §1.
 [53] (201207) Development and validation of a practical lowerdosesimulation tool for optimizing computed tomography scan protocols. Journal of Computer Assisted Tomography 36 (4), pp. 477–487. Cited by: §2.2.
 [54] (2010) Variable density compressed image sampling. Image Processing, IEEE Transactions 19 (1), pp. 264–270. Cited by: §1.1.
 [55] (2020) A review on deep learning in medical image reconstruction. Journal of the Operations Research Society of China, pp. 1–30. Cited by: §1.
 [56] (2018) Dynamic sparse sampling for confocal raman microscopy. Analytical chemistry 90 (7), pp. 4461–4469. Cited by: §1.1.
 [57] (2018) SLADSnet: supervised learning approach for dynamic sampling using deep neural networks. Electronic Imaging, Computational Imaging XVI. Cited by: §1.1.
 [58] (2018) A sparseview ct reconstruction method based on combination of densenet and deconvolution. IEEE transactions on medical imaging 37 (6), pp. 1407–1417. Cited by: §1.
 [59] (2008) An efficient primaldual hybrid gradient algorithm for total variation image restoration. UCLA CAM Report. Cited by: §1.