I Introduction
Over the past few decades, computed tomography (CT) has been very successfully used for clinical diagnosis and intervention. However, Xrays can be absorbed partially by the human body and may cause genetic or cancerous diseases [1]. As a result, the famous ALARA (as low as reasonably achievable) principle has been widely utilized to avoid an excessive radiation dose in clinical practice. Reducing the Xray flux towards each detector element and decreasing the number of sampling views are two effective ways to reduce the radiation dose, resulting in noisy measurements or insufficient projection data, respectively. However, it is chanllenging to generate highquality images with degraded projection data using traditional methods, such as filtered back projection (FBP). Thus, reconstructing highquality images from contaminated and/or undersampled projection data has attracted increased attention in the field of CT imaging. In this work, we focus on the sparseview problem.
To recover from the situation of undersampling, iterative reconstruction methods have been developed, which can efficiently improve the image quality when the projection views are incomplete. Classical iterative reconstruction algorithms, such as the algebraic reconstruction technique (ART) [2], simultaneous the algebraic reconstruction technique (SART) [3]
and expectation maximization (EM)
[4], can mitigate this problem to a certain degree. However, it is difficult to achieve a satisfactory result when projection views are highly sparse without extra priors. Introducing reasonable priors can significantly improve the imaging quality of traditional iterative methods. In recent years, compressed sensing (CS) theory has been developed and has been proven to be a powerful tool, which indicates that a signal can be exactly reconstructed with a very high probability if the signal can be represented sparsely with a certain sparsifying transform
[5, 6]. Inspired by CS theory, Sidky et al. [7, 8] coupled total variation (TV) and projection onto convex sets (POCS) to solve incomplete projection data reconstruction problems and achieved good performance. However, TV assumes that the signal is piecewise smooth, resulting in undesired patchy effects on the reconstructed images. Following this path, many variants of TV have been proposed, such as adaptiveweighted TV [9], fractionalorder TV [10, 11], and nonlocal TV [12, 13]. Furthermore, Niu et al. [14] combined penalized weighted leastsquares (PWLS) [15] with total generalized variation (TGV) [16] to reduce the patchy effects. Chen and Lang indicated that a highquality image can be used as the prior to accurately reconstruct CT dynamic image sequences from highly undersampled projection data [17].Recently, inspired by the rapid development of deep learning, several pioneering studies combining the ideas of deep learning and CS were presented
[18, 19]. Typically, Chen et al.proposed to unfold the iterative reconstruction into a residual convolutional neural network, and the parameters and the regularization terms can be learned from the external dataset
[20]. Alder and Öktem employed the primal dual hybrid gradient (PDHG) algorithm for nonsmooth convex optimization and unrolled the optimization process to a neural network [21]. Zhu et al. proposed a unified image reconstruction framework–automated transform by manifold approximation (AUTOMAP)–which directly learned a potential mapping function from the sensor domain to the image domain [22]. In [23], a parameterized plugandplay alternating direction method of multipliers (3pADMM) was developed for the PWLS model, and then the prior terms and related parameters were optimized with a neural network with a large quantity of training data. Except for the reconstruction problem, deep learning was also used in lowdose CT (LDCT) denoising. As a pioneering work, Chen et al. used a threelayer convolutional neural network (CNN) to handle the noise in the LDCT images [24]. Kang et al. introduced contourletbased multiscale UNet to improve the image quality of LDCT [25]. Motivated by the idea of autoencoders, Chen
et al. proposed a residual encoderdecoder CNN (REDCNN) for LDCT [26]. These methods obtained promising results, but the requirement of a large number of training samples may limit the application scenarios.As another direction of CS, in recent years, dictionary learning (DL)based methods have been proven useful in various image restoration problems. In 2006, Elad and Aharon adopted a dictionarybased method to address the image denoising problem [27]. Mairal et al. explored how to apply this method to color image restoration [28]. In the field of medical imaging, the DL method was first introduced into magnetic resonance imaging (MRI) by Chen et al. [29]. Ravishankar and Bresler utilized an adaptive dictionarybased method to reconstruct MR images from the highly undersampled space data [30]. Later, Xu et al. first introduced DL into LDCT reconstruction [31]
. Two types of dictionaries were learned as bases: (1) a global dictionary was learned from an external training set that was fixed during the reconstruction procedure, and (2) an adaptive dictionary was learned from the intermediate result that continued to update during the iterations. Inspired by the work of superresolution, which uses two dictionaries to link lowresolution images and the corresponding highresolution training images
[32], Lu et al. proposed using two dictionaries to predict fullsampled CT images [33], and Zhao et al. extended this method to spectral CT [34]. Chen et al. further improved the DLbased artifact reduction method with tissue feature atoms and artifact atoms to build discriminative dictionaries [35]. Liu et al. extended this idea to 3D sinogram restoration for LDCT [36]. Zheng et al. proposed to learn a union of sparsifying transforms instead of dictionary atoms for LDCT image reconstruction [37].However, most dictionary learning methods are patchbased, and the learned features often contain shifted versions of the same features. The learned dictionary may be overredundant, and the results will suffer from patch aggregation artifacts on the patch boundaries. To address these problems, convolutional sparse coding (CSC) methods have been proposed in which the shift invariance is directly modeled in the objective function. CSC can be traced back to [38] and was first proposed by Zeiler et al. [39]
. The effectiveness of CSC has been demonstrated in several computer vision tasks. Gu
et al. introduced CSC into superresolution and achieved promising results [40]. Wohlberg approached impulse noise via CSC aided by gradient constraints [41]. Liu et al. applied CSC for image fusion [42]. More recently, Zhang and Patel combined CSC and lowrank decomposition to address rain streak removal and image decomposition problems [43, 44]. Thanh et al.proposed to adopt a shared 3D multiscale convolutional dictionary to recover the highfrequency information of the MRI images [45].Inspired by the pioneering studies of CSC, in this work, two models based on CSC are proposed to address the sparseview CT reconstruction problem. In the first model, which is called PWLSCSC, a CSC regularization is adapted with the PWLS model. The filters are trained with an external fullsampled CT dataset. Although PWLSCSC can work well for sparseview CT, ringing artifacts may appear in some reconstructed images due to inaccurate filters. On the other hand, gradient constraint is an effective way to suppress these types of artifacts. Therefore, in the second model, an improved version of PWLSCSC, which imposes gradient regularization on feature maps, is proposed. For simplicity, we called this method PWLSCSCGR.
The remainder of this paper is organized as follows. In Section II, we introduce the background on PWLS and CSC. In Section III, the proposed reconstruction schemes are elaborated. In Section IV, the experimental designs and representative results are given. Finally, we will discuss relevant issues and conclude this paper in Section V.
Ii Background
Iia Penalized Weighted LeastSquares for CT Image Reconstruction
The calibrated and logtransformed projection data approximately follow a Gaussian distribution, which can be described by the following analytic formula
[46]:(1) 
where and
are the mean and variance of the measured projection data at
th bin, respectively, is a parameter adaptive to different bins and is a scaling parameter. Based on these properties of the projection data, the PWLSbased CT reconstruction problem can be modeled as follows:(2) 
where denotes the projection data,
is the vectorized attenuation coefficients to be reconstructed and
denotes the transpose operation. denotes the system matrix with a size of ( is the total number of projection data and is the total number of image pixels). is a diagonal matrix with the th element of as calculated by (1). represents the regularization term, and is a balancing parameter to control the tradeoff between the datafidelity and regularization terms.IiB Convolutional Sparse Coding
In contrast to traditional patchbased DL methods, CSC assumes that the image can be represented as a summation of convolutions between a set of filters and its corresponding feature maps:
(3) 
where denotes the image, denotes the convolution operator, is a set of filters, are the feature maps corresponding to filters , and is the regularization parameter. In the CSC model, the feature map has the same size as image . Since the reconstructed image is obtained by the summation of convolution operations , the patch aggregation artifacts in traditional patchbased sparse coding methods can be avoided. Meanwhile, the features learned by patchbased sparse coding methods often contain shifted versions of the same features, and the filters learned by CSC can obtain rotationinvariant features, which makes this method more efficient.
Iii Method
Iiia PWLSCSC Model
Current TV or DLbased methods suffer from blocky effects or patch aggregation artifacts. To circumvent these problems, we propose introducing CSC as the regularization term in (2). Combining CSC with (2), we aim to solve the following optimization problem:
(4) 
In this model, we use an external fullsampled CT dataset to train the filters by the method proposed in [47], so (4) becomes the following optimization problem:
(5) 
An alternating minimization scheme can be used to solve (5). First, an intermediate reconstructed image can be obtained with a set of fixed feature maps . Thus, the objective function (5) becomes:
(6) 
With the separable paraboloid surrogate method [48], the solution of (6) can be obtained by:
(7) 
where represents the iteration index and is the element of system matrix . The second step is to represent the intermediate result obtained in the last step with the fixed filters , which means calculating . Since CSC does not provide a good representation of the lowfrequency component of the image [41], only the highfrequency component of is processed by CSC, and the results can be obtained by simply summing the recovered highfrequency component and the lowfrequency component. The highfrequency component of the image is precomputed by a highpass filtering operation. Note that the symbol in the remainder of this section denotes the high frequency component. Then, we have the following optimization problem:
(8) 
Let , and , (8) can be rewritten as:
(9) 
The alternating Direction Method of Multipliers (ADMM) [49] is adopted to solve (9) by introducing a dual variable that is constrained to be equal to , leading to the following problem:
(10) 
and are optimized alternately as follows:
(11) 
(12) 
(13) 
(11) can be solved by transforming it into Fourier domain [50, 51], which gives us:
(14) 
where , , , and denote the corresponding expressions in the Fourier domain. (14) can be solved with the following closedform solution:
(15) 
where
is the identity matrix. (
15) can be computed efficiently with the Sherman Morrison formula [52]. The closedform solution of (12) can be obtained by:(16) 
where denotes the softthresholding function [53], which can be calculated as follows:
(17) 
The overall PWLSCSC algorithm for sparseview CT reconstruction is summarized in Algorithm 1.
IiiB PwlsCscgr
As claimed in [31]
, for DLbased methods, the structure loss or new artifacts may appear due to the inaccurate dictionary atoms. Similarly, CSCbased methods will suffer this problem if inaccurate filters exist. The constraint with gradient regularization (isotropic total variation), which is usually used to suppress outliers, is an effective way to overcome this problem
[41, 54]. Furthermore, as suggested by [54], imposing the gradient constraint on the feature maps is superior to applying the constraint on the image domain. As a result, to further improve the proposed PWLSCSC, in this subsection, we propose to utilize the gradient regularization (PWLSCSCGR) on feature maps as an additional constraint in PWLSCSC, which yields the following optimization problem:(18) 
where and are the filters that compute the gradient along the x and yaxes respectively, and
is a hyperparameter. Similar to (
5), given the predetermined filters , (18) can be decomposed into two optimization problems. One problem is exactly the same as (6), and the other can be given as:(19) 
The linear operators and are defined such that and the gradient term in (19) can be rewritten as:
(20) 
By introducing block matrix notation, (19) can be rewritten as follows:
(21) 
where
(22) 
To apply ADMM to (21), we introduce a dual variable and have the following optimization problem:
(23) 
Then we have an iterative scheme for (23) as follows:
(24) 
(25) 
(26) 
The only difference between solving (23) and (10) is solving . The solution for (24) in the Fourier domain is given by:
(27) 
that can also be solved by the Sherman Morrison formula [47]. The main steps of the proposed PWLSCSCGR algorithm are summarized in Algorithm 2.
Iv Experimental Results
In this section, extensive experiments were conducted to demonstrate the performance of the proposed methods. Two types of data, including simulated and real clinical data, were used in our experiments. All experiments were performed in MATLAB 2017a on a PC (equipped with an AMD Ryzen 5 1600 CPU at 3.2 GHz and 16 GB RAM). To accelerate the algorithm, we performed the CSC and CSCGR phases with a graphics processing unit (GPU) GTX 1080 Ti. All filters and the regularization parameters used in this section were the same if there was no special instruction, including 32 filters with sizes of 10 10, = 0.005, = 0.005, = 100 + 1 and
= 0.06. Notably, in our numerical scheme, the computation of convolution was transformed into the frequency domain, which can be treated as the dot product operation, so there was no need to keep the filter size to an odd number. All the parameters were constants except
, which was selfadaptive according to [55]. The peak signaltonoise ratio (PSNR), rootmeansquare error (RMSE) and structural similarity (SSIM) [56] were utilized to quantitatively evaluate the performance of the methods.Iva Simulated Results
To demonstrate the performance of the proposed methods, a dataset of clinical CT images, authorized by Mayo Clinic and downloaded from ”the NIHAAPMMayo Clinic Low Dose CT Grand Challenge,” was used [57]. This dataset contains 5,936 1 mm thick full dose CT slices collected from 10 patients. The highfrequency components of the ten images randomly selected from this dataset were used to train the predetermined filters . Fig. 1 shows the images in the training set. The projection data in fanbeam geometry were simulated by the Siddon’s raydriven algorithm [58]. A total of 64 projection views were evenly distributed over , and a flat detector with a length of 41.3 cm had 512 bins. The image arrays were 20 20 cm. The distance from the source to the center of rotation was 40 cm, and the distance from the center of the detector to the center of rotation was 40 cm. The outer iteration number was set to 2000. The inner iteration numbers were set to 20 and 100 for PWLS and CSCGR, respectively. The iteration numbers for PWLSCSC were the same as those for PWLSCSCGR.
Several stateoftheart methods were compared with our proposed algorithms, including FBP, PWLSTGV [14], PWLSDL [31], PWLSULTRA[37] and LEARN [20].
IvA1 Abdominal Case
The original abdominal image and the results reconstructed from different methods are shown in Fig. 2. The result of FBP contains severe streak artifacts and is clinically almost useless, while all other algorithms efficiently remove the artifacts. However, in Fig. 2(c), blocky effects are noticed to a certain degree. PWLSDL effectively avoids the blocky effects in Fig. 2(d), but the result appears smooth in some organs. This effect is because the DL method is patchbased, which usually averages all the overlapped patches to produce the final result. This procedure may efficiently suppress the noise, but some small details may also be smoothened. PWLSULTRA suffers the same problem as PWLSDL. In Fig. 2(f), the recent deep learningbased method LEARN also fails to recover small details. In Fig. 2(g) and (h), the proposed methods maintain more details while the artifacts are removed, especially some contrastenhanced blood vessels in the liver. The differences between PWLSCSC and PWLSCSCGR are almost invisible to visual inspection, but the absolute difference images, which are illustrated in Fig. 3, show that the PWLSCSCGR exhibits better performance.
To better visualize the details, Fig. 4 shows the results of a magnified region of interest (ROI), which is indicated by the red box in Fig. 2(a). The red arrows indicate several contrastenhanced blood vessels in the liver, which can only be well identified by our methods. In particular, in Fig. 4(d)(f), the liver region is smoothened, and the contrastenhanced blood vessels are almost lost. In Fig. 4(g) and (h), the proposed methods preserve most details and have the most coherent visual effect on the reference image, even for the mottlelike structures in the liver.
IvA2 Thoracic Case
Fig. 5 presents the thoracic images reconstructed by different methods. It can be easily observed that the FBP result is covered by the artifacts, and it is difficult to clearly discriminate the artifacts and blood vessels in the lung. In Fig. 5(c)(f), most artifacts are suppressed, but the results are smoothened to different degrees in some organs, and some important details are lost. In Fig. 5(g) and (h), more details in some regions are preserved while the artifacts are effectively removed.
In particular, two small regions, which are indicated by the blue and red boxes in Fig. 5(a), are enlarged in Fig. 6 and Fig. 7. In Fig. 6(b), the heart could not be well recognized. PWLSTGV distorted the boundaries and the structures in the heart in Fig. 6(c). As indicated by the red arrow, in Figs. 6(d)(f), the tiny vessels is blurred, and only PWLSCSC and PWLSCSCGR satisfactorily recover this structural information. In Fig. 6(g), the result of PWLSCSC introduces ringing artifacts, which are probably caused by inaccurate filters, while PWLSCSCGR effectively suppresses these artifacts with the aid of gradient regularization. In Fig. 7, the curvelike structures indicated by the red arrow are clearly visible in the results of PWLSCSC and PWLSCSCGR, and all the other methods cannot adequately maintain these structures.
IvA3 Quantitative Evaluation
To demonstrate the robustness of the proposed methods at various sampling rates, experiments with 48, 64 and 80 projection views were performed. Table I tabulates the quantitative results of the abdominal image, which is shown in Fig. 2(a) and reconstructed from 48, 60 and 80 projection views with different methods. When the number of projection views is set to 48, PWLSDL and PWLSULTRA demonstrate better performance than PWLSCSC, but PWLSCSC outperforms them when the number of projection views is set to 64 or 80 and PWLSCSCGR achieves the best performance in all situations. Table II displays the quantitative evaluations for the thoracic image shown in Fig. 5(a) and reconstructed from 48, 60 and 80 projection views with different methods. A similar trend is shown in Table II, in which PWLSCSCGR obtains the highest scores in all three metrics and PWLSCSC also outperforms PWLSTGV and PWLSDL for these conditions. The plausible reason is that the thoracic image has more small details, which can be better represented by CSC. In summary, the proposed PWLSCSC and PWLSCSCGR methods are more robust to different sampling rates than other methods.
Num. of Views  48  64  80  

PSNR  RMSE  SSIM  PSNR  RMSE  SSIM  PSNR  RMSE  SSIM  
FBP  24.15  0.06203  0.41375  25.35  0.05417  0.45858  26.40  0.04785  0.54287 
PWLSTGV  41.06  0.00885  0.96068  44.05  0.00627  0.97697  46.91  0.00451  0.98736 
PWLSDL  41.75  0.00817  0.96863  44.84  0.00573  0.97975  47.65  0.00414  0.98915 
PWLSULTRA  41.69  0.00824  0.96672  44.46  0.00599  0.97791  46.02  0.00500  0.98355 
LEARN  40.43  0.00953  0.95421  43.28  0.00692  0.97254  44.74  0.00580  0.97945 
PWLSCSC  41.36  0.00855  0.96375  44.92  0.00568  0.98064  47.75  0.00410  0.98911 
PWLSCSCGR  42.52  0.00748  0.96919  45.73  0.00517  0.98329  48.38  0.00381  0.99034 
Num. of Views  48  64  80  

PSNR  RMSE  SSIM  PSNR  RMSE  SSIM  PSNR  RMSE  SSIM  
FBP  21.85  0.08079  0.36232  22.99  0.07087  0.42335  23.9  0.0638  0.47914 
PWLSTGV  40.74  0.00919  0.96781  44.43  0.00600  0.98310  48.44  0.00379  0.99231 
PWLSDL  39.80  0.01023  0.96203  43.68  0.00655  0.98145  47.21  0.00436  0.99069 
PWLSULTRA  42.55  0.00745  0.97535  45.13  0.00554  0.98399  47.04  0.00445  0.98884 
LEARN  39.43  0.01068  0.95866  42.52  0.00743  0.97716  44.29  0.00613  0.98340 
PWLSCSC  41.55  0.00837  0.97177  45.07  0.00558  0.98517  48.69  0.00368  0.99269 
PWLSCSCGR  42.78  0.00726  0.97687  46.07  0.00497  0.98753  49.54  0.00333  0.99372 
IvB Real Data
To evaluate the proposed methods in a real clinical configuration, in this subsection, projection data acquired on a UEG Dental CT scanner using a normaldose protocol of 110 kVp and 13 mA/10 ms per projection were used. Five hundred sampling angles were evenly distributed over 360 degrees. In each projection view, 950 bins were distributed defining a fieldofview (FOV) of 18 18 25 cm. The reconstructed image is a matrix of 256 256 pixels. The outer iteration number was set to 40, and the inner iteration numbers for PWLS and CSCGR were set to 2 and 100, respectively. The iteration numbers for PWLSCSC are the same as those for PWLSCSCGR.
Fig. 8 shows the reference image reconstructed by ART with fullsampled (500 views) projection data and the results reconstructed by various methods with downsampled projection data with 125 views. It can be observed that due to the limitation of hardware, noticeable noise and artifacts still exist in the fullsampled result in Fig. 8(a). In Fig 8(b), FBP suffers from severe artifacts that fulfill every area of the mouth. In Fig. 8(c), PWLSTGV completely removes most of the artifacts, but the blocky effects at the edges of tissue, which are indicated by the blue and yellow arrows, are obvious. PWLSDL efficiently suppresses the artifacts shown in the reference image. However, some artifacts caused by patch aggregation can be observed in Fig. 8(d), which are indicated by the red arrow, and the details of the teeth are blurred. In Fig. 8(e) and (f), PWLSULTRA and LEARN cannot remove all the artifacts, and the details of teeth are blurred. In Fig. 8(g) and (h), PWLSCSC and PWLSCSCGR efficiently eliminate the noise and artifacts and avoid the side effect caused by patch aggregation in PWLSDL, and PWLSCSCGR clearly preserves the details of teeth.
IvC Effect of Filter Parameters
In this subsection, we evaluate the impact of filter parameters on PWLSCSCGR, including the number of filters, the filter sizes, the number of training samples and the different types of training sets.
IvC1 Number of Filters
To evaluate the impact of the number of filters, experiments were performed with = 4, 8, 16, 24, 32, 64 and 150 on the selected images in subsection IVA. The quantitative results are summarized in Table III. It can be seen that the metrics for both images rise at first and then begin to stabilize when continues to increase. This phenomenon may rely on the fact that when is small, the performance can be improved significantly by adding more filters due to the increasing precision of representation, but when is large enough, the benefit will also decay rapidly. Based on this observation, to balance the performance and computational time, the number of filters was set to 32 in this paper.
Image  Abdominal  Thoracic  

Num. of Filters  PSNR  RMSE  SSIM  Time  PSNR  RMSE  SSIM  Time 
4  44.77  0.00577  0.97990  722 s  45.00  0.00562  0.98450  788 s 
8  45.40  0.00537  0.98205  776 s  45.74  0.00517  0.98654  695 s 
16  45.54  0.00528  0.98262  745 s  46.01  0.00500  0.98737  763 s 
24  45.61  0.00524  0.98288  787 s  46.07  0.00497  0.98753  917 s 
32  45.73  0.00517  0.98329  970 s  46.07  0.00497  0.98753  985 s 
64  45.70  0.00519  0.98334  1069 s  46.18  0.00491  0.98779  1371 s 
150  45.65  0.00522  0.98308  2087 s  46.03  0.00500  0.98752  2418 s 
IvC2 Filter Sizes
To sense the impact of filter sizes, different sizes including 6, 8, 10, 12 and 14 were tested. The results are given in Fig. 9. The results show that the performance of our method is not quite sensitive to the filter size and that there are no distinct differences for different sizes. As a result, we simply set filter size to 10, which is approximately the maximum in Fig. 9.
IvC3 Number of Training Samples
In this subsection, different numbers of training samples are used to validate the impact of the size of the training set. All images used were randomly selected from the Mayo dataset. The quantitative results are given in Table IV. Notably, as the number of training samples increases, the improvement in the performance is small, which can be treated as the evidence that for the current number of filters, the performance is robust to the size of the training set.
Image  Abdominal  Thoracic  

Num. of Samples  PSNR  RMSE  SSIM  PSNR  RMSE  SSIM 
1  45.43  0.00535  0.98236  45.48  0.00532  0.98595 
3  45.69  0.00519  0.98313  45.94  0.00504  0.98717 
5  45.68  0.00520  0.98317  45.94  0.00504  0.98727 
10  45.73  0.00517  0.98329  46.07  0.00497  0.98753 
50  45.68  0.00520  0.98313  46.12  0.00494  0.98764 
IvC4 Filter Trained with Different Training Sets
In this subsection, we consider different types of images to form the training set. Three datasets, including undersampled CT images, natural images and the intermediate images in the reconstruction procedure, were tested. Fig. 10 shows the undersampled CT images reconstructed by FBP. Fig. 11 shows the natural image dataset. The undersampled CT images and natural images were used to train predetermined filters. The adaptive filters are trained with the intermediate images in the reconstruction procedure and updated in each iteration, which is similar to the adaptive dictionary learning in [31]. Thirtytwo filters with a size of 10 10 are trained with different training sets, and the filters are shown in Fig. 12 (the adaptive filters are trained with the intermediate result of the abdominal image). It can be observed that the learned adaptive filters become more sharper and cleaner in Fig. 12(c)(e), and the two pretrained filters set in Fig. 12(a) and (b) also capture the main structures although the filters look noisy. Moreover, in Fig. 13, the reconstruction results with different filters trained with various image sets appear to be visually similar, and it can be seen that the quantitative results shown in Table V are also close. Notably, although the results reconstructed by the adaptive filters achieve high scores, it is time consuming to update the filters in each iteration. In summary, our algorithm is robust to the choice of filters (either predetermined or adaptively learned).
Image  Abdominal  Thoracic  

Training Set  PSNR  RMSE  SSIM  PSNR  RMSE  SSIM 
Undersampled CT Images  44.12  0.00622  0.97759  43.73  0.00651  0.98006 
Natural Images  44.71  0.00581  0.97974  44.65  0.00586  0.98329 
Intermediate Images  45.48  0.00532  0.98237  45.65  0.00522  0.98655 
FullSampled CT Images  45.73  0.00517  0.98329  46.07  0.00497  0.98753 
IvD Convergence analysis and the Effect of Regularization Parameters
IvD1 Convergence analysis
Fig. 14 shows the PSNR and RMSE versus the iteration steps for the PWLSCSCGR. It can be seen that with an increase in the iteration number, both PSNR and RMSE curves converge to a stable position. These observations demonstrate that the proposed numerical scheme can efficiently optimize the energy functions to a satisfactory solution.
IvD2 Effect of
To investigate the sensitivity of , experiments were performed for various values. The results are presented in Fig. 15. The PSNR and SSIM decrease with increasing . However, according to our experiments, the smaller is, the slower the algorithm converges. When is less than or equal to 0.0025, the computational cost is twice or more when is equal to 0.005. Therefore, we set to be 0.005 in our algorithms.
IvD3 Effect of
To investigate the sensitivity of our method to , experiments were performed for various values. The results are shown in Fig. 16. It can be observed that PSNR and SSIM for both images demonstrate similar trends. The values of PSNR and SSIM rise rapidly with an increase in and reach a peak when is equal to 0.005. PSNR and SSIM begin to slowly decrease when is larger than 0.005. Thus, was set to 0.005 in this paper.
IvD4 Effect of
To explore the impact of , experiments were performed for various values. The results are given in Fig. 17. It is obvious that PSNR and SSIM change slowly with , which means that PWLSCSCGR is not sensitive to . According to the experimental results, was set to 0.06 in our experiments.
V Discussion and Conclusion
With the development of CSC in recent years, CSC has been proven useful in many imaging problems, including superresolution, image fusion, image decomposition and so on. Instead of dividing an image into overlapped patches, CSC directly works on the whole image, which maintains more details and avoids artifacts caused by patch aggregation. In this paper, we propose two methods based on CSC. The basic version introduces CSC into the PWLS reconstruction framework. To further improve the performance and preserve more structural information, gradient regularization on feature maps is imposed into the basic version. Qualitative and quantitative results demonstrate the merits of our methods.
In the experiments of Section IVA and IVB
, the filters and parameters were the same, showing the generalization of the proposed methods and that there is no need to adjust the filters or the parameters patient by patient. We also examined the impacts of filters on our method. The experimental results show that PWLSCSCGR can work well even with only four filters. PWLSCSCGR is also robust to the training set or even without the training set and can be treated as an unsupervised learning method.
Importantly, another issue is the computational time. The main cost of our methods depends on two parts: training the filters and the reconstruction. Training 32 filters with 10 images costs 85 s of GPU. Because this operation is offline and there is no need for a large training set, this part will not be the main problem. On the other hand, the reconstruction is timeconsuming. Although our methods have a similar heavy computational burden to PWLSDL, several techniques, including parallel computing and advanced optimization methods, can be applied for acceleration.
One of the most important deep learning models is CNN, which is also based on the convolution operator. For CSC, a signal can be represented by a summation of convolutions between a set of filters and the corresponding feature maps, and the key point is to calculate the feature maps with certain (predetermined or adaptive) filters. CNN trains the cascaded filters to convolve with the inputs. Furthermore, current CNNbased methods still lack theoretical proof. Most deep learning methods are datadriven, and the results cannot be guaranteed without sufficient training data. However, CSC, as an unsupervised learning method, has a strict mathematical proof. This method is robust to the number of training samples (as shown in Sec. IVC.3 and IVC.4) and even without training data. On the other hand, the same groups analyzed the relationship between the CSC and CNN methods in [59, 60] and found that assuming that our signals originate from the multilayer CSC model, the layeredthresholding pursuit algorithm for decomposing a given measurement vector completely equals the forward propagation in CNNs. This interesting finding provides a new way to explore the interpretability of deep learning.
In conclusion, inspired by successful applications of CSC in the field of signal processing, we explored the potential of this method incorporating a PWLS image reconstruction framework, resulting in two novel algorithms referred to as PWLSCSC and PWLSCSCGR. We evaluated the proposed algorithms with simulated and real data. In the experimental results, our methods have been shown to be competitive with several stateofart methods. The robustness of our methods was also investigated by extensive analysis with experimental configurations. In our future work, we will extend our methods to other CT imaging topics, such as metal artifact reduction and LDCT. Furthermore, the combination with deep learningbased methods is also an interesting direction.
References
 [1] D. J. Brenner and E. J. Hall, “Computed tomography–an increasing source of radiation exposure,” New Engl. J. Med., vol. 357, no. 22, pp. 2277–2284, 2007.
 [2] R. Gordon, R. Bender, and G. T. Herman, “Algebraic reconstruction techniques (ART) for threedimensional electron microscopy and Xray photography,” J. Theor. Biol., vol. 29, no. 3, pp. 471–481, 1970.
 [3] A. H. Andersen and A. C. Kak, “Simultaneous algebraic reconstruction technique (SART): a superior implementation of the ART algorithm,” Ultrasonic Imaging, vol. 6, no. 1, pp. 81–94, 1984.
 [4] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” J. Roy. Stat. Soc. B Met., vol. 39, no. 1, pp. 1–38, 1977.
 [5] E. J. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information,” IEEE Trans. Inform. Theory, vol. 52, no. 2, pp. 489–509, 2006.
 [6] D. L. Donoho, “Compressed sensing,” IEEE Trans. Inform. Theory, vol. 52, no. 4, pp. 1289–1306, 2006.
 [7] E. Y. Sidky, C. Kao, and X. Pan, “Accurate image reconstruction from fewviews and limitedangle data in divergentbeam CT,” J. Xray Sci. Technol., vol. 14, no. 2, pp. 119–139, 2006.
 [8] E. Y. Sidky and X. Pan, “Image reconstruction in circular conebeam computed tomography by constrained, totalvariation minimization,” Phys. Med. Biol., vol. 53, no. 17, pp. 4777–4807, 2008.
 [9] Y. Liu, J. Ma, Y. Fan, and Z. Liang, “Adaptiveweighted total variation minimization for sparse data toward lowdose Xray computed tomography image reconstruction,” Phys. Med. Biol., vol. 57, no. 23, pp. 7923–7956, 2012.
 [10] Y. Zhang, W.H. Zhang, H. Chen, M.L. Yang, T.Y. Li, and J.L. Zhou, “Fewview image reconstruction combining total variation and a highorder norm,” Int. J. Imag. Syst. Tech., vol. 23, no. 3, pp. 249–255, 2013.
 [11] Y. Zhang, W. Zhang, Y. Lei, and J. Zhou, “Fewview image reconstruction with fractionalorder total variation,” J. Opt. Soc. Am. A, vol. 31, no. 5, pp. 981–995, 2014.
 [12] H. Kim, J. Chen, A. Wang, C. Chuang, M. Held, and J. Pouliot, “Nonlocal totalvariation (NLTV) minimization combined with reweighted l1norm for compressed sensing CT reconstruction,” Phys. Med. Biol., vol. 61, no. 18, pp. 6878–6891, 2016.
 [13] Y. Zhang, Y. Xi, Q. Yang, W. Cong, J. Zhou, and G. Wang, “Spectral CT reconstruction with image sparsity and spectral mean,” IEEE Trans. Comput. Imag., vol. 2, no. 4, pp. 510–523, 2016.
 [14] S. Niu, Y. Gao, Z. Bian, J. Huang, W. Chen, G. Yu, Z. Liang, and J. Ma, “Sparseview Xray CT reconstruction via total generalized variation regularization,” Phys. Med. Biol., vol. 59, no. 12, pp. 2997–3017, 2014.
 [15] J. A. Fessler, “Penalized weighted leastsquares image reconstruction for positron emission tomography,” IEEE Trans. Med. Imaging, vol. 13, no. 2, pp. 290–300, 1994.
 [16] K. Bredies, K. Kunisch, and T. Pock, “Total generalized variation,” SIAM J. Imaging Sci., vol. 3, no. 3, pp. 492–526, 2010.
 [17] G. Chen, J. Tang, and S. Leng, “Prior image constrained compressed sensing (PICCS): a method to accurately reconstruct dynamic CT images from highly undersampled projection data sets,” Med. Phys., vol. 35, no. 2, pp. 660–663, 2008.
 [18] G. Wang, “A perspective on deep imaging,” IEEE Access, vol. 4, pp. 8914–8924, 2016.

[19]
G. Wang, J. C. Ye, K. Mueller, and J. A. Fessler, “Image reconstruction is a new frontier of machine learning,”
IEEE Trans. Med. Imaging, vol. 37, no. 6, pp. 1289–1296, 2018.  [20] H. Chen, Y. Zhang, Y. Chen, J. Zhang, W. Zhang, H. Sun, Y. Lv, P. Liao, J. Zhou, and G. Wang, “LEARN: Learned experts’ assessmentbased reconstruction network for sparsedata CT,” IEEE Trans. Med. Imaging, vol. 37, no. 6, pp. 1333–1347, 2018.
 [21] J. Adler and O. Öktem, “Learned primaldual reconstruction,” IEEE Trans. Med. Imaging, vol. 37, no. 6, pp. 1322–1332, 2018.
 [22] B. Zhu, J. Z. Liu, S. F. Cauley, B. R. Rosen, and M. S. Rosen, “Image reconstruction by domaintransform manifold learning,” Nature, vol. 555, pp. 487–492, 2018.
 [23] J. He, Y. Yang, Y. Wang, D. Zeng, Z. Bian, H. Zhang, J. Sun, Z. Xu, and J. Ma, “Optimizing a parameterized plugandplay ADMM for iterative lowdose CT reconstruction,” IEEE Trans. Med. Imaging, vol. 38, no. 2, pp. 371–382, 2019.
 [24] H. Chen, Y. Zhang, W. Zhang, P. Liao, K. Li, J. Zhou, and G. Wang, “Lowdose CT via convolutional neural network,” Biomed. Opt. Exp., vol. 8, no. 2, pp. 679–694, 2017.
 [25] E. Kang, J. Min, and J. C. Ye, “A deep convolutional neural network using directional wavelets for lowdose Xray CT reconstruction,” Med. Phys., vol. 44, no. 10, 2017.
 [26] H. Chen, Y. Zhang, M. K. Kalra, F. Lin, Y. Chen, P. Liao, J. Zhou, and G. Wang, “Lowdose CT with a residual encoderdecoder convolutional neural network,” IEEE Trans. Med. Imaging, vol. 36, no. 12, pp. 2524–2535, 2017.
 [27] M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” IEEE Trans. Image Process., vol. 15, no. 12, pp. 3736–3745, 2006.
 [28] J. Mairal, M. Elad, and G. Sapiro, “Sparse representation for color image restoration,” IEEE Trans. Image Process., vol. 17, no. 1, pp. 53–69, 2008.
 [29] Y. Chen, X. Ye, and F. Huang, “A novel method and fast algorithm for MR image reconstruction with significantly undersampled data,” Inverse Probl. Imag., vol. 4, no. 2, pp. 223–240, 2010.
 [30] S. Ravishankar and Y. Bresler, “MR image reconstruction from highly undersampled kspace data by dictionary learning,” IEEE Trans. Med. Imaging, vol. 30, no. 5, pp. 1028–1041, 2011.
 [31] Q. Xu, H. Yu, X. Mou, L. Zhang, J. Hsieh, and G. Wang, “Lowdose Xray CT reconstruction via dictionary learning,” IEEE Trans. Med. Imaging, vol. 31, no. 9, pp. 1682–1697, 2012.
 [32] J. Yang, J. Wright, T. S. Huang, and Y. Ma, “Image superresolution via sparse representation,” IEEE Trans. Image Process., vol. 19, no. 11, pp. 2861–2873, 2010.
 [33] Y. Lu, J. Zhao, and G. Wang, “Fewview image reconstruction with dual dictionaries,” Phys. Med. Biol., vol. 57, no. 1, pp. 173–189, 2012.
 [34] B. Zhao, H. Ding, Y. Lu, G. Wang, J. Zhao, and S. Molloi, “Dualdictionary learningbased iterative image reconstruction for spectral computed tomography application,” Phys. Med. Biol., vol. 57, no. 24, pp. 8217–8229, 2012.
 [35] Y. Chen, L. Shi, Q. Feng, J. Yang, H. Shu, L. Luo, J. Coatrieux, and W. Chen, “Artifact suppressed dictionary learning for lowdose CT image processing,” IEEE Trans. Med. Imaging, vol. 33, no. 12, pp. 2271–2292, 2014.
 [36] J. Liu, J. Ma, Y. Zhang, Y. Chen, J. Yang, H. Shu, L. Luo, G. Coatrieux, W. Yang, Q. Feng, and W. Chen, “Discriminative feature representation to improve projection data inconsistency for low dose CT imaging,” IEEE Trans. Med. Imaging, vol. 36, no. 12, pp. 2499–2509, 2017.
 [37] X. Zheng, S. Ravishankar, Y. Long, and J. A. Fessler, “PWLSULTRA: An efficient clustering and learningbased approach for lowdose 3D CT image reconstruction,” IEEE Trans. Med. Imaging, vol. 37, no. 6, pp. 1498–1510, 2018.
 [38] M. S. Lewicki and T. J. Sejnowski, “Coding timevarying signals using sparse, shiftinvariant representations,” in Advances in Neural Information Processing Systems, pp. 730–736, 1999.

[39]
M. D. Zeiler, D. Krishnan, G. W. Taylor, and R. Fergus, “Deconvolutional
networks,” in
Proc. IEEE Conf. Comput. Vis. Pattern Recognit.
, pp. 2528–2535, 2010.  [40] S. Gu, W. Zuo, Q. Xie, D. Meng, X. Feng, and L. Zhang, “Convolutional sparse coding for image superresolution,” in Proc. IEEE Int. Conf. Computer Vision, pp. 1823–1831, 2015.
 [41] B. Wohlberg, “Convolutional sparse representations as an image model for impulse noise restoration,” in Image, Video, and Multidimensional Signal Process. Workshop, pp. 1–5, 2016.
 [42] Y. Liu, X. Chen, R. K. Ward, and Z. J. Wang, “Image fusion with convolutional sparse representation,” IEEE Signal Proc. Lett., vol. 23, no. 12, pp. 1882–1886, 2016.
 [43] H. Zhang and V. M. Patel, “Convolutional sparse and lowrank codingbased rain streak removal,” in IEEE Winter Conf. App. Computer Vision (WACV), pp. 1259–1267, IEEE, 2017.
 [44] H. Zhang and V. M. Patel, “Convolutional sparse and lowrank codingbased image decomposition,” IEEE Trans. Image Process., vol. 27, no. 5, pp. 2121–2133, 2018.
 [45] N.D. Thanh, M. Q. Tran, and J. WonKi, “Frequencysplitting dynamic MRI reconstruction using multiscale 3D convolutional sparse coding and automatic parameter selection,” Med. Image Anal., vol. 53, pp. 179–196, 2019.
 [46] T. Li, X. Li, J. Wang, J. Wen, H. Lu, J. Hsieh, and Z. Liang, “Nonlinear sinogram smoothing for lowdose Xray CT,” IEEE Trans. Nucl. Sci., vol. 51, no. 5, pp. 2505–2513, 2004.
 [47] B. Wohlberg, “Efficient algorithms for convolutional sparse representations,” IEEE Trans. Image Process., vol. 25, no. 1, pp. 301–315, 2016.
 [48] I. A. Elbakri and J. A. Fessler, “Statistical image reconstruction for polyenergetic Xray computed tomography,” IEEE Trans. Med. Imaging, vol. 21, no. 2, pp. 89–99, 2002.
 [49] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends in Machine learning, vol. 3, no. 1, pp. 1–122, 2011.
 [50] H. Bristow, A. Eriksson, and S. Lucey, “Fast convolutional sparse coding,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 391–398, 2013.
 [51] C. Rusu, B. Dumitrescu, and S. A. Tsaftaris, “Explicit shiftinvariant dictionary learning,” IEEE Signal Process. Lett., vol. 21, no. 1, pp. 6–9, 2014.
 [52] B. Wohlberg, “Efficient convolutional sparse coding,” in IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 7173–7177, IEEE, 2014.
 [53] J. M. BioucasDias and M. A. Figueiredo, “A new twist: twostep iterative shrinkage/thresholding algorithms for image restoration,” IEEE Trans. Image Process., vol. 16, no. 12, pp. 2992–3004, 2007.
 [54] B. Wohlberg, “Convolutional sparse representations with gradient penalties,” in IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), pp. 6528–6532, IEEE, 2018.
 [55] B. He, H. Yang, and S. Wang, “Alternating direction method with selfadaptive penalty parameters for monotone variational inequalities,” J. Optimiz. Theory App., vol. 106, no. 2, pp. 337–356, 2000.
 [56] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–612, 2004.
 [57] The 2016 NIHAAPMMayo Clinic Low Dose CT Grand Challenge. [Online]. Available: http://www.aapm.org/GrandChallenge/LowDoseCT/.
 [58] R. L. Siddon, “Fast calculation of the exact radiological path for a threedimensional CT array,” Med. Phys., vol. 12, no. 2, pp. 252–255, 1985.
 [59] V. Papyan, Y. Romano, and M. Elad, “Convolutional neural networks analyzed via convolutional sparse coding,” J. Mach. Learn. Res., vol. 18, no. 1, pp. 2887–2938, 2017.
 [60] V. Papyan, Y. Romano, J. Sulam, and M. Elad, “Theoretical foundations of deep learning via sparse representations: A multilayer sparse model and its connection to convolutional neural networks,” IEEE Signal Proc. Mag., vol. 35, no. 4, pp. 72–89, 2018.
Comments
There are no comments yet.