A fast method for simultaneous reconstruction and segmentation in X-ray CT application

01/30/2021
by   Yiqiu Dong, et al.
14

In this paper, we propose a fast method for simultaneous reconstruction and segmentation (SRS) in X-ray computed tomography (CT). Our work is based on the SRS model where Bayes' rule and the maximum a posteriori (MAP) are used on hidden Markov measure field model (HMMFM). The original method leads to a logarithmic-summation (log-sum) term, which is non-separable to the classification index. The minimization problem in the model was solved by using constrained gradient descend method, Frank-Wolfe algorithm, which is very time-consuming especially when dealing with large-scale CT problems. The starting point of this paper is the commutativity of log-sum operations, where the log-sum problem could be transformed into a sum-log problem by introducing an auxiliary variable. The corresponding sum-log problem for the SRS model is separable. After applying alternating minimization method, this problem turns into several easy-to-solve convex sub-problems. In the paper, we also study an improved model by adding Tikhonov regularization, and give some convergence results. Experimental results demonstrate that the proposed algorithms could produce comparable results with the original SRS method with much less CPU time.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 12

page 13

page 14

page 15

page 16

02/18/2014

Fast X-ray CT image reconstruction using the linearized augmented Lagrangian method with ordered subsets

The augmented Lagrangian (AL) method that solves convex optimization pro...
12/14/2015

Relaxed Linearized Algorithms for Faster X-Ray CT Image Reconstruction

Statistical image reconstruction (SIR) methods are studied extensively f...
08/28/2018

Variational Bayesian Approach and Gauss-Markov-Potts prior model

In many inverse problems such as 3D X-ray Computed Tomography (CT), the ...
10/04/2021

GMRES Methods for Tomographic Reconstruction with an Unmatched Back Projector

Unmatched pairs of forward and back projectors are common in X-ray CT co...
05/31/2020

Limited-angle CT reconstruction via the L1/L2 minimization

In this paper, we consider minimizing the L1/L2 term on the gradient for...
06/12/2019

Torus computed tomography

We present a new computed tomography (CT) method for inverting the Radon...
07/18/2019

Fast permutation-word multiplication and the simultaneous conjugacy problem

Given a finite sequence a_1, a_2,..., a_d of d permutations in the symme...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

X-ray computed tomography (CT) is an important application of inverse problems, which reconstructs the attenuation coefficients of an object from the damping of X-rays. Since different materials have different attenuation coefficients, by X-ray CT technique we are able to see interior of the object. As X-ray CT is widely applied in a lot of fields, many reconstruction methods have been proposed and applied in the industry, such as the filtered back projection algorithm [kuchment2014radon], and algebraic reconstruction techniques [kak2001principles, hansen2012air].

After reconstruction, very often we would apply image segmentation technique to distinguish some interested regions or objects. A main drawback of separating reconstruction and segmentation is the error propagation, i.e., the errors in the reconstruction will continue to affect the segmentation. To overcome this drawback, several simultaneous reconstruction and segmentation (SRS) methods were proposed in recent years. As far as we know, the first SRS method in CT was proposed in [ramlau2007mumford] based on Mumford-Shah level-set approach [3]. In [van2008simultaneous], another SRS method according to hidden Markov measure field model (HMMFM) [marroquin2003hidden]

was proposed, where the segmentation result is obtained through a probability map. For CT with shadowed data, an SRS method based on Potts model

[potts1952some] was proposed in [10.1007/978-3-319-58771-4_25]. In order to segment the objects that have different patterns during CT reconstruction, in [8417955] the dictionary learning technique was introduced into SRS method.

Following the idea from [marroquin2003hidden], in [doi:10.1080/17415977.2015.1124428]

the means and variances of the segmentation classes were used as prior and a new SRS method was proposed. Numerical results show that this SRS method can significantly improve the accuracy of reconstruction and segmentation. Further, this method was extended by withdrawing the prior information on the variance, see

[10.1007/978-3-319-58771-4_21]. In [doi:10.1080/17415977.2015.1124428]

, by using Bayes’ rule and the maximum a posteriori (MAP) estimate with proper prior information on the means and variances of the classes, the authors proposed the following variational model:

(1)

where is the system matrix of CT scanner, is the image of attenuation coefficients, and contains all measurements. Here, we assume that the image consists of classes, and define with as the probability of pixel belonging to the th class. Then we have the constraint, and for all and . Furthermore, and

are the mean and standard deviation of the

th class, respectively. In addition, are the regularization parameters, and denotes the regularization according to the prior information. In [doi:10.1080/17415977.2015.1124428], Tikhonov regularization [tikhonov1943stability] and total variation (TV) regularization [4] have been tested. In the last term of (1), which is called as log-sum term, because the logarithmic operator and summation operator are non-separable, the constrained minimization problem in (1) is very difficult to solve. When solving in [doi:10.1080/17415977.2015.1124428], the authors have to use an approximation of the log-sum term. Moreover, the algorithm is time-consuming, which limits its applications to large-scale CT problems.

In this paper, we propose a new method to solve the model (1). Inspired by the work in [5, 2], the log-sum term can be transformed into a sum-log term by introducing an auxiliary variable, then the sub-problems become much easier to solve. Numerical results show that our method can provide comparable results as in [doi:10.1080/17415977.2015.1124428] with much less CPU time.

The rest of the paper is organized as follows. In Section 2, the method in [doi:10.1080/17415977.2015.1124428] is briefly reviewed. In Section 3, by transforming the log-sum term, we introduce a new method that can solve the minimization problem in the model (1) without simplification. Furthermore, an improved model is proposed and some convergence results are given. In Section 4, numerical results are presented. Finally, conclusions are drawn in Section 5.

2 The method in [doi:10.1080/17415977.2015.1124428]

In [doi:10.1080/17415977.2015.1124428], the authors use alternating minimization method [csiszar1984information] to solve the model (1) as follows,

(2)
(3)

The last term in (1) is a log-sum term, which is non-separable to the classification index , thus it is very difficult to minimize the sub-problem according to . In order to solve the sub-problem (2), i.e.,

the authors applied two-step approximation to simplify the log-sum term. Denote

(4)

In the first step,

is approximated by a “flat” Gaussian distribution

(5)

where

After a few iterations on (2) and (3), as the segmentation is good enough, a “sharp” Gaussian distribution is used to approximate to further improve the reconstruction result, i.e.,

(6)

where .

Clearly, after using a single Gaussian distribution to approximate the sum of several Gaussian distribution in , the summation operator of the log-sum in the model (1) is eliminated, and the reconstruction is obtained by solving a simple least-squares problem.

The sub-problem (3) is

with the constraint and for all and . Because of the constraint and the non-differentiable TV term, a general constrained gradient descent method such as Frank-Wolfe algorithm [bertsekas1999nonlinear] could be applied here. Due to the singularity of the TV term, it is well known that gradient descent algorithm is very inefficient [6].

Obviously, solving the minimization problem in (1) is challenging due to the log-sum term. In this paper, we will take advantage of the commutativity of the log-sum operations studied in [5, 2], and transform the model (1) into several easy-to-solve convex sub-problems. Besides, we could provide some convergence results that were missing in [doi:10.1080/17415977.2015.1124428].

3 Our method and convergence results

In this section, we propose a new method that solves the model (1) and provide some convergence results of the new method.

3.1 Proposed method

Due to the log-sum term in (1), the sub-problem with respect to is non-convex and very difficult to solve. Inspired by the transformation of the log-sum operations introduced in [5, 2], we can transform the log-sum term into a much simpler form. To do so, we would need the following proposition.

Proposition 3.1.

(commutativity of the log-sum operations [5, 2]). Given , we have

where , and

According to the log-sum term in (1), we define

(7)

Then, applying Proposition 3.1 on (1), we transform the minimization problem (1) into a new optimization problem with respect to the variables :

(8)

where and . Note that here we let to avoid being 0. Furthermore, in our method we use TV regularization on each column of , i.e.

where represent the neighbor pixels of point in the horizontal and vertical directions, respectively. At the boundary of the image domain, the value of the nearest pixel is repeated.

In order to solve the minimization problem (8), we apply alternating minimization method [csiszar1984information], i.e., we solve the sub-problems with respect to , and alternately. The detailed algorithm is given in Algorithm 1.

  1. Set and , and initialize , , and with for all and .
  2. Update , by
(9)
  3. Update , by
(10)
  4. Update , by
(11)
  5. If , then stop. Otherwise, let , and go to 2.
Algorithm 1 Algorithm for solving the minimization problem in (8)

In Algorithm 1, the objective function in the sub-problem (9) on is quadratic, so it can be efficiently solved by using CGLS method [bjorck1996numerical]. Since the sub-problem (11) is separable, we can solve element-wise. According to the first-order optimality condition together with Lagrange multiplier technique, we can easily obtain the closed-form solution:

(12)

In the following subsection, we will focus on solving the sub-problem (10).

3.1.1 The sub-problem

The difficulties of solving the minimization problem in (10) are mainly from the non-differentiable term , highly nonlinear and nonquadratic term , and the constraint . In order to split them apart, we introduce two auxiliary variables, , and let , while each of them take one of the difficulties from , , and , respectively. Now (10) becomes

(13)

Then, we apply the alternating direction method of multiplier (ADMM) [boyd2011distributed] to solve it.

By introducing two Lagrangian multipliers for the linear constraint and , respectively, we obtain the augmented Lagrangian [7]:

where and are positive penalty parameters. According to ADMM, we need solve

in order to obtain in Algorithm 1, and the iterates in ADMM are generated as follows:

(14)
(15)
(16)

For sub-problem (14) we can solve it column-by-column on , and each column can be solved by applying split Bregman method [1] with Neumann boundary condition. In sub-problem (15), each element of can be solved separately. Based on the first-order optimaility condition, we obtain that the minimizer should satisfy the following equation

Due to the logarithmic operation, need be positive, and its solution has closed form:

(17)

with

Sub-problem (16) is a least-squares problem with a constraint on a convex set, and one approximate solution could be obtained by

(18)

where is a pre-defined small positive number.

3.2 An improved model based on (8)

If we look back to the model (8), we can see that the only regularization term in the model is on . In [doi:10.1080/17415977.2015.1124428], due to the log-sum term in (1), the segmentation information represented in is utilized to regularize the reconstruction, and there were not any smoothing requirement on . While in the proposed new model (9), is updated by using the unregularized , which brings a risk: the isolated points on unregularized might lead to isolated points in the reconstruction result . This phenomenon is presented in Figure 3 in our numerical results. In order to further improve the reconstruction, we add Tikhonov regularization on , and modify the model (8) as follows:

(19)

where denotes the discrete gradient operator using the forward finite difference scheme with Neumann boundary condition, and is a regularization parameter.

  1. Set and , and initialize , and with all .
  2. Update , by
(20)
  3. Update , by
(21)
  4. Update , by
(22)
  5. If , stop. Or let , goto 2.
Algorithm 2 Algorithm for solving the minimization problem in (19)

We list the detailed algorithm in Algorithm 2. The only difference between Algorithm 1 and Algorithm 2 is the added Tikhonov regularization term in the sub-problem (20) with respect to . With Tikhonov regularization the objective function is still quadratic, so we can apply CGLS method [bjorck1996numerical] to solve (20) efficiently.

3.3 Convergence results

The model (8) could be seen as a special case of (19) if we choose . Thus, we only give the convergence results to Algorithm 2 for solving the minimization problem in (19).

Proposition 3.2.

For the sequence generated by Algorithm 2, every cluster point is a coordinatewise minimum point of .

Proof.

We use Theorem 5.1 in [Tseng2001] to give our convergence result. For self-contained, we list Theorem 5.1 in the appendix. For more details, please see [Tseng2001]. Here we verify that our algorithm satisfies the assumptions of the theorem.

According to (19), we set

We have the following statements:

  • The essentially cyclic rule (Definition 6.2) is satisfied, according to Algorithm 2.

  • Assumption B1 (in Definition 6.3) is satisfied, due to that is continuous in its domain.

  • Assumption B2 (in Definition 6.3) is satisfied, because the right hand of (20) (21) (22) are strictly convex, by the following facts:

    • (20) is quadratic to and the second term is strictly convex.

    • TV norm is convex in (21).

    • The second term in (21) and the second term in (22) are separable, and each of the components is strictly convex.

    • The first term in (22) is linear.

    • The feasible sets of are convex.

  • Assumption B3 (in Definition 6.3) is satisfied, because is continuous on its domain and are lsc.

  • Assumption C2 (in Definition 6.3) is satisfied, because the domain of is .

By applying Theorem 6.4 in Appendix, i.e., Theorem 5.1 in [Tseng2001], we know that for the sequence generated by Algorithm 2, every cluster point is a coordinatewise minimum point of .

4 Numerical experiments

In this section, we present some numerical experiment results to demonstrate the performance of our methods. We compare them with the one introduced in [doi:10.1080/17415977.2015.1124428], where a simplified model is solved instead of (1). All numerical tests are done on a linux server equipped with CPU 2.30Hz and MATLAB R2018a. In the tests, when we use CGLS method to solve the sub-problem with respect to in (8) or (19), we set the maximum iteration number as 100 and the stopping rule as

where is the inner iteration index. When we solve the sub-problem on , we stop ADMM after 50 iterations or the condition

is satisfied. In the split Bregman method for solving (14), the stopping rule is

where is the iteration index inside split Bregman method. The parameter is set to 0.0001 for all experiments.

4.1 Experimental settings

In our numerical experiments, the phantoms are generated from AIR Tools package [hansen2012air] with the command phantomgallery, and the system matrix of CT scanner is obtained by calling paralleltomo. Without special mention, the size of phantoms is 64-by-64, the number of pixels in the detector is 91, and the projection angles are from to with the equal space . Then, the underdetermined rate of our inverse problem, is 0.667. In our tests, we assume that the measurements are corrupted by additive white Gaussian noise with mean 0 and standard deviation , where gives the noise level and denotes the true attenuation coefficients.

In order to illustrate the performance of the methods, we define the reconstruction error and segmentation error as

(23)

where is the reconstruction result, is the Dirac function, is the label of pixel in the segmentation result, which is given by

(24)

and denotes the true label for pixel . The regularization parameters in our method as well as in the method proposed in [doi:10.1080/17415977.2015.1124428] are chosen as the ones which give the smallest value.

4.2 Comparison on a piecewise constant phantom

We first compare our methods by solving (8) and (19) with the one proposed in [doi:10.1080/17415977.2015.1124428] on an 8-class piecewise constant phantom. The noise level is set as 0.05, and the prior information on the mean and standard deviation for each class is and for . In our methods, we set the parameters as .

In Table 1, we list the reconstruction error and segmentation error defined as in (23) together with CPU times in second for all three methods. All results shown in Table 1 are the average of running on 50 different noise realizations. It can be seen that the methods by solving (8) and (19) can achieve almost the same segmentation results, and they are better than the one from [doi:10.1080/17415977.2015.1124428]. Comparing reconstruction error, we can see that the method by solving (8) gives the largest error, which is due to not enough regularization on the reconstruction. By adding Tikhonov regularization in the model (19), we are able to obtain comparable results as the simplified model in [doi:10.1080/17415977.2015.1124428]. In addition, both our methods cost similar CPU time, which is less than the one in [doi:10.1080/17415977.2015.1124428] with around a factor .

CPU Time (in second)
Method in [doi:10.1080/17415977.2015.1124428] 0.086 0.033 469.7
Method by solving (8) 0.106 0.027 93.2
Method by solving (19) 0.088 0.026 93.9
Table 1: Comparison on a piecewise constant phantom

In order to compare the results visually, in Figure 1 and 2 we show the reconstruction and segmentation results with respect to the best and worst case in these 50 tests, i.e., the results with the smallest and the largest values, respectively. It is clear that our methods provide more accurate segmentation results, which can be seen from the dark gray class in the middle red region, see the third row for zooming images. For the reconstruction results, the method in [doi:10.1080/17415977.2015.1124428] gives much smoother results than both the proposed methods.

Figure 1: Comparison of reconstruction and segmentation results with respect to the best case on a piecewise constant phantom. Row 1: reconstruction results; row 2: segmentation results; row 3: zoomed segmentation results in the red square.
Figure 2: Comparison of reconstruction and segmentation results with respect to the worst case on a piecewise constant phantom. Row 1: reconstruction results; row 2: segmentation results.

In Figure 3, we use one example from the 50 tests to show the difference on in (8), (19) comparing with in [doi:10.1080/17415977.2015.1124428]. We treat as a segmentation using (24), and show it in the third row of Figure 3. Note that in the method proposed in [doi:10.1080/17415977.2015.1124428], there is no variable . As you can see, in method (8) the isolated points in marked by red squares lead to isolated points in the reconstructions. After adding regularization on in (19), the number of isolated points is obviously reduced, and we can clearly see the smoothness in the reconstruction result from method (19). This indicates that the regularization term is helpful for improving the reconstruction. According to the reconstruction from [doi:10.1080/17415977.2015.1124428], although there is no regularization on , the reconstruction is still rather smooth without isolated points. It means that the simplification of the model in [doi:10.1080/17415977.2015.1124428] potentially gives smoothing regularization on the reconstruction through the segmentation .

Figure 3: Comparison of in (8), (19) with in [doi:10.1080/17415977.2015.1124428]. Row 1: the reconstructed ; row 2: the segmentation from ; row 3: the segmentation from .

4.3 Comparison on a smooth phantom

In this experiment, we compare the methods on a smooth phantom with 3 different classes, which is much more difficult to segment comparing with piecewise constant phantoms. The noise level is , and the prior information on the mean and standard deviation for three classes are , , and . Other parameters are set as .

In Table 2, we list the averages of the reconstruction errors, segmentation errors and CPU times on 50 experiment tests with different noise realizations. We can see that our methods by solving the models (8) and (19) achieve similar segmentation results, and the method proposed in [doi:10.1080/17415977.2015.1124428] provides the smallest segmentation error. Comparing the reconstruction results, it turns out that our method with the model (19) gives the best reconstruction results, followed by [doi:10.1080/17415977.2015.1124428]. It is obvious that Tikhonov regularization in the model (19) plays an important role, which reduces the reconstruction error significantly. In addition, our methods cost only CPU time comparing with the method with simplified model.

CPU Time (in second)
Method in [doi:10.1080/17415977.2015.1124428] 0.203 0.166 374.9
Method by solving (8) 0.215 0.177 28.8
Method by solving (19) 0.195 0.172 33.7
Table 2: Comparison on a smooth phantom
Figure 4: Comparison of reconstruction and segmentation results with respect to the best case on a smooth phantom. Row 1: reconstruction results; row 2: segmentation results.
Figure 5: Comparison of reconstruction and segmentation results with respect to the worst case on a smooth phantom. Row 1: reconstruction results; row 2: segmentation results.

In Figure 4 and 5 we show the reconstruction and segmentation results from three methods under the best case and the worst case. The three methods generate comparable results. As you can see, there are no big differences between the three methods.

4.4 Comparison of CPU times on different resolutions

In this section, we compare our methods with the method in [doi:10.1080/17415977.2015.1124428] with respect to CPU times. To do so, we generate the piecewise constant phantom with different resolutions, then adjust the number of the projection angles such that the underdetermined rate is still kept as 0.667. The test results are the averages on 3 different noise realizations.

phantom resolution
Projection angle
Method in [doi:10.1080/17415977.2015.1124428] 469.7 1981.4 9447.3
Method with (8) 93.2 161.2 783.5 3224.2
Method with (19) 93.9 138.0 560.4 2928.7
Table 3: Comparison of CPU time (in second) on a piecewise phantom

In Table 3, we list the CPU times in second for all three methods. Note that in the case of 512-by-512 phantom, we could not apply the method in [doi:10.1080/17415977.2015.1124428] due to heavy computational cost and limited memory. It is obvious that both our methods cost much less CPU times than the one in [doi:10.1080/17415977.2015.1124428]. When the resolution increases, the method with Tikhonov regularization, i.e., the one solving the model (19), utilizes less and less computing time compared to the one in [doi:10.1080/17415977.2015.1124428], with a factor of , and less time compared to method (8). The latter one is because, the sub-problem on is better conditioned and less iterations are needed in order to reach stopping rule.

All the above comparison indicate that the proposed algorithm is faster than the method in [doi:10.1080/17415977.2015.1124428]. There might be three reasons: a, the objective function of the Frank-Wolf algorithm is singularity due to the TV norm; b, the dimension of is large; c, although there are more variables to solve in the proposed method, the proposed method is separable and many sub-problems have close-form solutions.

5 Conclusion

In this paper, we propose two new methods for simultaneous reconstruction and segmentation in X-ray CT application. By using the commutativity of log-sum operations, the original model proposed in [doi:10.1080/17415977.2015.1124428] could be solved more efficiently. Although one more variable is introduced, the energy function becomes separable and each sub-problem is convex and easy-to-solve. We also show some convergence results, and numerically discover the role of the simplification steps in [doi:10.1080/17415977.2015.1124428]. Numerical results show that the proposed methods provide comparable reconstruction and segmentation results with much less CPU time.

6 Acknowledgments

We thank Hans Martin Kjer from Technical University of Denmark for providing us the codes for the method introduced in [doi:10.1080/17415977.2015.1124428]. Y. Dong acknowledges the support of the National Natural Science Foundation of China under Grant 11701388 and the Villum Foundation under Grant 25893. C. Wu was supported by National Natural Science Foundation of China under Grants 11871035 and 11531013; and Recruitment Program of Global Young Expert.

References

  • [1] T. Goldstein and S. Osher (2009) The split Bregman method for L1-regularized problems. SIAM Journal on Imaging Sciences 2 (2), pp. 323–343. Cited by: §3.1.1.
  • [2] J. Liu, X.C. Tai, H.Y. Huang, and Z.D. Huan (2013) A weighted dictionary learning model for denoising images corrupted by mixed noise. IEEE Transactions on Image Processing 22 (3), pp. 1108–1120. Cited by: §1, §2, §3.1, Proposition 3.1.
  • [3] D. Mumford and J. Shah (1989) Optimal approximations by piecewise smooth functions and associated variational problems. Communications on Pure and Applied Mathematics 42 (5), pp. 577–685. Cited by: §1.
  • [4] L.I. Rudin, S. Osher, and E. Fatemi (1992) Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena 60 (1), pp. 259–268. Cited by: §1.
  • [5] M. Teboulle (2007) A unified continuous optimization framework for center-based clustering methods.

    Journal of Machine Learning Research

    8 (Jan), pp. 65–102.
    Cited by: §1, §2, §3.1, Proposition 3.1.
  • [6] C.R. Vogel and M.E. Oman (1996) Iterative methods for total variation denoising. SIAM Journal on Scientific Computing 17 (1), pp. 227–238. Cited by: §2.
  • [7] C. Wu and X.C. Tai (2010) Augmented Lagrangian method, dual methods, and split Bregman iteration for ROF, vectorial TV, and high order models. SIAM Journal on Imaging Sciences 3 (3), pp. 300–339. Cited by: §3.1.1.

Appendix

All the following definitions and theorem come from [Tseng2001].

Let have the following form:

(25)

where , . is proper.

Definition 6.1.

Block coordinate descent (BCD) method:

  • Initialization. Choose any .

  • Iteration with . Given , choose an index and compute a new iterate

    satisfying

Definition 6.2.

[Essentially cyclic rule] There exists a constant such that every index is chosen at least once between the th iteration and the ()th iteration, for all .

Definition 6.3.

More assumptions about .

  • (B1) is continuous on .

  • (B2) For each and , the function is quasiconvex and hemivariate.

  • (B3) are lsc.

  • (C1) is open and tends to at every boundary point of .

  • (C2) , for some .

  • is quasiconvex if

  • is hemivariate if is not constant on any line segment belonging to .

  • is a coordinatewise minimum point of , if and

  • if is strictly convex, then is quasiconvex and hemivariate.

Theorem 6.4 (Theorem 5.1 in [Tseng2001]).

Suppose that satisfy Assumptions B1, B2, B3 and that satisfies either Assumption C1 or C2. Also, assume that the sequence generated by the BCD method using the essentially cyclic rule is defined. Then, either , or else every cluster point is a coordinatewise minimum point of .