Multi-Slice Fusion for Sparse-View and Limited-Angle 4D CT Reconstruction

08/01/2020 ∙ by Soumendu Majee, et al. ∙ Purdue University Eli Lilly and Company 0

Inverse problems spanning four or more dimensions such as space, time and other independent parameters have become increasingly important. State-of-the-art 4D reconstruction methods use model based iterative reconstruction (MBIR), but depend critically on the quality of the prior modeling. Recently, plug-and-play (PnP) methods have been shown to be an effective way to incorporate advanced prior models using state-of-the-art denoising algorithms. However, state-of-the-art denoisers such as BM4D and deep convolutional neural networks (CNNs) are primarily available for 2D or 3D images and extending them to higher dimensions is difficult due to algorithmic complexity and the increased difficulty of effective training. In this paper, we present multi-slice fusion, a novel algorithm for 4D reconstruction, based on the fusion of multiple low-dimensional denoisers. Our approach uses multi-agent consensus equilibrium (MACE), an extension of plug-and-play, as a framework for integrating the multiple lower-dimensional models. We apply our method to 4D cone-beam X-ray CT reconstruction for non destructive evaluation (NDE) of samples that are dynamically moving during acquisition. We implement multi-slice fusion on distributed, heterogeneous clusters in order to reconstruct large 4D volumes in reasonable time and demonstrate the inherent parallelizable nature of the algorithm. We present simulated and real experimental results on sparse-view and limited-angle CT data to demonstrate that multi-slice fusion can substantially improve the quality of reconstructions relative to traditional methods, while also being practical to implement and train.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 5

page 7

page 9

page 10

page 11

page 12

page 13

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Improvements in imaging sensors and computing power have made it possible to solve increasingly difficult reconstruction problems. In particular, the dimensionality of reconstruction problems has increased from the traditional 2D and 3D problems representing space to more difficult 4D or even 5D problems representing space-time and, for example, heart or respiratory phase [1, 2, 3, 4, 5, 6].

These higher-dimensional reconstruction problems pose surprisingly difficult challenges computationally and perhaps more importantly, in terms of algorithmic design and training due to the curse of dimensionality

[7]. However, the high dimensionality of the reconstruction also presents important opportunities to improve reconstruction quality by exploiting the regularity in the high-dimensional space. In particular, for time-resolved imaging, we can exploit the regularity of the image to reconstruct each frame with fewer measurements and thereby increase temporal resolution. In the case of 4D CT, the contributions of [2, 8, 9] have increased the temporal resolution by an order of magnitude by exploiting the space-time regularity of objects being imaged. These approaches use model-based iterative reconstruction (MBIR) [10, 11]

to enforce regularity in 4D using simple space-time prior models. More recently, deep learning based post-processing for 4D reconstruction has been proposed as a method to improve reconstructed image quality 

[12].

Fig. 1: Illustration of our multi-slice fusion approach. Each CNN denoiser operates along the time direction and two spatial directions. We fuse the CNN denoisers with the measurement model to produce a 4D regularized reconstruction.

Recently, it has been demonstrated that plug-and-play (PnP) priors [13, 14, 15, 16] can dramatically improve reconstruction quality by enabling the use of state-of-the-art denoisers as prior models in MBIR. So PnP has great potential to improve reconstruction quality in 4D CT imaging problems. However, state-of-the-art denoisers such as deep convolutional neural networks (CNN) and BM4D are primarily available for 2D and sometimes 3D images, and it is difficult to extend them to higher dimensions [17, 18, 7]

. In particular, extending CNNs to 4D requires very computationally and memory intensive 4D convolution applied to 5D feature tensor structures. This problem is further compounded by the lack of GPU accelerated routines for 4D convolution from major Deep-Learning frameworks such as Tensorflow, Keras, PyTorch 

111Currently only 1D, 2D, and 3D convolutions are supported with GPU acceleration. Furthermore, 4D CNNs require 4D ground truth data to train the PnP denoisers, which might be difficult or impossible to obtain.

In this paper, we present a novel 4D X-ray CT reconstruction algorithm that combines multiple low-dimensional CNN denoisers to implement a highly effective 4D prior model. Our approach, multi-slice fusion, integrates the multiple low-dimensional priors using multi-agent consensus equilibrium (MACE) [19]. MACE is an extension of the PnP framework that formulates the inversion problem using an equilibrium equation—as opposed to an optimization—and allows for the use of multiple prior models and agents.

Figure 1

illustrates the basic concept of our approach. Multi-slice fusion integrates together three distinct CNN denoisers each of which is trained to remove additive white Gaussian noise along lower dimensional slices (hyperplanes) of the 4D object. When MACE fuses the denoisers it

simultaneously

enforces the constraints of each denoising agent, so that the reconstructions are constrained to be smooth in all four dimensions. Consequently, multi-slice fusion results in high-quality reconstructions that are practical to train and compute even when the dimensionality of the reconstruction is high. In our implementation, one MACE agent estimates the cone-beam tomographic inversion. The remaining 3 agents are CNN denoisers trained to remove additive white Gaussian noise along two spatial directions and the time direction. The CNN agents work along complimentary spatial directions and are designed to take as input a stack of five 2D slices from five neighboring time-points. We refer to this as 2.5D denoising 

[7, 20]. Further details are given in Section IV.

The MACE solution can be computed using a variety of algorithms, including variants of the plug-and-play algorithm based on ADMM or other approaches [21, 22, 14, 13]. We implement multi-slice fusion on distributed heterogeneous clusters in which different agent updates are distributed onto different cluster nodes. In particular, the cone-beam inversion computations are distributed onto multiple CPU nodes and concurrently, the CNN denoising computations are distributed onto multiple GPU nodes.

Both simulated and real experiments on sparse-view and limited-angle 4D CT data indicate that the multi-slice fusion reconstruction using CNN based 4D priors can substantially improve reconstructed image quality compared to MBIR reconstruction using traditional 4D MRF priors. Furthermore, the separation of the CNN prior models from the measurement model in MACE allows us to generalize well for unknown CT geometries. In particular, for limited-angle CT, the advanced 4D prior in multi-slice fusion is able to suppress the temporal fluctuations due to limited-angle measurements without training the CNNs using limited-angle data.

The rest of the paper is organized as follows. In section II, we introduce the problem of 4D CT reconstruction. In section III, we introduce the theory behind MACE model fusion. In section IV, we use the MACE framework to introduce multi-slice fusion. In section V, we describe our training pipeline for training the CNN denoisers. In section VI, we describe our distributed implementation of multi-slice fusion on heterogeneous clusters. Finally, in section VII, we present results on sparse-view and limited-angle 4D CT using both simulated and real data.

Ii Problem Formulation

In 4D X-ray CT imaging, a dynamic object is rotated and several 2D projections (radiographs) of the object are measured for different angles as illustrated in Figure 2. The problem is then to reconstruct the 4D array of X-ray attenuation coefficients from these measurements, where three dimensions correspond to the spatial dimensions and the fourth dimension corresponds to time.

Fig. 2: Illustration of 4D cone-beam X-ray CT imaging. The dynamic object is rotated and several 2D projections (radiographs) of the object are measured for different angles. The projections are divided into disjoint subsets for each of the time-points.

Let be the number of time-points, and be the number of voxels at each time-point of the 4D volume. For each time-point , define

to be the vector of sinogram measurements at time

, and to be the vectorized 3D volume of X-ray attenuation coefficients for that time-point. Let us stack all the measurements to form a measurement vector where is the total number of measurements. Similarly, let us stack the 3D volumes at each time-point to form a vectorized 4D volume , where is the total number voxels in the 4D volume. The 4D reconstruction problem then becomes the task of recovering the 4D volume of attenuation coefficients, , from the series of sinogram measurements, .

In the traditional maximum a posteriori (MAP) approach, the reconstruction is given by

(1)

where is the data-fidelity or log-likelihood term, is the 4D regularizer or prior model, and the unitless parameter controls the level of regularization in the reconstruction. The data-fidelity term, , can be written in a separable fashion as

(2)

where is the system matrix, and is the weight matrix [3] for time-point . The values in the diagonal weight matrix,

, account for the non-uniform noise variance for each measurement.

If the prior model, , can be expressed analytically like a 4D Markov random field (MRF) as in [2, 4], then the expression in equation (1) can be minimized iteratively to reconstruct the image. However, in practice, it can be difficult to represent an advanced prior model in the form of a tractable cost function that can be minimized. Consequently, PnP algorithms have been created as a method for representing prior models as denoising operations[13, 14]. More recently, PnP methods have been generalized to the multi-agent consensus equilibrium (MACE) framework as a way to integrate multiple models in a principled manner [19, 4, 23].

Iii MACE Model Fusion

In this section, we use the multi-agent consensus equilibrium (MACE) framework to fuse the data-fidelity term and multiple denoisers; these multiple denoisers form a single prior model for reconstruction. This allows us to construct a 4D prior model using low-dimensional CNN denoisers (described in Section IV).

To introduce the concept of consensus equilibrium, let us first consider a variation of the optimization problem in equation (1) with regularizers , . The modified optimization problem can thus be written as

(3)

where the normalization by is done to make the regularization strength independent of the number of regularizers.

Now we transform the optimization problem of equation (3) to an equivalent consensus equilibrium formulation. However, in order to do this, we must introduce additional notation. First, we define the proximal maps of each term in equation (3). We define to be the proximal map of as

(4)

for some . Similarly, we define to be the the proximal map of each , as

(5)

Each of these proximal maps serve as agents in the MACE framework. We stack the agents together to form a stacked operator as

(6)

where is stacked representative variable. The consensus equilibrium is the vector that satisfies

(7)

where is an averaging operator given as

(8)

and the weighted average is defined as

(9)

Notice the weighting scheme is chosen to balance the forward and prior models. The unitless parameter is used to tune the weights given to the prior model and thus the regularization of the reconstruction. Equal weighing of the forward and prior models can be achieved using .

If satisfies the consensus equilibrium condition of equation (7), then it can be shown [19] that is the solution to the optimization problem in equation (3). Thus if the agents in MACE are true proximal maps then the consensus equilibrium solves an equivalent optimization problem.

Fig. 3: Illustration of consensus equilibrium as analogous to a force balance equation: each agent pulls the solution toward its manifold and at equilibrium the forces balance each other.

However, if the MACE agents are not true proximal maps, then there is no inherent optimization problem to be solved, but the MACE solution still exists. In this case, the MACE solution can be interpreted as the balance point between the forces of each agent as illustrated in Figure 3. Each agent pulls the solution toward its manifold and the consensus equilibrium solution represents a balance point between the forces of each agent. Thus MACE provides a way to incorporate non-optimization based models such as deep neural networks for solving inverse problems.

To see how we can incorporate deep neural network based prior models, first notice that equation (5) can be interpreted as the MAP estimate for a Gaussian denoising problem with prior model

and noise standard deviation

. Thus we can replace each MACE operator, , for each in equation (5) with a deep neural network trained to remove additive white Gaussian noise of standard deviation .

It is interesting to note that when is implemented with a deep neural network denoiser, then the agent is not, in general, a proximal map and there is no corresponding cost function . We know this because for to be a proximal map, it must satisfy the condition that (see [24, 13]), which is equivalent to being a conservative vector function (see for example [25, Theorem 2.6, p. 527]). For a CNN, is a function of the trained weights, and in the general case, the condition will not be met unless the CNN architecture is specifically designed to enforce such a condition.

The consensus equilibrium equation 7 states the condition that the equilibrium solution must satisfy. However, the question remains of how to compute this equilibrium solution. Our approach to solving the consensus equilibrium equations is to first find an operator that has the equilibrium solution as a fixed point, and then use standard fixed point solvers. To do this, we first notice that the averaging operator has the property that . Intuitively, this is true because applying averaging twice is the same as applying it once. Using this fact, we see that

(10)

where is the identity mapping. We then rewrite equation (7) as

So from this we see that the following fixed point relationship must hold for the consensus equilibrium solution.

(11)

and the consensus equilibrium solution is a fixed point of the mapping .

We can apply a variety of iterative fixed point algorithms to equation (11) to compute the equilibrium solution. These algorithms have varying convergence guarantees and convergence speeds [19]. One such algorithm is Mann iteration [19, 23, 26]. Mann iteration performs the following pseudo-code steps until convergence where indicates assignment of a psuedo-code variable.

(12)

where weighing parameter is used to control the speed of convergence. In particular, when , the Mann-iteration solver is equivalent to the consensus-ADMM algorithm [19, 27]. It can be shown that the Mann iteration converges to a fixed point of if is a non-expansive mapping [19].

Note that each Mann iteration update in equation (12) involves performing the minimization in equation (4). This nested iteration is computationally expensive and leads to slow convergence. Instead of minimizing equation (4) till convergence, we initialize with the result of the previous Mann iteration and perform only three iterations of iterative coordinate descent (ICD). We denote this partial update operator as where is the initial condition to the iterative update. The corresponding new operator approximation is then given by

(13)

Algorithm 1 shows a simplified Mann iteration using partial updates. We perform algebraic manipulation of the traditional Mann iterations[23, 26] in order to obtain the simplified but equivalent Algorithm 1. It can be shown that partial update Mann iteration also converges [23, 26] to the fixed point in equation (11). We used a zero initialization, , in all our experiments and continue the partial update Mann iteration until the differences between state vectors become smaller than a fixed threshold.

Input: Initial Reconstruction:
Output: Final Reconstruction:
1 while not converged do
2     
Algorithm 1 Partial update Mann iteration for computing the MACE solution

Iv Multi-Slice Fusion using MACE

Fig. 4:

Architecture of our 2.5D CNN denoiser. Different sizes of input and output necessitate a selection operator for the residual connection. Each green rectangle denotes a tensor, and each ellipse denotes an operation. Blue ellipses specify the shape of the convolution kernel.

We use four MACE agents to implement multi-slice fusion. We set and use the names , , to denote the denoising agents , , in equation (6). The agent enforces fidelity to the measurement while each of the denoisers , , enforces regularity of the image in orthogonal image planes. MACE imposes a consensus between the operators , , , to achieve a balanced reconstruction that lies at the intersection of the solution space of the measurement model and each of the prior models. The MACE stacked operator encompassing all four agents can be written as

(14)

Here the representative variable is formed by stacking four vectorized 4D volumes.

The three denoisers , , and share the same architecture and trained model but are applied along different planes of the 4D space. The CNN architecture is shown in Figure 4. We have modified a typical CNN architecture [28] to input information from a third dimension. The channel dimension of a convolution layer is typically used to input multiple color channels for denoising 2D color images using CNNs. We re-purpose the channel dimension to input five adjacent 2D slices of the noisy image to the network and output the denoised center slice. The other slices are being denoised by shifting the 5-slice moving window. We call this 2.5D since the receptive field along the convolution dimensions is large but in the channel dimension is small. It has been shown that this type of 2.5D processing is a computationally efficient way of performing effective 3D denoising with CNNs [7, 20]. We use the notation to denote a CNN space-time denoiser that performs convolution in the xy-plane and uses the convolution channels to input slices from neighboring time-points. The denoisers and are analogous to but are applied along the yz and zx-plane, respectively. This orientation of the three denoisers ensures that

  1. The spatial dimensions x, y, z are treated equivalently. This ensures the regularization to be uniform across all spatial dimensions;

  2. The regularization along spatial dimensions and the time dimension are separate. This is important for reconstructing images with varying amount of motion per time-point;

  3. Each dimension in x, y, z, and t is considered at least once. This ensures that model fusion using MACE incorporates information along all four dimensions.

Since the three denoising operators , , and process the 4D volume “slice by slice”, they can be implemented in parallel on large scale parallel computers. Details on distributed implementation are described in section VI.

V Training of CNN Denoisers

All three prior model agents , , and in multi-slice fusion share the same 2.5D model shown in Figure 4 but are applied along different planes. Thus we only need to train a single 2.5D CNN and need 3D data to train it.

Fig. 5: Illustration of our training pipeline. We extract 3D patches from a typical CT volume and add additive white Gaussian noise (AWGN) to generate training pairs. This makes the training process self-supervised.

Figure 5 outlines our training pipeline. We start with a low-noise 3D CT volume that is representative of the objects to be reconstructed. We normalize the voxel values to be within . Next we extract 3D patches from this volume, with the interpretation that 2 dimensions are spatial and the remaining dimension corresponds to time; i.e., virtual motion of a 2D object. We perform data augmentation by applying random rotation, mirroring and intensity shift to the patches. Finally we add pseudo-random additive white Gaussian noise (AWGN) to the patches to generate input data for training. The middle slice of the corresponding clean patch serves as the target data for training. This makes the training process self-supervised and thus we do not need labeled training data to train our model. The use of AWGN in this context is justified in section III.

Vi Distributed Reconstruction

The computational structure of multi-slice fusion is well-suited to a highly distributed implementation. The main computational bottleneck in Algorithm 1 is the operator. Fortunately, is a parallel operator and thus its individual components , , , and can be executed in parallel. The operators , , , and can themselves be parallelized internally as well. The distributed implementation of multi-slice fusion is illustrated in Figure 6.

Fig. 6: Illustration of distributed computation of multi-slice fusion. We perform distributed computation of the operator which is the main computational bottleneck in Algorithm 1. Each operator within , namely , , , and can be executed in parallel. Furthermore, operators , , , and are 3D operators that can process the 4D volume “slice by slice” leading to a large number of concurrent operations that can be distributed among multiple compute nodes.

The CNN denoisers , , and are 2.5D denoisers that denoise the 4D volume by processing it slice by slice and thus can be trivially parallelized leading to a large number of concurrent operations. The concurrent operations for all three denoisers are distributed among multiple GPUs due to the availability of optimized GPU routines in Tensorflow. In our experiments we used a GPU cluster with three Nvidia Tesla P100 GPUs to compute the CNN denoising operators.

The cone-beam inversion operator, , can also be computed slice by slice due to the separable structure in equations (4) and (2). This leads to a large number of concurrent operations which are distributed among multiple CPU nodes. Additional multi-threaded parallelism is also implemented similar to [3].

Vii Experimental Results

(a) Phantom
(b) FBP (3D)
(c) MBIR+4D-MRF

(d) Multi-slice fusion (proposed)
(e) MBIR+
(f) MBIR+
(g) MBIR+
Fig. 7: Comparison of different methods for simulated data . Each image is a slice through the reconstructed object for one time-point along the spatial xy-plane. Both FBP and MBIR+4D-MRF suffer from high noise and jagged edges and fail to recover the small hole in the bottom of the image. MBIR+ and MBIR+ suffer from horizontal and vertical streaks respectively since the denoisers were applied in those planes. MBIR+ cannot reconstruct the small hole in the bottom of the image since the xy-plane does not contain sufficient information.

We present experimental results on two simulated and two real 4D X-ray CT data for NDE applications to demonstrate the improved reconstruction quality of our method. The four experimental cases are outlined below

  1. Simulated Data : Sparse-view results on simulated data with a set of sparse views ranging over at each reconstructed time-point;

  2. Simulated Data : Sparse-view limited-angle results on simulated data with a set of sparse views ranging over at each reconstructed time-point;

  3. Real Data : Sparse-view results on real data with a set of sparse views ranging over at each reconstructed time-point;

  4. Real Data : Sparse-view limited-angle results on real data with a set of sparse views ranging over at each reconstructed time-point.

The selection of the rotation range per time-point is arbitrary and can be chosen after the measurements have been taken. For example, a full rotation with 400 views can be used as a single time-point or as four time-points with 100 views each. The four time-points per rotation can provide extra temporal resolution, however, they require a more difficult reconstruction with incomplete information.

We compare multi-slice fusion with several other methods outlined below

  • FBP: Conventional 3D filtered back projection reconstruction.

  • MBIR+4D-MRF: MBIR reconstruction using 4D Markov random field prior [2] with 26 spatial and 2 temporal neighbors;

  • MBIR+: MBIR using the CNN as a PnP prior;

  • MBIR+: MBIR using the CNN as a PnP prior;

  • MBIR+: MBIR using the CNN as a PnP prior.

We used two CPU cluster nodes, each with 20 Kaby Lake CPU cores to compute the cone-beam inversion and three GPU nodes each with a Nvidia Tesla P100 GPU to compute the CNN denoisers. The 2.5D CNN denoiser model was trained using a low-noise 3D CT reconstruction of a bottle and screw cap made from different plastics. The object is representative of a variety of Non-Destructive Evaluation (NDE) problems in which objects to be imaged are constructed from a relatively small number of distinct materials. The standard deviation of the additive white Gaussian noise added during training was .

(a) Phantom
(b) FBP (3D)
(c) MIBIR+4D-MRF
(d) Multi-slice fusion
(proposed)
(e) Plot of cross-section
Fig. 8: Plot of cross-section through the phantom and reconstructions from simulated data . Multi-slice fusion results in the most accurate reconstruction of the gap between materials.
Time-point 1 Time-point 2 Time-point 3 Time-point 4

Phantom

FBP (3D)

MBIR+4D-MRF

Multi-slice fusion

(proposed)


Fig. 9: Comparison of different methods for simulated data with rotation of object per time-point. Limited angular information at each frame leads to severe artifacts in reconstructions with FBP. Simple 4D prior in MBIR+4D-MRF is unable to properly merge information from consecutive time-points leading to a overtly blurred reconstruction. The CNN based 4D prior in multi-slice fusion is able to enforce a spatio-temporal smoothness on the image and thus the reconstructions do not suffer from artifacts due to limited angular information per time-point.
(a) FBP (3D)
(b) MBIR+4D-MRF

(c) Multi-slice fusion (proposed)
(d) MBIR+ (Missing Feature)
(e) MBIR+ (Horizontal Streaks)

(f) MBIR+ (Vertical Streaks)
Fig. 10: Comparison of different methods for Real Data : vial. Each image is a slice through the reconstructed vial for one time-point along the spatial xy-plane. Both FBP and MBIR+4D-MRF suffer from obvious windmill artifacts, higher noise and blurred edges. In contrast to that, the multi-slice fusion reconstruction has smooth and uniform textures while preserving edge definition. MBIR+ and MBIR+ suffer from horizontal and vertical streaks. MBIR+ cannot reconstruct the outer ring since the slice displayed is at the edge of the aluminum seal and the xy-plane does not contain sufficient information. Multi-slice fusion can resolve the edges of the rings better than either of MBIR+, MBIR+, and MBIR+ since it has information from all the spatial coordinates.
(a) FBP (3D)
(b) MBIR+4D-MRF
(c) Multi-slice fusion
(proposed)
(d) Plot of cross-section
Fig. 11: Plot of cross-section through the vial at a time when the aluminum and glass have physically separated. Multi-slice fusion is able to resolve the junction between materials better while simultaneously producing a smoother reconstruction within materials compared to MBIR+4D-MRF and FBP.
Fig. 12: Illustration of temporal resolution for real data : vial. We plot a cross-section through the vial with time for each method: multi-slice fusion, MBIR+4D-MRF, FBP. Multi-slice fusion results in improved space-time resolution of the separation of aluminum and glass.
Time-point 1 Time-point 2 Time-point 3 Time-point 4

FBP (3D)

MBIR+4D-MRF

Multi-slice fusion

(proposed)

Fig. 13: Volume rendering of the reconstructed spring for four time-points. A limited set of views is used to reconstruct each time-point. Limited angular information at each frame leads to severe artifacts in reconstructions with FBP. The simple prior model in MBIR+4D-MRF is unable to merge the information from multi time-points properly and some artifacts remain. The CNN based 4D prior in multi-slice fusion is able to enforce a spatio-temporal smoothness on the image and thus the reconstructions do not suffer from artifacts due to limited angular information per time-point.

Vii-a Simulated Data

Magnification 5.57
Number of Views per Time-point 75
Rotation per Time-point
Cropped Detector Array ,
Voxel Size
Reconstruction Size (x,y,z,t)

TABLE I: Experimental specifications for Simulated Data

In this section we present results on simulated data to evaluate our method in a sparse-view setting. Each time-point is reconstructed from a sparse set of views spanning . We take a low-noise CT reconstruction of a bottle and screw cap and denoise it further using BM4D [18] to generate a clean 3D volume to be used as a 3D phantom. We then vertically translate the 3D phantom by one pixel per time-point to generate a 4D phantom. We forward project the phantom using the system matrix to simulate noisy sinogram data and use that to reconstruct the object. The experimental specifications are summarized in Table I.

Figure 7 compares reconstructions using multi-slice fusion with several other methods. Each image is a slice through the reconstructed object for one time-point along the spatial xy-plane. Both FBP and MBIR+4D-MRF suffer from high noise and jagged edges and fail to recover the small hole in the bottom of the image. MBIR+ and MBIR+ suffer from horizontal and vertical streaks, respectively, since the denoisers were applied in those planes. MBIR+ does not suffer from streaks in the figure since we are viewing a slice along the xy-plane, but it suffers from other artifacts. MBIR+ cannot reconstruct the small hole in the bottom of the image since the xy-plane does not contain sufficient information.

Next we plot a cross-section through the object for multi-slice fusion, MBIR+4D-MRF, FBP, and the phantom in Figure 8. Multi-slice fusion results in the most accurate reconstruction of the gap between materials.

Finally we report the peak signal to noise ratio (PSNR) and the structural similarity index measure (SSIM) 

[29] with respect to the phantom for each method in Table II to objectively measure image quality. We define the PSNR for a given 4D reconstruction with a phantom as

(15)

where range is computed from the and percentiles of the phantom. As can be seen from Table II, multi-slice fusion results in the highest PSNR and SSIM scores.

Method PSNR(dB) SSIM
FBP 19.69 0.609
MBIR+4D-MRF 25.84 0.787
Multi-slice fusion 29.07 0.943
MBIR+ 29.03 0.922
MBIR+ 28.04 0.932
MBIR+ 28.31 0.926

TABLE II: Quantitative Evaluation for simulated data . Multi-slice fusion has the highest PSNR and SSIM metric among all the methods.

Vii-B Simulated Data

Magnification 5.57
Number of Views per Time-point 36
Rotation per Time-point
Cropped Detector Array ,
Voxel Size
Reconstruction Size (x,y,z,t)

TABLE III: Experimental specifications for Simulated Data

In this section we present results on simulated data to evaluate our method in a sparse-view and limited-angle setting. Each time-point is reconstructed from a sparse set of views spanning . We take a low-noise CT reconstruction of a bottle and screw cap and denoise it further using BM4D [18] to generate a clean 3D volume to be used as a phantom. We then vertically translate the 3D phantom by one pixel per time-point to generate a 4D phantom. We forward project the phantom to generate sinogram data and use that to reconstruct the object. The experimental specifications are summarized in Table III.

Figure 9 shows a comparison of different methods for simulated data with rotation of object per time-point. Limited angular information at each frame leads to severe artifacts in reconstructions with FBP. The simple 4D prior in MBIR+4D-MRF is unable to properly merge information from consecutive time-points leading to a blurred reconstruction. The CNN based 4D prior in multi-slice fusion is able to enforce a spatio-temporal smoothness on the image and thus the reconstructions do not suffer from artifacts due to limited angular information per time-point.

Table IV shows peak signal to noise ratio (PSNR) and structural similarity index measure (SSIM) with respect to the phantom for each method. Multi-slice fusion results in the highest PSNR and SSIM scores.

Method PSNR(dB) SSIM
FBP 10.86 0.467
MBIR+4D-MRF 14.25 0.742
Multi-slice fusion 19.44 0.875

TABLE IV: Quantitative Evaluation for simulated data . Multi-slice fusion has the highest PSNR and SSIM metric among all the methods.

Vii-C Real Data : Vial Compression

Scanner Model North Star Imaging X50
Voltage
Current
Exposure
Source-Detector Distance
Magnification
Number of Views per Time-point
Rotation per Time-point
Cropped Detector Array ,
Voxel Size
Reconstruction Size (x,y,z,t)

TABLE V: Experimental specifications for Real Data : Vial Compression
Fig. 14: Experimental setup for Real Data : Vial Compression. The vial is undergoing dynamic compression during the scan, to capture the mechanical response of the components. The glass vial (center) and the actuator (top) are held together by a frame constructed of tubes and plates. The tubes were placed outside the field of view of the CT scanner, thus causing artifacts in the reconstruction. We describe a correction for this in Appendix A.

In this section we present results on real data to evaluate our method in a sparse-view setting. The data is from a dynamic cone-beam X-ray scan of a glass vial, with elastomeric stopper and aluminum crimp-seal, using a North Star Imaging X50 X-ray CT system. The experimental specifications are summarized in Table V.

The vial is undergoing dynamic compression during the scan, to capture the mechanical response of the components as shown in Figure 14

. Of particular interest is the moment when the aluminum seal is no longer in contact with the underside of the glass neck finish. This indicates the moment when the force applied exceeds that exerted by the rubber on the glass; this is known as the “residual seal force”

[30].

During the scan, the vial was held in place by fixtures that were placed out of the field of view as shown in Figure 14. As the object rotated, the fixtures periodically intercepted the path of the X-rays resulting in corrupted measurements and consequently artifacts in the reconstruction. To mitigate this problem, we incorporate additional corrections that are described in Appendix A.

Figure 10 compares multi-slice fusion with several other methods. Each image is a slice through the reconstructed vial for one time-point along the spatial xy-plane. Both FBP and MBIR+4D-MRF suffer from obvious artifacts, higher noise and blurred edges. In contrast to that, the multi-slice fusion reconstruction has smooth and uniform textures while preserving edge definition. Figure 10 also illustrates the effect of model fusion by comparing multi-slice fusion with MBIR+, MBIR+, and MBIR+. MBIR+ and MBIR+ suffer from horizontal and vertical streaks respectively since the denoisers were applied in those planes. MBIR+ does not suffer from streaks in the figure since we are viewing a slice along the xy-plane, but it suffers from other artifacts. MBIR+ cannot reconstruct the outer ring since the slice displayed is at the edge of the aluminum seal and the xy-plane does not contain sufficient information. In contrast, multi-slice fusion can resolve the edges of the rings better than either of MBIR+, MBIR+, and MBIR+ since it uses information from all the spatial coordinates.

Next, we plot a cross-section through the object for multi-slice fusion, MBIR+4D-MRF and FBP in Figure 11. For this, we choose a time-point where we know the aluminum and glass have separated spatially. The multi-slice fusion reconstruction has a steeper and more defined slope in the junction of aluminum and glass. This supports that multi-slice fusion is able to preserve fine details in spite of producing a smooth regularized image.

Finally in Figure 12 we plot a cross-section through the object with respect to time to show the improved space-time resolution of our method. We do this for FBP, MBIR+4D-MRF and multi-slice fusion. Multi-slice fusion results in improved space-time resolution of the separation of aluminum and glass.

Vii-D Real Data : Injector Pen

Scanner Model North Star Imaging X50
Voltage
Current
Exposure
Source-Detector Distance
Magnification
Number of Views per Time-point
Rotation per Time-point
Cropped Detector Array ,
Voxel Size
Reconstruction Size (x,y,z,t)

TABLE VI: Experimental specifications for Real Data : Injector Pen

In this section we present results on real data to evaluate our method in a sparse-view and limited-angle setting. The data is from a dynamic cone-beam X-ray scan of an injector pen using a North Star Imaging X50 X-ray CT system. The experimental specifications are summarized in Table VI.

The injection device is initiated before the dynamic scan starts and completes a full injection during the duration of the scan. We are interested in observing the motion of a particular spring within the injector pen in order to determine whether it is working as expected. The spring exhibits a fast motion and as a result we need a high temporal resolution to observe the motion of the spring. To have sufficient temporal resolution we reconstruct one frame for every rotation of the object instead of the conventional rotation.

Fig. 15: Pipeline of the blind fixture correction in Algorithm 2. The vertical stripes in the yz-plane of the reconstruction and the ring at the edge of the field of view in the xy-plane of the reconstruction have been rectified after performing the correction.

In Figure 13 we show a volume rendering of the reconstructed spring for four time-points and reconstruction methods FBP, MBIR+4D-MRF, and multi-slice fusion. Limited angular information at each frame leads to severe artifacts in reconstructions with FBP. The simple prior model in MBIR+4D-MRF is unable to merge the information from multiple time-points properly and some artifacts remain. The CNN based 4D prior in multi-slice fusion is able to enforce a spatio-temporal smoothness on the image and thus the reconstructions do not suffer from artifacts due to limited angular information per time-point.

Viii Conclusion

In this paper, we proposed a novel 4D X-ray CT reconstruction algorithm, multi-slice fusion, that combines multiple low-dimensional denoisers to form a 4D prior. Our method allows the formation of an advanced 4D prior using state-of-the-art CNN denoisers without needing to train on 4D data. Furthermore, it allows for multiple levels of parallelism, thus enabling reconstruction of large volumes in a reasonable time. Although we focused on 4D X-ray CT reconstruction for NDE applications, our method can be used for any reconstruction problem involving multiple dimensions.

Acknowledgment

The authors would like to acknowledge support from Eli Lilly and Company under research project funding agreement 17099289. Charles A. Bouman and Gregery T. Buzzard were supported in part by NSF grant CCF-1763896. We also thank M. Cory Victor and Dr. Coralie Richard from Eli Lilly and Company for their assistance and guidance in setting up the residual seal force test experiment.

Appendix A Correction for fixtures outside the field of view

Inputs : Original Sinogram: ,
System Matrix: ,
Output: Corrected Sinogram:
Algorithm 2 Blind fixture correction

Here we describe our correction for fixtures placed out of the field of view of the scanner. As shown in Figure 14, the setup is held together by a fixture constructed of tubes and plates. The tubes were placed outside the field of view of the CT scanner, thus causing artifacts in the reconstruction. Our method performs a blind source separation of the projection of the object from that of the tubes. Our blind separation relies on the fact that the projection of the tubes is spatially smooth. This is true since the tubes themselves do not have sharp features and there is motion blur due to the large distance of the tubes from the rotation axis.

Algorithm 2 shows our correction algorithm for the fixtures. Figure 15 illustrates the algorithm pictorially. The initial reconstruction suffers from artifacts within the image and at the edge of the field of view. We mask so that the majority of the artifacts at the edge of the field of view are masked but the object remains unchanged in . Consequently the error sinogram primarily contains the projection of the tubes with some residual projection of the object. The blurring of filters out the residual object projection but preserves the spatially smooth projection of the tubes. The corrected measurements are found after performing a least squares fit. The correction can be repeated in order to get an improved reconstruction and consequently an improved correction .

Figure 15 shows the sinogram and reconstruction both before and after performing the blind correction. Not only does the reconstruction after fixture correction remove the artifacts in the air region, but it also improves the image quality inside the object. It can be seen that the vertical stripes in the object in the yz view of the reconstruction have been eliminated after performing the correction.

References

  • [1] C. Huang, J. L. Ackerman, Y. Petibon, T. J. Brady, G. El Fakhri, and J. Ouyang, “MR-based motion correction for PET imaging using wired active MR microcoils in simultaneous PET-MR: Phantom study,” Medical physics, vol. 41, no. 4, 2014.
  • [2] K. A. Mohan, S. Venkatakrishnan, J. W. Gibbs, E. B. Gulsoy, X. Xiao, M. De Graef, P. W. Voorhees, and C. A. Bouman, “TIMBIR: A method for time-space reconstruction from interlaced views.” IEEE Trans. Computational Imaging, vol. 1, no. 2, pp. 96–111, 2015.
  • [3] T. Balke, S. Majee, G. T. Buzzard, S. Poveromo, P. Howard, M. A. Groeber, J. McClure, and C. A. Bouman, “Separable models for cone-beam MBIR reconstruction,” Electronic Imaging, vol. 2018, no. 15, pp. 181–1, 2018.
  • [4] S. Majee, T. Balke, C. A. Kemp, G. T. Buzzard, and C. A. Bouman, “4D X-ray CT reconstruction using Multi-Slice Fusion,” in 2019 IEEE International Conference on Computational Photography (ICCP).   IEEE, 2019, pp. 1–8.
  • [5] Z. Nadir, M. S. Brown, M. L. Comer, and C. A. Bouman, “A model-based iterative reconstruction approach to tunable diode laser absorption tomography,” IEEE Transactions on Computational Imaging, vol. 3, no. 4, pp. 876–890, 2017.
  • [6]

    S. Majee, D. H. Ye, G. T. Buzzard, and C. A. Bouman, “A model based neuron detection approach using sparse location priors,”

    Electronic Imaging, vol. 2017, no. 17, pp. 10–17, 2017.
  • [7] A. Ziabari, D. H. Ye, K. D. Sauer, J. Thibault, and C. A. Bouman, “2.5D deep learning for CT image reconstruction using a multi-GPU implementation,” in Signals, Systems, and Computers, 2018 52nd Asilomar Conference on.   IEEE, 2018.
  • [8] J. Gibbs, K. A. Mohan, E. Gulsoy, A. Shahani, X. Xiao, C. Bouman, M. De Graef, and P. Voorhees, “The three-dimensional morphology of growing dendrites,” Scientific reports, vol. 5, p. 11824, 2015.
  • [9] G. Zang, R. Idoughi, R. Tao, G. Lubineau, P. Wonka, and W. Heidrich, “Space-time tomography for continuously deforming objects,” ACM Transactions on Graphics (TOG), vol. 37, no. 4, pp. 1–14, 2018.
  • [10] S. J. Kisner, E. Haneda, C. A. Bouman, S. Skatter, M. Kourinny, and S. Bedford, “Model-based CT reconstruction from sparse views,” in Second International Conference on Image Formation in X-Ray Computed Tomography, 2012, pp. 444–447.
  • [11] K. Sauer and C. Bouman, “A local update strategy for iterative reconstruction from projections,” IEEE Transactions on Signal Processing, vol. 41, no. 2, pp. 534–548, 1993.
  • [12] D. Clark and C. Badea, “Convolutional regularization methods for 4D, x-ray CT reconstruction,” in Medical Imaging 2019: Physics of Medical Imaging, vol. 10948.   International Society for Optics and Photonics, 2019, p. 109482A.
  • [13]

    S. Sreehari, S. V. Venkatakrishnan, B. Wohlberg, G. T. Buzzard, L. F. Drummy, J. P. Simmons, and C. A. Bouman, “Plug-and-play priors for bright field electron tomography and sparse interpolation,”

    IEEE Transactions on Computational Imaging, vol. 2, no. 4, pp. 408–423, 2016.
  • [14] S. V. Venkatakrishnan, C. A. Bouman, and B. Wohlberg, “Plug-and-play priors for model based reconstruction,” in Global Conference on Signal and Information Processing (GlobalSIP), 2013 IEEE.   IEEE, 2013, pp. 945–948.
  • [15] Y. Sun, B. Wohlberg, and U. S. Kamilov, “An online plug-and-play algorithm for regularized image reconstruction,” arXiv preprint arXiv:1809.04693, 2018.
  • [16] U. S. Kamilov, H. Mansour, and B. Wohlberg, “A plug-and-play priors approach for solving nonlinear imaging inverse problems,” IEEE Signal Processing Letters, vol. 24, no. 12, pp. 1872–1876, 2017.
  • [17] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising by sparse 3D transform-domain collaborative filtering,” IEEE Transactions on image processing, vol. 16, no. 8, pp. 2080–2095, 2007.
  • [18] M. Maggioni, G. Boracchi, A. Foi, and K. Egiazarian, “Video denoising using separable 4D nonlocal spatiotemporal transforms,” in Image Processing: Algorithms and Systems IX, vol. 7870.   International Society for Optics and Photonics, 2011, p. 787003.
  • [19] G. T. Buzzard, S. H. Chan, S. Sreehari, and C. A. Bouman, “Plug-and-play unplugged: Optimization-free reconstruction using consensus equilibrium,” SIAM Journal on Imaging Sciences, vol. 11, no. 3, pp. 2001–2020, 2018.
  • [20] D. Jiang, W. Dou, L. Vosters, X. Xu, Y. Sun, and T. Tan, “Denoising of 3D magnetic resonance images with multi-channel residual learning of convolutional neural network,” Japanese journal of radiology, vol. 36, no. 9, pp. 566–574, 2018.
  • [21] Y. Sun, B. Wohlberg, and U. S. Kamilov, “Plug-in stochastic gradient method,” arXiv preprint arXiv:1811.03659, 2018.
  • [22] Y. Sun, S. Xu, Y. Li, L. Tian, B. Wohlberg, and U. S. Kamilov, “Regularized fourier ptychography using an online plug-and-play algorithm,” arXiv preprint arXiv:1811.00120, 2018.
  • [23] V. Sridhar, G. T. Buzzard, and C. A. Bouman, “Distributed framework for fast iterative CT reconstruction from view-subsets,” Electronic Imaging, vol. 2018, no. 15, pp. 102–1, 2018.
  • [24] J.-J. Moreau, “Proximité et dualité dans un espace hilbertien,” Bulletin de la Société mathématique de France, vol. 93, pp. 273–299, 1965.
  • [25] R. E. Williamson, R. H. Crowell, and H. F. Trotter, Calculus of vector functions.   Prentice-Hall, 1972.
  • [26] V. Sridhar, X. Wang, G. T. Buzzard, and C. A. Bouman, “Distributed memory framework for fast iterative CT reconstruction from view-subsets using multi-agent consensus equilibrium,” Manuscript in preparation for IEEE Transactions on Computational Imaging, 2018.
  • [27] S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein et al., “Distributed optimization and statistical learning via the alternating direction method of multipliers,”

    Foundations and Trends® in Machine learning

    , vol. 3, no. 1, pp. 1–122, 2011.
  • [28] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a gaussian denoiser: Residual learning of deep CNN for image denoising,” IEEE Transactions on Image Processing, vol. 26, no. 7, pp. 3142–3155, 2017.
  • [29] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, 2004.
  • [30] R. Mathaes, H.-C. Mahler, J.-P. Buettiker, H. Roehl, P. Lam, H. Brown, J. Luemkemann, M. Adler, J. Huwyler, A. Streubel et al., “The pharmaceutical vial capping process: Container closure systems, capping equipment, regulatory framework, and seal quality tests,” European Journal of Pharmaceutics and Biopharmaceutics, vol. 99, pp. 54–64, 2016.