Deep Learning Can Reverse Photon Migration for Diffuse Optical Tomography

12/04/2017 ∙ by Jaejun Yoo, et al. ∙ 0

Can artificial intelligence (AI) learn complicated non-linear physics? Here we propose a novel deep learning approach that learns non-linear photon scattering physics and obtains accurate 3D distribution of optical anomalies. In contrast to the traditional black-box deep learning approaches to inverse problems, our deep network learns to invert the Lippmann-Schwinger integral equation which describes the essential physics of photon migration of diffuse near-infrared (NIR) photons in turbid media. As an example for clinical relevance, we applied the method to our prototype diffuse optical tomography (DOT). We show that our deep neural network, trained with only simulation data, can accurately recover the location of anomalies within biomimetic phantoms and live animals without the use of an exogenous contrast agent.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 8

page 9

page 10

page 11

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Theory

Lippmann-Schwinger Integral Equation for Photon Migration

Diffuse optical tomography uses NIR light to investigate anomalies within the tissue that have different optical absorption and scattering parameters. As these perturbations of optical parameters are usually due to the hemoglobin concentration, DOT is sensitive to biological function and disease that are associated with hemoglobin concentration changes. Accordingly, it has been investigated as a promising non-invasive imaging tool, for example, for functional brain imaging and breast cancer detection (12, 13, 14, 15).

The inverse problem in DOT is to find the locations and geometric features of anomalies with their optical parameters (such as absorption and/or diffusion ), using a finite number of pairs of optical measurements on a part of its boundary . In particular, our DOT system is mainly interested in the absorption parameter changes due to the hemoglobin concentration changes represented by

(1)

whereas the diffusion parameters of the anomalies are indifferent from the background, i.e., , for all . Here,

represents the characteristic function of a domain

, and the anomalies be compactly supported inside and have boundaries , respectively, such that is a simply connected domain.

If the medium has a characteristic of highly scattering interactions dominating over absorption, which is the case for the medium such as a cloud milk or biological tissue, then the light propagation can be modeled by the diffusion equation (12, 18):

(2)

where denotes the diffuse wave-number, , for , is the total photon density corresponding to the photon flux at the boundary from the th source distribution, is the extrapolation distance, and denotes the normal derivative in the direction of outward unit normal at .

Let be the background photon density for the homogeneous medium (in the absence of any perturbation) and be the scattered photon flux. Then, the th scattered photon flux measurement at the boundary for the absorption parameter perturbation is given by

(3)

where the measurement operator is given by

(4)

To facilitate further discussion, we define the forward mapping by collecting measurements with respect to all source positions, i.e.,

(5)

where denotes the detector locations at the boundary; see Supplementary Material for further details.

Note that the in (5) is a matrix of dimension of (no. of detectors) (no. of sources). (3) is the so-called Lippmann-Schwinger integral equation (derived in Supplementary material) that relates the scattered field to the perturbation in the absorption parameter. This construction of the multi-static data matrix was an essential part of the joint sparse recovery formulations (23, 24, 25, 26, 27). However, the measurement operator relating perturbation in the optical parameter involves the total optical photon density which, in turn, also depends on the unknown perturbation. This makes the inverse problem highly non-linear. Furthermore, due to the dissipative nature of diffusive wave and smaller number of measurements compared to the number of unknowns, resolving the image from DOT is a severely ill-posed problem (35). The basic idea of joint sparse recovery formulations (23, 24, 25, 26, 27) is then to decouple the nonlinear inverse problems in two consecutive steps, taking advantage of the fact that the optical perturbation does not change position during multiple illumination. In this paper, we further extend this idea to the point that the optical anomalies can be directly recovered from the measured multi-static data matrix .

Figure 1: (a) Schematic illustration of photon behavior within the diffusive medium. When photons are illuminated on the tissue, photons can be reflected, scattered or absorbed in the tissue. The goal of the DOT is to reconstruct the distribution of optical anomalies within tissues from scattered photon measurements. (b) A generic form of a neural network for inversion of the Lippmann-Schwinger equation. The measured multi-static data matrix is first mapped to the voxel domain by a fully connected layer representing , which then generates the framelet coefficient by the encoder layer filter . The framelet coefficient is further filtered using the additional convolution filter . Finally, the decoder reconstructs the 3-D distribution of the optical anomalies from the filtered framelet coefficients by the decoder layer filter .

Neural network for inverting Lippmann-Schwinger Equation

In this section, we derive a novel neural network to directly invert the forward operator (5). Specifically, let be the 3D distribution of the perturbation in the region of interest. If comes from a smoothly varying perturbation, then we can easily expect that the corresponding Fourier spectrum is mostly concentrated in a small number of coefficients, so there exists a spectral domain function such that

which can be equivalently represented in the discrete spatial domain by

(6)

where is called the annihilating filter (36). (6) implies the existence of a rank deficient block Hankel matrix, denoted by , constructed from and its rank is dependent on the Fourier domain sparsity level as rigorously shown in (37).

Let the block Hankel matrix

admit the singular value decomposition

where and

denote the left and the right singular vector bases matrices, respectively,

is the diagonal matrix whose diagonal components contain the singular values, and denotes its rank. If there exist two matrices pairs () and (, ) satisfying the conditions

(7)

where denote the range space of and represents a projection onto , then

(8)

with the coefficient matrix given by

(9)

One of the most important discoveries in (32) is that an encoder-decoder structure convolution layer is emerged from (9) and (8). Precisely, (8) is equivalent to the following paired encoder-decoder convolution structure:

(10)
(11)

where the convolutions in (10) and (11) corresponds to the single-input multi-output (SIMO) convolution and multi-input single-output (MISO), respectively:

(12)

where denotes the flipped version of the vector , i.e., its indices are reversed (37).

This encoder-decoder representation of the signal is ideally suitable for inverting Lippmann-Schwinger integral equation for our DOT imaging problem. First, we can simply choose . Then, by defining an inversion operator for the forward operator in (5) and substituting in (10), the encoder-decoder structure neural network is re-written as

(13)

where could be further processed coefficient from to remove noise, i.e.

for some filter . Finally, by choosing appropriate number of the filter channel , we can control the dimension of the reconstruction image manifold.

The corresponding three-layer network structure is illustrated in Fig. 1(b). Here, the network consists of a single fully connected layer that represents , two paired 3D-convolutional layers with filters and

, and the intermediate 3D-convolutional layer for additional filtering. To enable learning from the training data, we used the hyperbolic tangent function (tanh) as an activation function for the fully connected layer and two convolutional layers (C1 and C2), whereas the last convolutional layer (C3) was combined with rectified linear unit (ReLU) to ensure the positive value for the optical property distribution. Therefore, the network is designed such that the inversion of the Lippmann-Schwinger equation is performed to lead the resulting optical parameters in low-dimensional manifold structure.

Compared to the conventional model-based approaches, the proposed deep network has many advantages. First, the inversion of the Lippmann-Swinger equation is fully data-driven such that we do not need any explicit modeling of the acquisition system and boundary conditions. Second, unlike the explicit regularization in the model-based approaches, the low-dimensional manifold structure of the desired optical distribution is embedded as convolutional layers that are also learned from the data. This is why the proposed neural network is felicitous to provide a robust reconstruction. This will be substantiated through extensive experiments.

Methods

DOT Hardware System

Figure 2: Multi-channel DOT system used in our experiments (38, 39). The light source has three fiber pigtailed laser diode modules with 785 nm, 808nm and 850 nm. 70 MHz RF signal is simultaneously applied to these light sources using bias-T, RF splitter and RF AMP. Two optical switches are used to deliver light to 64 specific positions in the source probe. During optical switching time, one-tone modulation light photons reach 40 detection fiber ends after passing an optical phantom and are detected simultaneously by 40 avalanche photodiodes (APD) installed in the home-made signal processing card.

Fig. 2 shows the schematics of the frequency domain diffuse optical tomography (DOT) system, which we used in this study. The DOT system has been developed at the Korea Electrotechnology Research Institute (KERI) to improve diagnostic accuracy of the digital breast tomosynthesis (DBT) system by combining the DOT system with the DBT system for joint breast cancer diagnosis(38, 39). After its development, the system has been installed and been under clinical test at Asan Medical Center (AMC) (40). The DOT system briefly consists of four parts: light source, optical detector, optical probe, and data acquisition and controller. The light source has three fiber pigtailed laser diode modules with 785 nm, 808nm and 850 nm. 70 MHz RF signal is simultaneously applied to these light sources using bias-T, RF splitter and RF AMP. Two optical switches are used to deliver light to 64 specific positions in the source probe. During optical switching time, one-tone modulation light photons reach 40 detection fiber ends after passing an optical phantom and are detected simultaneously by 40 avalanche photodiodes (APD) installed in the home-made signal processing card. The DOT system uses an In-phase(I) and quadrature(Q) demodulator to get amplitude and phase of the signal in the signal processing card. The 40 IQ signal pairs are simultaneously acquired using data acquisition boards.

Biomimic phantoms

Next, to analyze the performance of the proposed approach in controlled real experiments, two biomimic phantoms with known inhomogeneity locations are created (see Fig. 4). The first phantom is made of polypropylene containing a vertically oriented cylindrical cavity that has 20 diameter and 15 height. The cavity is filled with the acetyl inclusion which has different optical properties to the background. For the second phantom, we used a custom-made open-top acrylic chamber (175mm x 120mm x 40mm) and three different sized knots (5mm, 10mm, 20mm diameter) for the mimicry of a tumor-like vascular structure. The knots were made using thin polymer tube (I.D 0.40mm, O.D 0.8mm diameter) and were filled with the rodent blood that was originated from the abdominal aorta of Sprague-Dawley rat that was under 1 to 2% isoflurane inhalation anesthesia. The chamber was filled with the completely melted pig lard and the medium was coagulated at room temperature for the imaging scan.

Animal imaging

The mouse colon cancer cell line MC38 were obtained from Scripps Korea Antibody Institute(Chuncheon, Korea) and the cell line was cultivated in Dulbecco’s modified Eagle’s medium (DMEM, GIBCO, NY, US) supplemented with 10% Fetal bovine serum(FBS, GIBCO) and 1x Antibiotic-Antimycotic(GIBCO). For the tumor-bearing mice, 5x106 cells were injected subcutaneously into the right flank region of C57BL/6 mice aging 7-9 weeks (Orient Bio, Seongnam, Korea). Animal hairs were removed through trimming and waxing. Anesthesia was applied during the imaging scanning with an intramuscular injection of Zoletil and Rumpun (4:1 ratio) in normal saline solution. Mice were placed inside of the custom-made 80mm x 80mm x 30mm open-top acrylic chamber that had a semicircle hole-structure on the one side of the chamber for the relaxed breathing. A gap between the semicircle structure and the head was sealed with the clay. The chamber was filled with the water/milk mixture as 1000:50 ratios. All experiments associated with this study were approved by Institutional Animal Care and Use Committees of Asan Medical Center (IACUC no. 2017-12-198).

Data preprocessing

To determine the maximally usable source-detector distance, we measured signal magnitude according to the source-detector distances. We observed that the signals were above the noise floor when the separation distance between the source and the detector was less than 51 (). Therefore, instead of using measurements at all source-detector pairs, we only used the pairs having source and detector less than 51 apart. This step not only enhanced the signal-to-noise ratio (SNR) of the data but also largely reduced the number of parameters to be learned in the fully connected layer. For example, in the source-detector configuration of the biomimic phantom, the number of source-detector pairs reduced from 2560 to 466. This decreased the number of the parameters to train from 137,625,600 to 25,105,920, which is an order difference. To match the scale and bias of the signal amplitude between the simulation and the real data, we multiplied appropriate calibration factor to the real measurement to match the signal envelope from the simulation data. We centered the data cloud on the origin with the maximum width of one by subtracting the mean across every individual data and dividing it with its maximum value. To deal with the unbalanced distribution of nonzero values in the 3D label image and to prevent the proposed network learning a trivial mapping (rendering all zero values), we weighted the non-zero values by multiplying a constant according to the ratio of the total voxel numbers over the non-zero voxels.

Neural network training

In order to test the robustness of the deep network in real experiments and to obtain a large database in an efficient manner, the training data were generated by solving (2) using finite element method (FEM) based solver NIRFAST (see, e.g., (41, 42)). The finite element meshes were constructed according to the specifications of the phantom used in each experiment (see Table 1). We generated 1500 numbers of data by randomly adding up to three spherical heterogeneities of different sizes (having radius between 2 to 13 ) and optical properties in the homogeneous background. The optical parameters of the heterogeneities were constrained to lie in a biologically relevant range, i.e., two to five times bigger than the background values. The source-detector configuration of the data is set to match that of real experimental data displayed in Table 1

. To make the label in a matrix form, FEM mesh is converted to the matrix of voxels by using triangulation-based nearest neighbor interpolation with an in-built MATLAB

function. The number of voxels per each dimension used for each experiment can be found in Table 1. To train the network for different sizes of phantoms and source-detector configurations, we generated different sets of training data and changed the input and output sizes of the network, accordingly. The specifications of the network architecture are provided in Table 2. Note that the overall structure of the network remained the same except the specific parameters.

 

# of sources # of detectors FEM mesh
Optical parameters
Background ()
# of voxels per xyz dimensions (resolution)
nodes elements
Polyprophylene phantom 20,609 86,284 0.003 0.5
Biomimic phantom 53,760 291,870 0.002 1
Mouse(normal) 12,288 63,426 0.0041 0.4503
Mouse (with tumor) 12,288 63,426 0.0045 0.3452

 

Table 1: Specification of FEM mesh for test data generation.
Type Polypropylene Biomimic Animal
patch size

/stride

output
size
depth
patch size
/stride
output
size
depth
patch size
/stride
output
size
depth
Network input - - - - - -
Fully connected - - - - - -
dropout - - - - - - - - -
3D convolution /1 16 /1 64 /1 128
3D convolution /1 16 /1 64 /1 128
3D convolution /1 1 /1 1 /1 1
Table 2: Network architecture specifications. Here, is the number of filtered measurement pairs (Polypropylene: , Biomimic: , Mouse (normal): , Mouse (with tumor): ).

The input of the neural network is the multi-static data matrix of pre-processed measurements. To perform convolution and to match its dimension with the final output of 3D image, the output of the fully connected layer is set to the size of the discretized dimension for each phantom. All the convolutional layers were preceded by appropriate zero padding to preserve the size. We used

as a non-linear activation function for every layer except the last convolutional layer where ReLU is used. In the network structure for polypropylene phantom, for example, the first convolutional layer convolves 16 filters of with stride 1 followed by . The second convolutional layer again convolves the filter of with stride 1 followed by ReLU (Table 2).

We used the mean squared error (MSE) as a loss function and the network was implemented using Keras library

(43). Weights for all the convolutional layers were initialized using Xavier initialization. We divided the generated data into 1000 training and 500 validation data sets. For training, we used the batch size of and Adam optimizer (44) with the default parameters as mentioned in the original paper, i.e., we used learning rate, , and

. Training runs for up to 120 epochs with early stopping if the validation loss has not improved in the last 10 epochs. To prevent overfitting, we added a zero-centered Gaussian noise with standard deviation

and applied dropout on the fully connected layer with probability

. We used a GTX 1080 graphic processor and i7-6700 CPU (3.40 GHz). The network took about 380 seconds for training. Since our network only used the generated simulation data for training, it suffered from the noise which cannot be observed from the synthetic data and overfitting due to lack of complexity. However, by carefully matching the signal envelops of the simulation data to those of the real data and tuning the parameters of the modules such as dropout probability and the standard deviation of the Gaussian noise, we could achieve the current network architecture which performs well in various experimental situations. We have not imposed any other augmentation such as shifting and tilting since our input data is not in the image domain but in the measurement domain which is unreasonable to apply such techniques. Every 3D visualization of the results is done by using ParaView (45).

Baseline algorithm for comparison

The baseline iterative algorithm was developed using the Rytov iterative method similar to (21). Specifically, after estimating the bulk optical properties as an initial guess, the Lippmann-Schwinger equation is linearized using Rytov approximation, and the associated penalized least squares optimization problem with penalty was solved. The optical properties are updated until the algorithm converges, and the background Green’s functions are updated accordingly so that the penalized least squares solutions are obtained using newly updated Green’s function at each iteration. We set the convergence criterion if the reconstructed optical parameter at current iteration has not improved in the last two iterations. Unless an initial guess is not closer, the algorithm generally converged in six to ten iterations and each iteration took approximately 40 seconds, which makes total working time less than 10 minutes.

Results

Figure 3: The reconstruction results from numerical phantom. Here, we used the network trained using the biomimic phantom geometry (see Table 2). The ground truth images are visualized with binary values to show the location of virtual anomalies clearly. Our proposed network successfully resolved the anomaly locations.

To validate our deep learning approach in a quantitative manner, the reconstruction experiments from numerical phantom were first performed, and some representative results are shown in Fig. 3. Here, we used the network trained using the biomimic phantom geometry (see Table 2). The ground truth images are visualized with binary values to show the location of virtual anomalies clearly. To demonstrate the robustness of the algorithm under various situations, we randomly chose the data with different anomaly sizes and locations, with distinct z-locations. Our proposed network successfully resolved the anomaly locations.

Next, to analyze the performance of the proposed approach in controlled real experiments, two biomimic phantoms with known inhomogeneity locations are created (see Fig. 4). The first phantom is made of polypropylene containing a vertically oriented cylindrical cavity that has 20 diameter and 15 height. The cavity is filled with the acetyl inclusion which has different optical properties to the background. The schematic illustrations of the phantoms are shown with their specifications in Fig. 4. Then, we obtained the measurement data using our multi-channel system. The reconstructed 3D images from the conventional iterative method and our proposed network are compared. For the phantom with a single inclusion, both the conventional iterative method and our network accurately reconstructed the location of optical anomalies (Fig. 4 (top row)). The contrast is also clearly seen in the DBT image. The second phantom is made of lards to mimic a condition similar to the human breast. We inserted tubes inside the phantom and made three knots of different sizes. After solidifying the lards, the tubes were filled with blood to mimic a tumor-like vascular structure. In the DBT image, the locations of the knotted tubes are barely seen, as pointed out by the black arrows. In this phantom case, the reconstructed image using the iterative method exhibited globally distributed artifacts, and suffered from artifacts even after thresholding out the values smaller than the 60% of the max value (Fig. 4 (bottom row)). Since the artifacts had higher values, which dominate the signal from the inclusions, we manually stopped the algorithm at iteration three to prevent the smoothing regularization from erasing the real signal. In contrast, the reconstructed image using the proposed network finds three inclusions without those artifacts showing a better contrast than DBT image. Furthermore, it is remarkable that our reconstruction has a good contrast so that the knots are easily seen, whereas it was hard to distinguish the virtual tumors from the background using the DBT image.

Figure 4: The reconstructed images of the phantoms. A phantom with a single acetyl cylindrical inclusion (top row) and a biomimic phantom with three blood inclusions (middle and bottom rows) were used. The first phantom is made of polypropylene containing a vertically oriented cylindrical cavity that has 20 diameter and 15 height. The cavity is filled with the acetyl inclusion which has different optical properties to the background. For this experiment, both the conventional iterative method and our network accurately reconstructed the location of optical anomalies. The contrast is also clearly seen in the DBT image. The second phantom is made of lards to mimic a condition similar to the human breast. We inserted tubes inside the phantom and made three knots of different sizes. After solidifying the lards, the tubes were filled with blood to mimic a tumor-like vascular structure. In the DBT image, the locations of the knotted tubes are barely seen, as pointed out by the black arrows. In this case, the reconstructed image using the iterative method suffered from artifacts at the bottom plane near the detector array. In contrast, the reconstructed image using the proposed network finds three inclusions without those artifacts showing a better contrast than DBT image.
Figure 5: The reconstructed images of in vivo animal experiments. Mouse without any tumor is shown. In order to get the scattered photon density measurements, we recorded the data with and without the mouse placed in the tank, which is filled with the water/milk mixture of ratio. Both the conventional and the proposed methods recovered high values at the chest area of the mouse. However, the iterative reconstruction finds a big chunk of high values around the left thigh of the mouse, which is unlikely with the normal mouse. In contrast, our proposed network shows a high along the spine of the mouse where the artery and organs are located.
Figure 6: The reconstructed images of in vivo animal experiments. Mouse with tumor on the right thigh is shown. Comparison between the mouse with and without tumor 3D visualizations are displayed. In the 3D images, the values are thresholded above 60% of the max value for a clear visualization. Our network finds a high values around the right thigh, where the tumor is located (Fig. 6).

Finally, we performed in-vivo animal experiments using a mouse (Fig. 5 and Fig. 6). In order to get the scattered photon density measurements, we recorded the data with and without the mouse placed in the chamber, which is filled with the water/milk mixture of ratio. Optical scattering data from animal experiments were collected using the single-channel system. Fig. 5 and Fig. 6 shows the reconstructed images of the mouse with and without tumor. Both the conventional and the proposed methods recovered high values at the chest area of the mouse. However, the iterative reconstruction finds a big chunk of high values around the left thigh of the mouse, which is unlikely with the normal mouse (Fig. 5). In contrast, our proposed network shows a high along the spine of the mouse where the artery and organs are located. Furthermore, in the mouse with tumor case, our network finds a high values around the right thigh, where the tumor is located (Fig. 6). The lateral view of our reconstructed images also match with the actual position of the mouse, whose head and body are held a little above the bottom plane due to the experiment setup.

Compared to the results of the conventional iterative method, our network showed a robust performance over the various examples. While the iterative reconstruction algorithm often imposed high intensities on spurious locations, such as the bottom plane of the phantom (Fig. 4 bottom row) and the big clutter at the left thigh of the normal mouse (Fig. 5 top row), our network found accurate positions with high values only at the locations where inclusions are likely to exist. Unlike the conventional method, which requires a parameter tuning for every individual case, it can infer from the measured data without additional pre- and post-processing techniques, which is desirable in the practical applications.

Note that our network had not seen any real data during the training nor the validation process. However, it successfully finds the inversion by learning only from the simulation data. Furthermore, even though we trained the network with sparse examples only, our network successfully finds both sparse (phantom, Fig. 4) and extended targets (mouse, Fig. 5 and Fig. 6) without any help of regularizers.

To further show that the proposed architecture is near-optimal, we performed ablation studies by changing or removing the components of the proposed architecture. Since our output is a 3D distribution, the network needs to find a set of 3D filters and . We observed that the network with 3D-convolution showed better axis resolution compared to the one using 2D convolution (Fig. 7), which is consistent with our theoretical prediction. One may suspect that the performance of the network has originated solely from the first layer since over 98% of the trainable parameters are from the fully connected layer. To address this concern, we tested the network with and without convolutional layers after the fully connected layer. We observed that the performance of our network deteriorated severely and it failed to train without the consecutive convolution layers. At least a single convolution layer with a single filter were needed to recover the accurate location of the optical anomalies (Fig. 7). However, paired encoder-decoder filter in the proposed network is better than just using a single convolution layer, which coincides with our theoretical prediction.

Figure 7: Network ablation study results. The reconstructed images from the networks with 2D- and 3D-convolution are compared (the 2nd and 3rd columns). To show the necessity of the convolutional layer, the image reconstructed using the network with a single 3D-convolution layer of a single filter is shown (last column). Meanwhile, the network with only a fully connected layer failed to train (results not shown).

Finally, we observed that the network training become inaccurate if the activation functions do not match with the physical intuition (Fig. 8). Since the first fully connected layer should learn the non-linear inversion of Lippmann-Schwinger equation and the consecutive 3D-convolution layers are supposed to learn the local bases and their adjoints , it appeared unreasonable to put non-negativity by imposing a ReLU activation function at each layer. Therefore, we used tanh as an activation function for every layer except the last convolutional layer outputting the 3D distribution of absorption coefficients which are positive values. This gave the most stable and accurate training. On the other hand, if we change the activation functions of the fully connected layer and intermediate layers from tanh to ReLU, we found the network is hardly trained or the performance is degraded, which runs counter to the common sense in computer vision area that ReLU gives a robust training. Interestingly, if we only have ReLU for the 3D-convolutional layers, we could observe that the training is very unstable (Fig. 8). Only after the activation functions of both fully connected layer and the following 3D-convolution layer have tanh activation, the network starts to learn and shows a stable training (Fig. 8).

Figure 8: The role of activation functions. The learning curves of the proposed network using tanh as an activation function for every layer except the last ReLU are shown (the 1st column). The learning curves of the networks with changed activation functions are shown. If we change the activation functions of the fully connected layer and intermediate layers from tanh to ReLU, we found the network is hardly trained or the performance is degraded (the 2nd column). If we only have ReLU for the 3D-convolutional layers, we could observe that the training is very unstable (the third column).

Conclusion

In this paper, we proposed a deep learning approach to solve the inverse scattering problem of diffuse optical tomography (DOT). Unlike the conventional deep learning approach,which tries to denoise or remove the artifacts from image to image using a black-box approach for neural network, our network was designed based on Lippmann-Schwinger equation to learn the complicated non-linear physics of the inverse scattering problem. Even though our network was only trained on the generated numerical data, we showed that the learned mapping is general over the real experimental data. The simulation and real experimental results showed that the proposed method outperforms the conventional method and accurately reconstructs the anomalies without iterative procedure or linear approximation. By using our deep learning framework, the non-linear inverse problem of DOT can be solved in end-to-end fashion and new data can be efficiently processed in a few hundreds of milliseconds, so it would be useful for dynamic imaging applications. Moreover, our new design principle based on deep convolutional framelets is a quite general framework to incorporate physics in network design, so we believe that the proposed network can be used for other inverse scattering problems from complicated physics.

References

  • (1)

    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks in

    Advances in neural information processing systems.
    pp. 1097–1105.
  • (2) Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation in International Conference on Medical Image Computing and Computer-Assisted Intervention. (Springer), pp. 234–241.
  • (3) He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition in

    Proceedings of the IEEE conference on computer vision and pattern recognition

    .
    pp. 770–778.
  • (4)

    Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines in

    Proceedings of the 27th international conference on machine learning (ICML-10).
    pp. 807–814.
  • (5) Russakovsky O, et al. (2015) ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115(3):211–252.
  • (6) Kang E, Min J, Ye JC (2017) A deep convolutional neural network using directional wavelets for low-dose x-ray ct reconstruction. Medical Physics 44(10).
  • (7) McCollough CH, et al. (2017) Low-dose ct for the detection and classification of metastatic liver lesions: Results of the 2016 low dose ct grand challenge. Medical Physics 44(10).
  • (8) Han Y, Yoo J, Ye JC (2016) Deep residual learning for compressed sensing ct reconstruction via persistent homology analysis. arXiv preprint arXiv:1611.06391.
  • (9) Jin KH, McCann MT, Froustey E, Unser M (2017) Deep convolutional neural network for inverse problems in imaging. IEEE Transactions on Image Processing 26(9):4509–4522.
  • (10) Antholzer S, Haltmeier M, Schwab J (2017) Deep Learning for Photoacoustic Tomography from Sparse Data. arXiv preprint arXiv:1704.04587.
  • (11) Yoon YH, Ye JC (2017) Deep learning for accelerated ultrasound imaging. arXiv preprint arXiv:1710.10006.
  • (12) Yodh A, Chance B (1995) Spectroscopy and imaging with diffusing light. Physics Today 48:34–40.
  • (13) Ntziachristos V, Yodh A, Schnall M, Chance B (2000) Concurrent MRI and diffuse optical tomography of breast after indocyanine green enhancement. Proceedings of the National Academy of Sciences 97(6):2767–2772.
  • (14) Strangman G, Boas DA, Sutton JP (2002) Non-invasive neuroimaging using near-infrared light. Biological psychiatry 52(7):679–693.
  • (15) Boas DA, et al. (2001) Imaging the body with diffuse optical tomography. IEEE signal processing magazine 18(6):57–75.
  • (16) Colton D, Kress R (2013) Integral equation methods in scattering theory. (SIAM).
  • (17) Ntziachristos V, Weissleder R (2001) Experimental three-dimensional fluorescence reconstruction of diffuse media by use of a normalized Born approximation. Optics letters 26(12):893–895.
  • (18) Markel VA, Schotland JC (2001) Inverse problem in optical diffusion tomography. I. Fourier–Laplace inversion formulas. JOSA A 18(6):1336–1347.
  • (19) Ammari H, et al. (2015) Mathematical methods in elasticity imaging. (Princeton University Press).
  • (20) Jiang H, Pogue BW, Patterson MS, Paulsen KD, Osterberg UL (1996) Optical image reconstruction using frequency-domain data: simulations and experiments. JOSA A 13(2):253–266.
  • (21) Yao Y, Wang Y, Pei Y, Zhu W, Barbour RL (1997) Frequency-domain optical imaging of absorption and scattering distributions by a Born iterative method. JOSA A 14(1):325–342.
  • (22) Ye JC, Webb KJ, Bouman CA, Millane RP (1999) Optical diffusion tomography by iterative-coordinate-descent optimization in a Bayesian framework. JOSA A 16(10):2400–2412.
  • (23) Lee O, Kim JM, Bresler Y, Ye JC (2011) Compressive diffuse optical tomography: noniterative exact reconstruction using joint sparsity. IEEE transactions on medical imaging 30(5):1129–1142.
  • (24) Lee O, Ye JC (2013) Joint sparsity-driven non-iterative simultaneous reconstruction of absorption and scattering in diffuse optical tomography. Optics express 21(22):26589–26604.
  • (25) Lee OK, Kang H, Ye JC, Lim M (2015) A non-iterative method for the electrical impedance tomography based on joint sparse recovery. Inverse Problems 31(7):075002.
  • (26) Yoo J, Jung Y, Lim M, Ye JC, Wahab A (2017) A joint sparse recovery framework for accurate reconstruction of inclusions in elastic media. SIAM Journal on Imaging Sciences 10(3):1104–1138.
  • (27)

    Lim J, et al. (2017) Beyond Born-Rytov limit for super-resolution optical diffraction tomography.

    Optics express (in press).
  • (28) Ye JC, Lee SY, Bresler Y (2008) Exact reconstruction formula for diffuse optical tomography using simultaneous sparse representation in Biomedical Imaging: From Nano to Macro, 2008. ISBI 2008. 5th IEEE International Symposium on. (IEEE), pp. 1621–1624.
  • (29) Ye JC, Lee SY (2008) Non-iterative exact inverse scattering using simultaneous orthogonal matching pursuit (S-OMP) in Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on. (IEEE), pp. 2457–2460.
  • (30) Kamilov US, et al. (2015) Learning approach to optical tomography. Optica 2(6):517–522.
  • (31) Van den Broek W, Koch CT (2012) Method for retrieval of the three-dimensional object potential by inversion of dynamical electron scattering. Physical review letters 109(24):245502.
  • (32) Ye JC, Han YS (2017) Deep convolutional framelets: A general deep learning for inverse problems. arXiv preprint arXiv:1707.00372.
  • (33) Yin R, Gao T, Lu YM, Daubechies I (2017) A tale of two bases: Local-nonlocal regularization on image patches with convolution framelets. SIAM Journal on Imaging Sciences 10(2):711–750.
  • (34) Poplack SP, Tosteson TD, Kogel CA, Nagy HM (2007) Digital breast tomosynthesis: initial experience in 98 women with abnormal digital screening mammography. American Journal of Roentgenology 189(3):616–623.
  • (35) Arridge SR (1999) Optical tomography in medical imaging. Inverse problems 15(2):R41.
  • (36) Vetterli M, Marziliano P, Blu T (2002) Sampling signals with finite rate of innovation. IEEE Transactions on Signal Processing 50(6):1417–1428.
  • (37) Ye JC, Kim JM, Jin KH, Lee K (2017) Compressive sampling using annihilating filter-based low-rank interpolation. IEEE Transactions on Information Theory 63(2):777–801.
  • (38) Heo D, et al. (2017) Dual modality breast imaging system: Combination digital breast tomosynthesis and diffuse optical tomography. SPIE Photonics west pp. 10059–67.
  • (39) Heo D, et al. (2017) Dual modality breast imaging system: Combination digital breast tomosynthesis and diffuse optical tomography. Korean Physical Society Spring Meeting.
  • (40) Choi YW, Park Hs, Kim Ys, Kim HL, Choi JG (2012) Effect of Acquisition Parameters on Digital Breast Tomosynthesis: Total Angular Range and Number of Projection Views. Korean Physical Society 61(11):1877–1883.
  • (41) Dehghani H, et al. (2009) Near infrared optical tomography using NIRFAST: Algorithm for numerical model and image reconstruction. International Journal for Numerical Methods in Biomedical Engineering 25(6):711–732.
  • (42) Paulsen KD, Jiang H (1995) Spatially varying optical property reconstruction using a finite element diffusion equation approximation. Medical Physics 22(6):691–701.
  • (43) Chollet F (2015) Keras (https://github.com/fchollet/keras).
  • (44) Kingma D, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
  • (45)

    Ahrens J, Geveci B, Law C (2005) Paraview: An end-user tool for large data visualization.

    The Visualization Handbook 717.
  • (46) Pogue BW, Paulsen KD, Abele C, Kaufman H (2000) Calibration of near-infrared frequency-domain tissue spectroscopy for absolute absorption coefficient quantitation in neonatal head-simulating phantoms. Journal of biomedical optics 5(2):185–193.

Supplementary Material

Appendix A Derivation of the Lippmann-Schwinger Equation for DOT

In this appendix, we derive the general form of the Lippmann-Schwinger equation if both absorption and scattering changes are present. Although the derivation of the Lippmann-Schwinger equation is already available in the literature(20, 21, 22), the derivation is mainly based on physical intuition rather than mathematical rigor. Therefore, we provide here a more rigorous mathematical derivation. Let the optical parameter distribution within the domain of interest be described by

(14)

where represents the characteristic function of a domain . We define the perturbations in optical parameters by

(15)

Then, the total photon density satisfies the transmission problem

(16)

where for any function and .

As for mathematical formalism to derive the Lippmann-Schwinger equation for the scattered photon density , the Robin Green function (), and the background photon density () for the homogeneous medium (in the absence of any anomaly) are required. The functions and are, respectively, the solutions to the boundary value problems

(17)

and

(18)

Notice that, by an appropriate use of the Green’s theorem and the boundary condition in (16),

(19)

where is the infinitesimal differential element on . Moreover, thanks to the boundary condition in (17),

(20)

Therefore, by combining (19) and (20), it can be easily seen that

From an application of the Green’s theorem on , one arrives at

Thanks to (16), one can see that

Using Green’s identity on the last term and by (16), one arrives at

After fairly easy manipulations, it can be seen that

(21)

Note that the first term on the RHS of (21) is nothing other than thanks to (17). Moreover, one last application of the Green’s theorem on the second term furnishes

(22)

where

(22) is the Lippmann-Schwinger integral equation that relates the scattered field to the perturbations in the optical properties. In particular, when the absorption perturbations are only present, the scattered photon flux measurement at the boundary is given by

where

Appendix B Bulk optical property estimation

Figure 9: Data preprocessing for bulk optical property estimation. The measurements distanced over 51 from source to detector were selected, and the bulk optical properties are estimated using this data by Newton-Raphson scheme.

The uniform bulk optical properties were found by fitting the experimental data to the model based data (diffusion equation in this case) using the iterative Newton-Raphson scheme (46). Under this scheme, two parameters and

were estimated through linear regression, where

is amplitude, is phase of the data (Fig. 9).

(23)

The minimization problem in (23) is the root finding problem which can be efficiently solved by Newton-Raphson method. Let . Newton-Raphson method find the roots of a real-valued function, such that

(24)

The method alternates between two functions of (24) and updates the optical properties using (25) at each iterative step.

(25)

where the derivative is computed numerically.