Simulating Patho-realistic Ultrasound Images using Deep Generative Networks with Adversarial Learning

12/21/2017 ∙ by Francis Tom, et al. ∙ 0

Ultrasound imaging makes use of backscattering of waves during their interaction with scatterers present in biological tissues. Simulation of synthetic ultrasound images is a challenging problem on account of inability to completely model various factors of which some include intra-/inter scanline interference, transducer to surface coupling, artifacts on transducer elements, inhomogeneous shadowing and nonlinear attenuation. While various approaches to ultrasound simulation has been developed, approaches that produce patho-realistic images typically solve wave space equations making it computationally expensive and slow to operate. We propose a generative adversarial network (GAN) inspired approach for fast simulation of patho-realistic ultrasound images. We apply the framework to intravascular ultrasound (IVUS) simulation. A stage 0 simulation is done from the echogenicity map of the tissue obtained from the ground truth label of ultrasound image using an off the shelf pseudo B-mode ultrasound image simulator . The images obtained are adversarially refined using stacked GAN. The stage I GAN generates low resolution images from the images generated by the initial simulation. The stage II GAN further refines the output of the stage I GAN and generates high resolution images which are patho-realistic. We demonstrate that the network generates realistic appearing images evaluated with a visual Turing test indicating an equivocal confusion in discriminating simulated from real. We also quantify the shift in tissue specific intensity distributions of the real and simulated images to prove their similarity.



There are no comments yet.


page 1

page 3

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Figure 1: Overview of the proposed framework for ultrasound simulation using stacked generative adversarial networks (GAN).

Ultrasound images are formed by reflection of ultrasonic acoustic waves from body structures. This makes the task of simulating realistic appearing medical ultrasound images computationally complex [1]

. Such simulators could be used as a learning aid for doctors to be able to simulate and visualize co-morbid pathologies which occur rarely in reality. They can also simulate images to augmented datasets for supervised learning algorithms. Fig. 

1 summarizes the contribution of this paper where we present a generative adversarial network (GAN) based framework for fast and realistic simulation of ultrasound images. The GAN framework [2] consists of two neural networks: a generator and a discriminator. In an adversarial learning strategy, the generator learns while attempting to generate realistic ultrasound images. The discriminator simultaneously learns while attempting to discriminate between true images and those simulated by the generator. This learning rule manifests as a two player mini-max game. Our formulation is elucidated in Sec. 3.

The rest of the paper has Sec. 2 describing the prior art of ultrasound image simulation. Sec. 4 presents the results of our experiments on the publicly available border detection in IVUS challenge111 dataset along with discussion. Sec. 5 presents the conclusions of the work.

2 Prior Art

The different approaches for simulating ultrasound images can broadly be grouped as employing (i) numerical modeling of ultrasonic wave propagation or (ii) image convolution operation based. An early approach employed convolutional operators leveraging the linearity, separability and space invariance properties of the point spread function (PSF) of imaging system to simulate pseudo B-mode images [3]. Numerical modeling based approaches have employed ray-based model to simulate ultrasound from volumetric CT scans [4], and with the Born approximation [5] have also simulated images using histopathology based digital phantoms employing wave equations [6, 7, 8]. Further Rayleigh scattering model [9], physical reflection, transmission and absorption have also been included in simulation [10]. Commercially available solutions like Field II simulator222 have used a finite element model for the purpose. These simulators that model the tissue as a collection of point scatterers fail to model appearance of artifacts, and the high computational complexity challenge their deployment. Recent approaches employ spatially conditioned GAN [11], yet are limited in resolution and the appearance of unrealistic artefacts in the generated images.

3 Methodology

Our framework employs (1) simulation of ultrasound images from the tissue echogenicity maps using a physics based simulator, (2) generation of low resolution images from the earlier simulated images by learning Stage I GAN, (3) generation of high resolution patho-realistic images from the low resolution earlier stage by a Stage II GAN. Stage I GAN does the structural and intensity mapping of the speckle maps generated by the physics based simulator. Stage II GAN models patho-realistic speckle PSF and also corrects any defects in the images simulated by the stage I GAN. Generation of high resolution images by GANs is difficult due to training instability [12]. Hence the complex task of ultrasound simulation is decomposed into sub-problems.

3.1 Stage 0 simulation from tissue echogenicity map

The ground truth label of lumen and external elastic lamina boundary contour available in the dataset is used to generate the tissue echogenicity map (Fig. 1(b)). The initial simulation is performed by a pseudo B-mode ultrasound image simulator333 [3, 13] that assumes a linear and space invariant point spread function (PSF). Image is formed with assumption of wave propagating vertically along the echogenicity map. The images are generated in the polar domain (Fig. 1(c)).

3.2 Stage I GAN

Refinement of the output of the pseudo B-mode simulator is performed by a two stage stacked GAN [14]. The Stage I GAN is trained to generate a low resolution image of

px. size in the polar domain using the simulated + unsupervised learning approach 

[15]. This involves learning of a refiner network to map the speckle map generated by the pseudo B-mode simulator , with as the function parameters. The discriminator network

is trained to classify whether the images are real or refined, where

denotes the discriminator network’s parameters. The refined image is denoted as (Fig. 1(d)). The set of real images (Fig. 1(a)

) are also used for learning the parameters. The loss function consists of a realism loss term (

) and an adversarial loss term. The realism loss term ensures that the ground truth label remains intact in the low resolution refined image which is used to condition the Stage II GAN. During training of the Stage I GAN, the discriminator loss and generator loss are minimized alternately where,


Self-regularization here minimizes per-pixel difference between the refined image and the synthetic image as


A buffer of refined images generated over the previous steps was used to improve the realism of the artifacts in the refined images as in [15].

Model architecture: The Stage I GAN is a residual network [16] with residual blocks. The simulated image from the pseudo B-mode simulator of size is first passed through the generator with convolution layer with sized kernel and feature maps on output. The refined image is generated by a convolution layer with sized kernel. The discriminator comprises of convolution layers and max pooling layers as in [15].

(a) Real image
(b) Tissue map
(c) Pseudo B-Mode
(d) Stage I GAN
(e) Stage II GAN
(f) Real image
(g) Tissue map
(h) Pseudo B-Mode
(i) Stage I GAN
(j) Stage II GAN
Figure 2: Comparison of images obtained at different stages of the proposed framework. The first row displays images in the Polar domain and the second row displays the corresponding figures in the Cartesian domain.

3.3 Stage II GAN

The generator of the Stage II GAN accepts the low resolution images from the Stage I GAN generator and generates high resolution images adding realistic artifacts. While training the Stage II GAN, the discriminator loss () and generator loss () are minimized alternately where,


The high resolution images generated are in polar domain (Fig. 1(e)) and scan converted to cartesian coordinate domain (Fig. 1(j)) for visual inspection.

Model architecture: The Stage II GAN generator consists of a downsampling block that comprises of two maxpooling layers to downsample the image by a factor of four. The feature maps are then fed to residual blocks, followed by upsampling the image to . The discriminator consists of downsampling blocks that reduce the dimension to followed by a convolution layer and a fully connected layer to generate the prediction.

4 Experiments, Results and Discussion

In this section, experimental validation of our proposed framework is presented. Images from the dataset released as part of the border detection in IVUS challenge444 were used in our work. The images were obtained from clinical recordings using a 20 MHz probe. Of the available images, had been manually annotated. We used real images for training (Patient Id. 1-9) and images were held out for testing (Patient Id. 10 ). The ground truth labeled dataset containing images were augmented by rotating each image by over steps. Each such image was then warped by translating up and down by in the polar domain to yield an augmented dataset of tissue maps to be processed with the pseudo B-mode simulator.

The Stage I GAN was trained with a learning rate of with the Adam optimizer over epochs with a batch size of . While training the Stage II GAN, the weights of Stage I GAN were held fixed. Training was done with the Adam optimizer with an initial learning rate of , and a learning rate decay of per epochs over epochs with a batch size of .

Images obtained at different stages of the pipeline are provided in Fig. 2. The Stage I GAN generates low resolution (64 x 64) images and are blurry with many defects. As shown, the Stage II GAN refines the image generated by Stage I GAN, adding patho-realistic details at higher resolution.

Qualitative evaluation of the refined images: In order to evaluate the visual quality of the adversarially refined images, we conduct a Visual Turing Test (VTT). A set of real images were randomly selected from the images in test set. Each real image was paired with simulated image and presented to evaluators who had experience in clinical or research ultrasonography. Each evaluator was presented with an independently drawn pairs and they had to identify the real image in each pair. Cumulatively they were able to correctly identify real IVUS out of present with chance of correct identification.

Quantitative evaluation of the refined images: The histograms of the real and adversarially refined images in the three regions i.e., lumen, media and externa for

randomly chosen annotated images were obtained and the probability mass functions were computed from the histograms. A random walk based segmentation algorithm 

[17] was used for the annotation of the refined images. The Jensen-Shannon (JS) divergence between the probability mass functions of the three regions in real vs. simulated images were computed and summarized in Table. 1. It is observed that result of Stage II GAN are realistically closer to real IVUS as compared to Stage 0 results. The pairwise JS-divergence between lumen and media, media and externa and lumen and externa were computed and summarized in Table. 2. The divergence between pair of tissue classes in Stage II GAN are also realistically closer to real IVUS.

lumen media externa
Stage II GAN 0.0458 0.1159 0.3343
Stage 0 0.0957 0.5245 0.6685
Table 1: JS-divergence between speckle distribution of different regions in simulated vs. real IVUS.
lumen-media media-ext. lumen-ext.
Real 0.2843 0.1460 0.2601
Stage II GAN 0.1471 0.1413 0.3394
Table 2: JS-divergence between pair of tissue types in real and simulated IVUS.

5 Conclusion

Here we have proposed a stacked GAN based framework for the fast simulation of patho-realistic ultrasound images using a two stage GAN architecture to refine images synthesized from an initial simulation performed with a pseudo B-model ultrasound image generator. Once training is completed, images are generated by providing a tissue map. The quality of the simulated images was evaluated through a visual Turing test that evoked equivocal visual response across experienced raters with   chance of identifying the real from simulated IVUS. The quality of the simulated images was quantitatively evaluated using JS divergence between the distributions of the real and simulated images. Similarity of the real and simulated images is also quantified by computing the shift in tissue specific speckle intensity distributions. Conclusively these evaluations substantiate ability of our approach to simulate patho-realistic IVUS images, converging closer to real appearance as compared to prior-art, while simulating an image in under ms during deployment.


  • [1] Martino Alessandrini, Mathieu De Craene, Olivier Bernard, Sophie Giffard-Roisin, Pascal Allain, Irina Waechter-Stehle, Jürgen Weese, Eric Saloux, Hervé Delingette, Maxime Sermesant, et al., “A pipeline for the generation of realistic 3d synthetic echocardiographic sequences: methodology and open-access database,” IEEE transactions on medical imaging, vol. 34, no. 7, pp. 1436–1451, 2015.
  • [2] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Proc. Adv., Neur. Inf. Process. Sys., 2014, pp. 2672–2680.
  • [3] J. C. Bamber and R. J. Dickinson, “Ultrasonic b-scanning: a computer simulation,” Phys., Med., Biol., vol. 25, no. 3, pp. 463, 1980.
  • [4] O. Kutter, R. Shams, and N. Navab, “Visualization and gpu-accelerated simulation of medical ultrasound from ct images,” Comp. Methods, Prog., Biomed., vol. 94, no. 3, pp. 250–266, 2009.
  • [5] M. D. R. Ramírez, P. R. Ivanova, J. Mauri, and O. Pujol, “Simulation model of intravascular ultrasound images,” in Int. Conf. Med. Image Comput., Comp.-Assist. Interv., 2004, pp. 200–207.
  • [6] S. C. Groot, R. Hamers, F. H. Post, C. P. Botha, and N. Bruining, “Ivus simulation based on histopathology,” in Proc. Int. Conf. Comp. Cardiol., 2006, pp. 681–684.
  • [7] S. Kraft, A. Karamalis, D. Sheet, E. Drecoll, E. J. Rummeny, N. Navab, P. B. Noël, and A. Katouzian, “Introducing nuclei scatterer patterns into histology based intravascular ultrasound simulation framework,” in Proc. SPIE Med. Imaging, 2013, vol. 8675, pp. 86750Y–6.
  • [8] S. Kraft, S. Conjeti, P. B. Noël, S. Carlier, N. Navab, and A. Katouzian, “Full-wave intravascular ultrasound simulation from histology,” in Int. Conf. Med. Image Comput., Comp.-Assist. Interv., 2014, pp. 627–634.
  • [9] C. Abkai, N. Becherer, J. Hesser, and R. Männer, “Real-time simulator for intravascular ultrasound (ivus),” in Proc. SPIE Med. Imaging, 2007, vol. 6513, pp. 65131E–1–6.
  • [10] F. M. Cardoso, M. C. Moraes, and S. S. Furuie, “Realistic ivus image generation in different intraluminal pressures,” Ultrasound, Med., Biol., vol. 38, no. 12, pp. 2104–2119, 2012.
  • [11] Yipeng Hu, Eli Gibson, Li-Lin Lee, Weidi Xie, Dean C Barratt, Tom Vercauteren, and J Alison Noble, “Freehand ultrasound image simulation with spatially-conditioned generative adversarial networks,” in Molecular Imaging, Reconstruction and Analysis of Moving Body Organs, and Stroke Imaging and Treatment, pp. 105–115. Springer, 2017.
  • [12] Martin Arjovsky and Léon Bottou, “Towards principled methods for training generative adversarial networks,” arXiv preprint arXiv:1701.04862, 2017.
  • [13] Y. Yu and S. T. Acton, “Speckle reducing anisotropic diffusion,” IEEE Trans. Image Process., vol. 11, no. 11, pp. 1260–1270, 2002.
  • [14] H. Zhang, T. Xu, H. Li, S. Zhang, X. Huang, X. Wang, and D. Metaxas, “Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks,” arXiv preprint arXiv:1612.03242, 2016.
  • [15] A. Shrivastava, T. Pfister, O. Tuzel, J. Susskind, W. Wang, and R. Webb, “Learning from simulated and unsupervised images through adversarial training,” Proc. IEEE/CVF Conf. Comp. Vis., Patt. Recog., pp. 2107–2116.
  • [16] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE/CVF Conf. Comp. Vis., Patt. Recog., 2016, pp. 770–778.
  • [17] L. Grady, “Random walks for image segmentation,” IEEE Trans. Patt. Anal., Mac. Intell., vol. 28, no. 11, pp. 1768–1783, 2006.