Log In Sign Up

UltraCompression: Framework for High Density Compression of Ultrasound Volumes using Physics Modeling Deep Neural Networks

by   Debarghya China, et al.

Ultrasound image compression by preserving speckle-based key information is a challenging task. In this paper, we introduce an ultrasound image compression framework with the ability to retain realism of speckle appearance despite achieving very high-density compression factors. The compressor employs a tissue segmentation method, transmitting segments along with transducer frequency, number of samples and image size as essential information required for decompression. The decompressor is based on a convolutional network trained to generate patho-realistic ultrasound images which convey essential information pertinent to tissue pathology visible in the images. We demonstrate generalizability of the building blocks using two variants to build the compressor. We have evaluated the quality of decompressed images using distortion losses as well as perception loss and compared it with other off the shelf solutions. The proposed method achieves a compression ratio of 725:1 while preserving the statistical distribution of speckles. This enables image segmentation on decompressed images to achieve dice score of 0.89 ± 0.11, which evidently is not so accurately achievable when images are compressed with current standards like JPEG, JPEG 2000, WebP and BPG. We envision this frame work to serve as a roadmap for speckle image compression standards.


page 1

page 4


Deep Network for Scatterer Distribution Estimation for Ultrasound Image Simulation

Simulation-based ultrasound training can be an essential educational too...

Prostate Segmentation from Ultrasound Images using Residual Fully Convolutional Network

Medical imaging based prostate cancer diagnosis procedure uses intra-ope...

Automatic segmentation of vertebral features on ultrasound spine images using Stacked Hourglass Network

Objective: The spinous process angle (SPA) is one of the essential param...

Simulating Patho-realistic Ultrasound Images using Deep Generative Networks with Adversarial Learning

Ultrasound imaging makes use of backscattering of waves during their int...

Content-Preserving Unpaired Translation from Simulated to Realistic Ultrasound Images

Interactive simulation of ultrasound imaging greatly facilitates sonogra...

Customized OCT images compression scheme with deep neural network

We customize an end-to-end image compression framework for retina OCT im...

1 Introduction

Ultrasound (US) has been in use for medical imaging for more than three decades, on account of its salient advantages which include the relatively low cost of ownership and operation, non-ionizing nature of radiation, real-time imaging capability, high resolution, and ability to serve for both inside-out and outside-in imaging [1]. In recent years there has been significant advancements beyond the conventional 2D US imaging techniques with the inclusion of 2D+T, 3D, 3D+T modes while making them more advanced and increasing their deployment in mainline medical imaging [2]. This has resulted in an increase in the quantity of data being used to store these images, thereby creating difficulties in storage and transmission. Even though image compression techniques have been able to solve this problem in general for consumer grade camera images and some medical images, ultrasound image compression is challenging due to the presence of speckle patterns which are hard to preserve when compressed with existing standards like JPEG, JPEG2000 [2]. In this paper, we propose a framework for ultrasound image compression wherein the compressor and the decompressor are designed taking into account the physics of ultrasound image formation process as illustrated in Fig. 1(a), to obtain decompressed images (Fig. 1(c)) which are perceivably similar to the original image (Fig. 1(b)), as also evident in Fig. 1(d) representing the perception distortion trade-off [3].

2 Prior Art

The significant growth in digital imaging over the last four decades has resulted in an increase in complexity associated with transmission and storage of images. International Standards Organization (ISO), and International Electro-technical Commission (IEC) jointly introduced the first standard for image compression popularly also referred to as the Joint Photographic Experts Group (JPEG) [4]. These developments were later on followed by genesis of new standards like the JPEG2000 [5], JPEG-LS [6], JPEG-XR [7] and HEVC/H.265 [8] for video compression.

In line with the growth of digitally acquired and stored medical images, the requirement of standards for digital imaging and communication in medicine (DICOM) 111 also grew. Subsequently, JPEG, JPEG 2000, JPEG-LS, H.264, and HEVC 222 were included within DICOM and used for radiological images, and information on clinically acceptable compression factors was laid out for each of the specific imaging modalities by different societies [2].

However, use of compression in US images has been limited to JPEG and JPEG2000 [2] and restricted for use in anatomical regions like the breast, musculoskeletal region and in pediatric imaging. Recently, methods employing deep neural networks have been introduced which allow high-density compression of medical images while retaining diagnostically relevant features [9]. The motivation of this paper is to address this challenge by introducing a framework with the primary ability to preserve key information rendered through speckles to enable its clinically relevant use.

3 Methodology

The image compression framework we propose here consists of a compressor and a decompressor block, with the compressed file constituting the commonly exchanged information between them. The key idea is to segment an image into anatomically relevant regions and transmit it to the decompressor which is essentially an image generator  [10], that generates the image from the compressed file.

3.1 Compressor

The compressor consists of a segmentation engine which splits the ultrasound image into anatomical and pathological segments and transmits them. We demonstrate generalizability by implementing the segmentation using two different methods which have been recently published, (i) a classical machine learning based approach with cross frame belief propagating iterative random walks 


and, (ii) a convolutional neural network based approach implemented with U-Net 


Classical machine learning based approach for segmentation: This approach [1]

is based on the statistical mechanical understanding of ultrasound-tissue interaction. A parametric model of ultrasonic speckle statistics is estimated and used to learn a random forest classifier for pixelwise classification of tissues. This model is used for contour initialization. Subsequently, iterative random walks are used for correcting the contours. Gradient vector flow based belief propagation is then applied to subsequent neighboring frames for initializing the random walks, and this process is performed iteratively for volume segmentation.

Convolutional neural network approach for segmentation: A semantic segmentation approach based on U-Net [11] is also used, where initialization is performed with VGG11 [12]

weights in the encoder unit. The decoder weights are randomly initialized. The network generates a pixel-wise segmentation response map. The ReLU activation function is used throughout the network following convolution layers. In the final layer of the decoder, the sigmoid activation function is used to generate a pixel-wise segmentation response map. The weighted cross-entropy (WCE) loss function is used, giving higher weights to the pixels closer to the boundary using using the morphological distance transform.

File format: The file format for transmission of compressed data contains four necessary pieces of information. Chain codes of the contour for different classes of tissue are stored in the probe geometry specific polar coordinate format. A number of scan lines that make up the polar image is included for reconstruction of the segmented contours in the decompressor. The frame size in the Cartesian coordinate system is also included. The acquisition frequency is included as well, for the reconstruction of the images using a generative model.

3.2 Decompressor

In this section, an adversarially trained generative convolutional network is employed to simulate patho-realistic ultrasound images from tissue echogenicity maps recreated from the received file. This generator involves two stages and is based on the framework proposed in  [10].

Stage 0: Physics based simulation: This first stage simulation is performed using a pseudo B-mode physics based simulator which works on the principle of linear and space invariant nature of point spread function (PSF) of speckle in the image [13].

Stage I: Speckle intensity and Point spread function learning: In this stage, we modify the architecture in  [10] to enable direct learning of the mapping from stage 0 simulated results to patho realistic ultrasound images, using a single generator. This allows for better preservation of the ground truth annotation which is an important diagnostically relevant feature. The proposed single stage model has the added advantage of being more computationally efficient, reducing the inference time by half.

The generator has an encoder-decoder architecture that enables it to learn the intensity mapping and the point spread function. The feature maps are fed to residual blocks, by nearest neighbour upsampling followed by convolutions. The discriminator has downsampling blocks that brings down the dimension to followed by a convolution layer and a fully connected layer for prediction. The LeakyReLU activation function is used in the downsampling blocks. Similar to the network in [10] a self regularization term is also included in the GAN loss for preserving the ground truth information, whose value was chosen as 0.01 via experimentation.

(a) Original Image

(b) UltraCompression

(c) JPEG

(d) JPEG2000

(e) BPG

(f) WebP
Figure 2: Visualization of decompressed images.

4 Experiments, Results and Discussion

In this experiment, we have used the IVUS pullback data from the border detection in IVUS Challenge 333 used. The dataset consists of pullbacks with one pullback per patient, acquired at MHz [14]. One pullback was used for reporting performance and remaining nine were used for training. Augmentation by axially rotating the pullbacks by 30°over 12 steps was done. Implementation has been done on Ubuntu

LTS OS, Python 3.6, PyTorch 0.5, CUDA

and cuDNN for acceleration with a Nvidia Tesla K40c GPU with 12GB of DDR4 RAM on a PC with Intel Core i5-8600K processor and 32 GB of system RAM.

The compressor with classical machine learning approach was implemented following [1]. The UNet [11] based compressor was trained with learning rate and batch size 14 using the Adam optimizer. The decompressor was trained with an initial learning rate of and a weight decay of per epochs over epochs with a batch size of , following  [10].

Lumen vs. Media Media vs. Ext. Lumen vs. Ext.
Table 1: Inter-tissue Jensen Shannon (JS) divergence. Higher value indicates better contrast between tissue pairs. Also refer Fig. 1(d)

. Ext. denotes external elastic luminae. Numbers in parentheses correspond to standard deviation.

Method Lumen Media Ext.
Table 2: Intra-tissue JS divergence. Lower value indicates better preservation of speckle statistics.
Method Lumen Media Ext.
Table 3: Intra-tissue JS divergence in attenuation map. Lower value indicates similarity in estimated signal attenuation.

The original and the decompressed images are compared using the inter-tissue Jensen-Shannon (JS) divergence of their speckle appearance and the results are reported in Table 1. We compared the proposed method with the prior art using intra-tissue JS divergence assessed between speckle statistics in Table 2 and divergence in attenuation in Table 3. Subsequently, we compared the impact on image segmentation using two strategies for segmentation. In the first case, we used a CNN within the compressor and a classical machine learning based approach for segmentation during validation, whose results are reported in Table 4. In the second case, the compression and segmentation approaches were reversed and the results are reported in Table 5.

Method SE SP Dice PPV
Table 4: Evaluating segmentation performance on decompressed images using [1]. SE denotes sensitivity, SP denotes specificity, PPV denotes Positive Predictive Value.
Method SE SP Dice PPV
Table 5: Evaluating segmentation performance on decompressed images using [11].

Here, an uncompressed polar image occupies bits. For the chain code used, the starting point occupies bits and the remaining points occupy bits. Two contours for lumen and media are stored and thus the compression factor becomes . In spite of achieving a high compression ratio, the decompressed images have very low intra-tissue JS divergence values, indicating a high degree of similarity to the original images. From Table 1, it is observed that the tissue-specific layer mapping is better than in the existing methods because the inter-tissue JS divergence is higher and nearly similar to that of the original images. Additionally, inter-tissue statistical mechanics has been mapped properly in the decompressed image since the classical method based on the statistical mechanical understanding of ultrasound-tissue interaction has been able to segment the lumen and external elastic luminae precisely. Through a paired visual Turing test, it was observed that there was a 50% chance of identifying the real from the decompressed images  [10]. Also, the compressor and decompressor blocks in this framework can be modified with other segmentation and generation methods to further improve performance.

5 Conclusion

In this paper, we have proposed a framework for high-density compression of ultrasound volumes. This framework involves two parts, a compressor with a segmentation block and a decompressor with a generation block. Both the segmentation and generation blocks can be customized with different algorithms. Here we we have used classical machine learning and convolutional neural network alternatively in the compressor block, and a generative adversarial network in the decompressor block. The quality of image compression has been evaluated using inter- and intra-tissue JS divergence between the original and the decompressed images.


  • [1] D. China, A. Illanes, P. Poudel, M. Friebe, P. Mitra, and D. Sheet, “Anatomical structure segmentation in ultrasound volumes using cross frame belief propagating iterative random walks.,” IEEE J. Biomed. Health Inform., 2018.
  • [2] F. Liu, M. Hernandez-Cabronero, V. Sanchez, M. W. Marcellin, and A. Bilgin, “The current role of image compression standards in medical imaging,” Information, vol. 8, no. 4, pp. 131, 2017.
  • [3] Yochai Blau and Tomer Michaeli, “The perception-distortion tradeoff,” in

    Proc. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, Utah, USA

    , 2018, pp. 6228–6237.
  • [4] W. B. Pennebaker and J. L. Mitchell, JPEG: Still image data compression standard, 1992.
  • [5] M. Boliek, “Jpeg 2000 image coding system: Core coding system,” ISO/IEC, 2002.
  • [6] M. J. Weinberger, G. Seroussi, and G. Sapiro, “The loco-i lossless image compression algorithm: Principles and standardization into jpeg-ls,” IEEE Trans. Image process., vol. 9, no. 8, pp. 1309–1324, 2000.
  • [7] F. Dufaux, G. J. Sullivan, and T. Ebrahimi, “The jpeg xr image coding standard [standards in a nutshell],” IEEE Signal Process. Magazine, vol. 26, no. 6, 2009.
  • [8] J-R Ohm, Gary J Sullivan, Heiko Schwarz, Thiow Keng Tan, and Thomas Wiegand, “Comparison of the coding efficiency of video coding standards—including high efficiency video coding (hevc),” IEEE Trans Circuits Syst. Video Technol., vol. 22, no. 12, pp. 1669–1684, 2012.
  • [9] A. Kar, S. P. K. Karri, N. Ghosh, R. Sethuraman, and D. Sheet, “Fully convolutional model for variable bit length and lossy high density compression of mammograms,” Proc. Conf. Comp. Vis. Patt. Recog., 2018.
  • [10] F. Tom and D. Sheet, “Simulating patho-realistic ultrasound images using deep generative networks with adversarial learning,” in Proc. Int. Symp. Biomed. Imaging, 2018, pp. 1174–1177.
  • [11] V. Iglovikov and A. Shvets, “Ternausnet: U-net with vgg11 encoder pre-trained on imagenet for image segmentation,” arXiv preprint arXiv:1801.05746, 2018.
  • [12] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
  • [13] J. C. Bamber and R. J. Dickinson, “Ultrasonic b-scanning: a computer simulation,” Phys. Med. Biol., vol. 25, no. 3, pp. 463, 1980.
  • [14] S. Balocco, C. Gatta, F. Ciompi, A. Wahle, P. Radeva, S. Carlier, G. Unal, E. Sanidas, J. Mauri, X. Carillo, et al., “Standardized evaluation methodology and reference database for evaluating ivus image segmentation,” Comput. Med. Imaging Graph., vol. 38, no. 2, pp. 70–90, 2014.