Ultrasound (US) has been in use for medical imaging for more than three decades, on account of its salient advantages which include the relatively low cost of ownership and operation, non-ionizing nature of radiation, real-time imaging capability, high resolution, and ability to serve for both inside-out and outside-in imaging . In recent years there has been significant advancements beyond the conventional 2D US imaging techniques with the inclusion of 2D+T, 3D, 3D+T modes while making them more advanced and increasing their deployment in mainline medical imaging . This has resulted in an increase in the quantity of data being used to store these images, thereby creating difficulties in storage and transmission. Even though image compression techniques have been able to solve this problem in general for consumer grade camera images and some medical images, ultrasound image compression is challenging due to the presence of speckle patterns which are hard to preserve when compressed with existing standards like JPEG, JPEG2000 . In this paper, we propose a framework for ultrasound image compression wherein the compressor and the decompressor are designed taking into account the physics of ultrasound image formation process as illustrated in Fig. 1(a), to obtain decompressed images (Fig. 1(c)) which are perceivably similar to the original image (Fig. 1(b)), as also evident in Fig. 1(d) representing the perception distortion trade-off .
2 Prior Art
The significant growth in digital imaging over the last four decades has resulted in an increase in complexity associated with transmission and storage of images. International Standards Organization (ISO), and International Electro-technical Commission (IEC) jointly introduced the first standard for image compression popularly also referred to as the Joint Photographic Experts Group (JPEG) . These developments were later on followed by genesis of new standards like the JPEG2000 , JPEG-LS , JPEG-XR  and HEVC/H.265  for video compression.
In line with the growth of digitally acquired and stored medical images, the requirement of standards for digital imaging and communication in medicine (DICOM) 111ftp://medical.nema.org/medical/dicom/1992-1995/ also grew. Subsequently, JPEG, JPEG 2000, JPEG-LS, H.264, and HEVC 222ftp://medical.nema.org/medical/dicom/supps/PC/ were included within DICOM and used for radiological images, and information on clinically acceptable compression factors was laid out for each of the specific imaging modalities by different societies .
However, use of compression in US images has been limited to JPEG and JPEG2000  and restricted for use in anatomical regions like the breast, musculoskeletal region and in pediatric imaging. Recently, methods employing deep neural networks have been introduced which allow high-density compression of medical images while retaining diagnostically relevant features . The motivation of this paper is to address this challenge by introducing a framework with the primary ability to preserve key information rendered through speckles to enable its clinically relevant use.
The image compression framework we propose here consists of a compressor and a decompressor block, with the compressed file constituting the commonly exchanged information between them. The key idea is to segment an image into anatomically relevant regions and transmit it to the decompressor which is essentially an image generator , that generates the image from the compressed file.
The compressor consists of a segmentation engine which splits the ultrasound image into anatomical and pathological segments and transmits them. We demonstrate generalizability by implementing the segmentation using two different methods which have been recently published, (i) a classical machine learning based approach with cross frame belief propagating iterative random walks
and, (ii) a convolutional neural network based approach implemented with U-Net.
Classical machine learning based approach for segmentation: This approach 
is based on the statistical mechanical understanding of ultrasound-tissue interaction. A parametric model of ultrasonic speckle statistics is estimated and used to learn a random forest classifier for pixelwise classification of tissues. This model is used for contour initialization. Subsequently, iterative random walks are used for correcting the contours. Gradient vector flow based belief propagation is then applied to subsequent neighboring frames for initializing the random walks, and this process is performed iteratively for volume segmentation.
weights in the encoder unit. The decoder weights are randomly initialized. The network generates a pixel-wise segmentation response map. The ReLU activation function is used throughout the network following convolution layers. In the final layer of the decoder, the sigmoid activation function is used to generate a pixel-wise segmentation response map. The weighted cross-entropy (WCE) loss function is used, giving higher weights to the pixels closer to the boundary using using the morphological distance transform.
File format: The file format for transmission of compressed data contains four necessary pieces of information. Chain codes of the contour for different classes of tissue are stored in the probe geometry specific polar coordinate format. A number of scan lines that make up the polar image is included for reconstruction of the segmented contours in the decompressor. The frame size in the Cartesian coordinate system is also included. The acquisition frequency is included as well, for the reconstruction of the images using a generative model.
In this section, an adversarially trained generative convolutional network is employed to simulate patho-realistic ultrasound images from tissue echogenicity maps recreated from the received file. This generator involves two stages and is based on the framework proposed in .
Stage 0: Physics based simulation: This first stage simulation is performed using a pseudo B-mode physics based simulator which works on the principle of linear and space invariant nature of point spread function (PSF) of speckle in the image .
Stage I: Speckle intensity and Point spread function learning: In this stage, we modify the architecture in  to enable direct learning of the mapping from stage 0 simulated results to patho realistic ultrasound images, using a single generator. This allows for better preservation of the ground truth annotation which is an important diagnostically relevant feature. The proposed single stage model has the added advantage of being more computationally efficient, reducing the inference time by half.
The generator has an encoder-decoder architecture that enables it to learn the intensity mapping and the point spread function. The feature maps are fed to residual blocks, by nearest neighbour upsampling followed by convolutions. The discriminator has downsampling blocks that brings down the dimension to followed by a convolution layer and a fully connected layer for prediction. The LeakyReLU activation function is used in the downsampling blocks. Similar to the network in  a self regularization term is also included in the GAN loss for preserving the ground truth information, whose value was chosen as 0.01 via experimentation.
4 Experiments, Results and Discussion
In this experiment, we have used the IVUS pullback data from the border detection in IVUS Challenge 333http://www.cvc.uab.es/IVUSchallenge2011/dataset.htmlis used. The dataset consists of pullbacks with one pullback per patient, acquired at MHz . One pullback was used for reporting performance and remaining nine were used for training. Augmentation by axially rotating the pullbacks by 30°over 12 steps was done. Implementation has been done on Ubuntu
LTS OS, Python 3.6, PyTorch 0.5, CUDAand cuDNN for acceleration with a Nvidia Tesla K40c GPU with 12GB of DDR4 RAM on a PC with Intel Core i5-8600K processor and 32 GB of system RAM.
The compressor with classical machine learning approach was implemented following . The UNet  based compressor was trained with learning rate and batch size 14 using the Adam optimizer. The decompressor was trained with an initial learning rate of and a weight decay of per epochs over epochs with a batch size of , following .
|Lumen vs. Media||Media vs. Ext.||Lumen vs. Ext.|
. Ext. denotes external elastic luminae. Numbers in parentheses correspond to standard deviation.
The original and the decompressed images are compared using the inter-tissue Jensen-Shannon (JS) divergence of their speckle appearance and the results are reported in Table 1. We compared the proposed method with the prior art using intra-tissue JS divergence assessed between speckle statistics in Table 2 and divergence in attenuation in Table 3. Subsequently, we compared the impact on image segmentation using two strategies for segmentation. In the first case, we used a CNN within the compressor and a classical machine learning based approach for segmentation during validation, whose results are reported in Table 4. In the second case, the compression and segmentation approaches were reversed and the results are reported in Table 5.
Here, an uncompressed polar image occupies bits. For the chain code used, the starting point occupies bits and the remaining points occupy bits. Two contours for lumen and media are stored and thus the compression factor becomes . In spite of achieving a high compression ratio, the decompressed images have very low intra-tissue JS divergence values, indicating a high degree of similarity to the original images. From Table 1, it is observed that the tissue-specific layer mapping is better than in the existing methods because the inter-tissue JS divergence is higher and nearly similar to that of the original images. Additionally, inter-tissue statistical mechanics has been mapped properly in the decompressed image since the classical method based on the statistical mechanical understanding of ultrasound-tissue interaction has been able to segment the lumen and external elastic luminae precisely. Through a paired visual Turing test, it was observed that there was a 50% chance of identifying the real from the decompressed images . Also, the compressor and decompressor blocks in this framework can be modified with other segmentation and generation methods to further improve performance.
In this paper, we have proposed a framework for high-density compression of ultrasound volumes. This framework involves two parts, a compressor with a segmentation block and a decompressor with a generation block. Both the segmentation and generation blocks can be customized with different algorithms. Here we we have used classical machine learning and convolutional neural network alternatively in the compressor block, and a generative adversarial network in the decompressor block. The quality of image compression has been evaluated using inter- and intra-tissue JS divergence between the original and the decompressed images.
-  D. China, A. Illanes, P. Poudel, M. Friebe, P. Mitra, and D. Sheet, “Anatomical structure segmentation in ultrasound volumes using cross frame belief propagating iterative random walks.,” IEEE J. Biomed. Health Inform., 2018.
-  F. Liu, M. Hernandez-Cabronero, V. Sanchez, M. W. Marcellin, and A. Bilgin, “The current role of image compression standards in medical imaging,” Information, vol. 8, no. 4, pp. 131, 2017.
-  Yochai Blau and Tomer Michaeli, “The perception-distortion tradeoff,” in , 2018, pp. 6228–6237.
-  W. B. Pennebaker and J. L. Mitchell, JPEG: Still image data compression standard, 1992.
-  M. Boliek, “Jpeg 2000 image coding system: Core coding system,” ISO/IEC, 2002.
-  M. J. Weinberger, G. Seroussi, and G. Sapiro, “The loco-i lossless image compression algorithm: Principles and standardization into jpeg-ls,” IEEE Trans. Image process., vol. 9, no. 8, pp. 1309–1324, 2000.
-  F. Dufaux, G. J. Sullivan, and T. Ebrahimi, “The jpeg xr image coding standard [standards in a nutshell],” IEEE Signal Process. Magazine, vol. 26, no. 6, 2009.
-  J-R Ohm, Gary J Sullivan, Heiko Schwarz, Thiow Keng Tan, and Thomas Wiegand, “Comparison of the coding efficiency of video coding standards—including high efficiency video coding (hevc),” IEEE Trans Circuits Syst. Video Technol., vol. 22, no. 12, pp. 1669–1684, 2012.
-  A. Kar, S. P. K. Karri, N. Ghosh, R. Sethuraman, and D. Sheet, “Fully convolutional model for variable bit length and lossy high density compression of mammograms,” Proc. Conf. Comp. Vis. Patt. Recog., 2018.
-  F. Tom and D. Sheet, “Simulating patho-realistic ultrasound images using deep generative networks with adversarial learning,” in Proc. Int. Symp. Biomed. Imaging, 2018, pp. 1174–1177.
-  V. Iglovikov and A. Shvets, “Ternausnet: U-net with vgg11 encoder pre-trained on imagenet for image segmentation,” arXiv preprint arXiv:1801.05746, 2018.
-  K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
-  J. C. Bamber and R. J. Dickinson, “Ultrasonic b-scanning: a computer simulation,” Phys. Med. Biol., vol. 25, no. 3, pp. 463, 1980.
-  S. Balocco, C. Gatta, F. Ciompi, A. Wahle, P. Radeva, S. Carlier, G. Unal, E. Sanidas, J. Mauri, X. Carillo, et al., “Standardized evaluation methodology and reference database for evaluating ivus image segmentation,” Comput. Med. Imaging Graph., vol. 38, no. 2, pp. 70–90, 2014.