Image segmentation, the task of specifying the class of each pixel in an image, is one of the active research areas in the medical imaging domain. In particular, image segmentation for biomedical imaging allows identifying different tissues, biomedical structures, and organs from images to help medical doctors diagnose diseases. However, manual image segmentation is a laborious task. Deep learning methods have been used to automate the process and alleviate the burden of segmenting images manually.
The rise of Deep Learning has enabled patients to have direct access to personal health analysis . Health monitoring apps on smart phones are now capable of monitoring medical risk factors. Medical health centers and hospitals are equipped with pre-trained models used in medical CADs to analyse MRI images . However, developing a high precision model often comes with various costs, such as a higher computational burden and a large model size. The latter requires many parameters to be stored in floating point precision, which demands high hardware resources to store and process images at test time. In medical domains, images typically have high resolution and can also be volumetric (the data has a depth in addition to width and height). Quantizing the neural networks can reduce the feedforward computation time and most importantly the memory burden at inference. After quantization, a high precision (floating point) model is approximated with a lower bit resolution model. The goal is to leverage the advantages of the quantization techniques while maintaining the accuracy of the full precision floating point models. Quantized models can then be deployed on devices with limited memory such as cell-phones, or facilitate processing higher resolution images or bigger volumes of 3D data with the same memory budget. Developing such methods can reduce the required memory to save model parameters potentially up to 32x in memory footprint. In addition, the amount of hardware resources (the number of logic gates) required to perform low precision computing, is much less than a full precision model . In this paper, we propose a fixed point quantization of U-Net , a popular segmentation architecture in the medical imaging domain. We provide a comprehensive quantization results on the Spinal Cord Gray Matter Segmentation Challenge , the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks , and the public National Institute of Health (NIH) dataset for pancreas segmentation in abdominal CT scans . In summary, this work makes the following contributions: Developing such methods can reduce the required memory to save model parameters potentially up to 32x in memory footprint. In addition, the amount of hardware resources (the number of logic gates) required to perform low precision computing, is much less than a full precision model .
In this paper, we propose a fixed point quantization of U-Net , a popular segmentation architecture in the medical imaging domain. We provide a comprehensive quantization results on the Spinal Cord Gray Matter Segmentation Challenge , the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks , and the public National Institute of Health (NIH) dataset for pancreas segmentation in abdominal CT scans . In summary, this work makes the following contributions:
We report the first fixed point quantization results on the U-Net architecture for the medical image segmentation task and show that the current quantization methods available for U-Net are not efficient for the common hardware in the industry.
We quantify the impact of quantizing the weights and activations on the performance of the U-Net model on three different medical imaging datasets.
We report results comparable to a full precision segmentation model by using only 6 bits for activation and 4 bits for weights, effectively reducing the weights size by a factor of and the activation size by a factor of .
2 Related Works
2.1 Image Segmentation
Image segmentation is one of the central problems in medical imaging , commonly used to detect regions of interest such as tumors. Deep learning approaches have obtained the state-of-the-art results in medical image segmentation [12, 21]. One of the favorite architectures used for image segmentation is U-Net  or its equivalent architectures proposed around the same time; ReCombinator Networks , SegNet , and DeconvNet 
, all proposed to maintain pixel level information that is usually lost due to pooling layers. These models use an encoder-decoder architecture with skip connections, where the information in the encoder path is reintroduced by skip connections in the decoder path. This architecture has proved to be quite successful for many applications that require full image reconstruction while changing the modality of the data, as in the image-to-image translation, semantic segmentation [19, 1, 15] or landmark localization [8, 14]. While all the aforementioned models propose the same architecture, for simplicity we refer to them as U-Net models. U-Net type models have been very popular in the medical imaging domain and have been also applied to the 3 dimensional (3D) segmentation task . One problem with U-Net is its high usage of memory due to full image reconstruction. All encoded features are required to be kept in memory and then used while reconstructing the final output. This approach can be quite demanding, especially for high resolution or 3D images. Quantization of weights and activations can reduce the required memory for this model, allowing to process images with a higher resolution or with a bigger 3D volume at test time.
2.2 Quantization for Medical Imaging Segmentation
There are two approaches to quantize a neural network, namely deterministic quantization and stochastical quantization . Althouh DNN quantization has been thoroughly studied [9, 25, 4], little effort has been done on developing quantization methods for medical image segmentation. In the following, we review recent works in this field.
Quantization in Fully Convolutional Networks: Quantization has been applied to Fully Convolutional Networks (FCN) in biomedical image segmentation . First, a quantization module is added to the suggestive annotation in FCN. In suggestive annotation, instead of using the original dataset, a representative training dataset is used, which in turn increases the accuracy. Next, FCN segmentations are quantized using Incremental Quantization (INQ). Authors report that suggestive annotation with INQ using 7 bits results in accuracy close to or better than those obtained with a full precision model. In FCN, features of different resolutions are upsampled back to the image resolution and merged together right before the final output predictions. This approach is sub-optimal compared to the U-Net which upsamples features only to one higher resolution, allowing the model to process them before they are passed to higher resolution layers. This gradual resolution increase in reconstruction acts as a conditional computation, where the features of higher resolution are computed using the lower resolution features. As reported in , this process of conditional computation results in faster convergence time and increased accuracy in the U-Net type architectures compared to the FCN type architectures. Considering the aforementioned advantages of U-Net, in this paper we pursue the quantization of this model.
U-Net Quantization: In 
, authors propose the first quantization for U-Net. They introduce 1) a parameterized ternary hyperbolic tangent to be used as the activation function, 2) a ternary convolutional method that calculates matrix multiplication very efficiently in the hamming space. They report 15-fold decrease in the memory requirement as well as 10x speed-up at inference compared to the full precision model. Although this method shows significant performance boost, in Section4 we demonstrate that this is not an efficient method for the currently available CPUs and GPUs.
We propose fixed point quantization for U-Net. We start with a full precision (32 bit floating point) model as our baseline. We then use the following fixed point quantization function to quantize the parameters (weights and activation) in the inference path:
where function projects its input to the nearest integer, and are shift left and right operators, respectively. In our simulation, shift left and right are implemented by multiplication and division in powers of 2. The function is defined as:
Equation (1) quantizes an input to the closest value that can be represented by bits. To map any given number to its fixed point value we first split the number into its floating and integer parts using:
and then use the following equation to convert to its fixed point representation using the specified number of bits for the integer () and fractional () parts:
Equation (3) is a fixed point quantization function that maps a floating point number to the closest fixed point value with integer and fractional bits. Throughout this paper, we use notation to denote that we are using a fixed point quantization of parameter by using bits to represent the integer part and bits to represent the fractional part. Based on our experiments, we did not benefit from an incremental quantization (INQ) as explained in . Although this method could work for higher precision models, for instance when using fixed point (Quantizing weights with 8 bits integer and 8 bits fractional parts), for extreme quantization as in , learning from scratch gave us the best accuracy with the shortest learning time. As shown in Figure S1, in the full precision case the weights of all U-Net layers are in , hence the integer part for the weight quantization is not required.
For numerical stability and to verify the gradients can propagate in training, we demonstrate that our quantization is differentiable . Starting from Equation (2), the derivative is:
3.2 Observations on U-Net Quantization
Dropout  is a regularization technique to prevent over-fitting of DNNs. Although it is used in the original implementation of U-Net, we found that when this technique is applied along with quantization, the accuracy drops a lot. Hence, in our implementation we removed dropout from all layers. This is due to the fact that quantization acts as a strong regularizer, as reported in , hence further regularization with dropout is not required. As shown in Figure S2 for each quantized precision, dropout reduces the accuracy, with the gap being even higher for lower precision quantizations.
3.2.2 Full Precision Layers
3.2.3 Batch Normalization
Batch normalization is a technique that improves the training speed and accuracy of DNN. We used the Pytorch implementation of batchnorm. In training, we use the quantization block after the batchnorm block in each layer (S.2 lists all the layers in our U-Net implementation) such that the batchnorm is first applied using the floating point calculations and then the quantized value is sent to the next layer (hence not quantizing the batchnorm block during training). However, at inference, Pytorch folds the batchnorm parameters into the weights, effectively including batchnorm parameters in the quantized model as part of the quantized weights.
4 Results and Discussion
We implemented the U-Net model and our fixed-point quantizer in Pytorch. We trained our model over 200 epochs with a batch size of 4. We applied our fixed point quantization along with TernaryNet and Binary  quantization on three different datasets: GM , EM , and NIH . For GM and EM datasets, we used an initial learning rate of and for NIH we used initial learning rate of . For all datasets we used Glorot for weight initialization and cosine annealing scheduler to reduce learning rate in training. Please check our repository for the model and training details.
NIH pancreas  dataset is composed of 82 3D abdominal CT scan and their corresponding pancreas segmentation images. Unfortunately, we did not had access to the pre-processed dataset described in , nevertheless, we extracted 512x512 2-D slices from the original dataset and applied a region of interest cropping to get 7059 images of size 176x112. The final dataset contains 7059 176x112 2-D images which are separated into training and testing dataset (respectively 80% and 20%). For GM and EM datasets, we used the provided dataset as described in  and  respectively. For both EM and GM datasets, we did not used any region of interest cropping and we used images of size 200x200.
The task of image segmentation for GM and NIH pancreas datasets is imbalanced. As suggested in , instead of weighted cross-entropy, we used a surrogate loss for the dice similarity coefficient. This loss is referred to as the dice loss and is formulated as , where and are prediction and ground truth pixels respectively (with indicating not belonging and indicating belonging to the class of interest) and is the added noise for numerical stability. For the EM dataset, using a weighted sum of cross entropy and dice loss produced the best results.
Figure 1 along with Table 1 show different quantization methods on the aforementioned datasets. Considering NIH dataset, Figure 1(top) and Table 1 show that despite using only 1 and 2 bits to represent network parameters, Binary and TernaryNet quantizations produce results that are close to the full precision model. However, for other datasets, our fixed point 6.0,
0.4 quantization surpasses Binary and TernaryNet quantization. The other important factor here is how efficient these quantization techniques can be implemented using the current CPU and GPU hardware. At the time of writing this paper, there is no commercially available CPU or GPU that can efficiently store and load sub-8bit parameters of a neural network, which leaves us to use custom functions to do bit manipulation to make sub-8bit quantization more efficient. Moreover, in the case of TernaryNet, to apply floating point scaling factor after ternary convolution, floating point operations are required. Our fixed point quantization uses only integer operations, which requires less hardware footprint and use less power compared to floating point operations. Finally, TernaryNet uses Tanh instead of ReLU for the activations. Using hyperbolic tangent as an activation function increases training time and execution time at inference. To verify it, we evaluated the performance of ReLU and Tanh in a simple neural network with 3 fully connected layers. We used the Intel’s OpenVino  inference engine together with high performance gemm_blas and avx2 instructions. Table 2 illustrates that using ReLU instead of Tanh at training and inference can increase performance by up to 8 times. These results can be extended to U-Net, since activation inference time is only a function of the input size. To compensate for the computation time, TernaryNet implements an efficient ternary convolution that gains up to 8 times in performance. At inference, an efficient Tanh function can be implemented that uses only two comparators to perform Tanh for ternary values. Considering accuracy, when Tanh is used as an activation function, the full precision accuracy is lower compared to ReLU . We observe similar behavior in the results reported in Table 1. Our fixed point quantizer provides a flexible trade-off between accuracy and memory, which makes it a practical solution for the current CPUs and GPUs, does not requite floating-point operations, and leverages the more efficient ReLU function. As opposed to BNN and TernaryNet quantizations, Table 1 shows that our approach for quantization of U-Net provides consistent results over 3 different datasets.
|Quantization||EM Dataset||GM Dataset||NIH Panceas|
|Full Precision||18.48 MBytes||94.05||93.02||56.32||56.26||75.69|
|BNN ||0.56 MBytes||78.53||-||31.44||-||72.56|
|TernaryNet ||1.15 MBytes||-||82.66||-||43.02||73.9|
In this work, we proposed a fixed point quantization method for the U-Net architecture and evaluated it on the medical image segmentation task. We report quantization results on three different semantic segmentation datasets and show that our fixed point quantization produces more accurate and also more consistent results over all these datasets compared to other qunatization techniques. We also demonstrate that Tanh, as the activation function, reduces the base line accuracy and also adds a computational complexity in both training and inference. Our proposed fixed point quantization technique provides a trade-off between accuracy and the required memory, does not require floating point computation and is more suitable for the currently available CPU and GPU hardware.
-  (2017-12) SegNet: a deep convolutional encoder-decoder architecture for image segmentation. TPAMI 39 (12), pp. 2481–2495. External Links: Cited by: §2.1.
-  (2010) An integrated micro-and macroarchitectural analysis of the drosophila brain by computer-assisted serial section electron microscopy. PLoS Biology 8 (10), pp. e1000502. Cited by: §1, §1, §4, §4.
-  (2016) 3D u-net: learning dense volumetric segmentation from sparse annotation. In MICCAI, pp. 424–432. Cited by: §2.1.
-  (2015) Binaryconnect: training deep neural networks with binary weights during propagations. In NeurIPS, pp. 3123–3131. Cited by: §2.2, Table 1, §4.
-  Release notes for intel® distribution of openvino™ toolkit 2019. Note: accessed on Jun 13th 2019 Cited by: Table 2, §4.
-  (2018) TernaryNet: faster deep model inference without gpus for medical 3d segmentation using sparse and binary convolutions. CoRR abs/1801.09449. External Links: Cited by: §2.2, Table 1, §4, §4, §4.
Neural networks for machine learning, video lectures. Coursera. Cited by: §3.1.
-  (2016) Recombinator networks: learning coarse-to-fine feature aggregation. In CVPR, pp. 5743–5752. Cited by: §2.1, §2.2.
-  (2018) Quantized neural networks: training neural networks with low precision weights and activations. JMLR 18 (187), pp. 1–30. External Links: Cited by: §1, §2.2, §3.2.1, §3.2.2.
Image-to-image translation with conditional adversarial networks. In CVPR, Cited by: §2.1.
-  (2012) ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, pp. 1097–1105. External Links: Cited by: §4.
-  (2017) A survey on deep learning in medical image analysis. Medical Image Analysis 42, pp. 60–88. Cited by: §2.1.
-  (2017-05) Deep learning for healthcare: review, opportunities and challenges. Briefings in Bioinformatics 19 (6), pp. 1236–1246. External Links: Cited by: §1.
-  (2016) Stacked hourglass networks for human pose estimation. In ECCV, pp. 483–499. Cited by: §2.1.
-  (2015) Learning deconvolution network for semantic segmentation. In ICCV, pp. 1520–1528. Cited by: §2.1.
-  (2019) The role of deep learning in improving healthcare. Data Science for Healthcare. Cited by: §1.
-  (2000) Current methods in medical image segmentation. Annual review of biomedical engineering 2 (1), pp. 315–337. Cited by: §2.1.
-  (2017) Spinal cord grey matter segmentation challenge.. NeuroImage. Cited by: §1, §1, §4, §4, §4.
-  (2015) U-net: convolutional networks for biomedical image segmentation. In MICCAI, pp. 234–241. Cited by: §1, §1, §2.1.
-  (2015) DeepOrgan: multi-level deep convolutional networks for automated pancreas segmentation.. MICCAI. Cited by: §1, §1, §4, §4.
-  (2017) Deep learning in medical image analysis. Annual review of biomedical engineering 19, pp. 221–248. Cited by: §2.1.
-  (2014) Dropout: a simple way to prevent neural networks from overfitting. JMLR 15 (1), pp. 1929–1958. Cited by: §3.2.1.
-  (2017) How to train a compact binary neural network with high accuracy?. In AAAI, Cited by: §3.2.2.
-  (2018) Quantization of fully convolutional networks for accurate biomedical image segmentation. In CVPR, pp. 8300–8308. Cited by: §2.2.
-  (2017) Incremental network quantization: towards lossless cnns with low-precision weights. CoRR abs/1702.03044. Cited by: §2.2, §3.
s.1 Weight Visualization of Full-Precision U-Net
s.2 Model Architecture
---------------------------------------------------------------- Layer (type) Output Shape Param # ================================================================ Conv2d-1 [ 64 , 200, 200] 640 BatchNorm2d-2 [ 64 , 200, 200] 128 QuantLayer-3 [ 64 , 200, 200] 0 Conv2d-4 [ 64 , 200, 200] 36,928 BatchNorm2d-5 [ 64 , 200, 200] 128 QuantLayer-6 [ 64 , 200, 200] 0 DownConv-7 [ 64 , 200, 200] 0 MaxPool2d-8 [ 64 , 100, 100] 0 Conv2dQuant-9 [ 128, 100, 100] 73,856 BatchNorm2d-10 [ 128, 100, 100] 256 QuantLayer-11 [ 128, 100, 100] 0 Conv2dQuant-12 [ 128, 100, 100] 147,584 BatchNorm2d-13 [ 128, 100, 100] 256 QuantLayer-14 [ 128, 100, 100] 0 DownConv-15 [ 128, 100, 100] 0 MaxPool2d-16 [ 128, 50 , 50 ] 0 Conv2dQuant-17 [ 256, 50 , 50 ] 295,168 BatchNorm2d-18 [ 256, 50 , 50 ] 512 QuantLayer-19 [ 256, 50 , 50 ] 0 Conv2dQuant-20 [ 256, 50 , 50 ] 590,080 BatchNorm2d-21 [ 256, 50 , 50 ] 512 QuantLayer-22 [ 256, 50 , 50 ] 0 DownConv-23 [ 256, 50 , 50 ] 0 MaxPool2d-24 [ 256, 25 , 25 ] 0 Conv2dQuant-25 [ 256, 25 , 25 ] 590,080 BatchNorm2d-26 [ 256, 25 , 25 ] 512 QuantLayer-27 [ 256, 25 , 25 ] 0 Conv2dQuant-28 [ 256, 25 , 25 ] 590,080 BatchNorm2d-29 [ 256, 25 , 25 ] 512 QuantLayer-30 [ 256, 25 , 25 ] 0 DownConv-31 [ 256, 25 , 25 ] 0 Upsample-32 [ 256, 50 , 50 ] 0 Conv2dQuant-33 [ 256, 50 , 50 ] 1,179,904 BatchNorm2d-34 [ 256, 50 , 50 ] 512 QuantLayer-35 [ 256, 50 , 50 ] 0 Conv2dQuant-36 [ 256, 50 , 50 ] 590,080 BatchNorm2d-37 [ 256, 50 , 50 ] 512 QuantLayer-38 [ 256, 50 , 50 ] 0 DownConv-39 [ 256, 50 , 50 ] 0 UpConv-40 [ 256, 50 , 50 ] 0 Upsample-41 [ 256, 100, 100] 0 Conv2dQuant-42 [ 128, 100, 100] 442,496 BatchNorm2d-43 [ 128, 100, 100] 256 QuantLayer-44 [ 128, 100, 100] 0 Conv2dQuant-45 [ 128, 100, 100] 147,584 BatchNorm2d-46 [ 128, 100, 100] 256 QuantLayer-47 [ 128, 100, 100] 0 DownConv-48 [ 128, 100, 100] 0 UpConv-49 [ 128, 100, 100] 0 Upsample-50 [ 128, 200, 200] 0 Conv2dQuant-51 [ 64 , 200, 200] 110,656 BatchNorm2d-52 [ 64 , 200, 200] 128 QuantLayer-53 [ 64 , 200, 200] 0 Conv2dQuant-54 [ 64 , 200, 200] 36,928 BatchNorm2d-55 [ 64 , 200, 200] 128 QuantLayer-56 [ 64 , 200, 200] 0 DownConv-57 [ 64 , 200, 200] 0 UpConv-58 [ 64 , 200, 200] 0 Conv2d-59 [ 1 , 200, 200] 577 ================================================================ Total params: 4,837,249 Trainable params: 4,837,249 Non-trainable params: 0 ---------------------------------------------------------------- Input size (MB): 0.15 Forward/backward pass size (MB): 593.57 Params size (MB): 18.45 Estimated Total Size (MB): 612.17 ----------------------------------------------------------------