U-Net Fixed-Point Quantization for Medical Image Segmentation

Model quantization is leveraged to reduce the memory consumption and the computation time of deep neural networks. This is achieved by representing weights and activations with a lower bit resolution when compared to their high precision floating point counterparts. The suitable level of quantization is directly related to the model performance. Lowering the quantization precision (e.g. 2 bits), reduces the amount of memory required to store model parameters and the amount of logic required to implement computational blocks, which contributes to reducing the power consumption of the entire system. These benefits typically come at the cost of reduced accuracy. The main challenge is to quantize a network as much as possible, while maintaining the performance accuracy. In this work, we present a quantization method for the U-Net architecture, a popular model in medical image segmentation. We then apply our quantization algorithm to three datasets: (1) the Spinal Cord Gray Matter Segmentation (GM), (2) the ISBI challenge for segmentation of neuronal structures in Electron Microscopic (EM), and (3) the public National Institute of Health (NIH) dataset for pancreas segmentation in abdominal CT scans. The reported results demonstrate that with only 4 bits for weights and 6 bits for activations, we obtain 8 fold reduction in memory requirements while loosing only 2.21 respectively. Our fixed point quantization provides a flexible trade off between accuracy and memory requirement which is not provided by previous quantization methods for U-Net such as TernaryNet.

READ FULL TEXT

page 6

page 13

page 14

page 15

research
06/21/2018

Quantizing deep convolutional networks for efficient inference: A whitepaper

We present an overview of techniques for quantizing convolutional neural...
research
02/10/2022

F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

Neural network quantization is a promising compression technique to redu...
research
08/12/2020

FATNN: Fast and Accurate Ternary Neural Networks

Ternary Neural Networks (TNNs) have received much attention due to being...
research
11/19/2019

IFQ-Net: Integrated Fixed-point Quantization Networks for Embedded Vision

Deploying deep models on embedded devices has been a challenging problem...
research
01/30/2023

Self-Compressing Neural Networks

This work focuses on reducing neural network size, which is a major driv...
research
03/13/2018

Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation

With pervasive applications of medical imaging in health-care, biomedica...
research
03/04/2023

Fixed-point quantization aware training for on-device keyword-spotting

Fixed-point (FXP) inference has proven suitable for embedded devices wit...

Please sign up or login with your details

Forgot password? Click here to reset