TernaryNet: Faster Deep Model Inference without GPUs for Medical 3D Segmentation using Sparse and Binary Convolutions

01/29/2018
by   Mattias P. Heinrich, et al.
0

Deep convolutional neural networks (DCNN) are currently ubiquitous in medical imaging. While their versatility and high quality results for common image analysis tasks including segmentation, localisation and prediction is astonishing, the large representational power comes at the cost of highly demanding computational effort. This limits their practical applications for image guided interventions and diagnostic (point-of-care) support using mobile devices without graphics processing units (GPU). We propose a new scheme that approximates both trainable weights and neural activations in deep networks by ternary values and tackles the open question of backpropagation when dealing with non-differentiable functions. Our solution enables the removal of the expensive floating-point matrix multiplications throughout any convolutional neural network and replaces them by energy and time preserving binary operators and population counts. Our approach, which is demonstrated using a fully-convolutional network (FCN) for CT pancreas segmentation leads to more than 10-fold reduced memory requirements and we provide a concept for sub-second inference without GPUs. Our ternary approximation obtains high accuracies (without any post-processing) with a Dice overlap of 71.0 statistically equivalent to using networks with high-precision weights and activations. We further demonstrate the significant improvements reached in comparison to binary quantisation and without our proposed ternary hyperbolic tangent continuation. We present a key enabling technique for highly efficient DCNN inference without GPUs that will help to bring the advances of deep learning to practical clinical applications. It has also great promise for improving accuracies in large-scale medical data retrieval.

READ FULL TEXT
research
03/08/2017

Deep Convolutional Neural Network Inference with Floating-point Weights and Fixed-point Activations

Deep convolutional neural network (CNN) inference requires significant a...
research
03/24/2020

Organ Segmentation From Full-size CT Images Using Memory-Efficient FCN

In this work, we present a memory-efficient fully convolutional network ...
research
12/01/2016

Training Bit Fully Convolutional Network for Fast Semantic Segmentation

Fully convolutional neural networks give accurate, per-pixel prediction ...
research
12/05/2019

PhoneBit: Efficient GPU-Accelerated Binary Neural Network Inference Engine for Mobile Phones

Over the last years, a great success of deep neural networks (DNNs) has ...
research
03/05/2019

TinBiNN: Tiny Binarized Neural Network Overlay in about 5,000 4-LUTs and 5mW

Reduced-precision arithmetic improves the size, cost, power and performa...
research
04/09/2018

Distribution-Aware Binarization of Neural Networks for Sketch Recognition

Deep neural networks are highly effective at a range of computational ta...
research
12/10/2016

Generalized Deep Image to Image Regression

We present a Deep Convolutional Neural Network architecture which serves...

Please sign up or login with your details

Forgot password? Click here to reset