DeepAI AI Chat
Log In Sign Up

Binarized Convolutional Neural Networks for Efficient Inference on GPUs

08/01/2018
by   Mir Khan, et al.
Tampereen teknillinen yliopisto
0

Convolutional neural networks have recently achieved significant breakthroughs in various image classification tasks. However, they are computationally expensive,which can make their feasible mplementation on embedded and low-power devices difficult. In this paper convolutional neural network binarization is implemented on GPU-based platforms for real-time inference on resource constrained devices. In binarized networks, all weights and intermediate computations between layers are quantized to +1 and -1, allowing multiplications and additions to be replaced with bit-wise operations between 32-bit words. This representation completely eliminates the need for floating point multiplications and additions and decreases both the computational load and the memory footprint compared to a full-precision network implemented in floating point, making it well-suited for resource-constrained environments. We compare the performance of our implementation with an equivalent floating point implementation on one desktop and two embedded GPU platforms. Our implementation achieves a maximum speed up of 7. 4X with only 4.4 implementation.

READ FULL TEXT
09/24/2018

No Multiplication? No Floating Point? No Problem! Training Networks for Efficient Inference

For successful deployment of deep neural networks on highly--resource-co...
07/28/2020

Optimization of XNOR Convolution for Binary Convolutional Neural Networks on GPU

Binary convolutional networks have lower computational load and lower me...
08/22/2018

An Overview of Datatype Quantization Techniques for Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are becoming increasingly popular d...
02/23/2020

PoET-BiN: Power Efficient Tiny Binary Neurons

The success of neural networks in image classification has inspired vari...
09/29/2022

Tuning of Mixture-of-Experts Mixed-Precision Neural Networks

Deep learning has become a useful data analysis method, however mainstre...
05/28/2018

Convolutional neural network compression for natural language processing

Convolutional neural networks are modern models that are very efficient ...
09/06/2017

Embedded Binarized Neural Networks

We study embedded Binarized Neural Networks (eBNNs) with the aim of allo...