DeepAI AI Chat
Log In Sign Up

Binarized Convolutional Neural Networks for Efficient Inference on GPUs

by   Mir Khan, et al.
Tampereen teknillinen yliopisto

Convolutional neural networks have recently achieved significant breakthroughs in various image classification tasks. However, they are computationally expensive,which can make their feasible mplementation on embedded and low-power devices difficult. In this paper convolutional neural network binarization is implemented on GPU-based platforms for real-time inference on resource constrained devices. In binarized networks, all weights and intermediate computations between layers are quantized to +1 and -1, allowing multiplications and additions to be replaced with bit-wise operations between 32-bit words. This representation completely eliminates the need for floating point multiplications and additions and decreases both the computational load and the memory footprint compared to a full-precision network implemented in floating point, making it well-suited for resource-constrained environments. We compare the performance of our implementation with an equivalent floating point implementation on one desktop and two embedded GPU platforms. Our implementation achieves a maximum speed up of 7. 4X with only 4.4 implementation.


No Multiplication? No Floating Point? No Problem! Training Networks for Efficient Inference

For successful deployment of deep neural networks on highly--resource-co...

Optimization of XNOR Convolution for Binary Convolutional Neural Networks on GPU

Binary convolutional networks have lower computational load and lower me...

An Overview of Datatype Quantization Techniques for Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are becoming increasingly popular d...

PoET-BiN: Power Efficient Tiny Binary Neurons

The success of neural networks in image classification has inspired vari...

Tuning of Mixture-of-Experts Mixed-Precision Neural Networks

Deep learning has become a useful data analysis method, however mainstre...

Convolutional neural network compression for natural language processing

Convolutional neural networks are modern models that are very efficient ...

Embedded Binarized Neural Networks

We study embedded Binarized Neural Networks (eBNNs) with the aim of allo...