PhoneBit: Efficient GPU-Accelerated Binary Neural Network Inference Engine for Mobile Phones

12/05/2019
by   Gang Chen, et al.
0

Over the last years, a great success of deep neural networks (DNNs) has been witnessed in computer vision and other fields. However, performance and power constraints make it still challenging to deploy DNNs on mobile devices due to their high computational complexity. Binary neural networks (BNNs) have been demonstrated as a promising solution to achieve this goal by using bit-wise operations to replace most arithmetic operations. Currently, existing GPU-accelerated implementations of BNNs are only tailored for desktop platforms. Due to architecture differences, mere porting of such implementations to mobile devices yields suboptimal performance or is impossible in some cases. In this paper, we propose PhoneBit, a GPU-accelerated BNN inference engine for Android-based mobile devices that fully exploits the computing power of BNNs on mobile GPUs. PhoneBit provides a set of operator-level optimizations including locality-friendly data layout, bit packing with vectorization and layers integration for efficient binary convolution. We also provide a detailed implementation and parallelization optimization for PhoneBit to optimally utilize the memory bandwidth and computing power of mobile GPUs. We evaluate PhoneBit with AlexNet, YOLOv2 Tiny and VGG16 with their binary version. Our experiment results show that PhoneBit can achieve significant speedup and energy efficiency compared with state-of-the-art frameworks for mobile devices.

READ FULL TEXT
research
01/03/2019

HG-Caffe: Mobile and Embedded Neural Network GPU (OpenCL) Inference Engine with FP16 Supporting

Breakthroughs in the fields of deep learning and mobile system-on-chips ...
research
01/03/2020

High Performance Depthwise and Pointwise Convolutions on Mobile Devices

Lightweight convolutional neural networks (e.g., MobileNets) are specifi...
research
08/16/2019

daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices

It is always well believed that Binary Neural Networks (BNNs) could dras...
research
10/06/2021

FADNet++: Real-Time and Accurate Disparity Estimation with Configurable Networks

Deep neural networks (DNNs) have achieved great success in the area of c...
research
05/15/2019

Accelerating Deterministic and Stochastic Binarized Neural Networks on FPGAs Using OpenCL

Recent technological advances have proliferated the available computing ...
research
11/18/2020

Larq Compute Engine: Design, Benchmark, and Deploy State-of-the-Art Binarized Neural Networks

We introduce Larq Compute Engine, the world's fastest Binarized Neural N...
research
01/29/2018

TernaryNet: Faster Deep Model Inference without GPUs for Medical 3D Segmentation using Sparse and Binary Convolutions

Deep convolutional neural networks (DCNN) are currently ubiquitous in me...

Please sign up or login with your details

Forgot password? Click here to reset