Larq Compute Engine: Design, Benchmark, and Deploy State-of-the-Art Binarized Neural Networks

11/18/2020
by   Tom Bannink, et al.
7

We introduce Larq Compute Engine, the world's fastest Binarized Neural Network (BNN) inference engine, and use this framework to investigate several important questions about the efficiency of BNNs and to design a new state-of-the-art BNN architecture. LCE provides highly optimized implementations of binary operations and accelerates binary convolutions by 8.5 - 18.5x compared to their full-precision counterparts on Pixel 1 phones. LCE's integration with Larq and a sophisticated MLIR-based converter allow users to move smoothly from training to deployment. By extending TensorFlow and TensorFlow Lite, LCE supports models which combine binary and full-precision layers, and can be easily integrated into existing applications. Using LCE, we analyze the performance of existing BNN computer vision architectures and develop QuickNet, a simple, easy-to-reproduce BNN that outperforms existing binary networks in terms of latency and accuracy on ImageNet. Furthermore, we investigate the impact of full-precision shortcuts and the relationship between number of MACs and model latency. We are convinced that empirical performance should drive BNN architecture design and hope this work will facilitate others to design, benchmark and deploy binary models.

READ FULL TEXT

page 3

page 6

page 7

page 15

research
11/23/2022

Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket

Binary neural networks are the extreme case of network quantization, whi...
research
06/17/2021

DeepLab2: A TensorFlow Library for Deep Labeling

DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a ...
research
04/03/2023

Optimizing data-flow in Binary Neural Networks

Binary Neural Networks (BNNs) can significantly accelerate the inference...
research
12/05/2019

PhoneBit: Efficient GPU-Accelerated Binary Neural Network Inference Engine for Mobile Phones

Over the last years, a great success of deep neural networks (DNNs) has ...
research
08/11/2023

Comprehensive Benchmarking of Binary Neural Networks on NVM Crossbar Architectures

Non-volatile memory (NVM) crossbars have been identified as a promising ...
research
03/18/2019

PZnet: Efficient 3D ConvNet Inference on Manycore CPUs

Convolutional nets have been shown to achieve state-of-the-art accuracy ...
research
07/09/2018

XNOR Neural Engine: a Hardware Accelerator IP for 21.6 fJ/op Binary Neural Network Inference

Binary Neural Networks (BNNs) are promising to deliver accuracy comparab...

Please sign up or login with your details

Forgot password? Click here to reset