Switchable Precision Neural Networks

02/07/2020
by   Luis Guerra, et al.
3

Instantaneous and on demand accuracy-efficiency trade-off has been recently explored in the context of neural networks slimming. In this paper, we propose a flexible quantization strategy, termed Switchable Precision neural Networks (SP-Nets), to train a shared network capable of operating at multiple quantization levels. At runtime, the network can adjust its precision on the fly according to instant memory, latency, power consumption and accuracy demands. For example, by constraining the network weights to 1-bit with switchable precision activations, our shared network spans from BinaryConnect to Binarized Neural Network, allowing to perform dot-products using only summations or bit operations. In addition, a self-distillation scheme is proposed to increase the performance of the quantized switches. We tested our approach with three different quantizers and demonstrate the performance of SP-Nets against independently trained quantized models in classification accuracy for Tiny ImageNet and ImageNet datasets using ResNet-18 and MobileNet architectures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/26/2018

LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks

Although weight and activation quantization is an effective approach for...
research
05/24/2019

Magnetoresistive RAM for error resilient XNOR-Nets

We trained three Binarized Convolutional Neural Network architectures (L...
research
02/06/2022

Energy awareness in low precision neural networks

Power consumption is a major obstacle in the deployment of deep neural n...
research
03/19/2020

LANCE: efficient low-precision quantized Winograd convolution for neural networks based on graphics processing units

Accelerating deep convolutional neural networks has become an active top...
research
12/10/2022

Vertical Layering of Quantized Neural Networks for Heterogeneous Inference

Although considerable progress has been obtained in neural network quant...
research
08/11/2023

Comprehensive Benchmarking of Binary Neural Networks on NVM Crossbar Architectures

Non-volatile memory (NVM) crossbars have been identified as a promising ...
research
07/31/2017

Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform

Deep neural networks (DNNs) are used by different applications that are ...

Please sign up or login with your details

Forgot password? Click here to reset