Any-Precision Deep Neural Networks

11/17/2019
by   Haichao Yu, et al.
18

We present Any-Precision Deep Neural Networks (Any-Precision DNNs), which are trained with a new method that empowers learned DNNs to be flexible in any numerical precision during inference. The same model in runtime can be flexibly and directly set to different bit-width, by truncating the least significant bits, to support dynamic speed and accuracy trade-off. When all layers are set to low-bits, we show that the model achieved accuracy comparable to dedicated models trained at the same precision. This nice property facilitates flexible deployment of deep learning models in real-world applications, where in practice trade-offs between model accuracy and runtime efficiency are often sought. Previous literature presents solutions to train models at each individual fixed efficiency/accuracy trade-off point. But how to produce a model flexible in runtime precision is largely unexplored. When the demand of efficiency/accuracy trade-off varies from time to time or even dynamically changes in runtime, it is infeasible to re-train models accordingly, and the storage budget may forbid keeping multiple models. Our proposed framework achieves this flexibility without performance degradation. More importantly, we demonstrate that this achievement is agnostic to model architectures. We experimentally validated our method with different deep network backbones (AlexNet-small, Resnet-20, Resnet-50) on different datasets (SVHN, Cifar-10, ImageNet) and observed consistent results. Code and models will be available at https://github.com/haichaoyu.

READ FULL TEXT
research
12/21/2018

Slimmable Neural Networks

We present a simple and general method to train a single neural network ...
research
07/19/2016

Runtime Configurable Deep Neural Networks for Energy-Accuracy Trade-off

We present a novel dynamic configuration technique for deep neural netwo...
research
02/13/2023

Stitchable Neural Networks

The public model zoo containing enormous powerful pretrained model famil...
research
05/23/2021

Post-Training Sparsity-Aware Quantization

Quantization is a technique used in deep neural networks (DNNs) to incre...
research
03/12/2019

Universally Slimmable Networks and Improved Training Techniques

Slimmable networks are a family of neural networks that can instantly ad...
research
07/08/2022

SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning

Neural architecture search (NAS) has demonstrated amazing success in sea...
research
10/06/2022

IR2Net: Information Restriction and Information Recovery for Accurate Binary Neural Networks

Weight and activation binarization can efficiently compress deep neural ...

Please sign up or login with your details

Forgot password? Click here to reset