Bit-Mixer: Mixed-precision networks with runtime bit-width selection

03/31/2021
by   Adrian Bulat, et al.
0

Mixed-precision networks allow for a variable bit-width quantization for every layer in the network. A major limitation of existing work is that the bit-width for each layer must be predefined during training time. This allows little flexibility if the characteristics of the device on which the network is deployed change during runtime. In this work, we propose Bit-Mixer, the very first method to train a meta-quantized network where during test time any layer can change its bid-width without affecting at all the overall network's ability for highly accurate inference. To this end, we make 2 key contributions: (a) Transitional Batch-Norms, and (b) a 3-stage optimization process which is shown capable of training such a network. We show that our method can result in mixed precision networks that exhibit the desirable flexibility properties for on-device deployment without compromising accuracy. Code will be made available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/10/2023

A Practical Mixed Precision Algorithm for Post-Training Quantization

Neural network quantization is frequently used to optimize model size, l...
research
02/09/2023

Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement Learning

Mixed-precision quantization mostly predetermines the model bit-width se...
research
10/12/2018

Training Deep Neural Network in Limited Precision

Energy and resource efficient training of DNNs will greatly extend the a...
research
04/21/2022

Arbitrary Bit-width Network: A Joint Layer-Wise Quantization and Adaptive Inference Approach

Conventional model quantization methods use a fixed quantization scheme ...
research
05/14/2020

Bayesian Bits: Unifying Quantization and Pruning

We introduce Bayesian Bits, a practical method for joint mixed precision...
research
06/17/2022

Channel-wise Mixed-precision Assignment for DNN Inference on Constrained Edge Nodes

Quantization is widely employed in both cloud and edge systems to reduce...
research
12/10/2022

Vertical Layering of Quantized Neural Networks for Heterogeneous Inference

Although considerable progress has been obtained in neural network quant...

Please sign up or login with your details

Forgot password? Click here to reset