FleXOR: Trainable Fractional Quantization

by   Dongsoo Lee, et al.

Quantization based on the binary codes is gaining attention because each quantized bit can be directly utilized for computations without dequantization using look-up tables. Previous attempts, however, only allow for integer numbers of quantization bits, which ends up restricting the search space for compression ratio and accuracy. In this paper, we propose an encryption algorithm/architecture to compress quantized weights so as to achieve fractional numbers of bits per weight.Decryption during inference is implemented by digital XOR-gate networks added into the neural network model while XOR gates are described by utilizing tanh(x) for backward propagation to enable gradient calculations. We perform experiments using MNIST, CIFAR-10, and ImageNet to show that inserting XOR gates learns quantization/encrypted bit decisions through training and obtains high accuracy even for fractional sub 1-bit weights. As a result, our proposed method yields smaller size and higher model accuracy compared to binary neural networks.



There are no comments yet.


page 1

page 2

page 3

page 4


Retraining-Based Iterative Weight Quantization for Deep Neural Networks

Model compression has gained a lot of attention due to its ability to re...

BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization

Mixed-precision quantization can potentially achieve the optimal tradeof...

Effective Quantization Methods for Recurrent Neural Networks

Reducing bit-widths of weights, activations, and gradients of a Neural N...

Bayesian Bits: Unifying Quantization and Pruning

We introduce Bayesian Bits, a practical method for joint mixed precision...

ABS: Automatic Bit Sharing for Model Compression

We present Automatic Bit Sharing (ABS) to automatically search for optim...

Equal Bits: Enforcing Equally Distributed Binary Network Weights

Binary networks are extremely efficient as they use only two symbols to ...

Deep neural networks are robust to weight binarization and other non-linear distortions

Recent results show that deep neural networks achieve excellent performa...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.