RTN: Reparameterized Ternary Network

12/04/2019
by   Yuhang Li, et al.
0

To deploy deep neural networks on resource-limited devices, quantization has been widely explored. In this work, we study the extremely low-bit networks which have tremendous speed-up, memory saving with quantized activation and weights. We first bring up three omitted issues in extremely low-bit networks: the squashing range of quantized values; the gradient vanishing during backpropagation and the unexploited hardware acceleration of ternary networks. By reparameterizing quantized activation and weights vector with full precision scale and offset for fixed ternary vector, we decouple the range and magnitude from the direction to extenuate the three issues. Learnable scale and offset can automatically adjust the range of quantized values and sparsity without gradient vanishing. A novel encoding and computation pat-tern are designed to support efficient computing for our reparameterized ternary network (RTN). Experiments on ResNet-18 for ImageNet demonstrate that the proposed RTN finds a much better efficiency between bitwidth and accuracy, and achieves up to 26.76 relative accuracy improvement compared with state-of-the-art methods. Moreover, we validate the proposed computation pattern on Field ProgrammableGate Arrays (FPGA), and it brings 46.46x and 89.17x savings on power and area respectively compared with the full precision convolution.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/16/2018

PACT: Parameterized Clipping Activation for Quantized Neural Networks

Deep learning algorithms achieve high classification accuracy at the exp...
research
07/31/2017

Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform

Deep neural networks (DNNs) are used by different applications that are ...
research
06/18/2021

Quantized Neural Networks via -1, +1 Encoding Decomposition and Acceleration

The training of deep neural networks (DNNs) always requires intensive re...
research
04/06/2020

A Learning Framework for n-bit Quantized Neural Networks toward FPGAs

The quantized neural network (QNN) is an efficient approach for network ...
research
11/19/2019

AddNet: Deep Neural Networks Using FPGA-Optimized Multipliers

Low-precision arithmetic operations to accelerate deep-learning applicat...
research
09/08/2021

Elastic Significant Bit Quantization and Acceleration for Deep Neural Networks

Quantization has been proven to be a vital method for improving the infe...
research
06/12/2018

Exploration of Low Numeric Precision Deep Learning Inference Using Intel FPGAs

CNNs have been shown to maintain reasonable classification accuracy when...

Please sign up or login with your details

Forgot password? Click here to reset