A Learning Framework for n-bit Quantized Neural Networks toward FPGAs

04/06/2020
by   Jun Chen, et al.
10

The quantized neural network (QNN) is an efficient approach for network compression and can be widely used in the implementation of FPGAs. This paper proposes a novel learning framework for n-bit QNNs, whose weights are constrained to the power of two. To solve the gradient vanishing problem, we propose a reconstructed gradient function for QNNs in back-propagation algorithm that can directly get the real gradient rather than estimating an approximate gradient of the expected loss. We also propose a novel QNN structure named n-BQ-NN, which uses shift operation to replace the multiply operation and is more suitable for the inference on FPGAs. Furthermore, we also design a shift vector processing element (SVPE) array to replace all 16-bit multiplications with SHIFT operations in convolution operation on FPGAs. We also carry out comparable experiments to evaluate our framework. The experimental results show that the quantized models of ResNet, DenseNet and AlexNet through our learning framework can achieve almost the same accuracies with the original full-precision models. Moreover, when using our learning framework to train our n-BQ-NN from scratch, it can achieve state-of-the-art results compared with typical low-precision QNNs. Experiments on Xilinx ZCU102 platform show that our n-BQ-NN with our SVPE can execute 2.9 times faster than with the vector processing element (VPE) in inference. As the SHIFT operation in our SVPE array will not consume Digital Signal Processings (DSPs) resources on FPGAs, the experiments have shown that the use of SVPE array also reduces average energy consumption to 68.7

READ FULL TEXT

page 1

page 11

page 12

research
09/22/2016

Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations

We introduce a method to train Quantized Neural Networks (QNNs) --- neur...
research
12/04/2019

RTN: Reparameterized Ternary Network

To deploy deep neural networks on resource-limited devices, quantization...
research
01/04/2019

Dataflow-based Joint Quantization of Weights and Activations for Deep Neural Networks

This paper addresses a challenging problem - how to reduce energy consum...
research
10/01/2018

ProxQuant: Quantized Neural Networks via Proximal Operators

To make deep neural networks feasible in resource-constrained environmen...
research
02/19/2019

Towards Hardware Implementation of Neural Network-based Communication Algorithms

There is a recent interest in neural network (NN)-based communication al...
research
09/10/2020

QuantNet: Learning to Quantize by Learning within Fully Differentiable Framework

Despite the achievements of recent binarization methods on reducing the ...
research
04/02/2021

Network Quantization with Element-wise Gradient Scaling

Network quantization aims at reducing bit-widths of weights and/or activ...

Please sign up or login with your details

Forgot password? Click here to reset