LANCE: efficient low-precision quantized Winograd convolution for neural networks based on graphics processing units

03/19/2020
by   Guangli Li, et al.
0

Accelerating deep convolutional neural networks has become an active topic and sparked an interest in academia and industry. In this paper, we propose an efficient low-precision quantized Winograd convolution algorithm, called LANCE, which combines the advantages of fast convolution and quantization techniques. By embedding linear quantization operations into the Winograd-domain, the fast convolution can be performed efficiently under low-precision computation on graphics processing units. We test neural network models with LANCE on representative image classification datasets, including SVHN, CIFAR, and ImageNet. The experimental results show that our 8-bit quantized Winograd convolution improves the performance by up to 2.40x over the full-precision convolution with trivial accuracy loss.

READ FULL TEXT
research
08/17/2018

Joint Training of Low-Precision Neural Network with Quantization Interval Parameters

Optimization for low-precision neural network is an important technique ...
research
02/07/2020

Switchable Precision Neural Networks

Instantaneous and on demand accuracy-efficiency trade-off has been recen...
research
12/28/2021

HiKonv: High Throughput Quantized Convolution With Novel Bit-wise Management and Computation

Quantization for Convolutional Neural Network (CNN) has shown significan...
research
04/23/2020

Quantaized Winograd/Toom-Cook Convolution for DNNs: Beyond Canonical Polynomials Base

The problem how to speed up the convolution computations in Deep Neural ...
research
10/01/2019

NGEMM: Optimizing GEMM for Deep Learning via Compiler-based Techniques

Quantization has emerged to be an effective way to significantly boost t...
research
07/22/2022

HiKonv: Maximizing the Throughput of Quantized Convolution With Novel Bit-wise Management and Computation

Quantization for CNN has shown significant progress with the intention o...
research
01/07/2019

DSConv: Efficient Convolution Operator

We introduce a variation of the convolutional layer called DSConv (Distr...

Please sign up or login with your details

Forgot password? Click here to reset