Pyramid Vector Quantization and Bit Level Sparsity in Weights for Efficient Neural Networks Inference

11/24/2019
by   Vincenzo Liguori, et al.
0

This paper discusses three basic blocks for the inference of convolutional neural networks (CNNs). Pyramid Vector Quantization (PVQ) is discussed as an effective quantizer for CNNs weights resulting in highly sparse and compressible networks. Properties of PVQ are exploited for the elimination of multipliers during inference while maintaining high performance. The result is then extended to any other quantized weights. The Tiny Yolo v3 CNN is used to compare such basic blocks.

READ FULL TEXT
research
05/17/2022

A Silicon Photonic Accelerator for Convolutional Neural Networks with Heterogeneous Quantization

Parameter quantization in convolutional neural networks (CNNs) can help ...
research
03/30/2016

Vector Quantization for Machine Vision

This paper shows how to reduce the computational cost for a variety of c...
research
04/10/2017

Pyramid Vector Quantization for Deep Learning

This paper explores the use of Pyramid Vector Quantization (PVQ) to redu...
research
03/07/2019

Efficient and Effective Quantization for Sparse DNNs

Deep convolutional neural networks (CNNs) are powerful tools for a wide ...
research
06/26/2022

CTMQ: Cyclic Training of Convolutional Neural Networks with Multiple Quantization Steps

This paper proposes a training method having multiple cyclic training fo...
research
12/10/2020

A MAC-less Neural Inference Processor Supporting Compressed, Variable Precision Weights

This paper introduces two architectures for the inference of convolution...
research
08/20/2019

Comparing ternary and binary adders and multipliers

While many papers have proposed implementations of ternary adders and te...

Please sign up or login with your details

Forgot password? Click here to reset