HadaNets: Flexible Quantization Strategies for Neural Networks

05/26/2019
by   Yash Akhauri, et al.
0

On-board processing elements on UAVs are currently inadequate for training and inference of Deep Neural Networks. This is largely due to the energy consumption of memory accesses in such a network. HadaNets introduce a flexible train-from-scratch tensor quantization scheme by pairing a full precision tensor to a binary tensor in the form of a Hadamard product. Unlike wider reduced precision neural network models, we preserve the train-time parameter count, thus out-performing XNOR-Nets without a train-time memory penalty. Such training routines could see great utility in semi-supervised online learning tasks. Our method also offers advantages in model compression, as we reduce the model size of ResNet-18 by 7.43 times with respect to a full precision model without utilizing any other compression techniques. We also demonstrate a 'Hadamard Binary Matrix Multiply' kernel, which delivers a 10-fold increase in performance over full precision matrix multiplication with a similarly optimized kernel.

READ FULL TEXT
research
09/30/2021

Semi-tensor Product-based TensorDecomposition for Neural Network Compression

The existing tensor networks adopt conventional matrix product for conne...
research
06/15/2023

Neural Network Compression using Binarization and Few Full-Precision Weights

Quantization and pruning are known to be two effective Deep Neural Netwo...
research
04/10/2017

WRPN: Training and Inference using Wide Reduced-Precision Networks

For computer vision applications, prior works have shown the efficacy of...
research
01/25/2021

TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models

The memory capacity of embedding tables in deep learning recommendation ...
research
08/08/2023

Quantization Aware Factorization for Deep Neural Network Compression

Tensor decomposition of convolutional and fully-connected layers is an e...
research
09/04/2017

WRPN: Wide Reduced-Precision Networks

For computer vision applications, prior works have shown the efficacy of...
research
02/11/2022

Learning from distinctive candidates to optimize reduced-precision convolution program on tensor cores

Convolution is one of the fundamental operations of deep neural networks...

Please sign up or login with your details

Forgot password? Click here to reset