Minimizing Area and Energy of Deep Learning Hardware Design Using Collective Low Precision and Structured Compression

04/19/2018
by   Shihui Yin, et al.
0

Deep learning algorithms have shown tremendous success in many recognition tasks; however, these algorithms typically include a deep neural network (DNN) structure and a large number of parameters, which makes it challenging to implement them on power/area-constrained embedded platforms. To reduce the network size, several studies investigated compression by introducing element-wise or row-/column-/block-wise sparsity via pruning and regularization. In addition, many recent works have focused on reducing precision of activations and weights with some reducing down to a single bit. However, combining various sparsity structures with binarized or very-low-precision (2-3 bit) neural networks have not been comprehensively explored. In this work, we present design techniques for minimum-area/-energy DNN hardware with minimal degradation in accuracy. During training, both binarization/low-precision and structured sparsity are applied as constraints to find the smallest memory footprint for a given deep learning algorithm. The DNN model for CIFAR-10 dataset with weight memory reduction of 50X exhibits accuracy comparable to that of the floating-point counterpart. Area, performance and energy results of DNN hardware in 40nm CMOS are reported for the MNIST dataset. The optimized DNN that combines 8X structured compression and 3-bit weight precision showed 98.4

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/30/2019

Deep Learning Training on the Edge with Low-Precision Posits

Recently, the posit numerical format has shown promise for DNN data repr...
research
02/08/2021

VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference

Quantization enables efficient acceleration of deep neural networks by r...
research
01/16/2020

Shifted and Squeezed 8-bit Floating Point format for Low-Precision Training of Deep Neural Networks

Training with larger number of parameters while keeping fast iterations ...
research
12/12/2016

Understanding the Impact of Precision Quantization on the Accuracy and Energy of Neural Networks

Deep neural networks are gaining in popularity as they are used to gener...
research
06/26/2021

Low-Precision Training in Logarithmic Number System using Multiplicative Weight Update

Representing deep neural networks (DNNs) in low-precision is a promising...
research
01/19/2019

Accumulation Bit-Width Scaling For Ultra-Low Precision Training Of Deep Networks

Efforts to reduce the numerical precision of computations in deep learni...
research
10/13/2020

Revisiting BFloat16 Training

State-of-the-art generic low-precision training algorithms use a mix of ...

Please sign up or login with your details

Forgot password? Click here to reset