Laconic Deep Learning Computing

05/10/2018
by   Sayeh Sharify, et al.
0

We motivate a method for transparently identifying ineffectual computations in unmodified Deep Learning models and without affecting accuracy. Specifically, we show that if we decompose multiplications down to the bit level the amount of work performed during inference for image classification models can be consistently reduced by two orders of magnitude. In the best case studied of a sparse variant of AlexNet, this approach can ideally reduce computation work by more than 500x. We present Laconic a hardware accelerator that implements this approach to improve execution time, and energy efficiency for inference with Deep Learning Networks. Laconic judiciously gives up some of the work reduction potential to yield a low-cost, simple, and energy efficient design that outperforms other state-of-the-art accelerators. For example, a Laconic configuration that uses a weight memory interface with just 128 wires outperforms a conventional accelerator with a 2K-wire weight memory interface by 2.3x on average while being 2.13x more energy efficient on average. A Laconic configuration that uses a 1K-wire weight memory interface, outperforms the 2K-wire conventional accelerator by 15.4x and is 1.95x more energy efficient. Laconic does not require but rewards advances in model design such as a reduction in precision, the use of alternate numeric representations that reduce the number of bits that are "1", or an increase in weight or activation sparsity.

READ FULL TEXT

page 2

page 4

research
07/27/2017

Tartan: Accelerating Fully-Connected and Convolutional Layers in Deep Learning Networks by Exploiting Numerical Precision Variability

Tartan (TRT), a hardware accelerator for inference with Deep Neural Netw...
research
09/02/2019

SPRING: A Sparsity-Aware Reduced-Precision Monolithic 3D CNN Accelerator Architecture for Training and Inference

CNNs outperform traditional machine learning algorithms across a wide ra...
research
09/09/2021

SONIC: A Sparse Neural Network Inference Accelerator with Silicon Photonics for Energy-Efficient Deep Learning

Sparse neural networks can greatly facilitate the deployment of neural n...
research
06/28/2022

LiteCON: An All-Photonic Neuromorphic Accelerator for Energy-efficient Deep Learning (Preprint)

Deep learning is highly pervasive in today's data-intensive era. In part...
research
04/17/2018

DPRed: Making Typical Activation Values Matter In Deep Learning Computing

We show that selecting a fixed precision for all activations in Convolut...
research
11/14/2018

Tetris: Re-architecting Convolutional Neural Network Computation for Machine Learning Accelerators

Inference efficiency is the predominant consideration in designing deep ...
research
05/09/2018

A Memristor based Unsupervised Neuromorphic System Towards Fast and Energy-Efficient GAN

Deep Learning has gained immense success in pushing today's artificial i...

Please sign up or login with your details

Forgot password? Click here to reset