Loom: Exploiting Weight and Activation Precisions to Accelerate Convolutional Neural Networks

06/23/2017
by   Sayeh Sharify, et al.
0

Loom (LM), a hardware inference accelerator for Convolutional Neural Networks (CNNs) is presented. In LM every bit of data precision that can be saved translates to proportional performance gains. Specifically, for convolutional layers LM's execution time scales inversely proportionally with the precisions of both weights and activations. For fully-connected layers LM's performance scales inversely proportionally with the precision of the weights. The LM accelerator targets area constrained System-on-a-Chip designs such as those found on mobile devices that cannot afford the multi-megabyte buffers that would be needed to store each layer on-chip during processing. Experiments on image classification CNNs show that on average across all networks studied and assuming that weights are supplied via a High Bandwidth Memory v2 (HBM2) interface, a configuration of LM outperforms a state-of-the-art bit-parallel accelerator [1] by 2.34x without any loss in accuracy while being 2.23x more energy efficient. Moreover, LM can trade-off accuracy for additional improvements in execution performance and energy efficiency.

READ FULL TEXT

page 6

page 7

research
07/27/2017

Tartan: Accelerating Fully-Connected and Convolutional Layers in Deep Learning Networks by Exploiting Numerical Precision Variability

Tartan (TRT), a hardware accelerator for inference with Deep Neural Netw...
research
10/30/2018

MPNA: A Massively-Parallel Neural Array Accelerator with Dataflow Optimization for Convolutional Neural Networks

The state-of-the-art accelerators for Convolutional Neural Networks (CNN...
research
07/20/2020

Improving Memory Utilization in Convolutional Neural Network Accelerators

While the accuracy of convolutional neural networks has achieved vast im...
research
08/25/2020

IKW: Inter-Kernel Weights for Power Efficient Edge Computing

Deep Convolutional Neural Networks (CNN) have achieved state-of-the-art ...
research
11/03/2017

ResBinNet: Residual Binary Neural Network

Recent efforts on training light-weight binary neural networks offer pro...
research
03/16/2018

EVA^2 : Exploiting Temporal Redundancy in Live Computer Vision

Hardware support for deep convolutional neural networks (CNNs) is critic...
research
04/17/2018

DPRed: Making Typical Activation Values Matter In Deep Learning Computing

We show that selecting a fixed precision for all activations in Convolut...

Please sign up or login with your details

Forgot password? Click here to reset