Accelerating Deep Convolutional Networks using low-precision and sparsity

10/02/2016
by   Ganesh Venkatesh, et al.
0

We explore techniques to significantly improve the compute efficiency and performance of Deep Convolution Networks without impacting their accuracy. To improve the compute efficiency, we focus on achieving high accuracy with extremely low-precision (2-bit) weight networks, and to accelerate the execution time, we aggressively skip operations on zero-values. We achieve the highest reported accuracy of 76.6 classification challenge with low-precision network[github release of the source code coming soon] while reducing the compute requirement by 3x compared to a full-precision network that achieves similar accuracy. Furthermore, to fully exploit the benefits of our low-precision networks, we build a deep learning accelerator core, dLAC, that can achieve up to 1 TFLOP/mm^2 equivalent for single-precision floating-point operations ( 2 TFLOP/mm^2 for half-precision).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2019

Mixed Precision Training With 8-bit Floating Point

Reduced precision computation for deep neural networks is one of the key...
research
10/15/2020

FPRaker: A Processing Element For Accelerating Neural Network Training

We present FPRaker, a processing element for composing training accelera...
research
02/03/2018

Mixed Precision Training of Convolutional Neural Networks using Integer Operations

The state-of-the-art (SOTA) for mixed precision training is dominated by...
research
01/13/2021

FBGEMM: Enabling High-Performance Low-Precision Deep Learning Inference

Deep learning models typically use single-precision (FP32) floating poin...
research
02/10/2020

A Framework for Semi-Automatic Precision and Accuracy Analysis for Fast and Rigorous Deep Learning

Deep Neural Networks (DNN) represent a performance-hungry application. F...
research
03/27/2019

Training Quantized Network with Auxiliary Gradient Module

In this paper, we seek to tackle two challenges in training low-precisio...
research
05/30/2023

Reduced Precision Floating-Point Optimization for Deep Neural Network On-Device Learning on MicroControllers

Enabling On-Device Learning (ODL) for Ultra-Low-Power Micro-Controller U...

Please sign up or login with your details

Forgot password? Click here to reset