Mixture-of-Rookies: Saving DNN Computations by Predicting ReLU Outputs

02/10/2022
by   Dennis Pinto, et al.
0

Deep Neural Networks (DNNs) are widely used in many applications domains. However, they require a vast amount of computations and memory accesses to deliver outstanding accuracy. In this paper, we propose a scheme to predict whether the output of each ReLu activated neuron will be a zero or a positive number in order to skip the computation of those neurons that will likely output a zero. Our predictor, named Mixture-of-Rookies, combines two inexpensive components. The first one exploits the high linear correlation between binarized (1-bit) and full-precision (8-bit) dot products, whereas the second component clusters together neurons that tend to output zero at the same time. We propose a novel clustering scheme based on the analysis of angles, as the sign of the dot product of two vectors depends on the cosine of the angle between them. We implement our hybrid zero output predictor on top of a state-of-the-art DNN accelerator. Experimental results show that our scheme introduces a small area overhead of 5.3 reducing energy consumption by 16.5

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/20/2016

Bit-pragmatic Deep Neural Network Computing

We quantify a source of ineffectual computations when processing the mul...
research
06/28/2023

ReDy: A Novel ReRAM-centric Dynamic Quantization Approach for Energy-efficient CNN Inference

The primary operation in DNNs is the dot product of quantized input acti...
research
11/23/2019

Training Modern Deep Neural Networks for Memory-Fault Robustness

Because deep neural networks (DNNs) rely on a large number of parameters...
research
03/15/2022

Energy-efficient Dense DNN Acceleration with Signed Bit-slice Architecture

As the number of deep neural networks (DNNs) to be executed on a mobile ...
research
06/15/2022

Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness

Certifiable robustness is a highly desirable property for adopting deep ...
research
02/14/2022

Saving RNN Computations with a Neuron-Level Fuzzy Memoization Scheme

Recurrent Neural Networks (RNNs) are a key technology for applications s...
research
12/11/2019

DGEMM performance is data-dependent

The DGEMM function is a widely used implementation of the matrix product...

Please sign up or login with your details

Forgot password? Click here to reset