Low Complexity Multiply Accumulate Unit for Weight-Sharing Convolutional Neural Networks

08/30/2016
by   James Garland, et al.
0

Convolutional Neural Networks (CNNs) are one of the most successful deep machine learning technologies for processing image, voice and video data. CNNs require large amounts of processing capacity and memory, which can exceed the resources of low power mobile and embedded systems. Several designs for hardware accelerators have been proposed for CNNs which typically contain large numbers of Multiply Accumulate (MAC) units. One approach to reducing data sizes and memory traffic in CNN accelerators is "weight sharing", where the full range of values in a trained CNN are put in bins and the bin index is stored instead of the original weight value. In this paper we propose a novel MAC circuit that exploits binning in weight-sharing CNNs. Rather than computing the MAC directly we instead count the frequency of each weight and place it in a bin. We then compute the accumulated value in a subsequent multiply phase. This allows hardware multipliers in the MAC circuit to be replaced with adders and selection logic. Experiments show that for the same clock speed our approach results in fewer gates, smaller logic, and reduced power.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/30/2018

Low Complexity Multiply-Accumulate Units for Convolutional Neural Networks with Weight-Sharing

Convolutional neural networks (CNNs) are one of the most successful mach...
research
06/05/2017

NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps

Convolutional neural networks (CNNs) have become the dominant neural net...
research
02/02/2021

Fast Exploration of Weight Sharing Opportunities for CNN Compression

The computational workload involved in Convolutional Neural Networks (CN...
research
03/05/2018

Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine

Deep neural networks have achieved impressive results in computer vision...
research
11/20/2017

Tactics to Directly Map CNN graphs on Embedded FPGAs

Deep Convolutional Neural Networks (CNNs) are the state-of-the-art in im...
research
06/18/2020

Caffe Barista: Brewing Caffe with FPGAs in the Training Loop

As the complexity of deep learning (DL) models increases, their compute ...
research
09/12/2019

A Camera That CNNs: Towards Embedded Neural Networks onPixel Processor Arrays

We present a convolutional neural network implementation for pixel proce...

Please sign up or login with your details

Forgot password? Click here to reset