Pre-Defined Sparse Neural Networks with Hardware Acceleration

12/04/2018
by   Sourya Dey, et al.
0

Neural networks have proven to be extremely powerful tools for modern artificial intelligence applications, but computational and storage complexity remain limiting factors. This paper presents two compatible contributions towards reducing the time, energy, computational, and storage complexities associated with multilayer perceptrons. Pre-defined sparsity is proposed to reduce the complexity during both training and inference, regardless of the implementation platform. Our results show that storage and computational complexity can be reduced by factors greater than 5X without significant performance loss. The second contribution is an architecture for hardware acceleration that is compatible with pre-defined sparsity. This architecture supports both training and inference modes and is flexible in the sense that it is not tied to a specific number of neurons. For example, this flexibility implies that various sized neural networks can be supported on various sized Field Programmable Gate Array (FPGA)s.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/09/2019

Unrolling Ternary Neural Networks

The computational complexity of neural networks for large scale or real-...
research
05/31/2018

A Highly Parallel FPGA Implementation of Sparse Neural Network Training

We demonstrate an FPGA implementation of a parallel and reconfigurable a...
research
11/06/2017

A General Neural Network Hardware Architecture on FPGA

Field Programmable Gate Arrays (FPGAs) plays an increasingly important r...
research
02/17/2020

STANNIS: Low-Power Acceleration of Deep NeuralNetwork Training Using Computational Storage

This paper proposes a framework for distributed, in-storage training of ...
research
03/25/2020

Overview of the IBM Neural Computer Architecture

The IBM Neural Computer (INC) is a highly flexible, re-configurable para...
research
02/17/2020

STANNIS: Low-Power Acceleration of Deep Neural Network Training Using Computational Storage

This paper proposes a framework for distributed, in-storage training of ...
research
11/03/2020

CUTIE: Beyond PetaOp/s/W Ternary DNN Inference Acceleration with Better-than-Binary Energy Efficiency

We present a 3.1 POp/s/W fully digital hardware accelerator for ternary ...

Please sign up or login with your details

Forgot password? Click here to reset