Accelerating Training of Deep Neural Networks via Sparse Edge Processing

11/03/2017
by   Sourya Dey, et al.
0

We propose a reconfigurable hardware architecture for deep neural networks (DNNs) capable of online training and inference, which uses algorithmically pre-determined, structured sparsity to significantly lower memory and computational requirements. This novel architecture introduces the notion of edge-processing to provide flexibility and combines junction pipelining and operational parallelization to speed up training. The overall effect is to reduce network complexity by factors up to 30x and training time by up to 35x relative to GPUs, while maintaining high fidelity of inference results. This has the potential to enable extensive parameter searches and development of the largely unexplored theoretical foundation of DNNs. The architecture automatically adapts itself to different network sizes given available hardware resources. As proof of concept, we show results obtained for different bit widths.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2018

A Highly Parallel FPGA Implementation of Sparse Neural Network Training

We demonstrate an FPGA implementation of a parallel and reconfigurable a...
research
02/10/2021

Hybrid In-memory Computing Architecture for the Training of Deep Neural Networks

The cost involved in training deep neural networks (DNNs) on von-Neumann...
research
08/20/2019

Efficient Deep Neural Networks

The success of deep neural networks (DNNs) is attributable to three fact...
research
03/25/2021

Enabling Incremental Training with Forward Pass for Edge Devices

Deep Neural Networks (DNNs) are commonly deployed on end devices that ex...
research
07/14/2020

Analyzing and Mitigating Data Stalls in DNN Training

Training Deep Neural Networks (DNNs) is resource-intensive and time-cons...
research
06/29/2021

NEUKONFIG: Reducing Edge Service Downtime When Repartitioning DNNs

Deep Neural Networks (DNNs) may be partitioned across the edge and the c...
research
06/01/2017

CATERPILLAR: Coarse Grain Reconfigurable Architecture for Accelerating the Training of Deep Neural Networks

Accelerating the inference of a trained DNN is a well studied subject. I...

Please sign up or login with your details

Forgot password? Click here to reset