DeepAI AI Chat
Log In Sign Up

Accelerating Training of Deep Neural Networks via Sparse Edge Processing

11/03/2017
by   Sourya Dey, et al.
0

We propose a reconfigurable hardware architecture for deep neural networks (DNNs) capable of online training and inference, which uses algorithmically pre-determined, structured sparsity to significantly lower memory and computational requirements. This novel architecture introduces the notion of edge-processing to provide flexibility and combines junction pipelining and operational parallelization to speed up training. The overall effect is to reduce network complexity by factors up to 30x and training time by up to 35x relative to GPUs, while maintaining high fidelity of inference results. This has the potential to enable extensive parameter searches and development of the largely unexplored theoretical foundation of DNNs. The architecture automatically adapts itself to different network sizes given available hardware resources. As proof of concept, we show results obtained for different bit widths.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/31/2018

A Highly Parallel FPGA Implementation of Sparse Neural Network Training

We demonstrate an FPGA implementation of a parallel and reconfigurable a...
02/10/2021

Hybrid In-memory Computing Architecture for the Training of Deep Neural Networks

The cost involved in training deep neural networks (DNNs) on von-Neumann...
08/20/2019

Efficient Deep Neural Networks

The success of deep neural networks (DNNs) is attributable to three fact...
03/25/2021

Enabling Incremental Training with Forward Pass for Edge Devices

Deep Neural Networks (DNNs) are commonly deployed on end devices that ex...
07/14/2020

Analyzing and Mitigating Data Stalls in DNN Training

Training Deep Neural Networks (DNNs) is resource-intensive and time-cons...
06/29/2021

NEUKONFIG: Reducing Edge Service Downtime When Repartitioning DNNs

Deep Neural Networks (DNNs) may be partitioned across the edge and the c...
06/01/2017

CATERPILLAR: Coarse Grain Reconfigurable Architecture for Accelerating the Training of Deep Neural Networks

Accelerating the inference of a trained DNN is a well studied subject. I...