BP-Im2col: Implicit Im2col Supporting AI Backpropagation on Systolic Arrays

09/20/2022
by   Jianchao Yang, et al.
0

State-of-the-art systolic array-based accelerators adopt the traditional im2col algorithm to accelerate the inference of convolutional layers. However, traditional im2col cannot efficiently support AI backpropagation. Backpropagation in convolutional layers involves performing transposed convolution and dilated convolution, which usually introduces plenty of zero-spaces into the feature map or kernel. The zero-space data reorganization interfere with the continuity of training and incur additional and non-negligible overhead in terms of off- and on-chip storage, access and performance. Since countermeasures for backpropagation are rarely proposed, we propose BP-im2col, a novel im2col algorithm for AI backpropagation, and implement it in RTL on a TPU-like accelerator. Experiments on TPU-like accelerator indicate that BP-im2col reduces the backpropagation runtime by 34.9 buffers by at least 22.7 adopting the traditional im2col. It further reduces the additional storage overhead in the backpropagation process by at least 74.78

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/19/2021

BPLight-CNN: A Photonics-based Backpropagation Accelerator for Deep Learning

Training deep learning networks involves continuous weight updates acros...
research
10/05/2018

Interpretable Convolutional Neural Networks via Feedforward Design

The model parameters of convolutional neural networks (CNNs) are determi...
research
12/29/2019

Pipelined Training with Stale Weights of Deep Convolutional Neural Networks

The growth in the complexity of Convolutional Neural Networks (CNNs) is ...
research
09/03/2023

FedFwd: Federated Learning without Backpropagation

In federated learning (FL), clients with limited resources can disrupt t...
research
12/14/2022

Directional Direct Feedback Alignment: Estimating Backpropagation Paths for Efficient Learning on Neural Processors

The error Backpropagation algorithm (BP) is a key method for training de...
research
06/13/2021

Low-memory stochastic backpropagation with multi-channel randomized trace estimation

Thanks to the combination of state-of-the-art accelerators and highly op...
research
09/14/2018

Non-iterative recomputation of dense layers for performance improvement of DCNN

An iterative method of learning has become a paradigm for training deep ...

Please sign up or login with your details

Forgot password? Click here to reset