Backprop with Approximate Activations for Memory-efficient Network Training

01/23/2019
by   Ayan Chakrabarti, et al.
0

Larger and deeper neural network architectures deliver improved accuracy on a variety of tasks, but also require a large amount of memory for training to store intermediate activations for back-propagation. We introduce an approximation strategy to significantly reduce this memory footprint, with minimal effect on training performance and negligible computational cost. Our method replaces intermediate activations with lower-precision approximations to free up memory, after the full-precision versions have been used for computation in subsequent layers in the forward pass. Only these approximate activations are retained for use in the backward pass. Compared to naive low-precision computation, our approach limits the accumulation of errors across layers and allows the use of much lower-precision approximations without affecting training accuracy. Experiments on CIFAR and ImageNet show that our method yields performance comparable to full-precision training, while storing activations at a fraction of the memory cost with 8- and even 4-bit fixed-point precision.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/22/2021

Mesa: A Memory-saving Training Framework for Transformers

There has been an explosion of interest in designing high-performance Tr...
research
12/08/2022

TinyKG: Memory-Efficient Training Framework for Knowledge Graph Neural Recommender Systems

There has been an explosion of interest in designing various Knowledge G...
research
02/08/2021

Enabling Binary Neural Network Training on the Edge

The ever-growing computational demands of increasingly complex machine l...
research
05/24/2019

Fully Hyperbolic Convolutional Neural Networks

Convolutional Neural Networks (CNN) have recently seen tremendous succes...
research
05/24/2019

Magnetoresistive RAM for error resilient XNOR-Nets

We trained three Binarized Convolutional Neural Network architectures (L...
research
09/04/2017

WRPN: Wide Reduced-Precision Networks

For computer vision applications, prior works have shown the efficacy of...
research
11/08/2016

A backward pass through a CNN using a generative model of its activations

Neural networks have shown to be a practical way of building a very comp...

Please sign up or login with your details

Forgot password? Click here to reset