ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

04/29/2021
by   Jianfei Chen, et al.
13

The increasing size of neural network models has been critical for improvements in their accuracy, but device memory is not growing at the same rate. This creates fundamental challenges for training neural networks within limited memory environments. In this work, we propose ActNN, a memory-efficient training framework that stores randomly quantized activations for back propagation. We prove the convergence of ActNN for general network architectures, and we characterize the impact of quantization on the convergence via an exact expression for the gradient variance. Using our theory, we propose novel mixed-precision quantization strategies that exploit the activation's heterogeneity across feature dimensions, samples, and layers. These techniques can be readily applied to existing dynamic graph frameworks, such as PyTorch, simply by substituting the layers. We evaluate ActNN on mainstream computer vision models for classification, detection, and segmentation tasks. On all these tasks, ActNN compresses the activation to 2 bits on average, with negligible accuracy loss. ActNN reduces the memory footprint of the activation by 12x, and it enables training with a 6.6x to 14x larger batch size.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2022

GACT: Activation Compressed Training for General Architectures

Training large neural network (NN) models requires extensive memory reso...
research
02/01/2022

Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction

Memory footprint is one of the main limiting factors for large neural ne...
research
12/08/2022

TinyKG: Memory-Efficient Training Framework for Knowledge Graph Neural Recommender Systems

There has been an explosion of interest in designing various Knowledge G...
research
09/21/2023

Activation Compression of Graph Neural Networks using Block-wise Quantization with Improved Variance Minimization

Efficient training of large-scale graph neural networks (GNNs) has been ...
research
01/07/2020

Sparse Weight Activation Training

Training convolutional neural networks (CNNs) is time-consuming. Prior w...
research
12/10/2018

Accelerating Convolutional Neural Networks via Activation Map Compression

The deep learning revolution brought us an extensive array of neural net...
research
07/12/2019

And the Bit Goes Down: Revisiting the Quantization of Neural Networks

In this paper, we address the problem of reducing the memory footprint o...

Please sign up or login with your details

Forgot password? Click here to reset