Memory Efficient Optimizers with 4-bit States

09/04/2023
by   Bingrui Li, et al.
0

Optimizer states are a major source of memory consumption for training neural networks, limiting the maximum trainable model within given memory budget. Compressing the optimizer states from 32-bit floating points to lower bitwidth is promising to reduce the training memory footprint, while the current lowest achievable bitwidth is 8-bit. In this work, we push optimizer states bitwidth down to 4-bit through a detailed empirical analysis of first and second moments. Specifically, we find that moments have complicated outlier patterns, that current block-wise quantization cannot accurately approximate. We use a smaller block size and propose to utilize both row-wise and column-wise information for better quantization. We further identify a zero point problem of quantizing the second moment, and solve this problem with a linear quantizer that excludes the zero point. Our 4-bit optimizer is evaluated on a wide variety of benchmarks including natural language understanding, machine translation, image classification, and instruction tuning. On all the tasks our optimizers can achieve comparable accuracy with their full-precision counterparts, while enjoying better memory efficiency.

READ FULL TEXT
research
10/06/2021

8-bit Optimizers via Block-wise Quantization

Stateful optimizers maintain gradient statistics over time, e.g., the ex...
research
11/01/2019

Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Recent emerged quantization technique (i.e., using low bit-width fixed-p...
research
05/23/2023

QLoRA: Efficient Finetuning of Quantized LLMs

We present QLoRA, an efficient finetuning approach that reduces memory u...
research
05/16/2023

MINT: Multiplier-less Integer Quantization for Spiking Neural Networks

We propose Multiplier-less INTeger (MINT) quantization, an efficient uni...
research
06/21/2023

Training Transformers with 4-bit Integers

Quantizing the activation, weight, and gradient to 4-bit is promising to...
research
07/13/2022

Sub 8-Bit Quantization of Streaming Keyword Spotting Models for Embedded Chipsets

We propose a novel 2-stage sub 8-bit quantization aware training algorit...
research
03/13/2017

Guetzli: Perceptually Guided JPEG Encoder

Guetzli is a new JPEG encoder that aims to produce visually indistinguis...

Please sign up or login with your details

Forgot password? Click here to reset