Unit Scaling: Out-of-the-Box Low-Precision Training

03/20/2023
by   Charlie Blake, et al.
0

We present unit scaling, a paradigm for designing deep learning models that simplifies the use of low-precision number formats. Training in FP16 or the recently proposed FP8 formats offers substantial efficiency gains, but can lack sufficient range for out-of-the-box training. Unit scaling addresses this by introducing a principled approach to model numerics: seeking unit variance of all weights, activations and gradients at initialisation. Unlike alternative methods, this approach neither requires multiple training runs to find a suitable scale nor has significant computational overhead. We demonstrate the efficacy of unit scaling across a range of models and optimisers. We further show that existing models can be adapted to be unit-scaled, training BERT-Large in FP16 and then FP8 with no degradation in accuracy.

READ FULL TEXT

page 26

page 27

page 28

page 29

research
02/08/2021

VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference

Quantization enables efficient acceleration of deep neural networks by r...
research
06/06/2022

8-bit Numerical Formats for Deep Neural Networks

Given the current trend of increasing size and complexity of machine lea...
research
11/01/2017

Attacking Binarized Neural Networks

Neural networks with low-precision weights and activations offer compell...
research
05/19/2018

GEN Model: An Alternative Approach to Deep Neural Network Models

In this paper, we introduce an alternative approach, namely GEN (Genetic...
research
10/10/2017

Mixed Precision Training

Deep neural networks have enabled progress in a wide variety of applicat...
research
07/13/2018

CascadeCNN: Pushing the Performance Limits of Quantisation in Convolutional Neural Networks

This work presents CascadeCNN, an automated toolflow that pushes the qua...
research
02/28/2019

Scaling Matters in Deep Structured-Prediction Models

Deep structured-prediction energy-based models combine the expressive po...

Please sign up or login with your details

Forgot password? Click here to reset