Training Deep Neural Networks with 8-bit Floating Point Numbers

12/19/2018
by   Naigang Wang, et al.
0

The state-of-the-art hardware platforms for training Deep Neural Networks (DNNs) are moving from traditional single precision (32-bit) computations towards 16 bits of precision -- in large part due to the high energy efficiency and smaller bit storage associated with using reduced-precision representations. However, unlike inference, training with numbers represented with less than 16 bits has been challenging due to the need to maintain fidelity of the gradient computations during back-propagation. Here we demonstrate, for the first time, the successful training of DNNs using 8-bit floating point numbers while fully maintaining the accuracy on a spectrum of Deep Learning models and datasets. In addition to reducing the data and computation precision to 8 bits, we also successfully reduce the arithmetic precision for additions (used in partial product accumulation and weight updates) from 32 bits to 16 bits through the introduction of a number of key ideas including chunk-based accumulation and floating point stochastic rounding. The use of these novel techniques lays the foundation for a new generation of hardware training platforms with the potential for 2-4x improved throughput over today's systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2019

Mixed Precision Training With 8-bit Floating Point

Reduced precision computation for deep neural networks is one of the key...
research
04/12/2019

Leveraging the bfloat16 Artificial Intelligence Datatype For Higher-Precision Computations

In recent years fused-multiply-add (FMA) units with lower-precision mult...
research
08/20/2017

Conversion of Mersenne Twister to double-precision floating-point numbers

The 32-bit Mersenne Twister generator MT19937 is a widely used random nu...
research
06/26/2021

Low-Precision Training in Logarithmic Number System using Multiplicative Weight Update

Representing deep neural networks (DNNs) in low-precision is a promising...
research
09/11/2023

Compressed Real Numbers for AI: a case-study using a RISC-V CPU

As recently demonstrated, Deep Neural Networks (DNN), usually trained us...
research
01/19/2019

Accumulation Bit-Width Scaling For Ultra-Low Precision Training Of Deep Networks

Efforts to reduce the numerical precision of computations in deep learni...
research
10/22/2019

Neural Network Training with Approximate Logarithmic Computations

The high computational complexity associated with training deep neural n...

Please sign up or login with your details

Forgot password? Click here to reset