A Study of BFLOAT16 for Deep Learning Training

05/29/2019
by   Dhiraj Kalamkar, et al.
0

This paper presents the first comprehensive empirical study demonstrating the efficacy of the Brain Floating Point (BFLOAT16) half-precision format for DeepLearning training across image classification, speech recognition, language model-ing, generative networks, and industrial recommendation systems. BFLOAT16 is attractive for Deep Learning training for two reasons: the range of values it can represent is the same as that of IEEE 754 floating-point format (FP32) and conversion to/from FP32 is simple. Maintaining the same range as FP32 is important to ensure that no hyper-parameter tuning is required for convergence; e.g., IEEE 754compliant half-precision floating point (FP16) requires hyper-parameter tuning. In this paper, we discuss the flow of tensors and various key operations in mixed-precision training and delve into details of operations, such as the rounding modes for converting FP32 tensors to BFLOAT16. We have implemented a method to emulate BFLOAT16 operations in Tensorflow, Caffe2, IntelCaffe, and Neon for our experiments. Our results show that deep learning training using BFLOAT16tensors achieves the same state-of-the-art (SOTA) results across domains as FP32tensors in the same number of iterations and with no changes to hyper-parameters.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/04/2021

EFloat: Entropy-coded Floating Point Format for Deep Learning

We describe the EFloat floating-point number format with 4 to 6 addition...
research
11/06/2017

Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks

Deep neural networks are commonly developed and trained in 32-bit floati...
research
09/29/2019

Run-time reconfigurable multi-precision floating point multiplier design for high speed, low-power applications

Floating point multiplication is one of the crucial operations in many a...
research
02/03/2018

Mixed Precision Training of Convolutional Neural Networks using Integer Operations

The state-of-the-art (SOTA) for mixed precision training is dominated by...
research
07/03/2020

FPnew: An Open-Source Multi-Format Floating-Point Unit Architecture for Energy-Proportional Transprecision Computing

The slowdown of Moore's law and the power wall necessitates a shift towa...
research
10/10/2017

Mixed Precision Training

Deep neural networks have enabled progress in a wide variety of applicat...
research
02/16/2023

With Shared Microexponents, A Little Shifting Goes a Long Way

This paper introduces Block Data Representations (BDR), a framework for ...

Please sign up or login with your details

Forgot password? Click here to reset