AdaptivFloat: A Floating-point based Data Type for Resilient Deep Learning Inference

09/29/2019
by   Thierry Tambe, et al.
0

Conventional hardware-friendly quantization methods, such as fixed-point or integer, tend to perform poorly at very low word sizes as their shrinking dynamic ranges cannot adequately capture the wide data distributions commonly seen in sequence transduction models. We present AdaptivFloat, a floating-point inspired number representation format for deep learning that dynamically maximizes and optimally clips its available dynamic range, at a layer granularity, in order to create faithful encoding of neural network parameters. AdaptivFloat consistently produces higher inference accuracies compared to block floating-point, uniform, IEEE-like float or posit encodings at very low precision (≤ 8-bit) across a diverse set of state-of-the-art neural network topologies. And notably, AdaptivFloat is seen surpassing baseline FP32 performance by up to +0.3 in BLEU score and -0.75 in word error rate at weight bit widths that are ≤ 8-bit. Experimental results on a deep neural network (DNN) hardware accelerator, exploiting AdaptivFloat logic in its computational datapath, demonstrate per-operation energy and area that is 0.9× and 1.14×, respectively, that of equivalent bit width integer-based accelerator variants.

READ FULL TEXT

page 5

page 9

research
11/01/2018

Rethinking floating point for deep learning

Reducing hardware overhead of neural networks for faster or lower power ...
research
03/23/2022

Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models

Increasingly larger and better Transformer models keep advancing state-o...
research
11/06/2017

Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks

Deep neural networks are commonly developed and trained in 32-bit floati...
research
04/15/2021

All-You-Can-Fit 8-Bit Flexible Floating-Point Format for Accurate and Memory-Efficient Inference of Deep Neural Networks

Modern deep neural network (DNN) models generally require a huge amount ...
research
06/01/2017

Dynamic Stripes: Exploiting the Dynamic Precision Requirements of Activation Values in Neural Networks

Stripes is a Deep Neural Network (DNN) accelerator that uses bit-serial ...
research
10/22/2019

Neural Network Training with Approximate Logarithmic Computations

The high computational complexity associated with training deep neural n...
research
03/04/2023

Fixed-point quantization aware training for on-device keyword-spotting

Fixed-point (FXP) inference has proven suitable for embedded devices wit...

Please sign up or login with your details

Forgot password? Click here to reset