With Shared Microexponents, A Little Shifting Goes a Long Way

02/16/2023
by   Bita Rouhani, et al.
0

This paper introduces Block Data Representations (BDR), a framework for exploring and evaluating a wide spectrum of narrow-precision formats for deep learning. It enables comparison of popular quantization standards, and through BDR, new formats based on shared microexponents (MX) are identified, which outperform other state-of-the-art quantization approaches, including narrow-precision floating-point and block floating-point. MX utilizes multiple levels of quantization scaling with ultra-fine scaling factors based on shared microexponents in the hardware. The effectiveness of MX is demonstrated on real-world models including large-scale generative pretraining and inferencing, and production-scale recommendation systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/07/2018

Rethinking Numerical Representations for Deep Neural Networks

With ever-increasing computational demand for deep learning, it is criti...
research
10/28/2021

FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding

Block Floating Point (BFP) can efficiently support quantization for Deep...
research
04/04/2018

Training DNNs with Hybrid Block Floating Point

The wide adoption of DNNs has given birth to unrelenting computing requi...
research
05/21/2018

Quantizing Convolutional Neural Networks for Low-Power High-Throughput Inference Engines

Deep learning as a means to inferencing has proliferated thanks to its v...
research
07/19/2023

ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats

In the complex domain of large language models (LLMs), striking a balanc...
research
06/17/2020

StatAssist GradBoost: A Study on Optimal INT8 Quantization-aware Training from Scratch

This paper studies the scratch training of quantization-aware training (...
research
05/29/2019

A Study of BFLOAT16 for Deep Learning Training

This paper presents the first comprehensive empirical study demonstratin...

Please sign up or login with your details

Forgot password? Click here to reset