FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding

10/28/2021
by   Sai Qian Zhang, et al.
12

Block Floating Point (BFP) can efficiently support quantization for Deep Neural Network (DNN) training by providing a wide dynamic range via a shared exponent across a group of values. In this paper, we propose a Fast First, Accurate Second Training (FAST) system for DNNs, where the weights, activations, and gradients are represented in BFP. FAST supports matrix multiplication with variable precision BFP input operands, enabling incremental increases in DNN precision throughout training. By increasing the BFP precision across both training iterations and DNN layers, FAST can greatly shorten the training time while reducing overall hardware resource usage. Our FAST Multipler-Accumulator (fMAC) supports dot product computations under multiple BFP precisions. We validate our FAST system on multiple DNNs with different datasets, demonstrating a 2-6× speedup in training on a single-chip platform over prior work based on mixed-precision or block floating point number systems while achieving similar performance in validation accuracy.

READ FULL TEXT

page 1

page 11

research
08/06/2019

Cheetah: Mixed Low-Precision Hardware Software Co-Design Framework for DNNs on the Edge

Low-precision DNNs have been extensively explored in order to reduce the...
research
05/29/2023

Reversible Deep Neural Network Watermarking:Matching the Floating-point Weights

Static deep neural network (DNN) watermarking embeds watermarks into the...
research
05/12/2022

Adaptive Block Floating-Point for Analog Deep Learning Hardware

Analog mixed-signal (AMS) devices promise faster, more energy-efficient ...
research
02/16/2023

With Shared Microexponents, A Little Shifting Goes a Long Way

This paper introduces Block Data Representations (BDR), a framework for ...
research
11/18/2017

MorphNet: Fast & Simple Resource-Constrained Structure Learning of Deep Networks

We present MorphNet, an approach to automate the design of neural networ...
research
07/07/2023

INT-FP-QSim: Mixed Precision and Formats For Large Language Models and Vision Transformers

The recent rise of large language models (LLMs) has resulted in increase...
research
03/17/2022

Convert, compress, correct: Three steps toward communication-efficient DNN training

In this paper, we introduce a novel algorithm, 𝖢𝖮_3, for communication-e...

Please sign up or login with your details

Forgot password? Click here to reset