Training DNNs with Hybrid Block Floating Point

04/04/2018
by   Mario Drumond, et al.
0

The wide adoption of DNNs has given birth to unrelenting computing requirements, forcing datacenter operators to adopt domain-specific accelerators to train them. These accelerators typically employ densely packed full precision floating-point arithmetic to maximize performance per area. Ongoing research efforts seek to further increase that performance density by replacing floating-point with fixed-point arithmetic. However, a significant roadblock for these attempts has been fixed point's narrow dynamic range, which is insufficient for DNN training convergence. We identify block floating point (BFP) as a promising alternative representation since it exhibits wide dynamic range and enables the majority of DNN operations to be performed with fixed-point logic. Unfortunately, BFP alone introduces several limitations that preclude its direct applicability. In this work, we introduce HBFP, a hybrid BFP-FP approach, which performs all dot products in BFP and other operations in floating point. HBFP delivers the best of both worlds: the high accuracy of floating point at the superior hardware density of fixed point. For a wide variety of models, we show that HBFP matches floating point's accuracy while enabling hardware implementations that deliver up to 8.5x higher throughput.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/04/2018

End-to-End DNN Training with Block Floating Point Arithmetic

DNNs are ubiquitous datacenter workloads, requiring orders of magnitude ...
research
11/19/2022

Accuracy Boosters: Epoch-Driven Mixed-Mantissa Block Floating-Point for DNN Training

The unprecedented growth in DNN model complexity, size and the amount of...
research
03/06/2018

Synthesizing Power and Area Efficient Image Processing Pipelines on FPGAs using Customized Bit-widths

High-level synthesis (HLS) has received significant attention in recent ...
research
06/03/2018

Deploying Customized Data Representation and Approximate Computing in Machine Learning Applications

Major advancements in building general-purpose and customized hardware h...
research
03/05/2020

Beyond Application End-Point Results: Quantifying Statistical Robustness of MCMC Accelerators

Statistical machine learning often uses probabilistic algorithms, such a...
research
02/16/2023

With Shared Microexponents, A Little Shifting Goes a Long Way

This paper introduces Block Data Representations (BDR), a framework for ...
research
10/11/2022

Block Format Error Bounds and Optimal Block Size Selection

The amounts of data that need to be transmitted, processed, and stored b...

Please sign up or login with your details

Forgot password? Click here to reset