FlexBlock: A Flexible DNN Training Accelerator with Multi-Mode Block Floating Point Support

03/13/2022
by   Seock-Hwan Noh, et al.
3

Training deep neural networks (DNNs) is a computationally expensive job, which can take weeks or months even with high performance GPUs. As a remedy for this challenge, community has started exploring the use of more efficient data representations in the training process, e.g., block floating point (BFP). However, prior work on BFP-based DNN accelerators rely on a specific BFP representation making them less versatile. This paper builds upon an algorithmic observation that we can accelerate the training by leveraging multiple BFP precisions without compromising the finally achieved accuracy. Backed up by this algorithmic opportunity, we develop a flexible DNN training accelerator, dubbed FlexBlock, which supports three different BFP precision modes, possibly different among activation, weight, and gradient tensors. While several prior works proposed such multi-precision support for DNN accelerators, not only do they focus only on the inference, but also their core utilization is suboptimal at a fixed precision and specific layer types when the training is considered. Instead, FlexBlock is designed in such a way that high core utilization is achievable for i) various layer types, and ii) three BFP precisions by mapping data in a hierarchical manner to its compute units. We evaluate the effectiveness of FlexBlock architecture using well-known DNNs on CIFAR, ImageNet and WMT14 datasets. As a result, training in FlexBlock significantly improves the training speed by 1.5 5.3x and the energy efficiency by 2.4 7.0x on average compared to other training accelerators and incurs marginal accuracy loss compared to full-precision training.

READ FULL TEXT

page 2

page 3

page 7

page 8

page 9

page 11

page 12

page 13

research
03/02/2020

A New MRAM-based Process In-Memory Accelerator for Efficient Neural Network Training with Floating Point Precision

The excellent performance of modern deep neural networks (DNNs) comes at...
research
09/06/2019

Training Deep Neural Networks Using Posit Number System

With the increasing size of Deep Neural Network (DNN) models, the high m...
research
10/15/2020

FPRaker: A Processing Element For Accelerating Neural Network Training

We present FPRaker, a processing element for composing training accelera...
research
02/10/2021

Hybrid In-memory Computing Architecture for the Training of Deep Neural Networks

The cost involved in training deep neural networks (DNNs) on von-Neumann...
research
01/31/2023

Training with Mixed-Precision Floating-Point Assignments

When training deep neural networks, keeping all tensors in high precisio...
research
02/19/2018

A Scalable Near-Memory Architecture for Training Deep Neural Networks on Large In-Memory Datasets

Most investigations into near-memory hardware accelerators for deep neur...
research
04/04/2019

Regularizing Activation Distribution for Training Binarized Deep Networks

Binarized Neural Networks (BNNs) can significantly reduce the inference ...

Please sign up or login with your details

Forgot password? Click here to reset