Distributed Low Precision Training Without Mixed Precision

11/18/2019
by   Zehua Cheng, et al.
0

Low precision training is one of the most popular strategies for deploying the deep model on limited hardware resources. Fixed point implementation of DCNs has the potential to alleviate complexities and facilitate potential deployment on embedded hardware. However, most low precision training solution is based on a mixed precision strategy. In this paper, we have presented an ablation study on different low precision training strategy and propose a solution for IEEE FP-16 format throughout the training process. We tested the ResNet50 on 128 GPU cluster on ImageNet-full dataset. We have viewed that it is not essential to use FP32 format to train the deep models. We have viewed that communication cost reduction, model compression, and large-scale distributed training are three coupled problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2020

Multi-Precision Policy Enforced Training (MuPPET): A precision-switching strategy for quantised fixed-point training of CNNs

Large-scale convolutional neural networks (CNNs) suffer from very long t...
research
06/13/2022

Modern Distributed Data-Parallel Large-Scale Pre-training Strategies For NLP models

Distributed deep learning is becoming increasingly popular due to the ex...
research
06/04/2020

Towards Lower Bit Multiplication for Convolutional Neural Network Training

Convolutional Neural Networks (CNNs) have been widely used in many field...
research
03/29/2021

Representation range needs for 16-bit neural network training

Deep learning has grown rapidly thanks to its state-of-the-art performan...
research
02/03/2018

Mixed Precision Training of Convolutional Neural Networks using Integer Operations

The state-of-the-art (SOTA) for mixed precision training is dominated by...
research
10/25/2021

Mixed precision in Graphics Processing Unit

Modern graphics computing units (GPUs) are designed and optimized to per...
research
07/20/2022

Quantized Training of Gradient Boosting Decision Trees

Recent years have witnessed significant success in Gradient Boosting Dec...

Please sign up or login with your details

Forgot password? Click here to reset