Multi-Precision Policy Enforced Training (MuPPET): A precision-switching strategy for quantised fixed-point training of CNNs

06/16/2020
by   Aditya Rajagopal, et al.
10

Large-scale convolutional neural networks (CNNs) suffer from very long training times, spanning from hours to weeks, limiting the productivity and experimentation of deep learning practitioners. As networks grow in size and complexity, training time can be reduced through low-precision data representations and computations. However, in doing so the final accuracy suffers due to the problem of vanishing gradients. Existing state-of-the-art methods combat this issue by means of a mixed-precision approach utilising two different precision levels, FP32 (32-bit floating-point) and FP16/FP8 (16-/8-bit floating-point), leveraging the hardware support of recent GPU architectures for FP16 operations to obtain performance gains. This work pushes the boundary of quantised training by employing a multilevel optimisation approach that utilises multiple precisions including low-precision fixed-point representations. The novel training strategy, MuPPET, combines the use of multiple number representation regimes together with a precision-switching mechanism that decides at run time the transition point between precision regimes. Overall, the proposed strategy tailors the training process to the hardware-level capabilities of the target hardware architecture and yields improvements in training time and energy efficiency compared to state-of-the-art approaches. Applying MuPPET on the training of AlexNet, ResNet18 and GoogLeNet on ImageNet (ILSVRC12) and targeting an NVIDIA Turing GPU, MuPPET achieves the same accuracy as standard full-precision training with training-time speedup of up to 1.84× and an average speedup of 1.58× across the networks.

READ FULL TEXT

page 6

page 16

page 17

page 18

page 19

page 20

page 21

research
04/14/2018

Low-Precision Floating-Point Schemes for Neural Network Training

The use of low-precision fixed-point arithmetic along with stochastic ro...
research
11/18/2019

Distributed Low Precision Training Without Mixed Precision

Low precision training is one of the most popular strategies for deployi...
research
06/06/2022

8-bit Numerical Formats for Deep Neural Networks

Given the current trend of increasing size and complexity of machine lea...
research
11/17/2015

Reduced-Precision Strategies for Bounded Memory in Deep Neural Nets

This work investigates how using reduced precision data in Convolutional...
research
08/07/2018

Rethinking Numerical Representations for Deep Neural Networks

With ever-increasing computational demand for deep learning, it is criti...
research
01/21/2021

GPU-Accelerated Optimizer-Aware Evaluation of Submodular Exemplar Clustering

The optimization of submodular functions constitutes a viable way to per...
research
02/24/2020

Combining Learning and Optimization for Transprecision Computing

The growing demands of the worldwide IT infrastructure stress the need f...

Please sign up or login with your details

Forgot password? Click here to reset