Training Neural Networks in Single vs Double Precision

09/15/2022
by   Tomas Hrycej, et al.
11

The commitment to single-precision floating-point arithmetic is widespread in the deep learning community. To evaluate whether this commitment is justified, the influence of computing precision (single and double precision) on the optimization performance of the Conjugate Gradient (CG) method (a second-order optimization algorithm) and RMSprop (a first-order algorithm) has been investigated. Tests of neural networks with one to five fully connected hidden layers and moderate or strong nonlinearity with up to 4 million network parameters have been optimized for Mean Square Error (MSE). The training tasks have been set up so that their MSE minimum was known to be zero. Computing experiments have disclosed that single-precision can keep up (with superlinear convergence) with double-precision as long as line search finds an improvement. First-order methods such as RMSprop do not benefit from double precision. However, for moderately nonlinear tasks, CG is clearly superior. For strongly nonlinear tasks, both algorithm classes find only solutions fairly poor in terms of mean square error as related to the output variance. CG with double floating-point precision is superior whenever the solutions have the potential to be useful for the application goal.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2007

Computing Integer Powers in Floating-Point Arithmetic

We introduce two algorithms for accurately evaluating powers to a positi...
research
01/27/2021

Probabilistic Error Analysis For Sequential Summation of Real Floating Point Numbers

We derive two probabilistic bounds for the relative forward error in the...
research
09/02/2021

Clock Skew Compensation Algorithm Immune to Floating-Point Precision Loss

We propose a novel clock skew compensation algorithm based on Bresenham'...
research
04/30/2021

On the Computation of PSNR for a Set of Images or Video

When comparing learned image/video restoration and compression methods, ...
research
03/25/2016

The Asymptotic Performance of Linear Echo State Neural Networks

In this article, a study of the mean-square error (MSE) performance of l...
research
07/18/2023

Multi-stage Neural Networks: Function Approximator of Machine Precision

Deep learning techniques are increasingly applied to scientific problems...
research
03/02/2021

Square Root Bundle Adjustment for Large-Scale Reconstruction

We propose a new formulation for the bundle adjustment problem which rel...

Please sign up or login with your details

Forgot password? Click here to reset