Non-Determinism in TensorFlow ResNets

01/30/2020
by   Miguel Morin, et al.
0

We show that the stochasticity in training ResNets for image classification on GPUs in TensorFlow is dominated by the non-determinism from GPUs, rather than by the initialisation of the weights and biases of the network or by the sequence of minibatches given. The standard deviation of test set accuracy is 0.02 with fixed seeds, compared to 0.027 with different seeds—nearly 74% of the standard deviation of a ResNet model is non-deterministic. For test set loss the ratio of standard deviations is more than 80%. These results call for more robust evaluation strategies of deep learning models, as a significant amount of the variation in results across runs can arise simply from GPU randomness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2017

Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters

Deep learning models can take weeks to train on a single GPU-equipped ma...
research
04/04/2023

Calibrated Chaos: Variance Between Runs of Neural Network Training is Harmless and Inevitable

Typical neural network trainings have substantial variance in test-set p...
research
06/01/2021

Detecting a single fault in a deterministic finite automaton

Given a deterministic finite automaton and its implementation with at mo...
research
11/13/2018

FusionStitching: Deep Fusion and Code Generation for Tensorflow Computations on GPUs

In recent years, there is a surge on machine learning applications in in...
research
02/15/2018

Horovod: fast and easy distributed deep learning in TensorFlow

Training modern deep learning models requires large amounts of computati...
research
02/25/2021

Rip van Winkle's Razor: A Simple Estimate of Overfit to Test Data

Traditional statistics forbids use of test data (a.k.a. holdout data) du...

Please sign up or login with your details

Forgot password? Click here to reset