DeepAI AI Chat
Log In Sign Up

Non-Determinism in TensorFlow ResNets

by   Miguel Morin, et al.

We show that the stochasticity in training ResNets for image classification on GPUs in TensorFlow is dominated by the non-determinism from GPUs, rather than by the initialisation of the weights and biases of the network or by the sequence of minibatches given. The standard deviation of test set accuracy is 0.02 with fixed seeds, compared to 0.027 with different seeds—nearly 74% of the standard deviation of a ResNet model is non-deterministic. For test set loss the ratio of standard deviations is more than 80%. These results call for more robust evaluation strategies of deep learning models, as a significant amount of the variation in results across runs can arise simply from GPU randomness.


page 1

page 2

page 3

page 4


Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters

Deep learning models can take weeks to train on a single GPU-equipped ma...

Calibrated Chaos: Variance Between Runs of Neural Network Training is Harmless and Inevitable

Typical neural network trainings have substantial variance in test-set p...

Detecting a single fault in a deterministic finite automaton

Given a deterministic finite automaton and its implementation with at mo...

FusionStitching: Deep Fusion and Code Generation for Tensorflow Computations on GPUs

In recent years, there is a surge on machine learning applications in in...


As deep neural networks become more complex and input datasets grow larg...

TFLMS: Large Model Support in TensorFlow by Graph Rewriting

While accelerators such as GPUs have limited memory, deep neural network...