Calibrated Chaos: Variance Between Runs of Neural Network Training is Harmless and Inevitable

04/04/2023
by   Keller Jordan, et al.
0

Typical neural network trainings have substantial variance in test-set performance between repeated runs, impeding hyperparameter comparison and training reproducibility. We present the following results towards understanding this variation. (1) Despite having significant variance on their test-sets, we demonstrate that standard CIFAR-10 and ImageNet trainings have very little variance in their performance on the test-distributions from which those test-sets are sampled, suggesting that variance is less of a practical issue than previously thought. (2) We present a simplifying statistical assumption which closely approximates the structure of the test-set accuracy distribution. (3) We argue that test-set variance is inevitable in the following two senses. First, we show that variance is largely caused by high sensitivity of the training process to initial conditions, rather than by specific sources of randomness like the data order and augmentations. Second, we prove that variance is unavoidable given the observation that ensembles of trained networks are well-calibrated. (4) We conduct preliminary studies of distribution-shift, fine-tuning, data augmentation and learning rate through the lens of variance between runs.

READ FULL TEXT

page 6

page 7

page 13

page 25

page 26

research
04/24/2020

Syntactic Data Augmentation Increases Robustness to Inference Heuristics

Pretrained neural models such as BERT, when fine-tuned to perform natura...
research
01/30/2020

Non-Determinism in TensorFlow ResNets

We show that the stochasticity in training ResNets for image classificat...
research
02/08/2023

Towards Inferential Reproducibility of Machine Learning Research

Reliability of machine learning evaluation – the consistency of observed...
research
09/25/2019

Regularising Deep Networks with DGMs

Here we develop a new method for regularising neural networks where we l...
research
02/05/2021

On the Reproducibility of Neural Network Predictions

Standard training techniques for neural networks involve multiple source...
research
11/15/2022

Hierarchical Inference of the Lensing Convergence from Photometric Catalogs with Bayesian Graph Neural Networks

We present a Bayesian graph neural network (BGNN) that can estimate the ...

Please sign up or login with your details

Forgot password? Click here to reset