Synthesizing Irreproducibility in Deep Networks

02/21/2021
by   Robert R. Snapp, et al.
0

The success and superior performance of deep networks is spreading their popularity and use to an increasing number of applications. Very recent works, however, demonstrate that modern day deep networks suffer from irreproducibility (also referred to as nondeterminism or underspecification). Two or more models that are identical in architecture, structure, training hyper-parameters, and parameters, and that are trained on exactly the same training data, yield different predictions on individual previously unseen examples. Thus, a model that performs well on controlled test data, may perform in unexpected ways when deployed in the real world, whose data is expected to be similar to the test data. We study simple synthetic models and data to understand the origins of these problems. We show that even with a single nonlinearity and for very simple data and models, irreproducibility occurs. Our study demonstrates the effects of randomness in initialization, training data shuffling window size, and activation functions on prediction irreproducibility, even under very controlled synthetic data. While, as one would expect, randomness in initialization and in shuffling the training examples exacerbates the phenomenon, we show that model complexity and the choice of nonlinearity also play significant roles in making deep models irreproducible.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2021

On the Expected Complexity of Maxout Networks

Learning with neural networks relies on the complexity of the representa...
research
02/17/2018

An analysis of training and generalization errors in shallow and deep networks

An open problem around deep networks is the apparent absence of over-fit...
research
10/19/2020

Anti-Distillation: Improving reproducibility of deep networks

Deep networks have been revolutionary in improving performance of machin...
research
06/03/2019

Deep ReLU Networks Have Surprisingly Few Activation Patterns

The success of deep networks has been attributed in part to their expres...
research
03/09/2021

More data or more parameters? Investigating the effect of data structure on generalization

One of the central features of deep learning is the generalization abili...
research
11/25/2022

The smooth output assumption, and why deep networks are better than wide ones

When several models have similar training scores, classical model select...
research
11/28/2021

On Predicting Generalization using GANs

Research on generalization bounds for deep networks seeks to give ways t...

Please sign up or login with your details

Forgot password? Click here to reset