Probing the Effect of Selection Bias on NN Generalization with a Thought Experiment

by   John K. Tsotsos, et al.

Learned networks in the domain of visual recognition and cognition impress in part because even though they are trained with datasets many orders of magnitude smaller than the full population of possible images, they exhibit sufficient generalization to be applicable to new and previously unseen data. Although many have examined issues regarding generalization from several perspectives, we wondered If a network is trained with a biased dataset that misses particular samples corresponding to some defining domain attribute, can it generalize to the full domain from which that training dataset was extracted? It is certainly true that in vision, no current training set fully captures all visual information and this may lead to Selection Bias. Here, we try a novel approach in the tradition of the Thought Experiment. We run this thought experiment on a real domain of visual objects that we can fully characterize and look at specific gaps in training data and their impact on performance requirements. Our thought experiment points to three conclusions: first, that generalization behavior is dependent on how sufficiently the particular dimensions of the domain are represented during training; second, that the utility of any generalization is completely dependent on the acceptable system error; and third, that specific visual features of objects, such as pose orientations out of the imaging plane or colours, may not be recoverable if not represented sufficiently in a training set. Any currently observed generalization in modern deep learning networks may be more the result of coincidental alignments and whose utility needs to be confirmed with respect to a system's performance specification. Our Thought Experiment Probe approach, coupled with the resulting Bias Breakdown can be very informative towards understanding the impact of biases.


page 1

page 2

page 3

page 4


Unsupervised Domain Alignment to Mitigate Low Level Dataset Biases

Dataset bias is a well-known problem in the field of computer vision. Th...

Bias amplification in experimental social networks is reduced by resampling

Large-scale social networks are thought to contribute to polarization by...

Transfer Learning and Bias Correction with Pre-trained Audio Embeddings

Deep neural network models have become the dominant approach to a large ...

Revealing the Underlying Patterns: Investigating Dataset Similarity, Performance, and Generalization

Supervised deep learning models require significant amount of labelled d...

Empirically Analyzing the Effect of Dataset Biases on Deep Face Recognition Systems

It is unknown what kind of biases modern in the wild face datasets have ...

OOWL500: Overcoming Dataset Collection Bias in the Wild

The hypothesis that image datasets gathered online "in the wild" can pro...

UnbiasedNets: A Dataset Diversification Framework for Robustness Bias Alleviation in Neural Networks

Performance of trained neural network (NN) models, in terms of testing a...

Please sign up or login with your details

Forgot password? Click here to reset