Circuit-Based Intrinsic Methods to Detect Overfitting

07/03/2019
by   Sat Chatterjee, et al.
0

The focus of this paper is on intrinsic methods to detect overfitting. These rely only on the model and the training data, as opposed to traditional extrinsic methods that rely on performance on a test set or on bounds from model complexity. We propose a family of intrinsic methods called Counterfactual Simulation (CFS) which analyze the flow of training examples through the model by identifying and perturbing rare patterns. By applying CFS to logic circuits we get a method that has no hyper-parameters and works uniformly across different types of models such as neural networks, random forests and lookup tables. Experimentally, CFS can separate models with different levels of overfit using only their logic circuit representations without any access to the high level structure. By comparing lookup tables, neural networks, and random forests using CFS, we get insight into why neural networks generalize. In particular, we find that stochastic gradient descent in neural nets does not lead to "brute force" memorization, but finds common patterns (whether we train with actual or randomized labels), and neural networks are not unlike forests in this regard. Finally, we identify a limitation with our proposal that makes it unsuitable in an adversarial setting, but points the way to future work on robust intrinsic methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2022

Benign Overfitting without Linearity: Neural Network Classifiers Trained by Gradient Descent for Noisy Linear Data

Benign overfitting, the phenomenon where interpolating models generalize...
research
02/10/2020

Making Logic Learnable With Neural Networks

While neural networks are good at learning unspecified functions from tr...
research
10/10/2019

Probabilistic Rollouts for Learning Curve Extrapolation Across Hyperparameter Settings

We propose probabilistic models that can extrapolate learning curves of ...
research
05/20/2017

Stabilizing Adversarial Nets With Prediction Methods

Adversarial neural networks solve many important problems in data scienc...
research
03/16/2020

Explaining Memorization and Generalization: A Large-Scale Study with Coherent Gradients

Coherent Gradients is a recently proposed hypothesis to explain why over...
research
04/09/2020

Analysis on DeepLabV3+ Performance for Automatic Steel Defects Detection

Our works experimented DeepLabV3+ with different backbones on a large vo...

Please sign up or login with your details

Forgot password? Click here to reset