Truth or Backpropaganda? An Empirical Investigation of Deep Learning Theory

10/01/2019
by   Micah Goldblum, et al.
0

We empirically evaluate common assumptions about neural networks that are widely held by practitioners and theorists alike. We study the prevalence of local minima in loss landscapes, whether small-norm parameter vectors generalize better (and whether this explains the advantages of weight decay), whether wide-network theories (like the neural tangent kernel) describe the behaviors of classifiers, and whether the rank of weight matrices can be linked to generalization and robustness in real-world networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/02/2020

No Spurious Local Minima: on the Optimization Landscapes of Wide and Deep Neural Networks

Empirical studies suggest that wide neural networks are comparably easy ...
research
05/29/2017

Feature Incay for Representation Regularization

Softmax loss is widely used in deep neural networks for multi-class clas...
research
05/30/2019

Deterministic PAC-Bayesian generalization bounds for deep networks via generalizing noise-resilience

The ability of overparameterized deep networks to generalize well has be...
research
03/03/2021

Formalizing Generalization and Robustness of Neural Networks to Weight Perturbations

Studying the sensitivity of weight perturbation in neural networks and i...
research
09/30/2022

Adaptive Weight Decay: On The Fly Weight Decay Tuning for Improving Robustness

We introduce adaptive weight decay, which automatically tunes the hyper-...
research
02/12/2023

Koopman-Based Bound for Generalization: New Aspect of Neural Networks Regarding Nonlinear Noise Filtering

We propose a new bound for generalization of neural networks using Koopm...

Please sign up or login with your details

Forgot password? Click here to reset