Deep Learning Generalization, Extrapolation, and Over-parameterization

03/19/2022
by   Roozbeh Yousefzadeh, et al.
2

We study the generalization of over-parameterized deep networks (for image classification) in relation to the convex hull of their training sets. Despite their great success, generalization of deep networks is considered a mystery. These models have orders of magnitude more parameters than their training samples, and they can achieve perfect accuracy on their training sets, even when training images are randomly labeled, or the contents of images are replaced with random noise. The training loss function of these models has infinite number of near zero minimizers, where only a small subset of those minimizers generalize well. Overall, it is not clear why models need to be over-parameterized, why we should use a very specific training regime to train them, and why their classifications are so susceptible to imperceivable adversarial perturbations (phenomenon known as adversarial vulnerability) <cit.>. Some recent studies have made advances in answering these questions, however, they only consider interpolation. We show that interpolation is not adequate to understand generalization of deep networks and we should broaden our perspective.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/25/2021

Deep Learning Generalization and the Convex Hull of Training Sets

We study the generalization of deep learning models in relation to the c...
research
07/30/2018

Faster Convergence & Generalization in DNNs

Deep neural networks have gained tremendous popularity in last few years...
research
02/28/2022

Robust Training under Label Noise by Over-parameterization

Recently, over-parameterized deep networks, with increasingly more netwo...
research
02/11/2019

Understanding over-parameterized deep networks by geometrization

A complete understanding of the widely used over-parameterized deep netw...
research
06/13/2018

Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate

Many modern machine learning models are trained to achieve zero or near-...
research
02/25/2020

The Curious Case of Adversarially Robust Models: More Data Can Help, Double Descend, or Hurt Generalization

Despite remarkable success, deep neural networks are sensitive to human-...
research
09/13/2023

Deep Nonparametric Convexified Filtering for Computational Photography, Image Synthesis and Adversarial Defense

We aim to provide a general framework of for computational photography t...

Please sign up or login with your details

Forgot password? Click here to reset