The Deep Bootstrap: Good Online Learners are Good Offline Generalizers

10/16/2020
by   Preetum Nakkiran, et al.
0

We propose a new framework for reasoning about generalization in deep learning. The core idea is to couple the Real World, where optimizers take stochastic gradient steps on the empirical loss, to an Ideal World, where optimizers take steps on the population loss. This leads to an alternate decomposition of test error into: (1) the Ideal World test error plus (2) the gap between the two worlds. If the gap (2) is universally small, this reduces the problem of generalization in offline learning to the problem of optimization in online learning. We then give empirical evidence that this gap between worlds can be small in realistic deep learning settings, in particular supervised image classification. For example, CNNs generalize better than MLPs on image distributions in the Real World, but this is "because" they optimize faster on the population loss in the Ideal World. This suggests our framework is a useful tool for understanding generalization in deep learning, and lays a foundation for future research in the area.

READ FULL TEXT

page 23

page 24

page 25

research
05/26/2019

Deep Online Learning with Stochastic Constraints

Deep learning models are considered to be state-of-the-art in many offli...
research
06/14/2023

Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning

The success of SGD in deep learning has been ascribed by prior works to ...
research
11/27/2021

Offline Neural Contextual Bandits: Pessimism, Optimization and Generalization

Offline policy learning (OPL) leverages existing data collected a priori...
research
11/18/2018

Deep Learning with Inaccurate Training Data for Image Restoration

In many applications of deep learning, particularly those in image resto...
research
10/02/2020

A straightforward line search approach on the expected empirical loss for stochastic deep learning problems

A fundamental challenge in deep learning is that the optimal step sizes ...
research
10/17/2022

Adaptive Oracle-Efficient Online Learning

The classical algorithms for online learning and decision-making have th...
research
11/23/2022

Powderworld: A Platform for Understanding Generalization via Rich Task Distributions

One of the grand challenges of reinforcement learning is the ability to ...

Please sign up or login with your details

Forgot password? Click here to reset