Generalization bounds for deep learning

12/07/2020
by   Guillermo Valle Pérez, et al.
0

Generalization in deep learning has been the topic of much recent theoretical and empirical research. Here we introduce desiderata for techniques that predict generalization errors for deep learning models in supervised learning. Such predictions should 1) scale correctly with data complexity; 2) scale correctly with training set size; 3) capture differences between architectures; 4) capture differences between optimization algorithms; 5) be quantitatively not too far from the true error (in particular, be non-vacuous); 6) be efficiently computable; and 7) be rigorous. We focus on generalization error upper bounds, and introduce a categorisation of bounds depending on assumptions on the algorithm and data. We review a wide range of existing approaches, from classical VC dimension to recent PAC-Bayesian bounds, commenting on how well they perform against the desiderata. We next use a function-based picture to derive a marginal-likelihood PAC-Bayesian bound. This bound is, by one definition, optimal up to a multiplicative constant in the asymptotic limit of large training sets, as long as the learning curve follows a power law, which is typically found in practice for deep learning problems. Extensive empirical analysis demonstrates that our marginal-likelihood PAC-Bayes bound fulfills desiderata 1-3 and 5. The results for 6 and 7 are promising, but not yet fully conclusive, while only desideratum 4 is currently beyond the scope of our bound. Finally, we comment on why this function-based bound performs significantly better than current parameter-based PAC-Bayes bounds.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/04/2021

Information Complexity and Generalization Bounds

We present a unifying picture of PAC-Bayesian and mutual information-bas...
research
08/19/2019

PAC-Bayes with Backprop

We explore a method to train probabilistic neural networks by minimizing...
research
10/24/2022

A PAC-Bayesian Generalization Bound for Equivariant Networks

Equivariant networks capture the inductive bias about the symmetry of th...
research
03/30/2022

Higher-Order Generalization Bounds: Learning Deep Probabilistic Programs via PAC-Bayes Objectives

Deep Probabilistic Programming (DPP) allows powerful models based on rec...
research
02/28/2012

PAC-Bayesian Generalization Bound on Confusion Matrix for Multi-Class Classification

In this work, we propose a PAC-Bayes bound for the generalization risk o...
research
02/23/2020

De-randomized PAC-Bayes Margin Bounds: Applications to Non-convex and Non-smooth Predictors

In spite of several notable efforts, explaining the generalization of de...
research
10/29/2018

Learning Gaussian Processes by Minimizing PAC-Bayesian Generalization Bounds

Gaussian Processes (GPs) are a generic modelling tool for supervised lea...

Please sign up or login with your details

Forgot password? Click here to reset