Information-Theoretic Bayes Risk Lower Bounds for Realizable Models

11/08/2021
by   Matthew Nokleby, et al.
30

We derive information-theoretic lower bounds on the Bayes risk and generalization error of realizable machine learning models. In particular, we employ an analysis in which the rate-distortion function of the model parameters bounds the required mutual information between the training samples and the model parameters in order to learn a model up to a Bayes risk constraint. For realizable models, we show that both the rate distortion function and mutual information admit expressions that are convenient for analysis. For models that are (roughly) lower Lipschitz in their parameters, we bound the rate distortion function from below, whereas for VC classes, the mutual information is bounded above by d_vclog(n). When these conditions match, the Bayes risk with respect to the zero-one loss scales no faster than Ω(d_vc/n), which matches known outer bounds and minimax lower bounds up to logarithmic factors. We also consider the impact of label noise, providing lower bounds when training and/or test samples are corrupted.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/10/2021

Rate-Distortion Analysis of Minimum Excess Risk in Bayesian Learning

Minimum Excess Risk (MER) in Bayesian learning is defined as the differe...
research
05/08/2016

Rate-Distortion Bounds on Bayes Risk in Supervised Learning

We present an information-theoretic framework for bounding the number of...
research
02/08/2021

Mutual Information of Neural Network Initialisations: Mean Field Approximations

The ability to train randomly initialised deep neural networks is known ...
research
10/28/2017

Lower Bounds for Two-Sample Structural Change Detection in Ising and Gaussian Models

The change detection problem is to determine if the Markov network struc...
research
07/08/2016

Lower Bounds on Active Learning for Graphical Model Selection

We consider the problem of estimating the underlying graph associated wi...
research
05/15/2023

Chain rules for one-shot entropic quantities via operational methods

We introduce a new operational technique for deriving chain rules for ge...
research
12/12/2019

General Information Bottleneck Objectives and their Applications to Machine Learning

We view the Information Bottleneck Principle (IBP: Tishby et al., 1999; ...

Please sign up or login with your details

Forgot password? Click here to reset