Chromatic PAC-Bayes Bounds for Non-IID Data: Applications to Ranking and Stationary β-Mixing Processes

09/10/2009
by   Liva Ralaivola, et al.
0

Pac-Bayes bounds are among the most accurate generalization bounds for classifiers learned from independently and identically distributed (IID) data, and it is particularly so for margin classifiers: there have been recent contributions showing how practical these bounds can be either to perform model selection (Ambroladze et al., 2007) or even to directly guide the learning of linear classifiers (Germain et al., 2009). However, there are many practical situations where the training data show some dependencies and where the traditional IID assumption does not hold. Stating generalization bounds for such frameworks is therefore of the utmost interest, both from theoretical and practical standpoints. In this work, we propose the first - to the best of our knowledge - Pac-Bayes generalization bounds for classifiers trained on data exhibiting interdependencies. The approach undertaken to establish our results is based on the decomposition of a so-called dependency graph that encodes the dependencies within the data, in sets of independent data, thanks to graph fractional covers. Our bounds are very general, since being able to find an upper bound on the fractional chromatic number of the dependency graph is sufficient to get new Pac-Bayes bounds for specific settings. We show how our results can be used to derive bounds for ranking statistics (such as Auc) and classifiers trained on data distributed according to a stationary ß-mixing process. In the way, we show how our approach seemlessly allows us to deal with U-processes. As a side note, we also provide a Pac-Bayes generalization bound for classifiers learned on data from stationary φ-mixing distributions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/24/2020

A Limitation of the PAC-Bayes Framework

PAC-Bayes is a useful framework for deriving generalization bounds which...
research
06/09/2022

On Margins and Generalisation for Voting Classifiers

We study the generalisation properties of majority voting on finite ense...
research
06/07/2021

How Tight Can PAC-Bayes be in the Small Data Regime?

In this paper, we investigate the question: Given a small number of data...
research
06/16/2022

Generalization Bounds for Data-Driven Numerical Linear Algebra

Data-driven algorithms can adapt their internal structure or parameters ...
research
02/28/2012

PAC-Bayesian Generalization Bound on Confusion Matrix for Multi-Class Classification

In this work, we propose a PAC-Bayes bound for the generalization risk o...
research
12/18/2021

Data-Driven Reachability analysis and Support set Estimation with Christoffel Functions

We present algorithms for estimating the forward reachable set of a dyna...
research
03/13/2023

Bayes Complexity of Learners vs Overfitting

We introduce a new notion of complexity of functions and we show that it...

Please sign up or login with your details

Forgot password? Click here to reset