A PAC-Bayesian Analysis of Distance-Based Classifiers: Why Nearest-Neighbour works!

09/28/2021
by   Thore Graepel, et al.
0

Abstract We present PAC-Bayesian bounds for the generalisation error of the K-nearest-neighbour classifier (K-NN). This is achieved by casting the K-NN classifier into a kernel space framework in the limit of vanishing kernel bandwidth. We establish a relation between prior measures over the coefficients in the kernel expansion and the induced measure on the weight vectors in kernel space. Defining a sparse prior over the coefficients allows the application of a PAC-Bayesian folk theorem that leads to a generalisation bound that is a function of the number of redundant training examples: those that can be left out without changing the solution. The presented bound requires to quantify a prior belief in the sparseness of the solution and is evaluated after learning when the actual redundancy level is known. Even for small sample size (m   100) the bound gives non-trivial results when both the expected sparseness and the actual redundancy are high.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/18/2018

PAC-Bayes bounds for stable algorithms with instance-dependent priors

PAC-Bayes bounds have been proposed to get risk estimates based on a tra...
research
02/04/2022

Demystify Optimization and Generalization of Over-parameterized PAC-Bayesian Learning

PAC-Bayesian is an analysis framework where the training error can be ex...
research
10/22/2021

Conditional Gaussian PAC-Bayes

Recent studies have empirically investigated different methods to train ...
research
09/06/2022

A PAC-Bayes bound for deterministic classifiers

We establish a disintegrated PAC-Bayesian bound, for classifiers that ar...
research
05/07/2014

PAC-Bayes Mini-tutorial: A Continuous Union Bound

When I first encountered PAC-Bayesian concentration inequalities they se...
research
04/12/2021

PAC Bayesian Performance Guarantees for Deep (Stochastic) Networks in Medical Imaging

Application of deep neural networks to medical imaging tasks has in some...
research
12/14/2019

Optimal PAC-Bayesian Posteriors for Stochastic Classifiers and their use for Choice of SVM Regularization Parameter

PAC-Bayesian set up involves a stochastic classifier characterized by a ...

Please sign up or login with your details

Forgot password? Click here to reset