Theoretical characterization of uncertainty in high-dimensional linear classification

02/07/2022
by   Lucas Clarté, et al.
5

Being able to reliably assess not only the accuracy but also the uncertainty of models' predictions is an important endeavour in modern machine learning. Even if the model generating the data and labels is known, computing the intrinsic uncertainty after learning the model from a limited number of samples amounts to sampling the corresponding posterior probability measure. Such sampling is computationally challenging in high-dimensional problems and theoretical results on heuristic uncertainty estimators in high-dimensions are thus scarce. In this manuscript, we characterise uncertainty for learning from limited number of samples of high-dimensional Gaussian input data and labels generated by the probit model. We prove that the Bayesian uncertainty (i.e. the posterior marginals) can be asymptotically obtained by the approximate message passing algorithm, bypassing the canonical but costly Monte Carlo sampling of the posterior. We then provide a closed-form formula for the joint statistics between the logistic classifier, the uncertainty of the statistically optimal Bayesian classifier and the ground-truth probit uncertainty. The formula allows us to investigate calibration of the logistic classifier learning from limited amount of samples. We discuss how over-confidence can be mitigated by appropriately regularising, and show that cross-validating with respect to the loss leads to better calibration than with the 0/1 error.

READ FULL TEXT

page 8

page 9

page 10

page 12

page 27

page 29

page 30

page 31

research
05/29/2019

Accelerating Monte Carlo Bayesian Inference via Approximating Predictive Uncertainty over Simplex

Estimating the uncertainty of a Bayesian model has been investigated for...
research
01/10/2021

Randomised maximum likelihood based posterior sampling

Minimization of a stochastic cost function is commonly used for approxim...
research
04/01/2022

DBCal: Density Based Calibration of classifier predictions for uncertainty quantification

Measurement of uncertainty of predictions from machine learning methods ...
research
06/11/2020

Generalization error in high-dimensional perceptrons: Approaching Bayes error with convex optimization

We consider a commonly studied supervised classification of a synthetic ...
research
04/07/2022

Calibration of a bumble bee foraging model using Approximate Bayesian Computation

1. Challenging calibration of complex models can be approached by using ...
research
11/18/2020

On Focal Loss for Class-Posterior Probability Estimation: A Theoretical Perspective

The focal loss has demonstrated its effectiveness in many real-world app...
research
12/01/2022

Uniform versus uncertainty sampling: When being active is less efficient than staying passive

It is widely believed that given the same labeling budget, active learni...

Please sign up or login with your details

Forgot password? Click here to reset