Comparing the quality of neural network uncertainty estimates for classification problems

by   Daniel Ries, et al.

Traditional deep learning (DL) models are powerful classifiers, but many approaches do not provide uncertainties for their estimates. Uncertainty quantification (UQ) methods for DL models have received increased attention in the literature due to their usefulness in decision making, particularly for high-consequence decisions. However, there has been little research done on how to evaluate the quality of such methods. We use statistical methods of frequentist interval coverage and interval width to evaluate the quality of credible intervals, and expected calibration error to evaluate classification predicted confidence. These metrics are evaluated on Bayesian neural networks (BNN) fit using Markov Chain Monte Carlo (MCMC) and variational inference (VI), bootstrapped neural networks (NN), Deep Ensembles (DE), and Monte Carlo (MC) dropout. We apply these different UQ for DL methods to a hyperspectral image target detection problem and show the inconsistency of the different methods' results and the necessity of a UQ quality metric. To reconcile these differences and choose a UQ method that appropriately quantifies the uncertainty, we create a simulated data set with fully parameterized probability distribution for a two-class classification problem. The gold standard MCMC performs the best overall, and the bootstrapped NN is a close second, requiring the same computational expense as DE. Through this comparison, we demonstrate that, for a given data set, different models can produce uncertainty estimates of markedly different quality. This in turn points to a great need for principled assessment methods of UQ quality in DL applications.


page 1

page 3

page 6

page 7


Target Detection on Hyperspectral Images Using MCMC and VI Trained Bayesian Neural Networks

Neural networks (NN) have become almost ubiquitous with image classifica...

Uncertainty quantification in neural network classifiers – a local linear approach

Classifiers based on neural networks (NN) often lack a measure of uncert...

Scalable Bayesian Uncertainty Quantification for Neural Network Potentials: Promise and Pitfalls

Neural network (NN) potentials promise highly accurate molecular dynamic...

Bayesian Weapon System Reliability Modeling with Cox-Weibull Neural Network

We propose to integrate weapon system features (such as weapon system ma...

Bayesian graph convolutional neural networks via tempered MCMC

Deep learning models, such as convolutional neural networks, have long b...

Can uncertainty boost the reliability of AI-based diagnostic methods in digital pathology?

Deep learning (DL) has shown great potential in digital pathology applic...

Deep-learning-driven Reliable Single-pixel Imaging with Uncertainty Approximation

Single-pixel imaging (SPI) has the advantages of high-speed acquisition ...

Please sign up or login with your details

Forgot password? Click here to reset