Evaluating Robustness of Predictive Uncertainty Estimation: Are Dirichlet-based Models Reliable?

by   Anna-Kathrin Kopetzki, et al.

Robustness to adversarial perturbations and accurate uncertainty estimation are crucial for reliable application of deep learning in real world settings. Dirichlet-based uncertainty (DBU) models are a family of models that predict the parameters of a Dirichlet distribution (instead of a categorical one) and promise to signal when not to trust their predictions. Untrustworthy predictions are obtained on unknown or ambiguous samples and marked with a high uncertainty by the models. In this work, we show that DBU models with standard training are not robust w.r.t. three important tasks in the field of uncertainty estimation. In particular, we evaluate how useful the uncertainty estimates are to (1) indicate correctly classified samples, and (2) to detect adversarial examples that try to fool classification. We further evaluate the reliability of DBU models on the task of (3) distinguishing between in-distribution (ID) and out-of-distribution (OOD) data. To this end, we present the first study of certifiable robustness for DBU models. Furthermore, we propose novel uncertainty attacks that fool models into assigning high confidence to OOD data and low confidence to ID data, respectively. Based on our results, we explore the first approaches to make DBU models more robust. We use adversarial training procedures based on label attacks, uncertainty attacks, or random noise and demonstrate how they affect robustness of DBU models on ID data and OOD data.



There are no comments yet.


page 5

page 6

page 28


Information Robust Dirichlet Networks for Predictive Uncertainty Estimation

Precise estimation of uncertainty in predictions for AI systems is a cri...

Adversarial Robustness on In- and Out-Distribution Improves Explainability

Neural networks have led to major improvements in image classification b...

Uncertainty-Aware Learning for Improvements in Image Quality of the Canada-France-Hawaii Telescope

We leverage state-of-the-art machine learning methods and a decade's wor...

Adversarial Phenomenon in the Eyes of Bayesian Deep Learning

Deep Learning models are vulnerable to adversarial examples, i.e. images...

Adversarial Training with Rectified Rejection

Adversarial training (AT) is one of the most effective strategies for pr...

β-Variational Classifiers Under Attack

Deep Neural networks have gained lots of attention in recent years thank...

Towards Maximizing the Representation Gap between In-Domain & Out-of-Distribution Examples

Among existing uncertainty estimation approaches, Dirichlet Prior Networ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.