Useful Confidence Measures: Beyond the Max Score

10/25/2022
by   Gal Yona, et al.
0

An important component in deploying machine learning (ML) in safety-critic applications is having a reliable measure of confidence in the ML model's predictions. For a classifier f producing a probability vector f(x) over the candidate classes, the confidence is typically taken to be max_i f(x)_i. This approach is potentially limited, as it disregards the rest of the probability vector. In this work, we derive several confidence measures that depend on information beyond the maximum score, such as margin-based and entropy-based measures, and empirically evaluate their usefulness, focusing on NLP tasks with distribution shifts and Transformer-based models. We show that when models are evaluated on the out-of-distribution data “out of the box”, using only the maximum score to inform the confidence measure is highly suboptimal. In the post-processing regime (where the scores of f can be improved using additional in-distribution held-out data), this remains true, albeit less significant. Overall, our results suggest that entropy-based confidence is a surprisingly useful measure.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/26/2021

Reliable and Trustworthy Machine Learning for Health Using Dataset Shift Detection

Unpredictable ML model behavior on unseen data, especially in the health...
research
03/31/2020

Prediction Confidence from Neighbors

The inability of Machine Learning (ML) models to successfully extrapolat...
research
06/01/2022

HYCEDIS: HYbrid Confidence Engine for Deep Document Intelligence System

Measuring the confidence of AI models is critical for safely deploying A...
research
06/17/2022

StaDRe and StaDRo: Reliability and Robustness Estimation of ML-based Forecasting using Statistical Distance Measures

Reliability estimation of Machine Learning (ML) models is becoming a cru...
research
03/06/2019

Prostate Segmentation from 3D MRI Using a Two-Stage Model and Variable-Input Based Uncertainty Measure

This paper proposes a two-stage segmentation model, variable-input based...
research
05/30/2018

To Trust Or Not To Trust A Classifier

Knowing when a classifier's prediction can be trusted is useful in many ...
research
03/28/2021

Entropy methods for the confidence assessment of probabilistic classification models

Many classification models produce a probability distribution as the out...

Please sign up or login with your details

Forgot password? Click here to reset