To Trust Or Not To Trust A Classifier

05/30/2018
by   Heinrich Jiang, et al.
8

Knowing when a classifier's prediction can be trusted is useful in many applications and critical for safely using AI. While the bulk of the effort in machine learning research has been towards improving classifier performance, understanding when a classifier's predictions should and should not be trusted has received far less attention. The standard approach is to use the classifier's discriminant or confidence score; however, we show there exists a considerably more effective alternative. We propose a new score, called the trust score, which measures the agreement between the classifier and a modified nearest-neighbor classifier on the testing example. We show empirically that high (low) trust scores produce surprisingly high precision at identifying correctly (incorrectly) classified examples, consistently outperforming the classifier's confidence score as well as many other baselines. Further, under some mild distributional assumptions, we show that if the trust score for an example is high (low), the classifier will likely agree (disagree) with the Bayes-optimal classifier. Our guarantees consist of non-asymptotic rates of statistical consistency under various nonparametric settings and build on recent developments in topological data analysis.

READ FULL TEXT

page 23

page 24

research
06/06/2022

Improving Model Understanding and Trust with Counterfactual Explanations of Model Confidence

In this paper, we show that counterfactual explanations of confidence sc...
research
10/24/2019

Accurate Layerwise Interpretable Competence Estimation

Estimating machine learning performance 'in the wild' is both an importa...
research
11/29/2022

Birds of a Feather Trust Together: Knowing When to Trust a Classifier via Adaptive Neighborhood Aggregation

How do we know when the predictions made by a classifier can be trusted?...
research
02/21/2023

Classification with Trust: A Supervised Approach based on Sequential Ellipsoidal Partitioning

Standard metrics of performance of classifiers, such as accuracy and sen...
research
02/16/2016

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

Despite widespread adoption, machine learning models remain mostly black...
research
10/25/2022

Useful Confidence Measures: Beyond the Max Score

An important component in deploying machine learning (ML) in safety-crit...
research
05/17/2023

rWISDM: Repaired WISDM, a Public Dataset for Human Activity Recognition

Human Activity Recognition (HAR) has become a spotlight in recent scient...

Please sign up or login with your details

Forgot password? Click here to reset