Investigating the Failure Modes of the AUC metric and Exploring Alternatives for Evaluating Systems in Safety Critical Applications

10/10/2022
by   Swaroop Mishra, et al.
9

With the increasing importance of safety requirements associated with the use of black box models, evaluation of selective answering capability of models has been critical. Area under the curve (AUC) is used as a metric for this purpose. We find limitations in AUC; e.g., a model having higher AUC is not always better in performing selective answering. We propose three alternate metrics that fix the identified limitations. On experimenting with ten models, our results using the new metrics show that newer and larger pre-trained models do not necessarily show better performance in selective answering. We hope our insights will help develop better models tailored for safety-critical applications.

READ FULL TEXT

page 13

page 19

research
08/21/2020

It's better to say "I can't answer" than answering incorrectly: Towards Safety critical NLP systems

In order to make AI systems more reliable and their adoption in safety c...
research
05/03/2023

Bayesian Safety Validation for Black-Box Systems

Accurately estimating the probability of failure for safety-critical sys...
research
03/05/2019

Limitations of Pinned AUC for Measuring Unintended Bias

This report examines the Pinned AUC metric introduced and highlights som...
research
10/19/2022

AUC-based Selective Classification

Selective classification (or classification with a reject option) pairs ...
research
12/02/2019

Diagnostic Curves for Black Box Models

In safety-critical applications of machine learning, it is often necessa...
research
08/06/2023

Empirical Optimal Risk to Quantify Model Trustworthiness for Failure Detection

Failure detection (FD) in AI systems is a crucial safeguard for the depl...
research
10/27/2021

Online Selective Classification with Limited Feedback

Motivated by applications to resource-limited and safety-critical domain...

Please sign up or login with your details

Forgot password? Click here to reset