Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates

11/06/2022
by   Dongfang Li, et al.
0

Calibration strengthens the trustworthiness of black-box models by producing better accurate confidence estimates on given examples. However, little is known about if model explanations can help confidence calibration. Intuitively, humans look at important features attributions and decide whether the model is trustworthy. Similarly, the explanations can tell us when the model may or may not know. Inspired by this, we propose a method named CME that leverages model explanations to make the model less confident with non-inductive attributions. The idea is that when the model is not highly confident, it is difficult to identify strong indications of any class, and the tokens accordingly do not have high attribution scores for any class and vice versa. We conduct extensive experiments on six datasets with two popular pre-trained language models in the in-domain and out-of-domain settings. The results show that CME improves calibration performance in all settings. The expected calibration errors are further reduced when combined with temperature scaling. Our findings highlight that model explanations can help calibrate posterior estimates.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/17/2020

Calibration of Pre-trained Transformers

Pre-trained Transformers are now ubiquitous in natural language processi...
research
12/22/2022

On Calibrating Semantic Segmentation Models: Analysis and An Algorithm

We study the problem of semantic segmentation calibration. For image cla...
research
02/28/2022

An Empirical Study on Explanations in Out-of-Domain Settings

Recent work in Natural Language Processing has focused on developing app...
research
06/22/2022

Explanation-based Counterfactual Retraining(XCR): A Calibration Method for Black-box Models

With the rapid development of eXplainable Artificial Intelligence (XAI),...
research
03/14/2022

On the Calibration of Pre-trained Language Models using Mixup Guided by Area Under the Margin and Saliency

A well-calibrated neural model produces confidence (probability outputs)...
research
10/31/2022

A Close Look into the Calibration of Pre-trained Language Models

Pre-trained language models (PLMs) achieve remarkable performance on man...
research
01/02/2021

Uncertainty-sensitive Activity Recognition: a Reliability Benchmark and the CARING Models

Beyond assigning the correct class, an activity recognition model should...

Please sign up or login with your details

Forgot password? Click here to reset