Can Explanations Be Useful for Calibrating Black Box Models?

10/14/2021
by   Xi Ye, et al.
0

One often wants to take an existing, trained NLP model and use it on data from a new domain. While fine-tuning or few-shot learning can be used to adapt the base model, there is no one simple recipe to getting these working; moreover, one may not have access to the original model weights if it is deployed as a black box. To this end, we study how to improve a black box model's performance on a new domain given examples from the new domain by leveraging explanations of the model's behavior. Our approach first extracts a set of features combining human intuition about the task with model attributions generated by black box interpretation techniques, and then uses a simple model to calibrate or rerank the model's predictions based on the features. We experiment with our method on two tasks, extractive question answering and natural language inference, covering adaptation from several pairs of domains. The experimental results across all the domain pairs show that explanations are useful for calibrating these models. We show that the calibration features transfer to some extent between tasks and shed light on how to effectively use them.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2019

"How do I fool you?": Manipulating User Trust via Misleading Black Box Explanations

As machine learning black boxes are increasingly being deployed in criti...
research
12/31/2020

FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation

Natural language (NL) explanations of model predictions are gaining popu...
research
06/24/2021

Towards Exploiting Geometry and Time for Fast Off-Distribution Adaptation in Multi-Task Robot Learning

We explore possible methods for multi-task transfer learning which seek ...
research
06/17/2019

Why and How zk-SNARK Works

Despite the existence of multiple great resources on zk-SNARK constructi...
research
12/24/2020

Sentence-Based Model Agnostic NLP Interpretability

Today, interpretability of Black-Box Natural Language Processing (NLP) m...
research
12/16/2020

Learning from the Best: Rationalizing Prediction by Adversarial Information Calibration

Explaining the predictions of AI models is paramount in safety-critical ...
research
01/15/2023

Rationalizing Predictions by Adversarial Information Calibration

Explaining the predictions of AI models is paramount in safety-critical ...

Please sign up or login with your details

Forgot password? Click here to reset