The Unreliability of Explanations in Few-Shot In-Context Learning

05/06/2022
by   Xi Ye, et al.
0

How can prompting a large language model like GPT-3 with explanations improve in-context learning? We focus specifically on two NLP tasks that involve reasoning over text, namely question answering and natural language inference. Including explanations in the prompt and having the model generate them does not consistently improve performance in the settings we study, contrary to recent results on symbolic reasoning tasks (Nye et al., 2021; Wei et al., 2022). Despite careful prompting, explanations generated by GPT-3 may not even be factually grounded in the input, even on simple tasks with straightforward extractive explanations. However, these flawed explanations can still be useful as a way to verify GPT-3's predictions post-hoc. Through analysis in three settings, we show that explanations judged as good by humans–those that are logically consistent with the input and the prediction–usually indicate more accurate predictions. Following these observations, we present a framework for calibrating model predictions based on the reliability of the explanations. Our framework trains calibrators using automatically extracted scores that approximately assess the reliability of explanations, which helps improve performance across three different datasets.

READ FULL TEXT
research
06/13/2023

FLamE: Few-shot Learning from Natural Language Explanations

Natural language explanations have the potential to provide rich informa...
research
10/12/2021

Investigating the Effect of Natural Language Explanations on Out-of-Distribution Generalization in Few-shot NLI

Although neural models have shown strong performance in datasets such as...
research
05/11/2023

COCKATIEL: COntinuous Concept ranKed ATtribution with Interpretable ELements for explaining neural net classifiers on NLP tasks

Transformer architectures are complex and their use in NLP, while it has...
research
05/19/2023

Solving NLP Problems through Human-System Collaboration: A Discussion-based Approach

Humans work together to solve common problems by having discussions, exp...
research
03/13/2019

Natural Language Interaction with Explainable AI Models

This paper presents an explainable AI (XAI) system that provides explana...
research
02/27/2023

Evaluation of Automatically Constructed Word Meaning Explanations

Preparing exact and comprehensive word meaning explanations is one of th...
research
07/17/2023

Abductive Reasoning with the GPT-4 Language Model: Case studies from criminal investigation, medical practice, scientific research

This study evaluates the GPT-4 Large Language Model's abductive reasonin...

Please sign up or login with your details

Forgot password? Click here to reset