Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations

by   Swarnadeep Saha, et al.

Recent work on explainable NLP has shown that few-shot prompting can enable large pretrained language models (LLMs) to generate grammatical and factual natural language explanations for data labels. In this work, we study the connection between explainability and sample hardness by investigating the following research question - "Are LLMs and humans equally good at explaining data labels for both easy and hard samples?" We answer this question by first collecting human-written explanations in the form of generalizable commonsense rules on the task of Winograd Schema Challenge (Winogrande dataset). We compare these explanations with those generated by GPT-3 while varying the hardness of the test samples as well as the in-context samples. We observe that (1) GPT-3 explanations are as grammatical as human explanations regardless of the hardness of the test samples, (2) for easy examples, GPT-3 generates highly supportive explanations but human explanations are more generalizable, and (3) for hard examples, human explanations are significantly better than GPT-3 explanations both in terms of label-supportiveness and generalizability judgements. We also find that hardness of the in-context examples impacts the quality of GPT-3 explanations. Finally, we show that the supportiveness and generalizability aspects of human explanations are also impacted by sample hardness, although by a much smaller margin than models. Supporting code and data are available at


page 2

page 9


FLamE: Few-shot Learning from Natural Language Explanations

Natural language explanations have the potential to provide rich informa...

Hardness of Samples Need to be Quantified for a Reliable Evaluation System: Exploring Potential Opportunities with a New Task

Evaluation of models on benchmarks is unreliable without knowing the deg...

Experience and Prediction: A Metric of Hardness for a Novel Litmus Test

In the last decade, the Winograd Schema Challenge (WSC) has become a cen...

Evaluating GPT-3 Generated Explanations for Hateful Content Moderation

Recent research has focused on using large language models (LLMs) to gen...

Learning to Scaffold: Optimizing Model Explanations for Teaching

Modern machine learning models are opaque, and as a result there is a bu...

Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V (Technical Report)

In multi-user environments in which data science and analysis is collabo...

Thermostat: A Large Collection of NLP Model Explanations and Analysis Tools

In the language domain, as in other domains, neural explainability takes...

Please sign up or login with your details

Forgot password? Click here to reset