Are Human Explanations Always Helpful? Towards Objective Evaluation of Human Natural Language Explanations

05/04/2023
by   Bingsheng Yao, et al.
0

Human-annotated labels and explanations are critical for training explainable NLP models. However, unlike human-annotated labels whose quality is easier to calibrate (e.g., with a majority vote), human-crafted free-form explanations can be quite subjective, as some recent works have discussed. Before blindly using them as ground truth to train ML models, a vital question needs to be asked: How do we evaluate a human-annotated explanation's quality? In this paper, we build on the view that the quality of a human-annotated explanation can be measured based on its helpfulness (or impairment) to the ML models' performance for the desired NLP tasks for which the annotations were collected. In comparison to the commonly used Simulatability score, we define a new metric that can take into consideration the helpfulness of an explanation for model performance at both fine-tuning and inference. With the help of a unified dataset format, we evaluated the proposed metric on five datasets (e.g., e-SNLI) against two model architectures (T5 and BART), and the results show that our proposed metric can objectively evaluate the quality of human-annotated explanations, while Simulatability falls short.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/05/2021

Do Natural Language Explanations Represent Valid Logical Arguments? Verifying Entailment in Explainable NLI Gold Standards

An emerging line of research in Explainable NLP is the creation of datas...
research
03/29/2023

LMExplainer: a Knowledge-Enhanced Explainer for Language Models

Large language models (LMs) such as GPT-4 are very powerful and can proc...
research
05/24/2022

Interpretation Quality Score for Measuring the Quality of interpretability methods

Machine learning (ML) models have been applied to a wide range of natura...
research
02/24/2021

Teach Me to Explain: A Review of Datasets for Explainable NLP

Explainable NLP (ExNLP) has increasingly focused on collecting human-ann...
research
10/13/2022

How (Not) To Evaluate Explanation Quality

The importance of explainability is increasingly acknowledged in natural...
research
07/16/2019

Evaluating Explanation Without Ground Truth in Interpretable Machine Learning

Interpretable Machine Learning (IML) has become increasingly important i...
research
12/08/2022

Harnessing the Power of Multi-Task Pretraining for Ground-Truth Level Natural Language Explanations

Natural language explanations promise to offer intuitively understandabl...

Please sign up or login with your details

Forgot password? Click here to reset