FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation

12/31/2020
by   Kushal Lakhotia, et al.
0

Natural language (NL) explanations of model predictions are gaining popularity as a means to understand and verify decisions made by large black-box pre-trained models, for NLP tasks such as Question Answering (QA) and Fact Verification. Recently, pre-trained sequence to sequence (seq2seq) models have proven to be very effective in jointly making predictions, as well as generating NL explanations. However, these models have many shortcomings; they can fabricate explanations even for incorrect predictions, they are difficult to adapt to long input documents, and their training requires a large amount of labeled data. In this paper, we develop FiD-Ex, which addresses these shortcomings for seq2seq models by: 1) introducing sentence markers to eliminate explanation fabrication by encouraging extractive generation, 2) using the fusion-in-decoder architecture to handle long input contexts, and 3) intermediate fine-tuning on re-structured open domain QA datasets to improve few-shot performance. FiD-Ex significantly improves over prior work in terms of explanation metrics and task accuracy, on multiple tasks from the ERASER explainability benchmark, both in the fully supervised and in the few-shot settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2021

Can Explanations Be Useful for Calibrating Black Box Models?

One often wants to take an existing, trained NLP model and use it on dat...
research
05/22/2023

SPARSEFIT: Few-shot Prompting with Sparse Fine-tuning for Jointly Generating Predictions and Natural Language Explanations

Explaining the decisions of neural models is crucial for ensuring their ...
research
03/21/2022

DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization

Large-scale pre-trained sequence-to-sequence models like BART and T5 ach...
research
03/27/2021

You Can Do Better! If You Elaborate the Reason When Making Prediction

Neural predictive models have achieved groundbreaking performance improv...
research
09/14/2022

Prompt Combines Paraphrase: Teaching Pre-trained Models to Understand Rare Biomedical Words

Prompt-based fine-tuning for pre-trained models has proven effective for...
research
04/16/2021

What to Pre-Train on? Efficient Intermediate Task Selection

Intermediate task fine-tuning has been shown to culminate in large trans...
research
09/20/2022

Automatic Label Sequence Generation for Prompting Sequence-to-sequence Models

Prompting, which casts downstream applications as language modeling task...

Please sign up or login with your details

Forgot password? Click here to reset