SPARSEFIT: Few-shot Prompting with Sparse Fine-tuning for Jointly Generating Predictions and Natural Language Explanations

05/22/2023
by   Jesus Solano, et al.
0

Explaining the decisions of neural models is crucial for ensuring their trustworthiness at deployment time. Using Natural Language Explanations (NLEs) to justify a model's predictions has recently gained increasing interest. However, this approach usually demands large datasets of human-written NLEs for the ground-truth answers, which are expensive and potentially infeasible for some applications. For models to generate high-quality NLEs when only a few NLEs are available, the fine-tuning of Pre-trained Language Models (PLMs) in conjunction with prompt-based learning recently emerged. However, PLMs typically have billions of parameters, making fine-tuning expensive. We propose SparseFit, a sparse few-shot fine-tuning strategy that leverages discrete prompts to jointly generate predictions and NLEs. We experiment with SparseFit on the T5 model and four datasets and compare it against state-of-the-art parameter-efficient fine-tuning techniques. We perform automatic and human evaluations to assess the quality of the model-generated NLEs, finding that fine-tuning only 6.8 both the task performance and the quality of the NLEs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/17/2022

Pathologies of Pre-trained Language Models in Few-shot Fine-tuning

Although adapting pre-trained language models with few examples has show...
research
12/12/2021

Few-Shot Out-of-Domain Transfer Learning of Natural Language Explanations

Recently, there has been an increasing interest in models that generate ...
research
09/02/2023

Explainability for Large Language Models: A Survey

Large language models (LLMs) have demonstrated impressive capabilities i...
research
12/07/2021

Ground-Truth, Whose Truth? – Examining the Challenges with Annotating Toxic Text Datasets

The use of machine learning (ML)-based language models (LMs) to monitor ...
research
12/31/2020

FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation

Natural language (NL) explanations of model predictions are gaining popu...
research
12/14/2020

Learning to Rationalize for Nonmonotonic Reasoning with Distant Supervision

The black-box nature of neural models has motivated a line of research t...
research
06/26/2023

How About Kind of Generating Hedges using End-to-End Neural Models?

Hedging is a strategy for softening the impact of a statement in convers...

Please sign up or login with your details

Forgot password? Click here to reset