Few Shot Rationale Generation using Self-Training with Dual Teachers

Self-rationalizing models that also generate a free-text explanation for their predicted labels are an important tool to build trustworthy AI applications. Since generating explanations for annotated labels is a laborious and costly pro cess, recent models rely on large pretrained language models (PLMs) as their backbone and few-shot learning. In this work we explore a self-training approach leveraging both labeled and unlabeled data to further improve few-shot models, under the assumption that neither human written rationales nor annotated task labels are available at scale. We introduce a novel dual-teacher learning framework, which learns two specialized teacher models for task prediction and rationalization using self-training and distills their knowledge into a multi-tasking student model that can jointly generate the task label and rationale. Furthermore, we formulate a new loss function, Masked Label Regularization (MLR) which promotes explanations to be strongly conditioned on predicted labels. Evaluation on three public datasets demonstrate that the proposed methods are effective in modeling task labels and generating faithful rationales.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/17/2021

Self-training with Few-shot Rationalization: Teacher Explanations Aid Student in Few-shot NLU

While pre-trained language models have obtained state-of-the-art perform...
research
11/16/2021

Few-Shot Self-Rationalization with Natural Language Prompts

Self-rationalization models that predict task labels and generate free-t...
research
05/12/2023

ZARA: Improving Few-Shot Self-Rationalization for Small Language Models

Language models (LMs) that jointly generate end-task answers as well as ...
research
07/06/2021

Self-training with noisy student model and semi-supervised loss function for dcase 2021 challenge task 4

This report proposes a polyphonic sound event detection (SED) method for...
research
06/14/2021

Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data

Most existing works in few-shot learning rely on meta-learning the netwo...
research
09/30/2022

Contrastive Graph Few-Shot Learning

Prevailing deep graph learning models often suffer from label sparsity i...
research
06/07/2023

GPT Self-Supervision for a Better Data Annotator

The task of annotating data into concise summaries poses a significant c...

Please sign up or login with your details

Forgot password? Click here to reset