SpanDrop: Simple and Effective Counterfactual Learning for Long Sequences

08/03/2022
by   Peng Qi, et al.
0

Distilling supervision signal from a long sequence to make predictions is a challenging task in machine learning, especially when not all elements in the input sequence contribute equally to the desired output. In this paper, we propose SpanDrop, a simple and effective data augmentation technique that helps models identify the true supervision signal in a long sequence with very few examples. By directly manipulating the input sequence, SpanDrop randomly ablates parts of the sequence at a time and ask the model to perform the same task to emulate counterfactual learning and achieve input attribution. Based on theoretical analysis of its properties, we also propose a variant of SpanDrop based on the beta-Bernoulli distribution, which yields diverse augmented sequences while providing a learning objective that is more consistent with the original dataset. We demonstrate the effectiveness of SpanDrop on a set of carefully designed toy tasks, as well as various natural language processing tasks that require reasoning over long sequences to arrive at the correct answer, and show that it helps models improve performance both when data is scarce and abundant.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2022

Bias Challenges in Counterfactual Data Augmentation

Deep learning models tend not to be out-of-distribution robust primarily...
research
04/20/2020

Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision

One of the primary challenges limiting the applicability of deep learnin...
research
06/08/2021

Cheap and Good? Simple and Effective Data Augmentation for Low Resource Machine Reading

We propose a simple and effective strategy for data augmentation for low...
research
11/16/2015

A Neural Transducer

Sequence-to-sequence models have achieved impressive results on various ...
research
06/02/2021

Learning to Rehearse in Long Sequence Memorization

Existing reasoning tasks often have an important assumption that the inp...
research
05/17/2023

Incorporating Attribution Importance for Improving Faithfulness Metrics

Feature attribution methods (FAs) are popular approaches for providing i...
research
04/06/2022

Sub-Task Decomposition Enables Learning in Sequence to Sequence Tasks

The field of Natural Language Processing (NLP) has experienced a dramati...

Please sign up or login with your details

Forgot password? Click here to reset