Natural Perturbation for Robust Question Answering

04/09/2020
by   Daniel Khashabi, et al.
1

While recent models have achieved human-level scores on many NLP datasets, we observe that they are considerably sensitive to small changes in input. As an alternative to the standard approach of addressing this issue by constructing training sets of completely new examples, we propose doing so via minimal perturbation of examples. Specifically, our approach involves first collecting a set of seed examples and then applying human-driven natural perturbations (as opposed to rule-based machine perturbations), which often change the gold label as well. Local perturbations have the advantage of being relatively easier (and hence cheaper) to create than writing out completely new examples. To evaluate the impact of this phenomenon, we consider a recent question-answering dataset (BoolQ) and study the benefit of our approach as a function of the perturbation cost ratio, the relative cost of perturbing an existing question vs. creating a new one from scratch. We find that when natural perturbations are moderately cheaper to create, it is more effective to train models using them: such models exhibit higher robustness and better generalization, while retaining performance on the original BoolQ dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/26/2020

Learning from Lexical Perturbations for Consistent Visual Question Answering

Existing Visual Question Answering (VQA) models are often fragile and se...
research
02/28/2022

Improving Lexical Embeddings for Robust Question Answering

Recent techniques in Question Answering (QA) have gained remarkable perf...
research
09/07/2018

Trick Me If You Can: Adversarial Writing of Trivia Challenge Questions

Modern natural language processing systems have been touted as approachi...
research
10/08/2021

A Few More Examples May Be Worth Billions of Parameters

We investigate the dynamics of increasing the number of model parameters...
research
07/29/2021

Break, Perturb, Build: Automatic Perturbation of Reasoning Paths through Question Decomposition

Recent efforts to create challenge benchmarks that test the abilities of...
research
03/27/2021

Improving Model Robustness by Adaptively Correcting Perturbation Levels with Active Queries

In addition to high accuracy, robustness is becoming increasingly import...
research
11/29/2022

Penalizing Confident Predictions on Largely Perturbed Inputs Does Not Improve Out-of-Distribution Generalization in Question Answering

Question answering (QA) models are shown to be insensitive to large pert...

Please sign up or login with your details

Forgot password? Click here to reset