HypoNLI: Exploring the Artificial Patterns of Hypothesis-only Bias in Natural Language Inference

03/05/2020
by   Tianyu Liu, et al.
6

Many recent studies have shown that for models trained on datasets for natural language inference (NLI), it is possible to make correct predictions by merely looking at the hypothesis while completely ignoring the premise. In this work, we manage to derive adversarial examples in terms of the hypothesis-only bias and explore eligible ways to mitigate such bias. Specifically, we extract various phrases from the hypotheses (artificial patterns) in the training sets, and show that they have been strong indicators to the specific labels. We then figure out `hard' and `easy' instances from the original test sets whose labels are opposite to or consistent with those indications. We also set up baselines including both pretrained models (BERT, RoBERTa, XLNet) and competitive non-pretrained models (InferSent, DAM, ESIM). Apart from the benchmark and baselines, we also investigate two debiasing approaches which exploit the artificial pattern modeling to mitigate such hypothesis-only bias: down-sampling and adversarial training. We believe those methods can be treated as competitive baselines in NLI debiasing tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2023

SMoA: Sparse Mixture of Adapters to Mitigate Multiple Dataset Biases

Recent studies reveal that various biases exist in different NLP tasks, ...
research
01/19/2021

Exploring Lexical Irregularities in Hypothesis-Only Models of Natural Language Inference

Natural Language Inference (NLI) or Recognizing Textual Entailment (RTE)...
research
11/29/2018

Non-entailed subsequences as a challenge for natural language inference

Neural network models have shown great success at natural language infer...
research
09/10/2019

Mitigating Annotation Artifacts in Natural Language Inference Datasets to Improve Cross-dataset Generalization Ability

Natural language inference (NLI) aims at predicting the relationship bet...
research
03/06/2018

Annotation Artifacts in Natural Language Inference Data

Large-scale datasets for natural language inference are created by prese...
research
04/16/2020

There is Strength in Numbers: Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training

Natural Language Inference (NLI) datasets contain annotation artefacts r...
research
05/14/2019

Misleading Failures of Partial-input Baselines

Recent work establishes dataset difficulty and removes annotation artifa...

Please sign up or login with your details

Forgot password? Click here to reset