Automatically Identifying Semantic Bias in Crowdsourced Natural Language Inference Datasets

12/16/2021
by   Michael Saxon, et al.
11

Natural language inference (NLI) is an important task for producing useful models of human language. Unfortunately large-scale NLI dataset production relies on crowdworkers who are prone to introduce biases in the sentences they write. In particular, without quality control they produce hypotheses from which the relational label can be predicted, without the premise, better than chance. We introduce a model-driven, unsupervised technique to find "bias clusters" in a learned embedding space of the hypotheses in NLI datasets, from which interventions and additional rounds of labeling can be performed to ameliorate the semantic bias of the hypothesis distribution of a dataset.

READ FULL TEXT
research
03/06/2018

Annotation Artifacts in Natural Language Inference Data

Large-scale datasets for natural language inference are created by prese...
research
07/09/2019

On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference

Popular Natural Language Inference (NLI) datasets have been shown to be ...
research
09/13/2017

Natural Language Inference over Interaction Space

Natural Language Inference (NLI) task requires an agent to determine the...
research
08/28/2019

Unlearn Dataset Bias in Natural Language Inference by Fitting the Residual

Statistical natural language inference (NLI) models are susceptible to l...
research
02/07/2022

Diversify and Disambiguate: Learning From Underspecified Data

Many datasets are underspecified, which means there are several equally ...
research
06/02/2021

MedNLI Is Not Immune: Natural Language Inference Artifacts in the Clinical Domain

Crowdworker-constructed natural language inference (NLI) datasets have b...
research
05/16/2022

Assessing the Limits of the Distributional Hypothesis in Semantic Spaces: Trait-based Relational Knowledge and the Impact of Co-occurrences

The increase in performance in NLP due to the prevalence of distribution...

Please sign up or login with your details

Forgot password? Click here to reset