MedNLI Is Not Immune: Natural Language Inference Artifacts in the Clinical Domain

06/02/2021
by   Christine Herlihy, et al.
0

Crowdworker-constructed natural language inference (NLI) datasets have been found to contain statistical artifacts associated with the annotation process that allow hypothesis-only classifiers to achieve better-than-random performance (Poliak et al., 2018; Gururanganet et al., 2018; Tsuchiya, 2018). We investigate whether MedNLI, a physician-annotated dataset with premises extracted from clinical notes, contains such artifacts (Romanov and Shivade, 2018). We find that entailed hypotheses contain generic versions of specific concepts in the premise, as well as modifiers related to responsiveness, duration, and probability. Neutral hypotheses feature conditions and behaviors that co-occur with, or cause, the condition(s) in the premise. Contradiction hypotheses feature explicit negation of the premise and implicit negation via assertion of good health. Adversarial filtering demonstrates that performance degrades when evaluated on the difficult subset. We provide partition information and recommendations for alternative dataset construction strategies for knowledge-intensive domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/06/2018

Annotation Artifacts in Natural Language Inference Data

Large-scale datasets for natural language inference are created by prese...
research
09/10/2019

Mitigating Annotation Artifacts in Natural Language Inference Datasets to Improve Cross-dataset Generalization Ability

Natural language inference (NLI) aims at predicting the relationship bet...
research
03/27/2013

Independence and Bayesian Updating Methods

Duda, Hart, and Nilsson have set forth a method for rule-based inference...
research
08/21/2018

Lessons from Natural Language Inference in the Clinical Domain

State of the art models using deep neural networks have become very good...
research
12/16/2021

Automatically Identifying Semantic Bias in Crowdsourced Natural Language Inference Datasets

Natural language inference (NLI) is an important task for producing usef...
research
10/15/2020

Reliable Evaluations for Natural Language Inference based on a Unified Cross-dataset Benchmark

Recent studies show that crowd-sourced Natural Language Inference (NLI) ...
research
05/14/2019

Misleading Failures of Partial-input Baselines

Recent work establishes dataset difficulty and removes annotation artifa...

Please sign up or login with your details

Forgot password? Click here to reset