SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference

08/16/2018
by   Rowan Zellers, et al.
0

Given a partial description like "she opened the hood of the car," humans can reason about the situation and anticipate what might come next ("then, she examined the engine"). In this paper, we introduce the task of grounded commonsense inference, unifying natural language inference and commonsense reasoning. We present SWAG, a new dataset with 113k multiple choice questions about a rich spectrum of grounded situations. To address the recurring challenges of the annotation artifacts and human biases found in many existing datasets, we propose Adversarial Filtering (AF), a novel procedure that constructs a de-biased dataset by iteratively training an ensemble of stylistic classifiers, and using them to filter the data. To account for the aggressive adversarial filtering, we use state-of-the-art language models to massively oversample a diverse set of potential counterfactuals. Empirical results demonstrate that while humans can solve the resulting inference problems with high accuracy (88 comprehensive analysis that indicates significant opportunities for future research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2022

DiscoSense: Commonsense Reasoning with Discourse Connectives

We present DiscoSense, a benchmark for commonsense reasoning via underst...
research
05/19/2019

HellaSwag: Can a Machine Really Finish Your Sentence?

Recent work by Zellers et al. (2018) introduced a new task of commonsens...
research
08/15/2019

Abductive Commonsense Reasoning

Abductive reasoning is inference to the most plausible explanation. For ...
research
04/22/2019

SocialIQA: Commonsense Reasoning about Social Interactions

We introduce SocialIQa, the first large-scale benchmark for commonsense ...
research
11/27/2018

From Recognition to Cognition: Visual Commonsense Reasoning

Visual understanding goes well beyond object recognition. With one glanc...
research
07/24/2019

WINOGRANDE: An Adversarial Winograd Schema Challenge at Scale

The Winograd Schema Challenge (WSC), proposed by Levesque et al. (2011) ...
research
05/23/2023

MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems

Although automatic dialogue tutors hold great potential in making educat...

Please sign up or login with your details

Forgot password? Click here to reset