On the Evaluation of Common-Sense Reasoning in Natural Language Understanding

11/05/2018
by   Paul Trichelair, et al.
10

The NLP and ML communities have long been interested in developing models capable of common-sense reasoning, and recent works have significantly improved the state of the art on benchmarks like the Winograd Schema Challenge (WSC). Despite these advances, the complexity of tasks designed to test common-sense reasoning remains under-analyzed. In this paper, we make a case study of the Winograd Schema Challenge and, based on two new measures of instance-level complexity, design a protocol that both clarifies and qualifies the results of previous work. Our protocol accounts for the WSC's limited size and variable instance difficulty, properties common to other common-sense benchmarks. Accounting for these properties when assessing model results may prevent unjustified conclusions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2019

Does It Make Sense? And Why? A Pilot Study for Sense Making and Explanation

Introducing common sense to natural language understanding systems has r...
research
10/02/2018

A Knowledge Hunting Framework for Common Sense Reasoning

We introduce an automatic system that achieves state-of-the-art results ...
research
11/09/2020

An Analysis of Dataset Overlap on Winograd-Style Tasks

The Winograd Schema Challenge (WSC) and variants inspired by it have bec...
research
01/08/2018

Winograd Schema - Knowledge Extraction Using Narrative Chains

The Winograd Schema Challenge (WSC) is a test of machine intelligence, d...
research
05/15/2019

A Surprisingly Robust Trick for Winograd Schema Challenge

The Winograd Schema Challenge (WSC) dataset WSC273 and its inference cou...
research
08/26/2019

Improving Neural Story Generation by Targeted Common Sense Grounding

Stories generated with neural language models have shown promise in gram...
research
02/23/2021

Baby Intuitions Benchmark (BIB): Discerning the goals, preferences, and actions of others

To achieve human-like common sense about everyday life, machine learning...

Please sign up or login with your details

Forgot password? Click here to reset