In this study, we create a CConS (Counter-commonsense Contextual Size
co...
Natural language understanding (NLU) studies often exaggerate or
underes...
To explain the predicted answers and evaluate the reasoning abilities of...
Image captioning models require the high-level generalization ability to...
Question answering (QA) models for reading comprehension tend to learn
s...
Question answering (QA) models are shown to be insensitive to large
pert...
Debiasing language models from unwanted behaviors in Natural Language
Un...
Extractive question answering (QA) models tend to exploit spurious
corre...
Several multi-hop reading comprehension datasets have been proposed to
r...
The possible consequences for the same context may vary depending on the...
The issue of shortcut learning is widely known in NLP and has been an
im...
For a natural language understanding benchmark to be useful in research,...
Question answering (QA) models for reading comprehension have been
demon...
Natural Language Inference (NLI) datasets contain examples with highly
a...
Crowdsourcing is widely used to create data for common natural language
...
A multi-hop question answering (QA) dataset aims to test reasoning and
i...
Machine reading comprehension (MRC) has received considerable attention ...
Existing analysis work in machine reading comprehension (MRC) is largely...
A challenge in creating a dataset for machine reading comprehension (MRC...