Large-scale Cloze Test Dataset Designed by Teachers

11/09/2017
by   Qizhe Xie, et al.
0

Cloze test is widely adopted in language exams to evaluate students' language proficiency. In this paper, we propose the first large-scale human-designed cloze test dataset CLOTH, in which the questions were used in middle-school and high-school language exams. With the missing blanks carefully created by teachers and candidate choices purposely designed to be confusing, CLOTH requires a deeper language understanding and a wider attention span than previous automatically generated cloze datasets. We show humans outperform dedicated designed baseline models by a significant margin, even when the model is trained on sufficiently large external data. We investigate the source of the performance gap, trace model deficiencies to some distinct properties of CLOTH, and identify the limited ability of comprehending a long-term context to be the key bottleneck.

READ FULL TEXT
research
06/04/2019

ChID: A Large-scale Chinese IDiom Dataset for Cloze Test

Cloze-style reading comprehension in Chinese is still limited due to the...
research
05/23/2023

Narrative XL: A Large-scale Dataset For Long-Term Memory Models

Despite their tremendous successes, most large language models do not ha...
research
08/21/2019

WikiCREM: A Large Unsupervised Corpus for Coreference Resolution

Pronoun resolution is a major area of natural language understanding. Ho...
research
09/10/2021

Tiered Reasoning for Intuitive Physics: Toward Verifiable Commonsense Language Understanding

Large-scale, pre-trained language models (LMs) have achieved human-level...
research
03/27/2022

MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering

This paper introduces MedMCQA, a new large-scale, Multiple-Choice Questi...
research
03/23/2018

Automated Evaluation of Out-of-Context Errors

We present a new approach to evaluate computational models for the task ...
research
05/13/2021

Not All Memories are Created Equal: Learning to Forget by Expiring

Attention mechanisms have shown promising results in sequence modeling t...

Please sign up or login with your details

Forgot password? Click here to reset