DeepAI AI Chat
Log In Sign Up

Qualitative and Quantitative Analysis of Diversity in Cross-document Coreference Resolution Datasets

by   Anastasia Zhukova, et al.

Cross-document coreference resolution (CDCR) datasets, such as ECB+, contain manually annotated event-centric mentions of events and entities that form coreference chains with identity relations. ECB+ is a state-of-the-art CDCR dataset that focuses on the resolution of events and their descriptive attributes, i.e., actors, location, and date-time. NewsWCL50 is a dataset that annotates coreference chains of both events and entities with a strong variance of word choice and more loosely-related coreference anaphora, e.g., bridging or near-identity relations. In this paper, we qualitatively and quantitatively compare annotation schemes of ECB+ and NewsWCL50 with multiple criteria. We propose a phrasing diversity metric (PD) that compares lexical diversity within coreference chains on a more detailed level than previously proposed metric, e.g., a number of unique lemmas. We discuss the different tasks that both CDCR datasets create, i.e., lexical disambiguation and lexical diversity challenges, and propose a direction for further CDCR evaluation.


page 1

page 2

page 3

page 4


XCoref: Cross-document Coreference Resolution in the Wild

Datasets and methods for cross-document coreference resolution (CDCR) fo...

Cross-document Event Identity via Dense Annotation

In this paper, we study the identity of textual events from different do...

NEREL: A Russian Dataset with Nested Named Entities, Relations and Events

In this paper, we present NEREL, a Russian dataset for named entity reco...

Identity and Granularity of Events in Text

In this paper we describe a method to detect event descrip- tions in dif...

Event Coreference Resolution by Iteratively Unfolding Inter-dependencies among Events

We introduce a novel iterative approach for event coreference resolution...

Realistic Evaluation Principles for Cross-document Coreference Resolution

We point out that common evaluation practices for cross-document corefer...