Focus on what matters: Applying Discourse Coherence Theory to Cross Document Coreference

10/11/2021
by   William Held, et al.
0

Performing event and entity coreference resolution across documents vastly increases the number of candidate mentions, making it intractable to do the full n^2 pairwise comparisons. Existing approaches simplify by considering coreference only within document clusters, but this fails to handle inter-cluster coreference, common in many applications. As a result cross-document coreference algorithms are rarely applied to downstream tasks. We draw on an insight from discourse coherence theory: potential coreferences are constrained by the reader's discourse focus. We model the entities/events in a reader's focus as a neighborhood within a learned latent embedding space which minimizes the distance between mentions and the centroids of their gold coreference clusters. We then use these neighborhoods to sample only hard negatives to train a fine-grained classifier on mention pairs and their local discourse features. Our approach achieves state-of-the-art results for both events and entities on the ECB+, Gun Violence, Football Coreference, and Cross-Domain Cross-Document Coreference corpora. Furthermore, training on multiple corpora improves average performance across all datasets by 17.2 F1 points, leading to a robust coreference resolution model for use in downstream tasks where link distribution is unknown.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/12/2020

Analyzing Neural Discourse Coherence Models

In this work, we systematically investigate how well current models of c...
research
10/09/2021

DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing

Text discourse parsing weighs importantly in understanding information f...
research
10/09/2021

Improving Multi-Party Dialogue Discourse Parsing via Domain Integration

While multi-party conversations are often less structured than monologue...
research
04/17/2021

Sequential Cross-Document Coreference Resolution

Relating entities and events in text is a key component of natural langu...
research
05/23/2022

Contrastive Representation Learning for Cross-Document Coreference Resolution of Events and Entities

Identifying related entities and events within and across documents is f...
research
04/16/2022

Towards Unification of Discourse Annotation Frameworks

Discourse information is difficult to represent and annotate. Among the ...
research
04/30/2020

Text Segmentation by Cross Segment Attention

Document and discourse segmentation are two fundamental NLP tasks pertai...

Please sign up or login with your details

Forgot password? Click here to reset