DocRED: A Large-Scale Document-Level Relation Extraction Dataset

06/14/2019
by   Yuan Yao, et al.
0

Multiple entities in a document generally exhibit complex inter-sentence relations, and cannot be well handled by existing relation extraction (RE) methods that typically focus on extracting intra-sentence relations for single entity pairs. In order to accelerate the research on document-level RE, we introduce DocRED, a new dataset constructed from Wikipedia and Wikidata with three features: (1) DocRED annotates both named entities and relations, and is the largest human-annotated dataset for document-level RE from plain text; (2) DocRED requires reading multiple sentences in a document to extract entities and infer their relations by synthesizing all information of the document; (3) along with the human-annotated data, we also offer large-scale distantly supervised data, which enables DocRED to be adopted for both supervised and weakly supervised scenarios. In order to verify the challenges of document-level RE, we implement recent state-of-the-art methods for RE and conduct a thorough evaluation of these methods on DocRED. Empirical results show that DocRED is challenging for existing RE methods, which indicates that document-level RE remains an open problem and requires further efforts. Based on the detailed analysis on the experiments, we discuss multiple promising directions for future research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/07/2023

Document-level Relation Extraction with Cross-sentence Reasoning Graph

Relation extraction (RE) has recently moved from the sentence-level to d...
research
10/24/2018

FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation

We present a Few-Shot Relation Classification Dataset (FewRel), consisti...
research
03/15/2021

Mention-centered Graph Neural Network for Document-level Relation Extraction

Document-level relation extraction aims to discover relations between en...
research
05/13/2020

Reasoning with Latent Structure Refinement for Document-Level Relation Extraction

Document-level relation extraction requires integrating information with...
research
02/28/2018

Simultaneously Self-Attending to All Mentions for Full-Abstract Biological Relation Extraction

Most work in relation extraction forms a prediction by looking at a shor...
research
04/06/2020

At Which Level Should We Extract? An Empirical Study on Extractive Document Summarization

Extractive methods have proven to be very effective in automatic documen...
research
10/31/2022

Semantic Novelty Detection and Characterization in Factual Text Involving Named Entities

Much of the existing work on text novelty detection has been studied at ...

Please sign up or login with your details

Forgot password? Click here to reset