Predicting Document Coverage for Relation Extraction

11/26/2021
by   Sneha Singhania, et al.
4

This paper presents a new task of predicting the coverage of a text document for relation extraction (RE): does the document contain many relational tuples for a given entity? Coverage predictions are useful in selecting the best documents for knowledge base construction with large input corpora. To study this problem, we present a dataset of 31,366 diverse documents for 520 entities. We analyze the correlation of document coverage with features like length, entity mention frequency, Alexa rank, language complexity and information retrieval scores. Each of these features has only moderate predictive power. We employ methods combining features with statistical models like TF-IDF and language models like BERT. The model combining features and BERT, HERB, achieves an F1 score of up to 46 coverage predictions on two use cases: KB construction and claim refutation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/10/2022

AIFB-WebScience at SemEval-2022 Task 12: Relation Extraction First – Using Relation Extraction to Identify Entities

In this paper, we present an end-to-end joint entity and relation extrac...
research
10/29/2022

Entity-centered Cross-document Relation Extraction

Relation Extraction (RE) is a fundamental task of information extraction...
research
07/01/2016

Learning Relational Dependency Networks for Relation Extraction

We consider the task of KBP slot filling -- extracting relation informat...
research
11/24/2018

Novelty and Coverage in context-based information filtering

We present a collection of algorithms to filter a stream of documents in...
research
08/16/2019

BERT-Based Multi-Head Selection for Joint Entity-Relation Extraction

In this paper, we report our method for the Information Extraction task ...
research
05/24/2023

A Causal View of Entity Bias in (Large) Language Models

Entity bias widely affects pretrained (large) language models, causing t...
research
04/25/2018

Hierarchical RNN for Information Extraction from Lawsuit Documents

Every lawsuit document contains the information about the party's claim,...

Please sign up or login with your details

Forgot password? Click here to reset