Enriching Knowledge Bases with Counting Quantifiers

07/10/2018
by   Paramita Mirza, et al.
0

Information extraction traditionally focuses on extracting relations between identifiable entities, such as <Monterey, locatedIn, California>. Yet, texts often also contain Counting information, stating that a subject is in a specific relation with a number of objects, without mentioning the objects themselves, for example, "California is divided into 58 counties". Such counting quantifiers can help in a variety of tasks such as query answering or knowledge base curation, but are neglected by prior work. This paper develops the first full-fledged system for extracting counting information from text, called CINEX. We employ distant supervision using fact counts from a knowledge base as training seeds, and develop novel techniques for dealing with several challenges: (i) non-maximal training seeds due to the incompleteness of knowledge bases, (ii) sparse and skewed observations in text sources, and (iii) high diversity of linguistic patterns. Experiments with five human-evaluated relations show that CINEX can achieve 60 counting information. In a large-scale experiment, we demonstrate the potential for knowledge base enrichment by applying CINEX to 2,474 frequent relations in Wikidata. CINEX can assert the existence of 2.5M facts for 110 distinct relations, which is 28 relations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/19/2019

Relation Discovery with Out-of-Relation Knowledge Base as Supervision

Unsupervised relation discovery aims to discover new relations from a gi...
research
04/28/2019

OPIEC: An Open Information Extraction Corpus

Open information extraction (OIE) systems extract relations and their ar...
research
04/14/2017

Cardinal Virtues: Extracting Relation Cardinalities from Text

Information extraction (IE) from text has largely focused on relations b...
research
12/15/2021

GenIE: Generative Information Extraction

Structured and grounded representation of text is typically formalized b...
research
05/15/2019

Neural Query Language: A Knowledge Base Query Language for Tensorflow

Large knowledge bases (KBs) are useful for many AI tasks, but are diffic...
research
03/06/2020

Uncovering Hidden Semantics of Set Information in Knowledge Bases

Knowledge Bases (KBs) contain a wealth of structured information about e...
research
07/30/2016

World Knowledge as Indirect Supervision for Document Clustering

One of the key obstacles in making learning protocols realistic in appli...

Please sign up or login with your details

Forgot password? Click here to reset