GenericsKB: A Knowledge Base of Generic Statements

05/02/2020
by   Sumithra Bhakthavatsalam, et al.
0

We present a new resource for the NLP community, namely a large (3.5M+ sentence) knowledge base of *generic statements*, e.g., "Trees remove carbon dioxide from the atmosphere", collected from multiple corpora. This is the first large resource to contain *naturally occurring* generic sentences, as opposed to extracted or crowdsourced triples, and thus is rich in high-quality, general, semantically complete statements. All GenericsKB sentences are annotated with their topical term, surrounding context (sentences), and a (learned) confidence. We also release GenericsKB-Best (1M+ sentences), containing the best-quality generics in GenericsKB augmented with selected, synthesized generics from WordNet and ConceptNet. In tests on two existing datasets requiring multihop reasoning (OBQA and QASC), we find using GenericsKB can result in higher scores and better explanations than using a much larger corpus. This demonstrates that GenericsKB can be a useful resource for NLP applications, as well as providing data for linguistic studies of generics and their semantics. GenericsKB is available at https://allenai.org/data/genericskb.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2020

Do Dogs have Whiskers? A New Knowledge Base of hasPart Relations

We present a new knowledge-base of hasPart relationships, extracted from...
research
05/25/2019

Triple-to-Text: Converting RDF Triples into High-Quality Natural Languages via Optimizing an Inverse KL Divergence

Knowledge base is one of the main forms to represent information in a st...
research
03/01/2022

DAMO-NLP at SemEval-2022 Task 11: A Knowledge-based System for Multilingual Named Entity Recognition

The MultiCoNER shared task aims at detecting semantically ambiguous and ...
research
08/05/2020

Antibody Watch: Text Mining Antibody Specificity from the Literature

Motivation: Antibodies are widely used reagents to test for expression o...
research
07/01/2021

A Study of the Quality of Wikidata

Wikidata has been increasingly adopted by many communities for a wide va...
research
02/25/2022

Mining Naturally-occurring Corrections and Paraphrases from Wikipedia's Revision History

Naturally-occurring instances of linguistic phenomena are important both...
research
04/11/2018

An Easy & Collaborative RDF Data Entry Method using the Spreadsheet Metaphor

Spreadsheets are widely used by knowledge workers, especially in the ind...

Please sign up or login with your details

Forgot password? Click here to reset