TiFi: Taxonomy Induction for Fictional Domains [Extended version]

01/29/2019
by   Cuong Xuan Chu, et al.
0

Taxonomies are important building blocks of structured knowledge bases, and their construction from text sources and Wikipedia has received much attention. In this paper we focus on the construction of taxonomies for fictional domains, using noisy category systems from fan wikis or text extraction as input. Such fictional domains are archetypes of entity universes that are poorly covered by Wikipedia, such as also enterprise-specific knowledge bases or highly specialized verticals. Our fiction-targeted approach, called TiFi, consists of three phases: (i) category cleaning, by identifying candidate categories that truly represent classes in the domain of interest, (ii) edge cleaning, by selecting subcategory relationships that correspond to class subsumption, and (iii) top-level construction, by mapping classes onto a subset of high-level WordNet categories. A comprehensive evaluation shows that TiFi is able to construct taxonomies for a diverse range of fictional domains such as Lord of the Rings, The Simpsons or Greek Mythology with very high precision and that it outperforms state-of-the-art baselines for taxonomy induction by a substantial margin.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/04/2021

Large-scale Taxonomy Induction Using Entity and Word Embeddings

Taxonomies are an important ingredient of knowledge organization, and se...
research
04/15/2020

TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories

Extracting structured knowledge from product profiles is crucial for var...
research
06/28/2019

Uncovering the Semantics of Wikipedia Categories

The Wikipedia category graph serves as the taxonomic backbone for large-...
research
05/12/2016

Joint Embeddings of Hierarchical Categories and Entities

Due to the lack of structured knowledge applied in learning distributed ...
research
01/25/2021

CHOLAN: A Modular Approach for Neural Entity Linking on Wikipedia and Wikidata

In this paper, we propose CHOLAN, a modular approach to target end-to-en...
research
08/19/2020

Generating Categories for Sets of Entities

Category systems are central components of knowledge bases, as they prov...
research
11/08/2017

Improving Hypernymy Extraction with Distributional Semantic Classes

In this paper, we show for the first time how distributionally-induced s...

Please sign up or login with your details

Forgot password? Click here to reset