TaxoGen: Unsupervised Topic Taxonomy Construction by Adaptive Term Embedding and Clustering

12/22/2018
by   Chao Zhang, et al.
0

Taxonomy construction is not only a fundamental task for semantic analysis of text corpora, but also an important step for applications such as information filtering, recommendation, and Web search. Existing pattern-based methods extract hypernym-hyponym term pairs and then organize these pairs into a taxonomy. However, by considering each term as an independent concept node, they overlook the topical proximity and the semantic correlations among terms. In this paper, we propose a method for constructing topic taxonomies, wherein every node represents a conceptual topic and is defined as a cluster of semantically coherent concept terms. Our method, TaxoGen, uses term embeddings and hierarchical clustering to construct a topic taxonomy in a recursive fashion. To ensure the quality of the recursive process, it consists of: (1) an adaptive spherical clustering module for allocating terms to proper levels when splitting a coarse topic into fine-grained ones; (2) a local embedding module for learning term embeddings that maintain strong discriminative power at different levels of the taxonomy. Our experiments on two real datasets demonstrate the effectiveness of TaxoGen compared with baseline methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/13/2020

CoRel: Seed-Guided Topical Taxonomy Construction by Concept Learning and Relation Transferring

Taxonomy is not only a fundamental form of knowledge representation, but...
research
10/17/2019

HiExpan: Task-Guided Taxonomy Construction by Hierarchical Tree Expansion

Taxonomies are of great value to many knowledge-rich applications. As th...
research
08/10/2023

Adaptive Taxonomy Learning and Historical Patterns Modelling for Patent Classification

Patent classification aims to assign multiple International Patent Class...
research
04/01/2022

Automatic Biomedical Term Clustering by Learning Fine-grained Term Representations

Term clustering is important in biomedical knowledge graph construction....
research
10/11/2019

Finding Interpretable Concept Spaces in Node Embeddings using Knowledge Bases

In this paper we propose and study the novel problem of explaining node ...
research
05/18/2023

Taxonomy Completion with Probabilistic Scorer via Box Embedding

Taxonomy completion, a task aimed at automatically enriching an existing...
research
06/05/2021

Enhancing Taxonomy Completion with Concept Generation via Fusing Relational Representations

Automatic construction of a taxonomy supports many applications in e-com...

Please sign up or login with your details

Forgot password? Click here to reset