SemRe-Rank: Incorporating Semantic Relatedness to Improve Automatic Term Extraction Using Personalized PageRank

by   Ziqi Zhang, et al.

Automatic Term Extraction deals with the extraction of terminology from a domain specific corpus, and has long been an established research area in data and knowledge acquisition. ATE remains a challenging task as it is known that no existing methods can consistently outperforms others in all domains. This work adopts a different strategy towards this problem as we propose to 'enhance' existing ATE methods instead of 'replace' them. We introduce SemRe-Rank, a generic method based on the concept of incorporating semantic relatedness - an often overlooked venue - into an existing ATE method to further improve its performance. SemRe-Rank applies a personalized PageRank process to a semantic relatedness graph of words to compute their 'semantic importance' scores, which are then used to revise the scores of term candidates computed by a base ATE algorithm. Extensively evaluated with 13 state-of-the-art ATE methods on four datasets of diverse nature, it is shown to have achieved widespread improvement over all methods and across all datasets. The best performing variants of SemRe-Rank have achieved, on some datasets, an improvement of 0.15 (on a scale of 0 1.0) in terms of the precision in the top ranked K term candidates, and an improvement of 0.28 in terms of overall F1.


page 1

page 2

page 3

page 4


Unsupervised Technical Domain Terms Extraction using Term Extractor

Terminology extraction, also known as term extraction, is a subtask of i...

Automatic Extraction of Agriculture Terms from Domain Text: A Survey of Tools and Techniques

Agriculture is a key component in any country's development. Domain-spec...

ATR4S: Toolkit with State-of-the-art Automatic Terms Recognition Methods in Scala

Automatically recognized terminology is widely used for various domain-s...

A Distributed Automatic Domain-Specific Multi-Word Term Recognition Architecture using Spark Ecosystem

Automatic Term Recognition is used to extract domain-specific terms that...

Modeling the dynamics of domain specific terminology in diachronic corpora

In terminology work, natural language processing, and digital humanities...

Topical Keyphrase Extraction with Hierarchical Semantic Networks

Topical keyphrase extraction is used to summarize large collections of t...

Interactive Concept Mining on Personal Data -- Bootstrapping Semantic Services

Semantic services (e.g. Semantic Desktops) are still afflicted by a cold...

Please sign up or login with your details

Forgot password? Click here to reset