SemRe-Rank: Incorporating Semantic Relatedness to Improve Automatic Term Extraction Using Personalized PageRank

11/09/2017
by   Ziqi Zhang, et al.
0

Automatic Term Extraction deals with the extraction of terminology from a domain specific corpus, and has long been an established research area in data and knowledge acquisition. ATE remains a challenging task as it is known that no existing methods can consistently outperforms others in all domains. This work adopts a different strategy towards this problem as we propose to 'enhance' existing ATE methods instead of 'replace' them. We introduce SemRe-Rank, a generic method based on the concept of incorporating semantic relatedness - an often overlooked venue - into an existing ATE method to further improve its performance. SemRe-Rank applies a personalized PageRank process to a semantic relatedness graph of words to compute their 'semantic importance' scores, which are then used to revise the scores of term candidates computed by a base ATE algorithm. Extensively evaluated with 13 state-of-the-art ATE methods on four datasets of diverse nature, it is shown to have achieved widespread improvement over all methods and across all datasets. The best performing variants of SemRe-Rank have achieved, on some datasets, an improvement of 0.15 (on a scale of 0 1.0) in terms of the precision in the top ranked K term candidates, and an improvement of 0.28 in terms of overall F1.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/22/2021

Unsupervised Technical Domain Terms Extraction using Term Extractor

Terminology extraction, also known as term extraction, is a subtask of i...
research
09/24/2020

Automatic Extraction of Agriculture Terms from Domain Text: A Survey of Tools and Techniques

Agriculture is a key component in any country's development. Domain-spec...
research
11/23/2016

ATR4S: Toolkit with State-of-the-art Automatic Terms Recognition Methods in Scala

Automatically recognized terminology is widely used for various domain-s...
research
05/24/2023

A Distributed Automatic Domain-Specific Multi-Word Term Recognition Architecture using Spark Ecosystem

Automatic Term Recognition is used to extract domain-specific terms that...
research
07/11/2017

Modeling the dynamics of domain specific terminology in diachronic corpora

In terminology work, natural language processing, and digital humanities...
research
10/17/2019

Topical Keyphrase Extraction with Hierarchical Semantic Networks

Topical keyphrase extraction is used to summarize large collections of t...
research
03/14/2019

Interactive Concept Mining on Personal Data -- Bootstrapping Semantic Services

Semantic services (e.g. Semantic Desktops) are still afflicted by a cold...

Please sign up or login with your details

Forgot password? Click here to reset