Unsupervised Key-phrase Extraction and Clustering for Classification Scheme in Scientific Publications

01/25/2021
by   Xiajing Li, et al.
19

Several methods have been explored for automating parts of Systematic Mapping (SM) and Systematic Review (SR) methodologies. Challenges typically evolve around the gaps in semantic understanding of text, as well as lack of domain and background knowledge necessary to bridge that gap. In this paper we investigate possible ways of automating parts of the SM/SR process, i.e. that of extracting keywords and key-phrases from scientific documents using unsupervised methods, which are then used as a basis to construct the corresponding Classification Scheme using semantic key-phrase clustering techniques. Specifically, we explore the effect of ensemble scores measure in key-phrase extraction, we explore semantic network based word embedding in embedding representation of phrase semantics and finally we also explore how clustering can be used to group related key-phrases. The evaluation is conducted on a dataset of publications pertaining the domain of "Explainable AI" which we constructed using standard publicly available digital libraries and sets of indexing terms (keywords). Results shows that: ensemble ranking score does improve the key-phrase extraction performance. Semantic-network based word embedding based on the ConceptNet Semantic Network has similar performance with contextualized word embedding, however the former are computationally more efficient. Finally Semantic key-phrase clustering at term-level can group similar terms together that can be suitable for classification scheme.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/16/2018

Theme-weighted Ranking of Keywords from Text Documents using Phrase Embeddings

Keyword extraction is a fundamental task in natural language processing ...
research
08/09/2019

Using Semantic Role Knowledge for Relevance Ranking of Key Phrases in Documents: An Unsupervised Approach

In this paper, we investigate the integration of sentence position and s...
research
08/01/2022

Patents Phrase to Phrase Semantic Matching Dataset

There are many general purpose benchmark datasets for Semantic Textual S...
research
06/09/2021

Phraseformer: Multimodal Key-phrase Extraction using Transformer and Graph Embedding

Background: Keyword extraction is a popular research topic in the field ...
research
03/15/2022

Unsupervised Keyphrase Extraction via Interpretable Neural Networks

Keyphrase extraction aims at automatically extracting a list of "importa...
research
11/09/2019

PoD: Positional Dependency-Based Word Embedding for Aspect Term Extraction

Dependency context-based word embedding jointly learns the representatio...
research
03/16/2020

Key Phrase Classification in Complex Assignments

Complex assignments typically consist of open-ended questions with large...

Please sign up or login with your details

Forgot password? Click here to reset