Log In Sign Up

Calculating Semantic Similarity between Academic Articles using Topic Event and Ontology

by   Ming Liu, et al.

Determining semantic similarity between academic documents is crucial to many tasks such as plagiarism detection, automatic technical survey and semantic search. Current studies mostly focus on semantic similarity between concepts, sentences and short text fragments. However, document-level semantic matching is still based on statistical information in surface level, neglecting article structures and global semantic meanings, which may cause the deviation in document understanding. In this paper, we focus on the document-level semantic similarity issue for academic literatures with a novel method. We represent academic articles with topic events that utilize multiple information profiles, such as research purposes, methodologies and domains to integrally describe the research work, and calculate the similarity between topic events based on the domain ontology to acquire the semantic similarity between articles. Experiments show that our approach achieves significant performance compared to state-of-the-art methods.


page 1

page 2

page 3

page 4


Understanding and representing the semantics of large structured documents

Understanding large, structured documents like scholarly articles, reque...

SimDoc: Topic Sequence Alignment based Document Similarity Framework

Document similarity is the problem of estimating the degree to which a g...

Clustering articles based on semantic similarity

Document clustering is generally the first step for topic identification...

Sequence-Based Extractive Summarisation for Scientific Articles

This paper presents the results of research on supervised extractive tex...

Semantic Similarity Computing for Scientific Academic Conferences fused with domain features

Aiming at the problem that the current general-purpose semantic text sim...

The Struggle with Academic Plagiarism: Approaches based on Semantic Similarity

Academic plagiarism is a serious problem nowadays. Due to the existence ...

Web Robot Detection in Academic Publishing

Recent industry reports assure the rise of web robots which comprise mor...