Improving Academic Plagiarism Detection for STEM Documents by Analyzing Mathematical Content and Citations

06/27/2019
by   Norman Meuschke, et al.
0

Identifying academic plagiarism is a pressing task for educational and research institutions, publishers, and funding agencies. Current plagiarism detection systems reliably find instances of copied and moderately reworded text. However, reliably detecting concealed plagiarism, such as strong paraphrases, translations, and the reuse of nontextual content and ideas is an open research problem. In this paper, we extend our prior research on analyzing mathematical content and academic citations. Both are promising approaches for improving the detection of concealed academic plagiarism primarily in Science, Technology, Engineering and Mathematics (STEM). We make the following contributions: i) We present a two-stage detection process that combines similarity assessments of mathematical content, academic citations, and text. ii) We introduce new similarity measures that consider the order of mathematical features and outperform the measures in our prior research. iii) We compare the effectiveness of the math-based, citation-based, and text-based detection approaches using confirmed cases of academic plagiarism. iv) We demonstrate that the combined analysis of math-based and citation-based content features allows identifying potentially suspicious cases in a collection of 102K STEM documents. Overall, we show that analyzing the similarity of mathematical content and academic citations is a striking supplement for conventional text-based detection approaches for academic literature in the STEM disciplines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/10/2021

Analyzing Non-Textual Content Elements to Detect Academic Plagiarism

Identifying academic plagiarism is a pressing problem, among others, for...
research
01/25/2018

Analyzing Similarity in Mathematical Content To Enhance the Detection of Academic Plagiarism

Despite the effort put into the detection of academic plagiarism, it con...
research
05/23/2020

A First Step Towards Content Protecting Plagiarism Detection

Plagiarism detection systems are essential tools for safeguarding academ...
research
03/03/2023

Discovery and Recognition of Formula Concepts using Machine Learning

Citation-based Information Retrieval (IR) methods for scientific documen...
research
06/02/2021

The Struggle with Academic Plagiarism: Approaches based on Semantic Similarity

Academic plagiarism is a serious problem nowadays. Due to the existence ...
research
05/20/2020

Machine Identification of High Impact Research through Text and Image Analysis

The volume of academic paper submissions and publications is growing at ...
research
12/27/2021

Hamtajoo: A Persian Plagiarism Checker for Academic Manuscripts

In recent years, due to the high availability of electronic documents th...

Please sign up or login with your details

Forgot password? Click here to reset