DeepAI AI Chat
Log In Sign Up

Hamtajoo: A Persian Plagiarism Checker for Academic Manuscripts

by   Vahid Zarrabi, et al.

In recent years, due to the high availability of electronic documents through the Web, the plagiarism has become a serious challenge, especially among scholars. Various plagiarism detection systems have been developed to prevent text re-use and to confront plagiarism. Although it is almost easy to detect duplicate text in academic manuscripts, finding patterns of text re-use that has been semantically changed is of great importance. Another important issue is to deal with less resourced languages, which there are low volume of text for training purposes and also low performance in tools for NLP applications. In this paper, we introduce Hamtajoo, a Persian plagiarism detection system for academic manuscripts. Moreover, we describe the overall structure of the system along with the algorithms used in each stage. In order to evaluate the performance of the proposed system, we used a plagiarism detection corpus comply with the PAN standards.


Uzbek text summarization based on TF-IDF

The volume of information is increasing at an incredible rate with the r...

Extracting Body Text from Academic PDF Documents for Text Mining

Accurate extraction of body text from PDF-formatted academic documents i...

Improving Academic Plagiarism Detection for STEM Documents by Analyzing Mathematical Content and Citations

Identifying academic plagiarism is a pressing task for educational and r...

hyperdoc2vec: Distributed Representations of Hypertext Documents

Hypertext documents, such as web pages and academic papers, are of great...

Plagiarism Detection on Electronic Text based Assignments using Vector Space Model (ICIAfS14)

Plagiarism is known as illegal use of others' part of work or whole work...

Machine Identification of High Impact Research through Text and Image Analysis

The volume of academic paper submissions and publications is growing at ...

Taxonomy of academic plagiarism methods

The article gives an overview of the plagiarism domain, with focus on ac...