Plagiarism Detection on Electronic Text based Assignments using Vector Space Model (ICIAfS14)

12/25/2014
by   MAC Jiffriya, et al.
0

Plagiarism is known as illegal use of others' part of work or whole work as one's own in any field such as art, poetry, literature, cinema, research and other creative forms of study. Plagiarism is one of the important issues in academic and research fields and giving more concern in academic systems. The situation is even worse with the availability of ample resources on the web. This paper focuses on an effective plagiarism detection tool on identifying suitable intra-corpal plagiarism detection for text based assignments by comparing unigram, bigram, trigram of vector space model with cosine similarity measure. Manually evaluated, labelled dataset was tested using unigram, bigram and trigram vector. Even though trigram vector consumes comparatively more time, it shows better results with the labelled data. In addition, the selected trigram vector space model with cosine similarity measure is compared with tri-gram sequence matching technique with Jaccard measure. In the results, cosine similarity score shows slightly higher values than the other. Because, it focuses on giving more weight for terms that do not frequently exist in the dataset and cosine similarity measure using trigram technique is more preferable than the other. Therefore, we present our new tool and it could be used as an effective tool to evaluate text based electronic assignments and minimize the plagiarism among students.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/30/2016

Unsupervised Measure of Word Similarity: How to Outperform Co-occurrence and Vector Cosine in VSMs

In this paper, we claim that vector cosine, which is generally considere...
research
03/20/2019

Distributed Vector Representations of Folksong Motifs

This article presents a distributed vector representation model for lear...
research
08/27/2016

Testing APSyn against Vector Cosine on Similarity Estimation

In Distributional Semantic Models (DSMs), Vector Cosine is widely used t...
research
03/29/2016

What a Nerd! Beating Students and Vector Cosine in the ESL and TOEFL Datasets

In this paper, we claim that Vector Cosine, which is generally considere...
research
03/01/2021

An open-source framework for ExpFinder integrating N-gram Vector Space Model and μCO-HITS

Finding experts drives successful collaborations and high-quality produc...
research
12/27/2021

Hamtajoo: A Persian Plagiarism Checker for Academic Manuscripts

In recent years, due to the high availability of electronic documents th...
research
08/28/2018

Implementation Notes for the Soft Cosine Measure

The standard bag-of-words vector space model (VSM) is efficient, and ubi...

Please sign up or login with your details

Forgot password? Click here to reset