Evaluating the robustness of source code plagiarism detection tools to pervasive plagiarism-hiding modifications

02/08/2021
by   Hayden Cheers, et al.
0

Source code plagiarism is a common occurrence in undergraduate computer science education. In order to identify such cases, many source code plagiarism detection tools have been proposed. A source code plagiarism detection tool evaluates pairs of assignment submissions to detect indications of plagiarism. However, a plagiarising student will commonly apply plagiarism-hiding modifications to source code in an attempt to evade detection. Subsequently, prior work has implied that currently available source code plagiarism detection tools are not robust to the application of pervasive plagiarism-hiding modifications. In this article, 11 source code plagiarism detection tools are evaluated for robustness against plagiarism-hiding modifications. The tools are evaluated with data sets of simulated undergraduate plagiarism, constructed with source code modifications representative of undergraduate students. The results of the performed evaluations indicate that currently available source code plagiarism detection tools are not robust against modifications which apply fine-grained transformations to the source code structure. Of the evaluated tools, JPlag and Plaggie demonstrates the greatest robustness to different types of plagiarism-hiding modifications. However, the results also indicate that graph-based tools (specifically those that compare programs as program dependence graphs) show potentially greater robustness to pervasive plagiarism-hiding modifications.

READ FULL TEXT
research
02/08/2021

Academic Source Code Plagiarism Detection by Measuring Program Behavioural Similarity

Source code plagiarism is a long-standing issue in tertiary computer sci...
research
03/19/2023

Towards a Dataset of Programming Contest Plagiarism in Java

In this paper, we describe and present the first dataset of source code ...
research
09/10/2019

LVMapper: A Large-variance Clone Detector Using Sequencing Alignment Approach

To detect large-variance code clones (i.e. clones with relatively more d...
research
10/04/2020

Mossad: Defeating Software Plagiarism Detection

Automatic software plagiarism detection tools are widely used in educati...
research
06/26/2023

Exploring the Robustness of Large Language Models for Solving Programming Problems

Using large language models (LLMs) for source code has recently gained a...
research
05/12/2020

Understanding Memory Access Patterns Using the BSC Performance Tools

The growing gap between processor and memory speeds results in complex m...
research
04/12/2013

The Recomputation Manifesto

Replication of scientific experiments is critical to the advance of scie...

Please sign up or login with your details

Forgot password? Click here to reset