Detecting Plagiarism based on the Creation Process
All methodologies for detecting plagiarism to date have focused on the final digital "outcome", such as a document or source code. Our novel approach takes the creation process into account using logged events collected by special software or by the macro recorders found in most office applications. We look at an author's interaction logs with the software used to create the work. Detection relies on comparing the histograms of multiple logs' command use. A work is classified as plagiarism if its log deviates too much from logs of "honestly created" works or if its log is too similar to another log. The technique supports the detection of plagiarism for digital outcomes that stem from unique tasks, such as theses and equal tasks such as assignments for which the same problem sets are solved by multiple students. Focusing on the latter case, we evaluate this approach using logs collected by an interactive development environment (IDE) from more than sixty students who completed three programming assignments.
READ FULL TEXT