Clone Detection on Large Scala Codebases

04/08/2022
by   Wahidur Rahman, et al.
0

Code clones are identical or similar code segments. The wide existence of code clones can increase the cost of maintenance and jeopardise the quality of software. The research community has developed many techniques to detect code clones, however, there is little evidence of how these techniques may perform in industrial use cases. In this paper, we aim to uncover the differences when such techniques are applied in industrial use cases. We conducted large scale experimental research on the performance of two state-of-the-art code clone detection techniques, SourcererCC and AutoenCODE, on both open source projects and an industrial project written in the Scala language. Our results reveal that both algorithms perform differently on the industrial project, with the largest drop in precision being 30.7%, and the largest increase in recall being 32.4%. By manually labelling samples of the industrial project by its developers, we discovered that there are substantially less Type-3 clones in the aforementioned project than that in the open source projects.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/03/2018

On Developers' Personality in Large-scale Distributed Projects: The Case of the Apache Ecosystem

Large-scale distributed projects are typically the results of collective...
research
09/26/2017

The Co-Evolution of Test Maintenance and Code Maintenance through the lens of Fine-Grained Semantic Changes

Automatic testing is a widely adopted technique for improving software q...
research
12/20/2019

CORE: Automating Review Recommendation for Code Changes

Code review is a common process that is used by developers, in which a r...
research
12/04/2017

The JKind Model Checker

JKind is an open-source industrial model checker developed by Rockwell C...
research
09/10/2019

LVMapper: A Large-variance Clone Detector Using Sequencing Alignment Approach

To detect large-variance code clones (i.e. clones with relatively more d...
research
04/24/2019

The VGG Image Annotator (VIA)

Manual image annotation, such as defining and labelling regions of inter...
research
11/11/2020

Guiding user annotations for units-of-measure verification

This extended abstract reports on previous work of the CamFort project i...

Please sign up or login with your details

Forgot password? Click here to reset