Augmenting Machine Learning with Information Retrieval to Recommend Real Cloned Code Methods for Code Completion

10/02/2020
by   Muhammad Hammad, et al.
0

Software developers frequently reuse source code from repositories as it saves development time and effort. Code clones accumulated in these repositories hence represent often repeated functionalities and are candidates for reuse in an exploratory or rapid development. In previous work, we introduced DeepClone, a deep neural network model trained by fine tuning GPT-2 model over the BigCloneBench dataset to predict code clone methods. The probabilistic nature of DeepClone output generation can lead to syntax and logic errors that requires manual editing of the output for final reuse. In this paper, we propose a novel approach of applying an information retrieval (IR) technique on top of DeepClone output to recommend real clone methods closely matching the predicted output. We have quantitatively evaluated our strategy, showing that the proposed approach significantly improves the quality of recommendation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/22/2020

DeepClone: Modeling Clones to Generate Code Predictions

During software development, programmers often tend to reuse the code fo...
research
02/23/2023

On Code Reuse from StackOverflow: An Exploratory Study on Jupyter Notebook

Jupyter Notebook is a popular tool among data analysts and scientists fo...
research
10/23/2019

Retrieve and Refine: Exemplar-based Neural Comment Generation

Code comment generation is a crucial task in the field of automatic soft...
research
08/15/2022

On the Adoption and Effects of Source Code Reuse on Defect Proneness and Maintenance Effort

Context. Software reusability mechanisms, like inheritance and delegatio...
research
06/27/2022

BashExplainer: Retrieval-Augmented Bash Code Comment Generation based on Fine-tuned CodeBERT

Developers use shell commands for many tasks, such as file system manage...
research
11/05/2021

DeSkew-LSH based Code-to-Code Recommendation Engine

Machine learning on source code (MLOnCode) is a popular research field t...
research
10/05/2022

IRJIT – An Information Retrieval Technique for Just-in-time Defect Identification

Defect identification at commit check-in time prevents the introduction ...

Please sign up or login with your details

Forgot password? Click here to reset