Unveiling the potential of large language models in generating semantic and cross-language clones

09/12/2023
by   Palash R. Roy, et al.
0

Semantic and Cross-language code clone generation may be useful for code reuse, code comprehension, refactoring and benchmarking. OpenAI's GPT model has potential in such clone generation as GPT is used for text generation. When developers copy/paste codes from Stack Overflow (SO) or within a system, there might be inconsistent changes leading to unexpected behaviours. Similarly, if someone possesses a code snippet in a particular programming language but seeks equivalent functionality in a different language, a semantic cross-language code clone generation approach could provide valuable assistance. In this study, using SemanticCloneBench as a vehicle, we evaluated how well the GPT-3 model could help generate semantic and cross-language clone variants for a given fragment.We have comprised a diverse set of code fragments and assessed GPT-3s performance in generating code variants.Through extensive experimentation and analysis, where 9 judges spent 158 hours to validate, we investigate the model's ability to produce accurate and semantically correct variants. Our findings shed light on GPT-3's strengths in code generation, offering insights into the potential applications and challenges of using advanced language models in software development. Our quantitative analysis yields compelling results. In the realm of semantic clones, GPT-3 attains an impressive accuracy of 62.14 prompt engineering. Furthermore, the model shines in transcending linguistic confines, boasting an exceptional 91.25 clones

READ FULL TEXT
research
04/22/2023

An Empirical Study on Using Large Language Models for Multi-Intent Comment Generation

Code comment generation aims at generating natural language descriptions...
research
08/26/2023

GPTCloneBench: A comprehensive benchmark of semantic clones and cross-language clones using GPT-3 model and SemanticCloneBench

With the emergence of Machine Learning, there has been a surge in levera...
research
08/08/2023

A Comparative Study of Code Generation using ChatGPT 3.5 across 10 Programming Languages

Large Language Models (LLMs) are advanced Artificial Intelligence (AI) s...
research
04/14/2023

Stochastic Code Generation

Large language models pre-trained for code generation can generate high-...
research
06/02/2023

Is Model Attention Aligned with Human Attention? An Empirical Study on Large Language Models for Code Generation

Large Language Models (LLMs) have been demonstrated effective for code g...
research
08/29/2023

AskIt: Unified Programming Interface for Programming with Large Language Models

In the evolving landscape of software development, Large Language Models...
research
05/16/2023

A Preliminary Analysis on the Code Generation Capabilities of GPT-3.5 and Bard AI Models for Java Functions

This paper evaluates the capability of two state-of-the-art artificial i...

Please sign up or login with your details

Forgot password? Click here to reset