AugmentedCode: Examining the Effects of Natural Language Resources in Code Retrieval Models

10/16/2021
by   Mehdi Bahrami, et al.
0

Code retrieval is allowing software engineers to search codes through a natural language query, which relies on both natural language processing and software engineering techniques. There have been several attempts on code retrieval from searching snippet codes to function codes. In this paper, we introduce Augmented Code (AugmentedCode) retrieval which takes advantage of existing information within the code and constructs augmented programming language to improve the code retrieval models' performance. We curated a large corpus of Python and showcased the the framework and the results of augmented programming language which outperforms on CodeSearchNet and CodeBERT with a Mean Reciprocal Rank (MRR) of 0.73 and 0.96, respectively. The outperformed fine-tuned augmented code retrieval model is published in HuggingFace at https://huggingface.co/Fujitsu/AugCode and a demonstration video is available at: https://youtu.be/mnZrUTANjGs .

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2020

Deep Graph Matching and Searching for Semantic Code Retrieval

Code retrieval is to find the code snippet from a large corpus of source...
research
10/20/2021

JavaBERT: Training a transformer-based model for the Java programming language

Code quality is and will be a crucial factor while developing new softwa...
research
12/06/2020

NaturalCC: A Toolkit to Naturalize the Source Code Corpus

We present NaturalCC, an efficient and extensible toolkit to bridge the ...
research
12/20/2022

Generation-Augmented Query Expansion For Code Retrieval

Pre-trained language models have achieved promising success in code retr...
research
04/16/2021

BERT2Code: Can Pretrained Language Models be Leveraged for Code Search?

Millions of repetitive code snippets are submitted to code repositories ...
research
08/11/2021

Natural Language-Guided Programming

In today's software world with its cornucopia of reusable software libra...
research
03/08/2021

Langar: An Approach to Evaluate Reo Programming Language

Reo is a formal coordination language. In order to assess and evaluate i...

Please sign up or login with your details

Forgot password? Click here to reset