Function completion in the time of massive data: A code embedding perspective

08/09/2020
by   M. Weyssow, et al.
0

Code completion is an important feature of integrated development environments (IDEs). It allows developers to produce code faster, especially novice ones who are not fully familiar with APIs and others code. Previous works on code completion have mainly exploited static type systems of programming languages or code history of the project under development or of other projects using common APIs. In this work, we present a novel approach for improving current function-calls completion tools by learning from independent code repositories, using well-known natural language processing models that can learn vector representation of source code (code embeddings). Our models are not trained on historical data of specific projects. Instead, our approach allows to learn high-level concepts and their relationships present among thousands of projects. As a consequence, the resulting system is able to provide general suggestions that are not specific to particular projects or APIs. Additionally, by taking into account the context of the call to complete, our approach suggests function calls relevant to that context. We evaluated our approach on a set of open-source projects unseen during the training. The results show that the use of the trained model along with a code suggestion plug-in based on static type analysis improves significantly the correctness of the completion suggestions.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

05/22/2020

DevReplay: Automatic Repair with Editable Fix Pattern

Static analysis tools, or linters, detect violation of source code conve...
04/28/2020

Fast and Memory-Efficient Neural Code Completion

Code completion is one of the most widely used features of modern integr...
03/08/2021

Siri, Write the Next Method

Code completion is one of the killer features of Integrated Development ...
06/18/2020

Learning to Format Coq Code Using Language Models

Should the final right bracket in a record declaration be on a separate ...
09/30/2017

Automated Program Analysis for Novice Programmers

This paper describes how to adapt a static code analyzer to help novice ...
05/28/2020

Using Source Code Density to Improve the Accuracy of Automatic Commit Classification into Maintenance Activities

Source code is changed for a reason, e.g., to adapt, correct, or adapt i...
01/07/2021

Decision Support System for an Intelligent Operator of Utility Tunnel Boring Machines

In tunnel construction projects, delays induce high costs. Thus, tunnel ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.