Improving Code Autocompletion with Transfer Learning

05/12/2021
by   Wen Zhou, et al.
0

Software language models have achieved promising results predicting code completion usages, and several industry studies have described successful IDE integrations. Recently, accuracy in autocompletion prediction improved 12.8 from training on a real-world dataset collected from programmers' IDE activity. But what if limited examples of IDE autocompletion in the target programming language are available for model training? In this paper, we investigate the efficacy of pretraining autocompletion models on non-IDE, non-autocompletion, and different-language example code sequences. We find that these unsupervised pretrainings improve model accuracy by over 50 datasets and over 10 of these pretrainings in an online setting through A/B testing on thousands of IDE autocompletion users, finding that pretraining is responsible for increases of up to 6.63

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

11/09/2020

Learning Autocompletion from Real-World Datasets

Code completion is a popular software development tool integrated into a...
02/05/2020

Aligning the Pretraining and Finetuning Objectives of Language Models

We demonstrate that explicitly aligning the pretraining objectives to th...
11/11/2021

Improving Large-scale Language Models and Resources for Filipino

In this paper, we improve on existing language resources for the low-res...
09/06/2018

Code-switched Language Models Using Dual RNNs and Same-Source Pretraining

This work focuses on building language models (LMs) for code-switched te...
10/09/2021

The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design

Pretraining Neural Language Models (NLMs) over a large corpus involves c...
09/10/2019

An Evalutation of Programming Language Models' performance on Software Defect Detection

This dissertation presents an evaluation of several language models on s...
09/09/2018

How clever is the FiLM model, and how clever can it be?

The FiLM model achieves close-to-perfect performance on the diagnostic C...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.