Generating Code with the Help of Retrieved Template Functions and Stack Overflow Answers

04/12/2021
by   Dawn Drain, et al.
0

We approach the important challenge of code autocompletion as an open-domain task, in which a sequence-to-sequence code generator model is enhanced with the ability to attend to reference code snippets supplied by a semantic code search engine. In this work, we present a novel framework to precisely retrieve template functions as well as intent-snippet pairs and effectively train such a retrieval-guided code generator. To demonstrate the effectiveness of our model designs, we perform extensive experiments with CodeSearchNet which contains template functions and CoNaLa which contains Stack Overflow intent-snippet pairs. We also investigate different retrieval models, including Elasticsearch, DPR, and our fusion representation search model, which currently holds the number one spot on the CodeSearchNet leaderboard. We observe improvements by leveraging multiple database elements and further gain from retrieving diverse data points by using Maximal Marginal Relevance. Overall, we see a 4 improvement to cross-entropy loss, a 15 44 subtler improvements of 2 Overflow intent-snippet pairs. We also create a novel Stack Overflow-Function Alignment dataset, which consists of 150K tuples of functions and Stack Overflow intent-snippet pairs that are of help in writing the associated function, of which 1.7K are manually curated.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/26/2018

StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow

Stack Overflow (SO) has been a great source of natural language question...
research
08/27/2020

Neural Code Search Revisited: Enhancing Code Snippet Retrieval through Natural Language Intent

In this work, we propose and study annotated code search: the retrieval ...
research
07/19/2020

Code2Que: A Tool for Improving Question Titles from Mined Code Snippets in Stack Overflow

Stack Overflow is one of the most popular technical Q A sites used by ...
research
06/06/2023

Generate-then-Retrieve: Intent-Aware FAQ Retrieval in Product Search

Customers interacting with product search engines are increasingly formu...
research
09/07/2023

All Labels Together: Low-shot Intent Detection with an Efficient Label Semantic Encoding Paradigm

In intent detection tasks, leveraging meaningful semantic information fr...
research
08/05/2021

Improved Retrieval of Programming Solutions With Code Examples Using a Multi-featured Score

Developers often depend on code search engines to obtain solutions for t...
research
07/13/2022

DocCoder: Generating Code by Retrieving and Reading Docs

Natural-language-to-code models learn to generate a code snippet given a...

Please sign up or login with your details

Forgot password? Click here to reset