Adversarial Training for Code Retrieval with Question-Description Relevance Regularization

10/19/2020
by   Jie Zhao, et al.
0

Code retrieval is a key task aiming to match natural and programming languages. In this work, we propose adversarial learning for code retrieval, that is regularized by question-description relevance. First, we adapt a simple adversarial learning technique to generate difficult code snippets given the input question, which can help the learning of code retrieval that faces bi-modal and data-scarce challenges. Second, we propose to leverage question-description relevance to regularize adversarial learning, such that a generated code snippet should contribute more to the code retrieval training loss, only if its paired natural language description is predicted to be less relevant to the user given question. Experiments on large-scale code retrieval datasets of two programming languages show that our adversarial learning method is able to improve the performance of state-of-the-art models. Moreover, using an additional duplicate question prediction model to regularize adversarial learning further improves the performance, and this is more effective than using the duplicated questions in strong multi-task learning baselines

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2020

Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning

Code summarization generates brief natural language description given a ...
research
03/13/2019

CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning

To accelerate software development, much research has been performed to ...
research
07/06/2023

DisAsymNet: Disentanglement of Asymmetrical Abnormality on Bilateral Mammograms using Self-adversarial Learning

Asymmetry is a crucial characteristic of bilateral mammograms (Bi-MG) wh...
research
09/20/2019

CodeSearchNet Challenge: Evaluating the State of Semantic Code Search

Semantic code search is the task of retrieving relevant code given a nat...
research
02/07/2016

The IMP game: Learnability, approximability and adversarial learning beyond Σ^0_1

We introduce a problem set-up we call the Iterated Matching Pennies (IMP...
research
03/26/2018

StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow

Stack Overflow (SO) has been a great source of natural language question...
research
07/02/2017

Variance Regularizing Adversarial Learning

We introduce a novel approach for training adversarial models by replaci...

Please sign up or login with your details

Forgot password? Click here to reset