How Important are Good Method Names in Neural Code Generation? A Model Robustness Perspective

11/29/2022
by   Guang Yang, et al.
0

Pre-trained code generation models (PCGMs) have been widely applied in neural code generation which can generate executable code from functional descriptions in natural languages, possibly together with signatures. Despite substantial performance improvement of PCGMs, the role of method names in neural code generation has not been thoroughly investigated. In this paper, we study and demonstrate the potential of benefiting from method names to enhance the performance of PCGMs, from a model robustness perspective. Specifically, we propose a novel approach, named RADAR (neuRAl coDe generAtor Robustifier). RADAR consists of two components: RADAR-Attack and RADAR-Defense. The former attacks a PCGM by generating adversarial method names as part of the input, which are semantic and visual similar to the original input, but may trick the PCGM to generate completely unrelated code snippets. As a countermeasure to such attacks, RADAR-Defense synthesizes a new method name from the functional description and supplies it to the PCGM. Evaluation results show that RADAR-Attack can, e.g., reduce the CodeBLEU of generated code by 19.72 38.74 Moreover, RADAR-Defense is able to reinstate the performance of PCGMs with synthesized method names. These results highlight the importance of good method names in neural code generation and implicate the benefits of studying model robustness in software engineering.

READ FULL TEXT

page 2

page 14

page 15

page 17

page 19

page 20

research
05/31/2022

CodeAttack: Code-based Adversarial Attacks for Pre-Trained Programming Language Models

Pre-trained programming language (PL) models (such as CodeT5, CodeBERT, ...
research
03/21/2021

Exploiting Method Names to Improve Code Summarization: A Deliberation Multi-Task Learning Approach

Code summaries are brief natural language descriptions of source code pi...
research
08/22/2023

Adversarial Attacks on Code Models with Discriminative Graph Patterns

Pre-trained language models of code are now widely used in various softw...
research
07/31/2021

Adversarial Robustness of Deep Code Comment Generation

Deep neural networks (DNNs) have shown remarkable performance in a varie...
research
12/20/2022

ReCode: Robustness Evaluation of Code Generation Models

Code generation models have achieved impressive performance. However, th...
research
06/09/2022

CARLA-GeAR: a Dataset Generator for a Systematic Evaluation of Adversarial Robustness of Vision Models

Adversarial examples represent a serious threat for deep neural networks...
research
03/26/2018

code2vec: Learning Distributed Representations of Code

We present a neural model for representing snippets of code as continuou...

Please sign up or login with your details

Forgot password? Click here to reset