CodeAttack: Code-based Adversarial Attacks for Pre-Trained Programming Language Models

05/31/2022
by   Akshita Jha, et al.
2

Pre-trained programming language (PL) models (such as CodeT5, CodeBERT, GraphCodeBERT, etc.,) have the potential to automate software engineering tasks involving code understanding and code generation. However, these models are not robust to changes in the input and thus, are potentially susceptible to adversarial attacks. We propose, CodeAttack, a simple yet effective black-box attack model that uses code structure to generate imperceptible, effective, and minimally perturbed adversarial code samples. We demonstrate the vulnerabilities of the state-of-the-art PL models to code-specific adversarial attacks. We evaluate the transferability of CodeAttack on several code-code (translation and repair) and code-NL (summarization) tasks across different programming languages. CodeAttack outperforms state-of-the-art adversarial NLP attack models to achieve the best overall performance while being more efficient and imperceptible.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/22/2023

Adversarial Attacks on Code Models with Discriminative Graph Patterns

Pre-trained language models of code are now widely used in various softw...
research
09/12/2022

Semantic-Preserving Adversarial Code Comprehension

Based on the tremendous success of pre-trained language models (PrLMs) f...
research
07/27/2023

Universal and Transferable Adversarial Attacks on Aligned Language Models

Because "out-of-the-box" large language models are capable of generating...
research
02/08/2023

Systematically Finding Security Vulnerabilities in Black-Box Code Generation Models

Recently, large language models for code generation have achieved breakt...
research
11/29/2022

How Important are Good Method Names in Neural Code Generation? A Model Robustness Perspective

Pre-trained code generation models (PCGMs) have been widely applied in n...
research
09/15/2023

Adversarial Attacks on Tables with Entity Swap

The capabilities of large language models (LLMs) have been successfully ...
research
10/30/2021

AdvCodeMix: Adversarial Attack on Code-Mixed Data

Research on adversarial attacks are becoming widely popular in the recen...

Please sign up or login with your details

Forgot password? Click here to reset