DeepAI AI Chat
Log In Sign Up

CodeAttack: Code-based Adversarial Attacks for Pre-Trained Programming Language Models

05/31/2022
by   Akshita Jha, et al.
2

Pre-trained programming language (PL) models (such as CodeT5, CodeBERT, GraphCodeBERT, etc.,) have the potential to automate software engineering tasks involving code understanding and code generation. However, these models are not robust to changes in the input and thus, are potentially susceptible to adversarial attacks. We propose, CodeAttack, a simple yet effective black-box attack model that uses code structure to generate imperceptible, effective, and minimally perturbed adversarial code samples. We demonstrate the vulnerabilities of the state-of-the-art PL models to code-specific adversarial attacks. We evaluate the transferability of CodeAttack on several code-code (translation and repair) and code-NL (summarization) tasks across different programming languages. CodeAttack outperforms state-of-the-art adversarial NLP attack models to achieve the best overall performance while being more efficient and imperceptible.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/26/2021

TreeBERT: A Tree-Based Pre-Trained Model for Programming Language

Source code can be parsed into the abstract syntax tree (AST) based on d...
09/12/2022

Semantic-Preserving Adversarial Code Comprehension

Based on the tremendous success of pre-trained language models (PrLMs) f...
01/06/2023

Adversarial Attacks on Neural Models of Code via Code Difference Reduction

Deep learning has been widely used to solve various code-based tasks by ...
02/08/2023

Systematically Finding Security Vulnerabilities in Black-Box Code Generation Models

Recently, large language models for code generation have achieved breakt...
11/29/2022

How Important are Good Method Names in Neural Code Generation? A Model Robustness Perspective

Pre-trained code generation models (PCGMs) have been widely applied in n...
06/13/2021

Target Model Agnostic Adversarial Attacks with Query Budgets on Language Understanding Models

Despite significant improvements in natural language understanding model...
10/30/2021

AdvCodeMix: Adversarial Attack on Code-Mixed Data

Research on adversarial attacks are becoming widely popular in the recen...