Systematically Finding Security Vulnerabilities in Black-Box Code Generation Models

by   Hossein Hajipour, et al.

Recently, large language models for code generation have achieved breakthroughs in several programming language tasks. Their advances in competition-level programming problems have made them an emerging pillar in AI-assisted pair programming. Tools such as GitHub Copilot are already part of the daily programming workflow and are used by more than a million developers. The training data for these models is usually collected from open-source repositories (e.g., GitHub) that contain software faults and security vulnerabilities. This unsanitized training data can lead language models to learn these vulnerabilities and propagate them in the code generation procedure. Given the wide use of these models in the daily workflow of developers, it is crucial to study the security aspects of these models systematically. In this work, we propose the first approach to automatically finding security vulnerabilities in black-box code generation models. To achieve this, we propose a novel black-box inversion approach based on few-shot prompting. We evaluate the effectiveness of our approach by examining code generation models in the generation of high-risk security weaknesses. We show that our approach automatically and systematically finds 1000s of security vulnerabilities in various code generation models, including the commercial black-box model GitHub Copilot.


page 1

page 8


CodeAttack: Code-based Adversarial Attacks for Pre-Trained Programming Language Models

Pre-trained programming language (PL) models (such as CodeT5, CodeBERT, ...

Do Not Give Away My Secrets: Uncovering the Privacy Issue of Neural Code Completion Tools

Neural Code Completion Tools (NCCTs) have reshaped the field of software...

Systematic Meets Unintended: Prior Knowledge Adaptive 5G Vulnerability Detection via Multi-Fuzzing

The virtualization and softwarization of 5G and NextG are critical enabl...

An Empirical Cybersecurity Evaluation of GitHub Copilot's Code Contributions

There is burgeoning interest in designing AI-based systems to assist hum...

Is GitHub's Copilot as Bad As Humans at Introducing Vulnerabilities in Code?

Several advances in deep learning have been successfully applied to the ...

Can OpenAI Codex and Other Large Language Models Help Us Fix Security Bugs?

Human developers can produce code with cybersecurity weaknesses. Can eme...

Open Sesame! Universal Black Box Jailbreaking of Large Language Models

Large language models (LLMs), designed to provide helpful and safe respo...

Please sign up or login with your details

Forgot password? Click here to reset