Vulnerability Mimicking Mutants

03/07/2023
by   Aayush Garg, et al.
0

With the increasing release of powerful language models trained on large code corpus (e.g. CodeBERT was trained on 6.4 million programs), a new family of mutation testing tools has arisen with the promise to generate more "natural" mutants in the sense that the mutated code aims at following the implicit rules and coding conventions typically produced by programmers. In this paper, we study to what extent the mutants produced by language models can semantically mimic the observable behavior of security-related vulnerabilities (a.k.a. Vulnerability-mimicking Mutants), so that designing test cases that are failed by these mutants will help in tackling mimicked vulnerabilities. Since analyzing and running mutants is computationally expensive, it is important to prioritize those mutants that are more likely to be vulnerability mimicking prior to any analysis or test execution. Taking this into account, we introduce VMMS, a machine learning based approach that automatically extracts the features from mutants and predicts the ones that mimic vulnerabilities. We conducted our experiments on a dataset of 45 vulnerabilities and found that 16.6 respective vulnerabilities. More precisely, 3.9 mutant set are vulnerability-mimicking mutants that mimic 55.6 vulnerabilities. Despite the scarcity, VMMS predicts vulnerability-mimicking mutants with 0.63 MCC, 0.80 Precision, and 0.51 Recall, demonstrating that the features of vulnerability-mimicking mutants can be automatically learned by machine learning models to statically predict these without the need of investing effort in defining such features.

READ FULL TEXT
research
01/05/2018

VulDeePecker: A Deep Learning-Based System for Vulnerability Detection

The automatic detection of software vulnerabilities is an important rese...
research
09/11/2023

FuzzLLM: A Novel and Universal Fuzzing Framework for Proactively Discovering Jailbreak Vulnerabilities in Large Language Models

Jailbreak vulnerabilities in Large Language Models (LLMs), which exploit...
research
04/10/2022

Is GitHub's Copilot as Bad As Humans at Introducing Vulnerabilities in Code?

Several advances in deep learning have been successfully applied to the ...
research
08/20/2023

Can Large Language Models Find And Fix Vulnerable Software?

In this study, we evaluated the capability of Large Language Models (LLM...
research
03/17/2018

Cost-aware Vulnerability Prediction: the HARMLESS Approach

Society needs more secure software. But predicting vulnerabilities is di...
research
04/16/2021

Neural Transfer Learning for Repairing Security Vulnerabilities in C Code

In this paper, we address the problem of automatic repair of software vu...
research
12/21/2020

Learning To Predict Vulnerabilities From Vulnerability-Fixes: A Machine Translation Approach

Vulnerability prediction refers to the problem of identifying the system...

Please sign up or login with your details

Forgot password? Click here to reset