DeepDFA: Dataflow Analysis-Guided Efficient Graph Learning for Vulnerability Detection

12/15/2022
by   Benjamin Steenhoek, et al.
0

Deep learning-based vulnerability detection models have recently been shown to be effective and, in some cases, outperform static analysis tools. However, the highest-performing approaches use token-based transformer models, which do not leverage domain knowledge. Classical program analysis techniques such as dataflow analysis can detect many types of bugs and are the most commonly used methods in practice. Motivated by the causal relationship between bugs and dataflow analysis, we present DeepDFA, a dataflow analysis-guided graph learning framework and embedding that uses program semantic features for vulnerability detection. We show that DeepDFA is performant and efficient. DeepDFA ranked first in recall, first in generalizing over unseen projects, and second in F1 among all the state-of-the-art models we experimented with. It is also the smallest model in terms of the number of parameters, and was trained in 9 minutes, 69x faster than the highest-performing baseline. DeepDFA can be used with other models. By integrating LineVul and DeepDFA, we achieved the best vulnerability detection performance of 96.4 F1 score, 98.69 precision, and 94.22 recall.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/10/2022

LineVD: Statement-level Vulnerability Detection using Graph Neural Networks

Current machine-learning based software vulnerability detection methods ...
research
10/21/2020

Deep Learning Frameworks for Pavement Distress Classification: A Comparative Analysis

Automatic detection and classification of pavement distresses is critica...
research
09/03/2020

Deep Learning based Vulnerability Detection: Are We There Yet?

Automated detection of software vulnerabilities is a fundamental problem...
research
04/01/2023

DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection

We propose and release a new vulnerable source code dataset. We curate t...
research
09/06/2022

Monkeypox virus detection using pre-trained deep learning-based approaches

Monkeypox virus is emerging slowly with the decline of COVID-19 virus in...
research
03/02/2023

Pathways to Leverage Transcompiler based Data Augmentation for Cross-Language Clone Detection

Software clones are often introduced when developers reuse code fragment...
research
06/26/2016

This before That: Causal Precedence in the Biomedical Domain

Causal precedence between biochemical interactions is crucial in the bio...

Please sign up or login with your details

Forgot password? Click here to reset