ReGVD: Revisiting Graph Neural Networks for Vulnerability Detection

by   Van-Anh Nguyen, et al.

Identifying vulnerabilities in the source code is essential to protect the software systems from cyber security attacks. It, however, is also a challenging step that requires specialized expertise in security and code representation. Inspired by the successful applications of pre-trained programming language (PL) models such as CodeBERT and graph neural networks (GNNs), we propose ReGVD, a general and novel graph neural network-based model for vulnerability detection. In particular, ReGVD views a given source code as a flat sequence of tokens and then examines two effective methods of utilizing unique tokens and indexes respectively to construct a single graph as an input, wherein node features are initialized only by the embedding layer of a pre-trained PL model. Next, ReGVD leverages a practical advantage of residual connection among GNN layers and explores a beneficial mixture of graph-level sum and max poolings to return a graph embedding for the given source code. Experimental results demonstrate that ReGVD outperforms the existing state-of-the-art models and obtain the highest accuracy on the real-world benchmark dataset from CodeXGLUE for vulnerability detection.


page 1

page 2

page 3

page 4


Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks

Vulnerability identification is crucial to protect the software systems ...

Towards Tracing Code Provenance with Code Watermarking

Recent advances in large language models have raised wide concern in gen...

Sequential Graph Neural Networks for Source Code Vulnerability Identification

Vulnerability identification constitutes a task of high importance for c...

Enhancing Security Patch Identification by Capturing Structures in Commits

With the rapid increasing number of open source software (OSS), the majo...

Roman Numeral Analysis with Graph Neural Networks: Onset-wise Predictions from Note-wise Features

Roman Numeral analysis is the important task of identifying chords and t...

Heterogeneous Graph Neural Networks for Software Effort Estimation

Software effort can be measured by story point [35]. Current approaches ...

Please sign up or login with your details

Forgot password? Click here to reset