VUDENC: Vulnerability Detection with Deep Learning on a Natural Codebase for Python

01/20/2022
by   Laura Wartschinski, et al.
0

Context: Identifying potential vulnerable code is important to improve the security of our software systems. However, the manual detection of software vulnerabilities requires expert knowledge and is time-consuming, and must be supported by automated techniques. Objective: Such automated vulnerability detection techniques should achieve a high accuracy, point developers directly to the vulnerable code fragments, scale to real-world software, generalize across the boundaries of a specific software project, and require no or only moderate setup or configuration effort. Method: In this article, we present VUDENC (Vulnerability Detection with Deep Learning on a Natural Codebase), a deep learning-based vulnerability detection tool that automatically learns features of vulnerable code from a large and real-world Python codebase. VUDENC applies a word2vec model to identify semantically similar code tokens and to provide a vector representation. A network of long-short-term memory cells (LSTM) is then used to classify vulnerable code token sequences at a fine-grained level, highlight the specific areas in the source code that are likely to contain vulnerabilities, and provide confidence levels for its predictions. Results: To evaluate VUDENC, we used 1,009 vulnerability-fixing commits from different GitHub repositories that contain seven different types of vulnerabilities (SQL injection, XSS, Command injection, XSRF, Remote code execution, Path disclosure, Open redirect) for training. In the experimental evaluation, VUDENC achieves a recall of 78 F1 score of 80 the Python corpus for the word2vec model are available for reproduction. Conclusions: Our experimental results suggest...

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/08/2020

VulDeeLocator: A Deep Learning-based Fine-grained Vulnerability Detector

Automatically detecting software vulnerabilities is an important problem...
research
03/13/2023

Automated Vulnerability Detection in Source Code Using Quantum Natural Language Processing

One of the most important challenges in the field of software code audit...
research
08/04/2021

A Comparison of Different Source Code Representation Methods for Vulnerability Prediction in Python

In the age of big data and machine learning, at a time when the techniqu...
research
02/23/2023

Detecting software vulnerabilities using Language Models

Recently, deep learning techniques have garnered substantial attention f...
research
06/26/2023

Can An Old Fashioned Feature Extraction and A Light-weight Model Improve Vulnerability Type Identification Performance?

Recent advances in automated vulnerability detection have achieved poten...
research
06/28/2023

Limits of Machine Learning for Automatic Vulnerability Detection

Recent results of machine learning for automatic vulnerability detection...
research
05/27/2023

Backdooring Neural Code Search

Reusing off-the-shelf code snippets from online repositories is a common...

Please sign up or login with your details

Forgot password? Click here to reset