Feature Engineering-Based Detection of Buffer Overflow Vulnerability in Source Code Using Neural Networks

06/01/2023
by   Mst Shapna Akter, et al.
0

One of the most significant challenges in the field of software code auditing is the presence of vulnerabilities in software source code. Every year, more and more software flaws are discovered, either internally in proprietary code or publicly disclosed. These flaws are highly likely to be exploited and can lead to system compromise, data leakage, or denial of service. To create a large-scale machine learning system for function level vulnerability identification, we utilized a sizable dataset of C and C++ open-source code containing millions of functions with potential buffer overflow exploits. We have developed an efficient and scalable vulnerability detection method based on neural network models that learn features extracted from the source codes. The source code is first converted into an intermediate representation to remove unnecessary components and shorten dependencies. We maintain the semantic and syntactic information using state of the art word embedding algorithms such as GloVe and fastText. The embedded vectors are subsequently fed into neural networks such as LSTM, BiLSTM, LSTM Autoencoder, word2vec, BERT, and GPT2 to classify the possible vulnerabilities. We maintain the semantic and syntactic information using state of the art word embedding algorithms such as GloVe and fastText. The embedded vectors are subsequently fed into neural networks such as LSTM, BiLSTM, LSTM Autoencoder, word2vec, BERT, and GPT2 to classify the possible vulnerabilities. Furthermore, we have proposed a neural network model that can overcome issues associated with traditional neural networks. We have used evaluation metrics such as F1 score, precision, recall, accuracy, and total execution time to measure the performance. We have conducted a comparative analysis between results derived from features containing a minimal text representation and semantic and syntactic information.

READ FULL TEXT
research
03/13/2023

Automated Vulnerability Detection in Source Code Using Quantum Natural Language Processing

One of the most important challenges in the field of software code audit...
research
04/29/2021

A comparative study of neural network techniques for automatic software vulnerability detection

Software vulnerabilities are usually caused by design flaws or implement...
research
09/05/2023

Using a Nearest-Neighbour, BERT-Based Approach for Scalable Clone Detection

Code clones can detrimentally impact software maintenance and manually d...
research
08/04/2021

A Comparison of Different Source Code Representation Methods for Vulnerability Prediction in Python

In the age of big data and machine learning, at a time when the techniqu...
research
12/20/2021

Vulnerability Analysis of the Android Kernel

We describe a workflow used to analyze the source code of the Android OS...
research
02/14/2018

Automated software vulnerability detection with machine learning

Thousands of security vulnerabilities are discovered in production softw...
research
05/07/2021

Code2Image: Intelligent Code Analysis by Computer Vision Techniques and Application to Vulnerability Prediction

Intelligent code analysis has received increasing attention in parallel ...

Please sign up or login with your details

Forgot password? Click here to reset