Software Vulnerability Prediction Knowledge Transferring Between Programming Languages

03/10/2023
by   Khadija Hanifi, et al.
0

Developing automated and smart software vulnerability detection models has been receiving great attention from both research and development communities. One of the biggest challenges in this area is the lack of code samples for all different programming languages. In this study, we address this issue by proposing a transfer learning technique to leverage available datasets and generate a model to detect common vulnerabilities in different programming languages. We use C source code samples to train a Convolutional Neural Network (CNN) model, then, we use Java source code samples to adopt and evaluate the learned model. We use code samples from two benchmark datasets: NIST Software Assurance Reference Dataset (SARD) and Draper VDISC dataset. The results show that proposed model detects vulnerabilities in both C and Java codes with average recall of 72%. Additionally, we employ explainable AI to investigate how much each feature contributes to the knowledge transfer mechanisms between C and Java in the proposed model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/10/2022

Cross-Language Source Code Clone Detection Using Deep Learning with InferCode

Software clones are beneficial to detect security gaps and software main...
research
04/05/2019

On the Feasibility of Transfer-learning Code Smells using Deep Learning

Context: A substantial amount of work has been done to detect smells in ...
research
05/19/2023

CCT-Code: Cross-Consistency Training for Multilingual Clone Detection and Code Search

We consider the clone detection and information retrieval problems for s...
research
04/15/2019

Semantic Source Code Models Using Identifier Embeddings

The emergence of online open source repositories in the recent years has...
research
10/21/2020

SeqTrans: Automatic Vulnerability Fix via Sequence to Sequence Learning

Software vulnerabilities are now reported at an unprecedented speed due ...
research
11/25/2020

Probing Model Signal-Awareness via Prediction-Preserving Input Minimization

This work explores the signal awareness of AI models for source code und...

Please sign up or login with your details

Forgot password? Click here to reset