Distinguishing Look-Alike Innocent and Vulnerable Code by Subtle Semantic Representation Learning and Explanation

08/22/2023
by   Chao Ni, et al.
0

Though many deep learning (DL)-based vulnerability detection approaches have been proposed and indeed achieved remarkable performance, they still have limitations in the generalization as well as the practical usage. More precisely, existing DL-based approaches (1) perform negatively on prediction tasks among functions that are lexically similar but have contrary semantics; (2) provide no intuitive developer-oriented explanations to the detected results. In this paper, we propose a novel approach named SVulD, a function-level Subtle semantic embedding for Vulnerability Detection along with intuitive explanations, to alleviate the above limitations. Specifically, SVulD firstly trains a model to learn distinguishing semantic representations of functions regardless of their lexical similarity. Then, for the detected vulnerable functions, SVulD provides natural language explanations (e.g., root cause) of results to help developers intuitively understand the vulnerabilities. To evaluate the effectiveness of SVulD, we conduct large-scale experiments on a widely used practical vulnerability dataset and compare it with four state-of-the-art (SOTA) approaches by considering five performance measures. The experimental results indicate that SVulD outperforms all SOTAs with a substantial improvement (i.e., 23.5 15.9 we conduct a user-case study to evaluate the usefulness of SVulD for developers on understanding the vulnerable code and the participants' feedback demonstrates that SVulD is helpful for development practice.

READ FULL TEXT
research
05/26/2023

Learning to Quantize Vulnerability Patterns and Match to Locate Statement-Level Vulnerabilities

Deep learning (DL) models have become increasingly popular in identifyin...
research
03/22/2021

Shallow or Deep? An Empirical Study on Detecting Vulnerabilities using Deep Learning

Deep learning (DL) techniques are on the rise in the software engineerin...
research
09/03/2020

Deep Learning based Vulnerability Detection: Are We There Yet?

Automated detection of software vulnerabilities is a fundamental problem...
research
06/19/2021

Vulnerability Detection with Fine-grained Interpretations

Despite the successes of machine learning (ML) and deep learning (DL) ba...
research
08/13/2021

Asteria: Deep Learning-based AST-Encoding for Cross-platform Binary Code Similarity Detection

Binary code similarity detection is a fundamental technique for many sec...
research
12/20/2021

VELVET: a noVel Ensemble Learning approach to automatically locate VulnErable sTatements

Automatically locating vulnerable statements in source code is crucial t...
research
02/11/2021

Why Don't Developers Detect Improper Input Validation?'; DROP TABLE Papers; –

Improper Input Validation (IIV) is a software vulnerability that occurs ...

Please sign up or login with your details

Forgot password? Click here to reset