Exploiting Token and Path-based Representations of Code for Identifying Security-Relevant Commits

11/15/2019
by   Achyudh Ram, et al.
0

Public vulnerability databases such as CVE and NVD account for only 60 security vulnerabilities present in open-source projects, and are known to suffer from inconsistent quality. Over the last two years, there has been considerable growth in the number of known vulnerabilities across projects available in various repositories such as NPM and Maven Central. Such an increasing risk calls for a mechanism to infer the presence of security threats in a timely manner. We propose novel hierarchical deep learning models for the identification of security-relevant commits from either the commit diff or the source code for the Java classes. By comparing the performance of our model against code2vec, a state-of-the-art model that learns from path-based representations of code, and a logistic regression baseline, we show that deep learning models show promising results in identifying security-related commits. We also conduct a comparative analysis of how various deep learning models learn across different input representations and the effect of regularization on the generalization of our models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/25/2022

VulBERTa: Simplified Source Code Pre-Training for Vulnerability Detection

This paper presents VulBERTa, a deep learning approach to detect securit...
research
05/08/2023

Vulnerability Detection Using Two-Stage Deep Learning Models

Application security is an essential part of developing modern software,...
research
02/13/2021

Data-Driven Vulnerability Detection and Repair in Java Code

Java platform provides various APIs to facilitate secure coding. However...
research
04/01/2023

DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection

We propose and release a new vulnerable source code dataset. We curate t...
research
02/13/2021

Why Security Defects Go Unnoticed during Code Reviews? A Case-Control Study of the Chromium OS Project

Peer code review has been found to be effective in identifying security ...
research
11/29/2017

Security Risks in Deep Learning Implementations

Advance in deep learning algorithms overshadows their security risk in s...
research
07/29/2023

JFinder: A Novel Architecture for Java Vulnerability Identification Based Quad Self-Attention and Pre-training Mechanism

Software vulnerabilities pose significant risks to computer systems, imp...

Please sign up or login with your details

Forgot password? Click here to reset