CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software

07/19/2021
by   Guru Prasad Bhandari, et al.
0

Data-driven research on the automated discovery and repair of security vulnerabilities in source code requires comprehensive datasets of real-life vulnerable code and their fixes. To assist in such research, we propose a method to automatically collect and curate a comprehensive vulnerability dataset from Common Vulnerabilities and Exposures (CVE) records in the public National Vulnerability Database (NVD). We implement our approach in a fully automated dataset collection tool and share an initial release of the resulting vulnerability dataset named CVEfixes. The CVEfixes collection tool automatically fetches all available CVE records from the NVD, gathers the vulnerable code and corresponding fixes from associated open-source repositories, and organizes the collected information in a relational database. Moreover, the dataset is enriched with meta-data such as programming language, and detailed code and security metrics at five levels of abstraction. The collection can easily be repeated to keep up-to-date with newly discovered or patched vulnerabilities. The initial release of CVEfixes spans all published CVEs up to 9 June 2021, covering 5365 CVE records for 1754 open-source projects that were addressed in a total of 5495 vulnerability fixing commits. CVEfixes supports various types of data-driven software security research, such as vulnerability prediction, vulnerability classification, vulnerability severity prediction, analysis of vulnerability-related code changes, and automated vulnerability repair.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/20/2022

AutoMESC: Automatic Framework for Mining and Classifying Ethereum Smart Contract Vulnerabilities and Their Fixes

Due to the risks associated with vulnerabilities in smart contracts, the...
research
02/07/2019

A Manually-Curated Dataset of Fixes to Vulnerabilities of Open-Source Software

Advancing our understanding of software vulnerabilities, automating thei...
research
09/15/2023

REEF: A Framework for Collecting Real-World Vulnerabilities and Fixes

Software plays a crucial role in our daily lives, and therefore the qual...
research
05/26/2023

AIBugHunter: A Practical Tool for Predicting, Classifying and Repairing Software Vulnerabilities

Many ML-based approaches have been proposed to automatically detect, loc...
research
07/21/2023

Exploring Security Commits in Python

Python has become the most popular programming language as it is friendl...
research
04/19/2021

Multi-context Attention Fusion Neural Network for Software Vulnerability Identification

Security issues in shipped code can lead to unforeseen device malfunctio...
research
12/21/2020

Learning To Predict Vulnerabilities From Vulnerability-Fixes: A Machine Translation Approach

Vulnerability prediction refers to the problem of identifying the system...

Please sign up or login with your details

Forgot password? Click here to reset