VulLibGen: Identifying Vulnerable Third-Party Libraries via Generative Pre-Trained Model

08/09/2023
by   Tianyu Chen, et al.
0

To avoid potential risks posed by vulnerabilities in third-party libraries, security researchers maintain vulnerability databases (e.g., NVD) containing vulnerability reports, each of which records the description of a vulnerability and the name list of libraries affected by the vulnerability (a.k.a. vulnerable libraries). However, recent studies on about 200,000 vulnerability reports in NVD show that 53.3 libraries, and 59.82 incomplete or incorrect. To address the preceding issue, in this paper, we propose the first generative approach named VulLibGen to generate the name list of vulnerable libraries (out of all the existing libraries) for the given vulnerability by utilizing recent enormous advances in Large Language Models (LLMs), in order to achieve high accuracy. VulLibGen takes only the description of a vulnerability as input and achieves high identification accuracy based on LLMs' prior knowledge of all the existing libraries. VulLibGen also includes the input augmentation technique to help identify zero-shot vulnerable libraries (those not occurring during training) and the post-processing technique to help address VulLibGen's hallucinations. We evaluate VulLibGen using three state-of-the-art/practice approaches (LightXML, Chronos, and VulLibMiner) that identify vulnerable libraries on an open-source dataset (VulLib). Our evaluation results show that VulLibGen can accurately identify vulnerable libraries with an average F1 score of 0.626 while the state-of-the-art/practice approaches achieve only 0.561. The post-processing technique helps VulLibGen achieve an average improvement of F1@1 by 9.3 technique helps VulLibGen achieve an average improvement of F1@1 by 39 identifying zero-shot libraries.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/17/2023

Identifying Vulnerable Third-Party Libraries from Textual Descriptions of Vulnerabilities and Libraries

To address security vulnerabilities arising from third-party libraries, ...
research
01/10/2023

CHRONOS: Time-Aware Zero-Shot Identification of Libraries from Vulnerability Reports

Tools that alert developers about library vulnerabilities depend on accu...
research
05/23/2023

Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning?

Software vulnerabilities bear enterprises significant costs. Despite ext...
research
06/13/2022

Dataset: Dependency Networks of Open Source Libraries Available Through CocoaPods, Carthage and Swift PM

Third party libraries are used to integrate existing solutions for commo...
research
09/05/2023

VFFINDER: A Graph-based Approach for Automated Silent Vulnerability-Fix Identification

The increasing reliance of software projects on third-party libraries ha...
research
09/30/2019

Automated Characterization of Software Vulnerabilities

Preventing vulnerability exploits is a critical software maintenance tas...
research
01/05/2021

Generating Informative CVE Description From ExploitDB Posts by Extractive Summarization

ExploitDB is one of the important public websites, which contributes a l...

Please sign up or login with your details

Forgot password? Click here to reset