Towards security defect prediction with AI

08/29/2018
by   Carson D. Sestili, et al.
0

In this study, we investigate the limits of the current state of the art AI system for detecting buffer overflows and compare it with current static analysis tools. To do so, we developed a code generator, s-bAbI, capable of producing an arbitrarily large number of code samples of controlled complexity. We found that the static analysis engines we examined have good precision, but poor recall on this dataset, except for a sound static analyzer that has good precision and recall. We found that the state of the art AI system, a memory network modeled after Choi et al. [1], can achieve similar performance to the static analysis engines, but requires an exhaustive amount of training data in order to do so. Our work points towards future approaches that may solve these problems; namely, using representations of code that can capture appropriate scope information and using deep learning methods that are able to perform arithmetic operations.

READ FULL TEXT
research
05/28/2021

Accelerating JavaScript Static Analysis via Dynamic Shortcuts (Extended Version)

JavaScript has become one of the most widely used programming languages ...
research
05/07/2021

Test Suites as a Source of Training Data for Static Analysis Alert Classifiers

Flaw-finding static analysis tools typically generate large volumes of c...
research
04/20/2023

Leveraging Static Analysis for Bug Repair

We propose a method combining machine learning with a static analysis to...
research
09/25/2022

Using Multiple Code Representations to Prioritize Static Analysis Warnings

In order to ensure the quality of software and prevent attacks from hack...
research
08/18/2023

Polyglot Code Smell Detection for Infrastructure as Code with GLITCH

This paper presents GLITCH, a new technology-agnostic framework that ena...
research
05/14/2019

Revisiting Precision and Recall Definition for Generative Model Evaluation

In this article we revisit the definition of Precision-Recall (PR) curve...
research
05/12/2021

Semantics, Verification, and Efficient Implementations for Tristate Numbers

Extended Berkeley Packet Filter(BPF)is an in-kernel, register-based virt...

Please sign up or login with your details

Forgot password? Click here to reset