SPARK: Static Program Analysis Reasoning and Retrieving Knowledge

11/03/2017
by   Wasuwee Sodsong, et al.
0

Program analysis is a technique to reason about programs without executing them, and it has various applications in compilers, integrated development environments, and security. In this work, we present a machine learning pipeline that induces a security analyzer for programs by example. The security analyzer determines whether a program is either secure or insecure based on symbolic rules that were deduced by our machine learning pipeline. The machine pipeline is two-staged consisting of a Recurrent Neural Networks (RNN) and an Extractor that converts an RNN to symbolic rules. To evaluate the quality of the learned symbolic rules, we propose a sampling-based similarity measurement between two infinite regular languages. We conduct a case study using real-world data. In this work, we discuss the limitations of existing techniques and possible improvements in the future. The results show that with sufficient training data and a fair distribution of program paths it is feasible to deducing symbolic security rules for the OpenJDK library with millions lines of code.

READ FULL TEXT
research
08/15/2022

A Library for Representing Python Programs as Graphs for Machine Learning

Graph representations of programs are commonly a central element of mach...
research
08/04/2022

Information Flow Control-by-Construction for an Object-Oriented Language Using Type Modifiers

In security-critical software applications, confidential information mus...
research
08/24/2022

Deep Symbolic Learning: Discovering Symbols and Rules from Perceptions

Neuro-Symbolic (NeSy) integration combines symbolic reasoning with Neura...
research
09/21/2023

Turaco: Complexity-Guided Data Sampling for Training Neural Surrogates of Programs

Programmers and researchers are increasingly developing surrogates of pr...
research
02/24/2020

Superoptimization of WebAssembly Bytecode

Motivated by the fast adoption of WebAssembly, we propose the first func...
research
09/27/2021

Cyber-Physical Taint Analysis in Multi-stage Manufacturing Systems (MMS): A Case Study

Information flows are intrinsic properties of an multi-stage manufacturi...
research
07/24/2023

ChatGPT for Software Security: Exploring the Strengths and Limitations of ChatGPT in the Security Applications

ChatGPT, as a versatile large language model, has demonstrated remarkabl...

Please sign up or login with your details

Forgot password? Click here to reset