Representing LLVM-IR in a Code Property Graph

11/09/2022
by   Alexander Küchler, et al.
0

In the past years, a number of static application security testing tools have been proposed which make use of so-called code property graphs, a graph model which keeps rich information about the source code while enabling its user to write language-agnostic analyses. However, they suffer from several shortcomings. They work mostly on source code and exclude the analysis of third-party dependencies if they are only available as compiled binaries. Furthermore, they are limited in their analysis to whether an individual programming language is supported or not. While often support for well-established languages such as C/C++ or Java is included, languages that are still heavily evolving, such as Rust, are not considered because of the constant changes in the language design. To overcome these limitations, we extend an open source implementation of a code property graph to support LLVM-IR which can be used as output by many compilers and binary lifters. In this paper, we discuss how we address challenges that arise when mapping concepts of an intermediate representation to a CPG. At the same time, we optimize the resulting graph to be minimal and close to the representation of equivalent source code. Our evaluation indicates that existing analyses can be reused without modifications and that the performance requirements are comparable to operating on source code. This makes the approach suitable for an analysis of large-scale projects.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/16/2022

A Language-Independent Analysis Platform for Source Code

In this paper, we present the CPG analysis platform, which enables the t...
research
01/19/2022

Cross-Language Binary-Source Code Matching with Intermediate Representations

Binary-source code matching plays an important role in many security and...
research
06/09/2022

ESBMC-Jimple: Verifying Kotlin Programs via Jimple Intermediate Representation

In this work, we describe and evaluate the first model checker for verif...
research
08/11/2023

A Uniform Representation of Classical and Quantum Source Code for Static Code Analysis

The emergence of quantum computing raises the question of how to identif...
research
05/12/2019

Static Analyzers and Potential Future Research Directions for Scala: An Overview

Static analyzers are tool sets which are proving to be indispensable to ...
research
06/07/2019

Datalog Disassembly

Disassembly is fundamental to binary analysis and rewriting. We present ...
research
06/06/2020

Replacements and Replaceables: Making the Case for Code Variants

There are often multiple ways to implement the same requirement in sourc...

Please sign up or login with your details

Forgot password? Click here to reset