DFI: An Interprocedural Value-Flow Analysis Framework that Scales to Large Codebases

by   Min-Yih Hsu, et al.

Context- and flow-sensitive value-flow information is an important building block for many static analysis tools. Unfortunately, current approaches to compute value-flows do not scale to large codebases, due to high memory and runtime requirements. This paper proposes a new scalable approach to compute value-flows via graph reachability. To this end, we develop a new graph structure as an extension of LLVM IR that contains two additional operations which significantly simplify the modeling of pointer aliasing. Further, by processing nodes in the opposite direction of SSA def-use chains, we are able to minimize the tree width of the resulting graph. This allows us to employ efficient tree traversal algorithms in order to resolve graph reachability. We present a value-flow analysis framework,DFI, implementing our approach. We compare DFI against two state-of-the-art value-flow analysis frameworks, Phasar and SVF, to extract value-flows from 4 real-world software projects. Given 32GB of memory, Phasar and SVF are unable to complete analysis of larger projects such as OpenSSL or FFmpeg, while DFI is able to complete all evaluations. For the subset of benchmarks that Phasar and SVF do handle, DFI requires significantly less memory (1.5 average) and runs significantly faster (23x speedup over Phasar, 57x compared to SVF). Our analysis shows that, in contrast to previous approaches, DFI's memory and runtime requirements scale almost linearly with the number of analyzed instructions.


page 1

page 2

page 3

page 4


Indexing Context-Sensitive Reachability

Many context-sensitive data flow analyses can be formulated as a variant...

Quasi-Autoregressive Residual (QuAR) Flows

Normalizing Flows are a powerful technique for learning and modeling pro...

Faster Reachability in Static Graphs

One of the most fundamental problems in computer science is the reachabi...

CLEF: Limiting the Damage Caused by Large Flows in the Internet Core (Technical Report)

The detection of network flows that send excessive amounts of traffic is...

Conquering the Extensional Scalability Problem for Value-Flow Analysis Frameworks

With an increasing number of value-flow properties to check, existing st...

Graph-based Semi-Supervised & Active Learning for Edge Flows

We present a graph-based semi-supervised learning (SSL) method for learn...

Adaptive Parameter Tuning for Reachability Analysis of Linear Systems

Despite the possibility to quickly compute reachable sets of large-scale...

Please sign up or login with your details

Forgot password? Click here to reset