Optimal Dyck Reachability for Data-Dependence and Alias Analysis

by   Krishnendu Chatterjee, et al.

A fundamental algorithmic problem at the heart of static analysis is Dyck reachability. The input is a graph where the edges are labeled with different types of opening and closing parentheses, and the reachability information is computed via paths whose parentheses are properly matched. We present new results for Dyck reachability problems with applications to alias analysis and data-dependence analysis. Our main contributions, that include improved upper bounds as well as lower bounds that establish optimality guarantees, are as follows. First, we consider Dyck reachability on bidirected graphs, which is the standard way of performing field-sensitive points-to analysis. Given a bidirected graph with n nodes and m edges, we present: (i) an algorithm with worst-case running time O(m + n ·α(n)), where α(n) is the inverse Ackermann function, improving the previously known O(n^2) time bound; (ii) a matching lower bound that shows that our algorithm is optimal wrt to worst-case complexity; and (iii) an optimal average-case upper bound of O(m) time, improving the previously known O(m ·log n) bound. Second, we consider the problem of context-sensitive data-dependence analysis, where the task is to obtain analysis summaries of library code in the presence of callbacks. Our algorithm preprocesses libraries in almost linear time, after which the contribution of the library in the complexity of the client analysis is only linear, and only wrt the number of call sites. Third, we prove that combinatorial algorithms for Dyck reachability on general graphs with truly sub-cubic bounds cannot be obtained without obtaining sub-cubic combinatorial algorithms for Boolean Matrix Multiplication, which is a long-standing open problem. We also show that the same hardness holds for graphs of constant treewidth.


Conditional Lower Bound for Inclusion-Based Points-to Analysis

Inclusion-based (i.e., Andersen-style) points-to analysis is a fundament...

The Fine-Grained and Parallel Complexity of Andersen's Pointer Analysis

Pointer analysis is one of the fundamental problems in static program an...

Subcubic Certificates for CFL Reachability

Many problems in interprocedural program analysis can be modeled as the ...

FlowCFL: A Framework for Type-based Reachability Analysis in the Presence of Mutable Data

Reachability analysis is a fundamental program analysis with a wide vari...

The Decidability and Complexity of Interleaved Bidirected Dyck Reachability

Dyck reachability is the standard formulation of a large domain of stati...

The Fine-Grained Complexity of Andersen's Pointer Analysis

Pointer analysis is one of the fundamental problems in static program an...

Indexing Context-Sensitive Reachability

Many context-sensitive data flow analyses can be formulated as a variant...