Conquering the Extensional Scalability Problem for Value-Flow Analysis Frameworks

12/14/2019 ∙ by Qingkai Shi, et al. ∙ Xiamen University The Hong Kong University of Science and Technology 0

With an increasing number of value-flow properties to check, existing static program analysis still tends to have scalability issues when high precision is required. We observe that the key design flaw behind the scalability problem is that the core static analysis engine is oblivious of the mutual synergies among different properties being checked and, thus, inevitably loses many optimization opportunities. Our approach is inter-property-aware and able to capture possible overlaps and inconsistencies among different properties. Thus, before analyzing a program, we can make optimization plans which decide how to reuse the specific analysis results of a property to speed up checking other properties. Such a synergistic interaction among the properties significantly improves the analysis performance. We have evaluated our approach by checking twenty value-flow properties in standard benchmark programs and ten real-world software systems. The results demonstrate that our approach is more than 8x faster than existing ones but consumes only 1/7 memory. Such a substantial improvement in analysis efficiency is not achieved by sacrificing the effectiveness: at the time of writing, 39 bugs found by our approach have been fixed by developers and four of them have been assigned CVE IDs due to their security impact.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Value flows (Shi et al., 2018; Cherem et al., 2007; Sui and Xue, 2016; Livshits and Lam, 2003), which track how values are loaded and stored in the program, underpin the analysis for a broad category of software properties, such as memory safety (e.g., null dereference, double free, etc.), resource usage (e.g., memory leak, file usage, etc.), and security properties (e.g., the use of tainted data). In addition, there are a large and growing number of domain-specific value-flow properties. For instance, mobile software requires that the personal information cannot be passed to an untrusted code (Arzt et al., 2014), and, in web applications, tainted database queries are not allowed to be executed (Tripp et al., 2013). Fortify,111Fortify Static Analyzer: https://www.microfocus.com/en-us/products/static-code-analysis-sast/ a commercial static code analyzer, checks nearly ten thousand value-flow properties from hundreds of unique categories. Value flow problems exhibit a very high degree of versatility, which poses great challenges to the effectiveness of general-purpose program analysis tools.

Faced with such a massive number of properties and the need of extension, existing approaches (e.g., Fortify, CSA222Clang Static Analyzer: https://clang-analyzer.llvm.org/ and Infer333Infer Static Analyzer: http://fbinfer.com/) provide a customizable framework together with a set of property interfaces that enable the quick customization for new properties. For instance, CSA uses a symbolic-execution engine such that, at every statement, it invokes the callback functions registered for the properties to check. The callback functions are written by the framework users in order to collect the symbolic-execution results, such as the symbolic memory and the path condition, so that we can judge the presence of any property violation at the statement. Despite the existence of many CSA-like frameworks, when high precision like path-sensitivity is required, existing static analyzers still cannot scale well with respect to a large number of properties to check, which we refer to as the extensional scalability issue. For example, our evaluation shows that CSA cannot path-sensitively check twenty properties for many programs in ten hours. Pinpoint (Shi et al., 2018) has already run out of 256GB memory for checking only eight properties.

We observe that, behind the extensional scalability issue, the key design flaw in conventional extension mechanisms (like that in CSA) is that the core static analysis engine is oblivious to the properties being checked. Although the property obliviousness gives the maximum flexibility and extensibility to the framework, it also prevents the core engine from utilizing the property-specific analysis results for optimization. This scalability issue is slightly alleviated by a class of approaches that are property-aware and demand-driven (Fan et al., 2019; Ball and Rajamani, 2002; Le and Soffa, 2008). These techniques are scalable with respect to a small number of properties because the core engine can skip certain program statements by understanding what program states are relevant to the properties. However, in these approaches, the semantics of properties are also opaque to each other. As a result, when the number of properties grows very large, the performance of the demand-driven approaches will quickly deteriorate, as in the case of Pinpoint. To the best of our knowledge, the number of literature specifically addressing the extensional scalability issue is very limited. Readers can refer to Section 7 for a detailed discussion.

In this work, we advocate an inter-property-aware design to relax the property-property and the property-engine obliviousness so that the core static analysis engine can exploit the mutual synergies among different properties for optimization. In our analysis, such exploitation of mutual synergies are enabled by enforcing a simple value-flow-based property model, which picks out source and sink values, respectively, as well as the predicate over these values for the satisfaction of the property. For instance, for a null deference property, our property model only requires the users of our framework to indicate where a null pointer may be created, where the null dereference may happen, as well as a simple predicate that enforces the propagation of the null pointer. Surprisingly, given a set of properties specified in our property model, our static analyzer can automatically understand the overlaps and inconsistencies of the properties to check. Based on the understanding, before analyzing a program, we can make dedicated analysis plans so that, at runtime, the analyzer can transmit the analysis results on path-reachability and path-feasibility across different properties for optimization. The optimization allows us to significantly reduce redundant graph traversals and unnecessary invocation of SMT solvers, two critical performance bottlenecks of conventional approaches. Section 2 provides examples to illustrate our approach.

We have implemented our approach, named Catapult, which is a new demand-driven and compositional static analyzer with the precision of path-sensitivity. Like a conventional compositional analysis (Xie and Aiken, 2005b), our implementation allows us to concurrently analyze functions that do not have calling relations. In Catapult, we have included all C/C++ value-flow properties that CSA checks by default. In the evaluation, we compare Catapult with three state-of-the-art bug-finding tools, Pinpoint, CSA, and Infer, using a standard benchmark and ten popular industrial-sized software systems. The experimental results demonstrate that Catapult is more than 8 faster than Pinpoint but consumes only 1/7 memory. It is as efficient as CSA and Infer in terms of both time and memory cost but is much more precise. Such promising scalability of Catapult is not achieved by sacrificing the capability of bug finding. In our experiments, although the benchmark software systems have been checked by numerous free and commercial tools, Catapult is still able to detect many previously-unknown bugs, in which thirty-nine have been fixed by the developers and four have been assigned CVE IDs due to their security impact.

In summary, our main contributions are listed as following:

  • An inter-property-aware design for checking value-flow properties, which mitigates the extensional scalability issue.

  • A series of cross-property optimization rules that can be made use of for general value-flow analysis frameworks.

  • A detailed implementation and a systematic evaluation that demonstrates our high scalability, precision, and recall.

2. Overview

The key factor that allows us to conquer the extensional scalability problem is the exploitation of the mutual synergies among different properties. In this section, we first use two simple examples to illustrate the mutual synergies and then provide a running example used in the whole paper.

2.1. Mutual Synergies

We observe that the mutual synergies among different properties are originated from their overlaps and inconsistencies.

In Figure 1a, to check memory-leak bugs, we need to track value flows from the newly-created heap pointer to check if the pointer will be freed.444In the paper, a pointer is “freed” means it is used in the function call free(p). We will detail how to use the value-flow information to check bugs later. To check free-global-pointer bugs, we track value flows from the global variable to check if it will be freed.555Freeing a pointer pointing to memory not on the heap (e.g., memory allocated by global variables) is buggy. See details in https://cwe.mitre.org/data/definitions/590.html As illustrated, the value-flow paths to search overlap from c=(a,b) to *c=1. Being aware of such overlaps, when traversing the graph from a=malloc() for memory-leak bugs, we can record that c=(a,b) cannot reach any free operation. Then, when checking free-global-pointer bugs, we can use this recorded information to immediately stop the graph traversal at the vertex c=(a,b), thereby saving computation resources.

In Figure 1b, to check memory-leak bugs, we track value flows from the newly-created pointer to where it is freed. To check null-dereference bugs, considering that malloc() may return a null pointer when memory allocation fails, we track value flows from the same pointer to where it is dereferenced. The two properties have an inconsistent constraint: the former requires a0 so that is a valid heap pointer while the latter requires a=0 so that is a null pointer. Being aware of the inconsistency, when traversing the graph for checking null dereferences, we can record whether pc (the path condition from a=malloc() to b=a) and pca=0 can be satisfied. If pc can be satisfied but pca=0 not, we can confirm that pca0 must be satisfiable without an expensive constraint-solving procedure, thus speeding up the process of checking memory leaks.

Figure 1. Possible overlaps and inconsistencies among properties. Each edge represents a value flow.

2.2. A Running Example

Figure 2. The workflow of our approach.
Figure 3. An example to illustrate our method.

Let us describe a running example using the value-flow graph in Figure 3, where we check null-deference and free-global-pointer bugs following the workflow in Figure 2. Given the program, we firstly follow previous works to build the value-flow graph (Sui et al., 2014; Cherem et al., 2007; Shi et al., 2018). With the graph in hand, we check the two properties with the precision of path-sensitivity. Here, path-sensitivity means that when searching paths on the value-flow graph, we will invoke an SMT solver to solve path conditions and other property-specific constraints, so that infeasible paths are pruned.

The Property Specifications. As is common practice, users of our framework need to provide the property specifications. The users are responsible for the correctness of the specifications.

In this paper, we focus on value-flow properties, which are checked by examining a series of value-flow paths from certain source values to some sink values. As an overview, the specifications of the two properties are described as two quadruples:

As illustrated above, the specification of a value-flow property consists of four parts which are separated by the semicolons. The first and second parts are the source and sink values. The values are specified by pattern expressions, which represent the values at certain statements. The uninterested values are written as “_”. In the example, the source values of null-deref and free-glob-ptr are the return pointer of malloc() and the global pointer empty_str, respectively. The sink value of null-deref is the dereferenced value at the statement *c=1 The sink values of free-glob-ptr are the freed values at free(b) and free(d).

The third part is a property-specific constraint, which is the precondition on which the bug can happen. The constraint of null-deref is to require the value on a value-flow path to be a null pointer, i.e., . The property free-glob-ptr does not have any specific constraint and, thus, puts true in the quadruple.

The predicate “never” means that value-flow paths between the specified sources and sinks should never be feasible. Otherwise, a bug exists.

The Core Static Analysis Engine. Before the analysis, our core engine automatically makes analysis plans based on the specifications. The analysis plans include the graph traversal plan and the optimization plan. In the example, we make the following optimization plans: (1) checking free-glob-ptr before null-deref; (2) when traversing the graph for checking free-glob-ptr, we record the vertices that cannot reach any sink vertex of null-deref. The graph traversal plan in the example is trivial, which is to traverse the graph from each source vertex of each property.

In Figure 3, when traversing the graph from empty_str to check free-glob-ptr, the core engine will visit all vertices except to look for free operations. According to the optimization plan, during the graph traversal, the core engine records that and cannot reach any dereference operation.

For null-deref, we traverse the graph from p. When visiting and , since the previously-recorded information tells that they cannot reach any sink vertex, we prune the subsequent paths from and and only need to continue the graph traversal from .

It is noteworthy that if we check null-deref before free-glob-ptr, we only can prune one path from for free-glob-ptr based on the results of null-deref (see Section 4.2.1). We will further explain the rationale of our analysis plans in the following sections.

3. Value-Flow Properties

This section provides a specification model for value-flow properties with the following two motivations. On the one hand, we observe that many property-specific constraints play a significant role in performance optimization. The specific constraints of a property not only can be used to optimize the property itself, but also can benefit other properties being checked together. Existing value-flow analyses either ignore or do not well utilize property-specific constraints, which exacerbates the extensional scalability issue.

On the other hand, despite many studies on value-flow analysis (Livshits and Lam, 2003; Shi et al., 2018; Sui et al., 2014; Sui and Xue, 2016; Cherem et al., 2007), we are still lack of a general and extensible specification model that can widen the opportunities of sharing analysis results across different properties. Some of the existing studies only focus on checking a specific property (e.g., memory leak (Sui et al., 2014)). Some adopt different specifications to check the same value-flow property (e.g., double free (Shi et al., 2018; Cherem et al., 2007)).

Preliminaries. As existing works (Li et al., 2011; Sui et al., 2014; Shi et al., 2018), we assume that the code in a program is in static single assignment (SSA) form, where every variable has only one definition (Cytron et al., 1991). Also, we say the value of a variable flows to a variable (or is data-dependent on ) if is assigned to directly (via assignments, such as b=a) or indirectly (via pointer dereferences, such as *p=a; q=p; b=*q). Thus, a value-flow graph can be defined as a directed graph where the vertices are values in the program and the edges represent the value-flow relations. A path is called value-flow path if it is a path on the value-flow graph.

Property Specification. As defined below, we model a value-flow property as an aggregation of value-flow paths.

Definition 3.1 (Value-Flow Property).

A value-flow property, , is a quadruple: , where

  • src and sink are two pattern expressions (Table 1) that specify the source and sink values of the value-flow paths to track.

  • psc is the property-specific constraint that every value on the value-flow path needs to satisfy.

  • is an extensible predicate that determines how to aggregate value-flow paths to check the specified property.

:: patterns
:: pattern list
:: call
:: load
:: store
:: assign
glob :: globals
:: symbol
sig :: character string
— _ :: uninterested value
Examples:
ret values of any state-
ment calling malloc;
the 2nd arg of any sta-
tement calling send;
dereferenced values at
every load statement;
Table 1. Pattern expressions used in the specification

In practice, we can use the quadruple to specify a wide range of value-flow properties. As discussed below, we put the properties into three categories, which are checked by aggregating a single, two, or more value-flow paths, respectively.

Null-Dereference-Like Bugs. Many program properties can be checked using a single value-flow path, such as null-deref and free-glob-ptr defined in Section 2.2, as well as a broad range of taint issues that propagate a tainted object to a program point consuming the object (Denning, 1976).

Double-Free-Like Bugs. A wide range of bugs happen in a program execution because two program statements (e.g., two statements calling function free) consecutively operate on the same value (e.g., a heap pointer). Typical examples include use-after-free which is a general form of double free, as well as bugs that operate on expired resources such as using a closed file descriptor. As an example, the specification of double-free can be specified as

In the specification, the property-specific constraint requires the initial value (or equivalently, all values) on the value-flow path is a valid heap pointer. This is because means malloc() fails to allocate memory but returns a null pointer. In this case, the free operation is harmless. The aggregate predicate “never-sim” means that the value-flow paths from the same pointer should never occur simultaneously. In other words, there is no control-flow path that goes through two different free operations on the same heap pointer. Otherwise, a double-free bug exists.

In Figure 3, for the two value-flow paths from to the two free operations, we can check to check double-free bugs. Here, and are the path conditions of the two paths, respectively.

Memory-Leak-Like Bugs. Many bugs happen because a value (e.g., a heap pointer) must be properly handled (e.g., freed by calling function free) in any program execution but, unfortunately, not. Typical examples include all kinds of resource leaks such as file descriptor leak, internet socket leak, etc. As an example, we write the following specification for checking memory leaks:

Compared to double-free, the only difference is the aggregate predicate. The aggregate predicate “must” means that the value-flow path from a heap pointer must be able to reach a free operation. Otherwise, a memory leak exists in the program.

In Figure 3, for the two value-flow paths from to the two free operations, we can check the disjunction of their path conditions, i.e., to determine if a memory leak exists. Here, and are the path conditions of the two paths, respectively. The additional is the condition on which the heap pointer is created.

4. Inter-property-aware analysis

Given multiple value-flow properties specified as the quadruple , our inter-property-aware static analyzer then starts to check them by searching value-flow paths and finally checking bugs based on the agg predicate. Since the path aggregate step is easy to run in parallel by independently checking all possible path groups, it is not the performance bottleneck. In this paper, we concentrate on how to exploit mutual synergies among different properties to improve the efficiency of searching value-flow paths.

4.1. A Naïve Static Analyzer

Input: the value-flow graph of a progam to check
Input: a set of value-flow properties to check
Output: paths between sources and sinks for each property
foreach property in the input property set do
       foreach source in its source set do
             while visit in the depth-first search from  do
                   if psc cannot be satisfied then
                         stop the search from ;
                        
                   end if
                  
             end while
            
       end foreach
      
end foreach
Algorithm 1 The naïve static analyzer.

For multiple value-flow properties, a naïve static analyzer checks them independently in a demand-driven manner. As illustrated in Algorithm 1, for each value-flow property, the static analyzer traverses the value-flow graph from each of the source vertices. At each step of the graph traversal, we check if psc can be satisfied with regard to the current path condition. If not, we can stop the graph traversal along the current path to save computing resources. This path-pruning process is illustrated in the shaded part of Algorithm 1, which is a critical factor to improve the analysis performance.

We observe that the properties to check usually have overlaps and inconsistencies and, thus, are not necessary to be checked independently as the naïve approach. Instead, we can exploit the overlaps and inconsistencies to facilitate the path-pruning process in Algorithm 1, thus improving the analysis efficiency. In what follows, we detail how the mutual synergies are utilized.

Optimization Plans
and ,
ID Rule Name Precondition Plan Benefit
1 property ordering check before more chances to prune paths
2 result recording check before record vertices that cannot reach prune paths at a vertex
3 check before , record unsat cores that conflict with prune paths if going through
4 check before ,

record interpolants that conflict with

a set of edges
Graph Traversal Plans
and ,
ID Rule Name Precondition Plan Benefit
5 traversal merging - search from for both properties sharing path conditions
6 psc-check ordering check first if satisfiable, so is
7 check if satisfiable, both and
can be satisfied
8 check any, e.g., , first if unsatisfiable, can be
satisfied
Table 2. Rules of Making Analysis Plans for a Pair of Properties

4.2. Optimized Intra-procedural Analysis

Based on the input property specifications, the core static analysis engine makes two plans for traversing the value-flow graph. The first is the optimization plan, which aims to prune more paths than the naïve approach. The second is the graph traversal plan, which concerns how to share paths among properties rather than prune paths. As a whole, all the plans are summarized in Table 2. Each row of the table is a rule describing what plan we can make on certain preconditions and what benefits we can obtain from the plan. To be clear, in this section, we detail the plans in the context of scanning a single-procedure program. In the next subsection, we introduce the inter-procedural analysis.

4.2.1. Optimization Plan

Based on the property specifications, we adopt several strategies to facilitate the path pruning (Rules 1 – 4 in Table 2).

Ordering the Properties (Rule 1). Given a set of properties with different source values, we need to determine the order in which they are checked. Generally, there is no perfect order that can guarantee the best optimization results. However, we observe that a random order could significantly affect how many paths we can prune in practice.

Let us consider the example in Figure 3 again. In Section 2.2, we have explained that if free-glob-ptr is checked before null-deref, we can prune the two paths from and when checking null-deref. However, if we change the checking order, i.e., check null-deref before free-glob-ptr, we can only prune one path from . In detail, when checking null-deref, the core engine records that cannot reach any sinks of free-glob-ptr. In this case, we can prune the path from when checking free-glob-ptr.

Intuitively, what makes the number of pruned paths different is that the number of free operations is more than dereference operations in the value-flow graph. That is, the more sink vertices in the value-flow graph, the fewer paths we can prune for the property. Inspired by this intuition and the example, the order of property checking is arranged according to the number of sink vertices. That is, the more sink vertices in the value-flow graph, the earlier we check this property.

Recording Sink-Reachability (Rule 2). Given a set of properties , , the basic idea is that, when checking by traversing the value-flow graph, the core engine needs to record whether each visited vertex may reach a sink vertex of . With the recorded information, when checking and visiting a vertex that cannot reach any of its sinks, the path from the vertex can be pruned. Section 2.2 illustrates the method.

Recording the psc-Check Results (Rules 3 & 4). Given a set of properties , , the basic idea is that, when checking by traversing the value-flow graph, the core engine needs to record whether some path segments (or a set of edges) conflict with the property-specific constraint of . With the recorded information, when checking and visiting the path segments that do not satisfy its specific constraint, the path with this segment can be pruned.

Let us consider the running example in Figure 3 again and assume that is . When traversing the graph from empty_str to check free-glob-ptr, the core engine needs to record that the edge from to (whose condition is , i.e., ) conflicts with the property-specific constraint of null-deref (i.e., ). With this information, when checking null-deref by traversing the graph from , we can also prune the path from the edge.

In practice, although the property-specific constraints are usually simple, the path constraints, e.g., in the above example, are usually very sophisticated. Fortunately, thanks to the advances in the area of clause learning (Beanie et al., 2003), we are able to efficiently compute some reusable facts when using SMT solvers to check path conditions and property-specific constraints. Specifically, we compute two reusable facts when a property-specific constraint conflicts with the current path condition pc.

When is unsatisfiable, we can record the unsatisfiable core (Dershowitz et al., 2006), which is a set of Boolean predicates from pc, e.g., , such that . Since pc is the conjunction of the edge constraints on the value-flow path, each corresponds to the condition of an edge on the value-flow graph. Thus, we can record an edge set , which conflicts with . When checking the other property with the same property-specific constraint, if a value-flow path goes through these recorded edges, we can prune the remaining paths.

In addition to the unsatisfiable cores, we also can record the interpolation constraints (Cimatti et al., 2010), which are even reusable for properties with a different property-specific constraint. In the above example, assume that is and is . During the constraint solving, an SMT solver can refute the satisfiability of by finding an interpolant such that but . In the example, the interpolant is , which provides a detailed explanation why the set conflicts with . In addition, the interpolant also indicates that the set conflicts with many other constraints like , , etc. Thus, given a property whose specific constraint conflicts with the interpolation constraint, it is sufficient to conclude that any value-flow path passing through the edge set can be pruned.

4.2.2. Graph Traversal Plan

Different from the optimization plan that aims to prune paths, the graph traversal plan is to provide strategies to share paths among different properties.

Merging the Graph Traversal (Rule 5). We observe that many properties actually share the same or a part of source vertices and even the same sink vertices. If the core engine checks each property one by one, it will inevitably repeat traversing the graph from a source vertex for different properties. To avoid such repetitive graph traversal from the same source, we propose the graph traversal plan to merge the path searching processes for different properties.

Figure 4. Merging the graph traversal.

As an example, in Figure 3, since may be a heap pointer or null, checking both null-deref and mem-leak needs to traverse the graph from . Figure 4 illustrates how the merged traversal is performed. That is, we maintain a property set during the graph traversal to record what properties the current path contributes to. Whenever visiting a vertex, we check if a property needs to be removed from the property set. For instance, at the vertex , we may be able to remove null-deref from the property set if we can determine cannot reach any dereference operation. When the property set becomes empty at a vertex, the graph traversal stops immediately.

Ordering the psc-Checks (Rules 6 – 8). Since the graph traversals are merged for different properties, at a vertex, e.g., in Figure 4, we have to check multiple property-specific constraints, e.g., for mem-leak and for null-deref. Thus, a problem we need to address is to determine the order in which the property-specific constraints are checked. Since checking such constraints often needs expensive SMT solving procedures, the order of such constraint solving affects the analysis performance.

Given two property-specific constraints and as well as the current path condition pc, we consider three cases, i.e., , , and , as listed in Table 2. Since property-specific constraints are usually simple, the above relations between and are easy to compute.

First, if , it means that the solution of also satisfies . Thus, we check first. If it is satisfiable, we can confirm that must be satisfiable without an expensive SMT solving procedure.

Second, if , it means that there exists a solution that satisfying both and . In this case, we check first, if it is satisfiable, we can confirm both and can be satisfied without additional SMT solving procedures. In our experience, this strategy saves a lot of resources.

Third, if , it means that there does not exist any solution that satisfies both and . In this case, we check any, e.g., , first. If the current path is feasible but is not satisfiable, we can confirm that can be satisfied without invoking SMT solvers. This case was illustrated in Figure 1b.

4.3. Modular Inter-procedural Analysis

Scalable program analyses work by exploiting the modular structure of programs. Almost every inter-procedural analysis builds summaries for functions and reuses the function summary at its calling contexts, in order to scale to large programs (Cousot and Cousot, 2002; Xie and Aiken, 2005b). In Catapult, we can seamlessly extend our optimized intra-procedural analysis to modular inter-procedural analysis by exploring the local value-flow graph of each function and then stitching the local paths together to generate complete value-flow paths. In the following, we explain our design of the function summaries.

In our analysis, for each function, we build three kinds of value-flow paths as the function summaries. They are defined as below and, in Appendices A and B, we formally prove the sufficiency to generate these function summaries. Intuitively, these summaries describe how function boundaries (i.e., formal parameters and return values) partition a complete value-flow path. Using the property double-free as an example, a complete value-flow path from to free(b) in Figure 5 is partitioned to a sub-path from to ret p by the boundary of xmalloc(). This sub-path is an output summary of xmalloc() as defined below.

Definition 4.1 (Transfer Summary).

Given a function f, a transfer summary of f is a value-flow path from one of its formal parameters to one of its return values.

Definition 4.2 (Input Summary).

Given a function f, an input summary of f is a value-flow path from one of its formal parameters to a sink value in f or in the callees of f.

Definition 4.3 (Output Summary).

Given a function f, an output summary of f is a value-flow path from a source value to one of f’s return values. The source value is in f or in the callees of f.

After generating the function summaries, to avoid separately storing them for different properties, each function summary is labeled with a bit vector to record what properties it is built for. Assume that we need to check

null-deref, double-free, and mem-leak in Figure 5. The three properties are assigned with three bit vectors , , and as their identities, respectively. As explained before, all three properties regard as the source vertex. The sink vertices for checking double-free and mem-leak are free(b) and free(u). There are no sink vertices for null-deref. According to Definitions 4.14.3, we generate the following function summaries:

Function Summary Path Label Type
xmalloc (, ret p) transfer
xfree (, ret u) input
(, free()) output

The summary (, ret p) is labeled with because all three properties regard as the source. The summary (, ret u) is also labeled with because the path does not contain any property-specific vertices and, thus, may be used for all three properties. The summary (, free()) is only labeled with because we do not regard free() as the sink for null-deref.

Figure 5. An example to show the inter-procedural analysis.

When analyzing the main function, we concatenate its intra-procedural paths with summaries from the callee functions so as to generate a complete path. For example, a concatenation is illustrated as below and its result is labeled by , meaning that the resulting path only works for double-free and mem-leak.

We observe that using value-flow paths as function summaries has a significant advantage for checking multiple properties. That is, since value flow is a kind of fundamental program relations, it can be reused across different properties. This is different from existing approaches that utilize state machine to model properties and generate state-specific function summaries (Fan et al., 2019; Das et al., 2002). Since different properties usually have different states, compared to our value-flow-based function summaries, such state-specific function summaries have fewer opportunities to be reused across properties.

5. Implementation

ID Property Name Brief Description
1 core.CallAndMessage Check for uninitialized arguments and null function pointers
2 core.DivideByZero Check for division by zero
3 core.NonNullParamChecker Check for null passed to function parameters marked with nonnull
4 core.NullDereference Check for null pointer dereference
5 core.StackAddressEscape Check that addresses of stack memory do not escape the function
6 core.UndefinedBinaryOperatorResult Check for the undefined results of binary operations
7 core.VLASize (Variable-Length Array) Check for declaration of VLA of undefined or zero size
8 core.uninitialized.ArraySubscript Check for uninitialized values used as array subscripts
9 core.uninitialized.Assign Check for assigning uninitialized values
10 core.uninitialized.Branch Check for uninitialized values used as branch conditions
11 core.uninitialized.CapturedBlockVariable Check for blocks that capture uninitialized values
12 core.uninitialized.UndefReturn Check for uninitialized values being returned to callers
13 cplusplus.NewDelete Check for C++ use-after-free
14 cplusplus.NewDeleteLeaks Check for C++ memory leaks
15 unix.Malloc Check for C memory leaks, double-free, and use-after-free
16 unix.MismatchedDeallocator Check for mismatched deallocators, e.g., new and free()
17 unix.cstring.NullArg Check for null pointers being passed to C string functions like strlen
18 alpha.core.CallAndMessageUnInitRefArg Check for uninitialized function arguments
19 alpha.unix.SimpleStream Check for misuses of C stream APIs, e.g., an opened file is not closed
20 alpha.unix.Stream Check stream handling functions, e.g., using a null file handle in fseek
Table 3. Properties to Check in Catapult

In this section, we present the implementation details as well as the properties to check in our framework.

Path-sensitivity. We have implemented our approach as a prototype tool called Catapult on top of Pinpoint (Shi et al., 2018). Given the source code of a program, we first compile it to LLVM bitcode,666LLVM: https://llvm.org/ on which our analysis is performed. To achieve path-sensitivity, we build a path-sensitive value-flow graph and compute path conditions following the same method of Pinpoint. The path conditions in our analysis are first-order logic formulas over bit vectors. A program variable is modeled as a bit vector, of which the length is the bit width (e.g., 32) of the variable’s type (e.g., int). The path conditions are solved by Z3 (De Moura and Bjørner, 2008), a state-of-the-art SMT solver, to determine path feasibility.

Properties to check. Catapult currently supports to check twenty C/C++ properties defined in CSA, which are briefly introduced in Table 3.777More details of the properties can be found on https://clang-analyzer.llvm.org/. The twenty properties include all CSA’s default C/C++ value-flow properties. All other default C/C++ properties in CSA but not in Catapult are simple ones that do not require a path-sensitive analysis. For example, the property security.insecureAPI.bcopy requires CSA report a warning whenever a program statement calling the function bcopy() is found.

Parallelization. Our analysis is performed in a bottom-up manner, in which a callee function is always analyzed before its callers. Bottom-up compositional analysis is easy to run in parallel (Xie and Aiken, 2005b). Our special design for checking multiple properties does not prevent our analysis from parallelization. As is common practice, in Catapult, functions that do not have calling relations are analyzed in parallel.

Soundness. We implement Catapult in a soundy manner (Livshits et al., 2015). This means that the implementation soundly handles most language features and, meanwhile, includes some well-known unsound design decisions as previous works (Xie and Aiken, 2005b; Cherem et al., 2007; Babic and Hu, 2008; Sui et al., 2014; Shi et al., 2018). For example, in our implementation, virtual functions are resolved by classic class hierarchy analysis (Dean et al., 1995). However, we do not handle C style function pointers, inline assembly, and library functions. We also follow the common practice to assume distinct function parameters do not alias each other (Livshits and Lam, 2003) and unroll each cycle twice on the call graph and the control flow graph. These unsound choices significantly improve the scalability but have limited negative impacts on the bug-finding capability.

6. Evaluation

This section presents the systematic evaluation that demonstrates the high scalability, precision, and recall of our approach.

ID Program Size (KLoC) ID Program Size (KLoC)
1 mcf 2 13 shadowsocks 32
2 bzip2 3 14 webassembly 75
3 gzip 6 15 transmission 88
4 parser 8 16 redis 101
5 vpr 11 17 imagemagick 358
6 crafty 13 18 python 434
7 twolf 18 19 glusterfs 481
8 eon 22 20 icu 537
9 gap 36 21 openssl 791
10 vortex 49 22 mysql 2,030
11 perlbmk 73
12 gcc 135 Total 5,303
Table 4. Subjects for Evaluation

6.1. Experimental Setup

To demonstrate the scalability of our approach, we compared the time and memory cost of Catapult with a series of existing industrial-strength static analyzers. We also investigated their capability of finding real bugs, which confirms that our promising scalability is not achieved by sacrificing its bug-finding capability.

Baseline approaches. First of all, we compared Catapult with Pinpoint (Shi et al., 2018)

, an open-source version of the most recent static analyzer of the same type. Both of the two techniques are demand-driven, compositional, and sparse static analysis with the precision of path-sensitivity. The difference is that

Catapult exploits mutual synergies among different properties to speed up the analysis while Pinpoint does not. In addition, we also conducted comparison experiments on the tools using abductive inference (Infer) and symbolic execution (CSA), both of which are open source and widely-used in industry. This comparison aims to show that Catapult is competitive, as it consumes similar time and memory cost with CSA and Infer, but is much more precise. In the experiments, all tools were run with fifteen threads to take advantage of parallelization.

We also tried to compare with other static bug detection tools such as Saturn (Xie and Aiken, 2005b), Calysto (Babic and Hu, 2008), Semmle (Avgustinov et al., 2016), Fortify, and Klocwork.888Klocwork: https://www.roguewave.com/products-services/klocwork/ However, they are either unavailable or not runnable on the experimental environment we are able to set up. The open-source static analyzer, FindBugs,999Findbugs Static Analyzer: http://findbugs.sourceforge.net/ is not included in our experiments because it only works for Java while we focus on the analysis of C/C++ programs. We do not compare with Tricoder (Sadowski et al., 2015), the static analysis platform from Google. This is because the only C/C++ analyzer in it is CSA, which has been included in our experiments.

Subjects for evaluation. To avoid possible biases on the benchmark programs, we include the standard and widely-used benchmark, SPEC CINT 2000101010SPEC CPU2000: https://www.spec.org/cpu2000/ (ID = 1 12 in Table 4), in our evaluation. At the same time, in order to demonstrate the efficiency and effectiveness of Catapult on real-world projects, we also include ten industrial-sized open-source C/C++ projects (ID = 13 22 in Table 4), of which the size ranges from a few thousand to two million lines of code.

Environment. All experiments were performed on a server with two Intel© Xeon© CPU E5-2698 v4 @ 2.20GHz (each has 20 cores) and 256GB RAM running Ubuntu-16.04.

Figure 6. (a) Comparing time and memory cost with Pinpoint. (b) The growth curves of the time and the memory overhead when comparing to Pinpoint. (c) Comparing time and memory cost with CSA and Infer.
Program   Catapult   Pinpoint
  # Rep # FP   # Rep # FP
shadowsocks   9 0   9 0
webassembly   10 2   10 2
transmission   24 2   24 2
redis   39 5   39 5
imagemagick   26 8   - -
python   48 7   48 7
glusterfs   59 22   59 22
icu   161 31   - -
openssl   48 15   - -
mysql   245 88   - -
% FP   26.9%   20.1%
 
Program   Catapult   CSA (Z3)   CSA (Default)   Infer
  # Rep # FP   # Rep # FP   # Rep # FP   # Rep # FP
shadowsocks   8 2   24 22   25 23   15 13
webassembly   4 0   1 0   6 2   12 12
transmission   31 10   17 12   26 21   167* 82
redis   19 6   15 7   32 20   16 7
imagemagick   24 7   34 21   78 61   34 18
python   37 7   62 40   149* 77   82 63
glusterfs   28 5   0 0   268* 82   - -
icu   55 11   94 67   206* 69   248* 71
openssl   39 19   44 26   44 26   211* 85
mysql   59 20   271* 59   1001* 79   258* 80
% FP   28.6%   64.9%   75.7%   78.6%
 * We inspected one hundred randomly-sampled bug reports.
We fail to run the tool on glusterfs.
Table 5. Effectiveness (Catapult vs. Pinpoint, CSA, and Infer)

6.2. Comparing with Static Analyzer of the Same Type

We first compared Catapult with Pinpoint, the most recent static analyzer of the same type. To demonstrate the power of the graph traversal plan and the optimization plan separately, we also configured our approach by disabling the optimization plan, which is denoted as Catapult.

In this experiment, we performed the whole program analysis. That is, we linked all compilation units in a project into a single file so that the static analyzers can perform cross-file analysis. Before the analysis, both Pinpoint and Catapult need to build the value-flow graph as the program intermediate representation. Since Catapult is built on top of Pinpoint, the pre-processing time and the size of value-flow graph are the same for both tools, which are almost linear to the size of a program (Shi et al., 2018). Typically, for MySQL, a program with about two million lines of code, it takes twenty minutes to build a value-flow graph with seventy million nodes and ninety million edges. We omit the details of these data because it is not the contribution of this paper.

Efficiency. The time and memory cost of checking each benchmark program is shown in Figure 6a. Owing to the inter-property-awareness, Catapult is about 8 faster than Pinpoint and takes only 1/7 memory usage on average. Typically, Catapult can finish checking MySQL in 5 hours, which is aligned with the industrial requirement of finishing an analysis in 5 to 10 hours (Bessey et al., 2010; McPeak et al., 2013).

When the optimization plan is disabled, Catapult is about 3.5 faster than Pinpoint and takes 1/5 memory usage on average. Compared to the result of Catapult, it implies that the graph traversal plan and the optimization plan contribute to 40% and 60% of the time cost reduction, respectively. Meanwhile, they contribute to 70% and 30% of the memory cost reduction, respectively. As a summary, the two plans contribute similar to the time cost reduction, and the graph traversal plan is more important for the memory cost reduction because it allows us to abundantly share analysis results across different properties and avoid duplicate data storage.

Using the largest subject, MySQL, as an example, Figure 6b illustrates the growth curves of the time and the memory overhead when the properties in Table 3 are added into the core engine one by one. As illustrated, in terms of both time and memory overhead, Catapult grows much slower than Pinpoint and, thus, scales up quite gracefully.

It is noteworthy that, except for the feature of inter-property-awareness, Catapult follows the same method of Pinpoint to build value-flow graph and perform path-sensitive analysis. Thus, they have similar performance to check a single property. Catapult performs better than Pinpoint only when multiple properties are checked together.

Effectiveness. Since both Catapult and Pinpoint are inter-proce-durally path-sensitive, as shown in Table 5-Left, they produce a similar number of bug reports (# Rep) and false positives (# FP) for all the real-world programs except for the programs that Pinpoint fails to analyze due to the out-of-memory exception.

6.3. Comparing with Other Static Analyzers

To better understand the performance of Catapult in comparison to other types of property-unaware static analyzers, we also ran Catapult against two prominent and mature static analyzers, CSA (based on symbolic execution) and Infer (based on abductive inference). Note that Infer

 does not classify the properties to check as Table

3 but targets at a similar range of properties, such as null dereference, memory leak, etc.

In the evaluation, CSA was run with two different configurations: one is its default configuration where a fast but imprecise range-based solver is employed to solve path constraints, and the other uses Z3 (De Moura and Bjørner, 2008), a full-featured SMT solver, to solve path constraints. To ease the explanation, we denote CSA in the two configurations as CSA (Default) and CSA (Z3), respectively. Since CSA separately analyzes each source file and Infer only has limited capability of detecting cross-file bugs, for a fair comparison, all tools in the experiments were configured to check source files separately, and the time limit for analyzing each file is set to 60 minutes. Since a single source file is usually small, we did not encounter memory issues in the experiment but missed a lot of cross-file bugs as discussed later. Also, since we build value-flow graphs separately for each file and do not need to track cross-file value flows, the time cost of building value-flow graphs is almost negligible. Typically, for MySQL, it takes about five minutes to build value-flow graphs for all files. This time cost is included in the results discussed below.

Note that we did not change other default configurations of CSA and Infer. This is because the default configuration is usually the best in practice. Modifying their default configuration may introduce more biases.

Efficiency (Catapult vs. CSA (Z3)). When both Catapult and CSA employ Z3 to solve path constraints, they have similar precision (i.e., full path-sensitivity) in theory. However, as illustrated in Figure 6c, Catapult is much faster than CSA and consumes similar memory for all the subjects. For example, for MySQL, it takes about 36 hours for CSA to finish the analysis while Catapult takes only half an hour but consumes similar memory space. On average, Catapult is 68 faster than CSA at the cost of only 2 more memory to generate and store summaries. In spite of the 2 more memory, both of them can finish the analysis in 12GB memory, which is affordable using a common personal computer.

Efficiency (Catapult vs. CSA (Default) and Infer). As illustrated in Figure 6c, compared to Infer and the default version of CSA, Catapult takes similar (sometimes, a little higher) time and memory cost to check the subject programs. For instance, for MySQL, the largest subject program, all three tools finish the analysis in 40 minutes and consume about 10GB memory. With similar efficiency, Catapult, as a fully path-sensitive analysis, is much more precise than the other two. The lower precision of CSA and Infer leads to many false positives as discussed below.

Effectiveness. In addition to the efficiency, we also investigate the bug-finding capability of the tools. Table 5-Right presents the results. Since we only perform file-level analysis in this experiment, the bugs reported by Catapult is much fewer than those in Table 5-Left. Since validating each report may take tens of minutes, one day, or even longer, we could not afford the time to manually inspect all of them. Thus, we randomly sampled a hundred reports for the projects that have more than one hundred reports. We can observe from the results that, on average, the false positive rate of Catapult is much lower than CSA and Infer. In terms of recall, Catapult reports more true positives, which cover all those reported by CSA and Infer. CSA and Infer miss many bugs because they make some trade-offs in exchange for efficiency. For example, CSA often stops its analysis on a path after it finds the first possible bug.

Together with the results on efficiency, we can conclude that Catapult is much more scalable than CSA and Infer because they have similar time and memory overhead but Catapult is much more precise and able to detect more bugs.

6.4. Detected Real Bugs

We note that the real-world software used in our evaluation is frequently scanned by commercial tools such as Coverity SAVE111111Coverity Scan: https://scan.coverity.com/projects/ and, thus, is expected to have very high quality. Nevertheless, Catapult still can detect many deeply-hidden software bugs that existing static analyzers, such as Pinpoint, CSA, and Infer, cannot detect.

At the time of writing, thirty-nine previously-unknown bugs have been confirmed and fixed by the software developers, including seventeen null pointer dereferences, ten use-after-free or double-free bugs, eleven resource leaks, and one stack-address-escape bug. Four of them even have been assigned CVE IDs due to their security impact. We have made an online list for all bugs assigned CVE IDs or fixed by their original developers.121212Detected real bugs: https://qingkaishi.github.io/catapult.html

As an example, Figure 7 presents a null-deference bug detected by Catapult in ImageMagick, which is a software suite for processing images. This bug is of high complexity, as it occurs in a function of more than 1,000 lines of code and the control flow involved in the bug spans across 56 functions over 9 files.

Since both CSA and Infer make many unsound trade-offs to achieve scalability, neither of them detects this bug. Pinpoint also cannot detect the bug because it is not memory-efficient and has to give up its analysis after the memory is exhausted.

Figure 7. A null-dereference bug in ImageMagick.

7. Related Work

To the best of our knowledge, a very limited number of existing static analyses have studied how to statically check multiple program properties at once, although the problem is very important at an industrial setting. Goldberg et al. (2018) make unsound assumptions and intentionally stop the analysis on a path after finding the first bug. Apparently, the approach will miss many bugs, which violates our design goal. Different from our approach that reduces unnecessary program exploration via cross-property optimization, Mordan and Mutilin (2016) studied how to distribute computing resources, so that the resources are not exhausted by a few properties. Cabodi and Nocco (2011) studied the problem of checking multiple properties in the context of hardware model checking. Their method has a similar spirit to our approach as it also tries to exploit mutual synergies among different properties. However, it works in a different manner specially designed for hardware. In order to avoid state-space explosion caused by large sets of properties, some other approaches studied how to decompose a set of properties into small groups (Camurati et al., 2014; Apel et al., 2016). Owing to the decomposition, we cannot share the analysis results across different groups. There are also some static analyzers such as Semmle (Avgustinov et al., 2016) and DOOP (Bravenboer and Smaragdakis, 2009) that take advantage of datalog engines for multi-query optimization. However, they are usually not path-sensitive and their optimization methods are closely related to the sophisticated datalog specifications. In this paper, we focus on value-flow queries that can be simply specified as a quadruple and, thus, cannot benefit from the datalog engines.

CSA and Infer currently are two of the most famous open-source static analyzers with industrial strength. CSA is a symbolic-execution-based, exhaustive, and whole-program static analyzer. As a symbolic execution, it suffers from the path-explosion problem (King, 1976). To be scalable, it has to make unsound assumptions as in the aforementioned related work (Goldberg et al., 2018), limit its capability of detecting cross-file bugs, and give up full path-sensitivity by default. Infer is an abstract-interpretation-based, exhaustive, and compositional static analyzer. To be scalable, it also makes many trade-offs: giving up path-sensitivity and discarding sophisticated pointer analysis in most cases. Similarly, Tricoder, the analyzer in Google, only works intra-procedurally in order to analyze large code base (Sadowski et al., 2015, 2018).

In the past decades, researchers have proposed many general techniques that can check different program properties but do not consider how to efficiently check them together (Reps et al., 1995; Ball and Rajamani, 2002; Henzinger et al., 2002; Clarke et al., 2003; Chaki et al., 2004; Xie and Aiken, 2005b; Dillig et al., 2008; Babic and Hu, 2008; Dillig et al., 2011; Cho et al., 2013; Sui and Xue, 2016; Shi et al., 2018). Thus, we study different problems. In addition, there are also many techniques tailored only for a special program property, including null dereference (Livshits and Lam, 2003), use after free (Yan et al., 2018), memory leak (Xie and Aiken, 2005a; Cherem et al., 2007; Sui et al., 2014; Fan et al., 2019), buffer overflow (Le and Soffa, 2008), etc. Since we focus on the extensional scalability issue for multiple properties, our approach is different from them.

Value flow properties checked in our static analyzer are also related to well-known type-state properties (Strom, 1983; Strom and Yemini, 1986). Generally, we can regard a value-flow property as a type-state property with at most two states. Nevertheless, value-flow properties have covered a wide range of program issues. Thus, a scalable value-flow analyzer is really necessary and useful in practice. Modeling a program issue as a value-flow property has many advantages. For instance, Cherem et al. (2007) pointed out that we can utilize the sparseness of value-flow graph to avoid tracking unnecessary value propagation in a control flow graph, thereby achieving better performance and outputting more concise issue reports. In this paper, we also demonstrate that using the value-flow-based model enables us to mitigate the extensional scalability issue.

8. Conclusion

We have presented Catapult, a scalable approach to checking multiple value-flow properties together. The critical factor that makes our technique fast is to exploit the mutual synergies among the properties to check. Since the number of program properties to check is quickly increasing nowadays, we believe that it will be an important research direction to study how to scale up static program analysis for simultaneously checking multiple properties.

Acknowledgements.
The authors would like to thank the anonymous reviewers and Dr. Yepang Liu for their insightful comments. This work is partially funded by Hong Kong GRF16214515, GRF16230716, GRF16206517, and ITS/215/16FP grants. Rongxin Wu is the corresponding author.

References

  • S. Apel, D. Beyer, V. Mordan, V. Mutilin, and A. Stahlbauer (2016) On-the-fly decomposition of specifications in software model checking. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 349–361. Cited by: §7.
  • S. Arzt, S. Rasthofer, C. Fritz, E. Bodden, A. Bartel, J. Klein, Y. Le Traon, D. Octeau, and P. McDaniel (2014) Flowdroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’14, pp. 259–269. Cited by: §1.
  • P. Avgustinov, O. de Moor, M. P. Jones, and M. Schäfer (2016) QL: object-oriented queries on relational data. In 30th European Conference on Object-Oriented Programming, Cited by: §6.1, §7.
  • D. Babic and A. J. Hu (2008) Calysto: scalable and precise extended static checking. In Proceedings of the 30th International Conference on Software Engineering, ICSE ’08, pp. 211–220. Cited by: §5, §6.1, §7.
  • T. Ball and S. K. Rajamani (2002) The slam project: debugging system software via static analysis. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’02, pp. 1–3. Cited by: §1, §7.
  • P. Beanie, H. Kautz, and A. Sabharwal (2003) Understanding the power of clause learning. In

    Proceedings of the 18th International Joint Conference on Artificial Intelligence

    ,
    IJCAI ’03, pp. 1194–1201. Cited by: §4.2.1.
  • A. Bessey, K. Block, B. Chelf, A. Chou, B. Fulton, S. Hallem, C. Henri-Gros, A. Kamsky, S. McPeak, and D. Engler (2010) A few billion lines of code later: using static analysis to find bugs in the real world. Communications of the ACM 53 (2), pp. 66–75. Cited by: §6.2.
  • M. Bravenboer and Y. Smaragdakis (2009) Strictly declarative specification of sophisticated points-to analyses. In Proceedings of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA ’09, pp. 243–262. Cited by: §7.
  • G. Cabodi and S. Nocco (2011) Optimized model checking of multiple properties. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2011, pp. 1–4. Cited by: §7.
  • P. Camurati, C. Loiacono, P. Pasini, D. Patti, and S. Quer (2014) To split or to group: from divide-and-conquer to sub-task sharing in verifying multiple properties. In International Workshop on Design and Implementation of Formal Tools and Systems (DIFTS), Lausanne, Switzerland, pp. 313–325. Cited by: §7.
  • S. Chaki, E. M. Clarke, A. Groce, S. Jha, and H. Veith (2004) Modular verification of software components in c. IEEE Transactions on Software Engineering 30 (6), pp. 388–402. Cited by: §7.
  • S. Cherem, L. Princehouse, and R. Rugina (2007) Practical memory leak detection using guarded value-flow analysis. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’07, pp. 480–491. Cited by: §1, §2.2, §3, §5, §7, §7.
  • C. Y. Cho, V. D’Silva, and D. Song (2013) BLITZ: compositional bounded model checking for real-world programs. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering, ASE ’13, pp. 136–146. Cited by: §7.
  • A. Cimatti, A. Griggio, and R. Sebastiani (2010) Efficient generation of craig interpolants in satisfiability modulo theories. ACM Transactions on Computational Logic (TOCL) 12 (1), pp. 7. Cited by: §4.2.1.
  • E. Clarke, D. Kroening, and K. Yorav (2003) Behavioral consistency of c and verilog programs using bounded model checking. In Proceedings of the 40th annual Design Automation Conference, pp. 368–371. Cited by: §7.
  • P. Cousot and R. Cousot (2002) Modular static program analysis. In International Conference on Compiler Construction, CC ’02, pp. 159–179. Cited by: §4.3.
  • R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck (1989) An efficient method of computing static single assignment form. In Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pp. 25–35. Cited by: Appendix A.
  • R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck (1991) Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems (TOPLAS) 13 (4), pp. 451–490. Cited by: §3.
  • M. Das, S. Lerner, and M. Seigle (2002) ESP: path-sensitive program verification in polynomial time. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation, PLDI ’02, pp. 57–68. Cited by: §4.3.
  • L. De Moura and N. Bjørner (2008) Z3: an efficient smt solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems, pp. 337–340. Cited by: §5, §6.3.
  • J. Dean, D. Grove, and C. Chambers (1995) Optimization of object-oriented programs using static class hierarchy analysis. In European Conference on Object-Oriented Programming, pp. 77–101. Cited by: §5.
  • D. E. Denning (1976) A lattice model of secure information flow. Communications of the Acm 19 (5), pp. 236–243. Cited by: §3.
  • N. Dershowitz, Z. Hanna, and A. Nadel (2006) A scalable algorithm for minimal unsatisfiable core extraction. In Theory and Applications of Satisfiability Testing, SAT ’06, pp. 36–41. Cited by: §4.2.1.
  • I. Dillig, T. Dillig, A. Aiken, and M. Sagiv (2011) Precise and compact modular procedure summaries for heap manipulating programs. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’11, pp. 567–577. Cited by: §7.
  • I. Dillig, T. Dillig, and A. Aiken (2008) Sound, complete and scalable path-sensitive analysis. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’08, pp. 270–280. Cited by: §7.
  • G. Fan, R. Wu, Q. Shi, X. Xiao, J. Zhou, and C. Zhang (2019) Smoke: scalable path-sensitive memory leak detection for millions of lines of code. In Proceedings of the 41st International Conference on Software Engineering, ICSE ’19, pp. 72–82. Cited by: §1, §4.3, §7.
  • E. Goldberg, M. Güdemann, D. Kroening, and R. Mukherjee (2018) Efficient verification of multi-property designs (the benefit of wrong assumptions). In 2018 Design, Automation Test in Europe Conference Exhibition (DATE), pp. 43–48. Cited by: §7, §7.
  • T. A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre (2002) Lazy abstraction. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’02, pp. 58–70. Cited by: §7.
  • J. E. Hopcroft (2007) Introduction to automata theory, languages, and computation. 3rd edition, Pearson Addison Wesley. Cited by: Appendix A.
  • J. C. King (1976) Symbolic execution and program testing. Communications of the ACM 19 (7), pp. 385–394. Cited by: §7.
  • W. Le and M. L. Soffa (2008) Marple: a demand-driven path-sensitive buffer overflow detector. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering, pp. 272–282. Cited by: §1, §7.
  • L. Li, C. Cifuentes, and N. Keynes (2011) Boosting the performance of flow-sensitive points-to analysis using value flow. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, ESEC/FSE ’11, pp. 343–353. Cited by: Appendix A, §3.
  • B. Livshits and M. S. Lam (2003) Tracking pointers with path and context sensitivity for bug detection in c programs. In Proceedings of the 9th European Software Engineering Conference Held Jointly with 11th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ESEC/FSE ’11, pp. 317–326. Cited by: §1, §3, §5, §7.
  • B. Livshits, M. Sridharan, Y. Smaragdakis, O. Lhoták, J. N. Amaral, B. E. Chang, S. Z. Guyer, U. P. Khedker, A. Møller, and D. Vardoulakis (2015) In defense of soundiness: a manifesto. Communications of the ACM 58 (2), pp. 44–46. Cited by: §5.
  • S. McPeak, C. Gros, and M. K. Ramanathan (2013) Scalable and incremental software bug detection. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, ESEC/FSE ’13, pp. 554–564. Cited by: §6.2.
  • V. O. Mordan and V. S. Mutilin (2016) Checking several requirements at once by cegar. Programming and Computer Software 42 (4), pp. 225–238. Cited by: §7.
  • T. Reps, S. Horwitz, and M. Sagiv (1995) Precise interprocedural dataflow analysis via graph reachability. In Proceedings of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’95, pp. 49–61. Cited by: §7.
  • C. Sadowski, E. Aftandilian, A. Eagle, L. Miller-Cushon, and C. Jaspan (2018) Lessons from building static analysis tools at google. Commun. ACM 61 (4), pp. 58–66. Cited by: §7.
  • C. Sadowski, J. Van Gogh, C. Jaspan, E. Söderberg, and C. Winter (2015) Tricorder: building a program analysis ecosystem. In Proceedings of the 37th International Conference on Software Engineering, ICSE ’15, pp. 598–608. Cited by: §6.1, §7.
  • Q. Shi, X. Xiao, R. Wu, J. Zhou, G. Fan, and C. Zhang (2018) Pinpoint: fast and precise sparse value flow analysis for million lines of code. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’18, pp. 693–706. Cited by: Appendix A, §1, §1, §2.2, §3, §3, §5, §5, §6.1, §6.2, §7.
  • R. E. Strom and S. Yemini (1986) Typestate: a programming language concept for enhancing software reliability. IEEE Transactions on Software Engineering (1), pp. 157–171. Cited by: §7.
  • R. E. Strom (1983) Mechanisms for compile-time enforcement of security. In Proceedings of the 10th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, POPL ’83, pp. 276–284. Cited by: §7.
  • Y. Sui and J. Xue (2016) SVF: interprocedural static value-flow analysis in llvm. In International Conference on Compiler Construction, CC ’16, pp. 265–266. Cited by: §1, §3, §7.
  • Y. Sui, D. Ye, and J. Xue (2014) Detecting memory leaks statically with full-sparse value-flow analysis. IEEE Transactions on Software Engineering 40 (2), pp. 107–122. Cited by: Appendix A, §2.2, §3, §3, §5, §7.
  • O. Tripp, M. Pistoia, P. Cousot, R. Cousot, and S. Guarnieri (2013) Andromeda: accurate and scalable security analysis of web applications. In International Conference on Fundamental Approaches to Software Engineering, pp. 210–225. Cited by: §1.
  • Y. Xie and A. Aiken (2005a) Context- and path-sensitive memory leak detection. In Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ESEC/FSE ’05, pp. 115–125. Cited by: §7.
  • Y. Xie and A. Aiken (2005b) Scalable error detection using boolean satisfiability. In Proceedings of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’05, pp. 351–363. Cited by: §1, §4.3, §5, §5, §6.1, §7.
  • H. Yan, Y. Sui, S. Chen, and J. Xue (2018) Spatio-temporal context reduction: a pointer-analysis-based static approach for detecting use-after-free vulnerabilities. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE), pp. 327–337. Cited by: §7.

Appendix A A Context-Free Grammar Model

With no loss of generality, we assume the code in each function is in SSA form, where every variable has only one definition (Cytron et al., 1989). In a program, as existing work (Li et al., 2011; Sui et al., 2014; Shi et al., 2018), we say the value of a variable flows to a variable (or is data-dependent on ) if is assigned to directly (via an assignment, such as b=a) or indirectly (via pointer dereferences, such as *p=a; q=p; b=*q). Then, the value-flow graph of a program is defined as below.

Definition A.1 (Value-Flow Graph).

A value-flow graph is a directed graph , where and are defined as following:

  • is a set of vertices, each of which is denoted by , meaning that the variable is defined or used in the statement .

  • is a set of edges, each of which represents a data dependence relation or value flow. means that the value of flows to .

We say is a value-flow path if and only if the sequence represents a path on the value-flow graph. We use to represent on the path if . Specifically, we use to represent the last element of . A value-flow path can be concatenated to the other one , denoted as , if and only if . Given , we use to represent the set of value-flow paths from a vertex in to a vertex in . The concatenation of two value-flow paths then can be extended to two sets:

In the following definitions, we use , and to represent four special vertex subsets. and represent the sets of formal and actual parameters, respectively. and represent the sets of formal and actual return values, respectively. We refer to the return value at a return statement as the formal return value (e.g., v@return v) and the return value at a call statement as the actual return value (e.g., v@v=func()). All proofs in this subsection are put in Appendix B.

Definition A.2 (Intra-Procedural Value-Flow Paths, ).

Given a value-flow graph and , a value-flow path iff. and : and are in the same function.131313 For simplicity, when we say and are in the same function, we mean that they are in the same function under the same calling context.

As defined below, a same-level value-flow path starts and ends in the same function, but may go through some callee functions.

Definition A.3 (Same-Level Value-Flow Paths, ).

Given a value-flow graph and , a value-flow path iff. and is in the same function with .

Example A.4 (Same-Level Value-Flow Paths, ).

The value-flow path in Figure 8 is a same-level value-flow path because the head of the path is in the same function with the tail .

Lemma A.5 (Same-Level Value-Flow Paths, ).

can be generated using the following productions:

(1)
(2)

An output value-flow path, which is defined below, indicates that a checker-specific source escapes to its caller functions or upper-level caller functions.

Definition A.6 (Output Value-Flow Paths, ).

Given a value-flow graph , a value-flow path iff. and is in the same function with or in the (upper-level) callers of ’s function.

Example A.7 (Output Value-Flow Paths, ).

In Figure 8, is an output value-flow path, because the source vertex flows to , which is a formal return value and in the same function with .

Lemma A.8 (Output Value-Flow Paths, ).

can be generated using the following productions:

(3)
(4)

An input value-flow path, as defined below, indicates that a formal parameter of a function may flow to a sink vertex in or ’s callees. A source vertex in ’s caller functions may propagate to the sink through the formal parameter.

Definition A.9 (Input Value-Flow Paths, ).

Given a value-flow graph , a value-flow path iff. and is in the same function with or in the (lower-level) callees of ’s function.

Example A.10 (Input Value-Flow Paths, ).

In Figure 8, is an input value-flow path because it starts with a formal parameter and ends with a sink vertex in the same function.

Lemma A.11 (Input Value-Flow Paths, ).

can be generated using the following productions:

(5)
(6)
Lemma A.12 (Target Value-Flow Paths, ).

Given a checker , the set of target value-flow paths can be generated using the following productions:

(7)
(8)
(9)
(10)
Figure 8. Code and its value-flow graph for explaining Examples A.4 - A.10

The context-free grammar (Productions (1) - (10

)) implies that there is a Turing machine (or an algorithm) that can generate the target set

of value-flow paths by concatenating various value-flow paths (Hopcroft, 2007), which sets the foundation for our compositional analysis. According to the grammar, we can prove that, in the compositional analysis, it is sufficient to generate three kinds of function summaries, i.e., value-flow paths in , , and . The sufficiency is described as the following theorem and proved in the appendices.

Theorem A.13 (Summary Sufficiency).

Any target value-flow path in can be written as the concatenation of (1) a function’s intra-procedural value-flow paths, and (2) value-flow paths in , , and from its callees.

Appendix B Proofs

Given a global value-flow graph , we now explain the proofs of the lemmas and theorems in the paper. To ease the explanation, we use , and to represent elements in the sets , and , respectively. Sometimes we add superscript to them, e.g., , to represent a list of such elements with indices. In the proofs, when we say two elements in are in the same function, we mean they are in the same function as well as the same calling context.


Proof of Lemma A.5.

Proof.

(1) Prove: .

First, according to Definitions 13 and A.3, it is straightforward to conclude .

Second, , can be written as where , , and . Since is an actual parameter and is a formal parameter, the concatenation of and means that we enter into a callee function, say foo. Since the formal parameter and the formal return value are in the same function, the concatenation of and means that we exit from the callee foo. Thus, and are actually the actual parameter and actual return value at the same call site. Since is a same-level path, and are in the same function. Hence, and is in the same function, meaning that .


(2) Prove: .

, can be intra-procedural or inter-procedural. If it is an intra-procedural path, then .

For an inter-procedural path , since and are in the same function, the value of must flow to the other function, say foo, and, then, flow back. The function foo must be callee function because because if a value is returned to the caller function, it cannot flow back to the same function in the same calling context. Therefore, a path in must be in the following form, where , the value flow is a function call, and the value flow is a function return, .

Thus, if is inter-procedural, . ∎

Proof of Lemma A.8.

Proof.

(1) Prove:

First, according to Definitions A.3 and A.6, it is straightforward to conclude