Towards Efficient Data-flow Test Data Generation Using KLEE

03/17/2018 ∙ by Chengyu Zhang, et al. ∙ Microsoft Nanyang Technological University East China Normal University 0

Dataflow coverage, one of the white-box testing criteria, focuses on the relations between variable definitions and their uses.Several empirical studies have proved data-flow testing is more effective than control-flow testing. However, data-flow testing still cannot find its adoption in practice, due to the lack of effective tool support. To this end, we propose a guided symbolic execution approach to efficiently search for program paths to satisfy data-flow coverage criteria. We implemented this approach on KLEE and evaluated with 30 program subjects which are constructed by the subjects used in previous data-flow testing literature, SIR, SV-COMP benchmarks. Moreover, we are planning to integrate the data-flow testing technique into the new proposed symbolic execution engine, SmartUnit, which is a cloud-based unit testing service for industrial software, supporting coverage-based testing. It has successfully helped several well-known corporations and institutions in China to adopt coverage-based testing in practice, totally tested more than one million lines of real code from industry.



There are no comments yet.


This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Data-flow testing is a group of testing strategies which aims to find paths to exercise the interactions between definitions and uses of the variables. The faults can be found by observing whether all corresponding uses produce the desired results. The idea of data-flow testing was first proposed in 1976 by Herman (Herman, 1976) who claimed data-flow testing could test a program more thoroughly and reveal more software bugs. Several empirical studies have revealed that data-flow coverage criteria are more effective than control-flow coverage criteria (Hutchins et al., 1994; Frankl and Iakounenko, 1998; Khannur, 2011). However, data-flow coverage still cannot find its adoption in practice, due to the lack of effective tool support.

In this presentation, we propose an efficient guided testing approach which achieves data-flow coverage criteria. To our knowledge, we are the first to adapt symbolic execution engine KLEE for data-flow testing and have implemented it on KLEE, an efficient and robust symbolic execution engine. To evaluate its efficiency, we build a data-flow testing benchmark, which consists of the subjects used in previous data-flow testing work, Software-artifact Infrastructure Repository111 (SIR), and International Competition on Software Verification222 (SV-COMP) benchmarks.

2. Approach and Implementation

2.1. Approach

According to the Herman’s definition (Herman, 1976), a def-use pair exists when there is at least one program path from the definition of variable to the use where there are no redefinitions between definition and use. For a def-use pair, if there is an input t that induces an execution path passing through definition and then use with no intermediate redefinitions of x between definition and use, the input t satisfies the def-use pair. The requirement to cover all def-use pairs at least once is called def-use coverage criterion.

Figure 1. The basic process of data-flow testing.

Figure. 1 shows the basic process of data-flow testing. For an input program, we perform static analysis to get a set of def-use pairs. At each time, the test generator selects a pair as the target to find a test input covering the pair. If the generator successfully generates the test input covering the pair, the test input will be put in the test suite, otherwise, the pair is regarded as the unsatisfiable pair within the given testing time.

To achieve efficient data-flow testing, we designed a cut-points guided search algorithm to enhance symbolic execution (Su et al., 2015). The key idea of the strategy is to reduce unnecessary path exploration and provide more guidance during execution. There are three key elements used to optimize the strategy: Cut Points, Instruction Distance and Redefinition Pruning. (1) Cut Points is a sequence of control points that must be traversed through by any path that could cover the target pair. These cut points are used as intermediate goals during the search to narrow down the exploration space of symbolic execution. (2) Instruction Distance is the distance between currently executed instruction and target instruction in the control flow graph. The execution state which has shorter instruction distance toward the goal can reach it more easily. (3) Redefinition Pruning is the strategy that gives lower priority to the execution states which have the redefinitions on paths. According to the definition of covering def-use pair, the path which has redefinitions between definition and use is invalid. Pruning the potential invalid paths can avoid useless explorations.

2.2. Implementation

Our implementation follows the basic data-flow testing process. In the data-flow analysis phase, the input program is analyzed by CIL (Necula et al., 2002) tool which is an infrastructure for C program analysis and transformation to identify the def-use pairs, cut points, and static program information. We use CIL instead of LLVM because we intend to analyze def-use pairs from source-level instead of bytecode level (LLVM IR). The test data generation and coverage computation phase are implemented on KLEE.

In detail, we implemented a searcher class in KLEE to apply our cut-points guided search strategy. Furthermore, we constructed a data-flow information table in KLEE to supply some extra data-flow information such as redefinitions and cut points during execution. A coverage checker and a redefinition checker are also implemented in KLEE based on the process tree. If an execution state reaches the use through the definition without any redefinition on the path, the coverage checker identifies that it has covered the pair. Then the execution state generates the test case satisfying the pair.

3. evaluation and discussion

Although data-flow testing has been investigated for a long time, there are still no standard benchmark programs for evaluating data-flow testing techniques. To set up a fair evaluation basis, we constructed our benchmarks as follows: (1) We conducted a survey on previous data-flow testing literature (Su et al., 2017). After excluding the subjects that are not available, not written in C language and simple laboratory programs, we finally got 7 subjects. (2) The benchmark included 7 Simens subjects from SIR which are widely used in the experiments of program analysis and software testing. (3) We further enriched the repository with 16 distinct subjects from SV-COMP which is used in the competition for software verification. We chose two groups of subjects in SV-COMP benchmark, ntdriver group (with 6 subjects) and ssh group (with 10 subjects). The code and pair scale of each categories are showed in Table. 1.

Subjects #Sub #LOC #DU pair Average Coverage Median Time (s/pair)
Previous Literature 7 449 346 60% 72% 0.1 4.3
SIR 7 2,687 1,409 57% 60% 0.7 10.1
SV-COMP (ntdriver) 6 7,266 2,691 75% 51% 1.5 5
SV-COMP (ssh) 10 5,249 18,347 29% 31% 18 5.7
Table 1. Evaluation Statistic of Data-flow Testing via KLEE

Table. 1 gives an overview of our evaluation. It shows the data-flow coverage (computed by #Covered/#Total) and median test time of KLEE and CPAchecker approache (Su et al., 2015) on our benchmark. From Table 1, we can find that KLEE can easily achieve nearly 60% of data-flow coverage within less than 1 second for each pair in the subjects from previous literature and SIR subjects. Comparing with the model-checking approach implemented on CPAchecker, it spends less time and achieves the similar coverage in most of the subjects. We found the longer median time in SV-COMP ssh benchmark is caused by the its complexity, which have complicate loops. The existence of a large number of infeasible pairs is also a obstacle for applying symbolic execution technique in practice, since it will waste much time on useless exploration. To solve this problem, we use model-checking to filter out the infeasible pairs in order to keep symbolic execution away from useless explorations.

4. Application

The idea in this presentation comes from our previous work published in  (Su et al., 2015). According to our survey on data-flow testing (Su et al., 2017), there are a variety of approaches to generate data-flow test data, such as random testing, collateral coverage-based testing, search-based testing and model checking-based testing. However, to our knowledge, we are the first to adapt symbolic execution for data-flow testing. Symbolic execution is a more efficient and precise way, because it can straightforwardly find the paths that satisfy the data-flow coverage criteria and easily generate the test case.

Furthermore, we are planning to integrate the data-flow testing technique into the cloud-based industrial automated unit test generation framework named SmartUnit (Zhang et al., 2018) which depends on our previous work on symbolic execution (Su et al., 2014, 2016). It helps several corporations and institutions to adopt coverage-based testing in practice, include China Academy of Space Technology, the main spacecraft development and production agency in China (like NASA in the United States); CASCO Signal Ltd., the best railway signal technique corporation in China, etc. . SmartUnit has totally tested more than one million lines of code since its release in 2016.

5. conclusion

In this presentation, we propose an efficient data-flow test data generation algorithm implemented on KLEE and evaluated on a diverse set of program subjects. It enables efficient and effective data-flow testing and helps several corporations and institutions to adopt data-flow testing in practice.


  • (1)
  • Frankl and Iakounenko (1998) Phyllis G Frankl and Oleg Iakounenko. 1998. Further empirical studies of test effectiveness. ACM SIGSOFT Software Engineering Notes 23, 6 (1998), 153–162.
  • Herman (1976) PM Herman. 1976. A data flow analysis approach to program testing. Australian Computer Journal 8, 3 (1976), 92–96.
  • Hutchins et al. (1994) Monica Hutchins, Herb Foster, Tarak Goradia, and Thomas Ostrand. 1994. Experiments on the effectiveness of dataflow-and control-flow-based test adequacy criteria. In Software Engineering, 1994. Proceedings. ICSE-16., 16th International Conference on. IEEE, 191–200.
  • Khannur (2011) Arunkumar Khannur. 2011. Software Testing: Techniques and Applications. Pearson Education India.
  • Necula et al. (2002) George Necula, Scott McPeak, Shree Rahul, and Westley Weimer. 2002. CIL: Intermediate language and tools for analysis and transformation of C programs. In Compiler Construction. Springer, 209–265.
  • Su et al. (2015) Ting Su, Zhoulai Fu, Geguang Pu, Jifeng He, and Zhendong Su. 2015. Combining symbolic execution and model checking for data flow testing. In Proceedings of the 37th International Conference on Software Engineering-Volume 1. 654–665.
  • Su et al. (2014) Ting Su, Geguang Pu, Bin Fang, Jifeng He, Jun Yan, Siyuan Jiang, and Jianjun Zhao. 2014. Automated coverage-driven test data generation using dynamic symbolic execution. In Software Security and Reliability, 2014 Eighth International Conference on. IEEE, 98–107.
  • Su et al. (2016) Ting Su, Geguang Pu, Weikai Miao, Jifeng He, and Zhendong Su. 2016. Automated coverage-driven testing: combining symbolic execution and model checking. SCIENCE CHINA Information Sciences 59, 9 (2016), 98101.
  • Su et al. (2017) Ting Su, Ke Wu, Weikai Miao, Geguang Pu, Jifeng He, Yuting Chen, and Zhendong Su. 2017. A Survey on Data-Flow Testing. Comput. Surveys 50, 1 (2017), 5.
  • Zhang et al. (2018) Chengyu Zhang, Yichen Yan, Hanru Zhou, Yinbo Yao, Ke Wu, Ting Su, Weikai Miao, and Geguang Pu. 2018. SmartUnit: Empirical Evaluations for Automated Unit Testing of Embedded Software in Industry. In Proceedings of 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP’18). 10 pages.