Extracting Formal Specifications to Strenghten Type Behaviour Testing

by   Dimitri Racordon, et al.
University of Geneva

Testing has become an indispensable activity of software development, yet writing good and relevant tests remains a quite challenging task. One well-known problem is that it often is impossible or unrealistic to test for every outcome, as the input and/or output of a program component can represent incredbly large, unless infinite domains. A common approach to tackle this issue it to only test classes of cases, and to assume that those classes cover all (or at least most) of the cases a component is susceptible to be exposed to. Unfortunately, those kind of assumptions can prove wrong in many situations, causing a yet well-tested program to fail upon a particular input. In this short paper, we propose to leverage formal verification, in particular model checking techniques, as a way to better identify cases for which the aforementioned assumptions do not hold, and ultimately strenghten the confidence one can have in a test suite. The idea is to extract a formal specification of the data types of a program, in the form of a term rewriting system, and to check that specification against a set of properties specified by the programmer. Cases for which tose properties do not hold can then be identified using model checking, and selected as test cases.



page 1

page 2


Boost the Impact of Continuous Formal Verification in Industry

Software model checking has experienced significant progress in the last...

Simulation, Model Checking, and Execution of Activity Models

This paper presents our findings for using activity modeling for simulat...

MET: Model Checking-Driven Explorative Testing of CRDT Designs and Implementations

Internet-scale distributed systems often replicate data at multiple geog...

Counterexample Classification

In model checking, when a given model fails to satisfy the desired speci...

Efficient Verification of Multi-Property Designs (The Benefit of Wrong Assumptions) (Extended Version)

We consider the problem of efficiently checking a set of safety properti...

Automated Regression Unit Test Generation for Program Merges

Merging other branches into the current working branch is common in coll...

PEQcheck: Localized and Context-aware Checking of Functional Equivalence (Technical Report)

Refactorings must not alter the program's functionality. However, not al...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Although indispensable, testing is time-consuming activity that remains extremly challenging, despite tremendous advances in techniques and supporting tools. Because testing cannot be exhaustive in most cases, the set of possible test cases must be reduced to a set that tests for distinct classes of inputs, under the assumption that if a property holds for this finite number of classes, it also does for the entire input domain. Unfortunately, identifying these assumptions and ultimately building appropriate test suites is a formidable challenge. Systematic selection techniques have been proposed [1], but require formal specifications, which are a luxury some non-critical industrial developments cannot afford.

On its own, formal verification [2] is often seen as an alternative. Programs are modeled in some formal language, enabling one to formally check for proofs on some given properties. The advantage of this approach over testing is that properties are verified against the exhaustive set of behaviours a program may expose. Unfortunately, despite the outstanding results formal verification has yielded for the last decades, it has seen a relatively sparse adoption in industrial software development. State space explosion [3] often appears to be the main limitation, but the cost to understand and/or integrate formal verification into industrial processes is yet another reason behind this unfortunate observation. One interesting observation reveals that tools that met the most success with the industry are those that avoid purely mathematical notations, either in favour of visual representations (e.g. Mathlab/Simulink [4]), or in favour of representations close to programming (e.g. Spin/Promela [5]). It then appears that there is a need to bridge the gap between software development and formal verification, in order to alleviate as much as possible from both worlds.

In this short paper, we propose to extract formal specifications from actual code, so as to enable the use of formal verification techniques, namely model checking, to identify cases for which a test may fail. Our extraction process relies on the assumption that in most programming languages, the programmer is provided with a collection of basic types that she may combine with some mechanism to form more complex data types. By providing a formal representation for those types out of the box, in the form of Algebraic Data Types, and translating the semantics of the actual code in the form a Term Rewriting System (TRS) [6], we are able to automatically build a formal specification of the program, so as to check whether or not it satisfies a set of requirements.

Ii Our approach

As mentioned above, most programming languages provide the programmer with a small collection of basic types (e.g. numeric types, collections, etc.), as well as a mechanism to combine them to create more complex data types (e.g. composition, inheritence, etc.). Would these basic types given a formal representations, in our case by the means of an algebraic signature and a term rewriting system, it is possible to extract the formal specification of the types and operations from actual code. Consider for instance the Swift111https://swift.org code, given in Listing 1. A type Buffer is defined, with two properties capacity and storage of type Int and Array<Int> respectively. Assuming we already have an algebraic specification for those two types, it is easy to create one for the type Buffer, as a simple composition. The signatures of the write and consume methods are almost identical to that of Swift:

Note that a Buffer term appears as part of the domain and codomain of both operations. The one in the domain is required so that the operation can access the properties of the method is manipulating, and the one in the codomain is required so that we can represent the possible mutation of the input buffer, which is in fact the result of transforming imperative code into functional one. The semantics is also easy to extract in that particular example. The write function first tests whether the buffer reached its maximum capacity, raises an exception if it did or inserts the new data if it did not. This can be represented as the following operation, in a term rewriting system:

The semantics of the consume operation is identical to that of the popLast method, from the Array<Int> built-in type, and hence assumed to already be provided.

1 struct Buffer {
2   var capacity: Int   = 3
3   var storage : [Int] = []
5   mutating  func write(data: Int)  throws {
6     guard storage.count < capacity  else {
7       throw BufferError.Overflow
8    }
9    storage.append(data)
10  }
11   mutating  func consume() -> Int? {
12     return storage.popLast()
13  }
Listing 1: Swift implementation of a buffer

A formal specification is not useful by itself, but can be used to formally check requirements. In our particular example, we propose to extend Swift to express pre/post-conditions and invariants on data types, as depicted in Listing 2. Equipped with both a formal specification and a set of requirements, we can now use model checking to find cases for which our implementation does not satisfies its requirements, which will not reveal bugs, but may also provide us with relevant test cases if we are able to keep a trace of the transitions that lead to a particular counter example.

1 protocol Buffer {
2   when storage.count == capacity
3    => write(data: _)  throws BufferError.Overflow
4   when storage.count == 0
5    => consume() == .nil
6   after write(data: i)
7    => consume() == i
Listing 2: Specification of semantic requirements

One nice advantage of our approach over traditional ones is that the requirements are expressed with a syntax extremly close to that of the programming language. In our particular example, we extended Swift with some new constructs, but a less invasive alternative would be to use comments or annotations, as it is customary in some other languages. This means the programmer does not need to learn a new language or tool to be able to leverage formal verification.

Iii Related works

Our work is closely related to Meyer’s the design by contract approach [7]. The programmer is provided with a way to specify contracts between a supplier (i.e. a type or an interface) and a client (i.e. a caller) that specifies pre/post-conditions and/or invariants on the data that is exchanged between the two. Contracts are traditionally checked dynamically, as the code is running. Our approach differs in the fact that we focus on statical analysis, with the advantage that once deemed correct, a program does not need to carry any additional information during its execution.

Our work is also related to Abstract Testing [8]. This technique proposes to replace transitional testing with abstract cases. A test case is no longer described as a concrete set of inputs that should yield a concrete output, but rather as a set of input constraints that should yield an answer that satisfies other constraints. Then, model checking can be used to prove the correctness of the system under test. In fact, abstract testing is very close to our approach, and only differs in the fact that it does not produces a formal specification, using the system under test as some kind of black box. The advantage is that while extracting the semantics of arbitrary code might be intractable in some cases, it is easier to call existing code and observe its behaviour. On the other hand, a complete formal semantics will yield stronger proofs, as it may not depend on some implementation properties.


  • [1] G. Bernot, M. C. Gaudel, and B. Marre, “Software testing based on formal specifications: a theory and a tool,” Software Engineering Journal, vol. 6, no. 6, pp. 387–405, Nov 1991.
  • [2] O. Hasan and S. Tahar, “Formal verification methods,” in Encyclopedia of Information Science and Technology, Third Edition.   IGI Global, 2015, pp. 7162–7170.
  • [3] E. M. Clarke, W. Klieber, M. Nováček, and P. Zuliani, Model Checking and the State Explosion Problem.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 1–30.
  • [4] J. B. Dabney and T. L. Harman, Mastering simulink.   Pearson/Prentice Hall, 2004.
  • [5] G. Holzmann, Spin Model Checker, the: Primer and Reference Manual, 1st ed.   Addison-Wesley Professional, 2003.
  • [6] A. Dick and P. Watson, “Order-sorted term rewriting,” The Computer Journal, vol. 34, no. 1, pp. 16–19, 1991.
  • [7] B. Meyer, Design by contract.   Prentice Hall, 2002.
  • [8] F. Merz, C. Sinz, H. Post, T. Gorges, and T. Kropf, “Bridging the gap between test cases and requirements by abstract testing,” Innovations in Systems and Software Engineering, vol. 11, no. 4, pp. 233–242, Dec 2015.