Smart Contract Design Meets State Machine Synthesis: Case Studies

06/07/2019 ∙ by Dmitrii Suvorov, et al. ∙ ITMO University 0

Modern blockchain systems support creation of smart contracts -- stateful programs hosted and executed on a blockchain. Smart contracts hold and transfer significant amounts of digital currency which makes them an attractive target for security attacks. It has been shown that many contracts deployed to public ledgers contain security vulnerabilities. Moreover, the design of blockchain systems does not allow the code of the smart contract to be changed after it has been deployed to the system. Therefore, it is important to guarantee the correctness of smart contracts prior to their deployment. Formal verification is widely used to check smart contracts for correctness with respect to given specification. In this work we consider program synthesis techniques in which the specification is used to generate correct-by-construction programs. We focus on one of the special cases of program synthesis where programs are modeled with finite state machines (FSMs). We show how FSM synthesis can be applied to the problem of automatic smart contract generation. Several case studies of smart contracts are outlined: crowdfunding platform, blinded auction and a license contract. For each case study we specify the corresponding smart contract with a set of formulas in linear temporal logic (LTL) and use this specification together with test scenarios to synthesize a FSM model for that contract. These models are later used to generate executable Solidity code which can be directly used in a blockchain system.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Since the invention of a blockchain data structure in 2008 various cryptocurrencies have been emerging, evolving and gaining popularity. This popularity is explained by the fact that blockchain systems are fully operable without a trusted entity. Recent cryptocurrencies support creation of smart contracts – stateful programs executed on a blockchain that encode the rules governing transactions. The execution of smart contracts is enforced by the consensus algorithm in the underlying blockchain system.

Smart contracts are a powerful tool to encode arbitrary contractual agreements in a machine-readable form but as with any programs they are error-prone and hard to reason about. Furthermore, the blockchain systems design principles make impossible contract’s code modification after it has been deployed to blockchain. Also smart contracts hold and transfer significant amount of digital currency which makes them an attractive target of various attacks and drastically increases the cost of an error in smart contract code. It has been shown that many contracts deployed to public ledgers contain security vulnerabilities and these vulnerabilities have led to theft of millions of US dollars in cryptocurrency equivalent. Thus, it is of paramount importance to ensure that smart contracts are correct. Various methods based on formal verification have been proposed to achieve this goal [1, 2, 3].

Another method to build correct programs is program synthesis, which has a lot in common with formal verification. The problem of program synthesis is formulated as follows: given a specification in formal logic, construct a program conforming to that specification. This problem is known to be undecidable in general, however various methods were proposed for some special cases of programs. Unlike formal verification, automated synthesis of smart contracts has received very little attention, although program synthesis application for smart contracts looks promising given the fact that they are relatively small (less than 100 SLOC in average [4]).

Program synthesis is a very broad topic and in this work we only focus on the problem of FSM synthesis where programs are defined in terms of FSMs. The rationale behind this is that smart contract logic can often be expressed with an FSM. Moreover, modeling contracts as FSMs is a recommended design pattern for Solidity – a language of Ethereum [5] contracts [6]. In fact there is a tool VeriSolid [7] that facilitates creating FSM smart contract models and generating Solidity code from these models. A user can specify temporal properties and verify that generated contracts conform to these properties.

Fig. 1: Data-flow diagram of the proposed approach.

In this work we employ techniques and tools (EFSM-tools 111https://github.com/ulyantsev/EFSM-tools/) outlined in [8]. We have no intent to compare different FSM synthesis tools as for case studies in Section 3 synthesis solving terminates within seconds. In our approach input events of a FSM correspond to methods of a contract and output actions – to implementations of those methods. We synthesize smart contract FSM models based on formal specification in temporal logic and test scenarios. Afterwards, given the contract’s state declaration and the implementation of its methods in a corresponding programming language, we can generate contract’s code that is guaranteed to be correct with respect to the specification. Code generation is straightforward and similar to that in VeriSolid tool, however VeriSolid does not support multiple transitions labeled by the same event so we had to implement our own tool fsmc 222https://github.com/d-suvorov/fsmc/. The high-level data-flow diagram for the proposed approach is shown in Figure 1.

Contributions. To the best of our knowledge, this work is the first attempt to employ specification-based program synthesis techniques to automatically generate smart contract source code. Specifically, we provide case studies to show that LTL synthesis can be successfully applied for some types of contracts to generate their FSM models and use these models to obtain source code that meets some formal properties. We also briefly discuss the applicability of program synthesis for automated smart contract generation.

2 Background and related work

In this section we introduce some basic concepts of blockchain systems and smart contracts necessary to understand the rest of the paper. Then we define FSMs that are used to model smart contracts and introduce formalism that are used to specify them. Finally we state the problem of specification-based FSM synthesis and provide some references to its efficient solutions.

2.1 Smart contracts

A blockchain is a list of records, called blocks, containing some data. Blockchain can be used as a ledger that is maintained in a distributed network, for instance in cryptocurrencies this ledger stores a transaction list. Effectively, a ledger in cryptocurrencies stores the mapping from accounts to their balances in digital currency. We refer to this digital currency as coins. The nodes of a network called miners execute consensus algorithm and decide on which blocks will be added. It is assumed that the majority of nodes are honest as they are incentivized to add new valid blocks, and the integrity of the system is based on that assumption.

Modern blockchain systems support smart contracts – executable programs stored on a blockchain. A contract is executed by miners which agree on the outcome of the execution and update the blockchain accordingly. Hence arbitrary contractual agreements can be expressed in program code and enforced without relying on a trusted party. Most popular smart contract systems share the same concepts but for the rest of the paper we consider Ethereum-like smart contracts.

In Ethereum [5], smart contracts are a type of accounts associated with executable code and a storage file. Smart contracts can be created by sending a transaction of a special kind to a blockchain. A code of the contract consists of methods – entry points which are called when transactions are send to the address of that contract. Essentially, transactions act as method invocations. Contracts can receive coins with these transactions and send coins to other accounts via send instructions. Each instruction of a method consumes some amount of gas during execution. The user who sends a transaction must pay gas for its execution. If a transaction runs out of gas during its execution the control returns to sender. An example of a smart contract is shown in the next section.

The problem of creating correct smart contracts have been actively studied over the past years. One of the first work by Delmolino et al. outlines common pitfalls specific to smart contract development. Since then, a variety of techniques have been used to verify smart contracts. Different tools based on symbolic execution were created: Oyente [2], Mythril [9], Manticore [10], Maian [11]. An early work of Bhargavan et al. uses F* programming language [1]. Lately modern theorem provers have been employed to formalize different aspects of smart contracts in blockchain systems [3, 12, 13] and used to mechanize reasoning about those aspects. Sergey et al. [14] design a new functional language, implement its embedding into Coq and mechanize proofs of safety and liveness properties of smart contracts. Flint [15] is a programming language that was designed specifically for writing robust smart contracts. Flint employs linear type theory to prevent unintentional loss of coins. An interesting example of contract-oriented languages are Bamboo [16] and Obsidian [17] as they model smart contracts as state machines and make state transitions explicit. Model checking can also be used to verify smart contracts. Nehai et al. [18] use NuSMV model checker to create a blockchain application model (including a blockchain model itself) and check its temporal properties.

Idelberger et al. [19] propose to use defeasible deontic logic to create smart contracts, which is somewhat similar to our approach. However, the execution of such smart contracts relies on an external logic engine. This setup negatively affects the performance. We generate FSM models which can be encoded in some programming language and executed directly.

2.2 Specification-based FSM synthesis

[]


[]

Fig. 2: An example of an FSM (a) and its Kripke structure (b).

We are following [8] and define a finite state machine (FSM) as a tuple , where

  • is a finite set of states,

  • is the initial state,

  • is a finite set of input events,

  • is a finite set of output actions,

  • is the transition function,

  • is the output function (with we denote a set of strings over ).

An FSM reads a sequence of input events one by one and transforms it into a sequence of output actions. With each input event it generates new output actions according to and changes its active state according to .

Model checking. Model checking is a technique for automatically verifying finite-state systems with respect to a given specification [20]. It is common to formalize the specification as a formula in temporal logic. In linear temporal logic (LTL), formulas express some properties of execution paths [21]. To proceed with its definition we first define a Kripke structure. With we denote a set of atomic propositions, which characterize execution states. Formally, a Kripke structure is a tuple , where

  • is a set of states,

  • is a set of initial states,

  • is a transition relation, which must be left-total (that is, from each state there is a transition to at least one state),

  • is a labeling function.

An example of an FSM and a corresponding Kripke structure is shown in Figure 2. To label transitions we use this notation: input event / output action.

LTL formulas are defined over infinite paths in Kripke structures. The formulas are built up from temporal operators, atomic propositions and connectives familiar from propositional logic (, , , ). If is a Boolean formula, then it simply states with which atomic propositions the first state of the path is marked. If is an LTL formula, then saying that holds for a state of an infinite path means that it holds for the infinite suffix of the path starting from this state. The following temporal operators can be used.

  • The operator: means that has to hold at the next state of the path.

  • The operator: means that has to hold on the entire suffix of the path.

  • The operator: means that eventually has to hold (somewhere on the suffix of the path starting from this state).

  • The operator: means that has to hold at least until becomes true, which must hold at this or some future state.

  • The operator: means that has hold until and including the point where first becomes true; if never becomes true, must remain true forever.

The formula is true for some Kripke model means that it is satisfied for all infinite paths of .

FSM synthesis. The problem of FSM synthesis by the specification is well-known. In its different statements the specification may be given as temporal formula, a set of test scenarios or the combination of the two. A test scenario for FSM is a sequence of pairs , where each and and, a FSM conforms to it if and only if it produces a sequence of actions (with we denote sequence concatenation) given a sequence of events as its input. Exact synthesis methods are mostly based on transition to SAT [8, 22]. In [8] different approaches based on transition to SAT and QSAT were examined. In the most efficient approach scenarios are encoded in SAT and LTL formulas are incorporated with iterative counterexample prohibition. In BoSy tool [23, 24], encoding in QSAT instead of SAT is used and a transition system is generated only from a set of LTL formulas. The generated transition system is guaranteed to be minimum in terms of the number of states.

3 Case studies

Fig. 3: Blinded auction. FSM generated for states.
1contract BlindedAuction {
2  enum State { ST_0, ST_1, ST_2, ST_3 }
3  State private state = States.ST_0;
4  struct Bid {
5    bytes32 blindedBid;
6    uint deposit;
7  }
8  mapping(address => Bid[]) private bids;
9  mapping(address => uint) private pendingReturns;
10  address private highestBidder;
11  uint private highestBid;
12  boolean private closed = false;
13
14  function biddingOver() private returns (bool) {
15    return now > creationTime + 5 days;
16  }
17  function revealOver() private returns (bool) {
18    return now >= creationTime + 10 days;
19  }
20
21  function cancel() public {
22    require(state == States.ST_0
23            || state == States.ST_2)
24    if (state == ST_0) {
25      _cancel_action(); state = ST_3;
26    }
27    if (state == ST_2) {
28      _cancel_action(); state = ST_3;
29    }
30  }
31
32  function      bid() public {…}
33  function    close() public {…}
34  function withdraw() public {…}
35  function   reveal() public {…}
36  function   finish() public {…}
37  function    unbid() public {…}
38
39  function _bid_action() public {
40    bids[msg.sender].push(Bid({
41      blindedBid: blindedBid,
42      deposit: msg.value
43    }));
44    pendingReturns[msg.sender] += msg.value;
45  }
46
47  function    _close_action() public {…}
48  function   _reveal_action() public {…}
49  function   _finish_action() public {…}
50  function _withdraw_action() public {…}
51  function   _cancel_action() public {…}
52  function    _unbid_action() public {…}
53}
Fig. 4: Blinded auction. Generated Solidity code.

This section contains case studies that evolve from simple example for illustration purposes only to more realistic examples taken from the literature. For each case study we provide a formal LTL specification and a result model that was generated with this specification and test scenarios.

In this section we extend FSMs with guard conditions. A guard condition is a Boolean expression that labels FSM transition. Guard conditions affect the semantics of FSMs in the following way: the transition can be executed only if its guard condition is satisfied. We use this notation to label transitions: input event [guard condition] / output action. A test scenario now is a sequence of triples , where is an input event, is a guard condition and is a sequence of output actions. If a sequence of output actions is omitted (as in input event [guard condition] / or ) it is implicitly assumed that it consists of one action with the same name as the corresponding input event.

In our approach input events of an FSM correspond to methods of a contract and output actions – to implementations of those methods. We specify smart contracts with a set of LTL formulas in terms of these events and actions. Formal specification is then combined with test scenarios and provided as input to EFSM-tools to generate an FSM that complies with given specification and test scenarios. Figure 3 shows an example of a generated FSM. Given the implementation of FSM output actions and the definitions of used predicates, FSM can be translated to executable code. Figure 4 shows an example of generated Solidity code, where lines 2 – 12 correspond to contract’s state definition, lines 14 – 19 – to predicate definition and lines 40 – 44 – to action definition (other action definitions are omitted for brevity).

3.1 Crowdfunding

For illustrative purpose, let us first consider a simplistic example of a crowdfunding platform. In such a platform users can donate coins (denoted with event ) during a predefined period of time which ends when variable becomes true. When the donation period is over, the owner of the campaign can request collected coins (event ). After that donors can claim their donations back (event ) and possibly get them back if not enough coins were collected during the campaign (). Intuitively, one can model the logic of this contract using an FSM with two states. More formally, the logic of described contract is formulated as follows:

  1. cannot happen more than once;

  2. cannot happen after has happened;

  3. cannot happen before ;

  4. can happen only if ;

  5. can happen only if .

With a straightforward translation we can formalize these properties in LTL as follows.

(1)
(2)
(3)
(4)
(5)

Figure 5 shows an FSM generated with EFSM-tools from this specification the set of scenarios , where

Another advantage of using formal logic is that now we can reason about about the system. For instance, from properties 1 and 3 we can derive that cannot happen after .

Fig. 5: Crowdfunding. FSM generated for states.

3.2 Blinded auction

Now we consider a more realistic example of blinded auction taken from [7]. During a predefined period of time after contract creation, users can make hidden bids (denoted with event ). When this period is over () the auction can be closed (event ), after which follows the next period when users are allowed to their bids. When the second period is over () the auction can be finished (event ). At any time before finishing the auction can be canceled (), after which users can claim back their bids (event ).

The logic of this blinded auction can be formulated as follows (we provide corresponding LTL formulas alongside).

  1. , and cannot happen more than once:

  2. cannot happen after has happened:

  3. and cannot happen after has happened:

  4. , , and cannot happen after has happened:

  5. and cannot happen before :

  6. cannot happen before :

  7. cannot happen before :

  8. can happen only if :

  9. can happen only if :

This formal LTL specification and a set of test scenarios was used to generate a FSM depicted in Figure 3. Test scenarios for this and the next case study can be found online 333https://github.com/d-suvorov/sc-gen. Given Solidity code associated with output actions labeling FSM transitions, the full smart contract code can be generated. The code generated from synthesized FSM is shown in Figure 4.

3.3 License server

This is an example taken from [19]. Here we consider a smart contract that could be used to monitor the execution of the agreement between two parties, namely Licensor and Licensee. We assume that these parties perform as client agents connected to some blockchain network. The contract consists of the following clauses which were copied from [19] and annotated with event names that we are going to use in the rest of this section to model contract execution.

  1. The Licensor grants the Licensee a license to evaluate the Product ().

  2. Licensee must not the results of the evaluation () of the Product without the approval () of the Licensor; the approval must be obtained before the publication. If the Licensee publishes results of the evaluation of the Product without approval from the Licensor, the Licensee has 24 hours to the material.

  3. The Licensee must not publish comments () on the evaluation of the Product, unless the Licensee is permitted to publish the results of the evaluation.

  4. If the Licensee is commissioned () to perform an independent evaluation of the Product, then the Licensee has the obligation to publish the evaluation results.

  5. This license will automatically if Licensee breaches this Agreement.

There is not a timer that can be used to trigger some transition in a blockchain system, that is why we introduce special events and . happens if the results of the evaluation of the Product were not removed in due time. happens if the Licensee was commissioned to perform an independent evaluation and did not published the results. Thus LTL specification is less abstract and at the first sight less intuitive than the informal description above. For instance to state that it is permitted to use and publish results after getting an approval we introduce the property : “ cannot happen after has happened”. Given that property, and the fact that can only happen after , we can derive that cannot happen after . The latter property can be formalized and verified, which is an advantage of using formal logic system for specifying a contract.

We introduce LTL specification in two stages. The following formulas encode general principles of the system (e.g., “ cannot happen if nothing has been published”).

(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)

The following formulas encode contractual clauses.

(1)
(2)
(3)
(4)
(5)
(6)
(7)

An FSM generated from this specification and test scenarios is shown in Figure 6. Test scenarios are available online (the link was provided in the previous section).

и

Fig. 6: License. FSM generated for states.

The original contract is formulated in terms of deontic modalities, i.e., in terms of permissions, obligations and related concepts. A shortcoming of specifying this smart contract in temporal logic (and using it to synthesize FSM model) is not tracking these modalities: given a sequence of actions there is no easy way to figure out permissions and obligations of contractual parties. On the other hand, the resulting representation is efficient, which is important in case of on-chain deployment, and could be used to determine whether or not the given sequence of actions leads to contract termination.

4 Conclusion and discussion

We argue that automated program synthesis could find more applications for smart contract generation as they often have simpler structure than general purpose programs and it is an open research question whether a Turing-complete language is necessary for smart contract programming. We would like to draw attention of the community to the problem of automatic synthesis of smart contracts. We provided several case studies to show that LTL synthesis can be applied to generate FSM models for smart contracts of some types. In these models input events correspond to smart contract methods and output actions correspond to these methods’ implementation. Generated FSM models can further be used to obtain programs that are correct with respect to some formal temporal properties.

Our approach can be straightforwardly extended for systems of interacting smart contracts and used to specify, synthesize and verify them. Another interesting method to extend the supported class of verified properties is to incorporate source-level formal verification techniques. Output actions in synthesized FMSs correspond to smart contract methods, hence we can use other verification frameworks to prove source-level properties about these methods and combine them with temporal properties of FSM models itself.

In practice smart contracts receive, hold and send coins and transaction execution costs some amount of gas. It is important to be able to use these concepts to formulate properties of interest about smart contracts. Hence, other possibilities for future work include using SAT or SMT to encode such concepts and extending specification language to incorporate these.

A drawback of using LTL synthesis is that LTL is not expressive enough to formulate properties of kind “ can happen infinitely often (while has not happened)”. For example the formula in CTL can be used to state that there is a path on which always holds. However this problem can be easily mitigated by specifying test scenarios in which repeats times, where is greater than the number of transitions of FSM to be generated.

Despite the simple remedy for the above problem we believe that a more suitable formal system to specify smart contracts is yet to be identified. It is an interesting research question: how to strike a balance between simplicity and expressivity of this system to allow effective synthesis of practical smart contracts.

Acknowledgments

The authors would like to thank Igor Buzhinsky and Daniil Chivilikhin for their feedback. This work was supported by the Government of Russia (Grant 08-08).

References

  • [1] K. Bhargavan, A. Delignat-Lavaud, C. Fournet, A. Gollamudi, G. Gonthier, N. Kobeissi, N. Kulatova, A. Rastogi, T. Sibut-Pinote, N. Swamy et al., “Formal verification of smart contracts: Short paper,” in Proceedings of the 2016 ACM Workshop on Programming Languages and Analysis for Security.   ACM, 2016, pp. 91–96.
  • [2] L. Luu, D.-H. Chu, H. Olickel, P. Saxena, and A. Hobor, “Making smart contracts smarter,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security.   ACM, 2016, pp. 254–269.
  • [3] Y. Hirai, “Defining the Ethereum Virtual Machine for interactive theorem provers,” in International Conference on Financial Cryptography and Data Security.   Springer, 2017, pp. 520–535.
  • [4] P. Hegedus, “Towards analyzing the complexity landscape of Solidity based Ethereum smart contracts,” Technologies, vol. 7, no. 1, p. 6, 2019.
  • [5] G. Wood et al., “Ethereum: A secure decentralised generalised transaction ledger,” Ethereum project yellow paper, vol. 151, pp. 1–32, 2014.
  • [6] “Solidity documentation: Common patterns.” https://solidity.readthedocs.io/en/develop/common-patterns.html#state-machine.
  • [7] A. Mavridou and A. Laszka, “Designing secure Ethereum smart contracts: A finite state machine based approach,” arXiv preprint arXiv:1711.09327, 2017.
  • [8] V. Ulyantsev, I. Buzhinsky, and A. Shalyto, “Exact finite-state machine identification from scenarios and temporal properties,” International Journal on Software Tools for Technology Transfer, vol. 20, no. 1, pp. 35–55, 2018.
  • [9] “Mythril classic: Security analysis tool for Ethereum smart contracts,” https://github.com/ConsenSys/mythril-classic.
  • [10] “Manticore,” https://github.com/trailofbits/manticore/.
  • [11] I. Nikolić, A. Kolluri, I. Sergey, P. Saxena, and A. Hobor, “Finding the greedy, prodigal, and suicidal contracts at scale,” in Proceedings of the 34th Annual Computer Security Applications Conference.   ACM, 2018, pp. 653–663.
  • [12] S. Amani, M. Bégel, M. Bortin, and M. Staples, “Towards verifying Ethereum smart contract bytecode in Isabelle/Hol,” in Proceedings of the 7th ACM SIGPLAN International Conference on Certified Programs and Proofs.   ACM, 2018, pp. 66–77.
  • [13] I. Grishchenko, M. Maffei, and C. Schneidewind, “A semantic framework for the security analysis of Ethereum smart contracts,” in International Conference on Principles of Security and Trust.   Springer, 2018, pp. 243–269.
  • [14] I. Sergey, A. Kumar, and A. Hobor, “Scilla: a smart contract intermediate-level language,” arXiv preprint arXiv:1801.00687, 2018.
  • [15] F. Schrans, S. Eisenbach, and S. Drossopoulou, “Writing safe smart contracts in Flint,” in Conference Companion of the 2nd International Conference on Art, Science, and Engineering of Programming.   ACM, 2018, pp. 218–219.
  • [16] “Bamboo language repository,” https://github.com/CornellBlockchain/bamboo/.
  • [17] M. Coblenz, “Obsidian: a safer blockchain programming language,” in Proceedings of the 39th International Conference on Software Engineering Companion.   IEEE Press, 2017, pp. 97–99.
  • [18] Z. Nehai, P.-Y. Piriou, and F. Daumas, “Model-checking of smart contracts,” in IEEE International Conference on Blockchain, Halifax, Canada, 2018.
  • [19] F. Idelberger, G. Governatori, R. Riveret, and G. Sartor, “Evaluation of logic-based smart contracts for blockchain systems,” in International Symposium on Rules and Rule Markup Languages for the Semantic Web.   Springer, 2016, pp. 167–183.
  • [20] O. Grumberg, E. Clarke, and D. Peled, Model checking.   The MIT Press Cambridge, 1999.
  • [21] A. Pnueli, “The temporal logic of programs,” in 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).   IEEE, 1977, pp. 46–57.
  • [22] N. Walkinshaw, R. Taylor, and J. Derrick, “Inferring extended finite state machine models from software executions,” Empirical Software Engineering, vol. 21, no. 3, pp. 811–853, 2016.
  • [23] P. Faymonville, B. Finkbeiner, M. N. Rabe, and L. Tentrup, “Encodings of bounded synthesis,” in Tools and Algorithms for the Construction and Analysis of Systems, 2017, pp. 354–370.
  • [24] P. Faymonville, B. Finkbeiner, and L. Tentrup, “BoSy: An experimentation framework for bounded synthesis,” in Computer Aided Verification.   Cham: Springer, 2017, pp. 325–332.