Quantitative Analysis of Smart Contracts

01/10/2018 ∙ by Krishnendu Chatterjee, et al. ∙ Hebrew University of Jerusalem Institute of Science and Technology Austria 0

Smart contracts are computer programs that are executed by a network of mutually distrusting agents, without the need of an external trusted authority. Smart contracts handle and transfer assets of considerable value (in the form of crypto-currency like Bitcoin). Hence, it is crucial that their implementation is bug-free. We identify the utility (or expected payoff) of interacting with such smart contracts as the basic and canonical quantitative property for such contracts. We present a framework for such quantitative analysis of smart contracts. Such a formal framework poses new and novel research challenges in programming languages, as it requires modeling of game-theoretic aspects to analyze incentives for deviation from honest behavior and modeling utilities which are not specified as standard temporal properties such as safety and termination. While game-theoretic incentives have been analyzed in the security community, their analysis has been restricted to the very special case of stateless games. However, to analyze smart contracts, stateful analysis is required as it must account for the different program states of the protocol. Our main contributions are as follows: we present (i) a simplified programming language for smart contracts; (ii) an automatic translation of the programs to state-based games; (iii) an abstraction-refinement approach to solve such games; and (iv) experimental results on real-world-inspired smart contracts.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In this work we present a quantitative stateful game-theoretic framework for formal analysis of smart-contracts.

Smart contracts. Hundreds of crypto-currencies are in use today, and investments in them are increasing steadily [22]. These currencies are not controlled by any central authority like governments or banks, instead they are governed by the blockchain protocol, which dictates the rules and determines the outcomes, e.g., the validity of money transactions and account balances. Blockchain was initially used for peer-to-peer Bitcoin payments [43], but recently it is also used for running programs (called smart contracts). A smart contract is a program that runs on the blockchain, which enforces its correct execution (i.e., that it is running as originally programmed). This is done by encoding semantics in crypto-currency transactions. For example, Bitcoin transaction scripts allow users to specify conditions, or contracts, which the transactions must satisfy prior to acceptance. Transaction scripts can encode many useful functions, such as validating that a payer owns a coin she is spending or enforcing rules for multi-party transactions. The Ethereum crypto-currency [14] allows arbitrary stateful Turing-complete conditions over the transactions which gives rise to smart contracts that can implement a wide range of applications, such as financial instruments (e.g., financial derivatives or wills) or autonomous governance applications (e.g., voting systems). The protocols are globally specified and their implementation is decentralized. Therefore, there is no central authority and they are immutable. Hence, the economic consequences of critical bugs in a smart contract cannot be reverted.

Types of Bugs. There are two types of bugs with monetary consequences:

  1. Coding errors. Similar to standard programs, bugs could arise from coding mistakes. At one reported case [31], mistakenly replacing += operation with =+ enabled loss of tokens that were backed by $800,000 of investment.

  2. Dishonest interaction incentives. Smart contracts do not fully dictate the behavior of participants. They only specify the outcome (e.g., penalty or rewards) of the behaviors. Hence, a second source for bugs is the high level interaction aspects that could give a participant unfair advantage and incentive for dishonest behavior. For example, a naive design of rock-paper-scissors game [27] allows playing sequentially, rather then concurrently, and gives advantage to the second player who can see the opponent’s move.

DAO attack: interaction of two types of bugs. Quite interestingly a coding bug can incentivize dishonest behavior as in the famous DAO attack [48]. The Decentralized Autonomous Organization (DAO) [37] is an Ethereum smart contract [51]. The contract consists of investor-directed venture capital fund. On June 17, 2016 an attacker exploited a bug in the contract to extract $80 million [48]. Intuitively, the root cause was that the contract allowed users to first get hold of their funds, and only then updated their balance records while a semantic detail allowed the attacker to withdraw multiple times before the update.

Necessity of formal framework. Since bugs in smart contracts have direct economic consequences and are irreversible, they have the same status as safety-critical errors for programs and reactive systems and must be detected before deployment. Moreover, smart contracts are deployed rapidly. There are over a million smart contracts in Ethereum, holding over 15 billion dollars at the time of writing [29]. It is impossible for security researchers to analyze all of them, and lack of automated tools for programmers makes them error prone. Hence, a formal analysis framework for smart contract bugs is of great importance.

Utility analysis.

In verification of programs, specifying objectives is non-trivial and a key goal is to consider specification-less verification, where basic properties are considered canonical. For example, termination is a basic property in program analysis; and data-race freedom or serializability are basic properties in concurrency. Given these properties, models are verified wrt them without considering any other specification. For smart contracts, describing the correct specification that prevents dishonest behavior is more challenging due to the presence of game-like interactions. We propose to consider the expected user utility (or payoff) that is guaranteed even in presence of adversarial behavior of other agents as a canonical property. Considering malicious adversaries is standard in game theory. For example, the expected utility of a fair lottery is

. An analysis reporting a different utility signifies a bug.

New research challenges. Coding bugs are detected by classic verification, program analysis, and model checking tools [21, 38]. However, a formal framework for incentivization bugs presents a new research challenge for the programming language community. Their analysis must overcome two obstacles: (a) the framework will have to handle game-theoretic aspects to model interactions and incentives for dishonest behavior; and (b) it will have to handle properties that cannot be deduced from standard temporal properties such as safety or termination, but require analysis of monetary gains (i.e., quantitative properties).

While game-theoretic incentives are widely analyzed by the security community (e.g., see [11]), their analysis is typically restricted to the very special case of one-shot games that do not consider different states of the program, and thus the consequences of decisions on the next state of the program are ignored. In addition their analysis is typically ad-hoc and stems from brainstorming and special techniques. This could work when very few protocols existed (e.g., when bitcoin first emerged) and deep thought was put into making them elegant and analyzable. However, the fast deployment of smart contracts makes it crucial to automate the process and make it accessible to programmers.

Our contribution. In this work we present a formal framework for quantitative analysis of utilities in smart contracts. Our contributions are as follows:

  1. We present a simplified (loop-free) programming language that allows game-theoretic interactions. We show that many classical smart contracts can be easily described in our language, and conversely, a smart contract programmed in our language can be easily translated to Solidity [28], which is the most popular Ethereum smart contract language.

  2. The underlying mathematical model for our language is stateful concurrent games. We automatically translate programs in our language to such games.

  3. The key challenge to analyze such game models automatically is to tackle the state-space explosion. While several abstraction techniques have been considered for programs [45, 34, 12], they do not work for game-theoretic models with quantitative objectives. We present an approach based on interval-abstraction for reducing the states, establish soundness of our abstraction, and present a refinement process. This is our core technical contribution.

  4. We present experimental results on several classic real-world smart contracts. We show that our approach can handle contracts that otherwise give rise to games with up to states. While special cases of concurrent games (namely, turn-based games) have been studied in verification and reactive synthesis, there are no practical methods to solve general concurrent quantitative games. To the the best of our knowledge, there are no tools to solve quantitative concurrent games other than academic examples of few states, and we present the first practical method to solve quantitative concurrent games that scales to real-world smart contract analysis.

In summary, our contributions range from (i) modeling of smart contracts as state-based games, to (ii) an abstraction-refinement approach to solve such games, to (iii) experimental results on real-world smart contracts.

Organization. We start with an overview of smart contracts in Section 2. Our programming language is introduced in Section 3 along with implementations of real-world contracts in this language. We then present state-based concurrent games and translation of contracts to games in Section 4. The abstraction-refinement methodology for games is presented in Section 5 followed by experimental results in Section 6. Section 7 presents a comparison with related work and Section 8 concludes the paper with suggestions for future research.

2 Background on Ethereum smart contracts

2.1 Programmable smart contracts

Ethereum [14] is a decentralized virtual machine, which runs programs called contracts. Contracts are written in a Turing-complete bytecode language, called Ethereum Virtual Machine (EVM) bytecode [53]. A contract is invoked by calling one of its functions, where each function is defined by a sequence of instructions. The contract maintains a persistent internal state and can receive (transfer) currency from (to) users and other contracts. Users send transactions to the Ethereum network to invoke functions. Each transaction may contain input parameters for the contract and an associated monetary amount, possibly , which is transferred from the user to the contract.

Upon receiving a transaction, the contract collects the money sent to it, executes a function according to input parameters, and updates its internal state. All transactions are recorded on a decentralized ledger, called blockchain. A sequence of transactions that begins from the creation of the network uniquely determines the state of each contract and balances of users and contracts. The blockchain does not rely on a trusted central authority, rather, each transaction is processed by a large network of mutually untrusted peers called miners. Users constantly broadcast transactions to the network. Miners add transactions to the blockchain via a proof-of-work consensus protocol [43].

We illustrate contracts by an example (Figure 1) which implements a contract that rewards users who solve a satisfiability problem. Rather than programming it directly as EVM bytecode, we use Solidity, a widely-used programming language which compiles into EVM bytecode [28]. This contract has one variable . Users can send money by calling , and the is updated according to the sent amount, which is specified by keyword. Users submit a solution by calling and giving values to input parameters and . If the input is a valid solution, the user who called the function, denoted by the keyword , is rewarded by . We note that when several users submit valid solutions, only the first user will be paid. Formally, if two users and submit transactions and to the contract that invoke and have valid input parameters, then user is paid if and only if appears in the blockchain before .

1contract SAT {
2    uint balance;
3    function deposit() payable {
4        balance += msg.value;
5    }
6    function submitSolution( bool a, bool b, bool c, bool d ) {
7        if( (a || b || !c) && (!a || c || !d) {
8            msg.sender.send(balance);
9            balance = 0;
10        }
11    }}
Figure 1: Smart contract that rewards users satisfying .

Subtleties. In this work, for simplicity, we ignore some details in the underlying protocol of Ethereum smart contract. We briefly describe these details below:

  • Transaction fees. In exchange for including her transactions in the blockchain, a user pays transaction fees to the miners, proportionally to the execution time of her transaction. This fact could slightly affect the monetary analysis of the user gain, but could also introduce bugs in a program, as there is a bound on execution time that cannot be exceeded. Hence, it is possible that some functions could never be called, or even worse, a user could actively give input parameters that would prevent other users from invoking a certain function.

  • Recursive invocation of contracts. A contract function could invoke a function in another contract, which in turn can have a call to the original contract. The underling Ethereum semantic in recursive invocation was the root cause for the notorious DAO hack [25].

  • Behavior of the miners. Previous works have suggested that smart contracts could be implemented to encourage miners to deviate from their honest behavior [50]. This could in theory introduce bugs into a contract, e.g., a contract might give unfair advantage for a user who is a big miner.

2.2 Tokens and user utility

A user’s utility is determined by the Ether she spends and receives, but could also be affected by the state of the contract. Most notably, smart contracts are used to issue tokens, which can be viewed as a stake in a company or an organization, in return to an Ether (or tokens) investment (see an example in Figure 2). These tokens are transferable among users and are traded in exchanges in return to Ether, Bitcoin and Fiat money. At the time of writing, smart contracts instantiate tokens worth billions of dollars [30]. Hence, gaining or losing tokens has clear utility for the user. At a larger scope, user utility could also be affected by more abstract storage changes. Some users would be willing to pay to have a contract declare them as Kings of Ether [40], while others could gain from registering their domain name in a smart contract storage [39]. In the examples provided in this work we mainly focus on utility that arises from Ether, tokens and the like. However, our approach is general and can model any form of utility by introducing auxiliary utility variables and definitions.

1contract Token {
2    mapping(address=>uint) balances;
3    function buy() payable {
4        balances[msg.sender] += msg.value;
5    }
6    function transfer( address to, uint amount ) {
7        if(balances[msg.sender]>=amount) {
8            balances[msg.sender] -= amount;
9            balances[to] += amount;
10    }}}
Figure 2: Token contract example.

3 Programming Language for Smart Contracts

In this section we present our programming language for smart contracts that supports concurrent interactions between parties. A party denotes an agent that decides to interact with the contract. A contract is a tuple where is a set of variables, describes the range of values that can be stored in each variable, is the initial values stored in variables, is a list of functions and describes for each function, the time segment in which it can be invoked. We now formalize these concepts.

Variables. There are three distinct and disjoint types of variables in :

  • contains “numeric” variables that can store a single integer.

  • contains “identification” (“id”) variables capable of pointing to a party in the contract by her address or storing Null. The notion of ids is quite flexible in our approach: The only dependence on ids is that they should be distinct and an id should not act on behalf of another id. We simply use different integers to denote distinct ids and assume that a “faking of identity” does not happen. In Ethereum this is achieved by digital signatures.

  • is the set of “mapping” variables. Each maps parties to integers.

Bounds and Initial values. The tuple where represent lower and upper bounds for integer values that can be stored in a variable. For example, if , then can only store integers between and . Similarly, if is a mapping and stores an address to a party in the contract, then can save integers between and . The function assigns an initial value to every variable. The assigned value is an integer in case of numeric and mapping variables, i.e., a mapping variable maps everything to its initial value by default. Id variables can either be initialized by Null or an id used by one of the parties.

Functions and Timing. The sequence is a list of functions and , where . The function can only be invoked in time-frame . The contract uses a global clock, for example the current block number in the blockchain, to keep track of time.

Note that we consider a single contract, and interaction between multiple contracts is a subject of future work.

3.1 Syntax

We provide a simple overview of our contract programming language. Our language is syntactically similar to Solidity and a translation mechanism for different aspects is discussed in Section 3.5. An example contract, modeling a game of rock-paper-scissors, is given in Figure 3. Here, a party, called issuer has issued the contract and taken the role of Alice. Any other party can join the contract by registering as Bob and then playing rock-paper-scissors. To demonstrate our language, we use a bidding mechanism. A more exact treatment of the syntax using a formal grammar can be found in Appendix 0.A.1.

    (0) contract RPS {     map Bids[0, 100] = 0;     id Alice = issuer;     id Bob = null;     numeric played[0,1] = 0;     numeric AliceWon[0,1] = 0;     numeric BobWon[0,1] = 0;     numeric bid[0, 100] = 0;     numeric AlicesMove[0,3] = 0;     numeric BobsMove[0,3] = 0;     //0 denotes no choice,     //1 rock, 2 paper,     //3 scissors     (1) function registerBob[1,10]          (payable bid : caller) {     (2)    if(Bob==null) {     (3)      Bob = caller;     (4)      Bids[Bob]=bid;            }            else{     (5)      payout(caller, bid);            }     (6) }     (7) function play[11, 15]         (AlicesMove:Alice = 0,         BobsMove:Bob = 0,         payable Bids[Alice]: Alice){     (8)  if(played==1)     (9)     return;          else     (10)    played = 1;     (11) if(BobsMove==0 and AlicesMove!=0)     (12)      AliceWon = 1;     (13) else if(AlicesMove==0 and BobsMove!=0)     (14)      BobWon = 1;     (15) else if(AlicesMove==0 and BobsMove==0)           {     (16)      AliceWon = 0;     (17)      BobWon = 0;           }     (18) else if(AlicesMove==BobsMove+1 or             AlicesMove==BobsMove-2)     (19)      AliceWon = 1;           else     (20)      BobWon = 1;     (21) }     (22) function getReward[16,20]() {     (23)  if(caller==Alice and AliceWon==1           or caller==Bob and BobWon==1)           {     (24)    payout(caller, Bids[Alice] + Bids[Bob]);     (25)    Bids[Alice] = 0;     (26)    Bids[Bob] = 0;           }     (27) }         }
Figure 3: A rock-paper-scissors contract.

Declaration of Variables. The program begins by declaring variables111For simplicity, we demonstrate our method with global variables only. However, the method is applicable to general variables as long as their ranges are well-defined at each point of the program., their type, name, range and initial value. For example, Bids is a map variable that assigns a value between and to every id. This value is initially . Line numbers (labels) are defined in Section 3.2 below and are not part of the syntax.

Declaration of Functions. After the variables, the functions are defined one-by-one. Each function begins with the keyword function followed by its name and the time interval in which it can be called by parties. Then comes a list of input parameters. Each parameter is of the form variable : party which means that the designated party can choose a value for that variable. The chosen value is required to be in the range specified for that variable. The keyword caller denotes the party that has invoked this function and payable signifies that the party should not only decide a value, but must also pay the amount she decides. For example, registerBob can be called in any time between and by any of the parties. At each such invocation the party that has called this function must pay some amount which will be saved in the variable bid. After the decisions and payments are done, the contract proceeds with executing the function.

Types of Functions. There are essentially two types of functions, depending on their parameters. One-party functions, such as registerBob and getReward require parameters from caller only, while multi-party functions, such as play ask several, potentially different, parties for input. In this case all parties provide their input decisions and payments concurrently and without being aware of the choices made by other parties, also a default value is specified for every decision in case a relevant party does not take part.

Summary. Putting everything together, in the contract specified in Figure 3, any party can claim the role of Bob between time and time by paying a bid to the contract, if the role is not already occupied. Then at time one of the parties calls play and both parties have until time to decide which choice (rock, paper, scissors or none) they want to make. Then the winner can call getReward and collect her prize.

3.2 Semantics

In this section we present the details of the semantics. In our programming language there are several key aspects which are non-standard in programming languages, such as the notion of time progress, concurrency, and interactions of several parties. Hence we present a detailed description of the semantics. We start with the requirements.

Requirements. In order for a contract to be considered valid, other than following the syntax rules, a few more requirements must be met, which are as follows:

  • We assume that no division by zero or similar undefined behavior happens.

  • To have a well-defined message passing, we also assume that no multi-party function has an associated time interval intersecting that of another function.

  • Finally, for each non-id variable , it must hold that and similarly, for every function , we must have .

Overview of time progress. Initially, the time is . Let be the set of functions executable at time , i.e., , then is either empty or contains one or more one-party functions or consists of a single multi-party function. We consider the following cases:

  • empty. If is empty, then nothing can happen until the clock ticks.

  • Execution of one-party functions. If contains one or more one-party functions, then each of the parties can call any subset of these functions at time . If there are several calls at the same time, the contract might run them in any order. While a function call is being executed, all parties are able to see the full state of the contract, and can issue new calls. When there are no more requests for function calls, the clock ticks and the time is increased to . When a call is being executed and is at the beginning part of the function, its caller can send messages or payments to the contract. Values of these messages and payments will then be saved in designated variables and the execution continues. If the caller fails to make a payment or specify a value for a decision variable or if her specified values/payments are not in the range of their corresponding variables, i.e. they are too small or too big, the call gets canceled and the contract reverts any changes to variables due to the call and continues as if this call had never happened.

  • Execution of multi-party functions. If contains a single multi-party function and , then any party can send messages and payments to the contract to specify values for variables that are designated to be paid or decided by her. These choices are hidden and cannot be observed by other participants. She can also change her decisions as many times as she sees fit. The clock ticks when there are no more valid requests for setting a value for a variable or making a payment. This continues until we reach time . At this time parties can no longer change their choices and the choices become visible to everyone. The contract proceeds with execution of the function. If a party fails to make a payment/decision or if Null is asked to make a payment or a decision, default behavior will be enforced. Default value for payments is and default behavior for other variables is defined as part of the syntax. For example, in function play of Figure 3, if a party does not choose, a default value of is enforced and given the rest of this function, this will lead to a definite loss.

Given the notion of time progress we proceed to formalize the notion of “runs” of the contract. This requires the notion of labels, control-flow graphs, valuations, and states, which we describe below.

Labels. Starting from , we give the contract, beginning and end points of every function, and every command a label. The labels are given in order of appearance. As an example, see the labels in parentheses in Figure 3.

Entry and Exit Labels. We denote the first (beginning point) label in a function by and its last (end point) label by .

Control Flow Graphs (CFGs). We define the control flow graph of the function in the standard manner, i.e. , where there is a vertex corresponding to every labeled entity inside . We do not distinguish an entity, its label and its corresponding vertex. Each edge has a condition which is a boolean expression that must be true when traversing that edge. For example, Figure 4 is an illustration of the control flow graph of function play in our example contract. For a more formal treatment see Appendix 0.A.2.

played==1 BobsMove==0 and AlicesMove!=0 AlicesMove==0 and BobsMove!=0 AlicesMove==0 and BobsMove==0 AlicesMove==BobsMove+1 or AlicesMove==BobsMove-2
Figure 4: Control Flow Graph of play() (left) and its edge conditions (right)

Valuations. A valuation is a function val, assigning a value to every variable. Values for numeric variables must be integers in their range, values for identity variables can be party ids or Null and a value assigned to a map variable must be a function such that for each identity , we have . Given a valuation, we extend it to expressions containing mathematical operations in the straight-forward manner.

States. A state of the contract is a tuple , where is a time stamp, is the current balance of the contract, i.e., the total amount of payment to the contract minus the total amount of payouts, is a label (that is being executed), val assigns values to variables and , is the caller of the current function. corresponds to the case where the caller is undefined, e.g., when no function is being executed. We use to denote the set of all states that can appear in a run of the contract as defined below.

Runs. A run of the contract is a finite sequence of states, starting from , that follows all rules of the contract and ends in a state with time-stamp . These rules must be followed when switching to a new state in a run:

  • The clock can only tick when there are no valid pending requests for running a one-party function or deciding or paying in multi-party functions.

  • Transitions that happen when the contract is executing a function must follow its control flow graph and update the valuation correctly.

  • No variable can contain an out-of-bounds value. If an overflow or underflow happens, the closest possible value will be saved. This rule also ensures that the contract will not create new money, given that paying more than the current balance of the contract results in an underflow.

  • Each party can call any set of the functions at any time.

This definition is formalized in Appendix 0.A.2.3.

Remark 1

Note that in our semantics each function body completes its execution in a single tick of the clock. However, ticks might contain more than one function call and execution.

Run prefixes. We use to mean the set of all prefixes of runs and denote the last state in by . A run prefix is an extension of if it can be obtained by adding one state to the end of .

Probability Distributions. Given a finite set

, a probability distribution on

is a function such that . Given such a distribution, its support, , is the set of all such that . We denote the set of all probability distributions on by .

Typically for programs it suffices to define runs for the semantics. However, given that there are several parties in contracts, their semantics depends on the possible choices of the parties. Hence we need to define policies for parties, and such policies will define probability distribution over runs, which constitute the semantics for contracts. To define policies we first define moves.

Moves. We use for the set of all moves. The moves that can be taken by parties in a contract can be summarized as follows:

  • Calling a function , we denote this by .

  • Making a payment whose amount, is saved in , we denote this by .

  • Deciding the value of to be , we denote this by .

  • Doing none of the above, we denote this by .

Permitted Moves. We define , so that is the set of permitted moves for the party with identity if the contract is in state . It is formally defined as follows:

  • If is a function that can be called at state , then .

  • If is the first label of a function and is a variable that can be decided by at the beginning of the function , then for all permissible values of . Similarly if can be paid by , .

  • .

Policies and Randomized Policies. A policy for party is a function , such that for every , . Intuitively, a policy is a way of deciding what move to use next, given the current run prefix. A policy profile is a sequence assigning one policy to each party . The policy profile defines a unique run of the contract which is obtained when parties choose their moves according to . A randomized policy for party is a function , such that . A randomized policy assigns a probability distribution over all possible moves for party given the current run prefix of the contract, then the party can follow it by choosing a move randomly according to the distribution. We use to denote the set of all randomized policy profiles, for randomized policies of and to denote the set of randomized policy profiles for all parties except . A randomized policy profile is a sequence assigning one randomized policy to each party. Each such randomized policy profile induces a unique probability measure on the set of runs, which is denoted as . We denote the expectation measure associated to by .

3.3 Objective function and values of contracts

As mentioned in the introduction we identify expected payoff as the canonical property for contracts. The previous section defines expectation measure given randomized policies as the basic semantics. Given the expected payoff, we define values of contracts as the worst-case guaranteed payoff for a given party. We formalize the notion of objective function (the payoff function).

Objective Function. An objective for a party is in one of the following forms:

  • , where is the total money received by party from the contract (by “payout” statements) and is the total money paid by to the contract (as “payable” parameters).

  • An expression containing mathematical and logical operations (addition, multiplication, subtraction, integer division, and, or, not) and variables chosen from the set . Here is the set of numeric variables, ’s are the values that can be saved inside maps.222We are also assuming, as in many programming languages, that and .

  • A sum of the previous two cases.

Informally, is trying to choose her moves so as to maximize .

Run Outcomes. Given a run of the program and an objective for party , the outcome is the value of computed using the valuation at for all variables and accounting for payments in to compute and .

Contract Values. Since we consider worst-case guaranteed payoff, we consider that there is an objective for a single party which she tries to maximize and all other parties are adversaries who aim to minimize . Formally, given a contract and an objective for party , we define the value of contract as:

This corresponds to trying to maximize the expected value of and all other parties maliciously colluding to minimize it. In other words, it provides the worst-case guarantee for party , irrespective of the behavior of the other parties, which in the worst-case is adversarial to party .

3.4 Examples

One contribution of our work is to present the simplified programming language, and to show that this simple language can express several classical smart contracts. To demonstrate the applicability, we present several examples of classical smart contracts in this section. In each example, we present a contract and a “buggy” implementation of the same contract that has a different value. In Section 6 we show that our automated approach to analyze the contracts can compute contract values with enough precision to differentiate between the correct and the buggy implementation. All of our examples are motivated from well-known bugs that have happened in real life in Ethereum.

3.4.1 Rock-Paper-Scissors.

Let our contract be the one specified in Figure 3 and assume that we want to analyze it from the point of view of the issuer . Also, let the objective function be Intuitively, this means that winning the rock-paper-scissors game is considered to have an additional value of , other than the spending and earnings. The idea behind this is similar to the case with chess tournaments, in which players not only win a prize, but can also use their wins to achieve better “ratings”, so winning has extra utility.

A common bug in writing rock-paper-scissors is allowing the parties to move sequentially, rather than concurrently [27]. If parties can move sequentially and the issuer moves after Bob, then she can ensure a utility of , i.e. her worst-case expected reward is . However, in the correct implementation as in Figure 3, the best strategy for both players is to bid and then Alice can win the game with probability by choosing each of the three options with equal probability. Hence, her worst-case expected reward is .

3.4.2 Auction.

Consider an open auction, in which during a fixed time interval everyone is allowed to bid for the good being sold and everyone can see others’ bids. When the bidding period ends a winner emerges and every other participant can get their money back. Let the variable HighestBid store the value of the highest bid made at the auction. Then for a party , one can define the objective as:

This is of course assuming that the good being sold is worth precisely as much as the highest bid. A correctly written auction should return a value of to every participant, because those who lose the auction must get their money back and the party that wins pays precisely the highest bid. The contract in Figure 5 (left) is an implementation of such an auction. However, it has a slight problem. The function bid allows the winner to reduce her bid. This bug is fixed in the contract on the right.

    contract BuggyAuction {     map Bids[0,1000] = 0;     numeric HighestBid[0,1000] = 0;     id Winner = null;     numeric bid[0,1000] = 0;     function bid[1,10]     (payable bid : caller) {        payout(caller, Bids[caller]);        Bids[caller]=bid;        if(bid>HighestBid)        {           HighestBid = bid;           Winner = caller;        }     }     function withdraw[11,20]()     {       if(caller!=Winner)       {         payout(caller, Bids[caller]);         Bids[caller]=0;       }     }}     contract Auction {     map Bids[0,1000] = 0;     numeric HighestBid[0,1000] = 0;     id Winner = null;     numeric bid[0,1000] = 0;     function bid[1,10]     (payable bid : caller) {       if(bid<Bids[caller])         return;       payout(caller, Bids[caller]);       Bids[caller]=bid;       if(bid>HighestBid)       {         HighestBid = bid;         Winner = caller;       }     }     function withdraw[11,20]()     {       if(caller!=Winner)       {       payout(caller, Bids[caller]);       Bids[caller]=0;       }     }}
Figure 5: A buggy auction contract (left) and its fixed version (right).

3.4.3 Three-Way Lottery.

Consider a three-party lottery contract issued by as in Figure 6 (left). Note that division is considered to be integer division, discarding the remainder. The other two players can sign up by buying tickets worth unit each. Then each of the players is supposed to randomly and uniformly choose a nonce. A combination of these nonces produces the winner with equal probability for all three parties. If a person does not make a choice or pay the fees, she will certainly lose the lottery. The rules are such that if the other two parties choose the same nonce, which is supposed to happen with probability , then the issuer wins. Otherwise the winner is chosen according to the parity of sum of nonces. This gives everyone a winning probability of if all sides play uniformly at random. However, even if one of the sides refuses to play uniformly at random, the resulting probabilities of winning stays the same because each side’s probability of winning is independent of her own choice assuming that others are playing randomly.

    contract BuggyLottery {     id issuer = p;     id Alice = null;     id Bob = null;     id Winner = null;     numeric deposit[0,1] = 0;     numeric AlicesChoice[0,3] = 0;     numeric BobsChoice[0,3] = 0;     numeric IssuersChoice[0,3] = 0;     numeric sum[0,9]=0;     function buyTicket[1,10]       (payable deposit:caller)       {         if(deposit!=1) return;         if(Alice==null) Alice=caller;         else if(Bob==null) Bob=caller;         else payout(caller, deposit);       }     function play[11,20]     (AlicesChoice:Alice = 0,     BobsChoice:Bob = 0,     IssuersChoice:issuer = 0,     payable deposit:issuer)     {     if(AlicesChoice==0 or BobsChoice==0)       Winner = issuer;     else if(IssuersChoice==0 or     deposit==0)       Winner = Alice;     else if(AlicesChoice==BobsChoice)       Winner = issuer;     else     {       sum= AlicesChoice+BobsChoice       + IssuersChoice;       if(sum/2*2==sum)         Winner = Alice;       else         Winner = Bob;     }     }     function withdraw[21,30]()     {       if(caller==Winner)         payout(caller, 3);     }}     contract Lottery {     id issuer = p;     id Alice = null;     id Bob = null;     id Winner = null;     numeric deposit[0,1] = 0;     numeric AlicesChoice[0,3] = 0;     numeric BobsChoice[0,3] = 0;     numeric IssuersChoice[0,3] = 0;     numeric sum[0,9]=0;     function buyTicket[1,10]     (payable deposit:caller)     {       if(deposit!=1) return;       if(Alice==null) Alice=caller;       else if(Bob==null) Bob=caller;       else payout(caller, deposit);     }     function play[11,20]     (AlicesChoice:Alice = 0,     BobsChoice:Bob = 0,     IssuersChoice:issuer = 0,     payable deposit:issuer)     {     if(AlicesChoice==0 or BobsChoice==0)       Winner = issuer;     else if(IssuersChoice==0 or     deposit==0)       Winner = Alice;     else     {       sum = AlicesChoice + BobsChoice       + IssuersChoice;       if(sum/3*3==sum)         Winner = Alice;       else if(sum/3*3==sum-1)         Winner = Bob;       else         Winner = issuer;     }     }     function withdraw[21,30]()     {       if(caller==Winner)         payout(caller, 3);     }}
Figure 6: A buggy lottery contract (left) and its fixed version (right).

We assume that the issuer has objective . This is because the winner can take other players’ money. In a bug-free contract we will expect the value of this objective to be , given that winning has a probability of . However, the bug here is due to the fact that other parties can collude. For example, the same person might register as both Alice and Bob and then opt for different nonces. This will ensure that the issuer loses. The bug can be solved as in the contract in Figure 6 (right). In that contract, one’s probability of winning is if she honestly plays uniformly at random, no matter what other parties do.

3.4.4 Token Sale.

Consider a contract that sells tokens modeling some aspect of the real world, e.g. shares in a company. At first anyone can buy tokens at a fixed price of unit per token. However, there are a limited number of tokens available and at most of them are meant to be sold. The tokens can then be transferred between parties, which is the subject of our next example. For now, Figure 7 (left) is an implementation of the selling phase. However, there is a big problem here. The problem is that one can buy any number of tokens as long as there is at least one token remaining. For example, one might first buy tokens and then buy another . If we analyze the contract from the point of view of a solo party with objective , then it must be capped by in a bug-free contract, while the process described above leads to a value of . The fixed contract is in Figure 7 (right). This bug is inspired by a very similar real-world bug described in [52].

3.4.5 Token Transfer.

Consider the same bug-free token sale as in the previous example, we now add a function for transferring tokens. An owner can choose a recipient and an amount less than or equal to her balance and transfer that many tokens to the recipient. Figure 8 (left) is an implementation of this concept. Taking the same approach and objective as above, we expect a similar result. However, there is again an important bug in this code. What happens if a party transfers tokens to herself? She gets free extra tokens! This has been fixed in the contract on the right. This example models a real-world bug as in [42].

    contract BuggySale {     map balance[0,2000] = 0;     numeric remaining[0,2000] = 1000;     numeric payment[0,2000] = 0;     function buy[1,10]       (payable payment:caller)     {       if(remaining<=0){         payout(caller, payment);         return;       }       balance[caller] += payment;       remaining -= payment;     }}     contract Sale {     map balance[0,2000] = 0;     numeric remaining[0,2000] = 1000;     numeric payment[0,2000] = 0;     function buy[1,10]       (payable payment:caller)     {       if(remaining-payment<0){         payout(caller, payment);         return;       }       balance[caller] += payment;       remaining -= payment;     }}
Figure 7: A buggy token sale (left) and its fixed version (right).
    contract BuggyTransfer {     map balance[0,2000] = 0;     numeric remaining[0,2000] = 1000;     numeric payment[0,2000] = 0;     numeric amount[0,2000] = 0;     numeric fromBalance[0,2000] = 0;     numeric toBalance[0,2000] = 0;     id recipient = null;     function buy[1,10]...     function transfer[1,10](       recipient : caller       amount : caller) {         fromBalance = balance[caller];         toBalance = balance[recipient];         if(fromBalance<amount)           return;         fromBalance -= amount;         toBalance += amount;         balance[caller] = fromBalance;         balance[recipient] = toBalance;       }}     contract Transfer {     map balance[0,2000] = 0;     numeric remaining[0,2000] = 1000;     numeric payment[0,2000] = 0;     numeric amount[0,2000] = 0;     id recipient = null;     function buy[1,10]...     function transfer[1,10](       recipient : caller       amount : caller) {         if(balance[caller]<amount)           return;         balance[caller] -= amount;         balance[recipient] += amount;       }}
Figure 8: A buggy transfer function (left) and its fixed version (right).

3.5 Translation to Solidity

In this section we discuss the problem of translating contracts from our programming language to Solidity, which is a widely-used language for programming contracts in Ethereum. There are two aspects in our language that are not automatically present in Solidity: (i) the global clock, and (ii) concurrent choices and payments by participants. We describe the two aspects below:

  • Translation of Timing and the Clock. The global clock can be modeled by the number of blocks in the blockchain. Solidity code is able to reference the blockchain. Given that a new block arrives roughly every to seconds, number of blocks that have been added to blockchain since the inception of the contract, or a constant multiple of it, can quantify passage of time.

  • Translation of Concurrent Interactions. Concurrent choices and payments can be implemented in Solidity using commitment schemes and digital signatures, which are standard tools in cryptography and cryptocurrencies. All parties first commit to their choice and then when they can no longer change it, unmask it. Commitment schemes can be extended to payments by requiring everyone to pay a fixed amount which is more than the value they are committing to and then returning the excess amount after unmasking.

Hence contracts in our language can be automatically translated to Solidity.

4 Bounded Analysis and Games

Since smart contracts can be easily described in our programming language, and programs in our programming language can be translated to Solidity, the main aim to automatically compute values of contracts (i.e., compute guaranteed payoff for parties). In this section, we introduce the bounded analysis problem for our programming language framework, and present concurrent games which is the underlying mathematical framework for the bounded analysis problem.

4.1 Bounded analysis

As is standard in verification, we consider the bounded analysis problem, where the number of parties and the number of function calls are bounded. In standard program analysis, bugs are often detected with a small number of processes, or a small number of context switches between concurrent threads. In the context of smart contracts, we analogously assume that the number of parties and function calls are bounded.

Contracts with bounded number of parties and function calls. Formally, a contract with bounded number of parties and function calls is as follows:

  • Let be a contract and , we define as an equivalent contract that can have at most parties. This is achieved by letting be the set of all possible ids in the contract. The set must contain all ids that are in the program source, therefore is at least the number of such ids. Note that this does not restrict that ids are controlled by unique users, and a real-life user can have several different ids. We only restrict the analysis to bounded number of parties interacting with the smart contract.

  • To ensure runs are finite, number of function calls by each party is also bounded. Specifically, each party can call each function at most once during each time frame, i.e. between two consecutive ticks of the clock. This closely resembles real-life contracts in which one’s ability to call many functions is limited by the capacity of a block in the blockchain, given that the block must save all messages. For a more rigorous treatment see Appendix 0.A.2.

4.2 Concurrent Games

The programming language framework we consider has interacting agents that act simultaneously, and we have the program state. We present the mathematical framework of concurrent games, which are games played on finite state spaces with concurrent interaction between the players.

Concurrent Game Structures. A concurrent two-player game structure is a tuple , where is a finite set of states, is the start state, is a finite set of actions, such that assigns to each state , a non-empty set of actions available to player at , and finally is a transition function that assigns to every state and action pair a successor state .

Plays and Histories. The game starts at state . At each state , player 1 chooses an action and player 2 chooses an action . The choices are made simultaneously and independently. The game subsequently transitions to the new state and the same process continues. This leads to an infinite sequence of tuples which is called a play of the game. We denote the set of all plays by . Every finite prefix of a play is called a history and the set of all histories is denoted by . If is a history, we denote the last state appearing according to , i.e. , by . We also define as the empty history.

Strategies and Mixed strategies. A strategy is a recipe that describes for a player the action to play given the current game history. Formally, a strategy for player is a function , such that . A pair of strategies for the two players is called a strategy profile. Each such induces a unique play. A mixed strategy for player given the history of the game. Intuitively, such a strategy suggests a distribution of actions to player at each step and then she plays one of them randomly according to that distribution. Of course it must be the case that . A pair of mixed strategies for the two players is called a mixed strategy profile. Note that mixed strategies generalize strategies with randomization. Every mixed strategy profile induces a unique probability measure on the set of plays, which is denoted as , and the associated expectation measure is denoted by .

State and History Utilities. In a game structure , a state utility function for player 1 is of the form . Intuitively, this means that when the game enters state , player 1 receives a reward of . State utilities can be extended to history utilities. We define the utility of a history to be the sum of utilities of all the states included in that history. Formally, if , then . Given a play , we denote the utility of its prefix of length by .

Games. A game is a pair (, ) where is a game structure and is a utility function for player 1. We assume that player 1 is trying to maximize , while player 2’s goal is to minimize it.

Values. The -step finite-horizon value of a game is defined as

(1)

where iterates over all possible mixed strategies of player . This models the fact that player 1 is trying to maximize the utility in the first steps of the run, while player 2 is minimizing it. The values of games can be computed using the value-iteration algorithm or dynamic programming, which is standard. A formal treatment of the standard algorithms for games is presented in Appendix 0.B.1.

Remark 2

Note that in (1), limiting player 2 to pure strategies does not change the value of the game. Hence, we can assume that player 2 is an arbitrarily powerful nondeterministic adversary and get the exact same results.

4.3 Translating contracts to games

The translation from bounded smart contracts to games is straightforward, where the states of the concurrent game encodes the states of the contract. Correspondences between objects in the contract and game are as follows: (a) moves in contracts with actions in games; (b) run prefixes in contracts with histories in games; (c) runs in contracts with plays in games; and (d) policies (resp., randomized policies) in contracts with strategies (resp., mixed strategies) in games. Note that since all runs of the bounded contract are finite and have a limited length, we can apply finite horizon analysis to the resulting game, where is the maximal length of a run in the contract. This gives us the following theorem:

Theorem 4.1 (Correspondence)

Given a bounded contract for a party with objective , a concurrent game can be constructed such that value of this game, , is equal to the value of the bounded contract, .

Details of the translation of smart contracts to games and proof of the theorem above is relegated to Appendix 0.B.2.

Remark 3

Note that in standard programming languages where there is no interaction the underlying mathematical models are graphs. In contrast, for the smart contracts programming languages we consider there are game theoretic interaction, and hence concurrent games on graphs are considered as the underlying mathematical model.

5 Abstraction for Quantitative Concurrent Games

Abstraction is a key technique to handle large-scale systems. In the previous section we described that smart contracts can be translated to games, but due to state-space explosion (since we allow integer variables), the resulting state space of the game is huge. Hence, we need techniques for abstraction, as well as refinement of abstraction, for concurrent games with quantitative utilities. In this section we present such abstraction refinement for quantitative concurrent games, which is our main technical contribution in this paper. We prove soundness of our approach and its completeness in the limit. Then, we introduce a specific method of abstraction, called interval abstraction, which we apply to the games obtained from contracts and show that soundness and refinement are inherited from the general case. We also provide a heuristic for faster refining of interval abstractions for games obtained from contracts.

5.1 Abstraction for quantitative concurrent games

Abstraction considers a partition of the state space, and reduces the number of states by taking each partition set as a state. In case of transition systems (or graphs) the standard technique is to consider existential (or universal) abstraction to define transitions between the partition sets. However, for game-theoretic interactions such abstraction ideas are not enough. We now describe the key intuition for abstraction in concurrent games with quantitative objectives and formalize it. We also provide a simple example for illustration.

Abstraction idea and key intuition. In an abstraction the state space of the game is partitioned into several abstract states, where an abstract state represents a set of states of the original game. Intuitively, an abstract state represents a set of similar states of the original game. Given an abstraction our goal is to define two games that can provide lower and upper bound on the value of the original game. This leads to the concepts of lower and upper abstraction.

  • Lower abstraction. The lower abstraction represents a lower bound on the value. Intuitively, the utility is assigned as minimal utility among states in the partition, and when an action profile can lead to different abstract states, then the adversary, i.e. player 2, chooses the transition.

  • Upper abstraction. The upper abstraction represents an upper bound on the value. Intuitively, the utility is assigned as maximal utility among states in the partition, and when an action profile can lead to different abstract states, then player 1 is chooses between the possible states.

Informally, the lower abstraction gives more power to the adversary, player 2, whereas the upper abstraction is favorable to player 1.

General abstraction for concurrent games. Given a game consisting of a game structure and a utility function , and a partition of , the lower and upper abstractions, and , of with respect to are defined as:

  • , where is a set of dummy states for giving more power to one of the players. Members of are called abstracted states.

  • The start state of is in the start state of and , i.e. .

  • . Each action in abstracted games either corresponds to an action in the original game or to a choice of the next state.

  • If two states , are in the same abstracted state , then they must have the same set of available actions for both players, i.e.  and . Moreover, inherits these action sets. Formally, and .

  • For all and and , we have . Similarly for and , . This means that all transitions from abstract states in go to the corresponding dummy abstract state in D.

  • If is a dummy abstract state, then let be the set of all partition sets that can be reached from by in . Then in , is a singleton, i.e., player 1 has no choice, and , i.e., player 2 can choose which abstract state is the next. Conversely, in , is a singleton and player 2 has no choice, while and player 1 chooses the next abstract state.

  • In line with the previous point, and for all and available actions and .

  • We have and . The utility of a non-dummy abstracted state in , resp. , is the minimal, resp. maximal, utility among the normal states included in it. Also, for each dummy state , we have .

Given a partition of , either (i) there is no lower or upper abstraction corresponding to it because it puts states with different sets of available actions together; or (ii) there is a unique lower and upper abstraction pair. Hence we will refer to the unique abstracted pair of games by specifying only.

Remark 4

Dummy states are introduced for conceptual clarity in explaining the ideas because in lower abstraction all choices are assigned to player 2 and upper abstraction to player 1. However, in practice, there is no need to create them, as the choices can be allowed to the respective players in the predecessor state.

Example. Figure 9 (left) shows a concurrent game with with states. The utilities are denoted in red. The edges correspond to transitions in and each edge is labeled with its corresponding action pair. Here , and . Given that action sets for and are equal, we can create abstracted games using the partition where and other sets are singletons. The resulting game structure is depicted in Figure 9 (center). Dummy states are shown by circles and whenever a play reaches a dummy state in , player 2 chooses which red edge should be taken. Conversely, in player 1 makes this choice. Also, and . The final abstracted of the example above, without dummy states, is given in Figure 9 (right).

Figure 9: An example concurrent game (left), abstraction process (center) and the corresponding without dummy states (right).

5.2 Abstraction: soundness, refinement, and completeness in limit

For an abstraction we need to prove three key properties: (a) soundness, (b) refinement of the abstraction, and (c) completeness in the limit. The intuitive description is as follows: (a) soundeness requires that the value of the games is between the value of the lower and upper abstraction; (b) refinement requires that if the partition is refined, then the values of lower and upper abstraction becomes closer; and (c) completeness requires that if the partitions are refined enough, then the value of the original game can be approximated. We present and prove each of these results below.

5.2.1 Soundness.

Soundness means that when we apply abstraction, value of the original game must lie between values of the lower and upper abstractions. Intuitively, this means abstractions must provide us with some interval containing the value of the game. We expect the value of to be less than or equal to the value of the original game because in , the utilities are less than in and player 2 has more power, given that she can choose which transition to take. Conversely, we expect to have a higher value than .

Formal requirement for Soundness. An abstraction of a game leading to abstraction pair is sound if for every ,

The factor in the inequalities above is due to the fact that each transition in the original game is modeled by two transitions in abstracted games, one to a dummy state and a second one out of that dummy state.

We now formally prove our soundness result. The main intuition in this proof is that letting player 1 get the minimal reward in each partition set when she reaches any state of the set, and allowing player 2 to choose the resulting state among all possibilities cannot possibly be in player 1’s favor and increase her utility. Similarly, doing the opposite thing by letting her get the maximal reward and choose the transition cannot possibly decrease her utility.

Theorem 5.1 (Soundness)

Given a game and a partition of its state space, if and exist, then the abstraction is sound, i.e.  for all , it is the case that .

Proof

We prove the first inequality, the second one can be done similarly.

For a mixed strategy for player 1 in , let be the guaranteed value of the game if player 1 plays . We say that is a best response to if playing leads to a total utility of . We define and best responses in analogously.

Let be a strategy for player 1 in , such that . Such a strategy exists because the set of all strategies for player 1 is compact and is continuous. Let be a strategy for player 1 in that follows , i.e. looks at the histories of as histories of and assigns to each history of the action that assigns to the corresponding history of . Then, let be a best response to in and a strategy in that follows against , i.e. chooses actions in non-dummy states in accordance with actions chosen by and actions in dummy states in accordance with transitions of the play of . Intuitively, is player 2’s best strategy if she does not use her additional ability of choosing the next state in . It is evident by construction that , because paths in according to these strategies go through a sequence of partition sets that correspond exactly to the sequence of states that are visited in and the utility of each such partition set is defined to be less than or equal to the utility of each of its states. Therefore , given that was a best response. This means that .

5.2.2 Refinement.

We say that a partition is a refinement of a partition , and write , if every is a union of several ’s in , i.e.  and for all , . Intuitively, this means that is obtained by further subdividing the partition sets in . It is easy to check that is a partial order over partitions. We expect that if , then the abstracted games resulting from give a better approximation of the value of the original game in comparison with abstracted games resulting from . This is called the refinement property.

Formal requirement for the Refinement Property. Two abstractions of a game using two partitions , such that , and leading to abstracted games corresponding to each satisfy the refinement property if for every ,

We now prove that any two abstractions with satisfy this property.

Theorem 5.2 (Refinement Property)

Let be two partitions of the state space of a game , then the abstractions corresponding to satisfy the refinement property.

Proof

Note that , so can itself be considered as a partition of and one can then define an abstracted pair of games and on with respect to . Strategies and paths in are in natural bijection with strategies and paths in and the bijection preserves utility. Therefore, using the Soundness theorem above, we have