Formal Specification and Verification of Smart Contracts for Azure Blockchain

In this paper, we describe the formal verification of Smart Contracts offered as part of the Azure Blockchain Content and Samples on github. We describe two sources of formal verification problems: (i) semantic conformance checking of smart contracts against a state-machine and access control based Azure Blockchain Workbench application configuration, and (ii) safety verification for smart contracts implementing the authority governance in Ethereum Proof-of-Authority (PoA) on Azure. We describe a new program verifier VeriSol for Solidity based on a translation to Boogie and leveraging the Boogie verification toolchain. We describe our experience applying VeriSol to Workbench sample contracts and Proof of Authority governance contracts in Azure, and finding previously unknown bugs in well-tested smart contracts. We provide push-button unbounded verification for the semantic conformance checking for all the sample contracts shipped in Workbench, once the bugs are fixed.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

09/25/2020

A formal model of Algorand smart contracts

We develop a formal model of Algorand stateless smart contracts (statele...
04/25/2019

Deductive Proof of Ethereum Smart Contracts Using Why3

A bug or error is a common problem that any software or computer program...
05/04/2020

Formal Verification of Solidity contracts in Event-B

Smart contracts are the artifact of the blockchain that provide immutabl...
08/04/2022

Deductive Verification of Smart Contracts with Dafny

We present a methodology to develop verified smart contracts. We write s...
11/14/2018

On the specification and verification of atomic swap smart contracts

Blockchain systems and smart contracts provide ways to securely implemen...
12/06/2019

User Experience with Language-Independent Formal Verification

The goal of this paper is to help mainstream programmers routinely use f...
09/20/2021

Money grows on (proof-)trees: the formal FA1.2 ledger standard

Once you have invented digital money, you may need a ledger to track who...

1. Introduction

The advent of blockchain (decentralized and distributed consensus protocol to maintain and secure a shared ledger) is seen as a disruptive technology with far-reaching impact on diverse areas of society such as cryptocurrencies, banking, escrow and governance. According to the Microsoft’s Coco Framework white paper(Microsoft, 2017)

Blockchain technology is poised to become the next transformational computing paradigm. It promises to disrupt existing business processes, to reduce the friction of doing business, and to unlock new business models, especially shared processes across organizations. According to Gartner, the business value-add of blockchain will grow to slightly more than $176 billion by 2025, and then it will exceed $3.1 trillion by 2030. Given such benefits, in our rapidly evolving digital economy, it won’t be long before blockchain technology is a key foundation for distributed enterprise and consumer applications.

Smart Contracts are applications that run on blockchains such as Ethereum, and are an essential ingredient for democratizing the use of blockchain technology beyond cryptocurrencies (e.g. bitcoin). Smart contracts often encode expressive workflows encoded in a Turing-complete programming language. For example, the Ethereum blockchain provides a low-level stack-based bytecode language that executes on top of the Ethereum Virtual Machine (EVM). High level languages such as Solidity and Serpent have been developed to enable traditional application developers to author smart contracts.

There are at least two compelling reasons to apply formal specifications and verifications to smart contracts:

  • [leftmargin=*]

  • Smart contract vulnerabilities. Unlike traditional programs written in high-level programming languages, smart contracts have unique security and integrity characteristics. First, smart contracts manage, hold, and transfer digital assets such as Ether, which make them susceptible to theft. Second, smart contracts are mostly immutable after deployment and hence the need to ensure their safety and security operating in an open and adversarial context is of paramount importance. Vulnerabilities in smart contracts have resulted in several high-profile exploits that undermine the trust in the underlying blockchain technology. For example, the infamous TheDAO exploit (the, 2016) resulted in the loss of almost 60 million USD worth of Ether, and resulted in an undesirable hard fork in Ethereum. Several other smart contract vulnerabilities have resulted in the loss of substantial value of Ether (Atzei et al., 2017), including the Parity Wallet bug that resulted in 169 million USD worth of ether to be locked forever (par, 2017).

  • High-level specifications. Smart contracts are often low-level implementations of a high-level workflow that comprises a state machine with different actions predicated by suitable access control to determine who has the permission to execute a given action. These high-level workflows are often designed by domain experts, who may not be proficient programmers with knowledge of subtle semantics of programming constructs. Thus, there is strong need for a high-level specification language for expressing the intent of the workflow, which can be implemented in a smart contract. Specifying the high-level workflow abstractly also allows targeting different languages (e.g. Solidity) or ledgers (Ethereum vs. Hyperledger Fabric (fab, 2015)) with relative ease.

In this work, we explore the use of formal specification and verification for Solidity smart contracts that constitute the Azure Blockchain content and samples on github (azu, 2018)111The content of the paper is based purely on information available in open source in the github page and Azure documentation pages. Moreover, given the static nature of the verification, we only need access to the JSON configuration and Solidity smart contract source files for conducting this study.. Azure Blockchain consists of a set of components and services that allow businesses to rapidly prototype and deploy blockchain applications on the Azure cloud (azu, 2017). Among other services, it currently consists of two main products (a) Azure Blockchain Workbench (or simply Workbench henceforth), and (b) Ethereum on Azure, which are interesting from the perspective of smart contract analysis. Several aspects of these smart contracts make them interesting targets for verification:

  1. [leftmargin=*]

  2. First, many of the sample contracts constitute proof of concepts for real-world enterprise scenarios.

  3. Second, a Workbench application consists of a JSON file that expresses the state machine with access control that a smart contract has to implement. Once formalized, these can serve as implicit specifications that can be checked statically or at runtime.

  4. Finally, the smart contracts that constitute the Ethereum on Azure (namely, the PoA governance contracts) have been deployed several thousand times by Azure Blockchain customers. Thus, the safety and security issues in such contracts have serious real-world consequences.

Although several static analysis approaches have been proposed recently to scan for known vulnerabilities in smart contracts (Luu et al., 2016; Tsankov et al., 2018), they do not offer the users the ability to specify and verify formal specifications (see Related Work in Section 8 for further explanation). For the purpose of this paper, we distinguish a formal verifier from static analysis in that a violation of a formal specification is always considered a bug, not just a bad coding practice.

Contributions.

The paper makes the following contributions:

  1. [leftmargin=*]

  2. We provide a formalization to the JSON-based Workbench application configuration that allows formal tools to interpret and enforce it.

  3. We define a semantic conformance checking problem between a JSON-based Workbench application configuration and a smart contract, and provide an automatic program instrumentation to enforce the specification in a Solidity smart contract.

  4. We describe a new prototype formal verifier VeriSol for Smart Contracts written in Solidity. The verifier encodes the semantics of Solidity programs into Boogie and leverages the well-engineered Boogie verification pipeline (Barnett et al., 2005). The verifier is generic and not tied to the Azure Blockchain examples.

  5. We use VeriSol to discover previously unknown semantic conformance bugs in Workbench samples using transaction-bounded verification; we then report full verification of the property on the fixed examples using invariant inference.

  6. Finally, we report a detailed case study of VeriSol on a PoA governance contract in Azure Blockchain and find previously unknown bugs.

Organization.

The paper is organized as follows: In Section 2, we provide an overview of the Azure Blockchain components and the smart contracts available as part of Azure Blockchain Content and Samples; we introduce a simple running example and informally describe the Workbench JSON application configuration language for specifying access control and state transitions. In Section 3 we provide formal semantics for the Workbench JSON application configuration (Section 3.1). In Section 4 we describe the problem of semantic conformance for a smart contract implementing a Workbench JSON configuration . In Section 5, we provide an encoding of a subset of the Solidity language in Boogie intermediate verification language. In Section 6, we describe a formal verifier VeriSol for Solidity, that leverages the Boogie translation and uses various Boogie based (bounded and unbounded) verification tools. In Section 7, we provide our experience of running VeriSol on the smart contracts that constitute the Azure Blockchain.We discuss related work in Section 8 and finally conclude.

2. Overview of Azure Blockchain Content and Samples

Figure 1. Workflow diagram for HelloBlockchain application.

Azure Blockchain consists of a set of components and services that allow businesses to rapidly prototype and deploy blockchain applications on the Azure cloud. Among other services, it currently consists of two products that are somehow independent (a) Azure Blockchain Workbench, and (b) Ethereum on Azure, which are interesting from the perspective of smart contract analysis. Azure Blockchain Workbench is primarily focused on the application level, whereas Ethereum on Azure is a product offering at the ledger level. This section gives an overview about the two.

2.1. Azure Blockchain Workbench

Workbench consists of services that allow users to deploy blockchain applications on the Azure cloud. An enterprise smart contract application requires not only a bare ledger, but also services for user authentication, identity mapping, messaging, REST APIs, web UI, source code control, etc. Azure Blockchain Workbench (abbr. Workbench) is a product that allows such an application scaffold to be created on Azure very easily, so that a user can focus on building the smart contract application. In Workbench, the smart contract application consists of two components: (i) a JSON file describing the application configuration or interface222We use the terms configuration and interface interchangeably., and (ii) a smart contract that implements the application business logic. Once an application is uploaded into Workbench, users can add more members, and members can drive the application to different states by taking suitable actions. We informally describe the configuration language (formally described in Section 3.1) and an associated Solidity smart contract in the next few paragraphs.

2.1.1. Workbench Application Configuration

Workbench requires a JSON based configuration file that is used to populate the application information, which can be queried by users through REST APIs to interact with a Workbench application. The JSON interface of an application consists of several attributes such as application name and description, set of roles, along with a set of workflows. Figure 1 provides an informal pictorial representation of the JSON for a simple application called HelloBlockchain. The actual JSON and the example related details can be found on the associated web page333https://github.com/Azure-Samples/blockchain/tree/master/blockchain-workbench/application-and-smart-contract-samples/hello-blockchain. The application consists of two roles (refereed under “APPLICATION ROLES”) namely Requestor and Responder. Informally, each role represents a set of user addresses; roles are used to provide access control or permissions for various actions exposed by an application.

A workflow informally consists of a name, description along with a set of states, data members, functions (or actions), and state transitions. The simple HelloBlockchain application consists of a single workflow with the same name as the application. As seen from Figure 1, the workflow consists of two states: Request and Respond. The data members (or fields) consists of Requestor, Responder that range over addresses, and strings RequestMessage and ResponseMessage to store the last message (not shown in the figure). The workflow consists of two actions or functions in addition to the constructor function: SendRequest and SendResponse, both of which take a string as argument.

Finally, the state transitions specify the initial state (Request for this example) and the transitions between the states. A transition consists of a start state, an action or function, an access control list, and a set of successor states. Figure 1 describes two transitions, one from each of the two states. For example, the application can transition from Request to Respond if a user from the Requestor (categorized as “Allowed Role” (AR) under “Legend” box) invokes the action SendResponse. An “Application Instance Role” (AIR) refers to a data member of the workflow that stores a member of a global role (also called Requestor for the example) — a transition such as from Respond to Request that uses an AIR checks if the user address matches the value stored in the instance data variable.

pragma solidity ^0.4.20;
contract HelloBlockchain {
     //Set of States
    enum StateType {Request, Respond}
    //List of properties
    StateType public  State;
    address public  Requestor;
    address public  Responder;
    string public RequestMessage;
    string public ResponseMessage;
    // constructor function
    function HelloBlockchain(string message)
                             
                             public
    {
        Requestor = msg.sender;
        RequestMessage = message;
        State = StateType.Request;
    }
    // call this function to send a request
    function SendRequest(string requestMessage)
                          public
    {
        RequestMessage = requestMessage;
        State = StateType.Request;
    }
    // call this function to send a response
    function SendResponse(string responseMessage)
                          public
    {
        Responder = msg.sender;
        ResponseMessage = responseMessage;
        State = StateType.Respond;
    }
    
 }
Figure 2. Solidity contract for HelloBlockchain application.

2.1.2. Workbench Application Smart Contract

After specifying an application configuration in JSON, a user provides a smart contract for the appropriate blockchain ledger to implement the workflow. Currently, Workbench supports the popular language Solidity for targeting applications on Ethereum. Figure 2 describes a Solidity smart contract that implements the HelloBlockchain workflow in the HelloBlockchain application. For the purpose of this section, we will ignore the portions of the code that are — we will refer to them when describing the conformance checking in Section 4.2. The contract declares the data members present in the JSON configuration as state variables with suitable types. Each contract implementing a workflow defines an additional state variable State to track the current state of a workflow. The contract consists of the constructor function along with the two functions defined in the JSON configuration, with matching signatures. The functions set the state variables and update the state variable appropriately to reflect the state transitions.

The Workbench service allows a user to upload the JSON, the Solidity code, and optionally adding users and perform various actions permitted by the configuration. To ensure the correct functioning and security of the application, it is crucial to verify that the Solidity program semantically conforms to the intended meaning of the JSON configuration.

2.2. Proof of Authority Governance Contracts

Separate from the application level Workbench offering, Azure Blockchain also offers ledger level services. One of them is Ethereum on Azure. The Ethereum blockchain comes with a choice of consensus protocols for a decentralized system: (i) the conventional Proof of Work (PoW) and the (ii) the Proof of Authority (PoA). The consensus algorithm has to decide which node wins the privilege to append the latest block to the blockchain.

The PoW is the widely used consensus algorithm for traditional public (permissionless) blockchain to guard against Sybil attacks in the presence of anonymous nodes, e.g., in the BitCoin network. A malicious party could easily create many nodes to be disproportionately powerful. However, PoW consensus is computationally expensive as it relies on miners solving a difficult cryptographic puzzle, and therefore it limits the throughput (the number of transactions that can be mined per unit of time) of the blockchain network.

Figure 3. POA Ethereum on Azure and its governance contracts.

The PoA is proposed as an alternative to PoW for permissioned consortium networks where the identities cannot be forged, as they are linked to off-chain identities. It differs from a public blockchain in that the consortium is formed by running an election to accept new members, each having an identity. The members share the responsibility/authority of validating transactions and appending them in the ledger. It allows for a superior throughput in the case of consortium blockchain applications.

Enterprise customers and other systems such as Azure Blockchain Workbench can deploy a PoA network on Azure based on the Parity implementation of Ethereum. The network consists of a set of nodes running the PoA protocol and validating transactions. Every node is assigned a distinct Ethereum address, which is called the validator address. The validator set is a contract managing a set of validator addresses (shown as "validators" for brevity). Adding or removing a validator address will result in its corresponding node to be added into or removed from the PoA network.

Parity Ethereum implementation exposes a ValidatorSet contract interface that is implemented by the Ethereum on Azure deployment. Governing a PoA network requires a set of contracts shown in Figure 3. Validators belong to different organizations, such as companies A, B and C in the figure. Each organization is represented by an admin contract. Initially, the PoA network is created by one admin, who naturally becomes the only elected admin of the network. Later, for another admin to become an elected admin, it needs to win a majority vote among existing elected admins. An elected admin is allowed to bring in a number of validators. The admins can also vote against each other. If more than half of the existing elected admins vote against one of them, the admin will be evicted, and its validators are removed consequently. The above protocol for admin voting and validator set management is implemented as a set of contracts that implement the Parity Ethereum’s ValidatorSet contract interface, and available as part of Azure Blockchain content and samples (poa, 2018b, a). It consists of the following smart contract implementations totaling around 600 lines of Solidity code.

  • [leftmargin=*]

  • SimpleValidatorSet: A simple implementation of the ValidatorSet that handles adding validators .

  • AdminValidatorSet: Inherits from SimpleValidatorSet, and introduces the concept of Admins for different organizations and voting. It allows an admin to control their validator set and vote for other admins.

  • Admin: A contract representing a Consortium member; it tracks votes for or against a given member.

  • AdminSet: Contract used for performing constant time operations on a set of Admins.

The smart contracts use several features that make it a challenging benchmark for Solidity smart contract reasoning. We outline some of them here:

  • [leftmargin=*]

  • The contracts use multiple levels of inheritance since the top-level contract AdminValidatorSet derives from the contract SimpleValidatorSet which in turn derives from ValidatorSet interface.

  • It uses sophisticated access control using Solidity modifiers to restrict which users and contracts can alter states of different contracts.

  • The contracts maintain deeply nested mappings and arrays to store the set of validators for different admins.

  • The contracts make use of nested loops and procedures to iterate over the arrays, and make use of arithmetic operations to reason about majority voting.

3. Formalizing Workbench Application Configuration

In the next couple of sections we describe the problem of ensuring that a smart contract correctly implements the Workbench Application Configuration provided in the JSON file. We first formalize the Workbench Application Configuration (WBAC) that we informally introduced in Section 2. The description can be seen as a mathematical representation of the official schema documentation of WBAC as described by the Azure Blockchain444https://docs.microsoft.com/en-us/azure/blockchain/workbench/configuration.

3.1. Workbench Application Interface

The Workbench Application Interface (WBAC) is described in a JSON file. The WBAC for an application allows the user to describe the data members of an applications, role-based access control for various functions or actions, and finally a high-level state-machine based view of the application. The role-based access control provides security for deploying smart contracts in an open and adversarial setting; the high-level state machine naturally captures the essence of a workflow that progresses between a set of states based on some actions from the user.

We assume that each function is associated with a , which is the address of a user or another workflow that invokes the function. In Solidity, this is denoted by msg.sender parameter within a function. The invocation of a function can be restricted to users that belong to certain roles — we refer to this as the access control.

Formally, a Workbench Application Configuration consists of the following:

  • [leftmargin=*]

  • A set of global roles , common to all workflows, that is used for access control for functions.

  • A set of types , which can either be an (i) , or a (ii) , or an (iii) an of a contract or user, or a (iv) a role , or a (v) workflow (as defined next).

  • A set of workflows , where a workflow is a tuple :

    • [leftmargin=*]

    • , a bounded set of states,

    • , an initial state,

    • , a set of properties where a property has a type and a string identifier .

    • A subset of properties are instance roles, where the types are roles from . That is, for any , is a member of . The intuition is that a specific instance of a workflow may designate only a subset of members from a given role to execute certain action. The instance role variables capture such instance specific users who are members of a given role .

    • A set of function types , where each function type consists of

      • The function name ,

      • Empty list of return parameters,

      • An arity of the input parameters,

      • A type and an identifier for the -th parameter (), counting from zero.

    • A constructor type where the function name equals the workflow name.

    • The access control set is a set over the union of the instance role properties and the set of global roles in . One can restrict the invocation of a function within a transition by insisting that the belongs to a subset of .

    • The initiator access control for restricting users who can create an instance of the contract by calling the constructor.

    • Finally, a set transitions . Intuitively, a transition indicates the system can transition from state to one of states in (non-deterministically) by invoking the function provided the “sender” of is a member of the access control set .

3.2. Example

Consider the WBAC for the HelloBlockchain application described in Section 2:

  • [leftmargin=*]

  • Set of roles

  • Set of types

  • A single workflow (we drop the suffix) with:

    • Set of workflow states ,

    • The start state ,

    • The set of properties (or fields) , , , .

    • The set of instance role properties are .

    • The set of 3 functions:

      • The constructor ,

      • Two functions and .

    • The access control set , with the initiator access contorl .

    • Finally, there are two state-transitions in as depicted in Figure 1.

      • , and

      • .

4. Semantic Conformance Checking for Workbench

Given a WBAC in the form of a JSON file and a smart contract (say a Solidity file), we would like to ensure that the smart contract correctly implements the interface and the state transitions described in . This is crucial to ensure that the high-level workflow specified in the application configuration file by the designer is correctly implemented in the smart contract.

We can divide this task into (a) structural and (b) semantic conformance checking. The structural conformance checking ensures that the data members, states and types specified in match those in . The semantic conformance checking ensures that the smart contract (that is structurally conformant with ) correctly implements the access control and state transitions in . Whereas the structural checking problem can be stated purely in terms of the abstract syntax tree of , the semantic checks require reasoning over the dynamic runtime states of . In Section 4.1, we formalize the problem of semantic conformance by providing an axiomatic semantics over an abstract smart contract language. Next, we describe a program instrumentation technique for runtime enforcement of these checks, and provide a concrete implementation for Solidity based smart contracts.

We first state an abstract version of structural conformance checking in this Section — the concrete details rely on the choice of language in which the smart contract is expressed (e.g. Solidity). Given a WBAC and a smart contract , the structural conformance checker enforces the following for each workflow : (i) There exists a contract or class named in , and the the matched class for contains (a) a member variable named whose range is , the set of states in , (b) a matching member variable or field for each with compatible types, and a matching constructor function and public functions that match the constructor and the functions in the workflow. Each function needs to have matching name as well as matching parameter name list, with compatible types. The Workbench system already provides one such structural conformance checker for Solidity.

4.1. Semantic Conformance

In this section, we show how to ensure that a structurally conformant smart contract implements the access control and state transitions according to the WBAC specification. We formalize these checks using an axiomatic semantics for a generic smart contract that supports constructors and function invocation. We use the Floyd-Hoare triple notation for partial correctness

to denote that for any execution of the statement from a state satisfying the predicate , does not fail any assertions and, upon termination, will end up in a state satisfying the predicate . We do not require the execution to terminate however.

Let be an instance of workflow . At a high-level, the idea is simple: we insist that when a function (respectively, the constructor) is executed along a transition (respectively, during contract creation), the resulting state transition should be in accordance with the workbench application specification.

  1. [leftmargin=*]

  2. Constructor. We ensure that a successful termination of the constructor (of arity ) during the creation of an instance of the workflow by a user () with the appropriate access control results in establishing the initial state .

    We use a predicate that checks if an address on blockchain refers to an user address or the address of a contract; the exact implementation would depend on the particular choice of language in which a smart contract is expressed. We weaken the access control check with the predicate to indicate that the access control applies only to user addresses (and not to the address of contracts). There are sample contracts in Workbench (e.g. BazaarItem (wor, 2018)) that expect a contract instance to be only created by another contract to enforce some encapsulation, and not be created directly by an user. This is achieved by specifying to be the empty set.

  3. Transition Functions. For a transition we ensure that if (with arity ) is invoked from the state of the transition by a user in , then the state of the smart contract instance transitions to one of the successor states in .

    The precondition checks two facts: (i) the satisfies the access control and thus is a user address and (iii) the start state is .

Type invariant.

In addition to these two checks, one can also postulate a type invariant for any instance role variable of role type . One can check that whenever such an instance role variable holds a non-zero value (i.e. other than 0x0), then it is an address that is a member of the global role ; recall that a global role (such as Requestor in HelloBlockchain) denotes a dynamic list of addresses that can be added or removed by a Workbench application owner. However, such a check ( ) is really a precondition that is asserted at (i) entry to any public method, and (ii) after any assignment to in the method body. We think of this as a precondition since the value of this predicate depends on the contents of , which can be seen as an input to a function. Since it does not add any runtime assertions to verify statically, we omit this check in our conformance checking.

4.2. Semantic Conformance for Solidity

In this section, we provide a concrete implementation of the semantic conformance checking for smart contracts written in the popular Solidity language for the Ethereum blockchain. We first describe a challenge in enforcing the checks related to global roles. We then provide program instrumentation to add the conformance checks for ensuring correct initial state and state transitions.

4.2.1. Global Roles

Although a Workbench Application configuration declares a set of global roles in , Workbench currently does not maintain the information on the blockchain. Workbench maintains the membership of different roles in databases outside of the blockchain. As a result, Solidity smart contracts for a Workbench application cannot refer to global role information in the body of any function. Validations of access control in for a transition is performed by the Workbench system directly at the time a function is invoked. This poses an interesting dilemma for statically verifying the semantic conformance: we can either model the state of the role database in addition to the smart contract code, or we perform a conservative verification where the content of each global role is completely non-deterministic. We adopt the latter for two reasons: (i) First, since the smart contracts cannot refer to global roles in current Workbench, any modification to the set of global roles (e.g. adding a user to the Requestor in the HelloBlockchain application) will never be interleaved with a transaction that executes a function in a workflow (given the deterministic execution semantics of EVM). (ii) Second, given that the global roles can change arbitrarily before invoking a function, one has to assume a completely non-deterministic value of the global roles for soundness. Therefore, we do not introduce any spurious behaviors by assuming the global roles as being completely arbitrary sets.

4.2.2. Program Instrumentation

To instrument the checks we formalized in the prior section, we use the modifier construct from Solidity. A modifier has syntax very similar to a function definition in Solidity with a name and list of parameters and a body that can refer to parameters and globals in scope. The general structure of a modifier definition without any parameters (we refer users to Solidity documentation of modifiers (sol, 2016)) is:

modifier Foo() {
   pre-statements;
   _; // placeholder
   post-statements;
}

where pre-statements and post-statements are Solidity statements. When this modifier is applied to a function Bar,

function Bar(int x) Foo(){
   Bar-statements;
}

the Solidity compiler transforms the body of Bar to execute pre-statements (respectively, post-statements) before (respectively, after) Bar-statements. This provides a convenient way to inject code at multiple return sites from a procedure and also inject code before the execution of the constructor code (since a constructor may invoke other base class constructors implicitly).

We now define a couple of helper predicates before describing the actual checks. Let us first define a Solidity predicate to encode a to be a member of an access-control set :

Here NonDetFunc is a side-effect free Solidity function that returns a non-deterministic Boolean value at each invocation. For the sake of static verification as we describe in this paper, one can declare a function without any definition. This allows us to model the membership check conservatively in the absence of global roles on the blockchain. However, this solution is not suitable for installing actual runtime checks since a truly non-deterministic function such as NonDetFunc cannot be realized; we discuss the actual runtime checks in the Appendix A.

Next, we define a predicate for membership of a contract state in a set of states using as follows:

We use these predicates to define the source code transformations below:

  • [leftmargin=*]

  • Constructor. For a workflow , we add the following modifier to constructor.

    modifier constructor_checker() {
       require ();
       _;
       assert ();
    }

    It is not hard to see that the assertion ensures that the constructor sets up the correct intial state. The precondition consists of two parts. The first disjunct checks that the constructor is invoked by another contract and not a user (the global variable tx.origin tracks the user address that initiates a transaction); this is an implementation of the predicate abstractly stated in earlier section. The second disjunct checks the access control for the msg.sender for the case when it is a user address.

  • Transition Function For a function , let there be multiple transitions where is invoked. Let be the arity of .

    modifier g_checker() {
       // copy old State
       StateType oldState = State;
       // copy old instance role vars
       
       _;
       assert ;
    }

    First, we copy the variable and all of the variables in into corresponding “old” copies. Next, the assertion checks that if the function is executed in a transition , then state (denoted by ) transitions to one of the successor states in . The notation replaces any occurrences of a state variable (such as ) with the “old” copy that holds the value at entry to the function; this is required since the value of the state variables can change during the execution of the procedure. Finally, since conjunction distributes over assertions, we can replace the single assertion with an assertion for each transition in the implementation.

4.2.3. Instrumented Running Example

Figure 4 shows the modifier definitions for our running example HelloBlockchain described in Section 2. The modifiers are applied to the user-written smart contract in Figure 2, and shown by the underlined statements. We add a comment for the lines where a reference to a global role in the specification is replaced by a call to the non-deterministic function NonDetFunc.

   function NonDetFunc() returns (bool); //no definition
    // Checker modifiers
    modifier constructor_checker()
    {
      require (msg.sender != tx.origin ||
               NonDetFunc()); // global role REQUESTOR
       _;
      assert (State == StateType.Request);
    }
    modifier SendRequest_checker()
    {
      StateType oldState = State;
      address oldRequestor = Requestor;
       _;
      assert ((msg.sender == oldRequestor &&
               oldState == StateType.Respond)
              ==> State == StateType.Request);
    }
    modifier SendResponse_checker()
    {
      StateType oldState = State;
       _;
      assert ((NonDetFunc() && // global role RESPONDER
               oldState == StateType.Request)
              ==> State == StateType.Respond);
    }
Figure 4. Modifier definitions for instrumented HelloBlockchain application.

5. Encoding Solidity

In the next two sections, we describe the design of a formal verifier for Solidity smart contracts. In this section, we first describe a translation of Solidity program to a program in the Boogie intermediate verification language (Barnett et al., 2005). Boogie has a small language with formalized semantics, such that verification tasks can be encoded into a formula in Satisfiability Modulo Theory (SMT), and can be discharged by SMT solvers such as Z3 (De Moura and Bjørner, 2008). The ability to go from a Boogie program to a SMT formula allows us to leverage various bounded and unbounded verification techniques for Boogie.

5.1. Boogie

We first describe a small subset of Boogie language that we use to formalize our translation from Solidity.

Boogie types are integers (), references () or arrays; arrays can be nested in that each index of an array can store an array. Booleans are syntactic sugar over integers.

Figure 5. Simple subset of Boogie language.

Figure 5 describes the expressions and statements in the language. Expressions (Exprs) consist of constants, variables, operations over expressions and array lookups, and quantified expressions. Expressions can have one of the Boogie types described above, except expressions that have type Boolean. Standard statements (Stmts) in Boogie consist of skip (), variable and array assignment, sequential composition, conditional statements, and loops. The statement assigns an arbitrary value of appropriate type to a variable x. A procedure call (

) can return a vector of values that can be stored in local variables. The

and statements behave as skip when the the Boolean valued argument evaluates to true in the state. When the argument evaluates to false, then assert fails the execution and assume blocks the execution.

A state is a valuation of variables in scope. Evaluation of an expression , is denoted by , and defined inductively. We can define the standard operational (big-step) semantics that denotes that executing a statement in a state can transition to a state . We skip the details for the sake of brevity in this document.

5.2. Solidity

We now define a subset of Solidity language that is sufficiently expressive and yet concise enough to demonstrate the translation.

5.2.1. Types

We start with the set of Solidity types. Solidity types can be one of integer, string, address, a contract name, mappings or arrays over them. We unify a mapping type mapping(t1 => t2) and an array type t2[t1] in Solidity as t1 t2, where t2 could be a nested array. In general, we use the Solidity type to stand for (for ) and (for ).

We define a mapping that translates a Solidity type to a type in Boogie as follows:

As described later, we represent a Solidity string as an uninterpreted integer in Boogie. Solidity by default treats a string as an uninterpreted value that can only be compared for equality.

5.2.2. Variables

To model the semantics of an object-oriented language such as Solidity, each generated Boogie program consists of a set of variables in global scope:

  • An array that maps an address to a contract name corresponding to its dynamic type.

  • An array that maps an address to its allocation status.

  • An array that maps the address (of an array) to the size of the array.

  • For each scalar state variable F of type in a contract in Solidity, we introduce a map .

  • For an array or mapping state variable F, we introduce a map .

  • For any array type (either as state variable or local variable) (where for ), we add a set of maps:

    • , and

    • a set of at most distinct maps:

      Here bk (respectively b) is the string representation of (respectively ); for example, if is , then bk is “int”.

1// Solidity code with nested mappings 2// inheritance, constructor and require 3pragma solidity ^0.4.24; 4contract A { 5    mapping (int => int[]) n; 6    constructor() { 7       n[0].push(22); 8    } 9    function F() returns (bool) { 10        return false; 11    } 12} 13contract B is A { 14    mapping (int => int) m; 15    constructor()  { 16        require (n[0].length == 1); 17        m[0] = 11; 18        m[1] = 21; 19        //m[0] does not alias m[1] 20        assert (m[0] == 11); 21        //n[0][0] does not alias m[*] 22        assert (n[0][0] == 22); 23    } 24    function F() returns (bool) { 25        return true; 26    } 27} 28contract C { 29   A a; 30   constructor() { 31      a = new B(); 32      assert(a.F()); 33   } 34}
Figure 6. Solidity source code
1// global declarations
2type Ref;
3type ContractName;
4var DType: [Ref]ContractName;
5var Alloc: [Ref]bool;
6var M_int_int: [Ref][int]int;
7var M_int_Ref: [Ref][int]Ref;
8var Length: [Ref]int;
9//Allocates a new address
10procedure Fresh(): (newRef: Ref){
11  havoc newRef; assume (!Alloc[newRef]);
12  Alloc[newRef] := true;
13}
14//Allocates an unbounded set of new addresses
15procedure AllocUnboundedAddresses() {
16   var oldAlloc: [Ref]bool;
17   oldAlloc := Alloc; havoc Alloc;
18   // ensure old allocated addresses remain
19   // allocated
20   assume ( i:Ref :: oldAlloc[i] ==> Alloc[i]);
21}
Figure 7. Boogie prelude
1// Boogie code 2const unique A, B, C: ContractName; 3 4var n_A, m_B, a_C: [Ref]Ref; 5 6// A’s constructor 7proc A_Ctor(this: Ref, msg_sender: Ref) { 8   // start of initialization 9   // Make array/mapping vars distinct for n 10   call tmp := Fresh(); assume (Length[tmp] == 0);  11 12   // nested array initialization 13   assume (  i:int :: (Length[M_int_Ref[tmp][i]] == 0)); 14   assume (  i:int :: !(Alloc[M_int_Ref[tmp][i]]));  15   call AllocUnboundedAddresses();  16   assume (  i:int :: Alloc[M_int_Ref[tmp][i]]);  17   assume (  i, j:int :: (i == j || 18           M_int_Ref[tmp][i] != M_int_Ref[tmp][j]));  19   assume (  i:int, j:int :: 20           M_int_int[M_int_Ref[tmp][i]][j] == 0);   21   n_A[this] := tmp; 22   // end of initialization 23 24   var l := Length[M_int_Ref[n_A[this]][0]]; 25   M_int_int[M_int_Ref[n_A[this]][0]][l] := 22; 26   Length[M_int_Ref[n_A[this]][0]] := l + 1; 27} 28proc F_A(this:Ref, msg_sender: Ref) returns (r:bool) { 29   r := false; 30} 31// B’s constructor 32proc B_Ctor(this: Ref, msg_sender: Ref) { 33   call A_Ctor(this, msg_sender); 34 35  // start of initialization 36  // Make array/mapping vars distinct for m 37  call tmp := Fresh(); assume (Length[tmp] == 0); 38  // Initialize Integer mapping m 39  assume (forall  i:int :: M_int_int[tmp][i] == 0); 40  m_B[this] := tmp; 41  // end of initialization 42 43  assume (Length[M_int_Ref[n_A[this]][0]] == 1); 44  M_int_int[m_B[this]][0] := 11; 45  M_int_int[m_B[this]][1] := 21; 46  assert (M_int_int[m_B[this]][0] == 11); 47  assert (M_int_int[M_int_Ref[n_A[this]][0]][0] == 22); 48} 49proc F_B(this:Ref, msg_sender: Ref) returns (r:bool) { 50   r := true; 51} 52// C’s constructor 53proc C_Ctor(this: Ref, msg_sender: Ref) { 54   call x := Fresh(); 55   assume (DType[x] == B); 56   call B_Ctor(x, this); 57   a_C[this] := x; 58   if (DType[a_C[this]] == B) { /* dyn dispatch */ 59      call y := F_B(a_C[this], this); /* msg.sender update */ 60   } else if (DType[a_C[this]]) == A) { 61      call y := F_A(a_C[this], this); 62   } 63   assert (y); 64}
Figure 8. Translated Boogie code.
Figure 9. Example Solidity code and translated Boogie code that includes the program-independent prelude and program-specific translation.

Consider the example contracts in Figure 9 to illustrate the modeling.

  • [leftmargin=*]

  • First, for the state variable a in contract C of scalar type , we introduce a map a_C of type ; an access to a inside a contract instance is treated as . Here this is a parameter to the enclosing method to refer to the address of the current object or contract instance (corresponds to Solidity this).

  • Second, since there is an array type in the program as the type of n in contract A, we introduce the maps M_int_Ref: [Ref][int]Ref and M_int_int:[Ref][int]int.

  • An array access in Solidity such as n[0][0] is translate as

    M_int_int[M_int_Ref[n_A[this]][0]][0]

    where n_A[this] looks up the value of n for this instance, M_int_Ref[n_A[this]][0] looks up the map at index 0, which is finally used to index into the M_int_Ref map again at index 0 to obtain the scalar value. We define the formal translation of expressions later in this section.

In addition to the set of globals, the set of variables in local scope consists of

  • A formal parameter x of type for each solidity parameter x of type .

  • A formal parameter msg_sender of type , and a formal parameter this of type . These variable models the implicit Solidity parameters msg.sender and this respectively.

  • A local variable x of type for each Solidity local variable x of type .

5.2.3. Expressions

Figure 10. Simple subset of Solidity language.
Figure 11. Translation of Solidity expressions to Boogie.

Figure 10 describes the set of Solidity expressions (SolExprs). Most expressions are standard; the expression x.length refers to the length of a static or dynamically allocated array x. Figure 11 describes translation of a Solidity expression (denoted by the function ) to an expression in Boogie. We highlight the non-trivial cases here. For a string literal, we use an uninterpreted function ) to map it to an integer; one can think of this as a hash of the string. For a state variable x in contract , the translation indexes into with the this parameter. The translation of array access indexes into a global array after recursively translating x and se. The choice of array depends on the Solidity type of . If the Solidity type is an array then the resulting expression has type in Boogie to represent the address of such an array; we index into the array M_t_Ref. Otherwise, we index into the array M_t_t2 where t2 is the translated type of .

5.2.4. Statements

Figure 12. Translation of Solidity statements to Boogie. We do not translate statements that involve a deep copy.

Figure 10 also describes the set of statements in Solidity that we translate to Boogie. The aborts execution when executed in a state when se is false, and can be used for adding preconditions to functions. In contrast, is used to terminate execution when an internal invariant se does not hold. We distinguish between two types of procedure calls (a) internal call which invokes a method Foo within the same contract, and (b) external call that invokes a method on a contract instance pointed to by y (which may include this). For an array variable x, adds an element with value se to the end of the array and increments the . We support two types of dynamic allocation (i) creation of a new contract instance of type through and (ii) allocating an array of type of size se through . Finally, we support the implicit allocation of a nested mapping of type using .

Figure 13. Semantics of Solidity array assignment LHS = RHS.
Assignments.

Figure 12 provides the translation of Solidity statements to Boogie statements using the function . We translate (respectively ) using (respectively, ), given that these statements terminate execution when the argument is false and we do not allow exceptions to be caught in our subset of Solidity. An assignment in Solidity can either be a simple top-level assignment (of a value or address) or a deep-copy (in the case of certain arrays). Figure 13 shows the cases where an array assignment in Solidity is treated as assigning the reference (Ref) or performing a deep-copy (Deep) (sol, 2018). Our translator currently does not handle programs that contain deep copy; this is denoted by the when we encounter the need for a deep copy. Translation of an array assignment