SMT-Friendly Formalization of the Solidity Memory Model

01/09/2020 ∙ by Ákos Hajdu, et al. ∙ Budapest University of Technology and Economics SRI International 0

Solidity is the dominant programming language for Ethereum smart contracts. This paper presents a high-level formalization of the Solidity language with a focus on the memory model. The presented formalization covers all features of the language related to managing state and memory. In addition, the formalization we provide is effective: all but few features can be encoded in the quantifier-free fragment of standard SMT theories. This enables precise and efficient reasoning about the state of smart contracts written in Solidity. The formalization is implemented in the solc-verify verifier and we provide an extensive set of tests that covers the breadth of the required semantics. We also provide an evaluation on the test set that validates the semantics and shows the novelty of the approach compared to other Solidity-level contract analysis tools.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Ethereum [30] is a public blockchain platform that provides a novel computing paradigm for developing decentralized applications. Ethereum allows the deployment of arbitrary programs (termed smart contracts [29]) that operate over the blockchain state, and allows the public to interact with the contracts. It is currently the most popular public blockchain with smart contract functionality. While the nodes participating in the Ethereum network operate a low-level, stack-based virtual machine (EVM) that executes the compiled smart contracts, the contracts themselves are mostly written in a high-level, contract-oriented programming language called Solidity [28].

Even though smart contracts are generally short, they are no less prone to errors than software in general. In the Ethereum context, any flaws in the contract code come with potentially devastating financial consequences (such as the infamous DAO exploit [18]). This has inspired a great interest in applying formal verification techniques to Ethereum smart contracts (see e.g., [4] or [14] for surveys). In order to apply formal verification of any kind, be it static analysis or model checking, the first step is to formalize the semantics of the programming language that the smart contracts are written in. Such semantics should not only remain an exercise in formalization, but should preferably be developed, resulting in precise and automated verification tools.

Early approaches to verification of Ethereum smart contracts focused mostly on formalizing the low-level virtual machine precisely (see, e.g.,  [11, 20, 22, 23, 2]). However, the unnecessary details of the EVM execution model make it difficult to reason about high-level functional properties of contracts (as they were written by developers) in an effective and automated way. For Solidity-level properties of smart contracts, Solidity-level semantics are preferred. While some aspects of Solidity have been studied and formalized [24, 10, 15, 31], the semantics of the Solidity memory model still lacks a detailed and precise formalization that also enables automation.

The memory model of Solidity has various unusual and non-trivial behaviors, providing a fertile ground for potential bugs. Smart contracts have access to two classes of data storage: a permanent storage that is a part of the global blockchain state, and a transient local memory used when executing transactions. While the local memory uses a standard heap of entities with references, the permanent storage has pure value semantics (although pointers to storage can be declared locally). This memory model that combines both value and reference semantics, with all interactions between the two, poses some interesting challenges but also offers great opportunities for automation. For example, the value semantics of storage ensures non-aliasing of storage data. This can, if supported by an appropriate encoding of the semantics, potentially improve both the precision and effectiveness of reasoning about contract storage.

This paper provides a formalization of the Solidity semantics that covers all Solidity features related to managing contract storage and memory. A major contribution of our formalization is that all but few of its elements can be encoded in the quantifier-free fragment of standard SMT theories. Additionally, our formalization captures the value semantics of storage with implicit non-aliasing information of storage entities. This allows precise and effective verification of Solidity smart contracts using modern SMT solvers. The formalization is implemented in the open-source

solc-verify tool [21], which is a modular verifier for Solidity based on SMT solvers. We validate the formalization and demonstrate its effectiveness by evaluating it on a comprehensive set of tests that exercise the memory model. We show that our formalization significantly improves the precision and soundness compared to existing Solidity-level verifiers, while remarkably outperforming low-level EVM-based tools in terms of efficiency.

2 Background

2.1 Ethereum

Ethereum [30, 3] is a generic blockchain-based distributed computing platform. The Ethereum ledger is a storage layer for a database of accounts (identified by addresses) and the data associated with the accounts. Every account has an associated balance in Ether (the native cryptocurrency of Ethereum). In addition, an account can also be associated with the executable bytecode of a contract and the contract state.

Although Ethereum contracts are deployed to the blockchain in the form of the bytecode of the Ethereum Virtual Machine (EVM) [30], they are generally written in a high-level programming language called Solidity [28] and then compiled to EVM bytecode. After deployment, the contract is publicly accessible and its code cannot be modified. An external user, or another contract, can interact with a contract through its API by invoking its public functions. This can be done by issuing a transaction that encodes the function to be called with its arguments, and contains the contract’s address as the recipient. The Ethereum network then executes the transaction by running the contract code in the context of the contract instance.

A contract instance has access to two different kinds of memory during its lifetime: contract storage and memory.111There is an additional data location named calldata that behaves the same as memory, but is used to store parameters of external functions. For simplicity, we omit it in this paper. Contract storage is a dedicated data store for a contract to store its persistent state. At the level of the EVM, it is an array of 256-bit storage slots stored on the blockchain. Contract data that fits into a slot, or can be sliced into fixed number of slots, is usually allocated starting from slot 0. More complex data types that do not fit into a fixed number of slots, such as mappings, or dynamic arrays, are not supported directly by the EVM. Instead, they are implemented by the Solidity compiler using storage as a hash table where the structured data is distributed in a deterministic collision-free manner. Contract memory is used during the execution of a transaction on the contract, and is deleted after the transaction finishes. This is where function parameters, return values and temporary data can be allocated and stored.

2.2 Solidity

Solidity [28] is the high-level programming language supporting the development of Ethereum smart contracts. It is a full-fledged object-oriented programming language with many features focusing on enabling rapid development of Ethereum smart contracts. The focus of this paper is the semantics of the Solidity memory model: the Solidity view of contract storage and memory, and the operations that can modify it. Thus, we restrict the presentation to a generous fragment of Solidity that is relevant for discussing and formalizing the Solidity memory model. We omit parts of Solidity that are not relevant to the memory model (e.g., inheritance, loops, blockchain-specific members). We also omit low-level, unsafe features of Solidity that can break the Solidity memory model abstractions (e.g., assembly and delegatecall). An example contract that illustrates relevant features is shown in Figure 1, and the abstract syntax of the targeted fragment is presented in Figure 2.

contract DataStorage {
  struct Record {
    bool set;
    int[] data;
  }
  mapping(address=>Record) private records;
  function append(address at, int d) public {
    Record storage r = records[at];
    r.set = true;
    r.data.push(d);
  }
  function isset(Record storage r) internal view returns (bool s) {
    s = r.set;
  }
  function get(address at) public view returns (int[] memory ret) {
    require(isset(records[at]));
    ret = records[at].data;
  }
}
Figure 1: An example contract illustrating commonly used features of the Solidity memory model. The contract keeps an association between addresses and data and allows users to query and append to data.
TypeName Type
address Address
int  uint Signed/unsigned integer
bool Boolean
mapping(TypeName => TypeName) Mapping
TypeName[] Dynamically-sized array
TypeName[] Fixed size array
StructName Struct name
DataLoc storage memory Data location
expr Expression
id Identifier
expr.id Member access
expr[expr] Index access
expr ? expr: expr Conditional
new TypeName[](expr) New memory array
StructName(expr) New memory struct
stmt Statement
TypeName DataLoc? id = expr; Local variable declaration
= ; Assignment (tuples)
expr.push(expr); Push
expr.pop(); Pop
delete expr; Delete
StructMem TypeName id; Struct member
StructDef struct StructName { StructMem } Struct definition
StateVar TypeName id; State variable definition
FunPar TypeName DataLoc? id Function parameter
Fun function id(FunPar) Function definition
      [returns (FunPar)] { stmt }
Constr constructor(FunPar) { stmt } Constructor definition
Contract contract id Contract definition
      {StructDef StateVar Constr? Fun}
Figure 2: Syntax of the targeted Solidity fragment.

Contracts.

Solidity contracts are similar to classes in object-oriented programming. A contract can define any additional types needed, followed by the declaration of the state variables and contract functions, including an optional single constructor function. The contract’s state variables define the only persistent data that the contract instance stores on the blockchain. The constructor function is only used once, when a new contract instance is deployed to the blockchain. Other public contract functions can be invoked arbitrarily by external users through an Ethereum transaction that encodes the function call data and designates the contract instance as the recipient of the transaction.

Example 1

The contract DataStorage in Figure 1 defines a struct type Record. Then it defines the contract storage as a single state variable records. Finally three contract functions are defined append(), isset(), and get(). Note that a constructor is not defined and, in this case, a default constructor is provided to initialize the contract state to default values.

Solidity supports further concepts from object-oriented programming, such as inheritance, function visibility, and overloading. However, as these are not relevant for the formalization of the memory model we omit them to simplify our presentation.

Types.

Solidity is statically typed and provides two classes of types: value types and reference types. Value types include elementary types such as addresses, integers, and Booleans that are always passed by value. Reference types, on the other hand, are passed by reference and include structs, arrays and mappings. A struct consists of a fixed number of members. An array is either fixed-size or dynamically-sized and besides the elements of the base type, it also includes a length field holding the number of elements. A mapping is an associative array mapping keys to values. The important caveat is that the table does not actually store the keys so it is not possible to check if a key is defined in the map.

Example 2

The contract in Figure 1 uses the following types. The records variable is a mapping from addresses to Record structures which, in turn, consist of a Boolean value and a dynamically-sized integer array. It is a common practice to define a struct with a Boolean member (set) to indicate that a mapping value has been set. This is because Solidity mappings do not store keys: any key can be queried, returning a default value (false for Booleans) if no value was associated previously.

Data locations for reference types.

Data of reference types resides in a data location that is either storage or memory. Storage is the persistent store used for state variables of the contract. In contrast, memory is used during execution of a transaction to store function parameters, return values and local variables, and it is deleted after the transaction finishes.

Semantics of reference types differ fundamentally depending on the data location that they are stored in. Layout of data in the memory data location resembles the memory model common in Java-like programming languages: there is a memory heap where reference types are allocated and any entity in the heap can contain values of value types, and references to other memory entities. In contrast, the storage data location treats and stores all entities, including those of reference types, as values with no references involved. Mixing storage and memory is not possible: the data location of a reference type is propagated to its elements and members. This means that storage entities cannot have references to memory entities, and memory entities cannot have reference types as values. Storage of a contract can be viewed as a single value with no aliasing possible.

contract C {
  struct T {
    int z;
  }
  struct S {
    int x;
    T[] ta;
  }
  T t;
  S s;
  S[] sa;
}
(a)

t

s

sa

T

S

T

T

S

T

S

T

T

T
(b)
function f(S memory sm1) public {
  T memory tm = sm1.ta[1];
  S memory sm2 = S(0, sm1.ta);
}

sm1

S

tm

sm2

S

T

T

(c)
Figure 3: An example illustrating reference types (structs and arrays) and the difference in their layouts in storage and memory: (a) shows a contract defining types and state variables; (b) shows an abstract representation of the contract storage as values; and, in contrast, (c) shows a function using the memory data location and a possible layout of the data in memory.
Example 3

Consider the contract C defined in Figure 2(a). The contract defines two reference struct types S and T, and declares state variables s, t, and sa. These variables are maintained in storage during the contract lifetime and they are represented as values with no references within. A potential value of these variables is shown in Figure 2(b). On the other hand, the top of Figure 2(c) shows a function with three variables in the memory data location, one as the argument to the function, and two defined within the function. Because they are in memory, these variables are references to heap locations. Any data of reference types, stored within the structures and arrays, is also a reference and can be reallocated or assigned to point to an existing heap location. This means that the layout of the data can contain arbitrary graphs with arbitrary aliasing. A potential layout of these variables is shown at the bottom of Figure 2(c).

Functions.

Functions are the Solidity equivalent of methods in classes. They receive data as arguments, can perform computations, manipulate state variables and interact with other Ethereum accounts. Besides accessing the storage of the contract through its state variables, functions can also define local variables, including function arguments and return values. Variables of value type are stored as values on a stack. Variables of reference types must be explicitly declared with a data location, and are always pointers that point to an entity in that data location (storage or memory). A pointer to storage is called a local storage pointer. As the storage is not memory in the usual sense, but a value instead, one can see the storage pointer as encoding a path to one reference type entity in the storage.

Example 4

Consider the example in Figure 1. The local variable r in function append() points to the struct residing at index at in the state variable records. In contrast, the return value ret of function get() is a pointer to an integer array in memory.

Statements and expressions.

Solidity includes usual programming statements and control structures. To keep the presentation simple, we focus only on the statements that are related to the formalization of the memory model: local variable declarations, assignments, array manipulation, and the delete statement. Solidity expressions relevant for the memory model are identifiers, member and array accesses, conditionals and allocation of new arrays and structs in memory.

If a value is not provided, local variable declarations automatically initialize the variable to a default value. For reference types in memory, this allocates new entities on the heap and performs recursive initialization of its members. For reference types in storage, the local storage pointers must always be explicitly initialized to point to a storage member. This ensures that no pointer is ever “null”. Value types are initialized to their simple default value (0, false). Behavior of assignment in Solidity is complex and depends on the data location of its arguments (e.g., deep copy or pointer assignment). Dynamically-sized storage arrays can be extended by pushing an element to their end, or can be shrunk by popping. The delete statement assigns the default value (recursively for reference types) to a given entity based on its type.

Example 5

The assignment r.set = true in the append() function is a simple value assignment. On the other hand, ret = records[at].data in the get() function allocates a new array on the heap and performs a deep copy of data from storage to memory.

2.3 SMT-Based Programs

TypeName Type
int SMT integer
bool Boolean
TypeNameTypeName SMT array
DataTypeName SMT datatype
DataTypeDef DataTypeName Datatype definition
expr Expression
id Identifier
Array read
Array write
Datatype constructor
Member selector
Conditional
Arithmetic expression
VarDecl Variable declaration
stmt Statement
Assignment
If-then-else
Assumption
Program Program definition
Figure 4: Syntax of SMT-based programs.

We formalize the semantics of the Solidity fragment by translating to a simple programming language that uses SMT semantics [9, 12] for the types and data. The syntax of this language is shown in Figure 4. The syntax is purposefully minimal and generic, so that it can be expressed in any modern SMT-based verification tool (e.g., Boogie [5] and Why3 [19]).222Our current implementation is based on Boogie, but we have plans to incorporate Why3 as an alternate backend.

The types of SMT-based programs are the SMT types: simple value types such as Booleans and mathematical integers, and structured types such as arrays [26, 17] and inductive datatypes [8]. The expressions of the language are standard expressions from SMT such as identifiers, array reads and write, datatype constructors, member selectors, conditionals and basic arithmetic [7]. All variables are declared at the beginning of a program. The statements of the language are limited to assignments, the if-then-else statement, and assumption statement.

SMT-based programs are a good intermediate verification representation for modeling semantics. For one, they have clearly defined mathematical semantics with no ambiguities. Furthermore, any property of the SMT program can be checked easily with SMT solvers: the program can be translated directly to a SMT formula by transforming the program into single static assignment form (SSA).

Note that the syntax requires the left hand side of an assignment to be an identifier. However, to make our presentation simpler, we will allow array read, member access and conditional expressions (and their combination) as LHS. Such constructs can be eliminated iteratively in the following way until only identifiers appear as LHS in assignments.

  • is equivalent to .

  • is equivalent to , where is the constructor of a datatype with members .

  • is equivalent to .

3 Formalization

In this section we present our formalization of the Solidity semantics through a translation that maps Solidity elements to constructs in the SMT-based language. The formalization is described in separate subsections for types, contracts, state variables, functions, statements, and expressions.

3.1 Types

We use to denote the function that maps a Solidity type to an SMT type. This function is used in the translation of contract elements and can, as a side effect, introduce datatype definitions and variable declarations. This is denoted with [decl] in the result of the function. To simplify the presentation, we assume that such side effects are automatically added to the preamble of the SMT program. Furthermore, we assume that declarations with the same name are only added once. We use to denote the original (Solidity) type of an expression (to be used later in the formalization). The definition of is shown in Figure 5.

bool
int
with
int with
with
int with
Figure 5: Formalization of Solidity types. Members of struct are denoted as with types .

Value types.

Booleans are mapped to SMT Booleans while other value types are mapped to SMT integers. Addresses are also mapped to SMT integers so that arithmetic comparison and conversions between integers and addresses is supported. For simplicity of presentation we map all integers (signed or unsigned) to SMT integers. Note that this does not capture the precise machine integer semantics with overflows, but this is not relevant from the perspective of the memory model. Precise computation can be provided by relying on SMT bitvectors or by encoding computations with modular arithmetic (see, e.g., [21]).

Reference types.

The Solidity syntax does not always require the data location for variable and parameter declarations. However, for reference types it is always required (enforced by the compiler), except for state variables that are always implicitly storage. Hence, in our formalization, we assume that the data location of reference types is a part of the type. As discussed before, memory entities are always accessed through pointers. However, for storage we distinguish whether it is the storage reference itself (e.g., state variable) or a storage pointer (e.g., local variable, function parameter). We denote the former with storage and the latter with storptr in the type name. Our modeling of reference types relies on the generalized theory of arrays [17] and the theory of inductive data-types [8], both of which are supported by modern SMT solvers (e.g., cvc4 [6] and z3 [16]).

Mappings.

Solidity mappings are only allowed in the storage data location. Mappings are implemented as a hash table that uses the contract storage as a large table for elements, and uses a collision resistant cryptographic hash function to map keys to storage slots. However, reasoning about such implementation details becomes infeasible in practice. We abstract away the implementation details, with the assumption that no hash collisions occur, and formalize Solidity mappings simply as SMT arrays.

Arrays.

Similarly to mappings, we abstract away the implementation details of Solidity arrays and model them with the SMT theory of arrays and inductive datatypes. Both fixed- and dynamically-sized arrays are translated using the same SMT type and we only treat them differently in the context of statements and expressions. To ensure that array size is properly modeled we keep track of it (length) in the datatype along with the actual elements (arr).

For storage array types with base type , we introduce an SMT datatype with a constructor that takes two arguments: an inner SMT array (arr) associating integer indexes and the recursively translated base type (), and an integer length. The advantage of this encoding is that the value semantics of storage data is provided by construction: each array element is a separate entity (no aliasing) and assigning storage arrays in SMT makes a deep copy. This encoding also generalizes if the base type is a reference type.

For memory array types with base type , we introduce a separate datatype (side effect). However, memory arrays are stored with pointer values. Therefore the memory array type is mapped to integers, and a heap () is introduced to associate integers (pointers) with the actual memory array datatypes. Note that mixing data locations within a reference type is not possible: the element type of the array has the same data location as the array itself. Therefore, it is enough to introduce two datatypes per element type : one for storage and one for memory. In the former case the element type will have value semantics whereas in the latter case elements will be stored as pointers.

Structs.

For each storage struct type the translation introduces an inductive datatype . This datatype includes a constructor for each struct member with types mapped recursively. Similarly to arrays, this ensures the value semantics of storage such as non-aliasing and deep copy assignments. For each memory struct we also introduce a datatype and a constructor for each member.333Mappings in Solidity cannot reside in memory. If a struct defines a mapping member and it is stored in memory, the mapping is simply inaccessible. Such members could be omitted from the constructor. However, the memory struct type itself is mapped to integers (pointer) and a heap () is introduced to associate the pointers with the actual memory struct datatypes. Note that if a memory struct has members with reference types, they are also pointers, which is ensured recursively by our encoding.

3.2 Local Storage Pointers

An interesting aspect of the storage data location is that, although the stored data has value semantics, it is still possible to define pointers to an entity in storage within a local context, e.g., with function parameters or local variables. These pointers are called local storage pointers.

Example 6

In the append() function of Figure 1 the variable r is defined to be a convenience pointer into the storage map records[at]. Similarly, the isset() function takes a storage pointer to a Record entity in storage as an argument.

Since our formalization uses SMT datatypes to encode the contract data in storage, it is not possible to encode these pointers directly. A partial solution would be to substitute each occurrence of the local pointer with the expression that is assigned to it when it was defined. However, this approach is too simplistic and has several limitations. Local storage pointers can be reassigned, or assigned conditionally. Moreover, at a given occurrence, it might not be known at compile time which definition should be used. Local storage pointers can also be passed in as function arguments: within the function they can point to different storage entities for different calls.

contract C {
  struct T {
    int z;
  }
  struct S {
    int x;
    T   t;
    T[] ts;
  }
  T   t1;
  S   s1;
  S[] ss;
}
(a)

C

T

t1 (0)

S

s1 (1)

T

t (0)

T[]

ts (1)

T

(i)

S[]

ss (2)

S

(i)

T

t (0)

T[]

ts (1)

T

(i)
(b)
   
     
     
      
        
        
      
        
        
(c)
Figure 6: An example of packing and unpacking: (a) contract with struct definitions and state variables; (b) the storage tree of the contract for type T; and (c) the unpacking expression for storage pointers of type T.

We propose an approach to encode local storage pointers while overcoming these limitations. Our encoding relies on the fact that storage data of a contract can be viewed as a finite depth tree of values. As such, each element of the stored data can be uniquely identified by a finite path leading to it.444Solidity does support limited form of recursive data-types. Such types could make the storage a tree of potentially arbitrary depth. We chose not to support such types as recursion is non-existing in Solidity types used in practice. In addition, we expect that such types will be disallowed in the future versions of Solidity.

Example 7

Consider the contract C in Figure 5(a). The contract defines structs T and S, and state variables of these types. If we are interested in all storage entities of type T, we can consider the sub-tree of the contract storage tree that has leaves of type T, such as the one in Figure 5(b). The root of the tree is the contract itself, with indexed sub-nodes for state variables, in order. For nodes of struct type there are indexed sub-nodes leading to its members, in order. For each node of array type there is a sub-node for the base type. Every pointer to a storage T entity can be identified by a path in this tree: by fixing the index to each state variable, member, and array index, as seen in brackets in Figure 5(b), such paths can be encoded as an array of integers. For example, the state variable t1 can be represented as array , the member s1.t as array , and ss[8].ts[5] as .

This idea allows us to encode storage pointer types (pointing to arrays, structs or mappings) simply as SMT arrays (). The novelty of our approach is that storage pointers can be encoded and passed around, while maintaining the value semantics of storage data, without the need for quantifiers to describe non-aliasing. To encode storage pointers, we need to address initialization of storage pointers, and dereference of storage pointers, while assignment is simply an assignment of array values. When a storage pointer is initialized to a concrete expression, we pack the indexed path to the storage entity (that the expression references) into an array value. When a storage pointer is dereferenced (e.g., by indexing into or accessing a member), the array is unpacked into a conditional expression that will evaluate to a storage entity by decoding paths in the tree.

Storage tree.

The storage tree for a given target type can be obtained easily by filtering the AST nodes of the contract definition to only include state variable declarations and to, further, only include nodes that lead to a sub-node of type . We denote the storage tree for type as .555In our implementation we do not explicitly compute the storage tree but instead traverse directly the AST supplied by the Solidity compiler.

Packing.

def :
       ;
       ;
       // Depth of result;
       foreach baseexpr in expr starting from the innermost do
             if  then
                   find -th edge ;
                   ;
                   ;
                  
            if  then
                   find edge ;
                   ;
                   ;
                  
            ;
            
      return result
Figure 7: Formalization of pack. It returns a symbolic array expression that, when evaluated, uniquely identifies the path to the storage entity that the expression references.

Given an expression (such as ss[8].ts[5]), uses the storage tree for the type of the expression and encodes it to an array (e.g., ) by fitting the expression into the tree. Pseudocode for is shown in Figure 7. The expression is processed iteratively starting from the innermost base expression and a constant array of zeros. The base expression of an identifier id is id itself and for an array index or a member access it is the expression itself and the recursive base expressions of . If the subexpression is an identifier (state variable or member access), the algorithm finds the outgoing edge annotated with the identifier (from the current root). Then, it writes the index of the edge in the array and sets the target node of the edge as new root. If the subexpression is an index access, the algorithm maps and writes the index expression (symbolically) in the array and proceeds on the single outgoing edge. The expression mapping function is introduced later in this section.

Example 8

For the contract in Figure 5(a) produces : the base expressions are ss, ss[8], ss[8].ts and ss[8].ts[5] in this order. First, is added as ss is the state variable with index 2. Then, ss[8] is an index access so 8 is mapped to and added to the result. Next, ss[8].ts is a member access with ts having the index . Finally, ss[8].ts[5] is an index access so 5 is mapped to and added.

Unpacking.

def :
       return ;
      
def :
       ;
       if root has no outgoing edges then result := subexpr;
       if root is contract then
             foreach -th edge in reverse order do
                   ;
                  
      if root is struct then
             foreach -th edge in reverse order do
                   ;
                  
      if root is array/mapping with single edge  then
             result := ;
            
      return result;
      
Figure 8: Formalization of unpacking a local storage pointer to a conditional expression.

The opposite of is . This function takes a storage pointer (of type ) and produces a conditional expression that decodes any given path into one of the leaves of the storage tree. Figure 8 formalizes the function. The function recursively traverses the tree starting from the contract node and accumulates the expressions leading to the leaves. The function creates conditionals when branching, and when a leaf is reached the accumulated expression is simply returned. For contracts we process edges corresponding to each state variable by setting the subexpression to be the state variable itself. For structs we process edges corresponding to each member by wrapping the subexpression into a member access. For both contracts and structs, the subexpressions are collected into a conditional as separate cases. For arrays and mappings we process the single outgoing edge by wrapping the subexpression into an index access using the current element (at index ) of the pointer.

Example 9

For example, the conditional expression corresponding to the tree in Figure 5(b) can be seen in Figure 5(c). Given a pointer ptr, if then the conditional evaluates to t1. Otherwise, if then s1 has to be taken, where two leaves are possible: if then the result is s1.t otherwise it is s1.ts, and so on. If ptr is then the conditional evaluates exactly to ss[8].ts[5] from which ptr was packed.

Note that with inheritance and libraries [28] it is possible that a contract defines a type but has no nodes in its storage tree. The contract can still define functions with storage pointers to , which can be called by derived contracts that define state variables of type . In such cases we declare an array of type , called the default context, and unpack storage pointers to as if the default context was a state variable. This allows us to reason about abstract contracts and libraries, modeling that their storage pointers can point to arbitrary entities not yet declared.

3.3 Contracts, State Variables, Functions

The focus of our discussion is the Solidity memory model and, for presentation purposes, we assume a minimalistic setting where the important aspects of storage and memory can be presented: we assume a single contract and a single function to translate. Interactions between multiple functions are handled differently depending on the verification approach. For example, in modular verification functions are checked individually against specifications (pre- and postconditions) and function calls are replaced by their specification [21].

State variables.

Each state variable of a contract is mapped to a variable declaration in the SMT program.666Generalizing this to multiple contracts can be done directly by using a separate one-dimensional heap for each state variable, indexed by a receiver parameter () identifying the current contract instance (see, e.g., [21]).The data location of state variables is always storage. As discussed previously, reference types are mapped using SMT datatypes and arrays, which ensures non-aliasing by construction. While Solidity optionally allows inline initializer expressions for state variables, without the loss of generality we can assume that they are initialized in the constructor using regular assignments.

Functions calls.

From the perspective of the memory model, the only important aspect of function calls is the way parameters are passed in and how function return values are treated. Our formalization is general in that it allows us to treat both of the above as plain assignments (explained in later subsections). For each parameter and return value of a function, we add declarations and in the SMT program. Note that for reference types appearing as parameters or return values of the function, their types are either memory or storage pointers.

Memory allocation.

In order to model allocation of new memory entities, while keeping some non-aliasing information, we introduce an allocation counter variable in the preamble of the SMT program. This counter is incremented for each allocation of memory entities and used as the address of the new entity. For each parameter with memory data location we include an assumption as they can be arbitrary pointers, but should not alias with new allocations within the function. Note that if a parameter of memory pointer type is a reference type containing other references, such non-aliasing constraints need to be assumed recursively. This can be done for structs by enumerating members. But, for dynamic arrays it requires quantification that is nevertheless still decidable (array property fragment [13]).

Initialization and default values.

false
[ref : int] (fresh symbol)
for
ref
[ref : int] (fresh symbol)
for each
ref
Figure 9: Formalization of default values. We denote struct members as with types .

If we are translating the constructor function, each state variable is first initialized to its default value with a statement . For regular functions, we set each return value to its default value with a statement . We use , as defined in Figure 9, to denote the function that maps a Solidity type to its default value as an SMT expression. Note that, as a side effect, this function can do allocations for memory entities, introducing extra declarations and statements, denoted by [decl] and {stmt}. As expected, the default value for Booleans is false, and for other primitives that map to integers. For mappings from to , the default value is an SMT constant array returning the default value of the value type for each key (see, e.g., [17]). The default value of storage arrays is the corresponding datatype value constructed with a constant array of the default value for base type , and a length of or for fixed- or dynamically-sized arrays. For storage structs, the default value is the corresponding datatype value constructed with the default values of each member.

The default value of uninitialized memory pointers is unusual. Since Solidity doesn’t support “null” pointers, a new entity is automatically allocated in memory and initialized to default values (which might include additional recursive initialization). Note, that for fixed-size arrays Solidity enforces that the array size must be an integer literal or a compile time constant, so setting each element to its default value is possible without loops or quantifiers. Similarly for structs, each member is recursively initialized, which is again possible by explicitly enumerating each member.

3.4 Statements

We use to denote the function that translates Solidity statements to a list of statements in the SMT program. It relies on the type mapping function (presented previously in Section 3.1) and on the expression mapping function (to be introduced in Section 3.6). Furthermore, we define a helper function dedicated to modeling Solidity assignments (to be discussed in Section 3.5).

The definition of is shown in Figure 10. As a side effect, extra declarations can be introduced to the preamble of the SMT program (denoted by [decl]). The Solidity documentation [28] does not precisely state the order of evaluating subexpressions in statements. It only specifies that subnodes are processed before the parent node. This problem is independent form the discussion of the memory models so we assume that side effects of subexpressions are added in the same order as it is implemented in the compiler. Furthermore, if a subexpression is mapped multiple times, we assume that the side effects are only added once. This makes our presentation simpler as we have to introduce fewer temporary variables.

for (fresh symbols)
for
for (reversed)
Figure 10: Formalization of statements.

Local variable declarations introduce a variable declaration with the same identifier in the SMT program by mapping the type.777Without the loss of generality we assume that identifiers in Solidity are unique. The compiler handles scoping and assigns an unique identifier to each declaration. If an initialization expression is given, it is mapped using and assigned to the variable. Otherwise, the default value is used as defined by in Figure 9. Delete assigns the default value for a type, which is simply mapped to an assignment in our formalization.

Solidity supports multiple assignments as one statement with a tuple-like syntax. The documentation [28] does not specify the behavior precisely, but the compiler evaluates the RHS tuple from left to right, while assignment is performed from right to left.

contract C {   struct S { int x; }   S s1, s2, s3;   function primitiveAssign() {     s1.x = 1; s2.x = 2; s3.x = 3;     (s1.x, s3.x, s2.x) = (s3.x, s2.x, s1.x);     // s1.x == 3, s2.x == 1, s3.x == 2   }   function storageAssign() {     s1.x = 1; s2.x = 2; s3.x = 3;     (s1, s3, s2) = (s3, s2, s1);     // s1.x, s2.x, s3.x are all equal to 1   } }
Figure 11: Example illustrating the right-to-left assignment order and the treatment of storage in tuple assignment.
contract C {   struct S { int x; }   S[] a;   constructor() {     a.push(S(1));     S storage s = a[0];     a.pop();     assert(s.x == 1); // Ok     // Following is error     // assert(a[0].x == 1);   } }
Figure 12: Example illustrating a dangling storage pointer.
Example 10

Consider the tuple assignment in function primitiveAssign() in Figure 12. From right to left, s2.x is assigned first with the value of s1.x which is . Afterwards, when s3.x is assigned with s2.x, the already evaluated (old) value of is used instead of the new value . Finally, s1.x gets the old value of s3.x, i.e., . Note however, that storage expressions on the RHS evaluate to storage pointers. Consider, for example, the function storageAssign() in Figure 12. From right to left, s2 is assigned first, with a pointer to s1 making s2.x become . However, as opposed to primitive types, when s3 is assigned next, s2 on the RHS is a storage pointer and thus the new value in the storage of s2 is assigned to s3 making s3.x become . Similarly, s1.x also becomes as the new value behind s3 is used.

Array push increases the length and assigns the given expression as the last element. Array pop decreases the length and sets the removed element to its default value. While the removed element can no longer be accessed via indexing into an array (a runtime error occurs), it can still be accessed via local storage pointers (see Figure 12).888The current version (0.5.x) of Solidity supports resizing arrays by assigning to the length member. However, this behavior is dangerous and has been since removed in the next version (0.6.0) (see https://solidity.readthedocs.io/en/v0.6.0/060-breaking-changes.html). Therefore, we do not support this in our encoding.

3.5 Assignments

Assignments between reference types in Solidity can be either pointer assignments or value assignments, involving deep copying and possible new allocations in the latter case. We use to denote the function that assigns a rhs SMT expression to a lhs SMT expression based on their original types and data locations. The definition of is shown in Figure 13. Value type assignments are simply mapped to an SMT assignment. To make our presentation more clear, we subdivide the other cases into separate functions for array, struct and mapping operands, denoted by , and respectively.

for value type operands
for mapping type operands
for struct type operands
for array type operands
{} (all other cases)
for each
for each
Figure 13: Formalization of assignment based on different type categories and data locations for the LHS and RHS. We use s, sp and m after the arguments to denote storage, storage pointer and memory types respectively.

Mappings.

As discussed previously, Solidity prohibits direct assignment of mappings. However, it is possible to declare a storage pointer to a mapping, in which case the rhs expression is packed. It is also possible to assign two storage pointers, which simply assigns pointers. Other cases are a no-op.999This is consequence of the fact that keys are not stored in mappings and so the assignment is impossible to perform.

lhs/rhs Storage Memory Stor.ptr.
Storage Deep copy Deep copy Deep copy
Memory Deep copy Pointer assign Deep copy
Stor.ptr. Pointer assign Error Pointer assign
Figure 14: Semantics of assignment between array and struct operands based on their data location.

Structs and arrays.

For structs and arrays the semantics of assignment is summarized in Figure 14. However, there are some notable details in various cases that we expand on below.

Assigning anything to storage lhs always causes a deep copy. If the rhs is storage, this is simply mapped to a datatype assignment in our encoding (with an additional unpacking if the rhs is storage pointer).101010This also causes mappings values to be copied, which contradicts the current Solidity semantics. However, we chose to keep the deep copy as we expect that assignment of mappings will be disallowed in the future versions of Solidity. If the rhs is memory, deep copy for structs can be done member wise by accessing the heap with the rhs pointer and performing the assignment recursively (as members can be reference types themselves). For arrays, we access the datatype corresponding to the array via the heap and do an assignment, which does a deep copy in SMT. Note however, that this only works if the base type of the array is a value type. For reference types, memory array elements are pointers and would require being dereferenced during assignment to storage. As opposed to struct members, the number of array elements is not known at compile time so loops or quantifiers have to be used (as in traditional software analysis). However, this is a special case, which can be encoded in the decidable array property fragment [13]. Assigning storage (or storage pointer) to memory is also a deep copy but in the other direction. However, instead overwriting the existing memory entity, a new one is allocated (recursively for reference typed elements or members). We model this by incrementing the reference counter, storing it in the lhs and then accessing the heap for deep copy using the new pointer.

3.6 Expressions

We use to denote the function that translates a Solidity expression to an SMT expression. As a side effect, declarations and statements might be introduced (denoted by [decl] and {stmt} respectively). The definition of is shown in Figure 15. As discussed in Section 3.4 we assume that side effects are added from subexpressions in the proper order and only once.

if struct storage
if struct storptr
if struct memory
if [] storage
if [] storptr
if [] memory
if [] storage
if [] storptr
if [] memory
if mapping(=>) storage
if mapping(=>) storptr
[ : ] (fresh symbol)
[ : ] (fresh symbol)
{}
{}
[ref : int] (fresh symbol)
{}
{}
{} for
ref
[ref : int] (fresh symbol)
{}
{} for each member
ref
Figure 15: Formalization of expressions. We denote struct members as with types .

Member access is mapped to an SMT member access by mapping the base expression and the member name. There is an extra unpacking step for storage pointers and a heap access for memory. Note that the only valid member for arrays is length. Index access is mapped to an SMT array read by mapping the base expression and the index, and adding en extra member access for arrays to get the inner array arr of elements from the datatype. Furthermore, similarly to member accesses, an extra unpacking step is needed for storage pointers and a heap access for memory.

Conditionals in Solidity can be mapped to an SMT conditional expression in general. However, data locations can be different for the true and false branches, causing possible side effects. Therefore, we first introduce fresh variables for the true and false branch with the common type (of the whole conditional), then make assignments using and finally use the new variables in the conditional. The Solidity documentation [28] does not specify the common type, but the compiler returns memory if any of the branches is memory, and storage pointer otherwise.

Allocating a new array in memory increments the reference counter, sets the length and the default values for each element (recursively). Note that in general the length might not be a compile time constant,in which case setting default values could be encoded with the array property fragment (similarly to deep copy in assignments) [13]. Allocating a new memory struct also increments the reference counter and sets each value by translating the provided arguments.

4 Evaluation

The formalization described in this paper forms the basis of our Solidity verification tool solc-verify [21].111111solc-verify is open source, available at https://github.com/SRI-CSL/solidity. In this section we provide an evaluation of the presented formalization and our implementation by validating it on a set of relevant test cases. For illustrative purposes we also compare our tool with other available Solidity analysis tools.121212All tests, with a Truffle test harness, a docker container with all the tools, and all individual results are available at https://github.com/dddejan/solidity-semantics-tests.

Tests.

We have manually developed a set of tests that try to capture the most interesting behaviors and corner cases of the Solidity memory semantics. The set is structured so that every target test behavior is represented with a test case checking the correctness of the behavior with assertions. The correctness of the tests themselves is determined by running them through the EVM with no assertion failures. Test cases are expanded to use all reference types and combinations of reference types. This includes structures, mappings, dynamic and fixed-size arrays, both single- and multi-dimensional. The tests are organized into the following classes. Tests in the assignment class check whether the assign statement is properly modeled. This includes assignments in the same data location, but also assignments across data locations that need deep copying, and assignments and re-assignments of memory and storage pointers. The delete class of tests checks whether the delete statement is properly modeled. Tests in the init class check whether variable and data initialization is properly modeled. For variables in storage, we check if they are properly initialized to default values in the contract constructor. Similarly, we check whether memory variables are properly initialized to provided values, or default values when no initializer is provided. The storage class of tests checks whether storage itself is properly modeled for various reference types, including for example non-aliasing. Tests in the storageptr class check whether storage pointers are modeled properly. This includes checking if the model properly models storage pointers to various reference types, including nested types. In addition, the tests check that the storage pointers can be properly passed to functions and ensure non-aliasing for distinct parts of storage.

Tools.

For illustrative purposes we include a comparison with the following available Solidity analysis tools: mythril v0.21.17 [27], verisol v0.1.1-alpha [25], and smt-checker v0.5.12 [1]. mythril is a Solidity symbolic execution tool with a symbolic execution engine that runs at the level of the EVM bytecode. verisol is similar to solc-verify in that it uses Boogie to model the Solidity contracts, but takes the traditional approach to modeling memory and storage with pointers and quantifiers. smt-checker is an SMT-based analysis module built into the Solidity compiler itself. There are other tools that can be found in the literature, but they are either basic prototypes that cannot handle realistic features we are considering, or are not available for direct comparison.

assignment (102) correct incorrect unsupported timeout time (s)
mythril 94 0 0 8 1655.14
verisol 10 61 31 0 175.27
smt-checker 6 9 87 0 15.25
solc-verify 78 8 16 0 62.81

delete (14)
correct incorrect unsupported timeout time (s)
mythril 13 1 0 0 47.51
verisol 3 8 3 0 24.66
smt-checker 0 0 14 0 0.30
solc-verify 7 1 6 0 9.02

init (18)
correct incorrect unsupported timeout time (s)
mythril 15 3 0 0 59.67
verisol 7 8 3 0 28.82
smt-checker 0 0 18 0 0.41
solc-verify 13 5 0 0 11.88

storage (27)
correct incorrect unsupported timeout time (s)
mythril 27 0 0 0 310.40
verisol 12 15 0 0 43.45
smt-checker 2 0 25 0 1.32
solc-verify 27 0 0 0 17.61

storageptr (164)
correct incorrect unsupported timeout time (s)
mythril 164 0 0 0 1520.29
verisol 128 19 17 0 203.93
smt-checker 4 18 142 0 21.93
solc-verify 164 0 0 0 96.92

Table 1: Results of evaluating 4 tools (mythril, verisol, smt-checker, solc-verify) on 5 test suites (assignment, delete, init, storage, storageptr) exercising different aspects of the memory model.

Results.

We ran all the tools on our set with a timeout of 60s and the results are shown in Table 1. As expected, mythril has the most consistent results on our test set. This is because mythril models contract semantics at the EVM level and does not need to model complex Solidity semantics. Nevertheless, the results also indicate that the performance penalty for this precision is significant (8 timeouts). verisol, as the closest to our approach, still doesn’t support many features and has a significant amount of false reports for features that it does support. Many of the false reports are due to their model of storage that is based on pointers, while ensuring storage consistency with the use of quantifiers. smt-checker is still not a robust tool and doesn’t support the majority of the Solidity features that our tests target.

solc-verify performs well on our test set, matching the precision of mythril at very low computational cost. The few false alarms we have are either due to Solidity features that we chose to not implement (e.g., proper treatment of mapping assignments), or parts of the semantics that we only implemented partially (e.g., parts that require the limited use of quantifiers).

5 Related Work

There is a strong push in the Ethereum community to apply formal methods to smart contract verification. This includes many attempts to formalize the semantics of smart contracts, both at the level of EVM and Solidity.

EVM-level semantics.

Bhargavan et al. [11] decompile a fragment of EVM bytecode to F*, modeling EVM as a stack based machine with word and byte arrays for storage and memory. Grishchenko et al. [20] extend this work by providing a complete small step semantics for EVM. Kevm [22] provides an executable formal semantics of EVM in the K framework. Hirai [23] formalizes EVM in Lem, a language used by some interactive theorem provers. Amani et al. [2] extends this work by defining a program logic to reason about EVM bytecode.

Solidity-level semantics.

Jiao et al. [24]

formalize the executable operational semantics of Solidity in the K framework. Their formalization focuses on the details of bit-precise sizes of types, alignment and padding in storage. They encode storage slots, arrays and mappings with the full encoding of hashing. However, the formalization does not describe assignments (e.g., deep copy) in detail, apart from simple cases. Furthermore, user defined structs are also not mentioned. In contrast, our semantics is high-level and abstracts away some details (e.g., hashes, alignments) to enable efficient automated verification. Additionally, we also focus on proper modeling of different cases for assignments between storage and memory. Bartotelli et al. 

[10] propose TinySol, a minimal core calculus for a subset of Solidity, required to model some basic features such as asset transfer and reentrancy. Contract data is simply modeled as a key value store, without explicitly mentioning differences in storage and memory, or in value and reference types. Crafa et al. [15] introduce Featherweight Solidity, a calculus formalizing some core features of the language. Data locations and reference types are not mentioned explicitly, they focus on primitive types and mention mappings briefly. The main focus is on the type system and type checking. They propose an improved type system that can statically detect unsafe casts and callbacks. The most closely related to our work is the work of Zakrzewski [31]

, a Coq formalization focusing on functions, modifiers and the memory model. The memory model is treated similarly: storage is a mapping from names to storage objects (values), memory is a mapping from references to memory objects (containing references recursively) and storage pointers define a path in storage. Their formalization is also high-level, without considering alignment, padding or hashing. The formalization is provided as big step functional semantics in Coq. While the paper presents some example rules, the formalization does not cover all cases. For example the details of assignments (e.g., memory to storage), push/pop for arrays, treating memory aliasing and new expressions. Furthermore, our approach focuses on SMT and modular verification, which enables automated reasoning.

6 Conclusion

We presented a high-level SMT-based formalization of the Solidity memory model semantics. Our formalization covers all aspects of the language related to managing both the persistent contract storage and the transient local memory. The novel encoding of storage pointers as arrays allow us to precisely model non-aliasing and deep copy assignments between storage entities without the need for quantifiers. The memory model forms the basis of our Solidity-level modular verification tool solc-verify. We developed a suite of test cases exercising all aspects of memory management with different combinations of reference types. Results indicate that our memory model outperforms existing Solidity-level tools in terms of soundness and precision, and is on par with low-level EVM-based implementations, while having a significantly lower computational cost for discharging verification conditions.

References

  • [1] Alt, L., Reitwiessner, C.: SMT-based verification of Solidity smart contracts. In: Leveraging Applications of Formal Methods, Verification and Validation. Industrial Practice, Lecture Notes in Computer Science, vol. 11247, pp. 376–388. Springer (2018)
  • [2] Amani, S., Bégel, M., Bortin, M., Staples, M.: Towards verifying ethereum smart contract bytecode in Isabelle/HOL. In: Proceedings of the 7th ACM SIGPLAN International Conference on Certified Programs and Proofs. pp. 66–77. ACM (2018)
  • [3] Antonopoulos, A., Wood, G.: Mastering Ethereum: Building Smart Contracts and Dapps. O’Reilly Media, Inc. (2018)
  • [4] Atzei, N., Bartoletti, M., Cimoli, T.: A survey of attacks on Ethereum smart contracts. In: Principles of Security and Trust, LNCS, vol. 10204, pp. 164–186. Springer (2017)
  • [5] Barnett, M., Chang, B.Y.E., DeLine, R., Jacobs, B., Leino, K.R.M.: Boogie: A modular reusable verifier for object-oriented programs. In: Formal Methods for Components and Objects, LNCS, vol. 4111, pp. 364–387. Springer (2006)
  • [6] Barrett, C., Conway, C.L., Deters, M., Hadarean, L., Jovanović, D., King, T., Reynolds, A., Tinelli, C.: Cvc4. In: International Conference on Computer Aided Verification. pp. 171–177. Springer (2011)
  • [7] Barrett, C., Fontaine, P., Tinelli, C.: The Satisfiability Modulo Theories Library (SMT-LIB). www.SMT-LIB.org (2016)
  • [8] Barrett, C., Shikanian, I., Tinelli, C.: An abstract decision procedure for satisfiability in the theory of recursive data types. Journal on Satisfiability, Boolean Modeling and Computation 3, 21–46 (2007)
  • [9] Barrett, C., Tinelli, C.: Satisfiability modulo theories. In: Handbook of Model Checking, pp. 305–343. Springer (2018)
  • [10] Bartoletti, M., Galletta, L., Murgia, M.: A minimal core calculus for Solidity contracts. In: Data Privacy Management, Cryptocurrencies and Blockchain Technology, Lecture Notes in Computer Science, vol. 11737, pp. 233–243. Springer (2019)
  • [11] Bhargavan, K., Delignat-Lavaud, A., Fournet, C., Gollamudi, A., Gonthier, G., Kobeissi, N., Kulatova, N., Rastogi, A., Sibut-Pinote, T., Swamy, N., Zanella-Béguelin, S.: Formal verification of smart contracts: Short paper. In: ACM Workshop on Programming Languages and Analysis for Security. pp. 91–96. ACM (2016)
  • [12] Biere, A., Heule, M., van Maaren, H.: Handbook of satisfiability. IOS press (2009)
  • [13] Bradley, A.R., Manna, Z., Sipma, H.B.: What’s decidable about arrays? In: Verification, Model Checking, and Abstract Interpretation, Lecture Notes in Computer Science, vol. 3855, pp. 427–442. Springer (2006)
  • [14] Chen, H., Pendleton, M., Njilla, L., Xu, S.: A survey on ethereum systems security: Vulnerabilities, attacks and defenses (2019), https://arxiv.org/abs/1908.04507
  • [15] Crafa, S., Pirro, M., Zucca, E.: Is solidity solid enough. In: Financial Cryptography Workshops (2019)
  • [16] De Moura, L., Bjørner, N.: Z3: An efficient smt solver. In: International conference on Tools and Algorithms for the Construction and Analysis of Systems. pp. 337–340. Springer (2008)
  • [17] De Moura, L., Bjørner, N.: Generalized, efficient array decision procedures. In: Formal Methods in Computer-Aided Design. pp. 45–52. IEEE (2009)
  • [18] Dhillon, V., Metcalf, D., Hooper, M.: The DAO hacked. In: Blockchain Enabled Applications, pp. 67–78. Apress (2017)
  • [19] Filliâtre, J.C., Paskevich, A.: Why3 — where programs meet provers. In: Proceedings of the 22nd European Symposium on Programming, Lecture Notes in Computer Science, vol. 7792, pp. 125–128. Springer (2013)
  • [20] Grishchenko, I., Maffei, M., Schneidewind, C.: A semantic framework for the security analysis of Ethereum smart contracts. In: Principles of Security and Trust, LNCS, vol. 10804, pp. 243–269. Springer (2018)
  • [21] Hajdu, Á., Jovanović, D.: solc-verify: A modular verifier for Solidity smart contracts. In: Verified Software. Theories, Tools, and Experiments. Lecture Notes in Computer Science, Springer (2019), (In press)
  • [22] Hildenbrandt, E., Saxena, M., Zhu, X., Rodrigues, N., Daian, P., Guth, D., Rosu, G.: KEVM: A complete semantics of the Ethereum virtual machine. Tech. rep., IDEALS (2017)
  • [23] Hirai, Y.: Defining the Ethereum virtual machine for interactive theorem provers. In: Financial Cryptography and Data Security, LNCS, vol. 10323, pp. 520–535. Springer (2017)
  • [24] Jiao, J., Kan, S., Lin, S., Sanán, D., Liu, Y., Sun, J.: Executable operational semantics of Solidity (2018), http://arxiv.org/abs/1804.01295
  • [25] Lahiri, S.K., Chen, S., Wang, Y., Dillig, I.: Formal specification and verification of smart contracts for azure blockchain. arXiv preprint arXiv:1812.08829 (2018)
  • [26] McCarthy, J.: Towards a mathematical science of computation. In: IFIP Congress. pp. 21–28 (1962)
  • [27] Mueller, B.: Smashing ethereum smart contracts for fun and real profit. In: 9th Annual HITB Security Conference (HITBSecConf) (2018)
  • [28] Solidity documentation (2019), https://solidity.readthedocs.io/
  • [29] Szabo, N.: Smart contracts (1994)
  • [30] Wood, G.: Ethereum: A secure decentralised generalised transaction ledger (2017), https://ethereum.github.io/yellowpaper/paper.pdf
  • [31] Zakrzewski, J.: Towards verification of Ethereum smart contracts: A formalization of core of Solidity. In: Verified Software. Theories, Tools, and Experiments, Lecture Notes in Computer Science, vol. 11294, pp. 229–247. Springer (2018)