Hunting for Re-Entrancy Attacks in Ethereum Smart Contracts via Static Analysis

07/02/2020 ∙ by Yuichiro Chinen, et al. ∙ 0

Ethereum smart contracts are programs that are deployed and executed in a consensus-based blockchain managed by a peer-to-peer network. Several re-entrancy attacks that aim to steal Ether, the cryptocurrency used in Ethereum, stored in deployed smart contracts have been found in the recent years. A countermeasure to such attacks is based on dynamic analysis that executes the smart contracts themselves, but it requires the spending of Ether and knowledge of attack patterns for analysis in advance. In this paper, we present a static analysis tool named RA (Re-entrancy Analyzer), a combination of symbolic execution and equivalence checking by a satisfiability modulo theories solver to analyze smart contract vulnerabilities to re-entrancy attacks. In contrast to existing tools, RA supports analysis of inter-contract behaviors by using only the Etherum Virtual Machine bytecodes of target smart contracts, i.e., even without prior knowledge of attack patterns and without spending Ether. Furthermore, RA can verify existence of vulnerabilities to re-entrancy attacks without execution of smart contracts and it does not provide false positives and false negatives. We also present an implementation of RA to evaluate its performance in analyzing the vulnerability of deployed smart contracts to re-entrancy attacks and show that RA can precisely determine which smart contracts are vulnerable.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Ethereum.

Ethereum, which is often described as “the world computer,” is a global, open-source platform for decentralized applications and execution of programs called

smart contracts, which are software programs recorded on the Ethereum blockchain111Hereafter, “contract” and “smart contract” are used interchangeably but have the same meaning. Likewise, “the blockchain” will be used to refer to the Ethereum blockchain, unless otherwise specified. and executed by the Ethereum Virtual Machine (EVM). The EVM is a virtual machine that runs codes called EVM bytecodes and is the runtime environment for smart contracts in Ethereum. Ethereum enables developers to build smart contracts with built-in functions and gain the benefits of cryptocurrency and blockchain technologies.

Research Motivation. The blockchain is decentralized and transparent by nature, and thus anyone can read the bytecodes of deployed contracts. Moreover, smart contracts typically contain financially valuable data and therefore create a criminogenic environment for adversaries. The attack on “The DAO” on June 2016 is the most infamous case of an attack on smart contracts. In the attack, a vulnerability called re-entrancy, where the main contract calls an external contract which again calls into the calling contract within a single transaction, was utilized to steal more than 60 million US Dollars worth of Ether. With the transparency of the Ethereum blockchain, the vulnerabilities of deployed contracts can be utilized permanently as a springboard to attacks. Indeed, many attacks [1] and honeypots [2] have been found on Ethereum in the past years.

Security Analysis of Smart Contracts. Based on these backgrounds, program analysis for the security of Ethereum smart contracts is an urgent and significant research theme. In this paper, we focus on the analysis of EVM bytecodes of smart contracts instead of their corresponding source codes. Analysis of bytecodes brings a number of advantages: (i) the analysis is independent of a high-level language that is periodically updated, and (ii) bytecodes can be obtained directly from the blockchain even if the corresponding source codes are unpublished. Using static analysis, analysts can easily judge whether a deployed contract has benign or malicious codes, e.g., vulnerable or honeypots.

In early literature on analysis of EVM bytecodes, formal verification [3, 4, 5, 6, 7] is a leading approach for verifying a specification of bytecodes and symbolic execution [8, 9, 10, 11, 12] is used for exploring bytecodes in a depth-first search fashion by extracting control flow graphs (CFGs). To the best of our knowledge, formal verification hovers at a level of just an abstraction of Ethereum bytecodes despite providing precise verification in general. In other words, using formal verification to analyze vulnerabilities in their entirety is difficult. Moreover, according to Weiss et al. [13], rigid formal verification is often dismissed as it puts too high demands on analysts who are not experts in formal verification. On the other hand, symbolic execution can potentially support analysis of vulnerabilities via extraction of CFGs. Several prior works [8, 9, 10]

succeeded in analyzing some vulnerabilities, such as to re-entrancy attacks. However, symbolic execution only outputs CFGs from programs to be analyzed and does not support detection of the vulnerabilities themselves. Accordingly, analysis becomes often heuristic if an infeasible path exists in the programs. In particular, analysis results may produce many false positives and false negatives because of an infeasible path 

[14].

A recent work [15] has found re-entrancy attacks that undermine existing analysis tools [5, 6, 8] by creating a new contract or calling a different function via an external contract. Although a monitoring tool was also proposed in the same paper as a countermeasure against those attacks, the tool performs monitoring based on dynamic analysis and is therefore unable to detect vulnerabilities unless the attacks occur during program execution. Accordingly, analysts need to implement and execute attack patterns by themselves in advance to check for vulnerabilities. Although generic attack patterns [16] have been presented formally, dynamic analysis remains impractical for analysis of smart contracts because it often requires analysts to know how an attack is launched as well as to spend Ether for the execution of contracts. Ideally, contracts should be analyzed based on only their bytecodes, i.e., with the use of static analysis.

Research Goal. This paper aims to design an inter-contract static analysis tool that (1) uses only EVM bytecodes as input, (2) eliminates false negatives and false positives, and (3) does not require analysts to have a priori knowledge of the attacks on contracts. For this goal, we focus on analysis of re-entrancy attacks [15]. As evident in the attack on “The DAO”, re-entrancy attacks have a significant impact and many contracts have been found to be vulnerable to these attacks [6, 8]. Considering findings on new attacks [15] in the recent years, our research goal can prove to be important and useful. Moreover, the design of a static analysis tool that is effective against new attacks is an open challenge [15].

Contributions. In this paper, we present a new static analysis tool named Re-entrancy Analyzer (RA) for analyzing EVM bytecodes of smart contracts. RA can analyze re-entrancy vulnerabilities via inter-contract flows, for instance, by creating a new contract and calling a different function in the main contract via an external contract. RA does not require analysts to have prior knowledge of the attack patterns and pay Ether for analysis, and it does not provide false positives and false negatives. These advantages are achieved via integration of symbolic execution and equivalence checking with a satisfiability modulo theories (SMT) solver. Furthermore, we provide an implementation of RA and evaluate it by utilizing publicly available reference implementations of re-entrancy vulnerabilities [15, 17, 18]. Our results confirm that, unlike Oyente [8], RA can analyze precisely the vulnerabilities of deployed contracts against state-of-the-art re-entrancy attacks [15]. We have released the source codes of RA via GitHub (https://github.com/wanidon/RA) for reproducibility and as reference for future works.

Our main contribution is the creation of a new symbolic execution method named symbolic re-entrancy emulation, which emulates re-entrancy attacks by connecting different contracts with each other. As will be described in detail in Section 3, Oyente and its extensions [5, 6] do not support inter-contract analysis, for example, CFGs become fragments. On the other hand, we developed a module that localizes stored data on the blockchain in each execution path and a module that stacks a return address of an account information for each path. Using these modules, RA can support inter-contract analysis and emulate the behavior of re-entrancy attacks in a symbolic fashion via an internal implementation of a dummy contract, i.e., a contract that executes other contracts.

As another important contribution, we developed a new method named vulnerability verification, which verifies vulnerabilities by utilizing the Z3 SMT solver in the CFGs obtained from the symbolic re-entrancy emulation. In particular, on path conditions of the obtained CFGs, our method verifies whether program behavior on paths for executions with re-entrancy attacks is identical to that without the attacks, i.e., behavior on normal executions. Using the methods above, RA can completely eliminate false positives and false negatives (See Section 4 for details).

2 Technical Background

Ethereum Smart Contracts and EVM. In Ethereum, there are two kinds of accounts, namely, an externally owned account (EOA) and a contract account. EOAs have a private key that can be used to access the corresponding Ether or contracts. A contract account has smart contract code, which an EOA cannot have, and it does not have a private key. Instead, it is owned and controlled by the logic of its smart contract code. In Ethereum, a smart contract refers to an immutable computer program that is deployed on the blockchain and runs deterministically in the context of the EVM. The immutability property indicates that, similar to any data published on a general blockchain, smart contract codes can be considered as trustworthy, i.e., once deployed, they cannot be changed or deleted. The deterministic property indicates that the execution of the coded functions of smart contracts will produce the same result for anyone who runs them. Once deployed on the blockchain, a contract is self-enforcing and managed by the peers in the network, i.e., its functions are executed when the conditions in the contract are met. A smart contract is given an identity in terms of a contract address. Using this address, it can receive Ether and its functions can be executed. A contract is invoked when its contract address is the destination of a transaction, which is a signed message originating from an EOA, transmitted by the network, and recorded on the blockchain. Such transaction causes a contract to run in the EVM using the transaction (and transaction’s data) as input. The data indicate which specific function in the contract to run and what parameters to pass to that function. To incentivize peers to execute contract functions, Ethereum relies on gas, which is paid in Ether, to “fuel computations”. The amount of gas needed to execute a transaction is relative to the complexity of the computations, thus also preventing infinite loops.

Smart contracts are typically written in a high-level language such as Solidity [17]. The source code is then compiled to low-level bytecode that runs in the EVM. The EVM is a simple stack-based architecture. Its instruction set is kept minimal to avoid incorrect implementations that could cause consensus problems. The EVM is a global singleton, i.e., it operates like a global, single-instance computer that runs in all peers in the network. Each peer runs a local copy of the EVM to validate the execution of contract functions, and the processed transactions and smart contracts are recorded on the blockchain.

Static Analysis of Programs. This paper focuses on the use of static analysis composed of CFGs, SMT solver, and symbolic execution. A CFG represents feasible paths of a program as a graph and is utilized for optimization by a compiler and static analysis of programs. Paths in a manner of sequential execution without branches are called basic blocks and are identical to nodes of a CFG. Likewise, paths which are feasible via branches or jumps are represented by edges to connect with nodes.

An SMT solver is a tool used for SMT problems. In contrast to satisfiability problems represented by proportional logic, SMT problems are represented by the first-order predicate logic which is more representative. By describing a specification to be verified in some logic formally, an SMT solver verifies whether a program satisfies the given specification.

Symbolic execution is a method that pseudo-executes a program by replacing information unspecified from the program itself with symbolic values to represent any value. Symbolic execution is composed of CFGs and an SMT solver, and is suitable for analysis of smart contracts given that smart contracts utilize information on blockchains which are outside of program codes. Specifically, a condition to execute a path is called a path condition. Path conditions at the beginning of program execution are valid, and a restriction is newly added to the path conditions when a branch occurs. In a case where a path condition contains a symbolic value, executing either one or both paths according to a condition, i.e., the condition is satisfied or not, is decided by checking the satisfiability of the condition. Path conditions are often represented by first-order predicate logic, and their satisfiability is decided by an SMT solver described above.

3 Motivating Example and Technical Difficulties

In this section, we recall the fallback function and the re-entrancy attacks shown by Rodler et al. [15] as our motivating example and then discuss the technical difficulties in the analysis of the attacks.

3.1 Re-entrancy Problem

Fallback Function. Functions in Solidity are similar to classes in object-oriented languages. There are four types of Solidity functions, namely, external, internal, public, and private. A contract can have exactly one unnamed function called a fallback function, which cannot have any arguments, cannot return anything, and should have external visibility. The fallback function of a contract is executed whenever the contract receives Ether without any data included. To receive Ether and add it to the total balance of the contract, the fallback function must be marked payable. If the contract does not have a fallback function, then it cannot receive Ether through regular transactions and throws an exception. In other words, if a contract is intended to not receive Ether, then the payable in the fallback function can simply be removed. The fallback function is also triggered if someone tries to call a function that does not exist in the contract, and is often utilized for re-entrancy attacks.

Create-Based Re-Entrancy Attack. CREATE is an instruction that creates a new contract during execution of a contract (we call this the original contract for convenience). The new contract consists of initialization codes and the codes of its functions, and these codes are allocated by following a STOP instruction of the original contract. Once the initialization codes are executed, data is initialized and then bytecodes of the new contract are returned. These transitions via the initialization codes can be viewed as function calls by CREATE. Whenever a new contract is created, its constructor will be executed immediately. In the attacks by Rodler et al. [15], a newly generated contract by CREATE can issue further calls in its constructor to other contracts, including malicious contracts, via CALL in the initialization codes. Here, the victim contract first creates a new contract and then updates its internal state. The newly created contract then calls a contract owned by an adversary. Consequently, the adversary maliciously withdraws from the victim contract via the re-entrancy.

Cross-Function Re-Entrancy Attack. Whenever function calls such that Ether is sent to an external contract account are executed, transactions are created by the CALL instruction. In doing so, some data are sent together with the transaction. The most significant four bytes are a function ID of a caller function, which is obtained from a function name and a type name of variables. The callee contract obtains the function ID of the caller contract from the received transaction and then decides a function to be executed by checking if the ID is identical to that owned by the callee contract. If the callee contract does not own a function ID specified by a transaction, a fallback function is executed as described above. Cross-function re-entrancy attacks proposed by Rodler et al. [15] are launched over multiple functions of the victim contract. Specifically, compared to classical re-entrancy attacks that are launched by re-entering the same contract via the same function, cross-function re-entrancy attacks are launched by re-entering the same contract via a different function.

3.2 Technical Difficulties

We present some technical issues that need to be solved for the analysis of the attacks described in the previous section. First, existing tools [5, 6, 8] do not provide support for generating running states of a callee contract via opcodes such as CREATE and CALL, which utilize an external contract. Consequently, in these tools, a caller contract is unidentified uniquely from the standpoint of a callee contract and vice versa. More concretely, in the reference implementation of re-entrancy attacks [15], an initialization code for the create-based re-entrancy attack is executed in the aforementioned manner. In doing so, the initialization code is on an infeasible path for existing tools. Moreover, because the main body of the created code is determined by a return value from the initialization code, the contract by CREATE is infeasible unless the initialization code is analyzed. Thus, the behavior of the created contract including CALL is unknown during offline analysis. Likewise, when an external contract is called by CALL for the cross-function re-entrancy attacks [15], analysis should be executed independently for each contract because running states of a callee contract are not generated. Consequently, utilizing path conditions for a caller contract is no longer meaningful to analyze behavior of a callee contract. Furthermore, there are many candidates of function combinations, and thus analyzing vulnerability to cross-function re-entrancy attacks becomes difficult due to the potential state explosion [15].

Second, a method that evaluates vulnerabilities should be considered as well. Several analysis tools [8, 9, 14] report feasible paths but do not provide evaluation and detection of vulnerabilities. Accordingly, analysts often need to determine if a contract is vulnerable in a heuristic fashion, and hence many false positives and false negatives are produced. To eliminate false positives and false negatives, analysis tools should include a method that can evaluate vulnerabilities without requiring analysts to have prior knowledge of the attacks.

4 Design of RA

In this section, we present Re-entrancy Analyzer (RA), a new static analysis tool for re-entrancy attacks on Ethereum smart contracts. We first describe our design concept and then describe symbolic re-entrancy emulation and vulnerability verification as main processes, including their implementation.

4.1 Design Concept

The main idea of RA is to combine symbolic execution and an SMT solver in a dual way, i.e., symbolic re-entrancy emulation and vulnerability verification.

First, modules that emulate re-entrancy attacks based on symbolic execution are developed. Loosely speaking, by using an SMT solver to verify if conditions for calling to an external contract and generating basic blocks are satisfied, RA can identify which function is called. Consequently, RA can find path conditions via symbolic execution, which transits executions to each basic block recursively. Second, by utilizing the path conditions obtained from the emulation process above, RA verifies if a resultant running state of a function call based on a fallback function is equivalent with the original behavior, i.e., without the fallback function, on the CFG obtained from the emulation. The verification is done by the SMT solver. While symbolic executions in literature [5, 6, 8] report only program behavior via CFGs obtained within a single contract, RA emulates re-entrancy attacks including inter-contract behavior and then it verifies the vulnerability of these contracts to re-entrancy attacks. Consequently, in contrast to the early literature, while the coverage of the analysis for re-entrancy attacks is improved drastically, RA can analyze the vulnerabilities precisely, i.e., without false positives and false negatives and without requiring analysts to have prior knowledge of attack patterns, even for state-of-the-art re-entrancy attacks [15].

4.2 Tool Overview

The overview of RA is as follows. First, for the symbolic re-entrancy emulation, RA generates CFGs from bytecodes whereby a newly created/called contract is represented within a flow of the main contract via symbolic execution. Then, for vulnerability verification, RA verifies whether the bytecodes are vulnerable to re-entrancy attacks by utilizing the Z3 SMT solver in accordance with path conditions obtained from the symbolic re-entrancy emulation. To do this, the following notions are defined for RA:

  • Local-world state is owned by a basic block in local and stores global information, such as balance or storage.

  • Call stack is a stack that stores a return address to a basic block.

  • Contract queue is a queue that stores bytecodes of a contract to be analyzed.

An overview of RA including the notions is presented in Fig. 1. RA mainly consists of three modules, namely, CFManager, VM, and Verifier. The symbolic re-entrancy emulation process is mainly conducted by CFManager and VM. On the other hand, the vulnerability verification process is executed by Verifier assisted by VM. We describe the role of each module below.

CFManager handles feasible paths by directly operating data structures in a basic block for a CFG. A basic block contains the local-world state, the call stack, as well as mnemonic of instructions, running states, and path conditions. These give us information retrieved from the blockchain and return values, i.e., information obtained as a result of each caller/callee contract. Consequently, symbolic execution of each block can be covered even through an external contract. Then, VM receives a contract to be analyzed from the contract queue and then executes instructions in accordance with basic blocks via CFManager and checking conditions by Verifier. Finally, Verifier verifies existence of re-entrancy vulnerability by utilizing the Z3 SMT solver with the information obtained by VM. Based on these modules, the symbolic emulation for inter-contract analysis is provided and then the verification of vulnerabilities is executed. The details of each process are presented below.

Figure 1: Overview of RA

4.3 Symbolic Re-entrancy Emulation

The goal of this process is to generate a CFG completely via emulation of re-entrancy attacks. In particular, the CFManager generates a CFG to represent inter-contract behavior by recording basic blocks with transitions as edges. We call such a CFG an extended CFG (ECFG) for convenience. In ECFGs, CREATE and CALL instructions are utilized as separators of basic blocks in addition to JUMP and STOP instructions. For convenience, let contracts to be executed by CREATE and CALL be callers, and let contracts to be created or called be callees. When CREATE appears, the symbolic execution is transited to its callee contract, i.e., the initialization code. Similarly, when CALL appears, the execution is transited to a function of its callee contract. When STOP appears within the callee contracts described above, the execution is returned to the caller contract. These transitions of contracts are managed by VM.

We now describe the emulation process by RA in detail. For instance, for analysis of the create-based re-entrancy attacks, VM extracts an initialization code of a contract from variables of CREATE by specifying basic blocks via the Z3 SMT solver, and CFManager transits the execution to those blocks in accordance with branches decided by the Z3 SMT solver. By obtaining each block and connecting them, CALL in the initialization code can be identified. Moreover, VM can register the callee contract obtained by the initialization code in the contract queue, and thus the whole contract can be symbolically emulated in a recursive manner. On the other hand, for analysis of the cross-function re-entrancy attacks, a function ID is symbolically executed as a symbolic value, and then Verifier extracts the function ID to be executed by the Z3 SMT solver. By giving the ID at the beginning execution and re-execution, combination of any functions is representative.

4.4 Vulnerability Verification

The vulnerability verification is done by Verifier with path conditions obtained from the emulation described in the previous section. Let functions to be verified be , where contains a function call during the execution. We also denote by a set of path conditions where is executed by taking over a result in the execution of , and by a set of path conditions where calls a fallback function in a manner that the fallback function calls and then is executed again with the result of . The Verifier module can receive and from the VM module. Then, the Z3 SMT solver verifies whether a program is vulnerable or not as follows:

(1)

The procedure of the vulnerability verification is shown in Fig. 2. The withdraw process is not executed when is executed after executing in a normal way, i.e., executions in . On the other hand, the withdraw process is executed due to the re-entrancy on whereby a fallback function calls , i.e., executions in . RA decides that a contract is vulnerable if program behaviors are equivalent in these executions. In particular, RA checks if there is no case where the behaviors are equivalent. Intuitively, the main difference between these cases is whether instructions following CALL in are executed before . Specifically, contains a branch that determines whether the withdraw process is executed in accordance with changing states on the blockchain, where path conditions at the end of executions for both cases are different from each other.

Figure 2: Process of Vulnerability Verification

4.5 Implementation

The main techniques for implementing the symbolic re-entrancy emulation of RA are presented as the technical parts of the implementation below. In particular, we describe how basic blocks are controlled and function IDs are extracted to reduce potential state explosion, which is one of the problems for static analysis tools according to Rodler et al. [15]. Several techniques for improving the performance are presented as well. We also plan to release the source codes of the whole implementation publicly via GitHub222The repository is publicly available. (https://github.com/wanidon/RA).

Environment. RA is implemented on Python 3.7 and three modules, i.e., z3, pysha3, and Graphviz. The z3 module provides APIs for the use of the Z3 SMT solver on Python. The pysha3 module enables the use of the Keccak256 hash function as a SHA-3 module. Finally, Graphviz provides support for drawing CFGs. We also note that RA is independent of any specific compiler because it uses bytecodes as input.

Control of Basic Blocks for CFManager. The method that represents a branch with a conditional jump operation JUMPI is as follows. First, RA adopts the depth-first search for finding feasible paths, and the CFManager owns a data structure named dfs_stack that stores candidates of basic blocks to be searched. Then, the CFManager pops two variables for JUMPI from the dfs_stack, i.e., a jump address and a condition for the jump where is determined by true or false. When is a symbolic value, the z3 module decides whether is satisfiable. If so, there exists a path that does not contain jump. Then, a new block is generated by copying the current basic block and incrementing a program counter, and the new block is newly recognized as the basic block. Next, the solver decides whether is satisfiable. If so, there exists a path that contains jump. Then, the current basic block is copied as a jump block and the program counter is set as the jump address. In the case that there exists a path with jump but a path without jump does not exist, the current basic block is set as a jump block. In contrast, if both a path with jump and that without jump exist, then the jump block itself is stored in the dfs_stack.

Extraction of Function ID for Verifier. Decision of a function ID for Verifier is implemented as follows. First, a symbolic value is assigned with the most significant four bytes for any transaction. The value is loaded onto a contract to be analyzed, and hence paths are branched for each function by comparing them with actual function IDs for the contract. Then, the function ID can be obtained because a solution satisfying is obtained by deciding the satisfiability of path conditions for all paths with the z3 module. If a CALL instruction appears during the execution, CALLABLE is recorded as a state of a basic block. The state of the basic block is inherited by all the descendant paths. In extracting an actual function ID, ID of a function with function call is recorded if the end state of a basic block is CALLABLE.

Speed-up Techniques. The entire performance of RA was improved by using four optimization techniques. First, RA can verify combinations of function calls described in Equation (1) in parallel. Second, a constraint is given on push to a stack. A CALL instruction normally requires push to a stack in accordance with success of a function call, i.e., 0 or 1. In contrast, a re-entrancy attack is always executed after the success of a function call. Thus, RA always pushes 1, i.e., the success of the function call, to a stack as assumed in the success of CALL instruction. This enables RA to reduce the number of path conditions as a speed-up technique. Third, because CALL is required to always succeed, a constraint, i.e., a balance for any contract account is a positive value, is given on path conditions only at the end states. The computational complexity can be reduced drastically by giving the constraint at the end states instead of at each change of the balance. Surprisingly, for analysis of several contracts, the computational performance becomes ten times faster by using this constraint although we omit the details due to space limitation. Finally, RA stops to analyze paths for some stop instruction without the rollback process. In particular, a stop instruction called REVERT contains the rollback about the blockchain related to a given transaction. However, to the best of our knowledge, re-entrancy attacks never occur in this case, thus RA no longer analyzes the remaining parts of such paths.

5 Evaluation

In this section, we show that RA can analyze inter-contract control flows precisely in comparison with execution of Oyente [8]. Comparison with other tools is theoretically discussed in Section 6. The computational performance of RA for analysis is also shown.

5.1 Case Studies

As case studies, we test RA with reference implementations of the re-entrancy attacks by Rodler et al. [15] and known re-entrancy vulnerabilities [17, 18]. The goal of these studies is to clarify the ability of RA to perform inter-contract analysis. The codes of the reference implementations of the re-entrancy attacks [15] are publicly available333https://github.com/uni-due-syssec/eth-reentrancy-attack-patterns.

Creation of Contracts. On a CFG output by RA for a contract created via CREATE, the execution is transited to a callee contract on the basic block (denoted by the red box), and the execution is returned on the basic block (denoted by the green box). In contrast, Oyente cannot transit execution into the initialization code, i.e., a created contract, in a symbolic manner. The remaining parts of the CFG are identical to a CFG output from Oyente. Consequently, RA can handle a function call by extracting the initialization codes on CREATE instructions.

On a CFG output by RA on analysis of the reference implementation of the create-based re-entrancy attack, the execution is transited/returned to/from a callee contract on the basic block. This confirms that, unlike Oyente, RA provides analysis of create-based re-entrancy attacks in an inter-contract fashion.

Call of Contracts. Suppose that an adversary utilizes a contract with only CALL instructions. In particular, because there are two basic blocks, two function calls and two transitions to the caller are identified according to the analysis results of RA. These results confirm that RA provides analysis of cross-function re-entrancy attacks in an inter-contract fashion.

5.2 Vulnerability Analysis

The performance of RA is evaluated by using it to analyze the vulnerability of a number of deployed contracts to re-entrancy attacks. In particular, the target contracts included in the evaluation are the Fund contract in the official document of Solidity [17], a contract named KnownReentrancy obtained from the code of “Reentrancy on a Single Function” published on the webpage of “Ethereum Smart Contract Best Practices” [18], the Bank contract based on create-based re-entrancy attack [15], the Token contract based on cross-function re-entrancy attack [15], and a contract named KnownCrossFunction obtained from the code of “Cross-function Reentrancy” published on “Ethereum Smart Contract Best Practices”. Both Fund and KnownReentrancy contain a single contract. Bank contains three contracts, two of which are benign, i.e., not vulnerable. Token contains twelve contracts, eleven of which are benign and only one contract is vulnerable. Finally, KnownCrossFunction contain one vulnerable contract and one benign contract because the cross-function re-entrancy attack needs “cross” calls between different contracts. Several contracts include multiple functions and hence analysis of re-entrancy attacks becomes complicated due to combinations of function calls as described in Section 3.2. In doing so, the goal of this evaluation is to determine whether RA can precisely identify both vulnerable contracts and benign contracts.

The results are shown in Table 1. According to the table, RA can precisely verify all the existing vulnerabilities without false positives and false negatives. In other words, in addition to being able to analyze contracts that are vulnerable to re-entrancy attacks, RA can also analyze contracts that are not vulnerable, as confirmed by the true negative rate. This precise evaluation for verification of vulnerabilities was obtained by the use of the Z3 SMT solver with path conditions obtained from symbolic execution.

Contract Code Length Benign Functions Vulnerable Functions TPR [% ] FPR [% ] TNR [% ] FNR [% ]
Fund [17] 356 0 1 100 0 0 0
Known Reentrancy [18] 365 0 1 100 0 0 0
Bank [15] 1694 2 1 100 0 100 0
Token [15] 2516 11 1 100 0 0 0
KnownCross Function [18] 680 1 1 100 0 100 0
Table 1: Results of Analysis of Contracts Vulnerable to Re-Entrancy Attacks: The Benign Functions and Vulnerable Functions columns represent the number of benign functions and vulnerable functions in the contract, respectively. The true positive rate is denoted by TPR, false positive rate by FPR, true negative rate by TNR, and false negative rate by FNR.

5.3 Computational Performance

We now present the computational performance of RA for analysis of re-entrancy attacks. The performance was measured by utilizing the time.perf_counter function as an average of ten executions. The environment for measurement is as follows: iMac 21.5-inch, 2017 with 3.6GHz Intel Core i7 processor, 32GB memory, and Radeon Pro 560 4GB as GPU. The measurement was done by implementing parallel-processing. As can be seen in Table 2, the computational time for analysis becomes larger in proportion to the number of function calls. Surprisingly, analysis of Bank is fast even with its code length because Bank is based on create-based re-entrancy attack, i.e., the majority of its codes to create a contract and the function call are restricted to call the created contract. In contrast, Token is based on cross-function re-entrancy attack and its code length and variation of function calls are large. Nevertheless, RA was still able to analyze Token within reasonable time.

Contract Code Length Combinations of Functions Time [sec]
Fund [17] 356 1 20.750
KnownReentrancy [18] 365 1 23.003
Bank [15] 1694 3 47.579
Token [15] 2516 12 519.676
KnownCrossFunction [18] 680 2 37.705
Table 2: Computational Time of RA for Analysis: The Combinations of Functions column represents the number of combinations of function calls in the contract.

6 Discussion

In this section, several considerations such as comparison to other existing tools and the current limitations of RA are discussed.

Comparison to Other Analysis Tools. Table 3 shows a comparison of different tools in terms of their ability to detect re-entrancy attacks discovered by Rodler et al. [15]. Oyente [8] and Securify [5] cannot detect such re-entrancy attacks. In particular, Oyente fails to detect the vulnerabilities, while Securify produces false alerts due to its conservative policy. Sereum and ÆGIS can potentially detect the vulnerabilities, but they are based on dynamic analysis. Finally, Annotary is a static analysis tool that can deal with CREATE and CALL instructions. Thus, it can potentially detect the re-entrancy attacks, but its paper did not discuss re-entrancy attacks. Moreover, Annotary’s analysis target are Solidity codes and thus it may experience difficulties in analyzing deployed contracts.

Tool Static Analysis Cross-Function Create-Based Analysis Target
Oyente[8] Static EVM
Securify [5] Static EVM
Annotary [13] Static Solidity
Sereum [15] Dynamic EVM
ÆGIS [16] Dyanmic EVM
RA Static EVM
Table 3: Comparison of RA and Other Analysis Tools: indicates the tool can detect the attack, indicates the tool cannot detect the attack, and indicates the tool can potentially detect the attack but did not provide a discussion in its paper.

Computational Complexity for Vulnerability Verification. Let the number of functions with function call be , the number of those paths be , the number of any function be , and the number of those paths be . Here, the number of functions to be executed in the first step is and its resulting number of the end state for each execution is . Then, in the second step, functions are executed in proportion to the number of the end states, and the number of the end states of is . Consequently, the order complexity to obtain path conditions at the end of process is . By parallelizing the process, the complexity is because all combinations of function calls can be parallelized. To compute path conditions using Equation (1), RA decides whether two sets, i.e., and , of the path conditions obtained from combinations of one function have an identity relation between and . In doing so, the order complexity is , which is smaller than the complexity to obtain path conditions at the end states. The execution time for analysis is polynomial time. RA is expected to be utilized for analysis of contracts developed by users in realistic time as shown in Section 6.

Extension to Analysis of Delegated Re-Entrancy Attacks. Rodler et al. [15] presented delegated re-entrancy attacks where a contract invokes another contract as a library within instructions that utilize contracts as an external library, e.g., DELEGATECALL or CALLCODE. These instructions are currently not implemented in RA, but RA can be extended to analyze delegated re-entrancy attacks by introducing these instructions. The main technical difficulty in analysis of delegated re-entrancy attacks is that it is unknown which library contract will be used [15]. This difficulty is also a problem in analyses of create-based and cross-function re-entrancy attacks, which have been overcome already in RA.

Limitations. The current implementation of RA has three limitations. First, analysis of gas is not considered. Thus, contracts that restrict gas consumption may not be analyzed precisely. Second, a case where multiple contracts are tightly coupled with each other, i.e., more than two contracts are strongly interdependent, is out of the scope of RA. Finally, bytecodes of contracts cannot include symbolic values. Accordingly, a case where created contracts are different for each execution is not considered. Improving the points described above is our ongoing work.

7 Related Works

In this section, we recall early literature on security analysis of Ethereum smart contracts in terms of symbolic execution and formal methods as static analysis. Then, we describe several multidisciplinary approaches proposed in the past few years as additional related works. Interested readers are advised to read the survey paper [19] for details on EVM analysis.

Symbolic Execution. Symbolic execution of Ethereum smart contracts was originally started by Oyente [8]. Although there are many subsequent works [9, 10, 11, 12], these works do not perform inter-contract analysis. Furthermore, analysis of re-entrancy attacks is often heuristic and thus Oyente often produces many false positives. Extensions of Oyente that support readability of outputs from symbolic execution have been proposed [20, 21], and the usability of RA can be potentially be improved in a similar way.

The closest work to RA is Annotary [13], which can analyze inter-contract behavior via both symbolic executions of EVM bytecodes and the Z3 SMT solver. The major difference of Annotary from RA is that Annotary mainly targets analysis of Solidity codes. In other words, RA mainly checks if deployed contracts are secure or not, while Annotary supports developers in implementing secure codes. Although the authors of Annotary did not consider the new attacks by Rodler et al. [15], we consider their idea and work to be elegant nonetheless.

ETHBMC [22] and VerX [23] are recent state-of-the-art works that verify properties of Ethereum. VerX is similar to Annotary in terms of taking Solidity codes as input and dealing with external contracts. Similar to Annotary, the new attacks by Rodler et al. [15] were not considered in VerX. On the other hand, ETHBMC takes EVM for symbolic executions and its motivation is rather close to that of RA. However, ETHBMC mainly focuses on parity vulnerability.

Formal Methods. Formal verification of EVM was motivated by Bhargavan et al. [3], and EVM was correctly formalized as KEVM [4]. However, verification of the security is challenging in general, and the existing works [5, 6] do not provide support for inter-contract analysis, which is the main target of our work. As more theoretical approach, Grishchenko et al. [7] formalized several attacks, including re-entrancy attacks, as well as formalization of EVM. Their formalization inspired the vulnerability verification of RA.

Multidisciplinary Approaches. NeuCheck [24] can rapidly analyze Solidity source codes by extracting a syntax tree on a cross-platform environment. TokenScope [25] can detect vulnerabilities by identifying tokens that have a different specification from the ERC20 token. SMARTSHIELD [26] is a bytecode rectification system that fixes security-related bugs automatically. Qian et al. [27]

presented an automated re-entrancy detection framework based on machine learning. The techniques in these multidisciplinary works can be used to potentially improve RA. Finally, ILF 

[28] and ContractWard [29] use a combination of machine learning and symbolic execution to improve testing coverage. These works are expected to have improved performance if they deploy RA as a building block.

8 Conclusion

In this paper, we introduced RA, a static analysis tool that provides inter-contract analysis of the EVM bytecodes of Ethereum smart contracts to detect vulnerabilities to state-of-the-art re-entrancy attacks [15]. Using RA, analysts do not need to have prior knowledge of re-entrancy attacks to detect them. To create RA, we designed modules that represent inter-contract CFGs by the symbolic re-entrancy emulation and the vulnerability verification with the Z3 SMT solver to verify the re-entrancy vulnerability of deployed contracts. We also conducted experiments on deployed contracts and confirmed the performance of RA by precisely identifying combinations of contracts with and without vulnerabilities. The aforementioned performance could be obtained by virtue of high-level combination of the symbolic execution and the Z3 SMT solver.

As future work, we plan to extend RA for a case in which a fraction of contracts to be verified and external contracts becomes many-to-many. If successful, RA will be able to deal with more complicated attacks that can be proposed in the future. Moreover, we plan to design capabilities that check for vulnerabilities aside from re-entrancy, such as time-dependent vulnerability which utilizes a timestamp for each block. We believe that the inter-contract analysis capability of RA can potentially identify such a vulnerability. Finally, we will also try to improve the computational performance of RA by reusing parts of the computations for vulnerability verification.

8.0.1 Code Availability

The source code of RA is available in GitHub:
https://github.com/wanidon/RA.

8.0.2 Acknowledgement

This work was supported in part by the Innovation Platform for Society 5.0 at MEXT, and by Secom Science and Technology Foundation.

References

  • [1] Nicola Atzei, Massimo Bartoletti, and Tiziana Cimoli. A survey of attacks on ethereum smart contracts (sok). In Proc. of POST 2017, volume 10204 of LNCS, pages 164–186. Springer, 2017.
  • [2] Christof Ferreira Torres and Mathis Steichen. The art of the scam: Demystifying honeypots in ethereum smart contracts. In Proc. of Usenix Security 2019, pages 1591–1607. Usenix Association, 2019.
  • [3] Karthikeyan Bhargavan, Antoine Delignat-Lavaud, Cédric Fournet, Anitha Gollamudi, Georges Gonthier, Nadim Kobeissi, Natalia Kulatova, Aseem Rastogi, Thomas Sibut-Pinote, Nikhil Swamy, et al. Formal verification of smart contracts: Short paper. In Proc. of PLAS 2016, pages 91–96. ACM, 2016.
  • [4] Everett Hildenbrandt, Manasvi Saxena, Nishant Rodrigues, Xiaoran Zhu, Philip Daian, Dwight Guth, Brandon Moore, Daejun Park, Yi Zhang, Andrei Stefanescu, et al. Kevm: A complete formal semantics of the ethereum virtual machine. In Proc. of CSF 2018, pages 204–217. IEEE, 2018.
  • [5] Petar Tsankov, Andrei Dan, Dana Drachsler-Cohen, Arthur Gervais, Florian Buenzli, and Martin Vechev. Securify: Practical security analysis of smart contracts. In Proc. of CCS 2018, pages 67–82. ACM, 2018.
  • [6] Sukrit Kalra, Seep Goel, Mohan Dhawan, and Subodh Sharma. Zeus: Analyzing safety of smart contracts. In Proc. of NDSS 2018. Internet Society, 2018.
  • [7] Ilya Grishchenko, Matteo Maffei, and Clara Schneidewind. A semantic framework for the security analysis of ethereum smart contracts. In Proc. of POST 2018, volume 10804 of LNCS, pages 243–269. Springer, 2018.
  • [8] Loi Luu, Duc-Hiep Chu, Hrishi Olickel, Prateek Saxena, and Aquinas Hobor. Making smart contracts smarter. In Proc. of CCS 2016, pages 254–269. ACM, 2016.
  • [9] Ivica Nikolić, Aashish Kolluri, Ilya Sergey, Prateek Saxena, and Aquinas Hobor. Finding the greedy, prodigal, and suicidal contracts at scale. In Proc. of ACSAC 2018, pages 653–663. ACM, 2018.
  • [10] Christof Ferreira Torres, Julian Schütte, et al. Osiris: Hunting for integer bugs in ethereum smart contracts. In Proc. of ACSAC 2018, pages 664–676. ACM, 2018.
  • [11] Han Liu, Chao Liu, Wenqi Zhao, Yu Jiang, and Jiaguang Sun. S-gram: towards semantic-aware security auditing for ethereum smart contracts. In Proc. of ASE 2018, pages 814–819. ACM, 2018.
  • [12] Ting Chen, Xiaoqi Li, Xiapu Luo, and Xiaosong Zhang. Under-optimized smart contracts devour your money. In Proc. of SANER 2017, pages 442–446. IEEE, 2017.
  • [13] Konrad Weiss and Julian Schütte. Annotary: A Concolic Execution System for Developing Secure Smart Contracts. In Proc. of ESORICS 2019, volume 11735 of LNCS, pages 747–766. Springer, 2019.
  • [14] Jialiang Chang, Bo Gao, Hao Xiao, Jun Sun, Yan Cai, and Zijiang Yang. scompile: Critical path identification and analysis for smart contracts. In Proc. of ICFEM, volume 11852 of LNCS, pages 286–304. Springer, 2019.
  • [15] Michael Rodler, Wenting Li, Ghassan O Karame, and Lucas Davi. Sereum: Protecting existing smart contracts against re-entrancy attacks. In Proc. of NDSS 2019. Internet Society, 2019.
  • [16] Christof Ferreira Torres, Mathis Steichen, Robert Norvill, Beltran Fiz Pontiveros, and Hugo Jonker. Ægis: Shielding vulnerable smart contracts against attacks. In Proc. of AsiaCCS 2020. ACM, 2020.
  • [17] Security Considerations Solidity 0.5.11 documentation. https://solidity.readthedocs.io/en/v0.5.11/security-considerations.html.
  • [18] Known Attacks. https://consensys.github.io/smart-contract-best-practices/known_attacks/.
  • [19] Monika Di Angelo and Gernot Salzer. A survey of tools for analyzing ethereum smart contracts. In Proc. of DAPPCON 2019, pages 69–78. IEEE, 2019.
  • [20] Mark Mossberg, Felipe Manzano, Eric Hennenfent, Alex Groce, Gustavo Grieco, Josselin Feist, Trent Brunson, and Artem Dinaburg. Manticore: A user-friendly symbolic execution framework for binaries and smart contracts. arXiv preprint arXiv:1907.03890, 2019.
  • [21] B Mueller. Smashing smart contracts. In 9th HITB Security Conference, 2018.
  • [22] Joel Frank, Cornelius Aschermann, and Thorsten Holz. ETHBMC: A bounded model checker for smart contracts. In Proc. of Usenix Security 2020. USENIX Association, 2020.
  • [23] Anton Permenev, Dimitar Dimitrov, Petar Tsankov, Dana Drachsler-Cohen, and Martin Vechev. Verx: Safety verification of smart contracts. In Proc. of IEEE S&P 2020, pages 414–430. IEEE, 2020.
  • [24] Ning Lu, Bin Wang, Yongxin Zhang, Wenbo Shi, and Christian Esposito. Neucheck: A more practical ethereum smart contract security analysis tool. Software: Practice and Experience, 2019:1–20, 2019.
  • [25] Ting Chen, Yufei Zhang, Zihao Li, Xiapu Luo, Ting Wang, Rong Cao, Xiuzhuo Xiao, and Xiaosong Zhang. Tokenscope: Automatically detecting inconsistent behaviors of cryptocurrency tokens in ethereum. In Proc. of CCS2019, pages 1503–1520. ACM, 2019.
  • [26] Yuyao Zhang, Siqi Ma, Juanru Li, Kailai Li, Surya Nepal, and Dawu Gu. Smartshield: Automatic smart contract protection made easy. In Proc. of SANER 2020, pages 23–34. IEEE, 2020.
  • [27] Peng Qian, Zhenguang Liu, Qinming He, Roger Zimmermann, and Xun Wang. Towards automated reentrancy detection for smart contracts based on sequential models. IEEE Access, 8:19685–19695, 2020.
  • [28] Jingxuan He, Mislav Balunoviundefined, Nodar Ambroladze, Petar Tsankov, and Martin Vechev. Learning to fuzz from symbolic execution with application to smart contracts. In Proc. of CCS2019, page 531–548. ACM, 2019.
  • [29] Wei Wang, Jingjing Song, Guangquan Xu, Yidong Li, Hao Wang, and Chunhua Su. Contractward: Automated vulnerability detection models for ethereum smart contracts. IEEE Transactions on Network Science and Engineering, pages 1–1 (Early Access), 2020.