+ All Categories
Home > Documents > IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

Date post: 05-Oct-2021
Category:
Upload: others
View: 6 times
Download: 0 times
Share this document with a friend
20
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 1 D EFECT C HECKER: Automated Smart Contract Defect Detection by Analyzing EVM Bytecode Jiachi Chen, Xin Xia, David Lo, John Grundy, Xiapu Luo and Ting Chen Abstract—Smart contracts are Turing-complete programs running on the blockchain. They are immutable and cannot be modified, even when bugs are detected. Therefore, ensuring smart contracts are bug-free and well-designed before deploying them to the blockchain is extremely important. A contract defect is an error, flaw or fault in a smart contract that causes it to produce an incorrect or unexpected result, or to behave in unintended ways. Detecting and removing contract defects can avoid potential bugs and make programs more robust. Our previous work defined 20 contract defects for smart contracts and divided them into five impact levels. According to our classification, contract defects with seriousness level between 1-3 can lead to unwanted behaviors, e.g., a contract being controlled by attackers. In this paper, we propose DefectChecker, a symbolic execution-based approach and tool to detect eight contract defects that can cause unwanted behaviors of smart contracts on the Ethereum blockchain platform. DefectChecker can detect contract defects from smart contracts’ bytecode. We verify the performance of DefectChecker by applying it to an open-source dataset. Our evaluation results show that DefectChecker obtains a high F-score (88.8% in the whole dataset) and only requires 0.15s to analyze one smart contract on average. We also applied DefectChecker to 165,621 distinct smart contracts on the Ethereum platform. We found that 25,815 of these smart contracts contain at least one of the contract defects that belongs to impact level 1-3, including some real-world attacks. Index Terms—Smart Contracts, Ethereum, Contract Defects Detection, Bytecode Analyze, Symbolic Execution 1 I NTRODUCTION In recent years, decentralized cryptocurrencies have at- tracted considerable interest. To ensure these systems are scalable and secure without the governance of a central- ized organization, decentralized cryptocurrencies adopt the blockchain concept as their underlying technology. Bitcoin [1] was the first digital currency, and it allows users to encode scripts for processing transactions automatically. However, scripts in Bitcoin are not Turing-complete, which restricts their application to currencies, such as money transfer or payment. To address this limitation, Ethereum [2] leverages a technology named Smart Contracts, which are Turing- complete programs that run on the blockchain. By utilizing this technology, practitioners can develop decentralized ap- plications (DApps) [3] and apply blockchain techniques to different fields such as gaming [4] and finance [5]. Smart contracts are usually developed using a high- level programming language, such as Solidity [6]. When developers deploy a smart contract to Ethereum, the con- tract will first be compiled into Ethereum Virtual Machine Jiachi Chen, Xin Xia and John Grundy are with the Faculty of Information Technology, Monash University, Melbourne, Australia. E-mail: {Jiachi.Chen, Xin.Xia, John.Grundy}@monash.edu David Lo is with the School of Information Systems, Singapore Manage- ment University, Singapore. E-mail: [email protected] Xiapu Luo is with the Department of Computing, The Hong Kong Polytechnic University, Hong Kong. E-mail: [email protected] Ting Chen is with the School of Computer Science and Engineering, University of Electronic Science and Technology of China, China. E-mail: [email protected] Xin Xia is the corresponding author. Manuscript received ; revised (EVM) bytecode. Then, each node on the Ethereum system will receive the smart contract bytecode and have a copy in their ledger. Anyone, even attackers, can invoke the smart contract by sending transactions to the corresponding contract address. Key features of smart contracts make them become attractive targets for hackers [7]. On the one hand, many smart contracts hold valuable Ethers, and they cannot hide their balance, which gives financial motivation for attacks by hackers [8], [9]. On the other hand, smart contracts run in a permission-less network, which means hackers can check all the transactions and bytecode freely, and try to find bugs on the contracts. Even worse, smart contracts cannot be modified, even when bugs are detected. Therefore, ensur- ing smart contracts are bug-free and well-designed before deploying them to Ethereum is extremely important. A contract defect [10], [11] is an error, flaw, or fault in a smart contract that causes it to produce an incorrect or unexpected result, or to behave in unintended ways [12]. The detection and removal of contract defects is a method to avoid potential bugs and improve the design of existing code. In our previous work [11], we first defined 20 contract defects by analyzing StackExchange [13] posts. It is also the first work that used an online survey to validate whether smart contract developers consider these contract defects as harmful, which make the definitions more persuasive. The work divided the defined 20 contract defects into five im- pact levels and showed that smart contracts contain defects with impact levels 1 to 3 can lead to unwanted behaviors, e.g., contracts being controlled by attackers. However, our previous work did not propose a suitable tool that could detect these contract defects. To address this limitation, in this paper, we propose DefectChecker to detect arXiv:2009.02663v2 [cs.SE] 23 Mar 2021
Transcript
Page 1: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 1

DEFECTCHECKER: Automated Smart ContractDefect Detection by Analyzing EVM Bytecode

Jiachi Chen, Xin Xia, David Lo, John Grundy, Xiapu Luo and Ting Chen

Abstract—Smart contracts are Turing-complete programs running on the blockchain. They are immutable and cannot be modified,even when bugs are detected. Therefore, ensuring smart contracts are bug-free and well-designed before deploying them to theblockchain is extremely important. A contract defect is an error, flaw or fault in a smart contract that causes it to produce an incorrect orunexpected result, or to behave in unintended ways. Detecting and removing contract defects can avoid potential bugs and makeprograms more robust. Our previous work defined 20 contract defects for smart contracts and divided them into five impact levels.According to our classification, contract defects with seriousness level between 1-3 can lead to unwanted behaviors, e.g., a contractbeing controlled by attackers. In this paper, we propose DefectChecker, a symbolic execution-based approach and tool to detect eightcontract defects that can cause unwanted behaviors of smart contracts on the Ethereum blockchain platform. DefectChecker candetect contract defects from smart contracts’ bytecode. We verify the performance of DefectChecker by applying it to an open-sourcedataset. Our evaluation results show that DefectChecker obtains a high F-score (88.8% in the whole dataset) and only requires 0.15sto analyze one smart contract on average. We also applied DefectChecker to 165,621 distinct smart contracts on the Ethereumplatform. We found that 25,815 of these smart contracts contain at least one of the contract defects that belongs to impact level 1-3,including some real-world attacks.

Index Terms—Smart Contracts, Ethereum, Contract Defects Detection, Bytecode Analyze, Symbolic Execution

F

1 INTRODUCTION

In recent years, decentralized cryptocurrencies have at-tracted considerable interest. To ensure these systems arescalable and secure without the governance of a central-ized organization, decentralized cryptocurrencies adopt theblockchain concept as their underlying technology. Bitcoin [1]was the first digital currency, and it allows users to encodescripts for processing transactions automatically. However,scripts in Bitcoin are not Turing-complete, which restrictstheir application to currencies, such as money transfer orpayment. To address this limitation, Ethereum [2] leveragesa technology named Smart Contracts, which are Turing-complete programs that run on the blockchain. By utilizingthis technology, practitioners can develop decentralized ap-plications (DApps) [3] and apply blockchain techniques todifferent fields such as gaming [4] and finance [5].

Smart contracts are usually developed using a high-level programming language, such as Solidity [6]. Whendevelopers deploy a smart contract to Ethereum, the con-tract will first be compiled into Ethereum Virtual Machine

• Jiachi Chen, Xin Xia and John Grundy are with the Faculty of InformationTechnology, Monash University, Melbourne, Australia.E-mail: {Jiachi.Chen, Xin.Xia, John.Grundy}@monash.edu

• David Lo is with the School of Information Systems, Singapore Manage-ment University, Singapore.E-mail: [email protected]

• Xiapu Luo is with the Department of Computing, The Hong KongPolytechnic University, Hong Kong.E-mail: [email protected]

• Ting Chen is with the School of Computer Science and Engineering,University of Electronic Science and Technology of China, China.E-mail: [email protected]

• Xin Xia is the corresponding author.

Manuscript received ; revised

(EVM) bytecode. Then, each node on the Ethereum systemwill receive the smart contract bytecode and have a copyin their ledger. Anyone, even attackers, can invoke thesmart contract by sending transactions to the correspondingcontract address.

Key features of smart contracts make them becomeattractive targets for hackers [7]. On the one hand, manysmart contracts hold valuable Ethers, and they cannot hidetheir balance, which gives financial motivation for attacks byhackers [8], [9]. On the other hand, smart contracts run ina permission-less network, which means hackers can checkall the transactions and bytecode freely, and try to find bugson the contracts. Even worse, smart contracts cannot bemodified, even when bugs are detected. Therefore, ensur-ing smart contracts are bug-free and well-designed beforedeploying them to Ethereum is extremely important.

A contract defect [10], [11] is an error, flaw, or fault ina smart contract that causes it to produce an incorrect orunexpected result, or to behave in unintended ways [12].The detection and removal of contract defects is a methodto avoid potential bugs and improve the design of existingcode. In our previous work [11], we first defined 20 contractdefects by analyzing StackExchange [13] posts. It is also thefirst work that used an online survey to validate whethersmart contract developers consider these contract defects asharmful, which make the definitions more persuasive. Thework divided the defined 20 contract defects into five im-pact levels and showed that smart contracts contain defectswith impact levels 1 to 3 can lead to unwanted behaviors,e.g., contracts being controlled by attackers.

However, our previous work did not propose a suitabletool that could detect these contract defects. To address thislimitation, in this paper, we propose DefectChecker to detect

arX

iv:2

009.

0266

3v2

[cs

.SE

] 2

3 M

ar 2

021

Page 2: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 2

eight contract defects defined in our previous work thatbelong to serious impact level 1 (high) to level 3 (medium),by using the bytecode of smart contracts. DefectChecker sym-bolically executes the smart contract through bytecode, andwithout the needs of source code. During the symbolic ex-ecution, DefectChecker generates the CFG of smart contracts,as well as the “stack event”, and identifies three features,i.e., “Money Call”, “Loop Block”, and “Payable Function”.By using the CFG, stack event, and the three features, wedesign eight rules to detect each contract defect.

We verify the performance of DefectChecker by applyingit to an open-source dataset developed in our previouswork [11]. We also compare its results with those of threestate-of-the-art tools, i.e., Oyente, Mythril and Securify. Ourevaluation results show that DefectChecker obtains the high-est F-score (88.8% in the whole dataset) and requires theleast time (0.15s per contract) to analyze one smart contractcompared to these other baseline tools. We also crawled allof the bytecode of smart contracts deployed on Ethereumby Jan. 2019 and applied DefectChecker to these 165,621distinct bytecode smart contracts. We found that 15.9% ofsmart contracts on Ethereum contain at least one of contractdefects (the severity level 1 to 3 ) using DefectChecker.

The main contributions of this work are:

• To the best of our knowledge, DefectChecker is themost accurate and the fastest symbolic execution-based model for smart contract defects detection.

• We systematically evaluated our tool using an opensource dataset to test its performance. In addition,we crawled all of the bytecode (165,621) on theEthereum platform by the time of writing the paperand identified 25,815 smart contracts that contain atleast one contract defect. Using these results, we findsome real-world attacks, and give examples to showthe importance of detecting contract defects.

• Our datasets, tool and analysis resultshave been released to the community athttps://github.com/Jiachi-Chen/DefectChecker/.

The organization of the rest of this paper is as fol-lows. In Section 2, we provide background knowledge ofsmart contracts and introduce eight contract defects withcode examples. Then, we introduce the architecture of DE-FECTCHECKER in section 3 and present its evaluation insection 4. We conduct a large scale evaluation based onEthereum smart contracts in Section 5 and give two real-world attacks as case studies. In section 6, we introduce therelated works. Finally, we conclude the study and discusspossible future work in Section 7.

2 BACKGROUND AND MOTIVATION

In this section, we briefly introduce key background infor-mation about smart contracts and their contract defects.

2.1 Smart ContractsContracts. Leveraging blockchain techniques, smart con-tracts are autonomous protocols stored on the blockchain.Once started, the running of a contract is automatic andit runs according to the program logic defined before-hand [14]. When developers deploy a smart contract to

Ethereum, the contract will be compiled to EVM bytecodeand identified by a unique 160-bit hexadecimal hash con-tract address. The smart contract execution depends on theircode, and even the creator cannot affect its running or state.For example, if a contract does not contain functions forEther transfer, even the creator cannot withdraw the Ethers.Smart contracts run on a permission-less network. Anyonecan invoke the methods of smart contracts through ABI(Application Binary Interface) [6]. The contract bytecode,transactions, and invocation parameters are visible to ev-eryone.Gas System. To ensure the security of smart contracts, eachtransaction of a smart contract will be run by all miners.Ethereum uses the gas system [15] to measure its computa-tional effort, and the developers who send transactions toinvoke smart contracts need to pay an execution fee. Theexecution fee is computed by: gas cost × gas price. Gascost depends on the computational resource that takes bythe execution and gas price is offered by the transactioncreators. To limit gas cost, when developers send theirtransactions to invoke contracts, they will set the Gas Limitwhich determines the maximum gas cost. If the gas cost ofa transaction exceeds its Gas Limit, the execution will failand throw an out-of-gas error [2]. There are some specialoperations which will limit the Gas Limit to a specific value.For example, address.transfer() and address.send() are twomethods provided by Ethereum that are used to send Ethers.If a smart contract uses these methods to send Ethers toanother smart contract, the Gas Limit will be restricted to2300 gas units [6]. 2300 gas units are not enough to writeto storage, call functions or send Ethers, which can lead tothe failure of transactions. Therefore, address.transfer() andaddress.send() can only be used to send Ethers to externalowned accounts (EOA). (There are two types of accounts onEthereum: externally owned accounts which controlled byprivate keys, and contract accounts which controlled bytheir contract code [2].)Ethereum Virtual Machine (EVM). To deploy a smartcontract to Ethereum, its source code needs to be compiledto bytecode and stored on the blockchain. EVM is a stack-based machine; when a transaction needs to be executed,EVM will first split bytecode into bytes; each byte representsa unique instruction called opcode. There are 140 unique op-codes by April 2019 [2], and each opcode is represented by ahexadecimal number [2]. EVM uses these opcodes to executethe task. For example, consider a bytecode 0x6070604001.EVM first splits this bytecode into bytes (0x60, 0x70, 0x60,0x40, 0x01), and execute the first byte 0x60, which refers toopcode PUSH1. PUSH1 pushes one byte data to EVM stack.Therefore, 0x70 is pushed to the stack. Then, EVM readsthe next 0x60 and push 0x40 into the stack. Finally, EVMexecutes 0x01, which refers to opcode ADD. ADD obtainsthe next two values from the top of the stack, i.e., 0x70 and0x40, and put their sum (B0), a hex result into the stack.EVM Bytecode v.s. JVM Bytecode in Control Flow Anal-ysis. Control flow analysis methods have been widely usedin other stack-based machines, e.g., JVM [16]. However,there are many differences in analyzing the control flowof Java bytecode and EVM bytecode. These differencespresent some new challenges in analyzing EVM bytecode.We highlight the key differences between EVM bytecode

Page 3: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 3

analysis method we used in this paper and JVM bytecodeanalysis. These include:

(1) JVM bytecode has a fixed stack depth under differentcontrol-flow paths. The execution of JVM cannot reach thesame program point with different stack sizes [17]. Thereare no such constraints for EVM bytecode, which greatlyincreases the difficulty of identifying the control-flow con-structs in EVM bytecode. For example, for a simple recursivecode “function f(int a)f(a);”. The code will be compiled inEVM as:

1 Block 1 :2 JUMPDEST3 PUSH Block1 ’ ID4 DUP25 PUSH Block2 ’ ID6 JUMP7 Block 2 :8 JUMPDEST

There are two blocks; two block identifiers are pushedin the same block (block 1) and will be read by the sameinstruction (JUMP). The difference between the JVM andEVM is that the JVM creates a new frame [18] with a newoperand stack for each method call, whereas the EVM justhas one global operand stack. (A frame is used to store dataand partial results, as well as to perform dynamic linking,return values for methods, and dispatch exceptions.)

(2) JVM bytecode has a clearly defined set of targetsfor each jump [19]. In contrast, the jump target for EVMbytecode is read from the EVM stack. When a conditionaljump is used, the target will be affected by the second stackitem. For example, in Figure 2, the jump target of JUMPI(ID 140) is read from previous instruction PUSH and will beaffected by the second stack item, i.e., ISZERO(GT(10, num))(details see Section 3.3). If the second item refers to a truevalue. The jump target is 148; otherwise, the target is 141.The unconditional jump target is also read from the top ofthe EVM stack. For example, the jump target of JUMP (ID147) in Figure 2 is also read from the previous instructionPUSH. Therefore, we need to symbolically execute the EVMbytecode to construct the control-flow edges.

(3) JVM bytecode has well-defined method invocationand return instructions [17]. In contrast, EVM bytecode usesjumps to perform its intra-contract function calls. In thiscase, to resolve an intra-contract function call, we need toinspect the top stack element to determine the jump target.For example, there are two functions A and B. Function Acontains three blocks, e.g., A1, A2, A3; function B containstwo blocks, e.g., B1, B2. The code on block A2 calls functionB. In EVM bytecode, there is no defined method invocationand return instructions. Instead, the code pushes the returnaddress to the stack; the arguments and jump target (blockidentifier of B1) need to be identified through bytecode. Toreturn, the code pops the caller’s block identifier (A3) andjumps to execute the block. Thus, the execution sequencesare A1, A2, B1, B2, A3. The identifiers of B1 and A3 shouldbe obtained from bytecode through symbolic execution.The Fallback Function. The fallback function is a uniquefeature of smart contracts compared to traditional programs.An example can be found at Line 13 of Listing 4, which is theonly unnamed function in smart contracts programming [6].

The fallback function does not have any arguments or returnvalues. It will be executed automatically on a call to thecontract if none of the functions match the given functionidentifier [6]. For example, if a transaction calls function ‘A’of the contract, and there is no function named ‘A’, then thefallback function will automatically be executed to handlethe erroneous function invocation. If the function is markedby payable [6], the fallback function will also be executedautomatically when receiving Ethers.The Call Instruction and Ether Transfer. Ether transfer isan important feature on Ethereum. In Solidity program-ming, there are three methods to transfer Ethers, i.e., ad-dress.call.value(), address.transfer(), and address.send(). Amongthese three methods, only address.call.value() allows users tosend Ethers to a contract address, as the other two methodsare limited to 2300 gas units, which are not enough tosend Ethers. address.send() returns a boolean value, whileaddress.transfer() throws an exception when errors happenand returns nothing. All of these three methods can generatea CALL instruction in contract bytecode. Other behaviors,e.g., function call, can also generate CALL instructions. ACALL instruction reads seven values from the top of EVMstack. They represent the gas limitation, recipient address,transfer amount, input data start position, size of the inputdata, output data start position, size of the output data,respectively.

2.2 Contract Defects in Smart Contracts

Our previous work [11] defined 20 contract defects forsmart contracts. We divided these contract defects into five“impact” levels; among these contract defects, 11 belong toimpact level one (most serious) to three (low seriousness)that might lead to unwanted behaviors. The definition ofthese 11 contract defects is given in Table 1. In this paper, wepropose DefectChecker, a symbolic execution tool to detecteight of these impact level one to three contract defects.DefectChecker does not detect contract defects belonging tolevels 4 and 5, as these contract defects will not affect thenormal running of the smart contracts according to thedefinition. For example, Unspecified Compiler Version is oneof the level 5 smart contract defects. The removal of thecontract defects requires the developer of the contract to usea specific compiler like 0.4.25. This contract defect will notaffect the normal running of the contract and will only posea threat for code reuse in the future. This kind of contractdefect is also difficult to detect at the bytecode level as muchsemantic information is lost after compilation.

However, please note that in this work, we do notconsider three of the contract defects that belong to impactlevel 1 to 3 – Unmatched Type Assignment, Hard Code Addressand Misleading Data Location, as they are not easy to detectat bytecode level. Our analysis shows that they appear 22,84, and 1 times among 587 smart contacts, respectively. EVMwill remove or add some information when compiling smartcontracts to bytecode, which may cover up these taintson the source contract code. For Hard Code Address, thebytecode we obtain from the blockchain does not containinformation on the construct function, while we found mostHard Code Address errors appear in construct functions. Todetect Unmatched Type Assignment, we need to know

Page 4: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 4

the maximum loop iterations, which is usually read fromstorage, and is not easy to obtain the value through staticanalysis. For example, for a loop “for(uint8 i = 0; i < num;i++)”, the data range of uint8 is from 0 to 255. Thus, if numis larger than 255, the loop will overflow. However, numis usually a storage variable which is read from storage ordepends on an external input. Thus, it is difficult to detectthis through bytecode analysis. Misleading Data Location isalso not easy to detect from bytecode. In Solidity program-ming, storage in Solidity is not dynamically allocated andthe type of struct, array or mapping are maintained on thestorage. Thus, these three types created inside a functioncan point to the storage slot 0 by default, which can leadto potential bugs. However, we cannot know whether thepoint on slot 0 is correct or a mistake made by EVM.

2.2.1 Definition of Impact LevelsBelow we give representative concrete examples of each ofthe eight smart contract defects, and introduce the definitionof impact level one to three according to our previous work.

• Impact 1 (IP1): Smart contracts containing these con-tract defects can lead to critical unwanted behaviors.Unwanted behaviors can be triggered by attackers,and they can make profits by utilizing the defects.

• Impact 2 (IP2): Smart contracts containing these con-tract defects can lead to critical unwanted behaviors.Unwanted behaviors can be triggered by attackers,but they cannot make profits by utilizing the defects.

• Impact 3 (IP3): There are two types of IP3. Type A:Smart contracts containing these contract defects canlead to critical unwanted behaviors, but unwantedbehaviors cannot be triggered by attackers. Type B:Smart contracts containing these contract defects canlead to major unwanted behaviors. The unwantedbehaviors can be triggered by attackers, but theycannot make profits by utilizing the defects.

Critical represents contract defects, which can lead toa crash, being controlled by attackers, or can lose all theEthers. Major represents the contract defects that can lead tothe loss of a part of the Ethers [11].

2.2.2 Examples of Smart Contract Defects

1 c o n t r a c t Victim { . . .2 address owner = owner address ;3 func t ion sendMoney ( address addr ) {4 requi re ( tx . o r i g i n == owner ) ;5 addr . t r a n s f e r (1 Ether ) ;6 }7 }8 c o n t r a c t Attacker{ . . .9 func t ion a t t a c k ( address vim addr , address myAddr) {

10 Victim vic = Victim ( vim addr ) ;11 vic . sendMoney (myAddr) ;12 }13 }

Listing 1: Transaction State Dependency

(1). Transaction State Dependency (TSD): Contracts needto check whether the caller has the right permission for somepermission sensitive functions. The failure of the permissioncheck can cause serious consequences. tx.origin can get theoriginal address of the transaction, but this method is not

reliable as the address returned by this method dependson the transaction state. Therefore, tx.origin should not beused to check whether the caller has permission to executefunctions.

Example: In Listing 1, The Attacker contract can make apermission check fail by utilizing the attack function (Line9). By utilizing this method, anyone can execute sendMoneyfunction (Line 3) and withdraw the Ethers in the contract.

Possible Solution: Solidity provides msg.sender to obtainthe sender address, which can be used to check permissionsinstead of using tx.origin.

(2). DoS under External Influence (DuEI): Smart contractswill rollback a transaction if exceptions are detected duringtheir running. If the error that leads to the exception cannotbe fixed, the function will give a denial of service (DoS) errorperpetually.

Example: Listing 2 shows such an example. Here, mem-bers is an array which stores many addresses. However,one of the address is an attacker contract, and the transferfunction can trigger an out-of-gas exception due to the 2300gas limitation [2]. Then, the contract state will rollback. Sincethe code cannot be modified, the contract can not removethe attack address from members list, which means that ifthe attacker does not stop attacking, the following functioncannot work anymore.

Possible Solution: Developers can use a boolean valuecheck instead of throwing exceptions in the loop. For ex-ample, using “if(members[i].send(0.1 ether) == false) break;”instead of line 3 in listing 2.

1 f o r ( u int i = 0 ; i < members . length ; i ++){2 i f ( t h i s . balance > 0 . 1 e ther )3 members [ i ] . t r a n s f e r ( 0 . 1 e ther ) ;4 }

Listing 2: DoS under External Influence

(3). Strict Balance Equality (SBE): Attackers can sendEthers to any contracts forcibly by utilizing selfdestruct() [6].This method will not trigger the fallback function, whichmeans the victim contract cannot reject the Ethers. There-fore, smart contract logic may fail to work due to theunexpected Ethers sent by attackers.

Example: The doingSomething() function in listing 3 canonly be triggered when the balance strict equal to 1 ETH.However, the attacker can send 1 Wei (1 ETH = 1e18 Wei) tothe contract to make the balance never equal to 1 ETH.

Possible Solution: The contract can use “≥” to replace“==” as attackers can only add to the amount of a balance.In this case, it is difficult for the attackers to affect the logicof the program.

1 i f ( t h i s . balance == 1 eth ) doingSomething ( ) ;

Listing 3: Strict Balance Equality:

(4). Reentrancy (RE): In Ethereum, a function can beexecuted several times in one execution by using the Callmethod. When a contract calls another, the execution waitsfor the call to finish [20]. Thus, it can lead to multipleinvocations and money transfer in some situations.

Example: Listing 4 shows an example of a reentrancy de-fect. There are two smart contracts, i.e., Victim contract andAttacker contract. The Attacker contract is used to transfer

Page 5: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 5

TABLE 1: The Definitions of contract defects with Impact level 1-3. The first eight contract defects can be detected byDefectChecker.

Contract Defect Definition ImpactLevel

Contract Defect Definition ImpactLevel

Transaction State De-pendency (TSD)

Using tx.origin to check the permission. IP1 DoS Under ExternalInfluence (DuEI)

Throwing exceptions inside a loop which canbe influenced by external users

IP2

Strict Balance Equal-ity (SBE)

Using strict balance quality to determine theexecute logic.

IP2 Reentrancy (RE) The re-entrancy bugs. IP1

Nested Call (NC) Executing CALL instruction inside anunlimited-length loop.

IP2 Greedy Contract (GC) A contract can receive Ethers but can notwithdraw Ethers.

IP3

Unchecked ExternalCalls (UEC)

Do not check the return value of external callfunctions.

IP3 Block Info Dependency(BID)

Using block information related functions todetermine the execute logic.

IP3

Unmatched Type As-signment

Assigning unmatched type to a value, whichcan lead to integer overflow

IP2 Misleading Data Loca-tion

The reference types of local variables withstruct, array or mapping do not clarify

IP2

Hard Code Address Using hard code address inside smart con-tracts.

IP3

Ethers from Victim contract, and the Victim contract can beregarded as a bank, which stores the Ethers of users. Userscan withdraw their Ethers by invoking withdraw() function,which contains Reentrancy defects.

First, the Attacker contract uses the reentrancy() function(L16) to invoke Victim contract’s withdraw() function in line3. The addr in line 16 is the address of the Victim contract.Normally, the Victim contract sends Ethers to the callee inline 6, and resets callee’s balance to 0 in line 7. However, theVictim contract sends Ethers to the Attacker contract beforeresetting the balance to 0. When the Victim contract sendsEthers to the Attacker contract (L6), the fallback function(L13) of the Attacker contract will be invoked automatically,and then invoking the withdraw() function (L14) again. Theinvoking sequence in this example is: L16-17→ L3-6→ L13-14→ L3-6→ L13-14 · · · , until Ethers run out.

1 c o n t r a c t Victim { . . .2 mapping ( address => uint ) publ ic userBalance ;3 func t ion withdraw ( ) {4 uint amount = userBalance [msg . sender ] ;5 i f ( amount > 0){6 msg . sender . c a l l . value ( amount ) ( ) ;7 userBalance [msg . sender ] = 0 ;8 }9 }

10 . . .11 }12 c o n t r a c t Attacker{ . . .13 func t ion ( ) payable{14 Victim (msg . sender ) . withdraw ( ) ;15 }16 func t ion reentrancy ( address addr ) {17 Victim ( addr ) . withdraw ( ) ;18 }19 . . .20 }

Listing 4: Reentrancy

Possible Solution: There are 3 kinds of Call methods thatcan be used to send Ethers in Ethereum, i.e., address.send(),address.transfer(), and address.call.value(). address.send() andaddress.transfer() will change the maximum gas limitation to2300 gas units if the recipient is a contract account. 2300gas units are not enough to transfer Ethers, which meansaddress.send() and address.transfer() cannot lead to Reentrancy.Therefore, using address.send() and address.transfer() insteadaddress.call.value() can avoid Reentrancy.

(5). Nested Call (NC): Instruction CALL is very expensive(9000 gas paid for a non-zero value transfer as part of the

CALL operation) [2]. If a loop contains the CALL instructionbut does not limit the loop iterations, the total gas cost mayhave a high risk to exceed its gas limitation.

Example: In listing 5, if we do not limit the loop iter-ations, attackers can maliciously increase its size to causean out-of-gas error. Once the out-of-gas error happens, thisfunction cannot work anymore, as there is no way to reducethe loop iterations.

Possible Solution: Developers should estimate the max-imum loop iterations and limit the loop iterations.

1 f o r ( u int i = 0 ; i < member . length ; i ++){2 member [ i ] . send (1 wei ) ;3 }

Listing 5: Nested Call

(6). Greedy Contract (GC): Ethers on smart contracts canonly be withdrawn by sending Ethers to other accounts orusing selfdestruct function. Otherwise, even the creators ofthe smart contracts cannot withdraw the Ethers and Etherswill be locked forever. We define that a contract is a greedycontract if the contract can receive Ethers (contains payablefunctions) but there is no way to withdraw the Ethers.

Example: Listing 6 is a greedy contract. The contractis able to receive Ethers as it contains a payable fallbackfunction in line 2. However, the contract does not containany methods to transfer money to others. Therefore, theEthers on the contract will be locked forever.

Possible Solution: Adding a function to withdrawEthers if the contract can receive Ethers.

1 Contract Greedy{2 func t ion ( ) payable{3 process (msg . sender ) ;4 }5 func t ion process ( address addr ) { . . . }6 }

Listing 6: Greedy Contract

(7). Unchecked External Call (UEC): Solidity pro-vides many functions (address.send(), address.call()) to trans-fer Ethers or call functions between contracts. However,these call-related methods can fail, e.g., have a network erroror run out of gas. When errors happen, these functions willreturn a boolean value but never throw an exception. If thecallers do not check the return values of the external calls,they cannot ensure whether the logic of the following codesnippets is correct.

Page 6: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 6

Example: Listing 7 shows such an example. Line 1 doesnot check the return value of the address.send(). As the Ethertransfer can sometimes fail, line 1 cannot ensure whether thelogic of the following code is correct.

Possible Solution: Always checking the return value ofthe address.send() and address.call().

1 address . send ( e t h e r s ) ; doingSomething ( ) ; //bad2 i f ( address . send ( e t h e r s ) ) doingSomething ( ) ; //good

Listing 7: Unchecked External Call

(8). Block Info Dependency (BID): Developers can utilizea series of block related functions to obtain block informa-tion. For example, block.blockhash is used to obtain the hashnumber of the current block. Many smart contracts rely onthese functions to decide a program’s execution, e.g., gen-erating random numbers. However, miners can influenceblock information, e.g, miners can vary the block time stampby roughly 900 seconds [20]. In this case, the block infodependency operation can be controlled by miners to someextent.

Example: The contract in listing 8 is a code snippet ofa roulette contract. The contract utilizes block hash numberto select a winner, and send winner one Ether as bonus.However, the miner can control the result. So, the miner canalways be the winner.

Possible Solution: The precondition of a safe randomnumber is that the random number cannot be controlled bya single person, e.g., a miner. The completely random infor-mation we can use in Ethereum includes users’ addresses,users’ input numbers and so on. Also, it is important tohide the values used by the contract for other players toavoid attacks. Since we cannot hide the address of usersand their submitted values on Ethereum, a possible solutionto generate random numbers without using block relatedfunctions is using a hash number. The algorithm has threerounds:

Round 1: Users obtain a random number and generatea hash value in their local machine. The hash value can beobtained by keccak256 function, which is a function providedby Ethereum. After obtaining the random number, userssubmit the hash number.

Round 2: After all the users submit the hash number,users are required to submit the original random number.The contract checks whether the original number can gen-erate the same hash number by using the same keccak256function.

Round 3: If all users submit correct original numbers, thecontract can use the original numbers to generate a randomnumber.

1 address [ ] p a r t i c i p a t o r s ;2 uint winnerID = uint ( block . blockhash ) %

p a r t i c i p a t o r s . length3 p a r t i c i p a t o r s [ winnerID ] . t r a n s f e r (1 e ths ) ;

Listing 8: Block Info Dependency:

3 THE DefectChecker APPROACH

3.1 Design OverviewFigure 1 depicts an overview architecture of the De-fectChecker approach. There are four components of De-

BasicBlock1

BasicBlock2

BasicBlockN

Money Call

CFG

PayableFunction

Feature Detector

SymbolicExecution

Loop Block

StackEvent

UncheckExternal Calls

…GreedyContract

…Reentrancy

DefectIdentifier

disasm

Opcode

Bytecode

Inputter

CFG Builder

Fig. 1: Overview architecture of DefectChecker

fectChecker, i.e., Inputter, CFG Builder, Feature Detector, andDefect Identifier.

The left part of the figure is the Inputter, and userscan feed bytecode as input. Solidity source code is alsoallowed, but it needs to be compiled into bytecode. Byte-code is then disassembled into opcodes by utilizing APIprovides by Geth [21]. Then, DefectChecker splits opcode intoseveral basic blocks and symbolically executes instructionsin each block. After that, DefectChecker generates the CFG(control flow graph) of a smart contract and records allstack events. During symbolic execution, Feature Detectordetects three features (i.e., Money Call, Loop Block andPayable Function), all concepts introduced below. Based onthis information, Defect Identifier uses eight different rules toidentify the contract defects on smart contracts.

Detecting contract defects by bytecode is very importantfor smart contracts on Ethereum. All the bytecode of smartcontracts are stored on the blockchain, but only less than 1%of smart contracts have opened their source code [22]. Smartcontracts usually call other contracts, but the callee contractsmay not open their source code for inspection. In such acase, the caller smart contracts can only detect whether thecallee contract is secure through their bytecode.

3.2 Basic Block Builder

A basic block is a straight-line code sequence with nobranches in except to the entry and no branches out exceptat the exit [23]. We first split the opcode into several blocksand give a type of the block according to its exit type. Theexit type can be determined by the last instruction on ablock. If the last instruction is JUMP or JUMPI, the blocktype is unconditional or conditional, respectively. If the lastinstruction is a terminal instruction (STOP, REVERT andRETURN), the block type is terminal. Some blocks belong tonone of these three types, we call their block type as fall.In summary, we consider four types of blocks: unconditional,conditional, fall, and terminal.

3.3 Symbolic Execution

Unlike other stack-based machines, e.g., JVM where Javabytecode has a clearly-defined set of targets for every jump,the jump position of EVM bytecode needs to be calculatedduring symbolic execution. Thus, DefectChecker needs tosymbolically execute each single EVM instruction one ata time to obtain the CFG for smart contracts. EVM is astack-based machine – when executing an instruction, itreads several symbolic states from the top of the EVM stack

Page 7: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 7

Block 1130 JUMPDEST131 PUSH1 0133 PUSH1 10135 DUP3136 GT137 ISZERO

EVM Stack* num* num, 0* num, 0, 10

* num, 0, GT(10, num)* num, 0, 10, num

* num, 0, ISZERO(GT(10, num))

138 PUSH1 148 * num, 0, ISZERO(GT(10, num)), 148

140 JUMPI * num, 0

Block 3141 PUSH1 1143 SWAP144 POP145 PUSH1 153

EVM Stack* num, 0, 1* num, 1, 0* num, 1* num, 1, 153

147 JUMP * num, 1

Block 2148 JUMPDEST149 PUSH1 0151 SWAP1152 POP

EVM Stack* num, 0* num, 0, 0* num, 0, 0* num, 0

Block 4153 JUMPDEST

EVM Stack

158 STOP

* num, 1 or 0

If(ISZERO(GT(10, num))== 1)

If (ISZERO(GT(10, num)) == 0)Conditional

Fall

UnconditionalFall

return 0

if(num > 10)

return 1

Fig. 2: Example of Symbolic Execution

and put the symbolic result back to the EVM stack. Duringthe symbolic execution, we can obtain the jump relationsbetween blocks. There are three types of block according tothe jump behaviors, i.e., conditional jump, unconditional jumpand fall execution. Stack Event records all symbolic states onthe EVM stack after the execution of each instruction.

1 func t ion example ( u int num) re turns ( u int ) {2 i f (num > 10)3 re turn 1 ;4 e l s e {5 re turn 0 ;6 }7 }

Listing 9: Code of Figure 2

Figure 2 is an example of the symbolic execution of thecode in Listing 9. There are 4 blocks in this figure, and eachblock contains several instructions. The instructions in block1 represent the code if(num >10). The block 2 and block 3put the value (0 or 1) to the EVM stack, respectively. Theinstructions in block 4 are used to return the value(0 or1) to the environment. The left-most number in each lineindicates instructions’ index ID, and the center part is theinstruction that needs to be executed. All the instructionswill execute sequentially according to their index ID. If theinstruction is ‘PUSH’, the right-most part will have a valuethat pushes into EVM stack. There is a Program Counter(PC) that records the ID that being executed at the currenttime. The PC starts from ID 0 in block 1, and EVM executesthis instruction.

The example shown in Figure 2 is a part of the code ofa contract, so the PC starts from index ID 130 in block 1.Before EVM executes the instruction JUMPDEST, there is asymbol num in the EVM stack. The symbol num representsthe input value of the function (L1 of Listing 9). JUMPDESTmarks a valid destination for jumps; it does not read or push

any values. So the PC points to ID 131, and EVM pushesa value 0 to EVM stack. Then, ‘10’ is pushed into EVMstack and PC point to 135. DUP3 duplicates the 3rd stackitem. Therefore, the symbol num is pushed into EVM stack.GT reads two values from the EVM stack. If the first value(at the top of the stack) is greater than the second value,than EVM push 1 into the stack; otherwise, 0 is pushed. Weuse a symbol GT(a, num) to represent the result and pushthe result into the EVM stack. Then, ISZERO reads a valuefrom the top of the EVM stack. ISZERO reads one valuefrom EVM. If the value equal to zero, then we push 1 intostack; otherwise, we push 0. We use a symbol ISZERO(GT(a,num)) to represent the result and push the result into theEVM stack. JUMPI (ID 140) reads two values from the stack,the first value represents the jump position ‘148’, and thesecond value is a conditional expression. If the result ofthe conditional expression is “1” (true), the the PC jumpsto the index ID 148, which indicates the start position ofblock 2. Otherwise, if the result is “0” (false), the EVM fallsto execute the following index ID 141(the start position ofBlock 3).

Since the result of ISZERO(GT(a, num)) can be “0” or “1”,this symbolic execution can generate two paths, i.e., Block 1→ Block 2 and Block 1→ Block 3.

We first assume the result of ISZERO(GT(a, num)) is “1”and the path is Block 1 ->Bock 2. In this case, the PC pointsto the ID 148. The jump type of this path is conditional jump.After executing the instructions on ID 148-152, the EVM fallsto execute block 4. The jump type from block 2 to block 4 isfall. When executing the first instruction of the block 4, theEVM stack holds two values, i.e., num and 0. Block 4 thenreturns the value 0 to the environment and uses instructionSTOP to finish the execution.

We then assume the result of ISZERO(GT(a, num)) is “0”and the path is Block 1 ->Bock 3. In this case, the PC pointsto the ID 141. The jump type of this path is fall execution.JUMP refers to an unconditional jump; it reads one valuefrom the top of the stack. The value reads by JUMP in ID147 is ‘153’. After executing the instructions on ID 141-147,the EVM then jumps to execute block 4. The jump type fromblock 3 to block 4 is an unconditional jump. When executingthe first instruction of the block 4, the EVM stack holds twovalues, i.e., num and 1. Block 4 then returns the value 1to the environment and uses instruction STOP to finish theexecution.

When executing a conditional jump, we should deter-mine the satisfiability of the conditional expression, which istypically realized by invoking an SMT (satisfiability modulotheories) solver [24], e.g., Z3 [25]. If the SMT solver cannotfind a solution, we consider the corresponding programpath as infeasible. Therefore, symbolic execution can beused to discover dead code. However, there may be littledead code in EVM bytecode, because the compiler can elim-inate dead code during the compilation of smart contracts.To accelerate our analysis, we consider the conditional ex-pression, which is equal to “0” as unsatisfiable and all otherconditional expressions as satisfiable, without checking theirsatisfiability.

Page 8: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 8

TABLE 2: The Information Required to Detect EachContract Defect

Contract Defect Control FlowInformation

SymbolicState

Transaction State Dependency (TSD) XDoS Under External Influence (DuEI) X XStrict Balance Equality (SBE) X XReentrancy (RE) X XNested Call (NC) X XGreedy Contract (GC) X XUnchecked External Calls (UEC) XBlock Info Dependency (BID) X

3.4 Feature DetectorTo detect contract defects at the bytecode level, we need toidentify some specific behaviors from their opcodes. In thispart, we introduce three features that we use when detectingcontract defects.

3.4.1 Money CallTo detect Reentrancy, we need to identify whether a smartcontract can transfer Ethers to other contracts. Ethereumprovides three methods to transfer Ethers, i.e., address.send(),address.transfer(), address.call().value(). All of these threemethods generate a CALL instruction. However, only de-tecting the CALL instruction is not enough, as many otherbehaviors can also generate CALL instruction, e.g., callingfunctions on other contracts or library. In this paper, if aCALL instruction is generated by functions which are usedto transfer Ethers, we call this CALL instruction a Money-CALL. Otherwise, the CALL instruction is a No-Money-CALL.CALL reads seven values from EVM stack. The first threevalues represent the gas limitation, recipient address, trans-fer amounts, respectively. If the transfer amount is largerthan 0, the CALL instruction is a Money-CALL.

However, only detecting Money-CALL is still not enough,as address.send() and address.transfer() will limit the maxi-mum gas consumption to 2300, which is not enough tosend Ethers. Therefore, these two methods also cannotcause Reentrancy. If the CALL instruction is generated byaddress.send() and address.transfer(), a specific number “2300”will be pushed into EVM stack, which represented themaximum gas consumption. So, if CALL instruction readsa specific number “2300” from the EVM stack, the CALL in-struction is generated by address.send() and address.transfer().We call this CALL instruction a Gas-Limited-Money-CALL.Otherwise, if the first value read by CALL instruction doesnot contain a specific value “2300”, we assume that theCALL instruction is generated by address.call().value(). Wecall this CALL instruction a Gas-Unlimited-Money-CALL.

3.4.2 Loop BlockAfter constructing the CFG, we need to detect which block isthe start of a loop and which blocks make up the body of theloop. To detect this information, we first traverse the pathof the CFG by utilizing DFS (Depth-first-search) [26] andthen flag all blocks we visit. If there is a block that has beenvisited, this block is the start of a loop, and other blocks inthis cycle are the loop bodies. Since some smart contracts arevery complicated, it may contain a large number of paths.To reduce the computational effort, we use the strategy of

pruning. For example, block A is the destination of manyother blocks, and we find the path of block A does notcontain any cycles. We do not need to visit the remainingpaths when other paths encounter block A.

3.4.3 Payable FunctionA smart contract can receive Ethers only if it containspayable functions [2]. To detect whether a function ispayable or not, we can inspect the first block of eachfunction. CALLVALUE instruction is used to get the re-ceived Ether amount. If a smart contract receives Ethers,CALLVALUE instruction will get a non-zero value. Thisvalue can be checked by the ISZERO instruction to knowwhether a transaction contains Ethers. If the function is notpayable, when receiving Ethers, it will throw an exceptionand terminate the execution.

To find the first block, we first rank all instructions bytheir index ID. All conditional jumps positioned before thefirst JUMPDEST instruction are the start position of eachfunction. EVM uses a hash value to identity functions; whenEVM receives a function call, it first compares the receivedvalue to each function’s hash value. If a function’s hashvalue is equal to the received hash value, it will jump tothe destination, which indicates a function’s start position.Otherwise, it will fall to fallback function, whose start posi-tion is the first JUMPDEST instruction.

3.5 DefectCheckerTable 2 describes the information required to detect eachkind of contract defect. To detect TSD and UEC, De-fectChecker only needs symbolic states computed by sym-bolic execution, as we only need to check whether ORI-GIN and CALL instructions are read by EQ and ISZEROinstruction, respectively. DefectChecker only needs controlflow information to detect BID, as we only need to checkwhether the conditional expression contains block relatedinstructions, e.g., ”BLOCKHASH”.

To detect the other 5 contract defects, DefectChecker needsboth control flow information and symbolic states. In theprevious subsection, we introduce three features detected bythe feature detector, i.e., Money Call, Loop Block, and Payablefunction. Money Call needs symbolic states, so to detect it,DefectChecker needs check the values on the EVM stack. LoopBlock and Payable function require control flow information,as they both need CFGs to locate the loop and the startof the function, respectively. NC, DuEI, GC, and Reentrancyall need to detect Money Call. DuEI and NC also need todetect Loop Block; GC needs to detect Payable function. Todetect Reentrancy, DefectChecker needs to travel all the pathsthat contain the Gas-Unlimited-Money Call, which needs thehelp of the CFG. To detect SBE, DefectChecker needs tocheck whether the BALANCE instruction is read by the EQinstruction in the conditional expressions, which needs bothcontrol flow information and symbolic state.

Below we describe the detailed patterns that we use todetermine whether a smart contract contains one or more ofthe contract defects.

3.5.1 Transaction State Dependencytx.origin generates an ORIGIN instruction. We first locateall ORIGIN instructions. We then check whether there is

Page 9: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 9

an ORIGIN that is read by an EQ instruction. The EQinstruction reads two values from EVM stack and verifieswhether these two values are equal. If the contract containsthis kind of contract defect ORIGIN instruction will com-pare to an address value. Ethereum uses a 40-bit value toindicate an address, and all addresses conform to the EIP55standard [27].

3.5.2 DoS Under External InfluenceIf a smart contract contains this contract defect, there willbe a part of the instructions that check the return value ofthe Money CALL, and then terminate the loop. To detect thiscontract defect, we first find loop-related blocks. Then, wecheck whether there is a block that contains Money CALL,and the type of the block is conditional, as it needs to checkthe return value. Then, this block jumps to a block, whichtype is terminal.

3.5.3 Strict Balance EqualityThis kind of contract defect can make a part of the codenever be executed. We need to check whether there is aconditional expression that contains the related pattern.BALANCE instruction is used to get the balance of a con-tract. If a BALANCE instruction is read by EQ, it means thereis a strict balance equality check. If this check happens at aconditional jump expression, it means this contract containsthis contract defect.

3.5.4 ReentrancyThe SLOAD instruction is used to get a value from stor-age [2]. It reads a value (named Slot ID) from the EVM stackand puts the result that reads from storage back onto theEVM stack. Using listing 4 as an example, Victim contractsdo not make the balance of an Attack contract to zero (L7)before sending Ethers (L6), which allows an Attack contractto withdraw Ethers again. To detect this contract defect, wefirst need to obtain paths that contain Gas-Unlimited-Money-Call, because only this kind of CALL can cause Reentrancy.We then need to obtain all conditional expressions on thesepaths. The amount that is sent by the victim contract is usu-ally checked before sending it to attacker contracts, and thisamount is loaded from storage. In this case, we need to checkif the conditional expression contains SLOAD instructionsand get its Slot ID. If this value still holds and does not beupdated when executing CALL instruction, it means CALLinstruction can be executed again and cause Reentrancy. Tocheck whether the storage value is updated, we need todetect whether the same Slot ID that is read by SLOAD iswritten by SSTORE instruction. (SSTORE instruction is usedto save data to memory. It reads two values from EVM stack,i.e., slot id and value that are written to storage.)

3.5.5 Nested CallUsing listing 5 as an example, array members is a storagevariable, all of its value, including its length, are stored onstorage. To get its length, SLOAD instruction reads its Slot IDδ from EVM stack, and this value is the position that storesthe value of members.length. To detect this contract defect,the first step is to find the start block of a loop and get theSlot ID. Then, we need to check whether this loop limits its

TABLE 3: Some Features of Dataset

Features Min Max Mean SDLines of Code 5 2,239 393.6 356.8# of Functions 1 174 30.1 621.6# of Instructions 7 15,355 3,597.3 2,523.7CC 1 132 30.3 22.4Ethers 0 1,500,000 7,844.9 1,704,552.7

size. If the loop limits its size, the same Slot ID δ will be readin the loop body again, and this value will be comparedwith another value. If a smart contract contains a loop thatdoes not limit its size but contains a Money-Call, Nest Call isdetected in this contract.

3.5.6 Greedy ContractA smart contract can transfer money through a MoneyCALL or selfdestruct function. selfdestruct function generatesSELFDESTRUCT instruction. If a smart contract containspayable functions but does not have either a Money CALLor SELFDESTRUCT instruction, the contract is a GreedyContract.

3.5.7 Unchecked External CallsThe external call returns a boolean value. If the result ischecked by the contract, it will generate an ISZERO in-struction. To detect this contract defect, we first locate CALLinstructions. Then, we check whether each CALL instructionis read by ISZERO. If there is a CALL that is not checked byISZERO, this contract defect is detected.

3.5.8 Block Info DependencyDetecting this contract defect is similar to Strict BalanceEquality. This contract defect can allow miners to control thecontract, as miners can change the value of some block in-formation, which affects the result of the conditional expres-sion. If the conditional expression contains block related in-structions, i.e., “BLOCKHASH”, “COINBASE”, “NUMBER”,“DIFFICULTY”, “GASLIMIT”, it means the contract containsthis contract defect.

4 EVALUATION

To measure the efficacy of DefectChecker, we present resultsbased on applying it to an open-sourced dataset and presentour experimental results analysis in this section.

4.1 Experimental SetupAll experiments were performed on a PC running Mac OS10.14.4 and equipped with an Intel i7 6-core CPU and 16GB of memory. We use Solidity 0.4.25 as the compiler tocompile source code into bytecode, and use EVM 1.8.14 todisassemble the bytecode to its opcodes.

4.2 DatasetThe dataset we used to evaluate DefectChecker was releasedin our previous work [11]. We first crawled all 17,013 opensourced smart contracts from Etherscan. Then, we randomlyselected 600 smart contracts from these contracts. We found13 smart contracts do not contain any contents. Thus,

Page 10: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 10

we removed them from our dataset. Finally, we obtained587 smart contracts from Etherscan. These contracts have231,098 lines of the code and more than 4 million Ethers intheir balance.

Table 3 shows some key features of the dataset, i.e., linesof code, number of functions in the contracts, number ofinstructions in the contracts, cyclomatic complexity [28] andEthers hold by the contracts. Cyclomatic complexity is asoftware metric that indicates the complexity of a program,and it is computed by analyzing the control flow graph. Theformulation to compute it is: E - N + 2P. E is the numberof edges on CFG; N is the number of nodes on CFG andP is the number of connected components on CFG. SinceCFG is a connected graph, so P always equal to 1, and theformulation can be simplified as: E - N + 2.

The simplest contract in our dataset only contains oneconstructor function with 7 instructions and a cyclomaticcomplexity of 1. The contract with the highest cyclomaticcomplexity has 11,696 instructions and 2,004 lines of code.The richest contract in our dataset holds 1.5 million Ethers,while the poorest contract has no Ethers in its balance.

Two authors of our previous work manually labeled thedataset. They both have three years of experience workingon smart-contract-based development and research, andtook part in the process of defining contract defects. Thus,they have a very good understanding of the smart contractprogramming and contract defects introduced in this pa-per. They first manually labeled the dataset independently.Then, they discussed the disagreements after completingthe labeling process and gave the final results. Their over-all Kappa value [29] was 0.71, which shows a substantialagreement between them.

In this work, we developed a tool named DefectChecker todetect eight contract defects with severity impact levels 1-3.The numbers of each type of contract defect in our datasetare shown in Table 4. This shows that Block Info Dependencyis the most frequent contract defect in our dataset, whileTransaction State Dependency and Strict Balance Equality arethe least popular. Their numbers are 42, 5, and 5, respec-tively. DefectChecker aims at Solidity version 0.4.0+, whichis the most widely used version at the time of writing thispaper [30]. However, some smart contracts are designed forSolidity version 0.2.0+ and 0.3.0+. Thus, we removed eightsmart contracts and used the remaining 579 smart contractsas our ground truth.

Among the six tools we introduced in Table 5, only Zeusopen sourced their dataset. However, Zeus still has fourkinds of defects which are not included in their dataset.Also, the Zeus authors did not provide the detail of howto built their dataset. Their paper only mentioned that“they manually validated each result” without providingany details, e.g., the number of people who labeled thedataset, and whether they are professional smart contractdevelopers or not. Thus, we did not use these datasets.

4.3 Evaluation Methods and Metrics

There are seven measurements obtained from our experi-ments: True Positive (TP), True Negative (TN), False Positive(FP), False Negative (FN), Precision (P), Recall (R) and F-Measure (F). TP indicates the results which correctly predict

TABLE 4: Experimental results for DefectChecker.

Defects #Defects #TP #TN #FP #FN P(%) R(%) F(%)TSD 5 5 474 0 0 100.0 100.0 100.0DuEI 6 6 466 7 0 46.2 100.0 63.2SBE 5 4 474 0 1 100.0 80.0 88.9RE 12 10 461 6 2 62.5 83.3 71.4NC 13 9 464 2 4 81.8 69.2 75.0GC 6 6 473 0 0 100.0 100.0 100.0UEC 22 20 454 3 2 87.0 90.9 88.9BID 42 41 437 0 1 100.0 97.6 98.8

a contract defect in a smart contract. TN indicates the resultswhich correctly predict a smart contract does not have adefect. FP and FN indicate the results which incorrectlypredict that a smart contract contains and does not containa contract defect. Precision , Recall , and F -Measure can becalculated as:

Precision =#TP

#TP +#FP× 100% (1)

Recall =#TP

#TP +#FN× 100% (2)

F -Measure =2× Precision×RecallPrecision+Recall

× 100% (3)

4.4 Experimental Results and AnalysisTable 4 summarizes the results of applying DefectChecker toour previous work’s dataset. The first column is the contractdefects that need to be detected. The second column is thenumber of contract defects in our dataset (ground truth).The remaining seven columns are used to measure theperformance of DefectChecker. Below, we discuss the analysisof each contract defect.

(1). Transaction State Dependency. DefectChecker detects5 smart contracts containing this contract defect among 579smart contracts with 0 false positives and negatives.

(2). DoS Under External Influence. DefectChecker detects13 smart contracts that have this contract defect among 579smart contracts with 7 false positives and 0 negatives. The 7errors are due to the error identification of a loop.

In our detection method, we first split the bytecode intoseveral blocks. Then, symbolic execution is used to find theedge between blocks. We traverse the path of CFG by usingDFS. If there is a block that has been visited, we regardthis block as the start of the loop (See Section 3.4.2). Sincewe regard all the paths are reachable, thus we only flagwhether two blocks have an edge. This mechanism leadsto false positives in detecting loops.

In Listing 10, all the L9, L10, and L11 hold a singleblock, respectively, and function sub() holds several blocks.EVM first executes the block of line 9, then executes theblocks of function sub() in line 2. After the execution ofblocks of line 10, line 11, respectively, the blocks of functionsub() will be executed again. Therefore, when traversing theCFG by using DFS, we can find that there is a cycle (funsub()→L10→L11→fun sub()). Since we regard all the pathsare reachable, we cannot know that the blocks of function

Page 11: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 11

sub() cannot jump the block of L10, after executing the blockof L11.

This kind of false positive can be addressed if we executethe loop continuously. Using a loop “for(int i = 0; i <100;i++)” as an example; we need to record the state of variablei, and check whether the expression (i <100) is satisfied ornot. If we prove the loop can execute continuously, we canconfirm it is a real loop not the error we show in Listing10. However, we need the assistance of an SMT solverto execute the loop, and executing the loop continuouslyis also time consuming. Thus, we believe the advantagesof removing the use of an SMT solver in our approachoutweighs the disadvantages.

1 l i b r a r y SafeMath {2 func t ion sub ( uint256 a , uint256 b ) i n t e r n a l

re turns ( uint256 ) {3 a s s e r t ( b <= a ) ;4 re turn a − b ;}}5 c o n t r a c t Mainsale {6 using SafeMath f o r uint256 ;7 uint256 publ ic t o t a l ;8 func t ion ( ) payable {9 uint amount = t o t a l . sub ( 1 0 0 ) ;

10 msg . sender . t r a n s f e r ( amount ) ;11 uint c o n t r i = msg . value . sub ( amount ) ;}}

Listing 10: Error Loop Example

(3). Strict Balance Equality. DefectChecker detects 4 smartcontracts that contain Strict Balance Equality with 0 falsepositives and 1 false negative. The cause of the error is thatthe contract defect related to several functions. For example,the contract in listing 11 uses a global variable balance torepresent the contract’s balance. Callers first call functiongetBalance to obtain the balance. The balance will then bechecked in Line 5. To detect this contract defect, we need toknow that the global variable balance represents the contractbalance. Therefore, the contract defect can only be detectedwhen we know users will first invoke getBalance() and thencall DefectFunction(). However, it is not easy to detect thiscontract defect at the bytecode level, as the two operations(i.e., balance == 1 eth and balance = this.balance) are in twoindependent functions, and we do not know the callingsequence.

1 c o n t r a c t Demo{2 uint balance = 0 ;3 func t ion getBalance ( ) { balance = t h i s . balance ;}4 func t ion DefectFunct ion ( ) {5 i f ( balance == 1 eth )6 doSomthing ;} }

Listing 11: Strict Balance Equality - False Negative Example

(4). Reentrancy. DefectChecker detects 16 smart contractsthat contain Reentrancy, with 6 false positives and 2 falsenegatives. The false positives are because of error-money-call detection. A smart contract contains Reentrancy musthave a Gas-Unlimited-Money-Call. To detect it, we first needto check whether the gas limits set are larger than 2,300 gasand the transfer amount is larger than 0. However, in someexamples, these two values are represented by complicatedsymbolic expressions. Some expressions also contain valuesthat read from storage (read by SLOAD). Thus their specificvalues can not be determined by static analysis. There-fore, DefectChecker failed to detect them. When DefectCheckerencounters complicated symbolic expressions, the default

value is larger than 2,300 gas and larger than 0, this leads tofalse positives. When detecting this contract defect, we needto check whether the Slot ID read by SLOAD instructionstill holds when executing CALL instruction. Some Slot IDsare also represented by complicated symbolic expressions.DefectChecker failed to detect whether they are equal, whichleads to reporting false negatives.

When detecting Money-Call, we use Gas-Limited-Money-Call as default, if we cannot figure out the exact value of thegas limit symbolically. We also conduct another experiment,which uses Gas-Limited-Money-Call as the default. How-ever, DefectChecker failed to detect any Reentrancy default.The reason is that the Gas-Limited-Money-Call usually is easyto detect, as address.transfer(), address.send() will put a specificvalue “2300” to the EVM stack. Thus, we just need to detectthe specific value. However, the gas limit of Gas-Unlimited-Call is not easy to detect, as it usually uses a complicatedexpression to represent the gas. Since address.call.value() willnot change the gas cost. In most situations, this method willnot lead to an out-of-gas error. This is the reason why weuse Gas-Unlimited-Call as our default.

(5). Nested Call. DefectChecker detects that 11 smartcontracts contain a Nested Call defect. Among these 11 smartcontracts, we have 2 false positives and 4 false negatives.The cause of the false positives is also the error identificationof the loop, which is the same with DoS Under ExternalInfluence. The false negatives are because of the complicateddata structure. When detecting this contract defect, the firststep is to know whether the loop iterations are related tothe array’s length. We use the SLOAD instruction relatedpattern to obtain the loop iterations, as described in Sec-tion 3.5.5. However, as shown in Listing 12, self is a structureand its length is obtained through an external function.Since external functions can be designed in different ways,it is challenging to design a pattern to detect it.

1 f o r ( u int i ; i<s e l f . keys . length ; i ++) {2 s e l f . data [ s e l f . keys [ i ] ] . t r a n s f e r (1 Ether ) ;}

Listing 12: Nest Call - False Negative Example

(6). Greedy Contract. DefectChecker detects 6 Greedy Con-tracts, with 0 false positives and negatives.

1 func t ion Example ( Address addr ) re turns ( bool ) {2 re turn addr . send ( ) ;}

Listing 13: Unchecked External Call - False PositiveExample

(7). Unchecked External Call. DefectChecker reports 23contracts have this kind of contract defect, with 3 falsepositives and 2 false negatives. We analyzed the false posi-tive examples and find that these contracts use the returnvalue of send() as function’s return value and check thereturn value in other functions. For example, addr.send() asshown in listing 13 is the return value of function Example,and the value is checked in the callee programs. The falsenegatives are because the defect happens in a constructorfunction, while the bytecode of the constructor function isnot contained in runtime bytecode. Therefore, we missed it.However, the contract defects in the constructor functionwill not harm the deployed contracts, as the constructorfunction will only be executed once when deploying thecontracts to the blockchain.

Page 12: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 12

(8). Block Info Dependency. DefectChecker detects 41smart contracts contain this contract defects, with 0 falsepositives and 1 false negative. The cause of the false negativeis similar to the one with Strict Balance Equality. The defectcontract uses a global variable to represent block informa-tion and uses this global variable in other functions, whichcauses the contract defect to be detected.

4.5 Comparison with state-of-the-art tools

In our previous work, we investigated whether there areexisting tools that can detect some of the contract defectswe have defined. We first collected all the papers fromtop Security and SE conferences/journals, i.e., CCS, S&P,USENIX Security, NDSS, ACSAC, ASE, FSE, ICSE, TSE,TIFS, and TOSEM from 2016 to 2019. Then, we only retainthe papers whose titles have the key words “smart contract”,“Ethereum” or “blockchain”. After that, we manually readthe abstract to verify their relevance. Finally, we found onlyfour papers that are related to smart contract defects, i.e.,Oyente [20], Maian [31], Zeus [35], and ContractFuzzer [34].

To enlarge our baseline methods, we use the samemethod as proposed by Kitchenham et al. [36]. We first readthe references of these 4 relevant papers, and tried to findwhether there are existing tools that can detect the definedcontract defects. If there is a relevant paper, we read itsreferences repeatedly, until no new paper can be found. Inthis way we also found two other tools, i.e., Securify [32]and Mythril [33].

Table 5 shows the input and contract defects that can bedetected by these tools. The last column shows the numberof the defects can be detected by these tools except thementioned 8 contract defects. As we know, the bytecode ofsmart contract on Ethereum are visible to everyone, but onlyless than 1% of the smart contracts open up their sourcecode [22]. Therefore, detecting contract defects from thebytecode level is very important. To make the comparisonfair, we select Oyente, MAIAN, Securify and Mythril as ourbaseline tools, since they can detect contract defects at thebytecode level, the same as DefectChecker. However, wefound that Maian has not been updated to support the latestEthereum environment and so we could not run MAIANon our dataset. For example, they use methods providedby web3 [37] to obtain contracts’ information on Ethereum.However, the methods they used have been removed anddid not support the current version of Ethereum that weused. In addition, DefectChecker gets 100% F-Measure whendetecting Greedy Contract. In this case, we do not comparewith MAIAN, and choose Oyente, Securify and Mythril as ourbaseline tools.

Oyente detects three kinds of security-related vulnera-bilities for smart contracts. These three kinds of security-related vulnerabilities are the same as our Unchecked ExternalCalls, Block Info Dependency and Reentrancy. Mythril [33] isa tool developed by ConsenSys, which is a leading globalblockchain technology company. They find security prob-lems from online posts or news, which is similar to ourprevious work [11]. Our previous work analyzed the postsfrom StackExchange posts and defined 20 contract defects.Mythril can detect 6 contract defects as shown in Table 7.Securify is a smart contract security analyzer that takes EVM

bytecode as input. It first decompiles EVM bytecode andanalyzes the semantic facts of the decompiled code. In ourstudy, Securify uses several security patterns to detect relatedvulnerabilities. Securify can detect Reentrancy and UncheckedExternal Call, which can also be detected by DefectChecker.

Table 6 shows the results of running Oyente on ourprevious dataset [11]. The F-score of Oyente in detectingRE, UEC, and BID are 3.7%, 68.1%, and 37.3%, respectively,while the numbers for DefectChecker are 71.4%, 88.9%, and98.8%, respectively. We found that Oyente only considersBLOCKHASH instructions when detecting Block Info Depen-dency, while there are many other instructions, e.g. NUM-BER (NUMBER instruction is used to get block’s number),that can lead to this contract defect. Besides, Oyente also hasmany false positives when detecting Reentrancy. The reasonis that they do not distinguish between send(), transfer()and call() functions at the bytecode level, while send() andtransfer() will limit gas to 2300 unit, which cannot causeReentrancy. In addition, the most important reason for theseerrors is code coverage. Code coverage means the percentageof instructions executed. The average code coverage forOyente is 18.9%, while the number for DefectChecker is 77.1%.Low code coverage means only a small part of the code canbe analyzed for contract defect occurrence, which can leadto a large number of false positives and negatives. Thereare three reasons that lead to the low coverage of Oyentecompared to DefectChecker. First, Oyente checks whether apath can be reached, while DefectChecker assumes that all thepaths are reachable. Oyente also only optimizes for SolidityVersion 0.4.19, but there is a wide version coverage in ourdataset. Finally, the jump positions of some unconditionaljump might not be easy to find. To be specific, the jumpposition might be a result of a complicated expression. Thusboth Oyente and DefectChecker can fail to detect these un-conditional jumps, and it is the reason why DefectCheckermisses some blocks.

Table 7 shows the results of Mythril. Mythril fails to detectTransaction State Dependency and Strict Balance Equality in ourdataset. In addition, its results contain many false positives,especially in detecting Reentrancy and DoS Under ExternalInfluence. We found that Mythril is similar to Oyente - itfails to distinguish between call() with transfer() and send(),which will not lead to Reentrancy. Besides, Mythril failedto distinguish loop related patterns, which lead to errorswhen detecting loop related defects, e.g., DoS Under ExternalInfluence or Nest Call.

Table 8 presents the results of Securify. Securify candetect two common defects with DefectChecker, i.e., Reen-trancy and Unchecked External Call. All the DefectChecker,Oyente, Mythril, and Securify can detect these two defects.The performance of Securify in testing Reentrancy (4.9%) isbetter than Oyente (3.7%), and similar to Mythril (4.9%),but much worse than DefectChecker (71.4%). In terms ofdetecting Unchecked External Call, the F-score of Securify(62.5%) is a little bit worse than Oyente (68.1%) and muchbetter than Mythril. DefectChecker still get the best F-score,which receives 88.9% in detecting Unchecked External Call

To compare the results between all four tools, we adda comparison of F-measure in Table 9, which shows thatDefectChecker obtains the best F-measure of all four tools.

We also calculate the overall precision, recall, and F-

Page 13: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 13

TABLE 5: Input and Defects Detected of Each Tool

Tools Input TSD DuEI SBE RE NC GC UEC BID # of Other DefectsDefectChecker Bytecode X X X X X X X X 0Oyente [20] Bytecode X X X 1Maian [31] Bytecode X 2Securify [32] Bytecode X X 7Mythril [33] Bytecode X X X X X X 28Contractfuzzer [34] Bytecode + ABI X X X 3Zeus [35] Source Code X X X X 3

TABLE 6: Experiment result of Oyente.Defects #Defects #TP #TN #FP #FN P(%) R(%) F(%)RE 12 2 94 373 10 2.1 16.7 3.7UEC 22 16 448 9 6 64.0 72.7 68.1BID 42 11 431 6 31 64.7 26.2 37.3

TABLE 7: Experiment result of Mythril.Defects #Defects #TP #TN #FP #FN P(%) R(%) F(%)TSD 5 0 474 0 5 0 0 0DuEI 6 1 245 228 5 0.4 16.7 0.8SBE 5 0 474 0 5 0 0 0RE 12 5 280 187 7 2.6 41.7 4.9NC 13 2 414 52 11 3.7 15.4 6.0UEC 22 11 436 21 11 34.4 50.0 40.8

measure of all four tools on the whole experimental dataset.Using overall-precision as the example, the overall result iscalculated by

∑ni=1 pci

×|ci|∑ni=1 |ci|

, in which pci is the precision ofthe contract defect i, |ci| is the number of contract defect i inthe whole dataset. The results are given in Table 10, whichclearly shows that DefectChecker obtains the best results indetecting contract defects.Time Consumption. We calculate the time to analyze onesmart contract to evaluate each tool. To make the evalua-tion accurate, we kill all the background processes in ourmachine when testing the tool to ensure the environment isclean. For each tool, we run it for 10 times and record theaverage time to test one smart contract in our dataset.

Table 11 shows the time consumption results of eachtool. The second column of the table gives the average timeconsumption to test a smart contract for each tool. The speedof DefectChecker is the fastest in these four tools. It only needs0.15s to analyze one smart contract. Oyente and Securify havesimilar running times. Oyente needs 18.48s to analyze onesmart contract, and the time for Securify is 21.55s. Mythrilis the slowest tool; it needs 103.55s to analyze one smartcontract. The maximum time to analyze a smart contract ofDefectChecker is 2.42s, while the time for Oyente, Securify, andMythril are 1096.32s, 1203.99s and 2480.26s, respectively. Thesimplest smart contract in our dataset only contains 7 lineswith a single constructor function. DefectChecker needs 0.04sto analyze it, while the time for Oyente, Securify, and Mythrilare 0.28s, 0.37s and 1.58s, respectively. DefectChecker also hasthe smallest Standard Deviation value among these fourtools, which shows that DefectChecker has the most stablespeed in analyzing a smart contract.

In conclusion, the efficiency of these four tools is inorder: DefectChecker >Oyente >Securify >Mythril.

TABLE 8: Experiment result of Securify.Defects #Defects #TP #TN #FP #FN P(%) R(%) F(%)RE 12 1 439 28 11 3.5 8.3 4.9UEC 22 10 457 0 12 100.0 45.5 62.5

4.6 Threats to ValidityInternal Validity. We used a dataset released in our previouswork [11] as the ground truth to evaluate DefectChecker.Since the people who developed DefectChecker are the sameas the people who labeled the dataset, it is likely thattheir familiarity with the dataset might lead to potentialoptimization or omissions when developing DefectChecker.We tried to use the datasets of the baseline tools to evaluateDefectChecker. However, we failed to find the dataset. Luuet al. run Oyente on 19,366 contracts. They only manuallycheck the correctness of some examples, instead of usinga complete dataset to evaluate Oyente. We can only findsome false positive and true positive values on their paper.Securify uses a complete dataset which consists of 100 smartcontracts. However, they do not open their dataset to thepublic. Mythril is a tool from industry. They even do nothave an evaluation section in their technical papers. Thus,we had to build our own dataset. To reduce the influence ofour dataset, we first wrote a few demo smart contracts whendeveloping DefectChecker and used these to conduct small-scale testing of our proposed tool. Then, we conductedlarge-scale testing by using real world bytecode we crawledfrom the Ethereum blockchain. The dataset is the same asthat we introduced in Section 5. During this large-scaletesting, we randomly choose a set of smart contracts thatcan find their source code. We use these smart contractsto improve the performance and patterns that are used todetect contract defects. We admit that the familiarity withthe ground truth dataset might lead to a bias, but themethods we used to develop DefectChecker can reduce thisinfluence.

External Validity. The dataset we used to evaluate De-fectChecker is based on manual analysis, which may containfalse positives and negatives. To address this problem, wedouble-checked the results and used them to update thedataset when we found some mistakes. Another threatis that Solidity is a fast-growing programming language.There are nine versions released in 2018, which may add ormodify any features of the previous version. DefectCheckeris designed based on Solidity version 0.4.0+, which is themost popular version in the time of writing the paper [30].In the future, more smart contracts may use higher versions,which may make our tool unable to work.

5 A LARGE SCALE EVALUATION

In the previous section, we showed that DefectChecker has anexcellent performance when applied to a small scale dataset.In this section, to validate DefectChecker is still usable tofind contract defects in real-world smart contracts, we ranDefectChecker on a large scale dataset that we crawled fromEthereum blockchain, and show the contract defects as

Page 14: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 14

TABLE 9: Result Comparison(F-Measure) between Four Tools

Tools TSD DuEI SBE RE NC GC UEC BIDDefectChecker 100.0% 63.2% 88.9% 71.4% 75.0% 100.0% 88.9% 98.8%

Oyente / / / 3.7% / / 68.1% 37.3%Securify / / / 4.9% / / 62.5% /Mythril 0% 0.8% 0% 4.9% 6.0% / 40.8% /

TABLE 10: Overall Precision, Recall, and F-Measure of EachTool

Tools O. P. (%) O. R (%) O. F. (%)DefectChecker 88.3 90.9 88.8Oyente 54.6 38.2 40.9Securify 65.9 32.4 42.2Mythril 13.3 30.2 16.5

TABLE 11: Time Consumption of Each Tool

Tools Avg. Max Min S.D.DefectChecker 0.15s 2.42s 0.04s 5.43Oyente 18.48s 1,096.32s 0.28s 2,877.64Securify 21.55s 1,203.99s 0.37s 3,384.39Mythril 103.55s 2,480.26s 1.58s 13,063.80

found by DefectChecker. We give two real-world attacks ascase studies to show how harmful these contract defects are.

5.1 DatasetTo identify whether contract defects are actually prevalentin a large-scale, real-world dataset, we crawled bytecodefrom Ethereum blockchain by 2019.01 and obtained 183,706distinct bytecode. Since some smart contract versions are notsupported by DefectChecker, and so we removed them fromour experimental dataset. Finally, we ran DefectChecker on165,621 distinct smart contract bytecode. All these bytecodeare runtime bytecode. Runtime bytecode does not containinformation on their constructor function. It is the defaultbytecode stored on the Ethereum.

5.2 Contract Defects on EthereumWe ran DefectChecker on 165,621 smart contract bytecode.The detailed results are given in Table 12, which aimsto show the frequency of each defect on Ethereum. SinceDefectChecker only identifies whether a contract contains adefect or not, if the same kind of defects appears multipletimes in a smart contract, we only count it once in Ta-ble 12. The second column of the table shows how manycontracts contain related defects, and the last column givesthe percentage of how many contracts contain the defect.If a contract contains multiple defects, all of the defects arecounted.

Unchecked External Calls is the most frequent contractdefect in the Ethereum, and about 7.5% of real world smartcontracts contain this defect. There are about 3.1% of smartcontracts that contain Block Info Dependency, which is thesecond most popular contract defect on the blockchain.Strict Balance Equality is the rarest of our contract defects.DefectChecker only detects 390 smart contracts that havethis contract defect. The percentage of Nested Call is alsoless than 1%, with 1,043 (0.6%) smart contracts having this

TABLE 12: Contract Defects in Ethereum

Contract Defects # Defects # PercentageTransaction State Dependency 1,669 1.0%DoS Under External Influence 2,116 1.3%Strict Balance Equality 390 0.2%Reentrancy 3,892 2.4%Nested Call 1,043 0.6%Greedy Contract 3,139 1.9%Unchecked External Calls 12,439 7.5%Block Info Dependency 5,201 3.1%

kind of contract defect. The percentage of Transaction StateDependency and DoS Under External Influence are similar onEthereum, at about 1.0% and 1.3%, respectively. There are3,139 greedy contracts on the Ethereum, and 3,892 smartcontracts containing the Reentrancy problem, which can leadto serious security problems.

We found that there are 16 smart contracts that contain 4kinds of contract defects, which are thus the most defectivecontracts. The number of smart contracts that contain 3kinds of contract defects is 539, and 3,520 smart contractscontain 2 kinds of contract defects. About 25,815 smartcontracts contain at least one kind of defect, which meansthat about 15.9% smart contracts on Ethereum contain somekinds of defects, as reported by our DefectChecker.

We utilized cyclomatic complexity [28] and the numberof instructions to conduct a further analysis. We computedthe cyclomatic complexity and number of instructions forcontracts in our dataset. We found that the average cyclo-matic complexity of smart contracts in Ethereum is 21.3,and the average number of instructions are 2,342.6. Figure 3shows the relationship between the number of the contractdefects that contained in smart contracts and the number ofinstructions & cyclomatic complexity. The x-axis means thenumber of contract defects in a smart contract. The left y-axis is the number of x, and the right y-axis is the number ofcyclomatic complexity. The two lines have a similar trend.

The number of instructions is proportional to the lengthof a contracts’ code, which can show the contracts’ complex-ity at the code level. The number of cyclomatic complexityindicated the complexity of a program. We performed ageneralized linear regression with the Poisson error distri-bution model provided by R [38] to analyze the relationshipbetween the number of defects with instructions, and thenumber of defects with cyclomatic complexity. In our model,we use the number of instructions and cyclomatic complex-ity to predict the number of defects, respectively. Since boththe correlation coefficients are positive (0.001 with std. error= 0.0009 and 0.023 with std. error = 0.0179, respectively), itshows that the more complex a contract is, the higher is itsprobability to contain defects. We calculated the correlationlevel between these two complexity measures using thePearson correlation method [39] at a 5% significance level.

Page 15: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 15

2,356.0

1,825.0

2,801.9

2,946.2

2,956.0

21.0

18.3

28.1

36.7

62.9

0

10

20

30

40

50

60

70

0

500

1000

1500

2000

2500

3000

3500

0 Defect(139807) 1 Defect(25815) 2 Defects(3520) 3 Defects(538) 4 Defects(16)

# of InstructionsCyclometicComplexity

Average Instructions Average Cyclomatic Complexity

Fig. 3: The relationship between the number of contractdefects and number of Instructions & Cyclomatic

Complexity

The statistical test shows that the correlation coefficient is0.702 with p− value < 0.05. These correlation results implythat the number of instructions and cyclomatic complexityis correlated, and we can use only one of them as a predictor.

5.3 Case Study

DefectChecker found some real-world attacks / financial lossfrom our large-scale testing on the full Ethereum dataset.In this subsection, we give two examples to show theimportance of detecting such contract defects.Case Study 1: The first example is shown in Listing 14.There are 2,335.8 Ethers in the contract balance, and it isworth $552,720 by Mar. 2020. Unfortunately, all the Ethersare locked because of the contract defect, i.e., Nested Call.The buggy function in Listing 14 is named sendReward(). Wehighlight two lines of the code (Line 2 and Line 14), whichare related to two contract defects, i.e., Nested Call and DoSUnder External Influence.

There is a loop in the function sendReward(), and theloop iterations are increased with the length of investors[].However, the contract does not limit its loop iterations. Aswe know, sending Ethers is expensive as it needs a largeamount of gas consumption, and the contract sends Ethersto the contract users in Line 14. So, the gas consumptionof executing sendReward() will increase in the length ofinvestors[]. When we check the transaction of the contract,we can find that the contract can work normally at first, asthe total gas consumption of sendReward() does not exceedits maximum gas limitation at that time. However, withthe increase of the length of investors[], the total gas costincreases rapidly. The gas cost then eventually exceeds thegas limitation, and leads to an out of gas error. Even worse,since the length of investors[] cannot be reduced, once theerror happens, the sendReward() cannot be called anymore,which means all the Ethers in the balance are locked forever.Figure 4 shows the detail of a failed transaction. It is clearthat when a user calls sendReward(), the out-of-gas errorhappens.

1 func t ion sendReward ( ) publ ic isOwner{2 f o r ( u int i = 0 ; i < i n v e s t o r s . length ; i ++){3 address add = i n v e s t o r s [ i ] ;4 User memory user = addressToUser [ add ] ;

Fig. 4: Transaction Detail of Case Study 1

5 i f ( user . gameOver ) {6 autoReInvest ( add ) ;7 user . r e b i r t h = now − ( oneLoop / 2) ;8 addressToUser [ add ] = user ;9 } e l s e {

10 i f ( SafeMath . sub (now , user . r e b i r t h ) >=oneLoop ) {

11 address payable needPay = address (uint160 ( add ) ) ;

12 uint staticAmount = g e t S t a t i c ( add ) ;13 i f ( staticAmount > 0){14 needPay . t r a n s f e r ( staticAmount ) ;15 }16 . . .17 }18 }

Listing 14: Case Study 1 - Contract with Nested Call. Codefrom Contract:

0x41AeB72624f739281b12aDE663791254F32DB669.

It should be noticed that although the financial loss inthe real world example is caused by Nested Call, the contractshown in Listing 14 also has another contract defect, namelyDoS Under External Influence. This contract defect can alsolead to the lock of Ethers. Specifically, if the needPay (Line14) is a contract address, the maximum Gas Limit will berestricted to 2300 gas units, which is not enough to transferEthers. Thus, an out-of-gas error will happen in Line 14, andthe Ether transfer cannot succeed.Case Study 2: A second example is a bank contract, whichis shown in Listing 15. Users can send Ethers to the Deposit()function, and withdraw its Ethers by calling the CashOut()function. First, the contract sends Ethers on Line 11 andthen reduce the caller’s balance on Line 12. However, it canlead to the Reentrancy if the caller is an attacking contract.When the victim contract sends Ethers to the attack contract.The fallback function of the attack contract can recall theCashOut() function, and steal Ethers of the victim contract.Then, all of the balance in the contract was stolen by theattackers.

Figure 5 shows an attacking transaction which waslaunched by an attacking contract. The address of the at-tacking contract starts with 0xdefbe, and the address of thevictim contract starts with 0xbabfe. The attack happens threetimes on block 4919015, 4919567, and 4919662, respectively.First, the attacking contract sent 1 Ether to the victim con-tract. Then, the victim contract returned back Ethers to theattack contract. From these 3 attacks, the attacking contractstole about 5 Ethers from the victim contract, which were

Page 16: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 16

Fig. 5: Transaction Lists of Case Study 2

worth about $1,200 at the time of writing the paper. We onlyshow one example in Figure 5. Actually, the victim contractwas attacked by multiple attacking contracts, so the financialloss was far more than 5 Ethers.

1 func t ion Deposit ( ) publ ic payable{2 i f (msg . value >= MinDeposit ) {3 balances [msg . sender ]+=msg . value ;4 TransferLog . AddMessage (msg . sender , msg . value , ”

Deposit ” ) ;5 }6 }7

8 func t ion CashOut ( u int am)9 {

10 i f ( am<=balances [msg . sender ] ) {11 i f (msg . sender . c a l l . value ( am) ( ) ) {12 balances [msg . sender ]−= am ;13 TransferLog . AddMessage (msg . sender , am, ”

CashOut ”) ;14 }15 }16 }

Listing 15: Case Study 2 - Contract with Reentrancy. Codefrom Contract:

0xbABfE0AE175b847543724c386700065137d30e3B.

5.4 Threats to Validity

Internal Validity. The dataset we used was crawled fromEthereum, which contains different Solidity versions. De-fectChecker only supports versions higher than 0.4.0+, andabout 20,000 contracts had to be removed from our dataset,which may influence the overall results. However, the byte-code we removed is from many years ago, since the firstversion of 0.4.0+ was released on Sept. 2016. Even thoughthere are many contract defects in the removed bytecode,these do not represent current smart contract usage.

Another key threat is that we used our DefectChecker toget the results, but DefectChecker also reports false positivesand negatives, as shown in the previous section. However,DefectChecker is the most accurate and efficient tool thatdetects contract defects in the bytecode level, as we alsodemonstrated in the previous section. Therefore, we believethe results and our conclusions from it are reasonable.

External Validity. There are more than 1,000 smart con-tracts being deployed to Ethereum every day [40]. Manyguidance and security detection tools [31], [41] are releasedto the public, which can help to improve the quality of smartcontracts. In this case, the contract defects in smart contractsmay decrease, which may lead to different results to whatwe found and reported in this section.

6 RELATED WORK

Contract Defects on Smart Contracts. Our previouswork [11] is the first work that defines 20 smart contractdefects on Ethereum by analyzing the post on StackEx-change [13]. We first crawl all 17,128 Stack Exchange postsby the time of writing the paper and use key words tofilter solidity related posts. After getting Solidity relatedposts, two authors of the paper use Open Card Sorting tofind 20 contract defects and divide them into five cate-gories, i.e., security, availability, performance, maintainability,and reusability defects. According to their paper, althoughprevious works define several security defects, they didnot consider the practitioners’ perspective. Therefore, wefirst designed an online survey to collect feedback fromdevelopers to validate whether the developers regard thecontract defects are harmful. This feedback showed that allthe defined contract defects are harmful to smart contracts.We assigned five impact levels to the defined 20 contractdefects according to our survey results and the symptomsof the defects. According to our definition, contract defectswith impact level 1-3 can lead to unwanted behaviors ofcontract, e.g., a contract being controlled by attackers.Smart Contract Security Problems and Detection Tools.Luu et al. [20] introduced four security issues in theirwork, i.e., mishandled exception, transaction-ordering de-pendence, timestamp dependence, and reentrancy attack.They proposed a tool named Oyente, which is the first sym-bolic execution based bug detection tool for smart contracts.They first split the bytecode into several blocks, and built askeletal control flow graph for the detected contract. Then,they utilized Z3 [25] as their SMT solver and symbolicallyexecuted each instruction to obtain the full control flowgraph. Finally, they designed different patterns to detectwhether the input contracts contain the defined securityproblems. Oyente measured 19,366 existing Ethereum con-tracts and found 8,519 of them contain the defined securityproblems.

Kalra et al. [35] developed a tool named Zeus. The toolfeeds source code as input and translates them to LLVMbytecode. Zeus can detect seven kinds of security problems(four of them are the same with Oyente), and the otherthree problems are unchecked send, Failed send, Integer over-flow/underflow. They also compared their result to Oyenteand found Oyente contains many false positives and falsenegatives. Zeus crawled 1,524 distinct smart contracts fromEtherscan [30], Etherchain [42] and EtherCamp [43] explor-ers to evaluate their tool. The result illustrates that about94.6% of contracts contain at least one security problem.However, the needs of source code limited their usage.

Jiang et al. [34] proposed a tool named ContractFuzzerto test seven security issues. ContractFuzzer is the first toolthat utilizes fuzzing technology to detect security problems

Page 17: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 17

on smart contracts. They tested 6,991 smart contracts andfound that 459 of them have issues. However, only less than0.5% of smart contracts open their ABI to investigate onEthereum [30], while their tool needs smart contract ABIor source code to generate test case, which limited theirusage. In addition, our dataset consisted of 579 bytecodesmart contracts, which are not supported by ContractFuzzer.

Nikolic et al. [31] developed a tool named MAIAN,which contains two major parts: symbolic analysis andconcrete validation. Similar to Oyente, MAIAN utilizes sym-bolic execution and defines several execution rules to detectthese security issues. Their tool takes input data as eitherbytecode or source code. MAIAN has a different concerncompared to our tool. They focus on security issues that canlead to a contract not able to release Ethers, can transferEthers to arbitrary addresses, or can be killed by anybody.Their results were deduced from 970,898 smart contractsand they found that a total of 34,200 (2,365 distinct) contractscontain at least one of these three security issues.

ConsenSys is a leading blockchain technology company.They built a website named SWC Registry [44] (SmartContract Weakness Classification and Test Cases) to collectsmart contract security problems from both online postsand news through crowdsourcing. Mythril [33] is a tool todetect security problems on this SWC Registy, and theirfirst version was released in May 2018. The method usedby Mythril is similar to Oyente. It first builds a CFG andutilizes Z3 [25] as an SMT solver. Then, it designs severalrules to detect related problems. Mythril is a tool developedby industry; their instruction manual does not contain anyevaluation section on the tool.

Securify [32] is a tool released by Tsankov et al. Securifyis the first tool that utilizes semantic information to detectsecurity problems on smart contracts. It first decompilesEVM bytecode to and analyzes the semantic facts, includingdata flow and control flow dependencies. Finally, it checksseveral security patterns that are written in a specializeddomain-specific language to detect related security prob-lems. Securify focuses on two kinds of security problems, i.e.,Stealing Ether and Frozen Funds. There are 9 security issuescan that be detected by Securify. Tsankov et al. evaluate theirtool based on two datasets. First, a large-scale evaluationbased on 24,594 smart contracts. Their results show thatmore than 70% of smart contracts contain at least one of thesecurity problems. Then, they use a small-scale evaluationbased on 100 smart contracts to evaluate their proposedtool’s effectiveness. To simplify manual inspection, all ofthese 100 smart contracts are up to 200 lines of code.According to their paper, Securify can find more securityviolations compared to Oyente and Mythril.

In this paper, we propose a tool named DefectChecker,which is the most accurate and the fastest symbolic exe-cution model of smart contract defect detection tool. De-fectChecker can detect contract defects by analyzing byte-code, while Zeus and ContractFuzzer need source code andcontract ABI, respectively. The bytecode of smart contractsare visible to everyone, while only 1% of smart contractsopen up their source code and ABI for the public [22],which restricts their usage. MAIAN uses a dynamic analysismethod to detect security problems, which is different fromour static analysis method. However, we find their tool

can not support the current version of Ethereum that weused. Oyente, Mythril, and Securify use symbolic execution todetect security problems, which are similar to DefectChecker,but DefectChecker uses Stack Event and Feature Detector toinstead the usage of SMT solver, which makes DefectCheckerrequires less runtime and yet is more accurate than thesetools.

Oyente, Mythril, and Securify can detect other contractdefects that are not supported by DefectChecker. Especiallyfor Mythril, which can detect 34 kinds of contract defects. Weadmit that some tools can detect more contract defects thanDefectChecker, but it is not the main motivation of this paper.Previous works, e.g., Oyente, Securify, only proposed severalsecurity defects of smart contracts without validating theyare really harmful. This is not beneficial for the developmentof the smart contract ecosystem. In our previous work,we validated whether smart contract developers considerthe contract defects we found from StackExchange postsare harmful by using an online survey. In this paper, weproposed DefectChecker, which aims to automatically detectthe validated contract defects. We use Oyente, Mythril, andSecurify as baseline methods with the aim to show themethod we use is more accurate and efficient than thesestate-of-the-art tools.

Our DefectChecker is extensible. As shown in Figure 1,there are three components of DefectChecker, i.e., CFGBuilder, Feature Detector, and Defect Identifier. Defect Identifieruses eight different rules to identify the contract defects,while the other two components can also be used to detectother defects. When detecting other defects, we can definenew rules that use the data provided by our Feature De-tector, CFG, and Stack Event components. There are manytools built based on the top of Oyente. For example, our pre-vious work GasChecker [45] is a tool to detect gas-inefficientSmart Contracts. The tool uses the CFG generated by Oyenteto detect related gas-inefficient issues. DefectChecker hashigher efficiency in generating CFG compared to Oyente.GasChecker can also use the CFG generated by DefectChecker.Thus, DefectChecker is also extensible to detect other kinds ofissues.

7 CONCLUSION AND FUTURE WORK

In this paper, we proposed DefectChecker, which utilizessymbolic execution to detect smart contract defects by an-alyzing the contracts’ bytecode. DefectChecker uses differentrules to detect 8 contract defects and achieves a very goodresult when running on our previous work’s dataset. Thescores for our tool are much higher than those of the stateof the art work e.g. (Oyente, Mythril, and Securify). Wealso crawled 165,621 distinct bytecode smart contracts fromEthereum and ran DefectChecker on these. Our results showthat about 15.89% of smart contracts on Ethereum containat least one instance of our 8 identified kinds of contractdefects.

Two groups can benefit from this work. For smart con-tract developers, they can utilize DefectChecker to checktheir smart contracts and make them more robust. AsDefectChecker can detect contract defects from bytecodewithout the need for source code, developers can utilizeDefectChecker to check whether the smart contracts they call

Page 18: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 18

are secure or not, even if the callee contracts are not opensourced. This can also make their contracts safer. For soft-ware engineering researchers, DefectChecker provides a goodframework to help them solve other smart-contract-relatedresearch problems as the CFG generated by DefectCheckercan be used for other purposes.

DefectChecker has some false positives / negatives whendetecting defects, e.g., NC, DuEI. As we described in Section4.4, adding a SMT Solver can reduce some error cases, whileit will also increase the time consumption for analyzing acontract. Future work could explore how to combine themethod used by DefectChecker and a SMT solver, to balanceboth efficiency and accuracy. Specifically, researchers couldidentify which kinds of code patterns can lead to the errorsmade by DefectChecker. For example, DefectChecker regardsall paths to be reachable, while some conditional expressionsare always evaluated to false, which can lead to the falsepositives in detecting loops. Developers can use a SMTsolver to check the conditional expression in the loop relatedblocks. This method can increase the accuracy in detectingloop related blocks.

ACKNOWLEDGEMENTS

This research was partially supported by the Australian Re-search Council’s Discovery Early Career Researcher Award(DECRA) funding scheme (DE200100021), ARC LaureateFellowship funding scheme (FL190100035), ARC Discov-ery grant (DP200100020), National Natural Science Foun-dation of China (61872057), National Key R&D Programof China (2018YFB0804100), Hong Kong RGC Project (No.152193/19E), and the National Research Foundation, Singa-pore under its Industry Alignment Fund – Pre-positioning(IAF-PP) Funding Initiative. Any opinions, findings andconclusions or recommendations expressed in this materialare those of the author(s) and do not reflect the views ofNational Research Foundation, Singapore.

REFERENCES

[1] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,”Working Paper, 2008.

[2] G. Wood, “Ethereum: A secure decentralised generalised transac-tion ledger,” Project Yellow Paper, 2014.

[3] (Apr., 2019) Decentralized Application. [Online]. Available:https://en.wikipedia.org/wiki/Decentralized application

[4] (Feb., 2019) Cryptokitties. [Online]. Available: https://www.cryptokitties.co/

[5] (Apr., 2019) ICO Ethereum. [Online]. Available: https://etherscan.io/directory/ICOs

[6] (Mar., 2018) Solidity Document. [Online]. Available: http://solidity.readthedocs.io

[7] T. Chen, X. Li, Y. Wang, J. Chen, Z. Li, X. Luo, M. H. Au,and X. Zhang, “An Adaptive Gas Cost Mechanism for Ethereumto Defend Against Under-Priced DoS Attacks,” in InternationalConference on Information Security Practice and Experience. Springer,2017, pp. 3–24.

[8] T. Chen, Y. Zhu, Z. Li, J. Chen, X. Li, X. Luo, X. Lin, and X. Zhange,“Understanding Ethereum via Graph Analysis,” in IEEE INFO-COM 2018-IEEE Conference on Computer Communications. IEEE,2018, pp. 1484–1492.

[9] T. Chen, Z. Li, Y. Zhu, J. Chen, X. Luo, J. C.-S. Lui, X. Lin, andX. Zhang, “Understanding Ethereum via Graph Analysis,” ACMTransactions on Internet Technology (TOIT), vol. 20, no. 2, pp. 1–32,2020.

[10] ISO, “ISO/IEC/IEEE International Standard - Systems and soft-ware engineering–Vocabulary,” ISO/IEC/IEEE 24765: 2017 (E),Tech. Rep., 2017.

[11] J. Chen, X. Xia, D. Lo, J. Grundy, X. Luo, and T. Chen, “DefiningSmart Contract Defects on Ethereum,” IEEE Transactions on Soft-ware Engineering, 2020.

[12] R. Chillarege et al., “Orthogonal defect classification,” Handbook ofSoftware Reliability Engineering, pp. 359–399, 1996.

[13] (Jan., 2018) StackExchange. [Online]. Available: https://ethereum.stackexchange.com/

[14] T. Chen, Z. Li, H. Zhou, J. Chen, X. Luo, X. Li, and X. Zhang, “To-wards saving money in using smart contracts,” in 2018 IEEE/ACM40th International Conference on Software Engineering: New Ideas andEmerging Technologies Results (ICSE-NIER). IEEE, 2018, pp. 81–84.

[15] Ethereum Foundation, “Ethereum’s white paper.” https://github.com/ethereum/wiki/wiki/White-Paper, 2014.

[16] R. Vallee-Rai, P. Co, E. Gagnon, L. Hendren, P. Lam, and V. Sun-daresan, “Soot: A Java bytecode optimization framework,” inCASCON First Decade High Impact Papers, 2010, pp. 214–224.

[17] N. Grech, L. Brent, B. Scholz, and Y. Smaragdakis, “Gigahorse:thorough, declarative decompilation of smart contracts,” in 2019IEEE/ACM 41st International Conference on Software Engineering(ICSE). IEEE, 2019, pp. 1176–1186.

[18] (Aug., 2020) The Java Virtual Machine Specification.[Online]. Available: https://docs.oracle.com/javase/specs/jvms/se8/html/index.html

[19] T. Chen, Z. Li, Y. Zhang, X. Luo, T. Wang, T. Hu, X. Xiao, D. Wang,J. Huang, and X. Zhang, “A large-scale empirical study on controlflow identification of smart contracts,” in 2019 ACM/IEEE Interna-tional Symposium on Empirical Software Engineering and Measurement(ESEM). IEEE, 2019, pp. 1–11.

[20] L. Luu, D.-H. Chu, H. Olickel, P. Saxena, and A. Hobor, “Makingsmart contracts smarter,” in Proceedings of the 2016 ACM SIGSACConference on Computer and Communications Security. ACM, 2016,pp. 254–269.

[21] (Mar., 2018) Geth. [Online]. Available: https://github.com/ethereum/go-ethereum

[22] T. Chen, Y. Zhang, Z. Li, X. Luo, T. Wang, R. Cao, X. Xiao,and X. Zhang, “TokenScope: Automatically Detecting InconsistentBehaviors of currency Tokens in Ethereum,” in Proceedings of the2019 ACM SIGSAC Conference on Computer and CommunicationsSecurity, 2019, pp. 1503–1520.

[23] T. Chen, Y. Feng, Z. Li, H. Zhou, X. Luo, X. Li, X. Xiao, J. Chen,and X. Zhang, “Gaschecker: Scalable analysis for discovering gas-inefficient smart contracts,” IEEE Transactions on Emerging Topics inComputing, 2020.

[24] C. Barrett and C. Tinelli, “Satisfiability modulo theories,” in Hand-book of Model Checking. Springer, 2018, pp. 305–343.

[25] L. De Moura and N. Bjørner, “Z3: An efficient SMT solver,” inInternational conference on Tools and Algorithms for the Constructionand Analysis of Systems. Springer, 2008, pp. 337–340.

[26] R. Tarjan, “Depth-first search and linear graph algorithms,” SIAMjournal on computing, vol. 1, no. 2, pp. 146–160, 1972.

[27] (Jan., 2016) EIP-55. [Online]. Available: https://github.com/ethereum/EIPs/blob/master/EIPS/eip-55.md

[28] T. J. McCabe, “A complexity measure,” IEEE Transactions on soft-ware Engineering, no. 4, pp. 308–320, 1976.

[29] J. Cohen, “A coefficient of agreement for nominal scales,” Educa-tional and psychological measurement, vol. 20, no. 1, pp. 37–46, 1960.

[30] (Mar., 2018) Etherscan. [Online]. Available: https://etherscan.io/[31] I. Nikolic, A. Kolluri, I. Sergey, P. Saxena, and A. Hobor, “Finding

the greedy, prodigal, and suicidal contracts at scale,” in Proceedingsof the 34th Annual Computer Security Applications Conference. ACM,2018, pp. 653–663.

[32] P. Tsankov, A. Dan, D. Drachsler-Cohen, A. Gervais, F. Buenzli,and M. Vechev, “Securify: Practical security analysis of smartcontracts,” in Proceedings of the 2018 ACM SIGSAC Conference onComputer and Communications Security. ACM, 2018, pp. 67–82.

[33] (Aug., 2019) Mythril: Security analysis tool for EVM bytecode. .[Online]. Available: https://github.com/ConsenSys/mythril

[34] B. Jiang, Y. Liu, and W. Chan, “Contractfuzzer: Fuzzing smartcontracts for vulnerability detection,” in Proceedings of the 33rdACM/IEEE International Conference on Automated Software Engineer-ing. ACM, 2018, pp. 259–269.

[35] S. Kalra, S. Goel, M. Dhawan, and S. Sharma, “Zeus: Analyzingsafety of smart contracts,” in 25th Annual Network and DistributedSystem Security Symposium (NDSS’18), 2018.

[36] B. Kitchenham, “Procedures for performing systematic reviews,”Keele, UK, Keele University, vol. 33, no. 2004, pp. 1–26, 2004.

[37] (April., 2019) Web3.py. [Online]. Available: https://web3py.readthedocs.io/en/stable/

Page 19: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 19

[38] (Jan., 2021) Generalized Linear Models in R,Part 6: Poisson Regression for Count Variables.[Online]. Available: https://www.theanalysisfactor.com/generalized-linear-models-in-r-part-6-poisson-regression-count-variables/

[39] J. Benesty, J. Chen, Y. Huang, and I. Cohen, “Pearson correlationcoefficient,” in Noise reduction in speech processing. Springer, 2009,pp. 1–4.

[40] T. Chen, Z. Li, Y. Zhang, X. Luo, A. Chen, K. Yang, B. Hu, T. Zhu,S. Deng, T. Hu et al., “Dataether: Data Exploration Framework forEthereum,” in 2019 IEEE 39th International Conference on DistributedComputing Systems (ICDCS). IEEE, 2019, pp. 1369–1380.

[41] (Mar., 2018) Oyente: An Analysis Tool for Smart Contracts.[Online]. Available: https://github.com/melonproject/oyente

[42] (Mar., 2018) Etherchain. [Online]. Available: https://www.etherchain.org/contracts/

[43] (Mar., 2018) Ethercamp. [Online]. Available: https://live.ether.camp/

[44] (July., 2019) SWC Registry: Smart Contract Weakness Classificationand Test Cases. [Online]. Available: https://smartcontractsecurity.github.io/SWC-registry/

[45] T. Chen, Y. Feng, Z. Li, H. Zhou, X. Luo, X. Li, X. Xiao, J. Chen, andX. Zhang, “GasChecker: Scalable Analysis for Discovering Gas-Inefficient Smart Contracts,” IEEE Transactions on Emerging Topicsin Computing, 2020.

REFERENCES

[1] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,”Working Paper, 2008.

[2] G. Wood, “Ethereum: A secure decentralised generalised transac-tion ledger,” Project Yellow Paper, 2014.

[3] (Apr., 2019) Decentralized Application. [Online]. Available:https://en.wikipedia.org/wiki/Decentralized application

[4] (Feb., 2019) Cryptokitties. [Online]. Available: https://www.cryptokitties.co/

[5] (Apr., 2019) ICO Ethereum. [Online]. Available: https://etherscan.io/directory/ICOs

[6] (Mar., 2018) Solidity Document. [Online]. Available: http://solidity.readthedocs.io

[7] T. Chen, X. Li, Y. Wang, J. Chen, Z. Li, X. Luo, M. H. Au,and X. Zhang, “An Adaptive Gas Cost Mechanism for Ethereumto Defend Against Under-Priced DoS Attacks,” in InternationalConference on Information Security Practice and Experience. Springer,2017, pp. 3–24.

[8] T. Chen, Y. Zhu, Z. Li, J. Chen, X. Li, X. Luo, X. Lin, and X. Zhange,“Understanding Ethereum via Graph Analysis,” in IEEE INFO-COM 2018-IEEE Conference on Computer Communications. IEEE,2018, pp. 1484–1492.

[9] T. Chen, Z. Li, Y. Zhu, J. Chen, X. Luo, J. C.-S. Lui, X. Lin, andX. Zhang, “Understanding Ethereum via Graph Analysis,” ACMTransactions on Internet Technology (TOIT), vol. 20, no. 2, pp. 1–32,2020.

[10] ISO, “ISO/IEC/IEEE International Standard - Systems and soft-ware engineering–Vocabulary,” ISO/IEC/IEEE 24765: 2017 (E),Tech. Rep., 2017.

[11] J. Chen, X. Xia, D. Lo, J. Grundy, X. Luo, and T. Chen, “DefiningSmart Contract Defects on Ethereum,” IEEE Transactions on Soft-ware Engineering, 2020.

[12] R. Chillarege et al., “Orthogonal defect classification,” Handbook ofSoftware Reliability Engineering, pp. 359–399, 1996.

[13] (Jan., 2018) StackExchange. [Online]. Available: https://ethereum.stackexchange.com/

[14] T. Chen, Z. Li, H. Zhou, J. Chen, X. Luo, X. Li, and X. Zhang, “To-wards saving money in using smart contracts,” in 2018 IEEE/ACM40th International Conference on Software Engineering: New Ideas andEmerging Technologies Results (ICSE-NIER). IEEE, 2018, pp. 81–84.

[15] Ethereum Foundation, “Ethereum’s white paper.” https://github.com/ethereum/wiki/wiki/White-Paper, 2014.

[16] R. Vallee-Rai, P. Co, E. Gagnon, L. Hendren, P. Lam, and V. Sun-daresan, “Soot: A Java bytecode optimization framework,” inCASCON First Decade High Impact Papers, 2010, pp. 214–224.

[17] N. Grech, L. Brent, B. Scholz, and Y. Smaragdakis, “Gigahorse:thorough, declarative decompilation of smart contracts,” in 2019IEEE/ACM 41st International Conference on Software Engineering(ICSE). IEEE, 2019, pp. 1176–1186.

[18] (Aug., 2020) The Java Virtual Machine Specification.[Online]. Available: https://docs.oracle.com/javase/specs/jvms/se8/html/index.html

[19] T. Chen, Z. Li, Y. Zhang, X. Luo, T. Wang, T. Hu, X. Xiao, D. Wang,J. Huang, and X. Zhang, “A large-scale empirical study on controlflow identification of smart contracts,” in 2019 ACM/IEEE Interna-tional Symposium on Empirical Software Engineering and Measurement(ESEM). IEEE, 2019, pp. 1–11.

[20] L. Luu, D.-H. Chu, H. Olickel, P. Saxena, and A. Hobor, “Makingsmart contracts smarter,” in Proceedings of the 2016 ACM SIGSACConference on Computer and Communications Security. ACM, 2016,pp. 254–269.

[21] (Mar., 2018) Geth. [Online]. Available: https://github.com/ethereum/go-ethereum

[22] T. Chen, Y. Zhang, Z. Li, X. Luo, T. Wang, R. Cao, X. Xiao,and X. Zhang, “TokenScope: Automatically Detecting InconsistentBehaviors of currency Tokens in Ethereum,” in Proceedings of the2019 ACM SIGSAC Conference on Computer and CommunicationsSecurity, 2019, pp. 1503–1520.

[23] T. Chen, Y. Feng, Z. Li, H. Zhou, X. Luo, X. Li, X. Xiao, J. Chen,and X. Zhang, “Gaschecker: Scalable analysis for discovering gas-inefficient smart contracts,” IEEE Transactions on Emerging Topics inComputing, 2020.

[24] C. Barrett and C. Tinelli, “Satisfiability modulo theories,” in Hand-book of Model Checking. Springer, 2018, pp. 305–343.

[25] L. De Moura and N. Bjørner, “Z3: An efficient SMT solver,” inInternational conference on Tools and Algorithms for the Constructionand Analysis of Systems. Springer, 2008, pp. 337–340.

[26] R. Tarjan, “Depth-first search and linear graph algorithms,” SIAMjournal on computing, vol. 1, no. 2, pp. 146–160, 1972.

[27] (Jan., 2016) EIP-55. [Online]. Available: https://github.com/ethereum/EIPs/blob/master/EIPS/eip-55.md

[28] T. J. McCabe, “A complexity measure,” IEEE Transactions on soft-ware Engineering, no. 4, pp. 308–320, 1976.

[29] J. Cohen, “A coefficient of agreement for nominal scales,” Educa-tional and psychological measurement, vol. 20, no. 1, pp. 37–46, 1960.

[30] (Mar., 2018) Etherscan. [Online]. Available: https://etherscan.io/[31] I. Nikolic, A. Kolluri, I. Sergey, P. Saxena, and A. Hobor, “Finding

the greedy, prodigal, and suicidal contracts at scale,” in Proceedingsof the 34th Annual Computer Security Applications Conference. ACM,2018, pp. 653–663.

[32] P. Tsankov, A. Dan, D. Drachsler-Cohen, A. Gervais, F. Buenzli,and M. Vechev, “Securify: Practical security analysis of smartcontracts,” in Proceedings of the 2018 ACM SIGSAC Conference onComputer and Communications Security. ACM, 2018, pp. 67–82.

[33] (Aug., 2019) Mythril: Security analysis tool for EVM bytecode. .[Online]. Available: https://github.com/ConsenSys/mythril

[34] B. Jiang, Y. Liu, and W. Chan, “Contractfuzzer: Fuzzing smartcontracts for vulnerability detection,” in Proceedings of the 33rdACM/IEEE International Conference on Automated Software Engineer-ing. ACM, 2018, pp. 259–269.

[35] S. Kalra, S. Goel, M. Dhawan, and S. Sharma, “Zeus: Analyzingsafety of smart contracts,” in 25th Annual Network and DistributedSystem Security Symposium (NDSS’18), 2018.

[36] B. Kitchenham, “Procedures for performing systematic reviews,”Keele, UK, Keele University, vol. 33, no. 2004, pp. 1–26, 2004.

[37] (April., 2019) Web3.py. [Online]. Available: https://web3py.readthedocs.io/en/stable/

[38] (Jan., 2021) Generalized Linear Models in R,Part 6: Poisson Regression for Count Variables.[Online]. Available: https://www.theanalysisfactor.com/generalized-linear-models-in-r-part-6-poisson-regression-count-variables/

[39] J. Benesty, J. Chen, Y. Huang, and I. Cohen, “Pearson correlationcoefficient,” in Noise reduction in speech processing. Springer, 2009,pp. 1–4.

[40] T. Chen, Z. Li, Y. Zhang, X. Luo, A. Chen, K. Yang, B. Hu, T. Zhu,S. Deng, T. Hu et al., “Dataether: Data Exploration Framework forEthereum,” in 2019 IEEE 39th International Conference on DistributedComputing Systems (ICDCS). IEEE, 2019, pp. 1369–1380.

[41] (Mar., 2018) Oyente: An Analysis Tool for Smart Contracts.[Online]. Available: https://github.com/melonproject/oyente

[42] (Mar., 2018) Etherchain. [Online]. Available: https://www.etherchain.org/contracts/

[43] (Mar., 2018) Ethercamp. [Online]. Available: https://live.ether.camp/

[44] (July., 2019) SWC Registry: Smart Contract Weakness Classificationand Test Cases. [Online]. Available: https://smartcontractsecurity.github.io/SWC-registry/

Page 20: IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , …

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. , NO. , 20

[45] T. Chen, Y. Feng, Z. Li, H. Zhou, X. Luo, X. Li, X. Xiao, J. Chen, andX. Zhang, “GasChecker: Scalable Analysis for Discovering Gas-

Inefficient Smart Contracts,” IEEE Transactions on Emerging Topicsin Computing, 2020.


Recommended