Bitcoin and Blockchain: Security and Privacy

04/25/2019 ∙ by Ehab Zaghloul, et al. ∙ Michigan State University 0

A cryptocurrency is a decentralized digital currency that is designed for secure and private asset transfer and storage. As a currency, it should be difficult to counterfeit and double-spend. In this paper, we review and analyze the major security and privacy issues of Bitcoin. In particular, we focus on its underlying foundation, blockchain technology. First, we present a comprehensive background of Bitcoin and the preliminary on security. Second, the major security threats and countermeasures of Bitcoin are investigated. We analyze the risk of double-spending attacks, evaluate the probability of success in performing the attacks and derive the profitability for the attacker to perform such attacks. Third, we analyze the underlying Bitcoin peer-to-peer network security risks and Bitcoin storage security. We compare three types of Bitcoin wallets in terms of security, type of services and their trade-offs. Finally, we discuss the security and privacy features of alternative cryptocurrencies and present an overview of emerging technologies today. Our results can help Bitcoin users to determine a trade-off between the risk of double-spending attempts and the transaction time delay or confidence before accepting transactions. These results can also assist miners to develop suitable strategies to get involved in the mining process and maximize their profits.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

A cryptocurrency is a decentralized online currency that was developed as an alternate means to transfer money in an unprecedented way. Existing financial systems require a centralized trusted financial institution to securely process transactions between two parties. This institution charges costly service fees that are unavoidable for banking customers. In addition to such cost burdens, delayed processing time and security issues have affected the modern-day financial industry. Certain transactions, such as funds transfer, may take days or weeks to be cleared, causing issues in cases of urgency. The modern-day financial system is also plagued with security and privacy vulnerabilities. Financial institutions employ the most advanced security techniques to protect customers. However, the sensitive information of the customer is always exposed to the financial institutions making it vulnerable to information leakage. To mitigate these security concerns, privacy risks, and inconveniences, new cryptographic protocols have been developed to allow secure and convenient asset transfer, without involving a centralized third-party.

In 2008, Satoshi Nakamoto developed a white paper in which he proposed Bitcoin [1]. Bitcoin is an online Peer-to-Peer (P2P) digital cash system that does not require a trusted third-party. In Bitcoin, users possess ownership rights to virtual cryptocoins that are denoted as Bitcoins (BTC). Users generate transactions to transfer BTC and store them in the public ledger, blockchain. The smallest transferable value today is known as a Satoshi, which is equivalent to one-hundredth of a millionth BTC (i.e. 0.00000001 BTC).

Bitcoin transactions utilize cryptographic protocols to provide a secure process while striving to preserve the privacy of both the buyer and seller. The transactions are stored in a blockchain [2, 3, 4, 5] to limit inherent issues of digital media such as double-spending [6]. A blockchain is a distributed database acting as a public ledger that holds all processed transactions. It is based on a distributed consensus that allows any past and present online transaction to be verified  [7].

Bitcoin transactions are released into the network and validated by the nodes as they propagate through the entire network. The validating nodes, referred to as miners, compete to mine groups of transactions into blocks and earn BTC as a reward. Mining is the process of solving a hard cryptopuzzle, referred to as the Proof-of-Work (PoW), that requires extensive computational power. The first miner capable of finding a solution to the problem broadcasts his/her block to the network and earns the reward. The reward consists of a specified amount of new released BTC and all the transaction fees associated with the transactions included in the block. All the other miners then surrender to the solution of the winning miner and append the winning block to the blockchain.

The first Bitcoin software was implemented by Satoshi Nakamoto and is known as the Bitcoin Core. This implementation is sometimes referred to as the Satoshi client and is run by most of the network nodes in Bitcoin. It is an open source project with a large developer community contributing to it. The developers follow a Bitcoin Improvement Proposals (BIP) [8] document and introduce the standards of the system. The document also contains new features and proposals for the developer community to test thoroughly before making final modifications to the software.

Following Bitcoin, many cryptocurrency systems appeared and continue to do so today. The blockchain technology is a common characteristic shared by many newly emerging cryptocurrency systems [9]. The majority of these systems are mainly clones of Bitcoin. These systems introduced only minor adjustments such as currency supply or block size within the Blockchain. Alternatively, a few systems introduce innovative concepts that offer substantial features. Examples of these features include novel consensus mechanisms or enhanced decentralized computing platforms that can provide additional functions and higher flexibility to the system.

All cryptocurrencies are traded in the online cryptocurrency marketplace. The cryptocurrency market is similar to other exchange markets such as the stock market, with various trading platforms. However, the cryptocurrency market is not regulated by a government or agency and trading occurs virtually 24/7 across the world. The nature of cryptocurrency allows transactions to occur at speeds that cannot be accomplished with fiat currency, such as the United States dollar. This results in a much more volatile market than traditional trading markets. Coin prices are continuously rising and dropping, and new cryptocurrencies consistently enter and leave the marketplace. Many coins continue to rise in value based on value demonstrated to investors. However, increased speculation in the marketplace has lead to the over-evaluation of many cryptocurrencies.

As of November 2017, the total market cap of the cryptocurrency market hit $246 billion [10]. This amount comes from the total valuation of almost one thousand cryptocurrencies on the market today. Comparing this amount to the $17.6 billion total market cap in 2016, the market has increased by 1,298%. This rapid growth in the new market has led an effort to examine the role of cryptocurrency in the future.

I-a Contributions

Cryptocurrencies, particularly Bitcoin, have attracted massive and diverse attention. They are in continuous development and evolution thrusting researchers to thoroughly and constantly analyze them. Notable studies have been presented that discuss the blockchain technology and outline open issues. Their main purpose is to exploit the future stability of Bitcoin from different perspectives. The study presented in [11] is one of the first studies to present an exposition of Bitcoin and some altcoins. This work focused on discussing stability properties and comparing them to those in Bitcoin, to measure its degree of stability as a system. It also briefly investigated security and privacy concerns, in addition to some alternative consensus protocols. However, the study lacks recently investigated attacks and deep analysis behind them. It is also limited in its discussions about the alternative protocols. The survey presented in [12] is a technical analysis on Bitcoin and aimed at consolidating key algorithmic features of the system. This work expanded on the study in [11] by providing an in-depth analysis in terms of security and privacy. However, the authors briefly discuss Bitcoin storage wallets and do not delve into the underlying infrastructure that secures these wallets. The study presented in [13] is another comprehensive survey that explored various security and privacy aspects of Bitcoin, accompanied with possible countermeasures. The survey presents comparisons between various security attacks and privacy protocols. Nevertheless, as in [12], Bitcoin storage wallet types are briefly compared without exploiting their foundation that reflects points of weakness. Another privacy-focused research is presented in [14]

that aims at expanding the previous works in terms of anonymity and privacy. In this study, the authors show that analyzing anonymity and privacy in Bitcoin may be classified into various classes where each class may result in a different privacy leakage outcomes. They also classify the on-going efforts to improve anonymity and privacy while discussing their potential corresponding outcomes.

While the main purpose of this paper is similar to the previous studies; to examine the potential stability of Bitcoin, however, we strive to approach this goal by considering missing and limited previously discussed topics. We delve deeper into analyzing the double-spending attacks by modeling the probability of success in multiple ways. In particular, utilizing our analysis, we present our own profitability analysis of the double-spending attacks. We reveal a break-point in time when attackers should give up on the attack since it is unlikely that they will turn a profit beyond this point (i.e. the time when the cost is greater than the revenue). We present a trade-off between the waiting time before accepting a transaction versus the profits/losses of the attackers. This may help maximize the confidence of the users before accepting transactions. In addition to this, we thoroughly analyze the infrastructure of Bitcoin storage wallets. Our discussion presents the key algorithmic features introduced by each wallet type in order to counter the different potential threats. The main purpose is to enlighten the users with the trade-offs when using different types of wallets from a cryptographic perspective.

The major contributions of this paper can be summarized as follows:

  1. We provide a comprehensive explanation of the primary components of Bitcoin discussed in a sequential and logical order for the readers to comprehend. The main purpose is to cultivate the readers with the necessary background on Bitcoin to consolidate their understanding of the system. We aim at providing sufficient background for the readers to build a solid understanding that can be utilized when exploring similar systems. This background will also help the readers easily digest the following concerns and issues discussed in this paper.

  2. We delve thoroughly into the analysis of double-spending attacks. We first show that the probability of success of performing double-spending attacks can be modeled using two distinct probabilistic models. We show that both models result in a similar outcome. Next, using these probabilistic models, we present a profitability analysis on performing double-spending attacks. The main purpose of this analysis is to reflect the trade-off between the waiting time before accepting a transaction versus the profits/losses of the attackers. We also aim at reflecting that attackers with 51% computational power or more will continue to profit indefinitely.

  3. We present fundamental network security and privacy concerns. The purpose of this analysis is to expand the knowledge of readers on major security and privacy concerns that threaten the stability of systems running the blockchain technology. Our target is to help the reader realize the major threats to such systems from a security and privacy perspective.

  4. We dive deeply into the exploration of Bitcoin storage wallets. We first classify wallets based on their underlying infrastructure and methods of PKI pair generation. We aim at presenting the cryptographic primitives related to each type of wallet. Next, we classify wallets based on installation environments and then further classify them based on functionality. We strive to help the readers understand the different classes of wallets, their corresponding security risks, and the best practices to secure their cryptocoins.

The interest in blockchain continues to grow aggressively. It has already attracted a wide range of audiences such as governments, enterprises, health-care, and many more. We realize that in order for blockchain to sustain its success and for these interested entities to adopt it, we must educate a wider range of audience, which could include: (i) researchers at the beginning of the line that wish to expand on research in this area and (ii) skeptical entities and individuals that wish to adopt the technology and wish to learn more about it. We aim at putting together a comprehensive study that explores blockchain technology from multiple angles and filling in the gaps of previous studies.

I-B Organization

The rest of the paper is organized as follows. In Section II, we briefly review previous digital cash systems and blockchain infrastructures. In Section III, we provide a comprehensive background review on Bitcoin outlining its building blocks and protocols. Next, in Section IV, we evaluate double-spending attacks and present our profitability analysis. Following that, in Section V, we assess the major network security attacks of the Bitcoin network. In Section VI, we analyze the security issues in the storage wallets used by Bitcoin today. We investigate the subsequent privacy protocols of Bitcoin in an effort to limit the linkage problem in Section VII. In Section VIII, we review protocols and alternative consensus algorithms implemented in emerging cryptocurrencies outlining the security and privacy advantages and limitations. Finally, we conclude our study, summarize the lessons learned, and future research directions in Section IX.

Ii Related Work

In this section, we discuss the history of digital cash systems. We also introduce the evolution of the blockchain technology.

Ii-a Digital Cash Systems

Research in digital cash dates back to the early 1980s [15]. In 1990, DigiCash Inc., an electronic cash corporation, made an initial attempt to provide a cryptocurrency system [16]. DigiCash transactions involved cryptographic protocols and aimed at providing its users with anonymity. However,it failed in 2000 as the Internet bubble popped despite being attractive initially. David Chaum, its founder, believes the failure of DigiCash to succeed was tied to its technology which preceded the e-commerce maturation within the Internet. Other reasons which led to its failure included the cooperation of banks to process a transaction, making DigiCash a centralized system.

In 1998, a decentralized digital cash system, b-money [17], was introduced by Wei Dai. B-money is an anonymous and distributed digital cash system that aimed at providing untraceable transactions. One major advantage of this system is that it eliminates the need for a central authority. However, b-money was just an initial and incomplete idea. It did not properly tackle some of the key issues including double-spending attacks.

In early 2000, Digital Gold Currency (DGC), a currency backed by gold, gained some popularity. DGC is considered to be a second-generation digital currency. It is issued by some companies that enable users to pay each other in units similar to those of gold bullion. Examples include iGolder, gbullion, and e-Gold. Although DGC seemed to have a bright future, it lost popularity due to its centralized structure. Politics may have also played a role in its declining popularity. Companies that provided DGC were forced to shut down by the federal government due to their inability to comply with the government regulations [18].

In 2003, Second Life [19], an online virtual world, introduced a digital currency referred to as the Linden dollar. The Linden dollar is exchangeable for fiat currencies. Second Life users are able to use this currency for direct transactions. However, similar to its preceding digital currencies, the Linden dollar is a centralized digital currency that is controlled by its creator, Linden Labs [20]. Moreover, its price is volatile and unstable making it a risky currency to own.

Ii-B Blockchain History

In 1991, the first secure blockchain was proposed by Stuart Haber and W. Scott Stornetta [2]. Their blockchain aimed at certifying the creation or modification of a digital record by digitally time-stamping the record being processed. However, the blockchain was not efficient since each record was independently time-stamped. To improve the efficiency, Merkle trees [21] were incorporated into blockchains in 1992 [3]. They improved the efficiency by handling multiple digital records into one block. Finally, Satoshi Nakamoto implemented the first real blockchain and used it as the core technology for the Bitcoin cryptocurrency system.

Iii Understanding Bitcoin

In this section, we will present the major building blocks and protocols of Bitcoin. We first present the Bitcoin network, Bitcoin transaction, and Bitcoin transaction standards. Next, we explain how Merkle trees are utilized to group transactions into blocks and stored in the blockchain. Following that, we discuss the Bitcoin mining process, mining pools, payment methods, and methods of developing alternative cryptocurrencies.

Iii-a The Bitcoin Network

Bitcoin runs over a P2P network. The main advantage of using a P2P network is the agile movement of data for all nodes to achieve consensus. In contrast to the typical P2P network used to share data files between interested peers, Bitcoin utilizes the network to rapidly broadcast data among all the connected nodes. This process is known as flooding and continues until all nodes within the network receive the broadcast data.

It is important to differentiate between the terms node and peer of a P2P network. A node is a network entity that is connected to one or multiple other similar nodes. The directly connected nodes are referred to as the peers. Nodes propagate data to the indirectly connected nodes by traversing it to their peers which follow a similar manner until the data reaches every connected node.

In the Bitcoin network, data being flooded includes IP addresses of the nodes, newly generated transactions, and blocks of verified transactions that extend the blockchain. Peers share IP addresses of other nodes that they are connected to or have discovered from their peer nodes. The aim behind sharing IP addresses is to allow peers in the network to discover and connect to more nodes resulting in a random network topology. Newly generated transactions are broadcast through the network to rapidly publicize their occurrence to all connected nodes. Miners compete to mine these transactions into blocks. The winning miner broadcasts the block to all the connected nodes to extend and update their version of the blockchain.

Nodes in the Bitcoin P2P network are defined based on their roles. The main duties are summarized as transaction generation, block/transaction routing, block/transaction verification, and transaction mining. Block/transaction routing is performed by all nodes.

A node that can perform all functions is referred to as a full node. It consistently keeps a copy of the full blockchain allowing it to verify any transaction without needing assistance of other connected nodes. It also possesses a BTC wallet that can generate transactions and compute the possessed value of BTC by the node. Moreover, a full node possesses computational resources to compete in the mining competition. Nodes that do not store a full copy of the blockchain are referred to as Simplified Payment Verification (SPV) nodes or lightweight nodes. These nodes require assistance from full nodes when verifying a transaction. Full nodes feed the SPV nodes with the required information from the blockchain necessary to complete the transaction verification.

Some nodes may only perform one particular function. Ones that are engaged in the mining process are referred to as mining nodes while others that generate transactions are referred to as wallets.

In most Bitcoin software implementations, all nodes are treated equally and can be uniquely identified by their IP addresses. Using these addresses, peers establish Transmission Control Protocol (TCP) connections with one another. Each node can choose whether to connect to the network using a public or private IP. A node that utilizes a public IP is accessible over the Internet by any connected node while one with a private IP is only accessible by nodes within its private network. By default, a node with a public IP address is granted 8 outbound connections and 117 inbound connections, resulting in a total of 125 connections. On the other hand, a node with a private IP address is granted only 8 outbound connections. An outbound connection is initiated by the node itself when it requests connecting to a discoverable node while an inbound connection is initiated by other nodes in the network that desire connecting to the node.

For explanation purposes, we define the node that initiates a connection as client and the node that waits for an incoming connection as server. Both nodes engage in a TCP handshake by exchanging network packets defined as and . The client initiates a connection request by sending a packet addressed to the IP address of the server. By default, the server listens on port 8333 for incoming packets. If the server accepts the packet, it responds with a packet and its own packet, both addressed to the IP address of the client. Finally, the client responds by sending a packet addressed to the IP address of the server and the connection is established. The connection enables symmetric communication allowing the client and server to exchange data bidirectionally. The connection is lost if peers do not communicate for a specified idle time. To reconnect, peers engage in a new TCP handshake.

As discussed previously, a node shares with its peers a list of IP addresses that it has learned as a result of being connected to the network. Each node stores its list in two separate tables: a tried table and a new table. The tried table of a node stores IP addresses that the node has established connections with while its new table stores IP addresses that it has only discovered but did not attempt to connect to yet. When a node desires sharing IP addresses with its peers, it randomly selects IP addresses from both tables and sends them in messages. An message can contain up to 23% or a maximum of 1000 IP addresses of the total IP addresses stored in both tables. To initiate sharing, a node sends a message to its peers requesting them to share their lists of IP addresses. The peers then respond with an message. In some cases, sharing IP addresses is unsolicited if a node voluntarily sends an messages to its peers without receiving a message.

A node that wishes to connect to the Bitcoin network for the first time cannot obtain IP addresses by this method. Bootstrapping is mainly achieved by communicating with a Domain Name Server (DNS) seeder. The node sends a DNS query requesting a list of active IP addresses. If the DNS fails to respond with an appropriate list of active IP addresses, the node can still connect to the network by using a hard-coded list of IP addresses, referred to as seeds. Once connected to any of these IP addresses, the node can then request more IP addresses from its peers by sending messages.

Nodes also relay verified transactions and blocks to their peers to reach consensus. A node begins by broadcasting an inventory () message to all its peers informing them of the new transactions or blocks it has received and verified. The peer nodes check whether they are already informed of these new transactions and blocks then respond to the node with a message. The message includes all the transactions and blocks a peer node is not aware of. The node then responds with a transaction/block message that includes the complete transactions/blocks the peer requests. Once received, the peer validates the transactions or blocks and continues to relay them to its own peers in a similar manner. If a received transaction or block cannot be validated, it is immediately dropped and its propagation is discontinued.

Iii-B Bitcoin Transactions

We define a Bitcoin transaction () as the transfer of an amount of BTC ownership rights from the wallet of the buyer to the wallet of the seller, in exchange for a product or service. BTC wallets utilize elliptic curve digital signatures to handle the transfer of ownership rights and ensure that unauthorized spending of the cryptocurrency is infeasible. Each wallet randomly generates a private key which is used to derive its corresponding public key that is shared among all users. The is used to generate the address of the wallet needed to make payments to it while the is used to generate a digital signature corresponding to the in order to claim payments made to the wallet and use them in later transactions. A is first generated from a Cryptographically Secure Pseudo-Random Number Generator () and its corresponding is then calculated using Elliptic Curve Digital Signature Algorithm (ECDSA). Calculations are performed based on the field and curve parameters defined by with the curve order  [22] as follows

(1)
(2)

where is a generator of the elliptic curve and represents elliptic curve multiplication.

The BTC wallet of the buyer assembles a transaction using the Unspent Transaction Outputs () of the buyer stored in the blockchain. An specifies an amount of BTC claimed earlier by the buyer as a result of a previously processed transaction. A simple BTC transaction is shown in Fig. 1.

In the figure, we show that a transaction can consist of multiple inputs and outputs. The output represents the transfer of ownership rights of a certain amount of BTC from the wallet of the buyer to the wallet of the seller. The output represents redirecting ownership rights of the BTC change amount back to the wallet of the buyer. A distinct locking script is attached to each of these outputs which specifies conditions that must be met in order to grant ownership rights. For example, the locking script attached to must include the of the seller needed to generate his/her wallet address. This ensures that the payment is made to the wallet of the seller and only he/she is granted access to it with his/her corresponding . Using , the seller can generate a digital signature that corresponds to the associated with the locking script, hence claim the output.

The inputs represent unspent transaction outputs claimed by the buyer from previous transactions. When a buyer decides to use a specific output from a previous transaction as an input to a new transaction, the buyer must specify proof that he/she still possesses ownership rights and did not previously spend them in another transaction. This is done by attaching an unlocking script to each input. The unlocking script solves the locking script that was associated with the output from the previous transaction. Likewise, the unlocking script is a digital signature produced by the of the buyer that corresponds to a associated with the locking script of an . A valid unlocking script is legitimate proof of continuous possession of ownership rights to certain BTC being used as input. As a result, BTC can be viewed as a chain of digitally signed transactions where ownership rights are transferred from one owner to the other by digitally signing them.

Fig. 1: A single transaction with multiple inputs and outputs.

A transaction must include at least one input, however, may include multiple outputs to simultaneously pay different sellers from the total value associated with the inputs. The locking script of each output would specify the conditions of its claimer. However, it is necessary that the total BTC value of the inputs is always equal to or greater than the total value of the outputs. In the event that the total value of the inputs is greater than the total outputs, the difference, known as the transaction fee, is rewarded to the miner that adds the transaction into a block attached to the blockchain. For guaranteed processing, most available wallets today derive the transaction fee as a fixed amount of BTC in relation to the size of the transaction. In other words, the transaction fee increases with the size of the transaction.

The wallet of the user combines all the transaction inputs/outputs and their corresponding scripts into one digital message . It then applies the Secure Hash Algorithm to twice to increase security before releasing it into the network. The 32 byte digest representing the identity of the transaction () is generated as follows

(3)

A newly generated transaction assembled by the BTC wallet of a buyer is released into the Bitcoin network to be validated and stored in the blockchain. The generating node transfers the transaction to its peers which flood it to the rest of the network nodes. Each node that receives it audits the inputs by executing the scripts associated with it. This audit involves checking whether the execution of the unlocking script integrated by a buyer within each input matches its corresponding locking script defined in the previous transaction. If a match exists, the node relays the transaction to its peers and temporarily places it in its transaction pool until chosen to be mined, otherwise, the transaction is dropped.

In some cases, transactions are not flooded into the network in the same order they are generated. As a result, during the audit, a node might not be aware of some inputs of a transaction (child transaction) referring to the outputs of other transactions (parent transactions). Instead of immediately rejecting the transaction and considering its inputs as invalid, the node can temporarily place it into an orphan transaction pool. If the parent transaction shows up, the inputs of the child transaction become valid and it can be transferred to the transaction pool.

Iii-C Bitcoin Transaction Standards

Currently, there are five Bitcoin transaction standards and a few non-standard transactions. All transaction types are generated with a stack-based scripting language that is processed from left to right. A script consists of a list of instructions that must be executed in the correct order to grant an individual the right to spend the BTC within a transaction. The list of standards is described below.

Pay to Public Key Hash (P2PKH):  This standard transaction is the most used type. The locking script within each output of a transaction holds the public key hash (serving as a Bitcoin address) of the seller that will claim the BTC amount included. In other words, the locking script defines a condition that the seller must possess a specific corresponding to the public key hash to claim the output. Once claimed by the seller, the output becomes an owned by the seller. In order for the seller to use this specific as an input to a future transaction, the seller must attach a valid unlocking script to it. The unlocking script includes the of the seller and a digital signature generated by his/her that corresponds to the public key hash associated with the locking script of the previous transaction output.

Pay to Public Key:  The intent behind this standard transaction is to simplify the P2PKH standard. Rather than associating the public key hash within the locking script of the output, the public key itself is used. As a result, the validation process is simple. The digital signature of the seller generated with a can immediately be compared to the associated by searching whether or not they match.

Multi-signature (MultiSig):  In this standard transaction, a combination of keys is required to authorize an output claim. The locking script of a transaction output is associated with a number () of public keys. In order for an individual to claim the output, the individual must possess -of- private keys that correspond to the public keys. This type of transaction can increase the security and can be used in scenarios which require more than one user to be present in order to claim and spend BTC. However, as the number of public keys associated with the transaction output increases, the size of the transaction also increases. As a result, these transactions acquire large space in the pool, therefore requiring more storage memory. As discussed previously, larger transactions also require larger transaction fees.

Pay to Script Hash (P2SH):  This standard transaction was introduced to resolve the complex issues caused by MultiSig transactions. The transaction has the same simple complexity as a P2PKH transaction. Rather than associating the entire locking script with a transaction output that includes multiple public keys, a double hash computation is applied to the entire script, specifically . The result is a 20-byte digest that is attached to the locking script instead of the entire original script. In order to use the output from this transaction as an input to another transaction, the buyer creates an unlocking script that holds -of- private keys and the original script that was cryptographically hashed earlier. In that way, sufficient information is available in the locking and unlocking scripts to validate the for spending. In addition, the buyer no longer has to worry about generating large transactions which might require hefty transaction fees to process. Instead, only the seller is required to provide the unlocking script he/she wishes to spend the output in a new transaction.

Data Output:  This standard transaction is intended to store arbitrary data on the blockchain rather than transfer BTC from a buyer to a seller. In the Bitcoin community, many members believe that such transactions are abusive to the system since it places a burden on the network nodes to process transactions that do not carry BTC. However, such transactions exist and allow 40 bytes of data to be stored per transaction. These transactions are un-spendable, therefore are not stored in the set.

Non-Standard:  A very small percentage of transactions are processed under non-standard transactions. Non-standard transactions use more sophisticated scripts that strive to provide higher complexity and security. In some cases, these transactions might even be the result of bugs or mistakes resulting in loss of BTC.

Iii-D Merkle Trees

Validated transactions are grouped into blocks which are then mined and stored in the blockchain. A single block can contain multiple transactions up to the block size limit. Merkle trees, sometimes referred to as hash trees, are utilized to cluster multiple transactions in one block.

A Merkle tree is a tree data structure generated in a bottom-up approach that can efficiently summarize and verify the integrity of the transactions being combined. Starting from the leaf nodes which are hashes of the original data, each non-leaf node is generated as a computation of its respective children nodes. For a single non-leaf node, all its children nodes are concatenated then hashed to produce a single digest that represents the node in the tree. This approach continues until a single node is generated which is defined as the root node.

BTC utilizes a binary Merkle tree in which each non-leaf node has exactly two children. It applies a double hash computation when generating nodes. The leaf nodes used to construct the tree are the identities generated for each transaction as discussed in equation (3).

In a binary Merkle tree, each row within the tree consists of an even number of nodes, except the root node. In the case where a row consists of an odd number of nodes, a replica of the last node is reproduced to even out the number of nodes in that row. To better comprehend the construction of the binary Merkle tree, consider a block that consists of five transactions,

. Each one of these transactions has already been validated by the nodes and an identity for each transaction has been generated as discussed in equation (3). We denote the corresponding identities as , where each identity represents a leaf node in the tree. In this example, the number of nodes at the leaf node level is odd, therefore a replica of the fifth identity is generated, . Next, the double hash computation is applied to the concatenation of each two identities to generate the parent non-leaf nodes of the Merkle tree as follows

(4)
(5)
(6)

where is the concatenation of two identities.

As shown in the previous equations, an odd number of non-leaf nodes is generated at that level. To even it out, we replicate to produce as

(7)

Using the resulting digests we can generate the following level of non-leaf nodes as

(8)
(9)

Finally, the 32 bytes root node is derived as

(10)

Fig. 2 represents the complete construction of the Merkle tree for this example. The dotted nodes represent the replicated nodes that are added to even out the odd rows. The root node, R, representing the summary of all transactions is placed into the block header of a block to be mined and chained to the blockchain.

Fig. 2: A Merkle tree within a block.

The use of Merkle trees is more common in SPV nodes since they do not store a copy of the full blockchain. When an SPV node needs proof to the existence of a transaction within a block, it turns to a full node for assistance. The full node will generate a merkle path by computing a maximum of computations, where represents the total number of transactions in the tree. Using the Merkle path as an authentication path, the SPV node can prove the existence of a transaction within the tree. This proof of existence method is considered to be efficient since it only requires hash computations.

Iii-E Blockchain

The blockchain is a public ledger that stores all previous transactions since the creation of Bitcoin. It provides its users with transaction confirmations to track ownership rights of BTC. As new transactions are processed, the blockchain is extended. It consists of blocks , each carrying a set of validated transactions, where represents the first block and represents the most recent block attached to the blockchain. Blocks are linked back-to-back, with each one referencing its previous block to form the complete blockchain. To reference a block, a unique 32 byte identity is generated for by applying to the block header. An identity is referred to as the block hash.

The head of the blockchain is denoted as and is defined as the genesis block. differs from all the other blocks as it does not reference any previous block. At the launching stage of the system, was a stand-alone block waiting for the system to initiate a new mined block to be chained to it.

Each block consists of two parts, a header, and a body. Each header incorporates the block hash of its predecessor block in the chain. The header also consists of a difficulty target, nonce, and a time-stamp which are discussed in more detail in the following subsection. The body carries all the leaf nodes and non-leaf nodes of the Merkle tree, excluding the Merkle root, which is incorporated in the header. This design makes it infeasible to retroactively alter records within any block of the blockchain. Any modification to one block will require adjusting all the subsequent chained blocks.

Iii-F Bitcoin Mining

Bitcoin mining is the final stage to secure validated transactions and add them to the blockchain. Once a transaction is added to the blockchain, it becomes completely verified and public to all users. The transaction claimer(s) can use the embedded (s) as the input to other transactions whenever desired.

Miners begin by selecting transactions from their transaction pools that will be placed into a block where a block cannot exceed 1MB in size. A small portion of that space is specified to carry high priority transactions. Priority is based on the size and age of the transaction inputs. The rest of the block is filled with other transactions which have greater transaction fees to maximize the profit that a miner can turn if successful in mining the block first. A transaction with low or no fees will probably remain in the transaction pool of the miner until it ages and becomes a high priority transaction.

Next, the miner assembles a special transaction, known as the coinbase transaction. This transaction is a reward paying transaction to the miner in the event of winning a mining competition. It does not have any inputs and consists of a single output addressed to the wallet of the miner. The amount incorporated in the output is the reward mining fee (12.5 BTC at the time of writing) plus the sum of all transaction fees included in each transaction.

All the selected transactions along with the coinbase transaction are then combined into a Merkle tree as discussed previously. At this point, the miner has all the components needed to construct the block header of the new block except the nonce. The nonce is a value which if concatenated with the block header of the group of chosen transactions and then double hashed, it produces a digest with a specific prefix of zeros in its binary representation. Searching for this value is performed in a brute-force manner and is directly correlated with the computational power available. The more available computational power, the faster a miner is able to find the correct nonce. A successful miner will then broadcast his/her proof-of-work to prove that he/she consumed computational resources in order to find the correct nonce.

The primary advantage of the proof-of-work is to make it computationally infeasible to perform Sybil attacks [23]. This process is intentionally designed to be resource-intensive to perform while efficient to verify that the work has been done. It is required that a certain number of zeros appear in the prefix of the digest as a result of applying the double computation. The prefix determines the difficulty of finding the correct nonce. The more zeros required in the prefix of the digest, the harder it is to find the correct nonce and vice versa. The difficulty is dynamically altered every two weeks so that the average time it takes a miner to find a correct solution is approximately ten minutes. As the number of miners increases, the difficulty increases, and vice versa.

The first miner to find the correct nonce to a block of transactions is rewarded a mining reward as compensation for the computational power spent. The mining reward is halved precisely every 210,000 blocks that are added to the blockchain. It is estimated to continue until the year 2140 when nearly 21 million BTC will have been released into the system. The reason for having a fixed supply of BTC is to prevent price inflation in the future.

Another incentive that encourages miners to spend their computational power to perform mining is the transaction fee. The winner is not only rewarded the mining reward but is also given all the transaction fees incorporated with all the transactions in the block. With time, the mining reward will decrease due to halving, which will demand higher transaction fees in the future to compensate for the reduced mining reward.

After a block is successfully mined, all the miners check their transaction pools to eliminate the transactions that have been included in the mined block and immediately construct a new block of transactions. The end of the mining race marks the beginning of a new one. Miners instantly begin to search for the nonce of the next block of transactions.

Simultaneously, the mined block is flooded through the network so that all the nodes can update their blockchains. The winning miner transmits the block to its peer nodes to validate it before propagating it further through the network. The peer nodes check whether the block is correctly assembled in terms of syntax and variables. The proof-of-work provided by the miner must be correct and the coinbase transaction must include the correct amount to pay the miner. If any information is invalid, the block is immediately dropped.

Quite regularly, as blocks are mined to extend the blockchain, a temporary incident, known as a fork, might occur. A fork occurs when two miners are able to simultaneously mine two different blocks at the same time. As a result, both newly mined blocks are accepted to extend the blockchain. The blocks are flooded into the network and the miners update their version of the blockchain based on the block they receive first. This results in two valid versions of the blockchain in possession by the miners with two different paths. However, the miners continue to extend their version of the blockchain regardless which path they possess. Eventually, one path will grow longer than the other as mining continues. The path that grows longer is the winner and all nodes immediately discard the other path and update their blockchain to the longer one. In literature, the blocks that are dropped are known as orphan blocks; valid blocks that were part of the blockchain at some point.

Iii-G Bitcoin Mining Pools and Payment Methods

Although solo miners can compete in the mining process, the likelihood of a successful return is very low. This is even the case for solo miners with the most powerful computing machines. As a solution to this problem, solo miners collaborate in the mining process by joining computational power into mining pools. Together, they form a large organization with significant computational power that can compete with the other large entities. The members of the mining pool work together to find the correct nonce for a candidate block and report the result as one miner, increasing their chances of winning the competition. In the event of success, the rewards are split among the participating miners based on the contribution provided by each.

The concept of a mining pool can be compared to the lottery. Assuming individuals with the same financial capabilities, if a large group buys tickets together, the individuals within the group have a better chance of winning than a single individual buying tickets alone. If any ticket owned by the group wins the lottery, the participating individuals split the reward proportional to the amount invested by each. In a mining pool, the computational power provided by each solo miner is analogous to the amount invested by each ticket buyer.

A mining pool is managed by a pool operator who handles the entire pool server and receives a percentage of the rewards as compensation. The role of the operator is to coordinate the mining performed by all the participating miners. The operator keeps a continuously updated copy of the entire blockchain to ease the job of the participating miners. Using the updated blockchain, the operator verifies any transaction that appears in the network and places it in a candidate block for mining. By that, miners only need to worry about finding the correct nonce of the candidate block. If the mining pool wins the competition, the operator divides the rewards among the participating miners.

Reward splitting can be performed in multiple forms and varies from one mining pool to the other. As described in [24], these methods can be categorized into simple reward, score-based reward, or risk-free pay-per share reward.

Simple reward systems consist of either proportional systems or Pay-Per Share (PPS) systems. In the proportional systems, a reward is split among the participating miners at the end of each round, where a round is the consecutive time between two successful blocks generated by the pool. The operator keeps a percentage of the reward and divides the remaining among the miners based on the shares they submit. Shares are defined as the number of hashes performed by each miner in attempt to find the correct proof-of-work. A miner that submits shares from a total of shares submitted by all the miners in the pool receives a reward of BTC on average. Conversely, the PPS system is a deterministic one where the miner knows how much reward can be turned in advance. The operator immediately pays each miner based on the submitted shares regardless of the mining result. In other words, a miner that submits shares receives BTC/share, where represents the probability of one share being the correct proof-of-work. In this system, the operator is taking the risk of mining independently since the miners receive ensured payments whether or not the pool generates a block.

Score-based reward systems come in many forms and strive to prevent miners from pool-hopping. Pool-hopping is the practice of mining in a pool only during its good times (successfully generating blocks) and leaving it during its bad times. A pool-hopper can maximize his/her rewards at the expense of miners that remain loyal to the pool at all times. The method introduced by Slush [25] is one of the first implemented score-based systems that extends the proportional method. Rather than paying the miner an amount based on the submitted shares after each round, the miner is given a score that is proportional to his/her contribution and increases as more time elapses from the start of the round. The score is used to calculate the reward share given to the miner at the end of the round. However, this method is still susceptible to hopping since the score does not consider factors such as the mining difficulty or the hashrate of the pool. Also, in this method mining at the beginning of a round is more profitable since there are fewer shares at that time. As a result, the geometric method was introduced to address these weaknesses. This method introduced a fixed fee, a constant amount taken from the reward of each block, and a variable fee, a score granted at the beginning of each round to the operator. As time passes, the variable fee declines, making mining equally profitable throughout the entire round. Shorter rounds result in larger variable fees and vice versa. By that, there is no advantage to mining early in the round.

Another score-based method is Pay-Per-Last-N-Shares (PPLNS) that exists in different forms. In this method, the concept of rewarding miners after each round is replaced with rewarding miners that have been participating in earlier rounds, regardless of the mining result. In other words, the operator pays miners based on their contributions from previous efforts. Later on, more advanced payment systems evolved such as the Double Geometric Method (DGM). This system is a hybrid between the PPLNS and geometric system that combines advantages of both methods.

Some mining pools employ a risk-free pay-per share system. One of the first implemented systems is known as the Maximum Pay-Per Share (MPPS). It combines both the PPS and proportional systems, where each participating miner has a balance of each. If the miner submits a share, the PPS balance is incremented and when the pool successfully generates a block, the proportional balance is incremented. At pay time, the miner receives the minimum of both balances. This method protects the pool from taking the risk alone. However, this method is inconsiderate to the miners, since they will always make less whether the pool is successful or not. In addition to this, the system suffers from pool-hopping. A solution was later proposed to solve this problem in the Shared Maximum Pay-Per-Share (SMPPS) system. The miners have a PPS balance which continues to accumulate as the miners are participating. If a block is found by the pool and there are sufficient funds, the miners are paid based on their PPS balance. However, if there are no sufficient funds, miners are paid proportional to the available funds and given credit to be paid later for whatever balance that is owed.

Today, a broad range of mining pools exist that give miners a variety of options when joining pools. The question most miners would ask is which mining pool is the best to join. The answer here lies in the preferences of the miners. For example, some miners are not willing to take the risk of not getting paid in the event of being unsuccessful in generating a block and would prefer a PPS mining pool. Others might be willing to take the risk and choose a score-based system for instance, in return for larger profit.

Iii-H Alternative Cryptocurrencies

In literature, alternative cryptocurrencies are known as altcoins, most of which are inspired by Bitcoin. Altcoins strive to offer innovative features and/or enhanced security/privacy countermeasures in an effort to compete with Bitcoin. Their development process is based on the level of innovation and security/privacy countermeasures they present.

The simplest method to develop an altcoin is by forking the open source code of Bitcoin [26] while adding/modifying any features to it. In software development, a fork is a completely independent project that exploits a copy of the original source code. A Bitcoin fork generates an entirely new blockchain and is completely independent of Bitcoin. Namecoin [27] is the first developed Bitcoin fork that adopted all of the characteristics of Bitcoin. It also introduced an additional feature allowing users to store data within its transactions. Various Bitcoin forks have evolved latterly with more features and handled security/privacy issues. Many of these forks implemented privacy protocols to increase the anonymity of cryptocurrencies. In Section VII, we discuss notable privacy protocols that have impacted some of these altcoins.

In exceptional occasions, an altcoin can also be the result of a hardfork. A hardfork occurs when modifications are made to the original software of Bitcoin making its new transactions/blocks incompatible with those previously generated prior to the modifications. These modifications can be as simple as altering certain parameters, such as the block size, or as complex as changing major protocols, such as the consensus algorithm. In order to enforce these modifications, the majority of users/miners must upgrade their client nodes to the latest version which accommodates these changes. The users/miners that do not accept the upgrade will view the new transactions/blocks as invalid and will not accept them. As a result, the blockchain will inevitably split into two paths, one storing transactions of the original cryptocurrency and one storing transactions generated due to the modifications made, hence creating a new altcoin. Users in possession of the original cryptocurrency will automatically be granted an equivalent amount of the new altcoin to what they hold.

Bitcoin Cash is a notable example of a Bitcoin hardfork which occurred on August 1, 2017. It was the result of enforcing  [28] which proposed activating Segregated Witness (SegWit) [29]. SegWit increases the transaction speed of Bitcoin by splitting the transaction into segments and removing the unlocking signatures which are attached separately at the end. The majority of the miners accepted this proposal resulting in Bitcoin Cash. Users who possessed BTC were immediately granted an equivalent amount of BCC (The currency of Bitcoin Cash) to the BTC they possessed.

While only borrowing the concept of storing transactions in a blockchain, some altcoins have been implemented from scratch with a completely different design and purpose. These altcoins strive to provide services and security/privacy countermeasures beyond the capabilities of Bitcoin or any of its forks. They present substantial differences such as integrating enhanced consensus algorithms or utilizing private (permissioned) blockchains. In contrast to the public (permissionless) blockchain of Bitcoin, where all participating nodes are allowed to execute the consensus protocol and maintain the blockchain, a private blockchain is limited to only specific nodes. As a result, the cryptocurrency market has witnessed a considerable number of altcoins with substantial innovative features.

Iii-I Major Security and Privacy Issues

Cryptocurrencies are regarded as robust transacting systems designed to avoid payment fraud and provide superior user privacy. However, major issues in cryptocurrencies have been theorized that can put security and privacy in jeopardy.

First, the attacker may potentially deceive the system by spending the same coin more than once. This is known as the double spending attack. Second, cryptocurrencies may be exposed to further weaknesses through major network and storage vulnerabilities. Third, while cryptocurrencies strive to provide their users with anonymity, the current solutions could be vulnerable to linkage problems putting the privacy of the users in jeopardy.

In the following sections, we aim at highlighting major security and privacy issues that exist in Bitcoin and similar cryptocurrencies today.

Iv Double-Spending Attacks

Double-spending is an attack that could be performed by malicious users attempting to deceive the system by spending the same BTC more than once. The attacker generates duplicates of the same and uses it as an input in more than one transaction. Differentiating between the duplicated (fraudulent) copies and the original becomes an issue when used in a decentralized system. There is no trusted entity that verifies the legitimacy of the used as input in a transaction. The inputs of a transaction may consist of unidentifiable fraudulent BTC that have possibly been spent earlier.

The system defends against such attacks by relying on its users (miners) to validate the legitimacy of the BTC used as an input to transactions. Using the information stored in the blockchain from the previous transactions, the miners validate the inputs of any new transaction to ensure that it does not contain previously spent inputs. Once verified, the transaction is mined into a block which is attached to the blockchain. Any user that refers to the blockchain becomes aware that specific (s) have been spent earlier, making fraudulent input transactions detectable.

To ensure that attackers cannot manipulate the blockchain in their favor, the mining process is designed to be an expensive and resource-intensive operation. To mine a block of transactions in the blockchain, the miners must provide a valid proof-of-work. An attacker that wishes to double-spend BTC must reverse a transaction that has been stored in the blockchain to reuse its inputs in another transaction. Reversing an already stored transaction in the blockchain is an extremely difficult task since it requires a significant share of the total computational power of the system.

In the rest of this section, we will analyze the double-spending attacks. We first discuss conventional methods to perform double-spending. Next, we analyze the probability and profitability of the double-spending attack and present a trade-off between the waiting time before accepting a transaction versus the profitability of the attack.

Iv-a Types of Attacks

A double-spending attack comes in many forms. We discuss various techniques that can be performed.

Iv-A1 Race Attack

A race attack refers to the case where a merchant accepts an unconfirmed transaction (a transaction in a transaction pool waiting to be mined and stored in the blockchain) and immediately provides the payer with a product/service before waiting for confirmation. An attacker with the intention of deceiving the merchant creates two transactions: (i) a transaction that pays the merchant an amount of BTC in return for a product/service and (ii) a fraudulent transaction that pays the same amount to the wallet of the attacker. Both transactions use the same inputs (duplicated BTC) and try to spend the same BTC. The attacker concurrently releases both transactions into the Bitcoin network. The miners consider both transactions as being valid until one of them gets stored in the blockchain. The transaction that gets stored in the blockchain is referred to as a confirmed transaction. At that point, the inputs of the stored transaction cannot be used as inputs to other transactions. Therefore, the fraudulent transaction has a chance of being verified first and added to the blockchain making the merchant-paying transaction invalid. The invalid transaction is rejected by the system and dropped from the transaction pools of miners.

To avoid a race attack, merchants must wait for the mining to be completed and the transaction to appear in the blockchain before providing the payer with the product/service. It is recommended that the merchant should wait for at least six subsequent blocks as confirmation before making the trade. In this case, the chances for an attacker to reverse a transaction are negligible, assuming that the attacker can control no more than 10% of the total computational power used in mining.

Iv-A2 Finney Attack

Finney attack was first suggested in a Bitcoin forum [30]. Similar to the race attack, the attacker performing this attack will only succeed if the merchant accepts an unconfirmed transaction. The attacker creates two transactions similar to those in the race attack and holds on to both of them. The attacker then begins mining the block containing the fraudulent transaction. If the attacker is successful in mining the block, the attacker then uses the other transaction to pay a merchant immediately in exchange for a product/service. Once the merchant makes the trade, the attacker releases the mined block which contains the fraudulent transaction into the network. Given that the block is already mined, it will be added to the blockchain immediately. As a result, the merchant-paying transaction will become invalid. In addition to this, the attacker is rewarded the mining reward for the mined block carrying the fraudulent transaction. However, the ability to independently mine a block is improbable given the resources necessary to perform the task.

Iv-A3 Vector76 Attack

In comparison to the race and Finney attacks, the Vector76 attack requires the merchant to wait for a single block to be mined and added to the blockchain as a confirmation. To reverse the transaction, the attacker needs to create a fork in the blockchain. Initially, the attacker creates a merchant-paying transaction and does not broadcast it to the network. Next, the attacker tries to independently and secretly mine this transaction into a block. If successful, the attacker holds onto the block until the honest miners discover another block. The attacker then simultaneously releases the block into the network at the same time as the honest miners release their block which will result in a fork. Before the fork is resolved, the attacker creates a fraudulent transaction that double-spends the same BTC used in the merchant-paying transaction. The attacker then relays the fraudulent transaction to the honest miners that do not have the path of the blockchain that carries the merchant-paying transaction. These miners see the fraudulent transaction as valid and begin mining it into a block. As a result, each path of the blockchain stores one of the transactions. If the path that holds the fraudulent transaction grows longer than the other path, the double-spending attempt is successful.

Iv-A4 51% Attack

51% attack is the largest threat to the BTC system. This attack is also referred to as the majority attack in which the attacker (usually a pool of miners) controls more than half of the total computational power of the system. By controlling the majority of the power, the attacker is capable of interfering with the process of mining blocks and reversing any block of transactions. During a 51% attack, the system loses integrity since the other miners no longer have an incentive to compete in the mining process.

To better comprehend this attack, consider the case where the attacker generates a merchant-paying transaction and releases it into the network. The merchant waits for an appropriate number of confirmations before accepting the payment and making the trade. Simultaneously, the attacker secretly begins to mine a block that contains a fraudulent transaction followed by more blocks to extend it. Since the computational power of the attacker is more than the rest of the computational power of all the miners combined, the attacker can mine blocks in less time. Once the merchant accepts the transaction, the attacker releases the secretly mined blocks to create a fork in the blockchain. If the fraudulent fork created by the attacker is longer than the original chain, it becomes dominant and all miners begin to extend on it. By that, the merchant-paying transaction no longer exists in the blockchain.

This attack represents the biggest threat to Bitcoin as it is directly correlated to the resources an attacker can provide. Resources are measured in terms of financial and computational power. Large entities such as governments or intelligence agencies have the means to control a large share of the total computational power. They are able to destroy or push the system into their favorable status. It is important to note that even with a computational power that is slightly less than 50%, an attacker may still be able to severely manipulate the system. In the next subsection, we analyze the chances of success of the attackers based on the share of computational power they control.

Iv-B Probability of Success

Despite the continuous increasing popularity of Bitcoin, the number of merchants that have accepted it as a method of payment today is still relatively minimal. Many merchants have concerns about its capabilities in terms of security, while others consider it as a slow method to make payments. Those that accept it should try to take all precautions before accepting a transaction to prevent double-spending attacks.

One of the important precautions is to decide when to accept a transaction before making the trade. Merchants prefer to obtain a certain degree of confidence as assurance that the payer will not be able to reverse the transaction. Those that can afford to wait a long period of time before accepting a transaction (for example, online platforms) require a minimum of six confirmations before accepting a transaction and considering it as being irreversible. However, others that cannot afford this time waiting (such as vending machines), rush into accepting transactions at the risk of losing the payment to a double-spending attack.

Similar to the analysis in [1], we model the race between the honest miners and the attacker to generate blocks as a binomial random walk. The race is denoted as which represents the number of blocks generated by the honest miners with computational power minus the number of blocks generated by the attacker with computational power . If a block is generated by the honest miners, we increment by 1. Conversely, if a block is generated by the attacker, we decrement by 1. The race between the honest chain and the chain generated by the attacker can be derived as

(11)

where represents an individual block race. If and the attacker has unlimited resources, the attacker will eventually reach . At that point, the attacker can replace the blocks generated by the honest miners and succeed in performing the attack.

The probability of the attackers to catch up and surpass the blocks generated by the honest miners can be compared to the Gambler’s Ruin Problem. Similar to the description in [31], we assume a gambler (attacker) begins with an initial fortune , , and either wins $1 with probability or loses $1 with probability , in each successive gamble. The game represents a random walk which terminates at (fail) or at (success). The probability of success after trials is denoted as and can be calculated as:

(12)

Since , we can rewrite equation (12) as:

(13)

At , the attacker has a probability of success . By rearranging and generalizing equation (13), we have

(14)

Let meaning that , we can rewrite equation (14) as

(15)

Solve from equation (15) and substitute the result into equation (14) to obtain

(16)

Following the analysis in [32], we assume that the attacker begins with an initial fortune and can afford to lose up to dollars before giving up. The gambler wins if dollars. This assumption modifies the game to account for the probability of the attacker to surpass the blocks generated by the honest miners as

(17)

Consider an attacker that possesses an unlimited amount of resources and is willing to use as much of it as needed to perform the attack, i.e. . If , then

(18)

For , we first divide the numerator and denominator by then calculate the limit as

(19)

Finally, we can summarize the probability of the attacker to surpass the blocks generated by the honest miners as

(20)

The merchant has no way of figuring out the number of blocks that the attacker has been able to secretly mine. Therefore, one way to model the overall probability of the attacker to surpass the honest chain is by using the Poisson distribution. The expected number of blocks an attacker can generate is

. The overall probability of the attacker to surpass the honest chain can be computed by multiplying the Poisson density and the probability of surpassing the honest remaining blocks as discussed in equation

(21)

For equation (IV-B), if , we will always have , meaning that the attacker will win. When , the probability for the attacker to succeed is

(22)

Another way to model this probability is by using the negative binomial distribution assuming the attacker can pre-mine one block before broadcasting the merchant-paying transaction to the network 

[33]. The merchant waits for blocks to be generated by the honest miners with computational power before accepting the transaction. At that time, the attacker can secretly generate blocks with computational power , where . By definition, we can model this as the number of blocks that the attacker can generate (success) before the number of blocks the honest miners can generate (failure). Therefore, the probability of a successful double-spending attack for a given value can be calculated as

(23)

Overall, the probability for an attacker to successfully surpass the number of blocks generated by the honest miners can be computed as

(24)

Similar to the previous analysis, equation (IV-B) confirms that when , the attacker will always succeed since . When , the probability of success can be defined as

(25)

Fig. 3 shows the results of as changes based on equation (IV-B). From this figure, the merchant can obtain the desired level of confidence before accepting a transaction. The obtained level of confidence is definite for any at , meaning that the attackers have 100% chance of success. As the number of blocks increases, the chances of a successful double-spending attack decline. Conversely, as increases, the chances of a successful attack increase. The figure also shows that if then we will always get . This is known as the majority attack. In fact, even if the values of are slightly less than 0.5, the chances of a successful double-spending attack could still be high. However, the probability declines exponentially as the value of increases.

Fig. 3: Probability of successful double-spending attacks vs. number of confirmations waited by the merchant.

Iv-C Attack Profitability

A successful double-spending attack is only profitable if the revenue is higher than the cost of performing the attack. Suppose an attacker tries to double-spend BTC paid to a merchant in exchange for a product/service. The attacker releases a transaction into the network that pays BTC to the wallet possessed by the merchant. Immediately after releasing the transaction, the attacker secretly begins to mine blocks of transactions. One of these blocks contains a fraudulent transaction that pays the same BTC to the wallet possessed by the attacker. The merchant accepts the transaction after observing that blocks have been extended to the blockchain. If the attacker is able to secretly mine blocks and replace the blocks in the blockchain generated by the honest miners, then the attacker is successful in gaining a product/service without paying for it. Assume that the attack returns a value of , one of which is the actual BTC as a result of reversing the merchant-paying transaction and the other as the product/service. In addition to this, the attacker gains the mining reward for each block mined and the transaction fees included in each transaction. Then the revenue gained by the attacker can be formulated based on his/her corresponding as follows

(26)

where is the block reward and the transaction fee per block.

Multiple factors can impact the cost such as the price and depreciation value of machinery used, the cost of electricity, and the amount of BTC being spent in the transaction. However, formulating the cost with all the possible factors is infeasible. To simplify it, we focus our analysis on the cost factors that could change significantly as the attack is performed. These factors include the BTC an attacker spends in the merchant-paying transaction, the cost of mining blocks, and the depreciation cost of the computing device used in BTC at time . We derive the cost as follows

(27)

where is the estimated mining electrical cost in BTC/block of a miner with a share of the total computational power of the system. We assume remains constant during the total time the attack is performed.

We also assume that the average lifespan of the mining equipment is approximately two years. Using straight-line depreciation, is a negligible value for an attack over a short period of time. Therefore, we can reduce the cost equation as follows

(28)

The time to mine a single block either by the honest miners or the attackers is approximately ten minutes. Therefore, we can rewrite equations (IV-B), (26), and (28) as

(29)
(30)
(31)

The profit/loss can be formulated from equations (30) and (31) as

(32)

Nowadays, to stand a chance in mining Bitcoin, miners merge their computational power into mining pools as discussed in Section III-G. The mining pools combine the computational power provided by the computing machine of each participating miner. Machines are categorized into one of four groups: Application-Specific Integrated Circuits (ASIC), Field-Programmable Gate Array (FPGA), Graphics Processing Unit (GPU), and Central Processing Unit (CPU). Each group can provide up to a certain computational power. Comparisons of most of those machines are presented in [34] and [35].

Each computing machine consumes electricity differently based on its specifications. Even machines with similar specifications might vary in cost to operate. As a result, formulating the cost of electricity spent by a miner in the mining process becomes challenging.

ASICs have monopolized the mining process due to their incomparable computational power with those of the CPUs, GPUs, and FPGAs. Miners using any computing machines other than ASICs have a negligible chance of competing. Many mining pools do not permit miners with these machines to join their pools. A miner that joins a mining pool with one of these machines would hardly earn BTC in the event that the pool successfully mines a block. This is because rewards are usually divided among the miners based on their contributions as discussed in Section III-G.

Our goal now is to formulate the estimated electrical cost of a mining pool. First we estimate the total number of miners based on the total hashrate of the system at a certain time as

(33)

where is the average hashrate of a single mining machine involved in mining at time .

The cost of electricity is measured in cents/kWh and varies based on the end-use sector and time . End-use sectors include, residential, commercial, industrial, and transportation. We denote the average cost of electricity of all sectors at time as . Using the computing wattage of the machine, the average running cost of a machine at time is

(34)

Using equations (33) and (34), the total cost for all miners at time is

(35)

We know that it takes approximately 10 minutes to generate one block, i.e. in hour, miners can generate blocks. Therefore, the total cost for all miners to generate one block at minutes can be estimated as

(36)

However, as discussed previously, miners merge their computational power to increase their chances of winning in the mining competition. The largest mining pool that exists today is Antpool [36] controlling approximately 25% of the total computational power. Other mining pools also exist such as BTCC Pool [37], Bixin [38], BTC.com [39], and BTC.TOP [40] that control approximately 7%-11% of the computational power.

We estimate the average electricity cost of a mining pool based on its computational power as

(37)

For our simulations, we assume that the total cost of mining blocks by all miners and computational power of the mining pool are fixed during the total mining time . We also assume a mining environment consists of miners using only ASICs such as Antminer S9 since it is one of the most efficient computing machines on the market today. The specifications of this machine are TH/s and kWh.

Consider an attacker trying to perform a double-spending attack during the period of August 2017. During that period, 1 BTC was equal to approximately $4500. The total hashrate power was approximately TH/s and the average cost of electricity for all sectors in the U.S. was approximately 10.98 cents/kWh, based on the data collected by the U.S Energy Information Administration [41]. Under these circumstances, in Fig 4 we present the expected profit/loss of double-spending attacks for various computational powers . For this analysis, we assume the attackers try to double-spend BTC.

Fig. 4: Profit/loss of attackers with varying computational power trying to double-spend BTC.

In Fig. 4, a point above represents a profit while one below it represents a loss. The point of intersection of a curve with represents the break-even point of an attack. The amount of BTC spent to perform the attack at this time is equal to the revenue returned. By analyzing the figure, we attain the following findings:

  1. [wide=0pt, listparindent=3pt,parsep=3pt]

  2. For any value at , the attacker turns a profit of exactly BTC. Recall in Fig. 3, for any value at (or ), . The merchant accepts an unconfirmed transaction giving the attacker a theoretically perfect chance to succeed. In this example, the attacker is trying to double-spend BTC resulting in a BTC for all values at .

  3. When the merchant waits for confirmations before accepting a transaction, the attacker is forced to mine blocks in order to create a fork in the blockchain and succeed in the attack. As discussed in Fig. 3, the probability of success is based on the computational power of the attackers. Larger values of correspond to higher probabilities of success , hence larger profits/losses. We also know that declines as (or ) increases for all values . As a result, the profits eventually turn into a loss as time progresses. An attacker with a smaller value begins losing at an earlier time during the attack while one with a larger value can withstand longer periods before losing. However, as reflected by the figure, the losses of attackers with smaller values are less and continue to increase slower than those with larger values . This is due to the fact that the cost of electricity for attackers with larger values are larger than those with smaller values .

  4. In Fig 4, an attacker with represents the scenario that begins at the maximum possible profit, then continues to decline till the break-even point. That time is enough for only three blocks to be added to the blockchain. In other words, three blocks of confirmation for a transaction worth of 5 BTC should give the merchant enough confidence that an attacker with will not be able to reverse the transaction. The attacker will most likely fail to turn a profit if unable to beat the honest miners in the mining process before they add three blocks to the blockchain. If the attacker continues to perform the attack beyond this point, the cost will continue to increase while the revenue declines leading to a loss. As a result, the attacker would most likely surrender at the break-even point to minimize any losses.

  5. For to , Fig 4 shows that the profit continues to grow as increases until it reaches a maximum point due to the accumulation of the mining rewards. Once the chance of success starts to decline with , the profit also begins to decrease until it reaches the break-even point and later turns into a loss. However, for , the attacker always succeeds. The profit is represented as a straight line with a positive slope where the slope represents the rate of turning a profit.

In summary, an attacker with a computational power will eventually lose at some point as increases. On the other hand, an attacker with computational power will always succeed with a profit. However, it is important to note that this analysis does not include the luck factor. Consider two miners with computational powers and respectively, where . The miner with computational power has more resources to solve the proof-of-work, therefore can perform mining faster than the miner with computational power . However, the miner with computational power could still find the solution to the proof-of-work first due to the randomness of the exhaustive search performed. From a probabilistic standpoint, the chances are low.

V Bitcoin Network Security

Bitcoin is designed to operate over a P2P network. It is vulnerable to the decentralized network attacks which can escalate other issues. In this section, we will discuss major network attacks that can compromise Bitcoin and present network related issues. We also suggest possible countermeasures.

V-a Denial of Service Attacks

Denial of Service (DoS) attacks flood the network with bogus traffic in order to disrupt legitimate services and participating components connected to the Bitcoin network. As an example, DoS attacks on a mining pool can result in eliminating the pool from the mining competition, hence giving an advantage to other miners. They could also facilitate double-spending attacks by preventing certain miners from observing the actual transaction flow [42, 43].

Some nodes prefer to privately connect to the Bitcoin network in order to limit the possibilities of becoming victims of DoS attacks. However, this limits the nodes to at most 8 outgoing connections. As the number of private nodes increases in the network, the random topology connection weakens. With fewer connections between the nodes, information is flooded at slower rates. There is also no guarantee of the legitimacy of the 8 outgoing connections each private node connects to. This means that even a private node can still be vulnerable to a DoS attack if it unluckily connects to malicious nodes.

Bitcoin developers are continuously updating the Bitcoin implementation in an effort to minimize the chances of DoS occurrences. The newer versions analyze the network connections more closely to try to eliminate suspicious nodes from connecting. Developers also strive to limit certain transactions/blocks from being flooded throughout the network. New transactions/blocks are given priority over less important ones such as orphan transactions/blocks. Certain parameters such as block size are also continuously being altered to adjust the network based on its needs. However, the nature of the P2P network makes Bitcoin vulnerable to these attacks.

V-B Sybil Attacks

Peer-to-peer networks are also vulnerable to Sybil attacks [23]. In Sybil attacks, the attacker sets up multiple pseudonymous identities from a single node. In this way, the attacker can acquire an unfair number of shares of the network IP addresses. The honest nodes in the network can easily be deceived into believing that the IP addresses belong to different nodes. With a large number of IP addresses, the attacker can monopolize other connections of nodes and control data propagating to them.

A countermeasure to this attack was proposed in the original Bitcoin white paper [1]. This countermeasure also presented a solution to the majority decision-making problem. It is more convenient to have a one-to-one relationship between a computing machine (node) and a vote instead of having one between an IP address and a vote. An attacker reproducing multiple IP addresses from a single node can no longer make use of them. Every node must engage in a proof-of-work procedure to prove its legitimacy as discussed in Section III-F.

Other countermeasures have been taken by the Bitcoin developers to limit Sybil attacks. Each outbound connection is limited to a single IP address per subnet mask 255.255.0.0 (i.e. x.y.0.0/16). In other words, a malicious node can theoretically generate 65536 IP addresses per network prefix consisting of 16 bits where only one can be utilized in a requested outbound connection. Today, owning a machine with different network prefixes that consist of 16 bits which can generate numerous IP addresses is impracticable. Malicious users with IP addresses belonging to different network prefixes need to collude in order to pull off such an attack.

Developers can continue to increase the security by limiting outbound connections to larger subnet masks (for example x.y.z.0/24), however, this would limit the connection possibilities to the outbound connections which contradicts the P2P network. To optimize security, the subnet mask should be modified dynamically based on the available network prefixes of the nodes connected to the network. This optimization is very challenging since there is no fixed pattern to how or when nodes connect to the network. In general, this practice is a weak security countermeasure and can slightly increase the security if optimized.

Users should also realize that the majority of node connections are inbound connections (117). If we were to assume that all the 8 outbound connections of a node are legitimate, there is no guarantee that the inbound connections are genuine. A private node relies only on its outbound connections to limit its network connections and the data it receives.

V-C Eclipse Attacks

The Eclipse attack on Bitcoin was proposed in [44]. The primary purpose of an eclipse attack as defined originally in [45, 46, 47] is to monopolize all the outbound and inbound connections of a node within a P2P network. As a result, the victim node becomes isolated from the rest of the network and only receives data fed to it by the attacker. By monopolizing the connections of a node, the attacker can control the blockchain view of this node. The eclipse attack targets nodes that are possibly discoverable; nodes with public IP addresses. It strives to populate the tried and new tables of nodes with bogus IP addresses by frequently sending the victim nodes unsolicited messages. When the tables of nodes are full, they begin evicting random IP addresses to replace them with the newer ones.

The attack requires the victim node to restart all of its connections. Examples that may cause connection restarts include Internet Service Provider outages, power failures or system/software updates. When the node tries reconnecting to its 8 permitted outbound connections, it will choose the compromised addresses in either the new or tried tables with a bias towards the newest stored IP addresses. The optimum time to perform the attack is after populating the tables of the victim node with a decent number of controlled IP addresses. The chances of a successful attack are based on the percentage of the controlled IP addresses and the time an attacker spends performing the attack.

To limit an eclipse attack, some countermeasures have been proposed [44]. When replacing IP addresses as newer ones arrive, a deterministic eviction method could be used instead of the random eviction technique. In this way, each IP address is mapped to exactly one slot in the tables rather than multiple slots, requiring the attacker to possess a large number of addresses. Also, allowing random selection of IP addresses rather than choosing the most recent ones when initiating an outbound connection makes the attack less biased to the bogus addresses of the attacker. Other measures include checking an evicted IP address before replacing it with a new one. If the address still connects successfully, there is no reason to evict and replace it with another one. Feeler and anchor connections are also good methods that can disrupt an attacker. Other measures such as increasing the size of the tables, allowing more outgoing connections, or banning unsolicited can also greatly limit eclipse attacks.

V-D Routing Attacks

The main purpose of a routing attack is to intercept the network transmitted messages and tamper with them. The work presented in [48] proposed a routing attack on Bitcoin via the Internet infrastructure. The Border Gateway Protocol (BGP) [49] is the most widely used protocol when transmitting data between Autonomous Systems (ASs). An AS manages a set of nodes with similar IP address prefixes and is responsible for routing data between its nodes and other ASs.

The proposed attack intercepts traffic between ASs by performing two independent attacks: partitioning attack and delay attack. The attack takes advantage of the fact that ASs do not validate the newly announced BGP routes which could result in possible BGP hijacks. A malicious AS can announce forged IP address prefixes to deceive other ASs into believing false routing information. As a result, a successful attacker will be able to intercept all the traffic for nodes with a certain IP address prefix before it reaches its original destination.

The partitioning attack strives to partition the Bitcoin network into two disjoint groups. One group represents the set of isolated nodes while the other group represents the remaining network. The attacker, usually an AS, requires BGP hijacking of other ASs. Once hijacked, the attacker can intercept all inbound and outbound traffic of all the victim ASs. However, the attacker cannot intercept the traffic of stealth connections. Such connections include intra-AS, node connections within the same AS, intra-pool, node connections between gateways belonging to the same mining pool or pool-to-pool, private connections established between pools. Stealth connections can leak data to the isolated group of nodes and result in an attack failure. Therefore, the attacker must detect such nodes and remove them from the lists of nodes to be isolated.

Once the network is divided into two groups, the attacker can perform the delay attack. The main goal of the delay attack is to tamper with data propagating to its destination and cause a stall. The success of the attack relies on the fact that message exchanging (, , and ) is not encrypted. If the attacker intercepts the flow of traffic between ASs, it is possible to tamper with these messages without any node learning about it. For example, when a node within an AS requests data from its peer within another AS, the attacker will intercept the requested message () and modify the request. As a result, the sending node will send undesired data and cause the receiving node to resend a request message. As long as the attack occurs in a 20-minute time frame, the nodes will not lose their connection and will not be aware that their messages are being tampered with.

The authors in [48] suggest some countermeasures to limit the routing attacks. Simple measures include increasing and diversifying AS connections. Also, monitoring the network information such as round-trip time can help identify potential threats. More complex measures can include, encrypting messages, using different channels and ports, and simultaneously requesting data from more than one peer. However, implementing such more complex measures could introduce additional cost and delay.

Vi Bitcoin Storage Security

Unlike physical wallets that are used to hold cash and banking cards, Bitcoin wallets behave differently. A bitcoin wallet does not store actual BTC. Instead, it stores the private and public key pairs that can be utilized to prove the ownership rights to certain BTC stored over the blockchain. As discussed in Section III-B, keys are generated using pseudo-random number generators and elliptic curve cryptography. In this section, we discuss the variations in Bitcoin wallets and outline the security issues in each.

Similar to [50], we first discuss Bitcoin wallet security based on the key generation and infrastructure of the wallet. Three types of wallets have been defined in BTC. We summarize the comparison between all three types in Table I.

The simpler wallets are categorized as nondeterministic wallets, sometimes referred to as Type-0 wallets. In these wallets, when a new pair of keys is requested, the wallet generates a random private key as shown in equation (1). Next, the wallet derives its corresponding public key as described in equation (2). The generated key pair is completely random and uncorrelated to the previously generated keys. However, these wallets require sophisticated management and could fail to perform well as the number of stored keys grows exceeding the storage capacity of the wallet. A consistent backup of the generated keys is also essential to ensure that the users can still access their BTC in the event of a wallet being unavailable. However, backups are liable to theft and can result in exposing all the keys belonging to a wallet.

Deterministic wallets are another type of BTC wallets. They are also referred to as Type-1 wallets and can handle the drawbacks of type-0 wallets. In this type of wallet, all the generated keys are based on a common and randomly chosen seed . Using the , all the keys are derived in a deterministic manner. First, a private key with an index is generated as

(38)

Using equation (38), the corresponding public key is then generated as discussed in equation (2). In contrast to nondeterministic wallets, deterministic wallets need only to keep a backup of to regenerate all of the previously derived keys.

Hierarchical Deterministic (HD) wallets, referred to as Type-2 wallets, were later introduced based on the standard [51]. In HD wallets, keys are generated in a tree structure as shown in Fig. 5. The key of a node is generated using its corresponding parent node key.

Fig. 5: The structure of a hierarchical deterministic wallet.

For each node, a key consists of three components: a private key , a public key and a chain code (). The chain code is a third component introduced to prevent the derivation of the key of a child node from only the private and public keys of the parent node. In this way, the extended key is an extension of both the private and public key. The extended private key is a combination of the private key and chain code which is used to derive the private key of a child node. Using the derived private key of the child node, it is possible to derive its corresponding public key as explained in equation (2). On the other hand, the extended public key is a combination of the public key and chain code which is used to derive the public key of a child node. It is important to realize that the public key of a child node can be derived using either the extended private or extended public keys.

Key generation begins at depth 0 which derives the root node (master) key components using a randomly chosen seed (). In many wallets, is in the form of a mnemonic word sequence as described in standard [52]. A mnemonic word sequence is a sequence of English words that represents a random number used to derive . Using , the master private key and chain code are derived as

(39)
(40)

where is a one-way hash-based message authentication code that outputs a 512 bit digest and functions and extract the left and right 256 bits of the digest respectively. Using the result in equation (39), the master public key is generated as described in equation (2).

Infrastructure Management Seed Backup Structure
Type 0 Complex No All keys Random
Type 1 Moderate Yes Only the seed Sequential
Type 2 Simple Yes Only the seed Hierarchical
TABLE I: Wallet Infrastructure Comparison.

The next step is to generate keys for the children nodes at depth 1 in the tree. Keys can be generated differently depending on the security of the environment in which the wallet is being used. For example, when used in a secure environment, the wallet uses the extended private key to generate all the components of a child node key. This includes the private key which would allow the user to spend BTC from the wallet. Using the private key of the parent, we can generate the corresponding public key and derive both the private key and chain code of the child using a Child Key Derivation (CKD) function as

(41)
(42)

where , , are the public key, private key, and chain code of the parent node respectively and is the index of the child node. Using the result of equation (41), we can derive the public key of the child node as explained in equation (2).

On the other hand, when used in an insecure environment, the wallet uses the extended public key to derive only the public key and chain code of the child node instead of the private key. This protects the private key from being exposed to potential attackers. It also allows payments to be made to the wallet while preventing them from being spent. The public key and chain code of the child node are derived using the CKD function as

(43)
(44)

Although using the extended public key is more secure as it does not expose the private key, it may still put the wallet at risk. The extended public key exposes the chain code which is an essential component in key derivation. Using an exposed chain code and public key, an attacker can perform a brute-force attack on all the chain codes derived from it as shown in equation (44). In other cases, if the private key of a node is compromised in any way, the attacker can use it along with its corresponding exposed chain code to derive the extended private keys of all the descending children nodes as shown in equations (41) and (42). We also consider the worst case scenario where an attacker is capable of reversing a derived as shown in equation (41). If successful, using the corresponding parent extended public key, an attacker can derive .

To counter these issues, HD wallets also implement an enhanced derivation function known as the hardened CKD. This derivation strives to secure the exposed chain code within an extended public key. It prevents the public key of a child node from being derived from the extended public key. Therefore, the extended private key of the parent node is only useful to derive a hardened private key and chain code of the child node as

(45)
(46)

Using the result in equation (45), the corresponding hardened public key of the child node can be derived as explained in equation (2). In practice, it is suggested to derive the children keys of the master node using the hardened CKD to keep the master key as secure as possible.

Bitcoin wallets can also take other measures to increase the security of storing keys. Practices such as P2SH [53] and Multi-Sig transactions increase the security of the BTC stored in the wallet, as discussed in Section III-C. Such techniques are referred to as threshold techniques as they require -of- private keys to enable BTC spending. Other wallets enhance the security by encrypting the stored private keys along with a pass-phrase chosen by the owner of the wallet as defined in standard [54]. That is

(47)

where is the Advanced Encryption Standard [55] and is the encryption key. If a user wishes to spend BTC, the user must first decrypt the corresponding encrypted private key using and the pass-phrase previously used in encryption. Although encryption provides higher levels of security, the user must keep the pass-phrase and encryption keys stored securely.

Type Function Examples
Full Service (online) Generate private key, derive public key, distribute public key, monitor output TX, create/sign unsigned TX, broadcast TX coinbase.com [56], blockchain.info [57]
Signing-only (offline) Generate parent private key, derive parent public key, sign TX Ledger Nano [58], TREZOR [59]
Distributed (offline) Derive CKD, Distribute public key customized pre-populated database
TABLE II: Wallet Function Comparison.

The wallets that exist today come in different forms and account for different security measures. Based on the different installation environments, wallets can be categorized into three types: online (web) wallets, desktop (software) wallets, and mobile wallets. As in [60], we can further categorize each type of these wallets into: full-service wallets, signing-only wallets and distributing wallets, based on the functions that they can perform. Table II summarizes these different functions.

A full-service wallet is one that can perform all the functions required to spend and receive BTC. These functions include generating private keys needed to spend BTC, signing transactions with the private keys, deriving public keys needed to receive payments of BTC, broadcasting the derived public keys to the network, and monitoring the BTC spending and receiving of a wallet. Full-service wallets must be able to connect to the Bitcoin network. Examples of online full-service wallets include the wallets provided by coinbase.com [56] and blockchain.info [57]. Armory, Electrum and Bitcoin Core are the most popular desktop full-service wallets today. For mobile wallets, an example that runs on both Android and iOS includes the Airbitz wallet.

The second type of wallets are the signing-only wallets. The main purpose behind these wallets is to enhance the security of the wallet by generating private keys in secure offline environments. Working in conjunction with a networked wallet, the signing-only wallet can interact with the Bitcoin network and can deterministically generate pairs of private and public keys as needed to transfer the public key to the networked wallet. The role of the networked wallet is to distribute the public key to allow payments to be made to the wallet. In case of an HD wallet, the network can also generate child node keys as desired. Once the networked wallet detects a transaction addressed to one of the public keys that it has distributed, it creates an unsigned transaction based on the and transfers it to the signing-only wallet. The signing-only wallet then uses its private key that could be derived from an extended private key in the case of an HD wallet to sign the transaction and returns it back to the networked wallet. Finally, the networked wallet distributes the signed transaction in the Bitcoin network to claim the BTC.

Signing wallets can either be offline wallets or hardware wallets. Offline wallets are designed to reduce the network vulnerabilities. Their tasks include private key derivation and transaction signing. The signed transactions are transferred via removable media to the online wallets. Offline wallets provide higher levels of security than the full-service wallets, however, they require a continuously isolated device. On the other hand, hardware wallets are less of a hassle than offline wallets. They are connected directly to the networked device which eliminates the dependency of removable media when communicating between the signing-only wallet and the networked model. However, the hardware wallet is also inconvenient in situations where the owner makes frequent payments since the owner must constantly carry the hardware wallet to be able to make a payment anytime. As a result, many people use hardware wallets for long-term storage rather than day-to-day transactions. Utilizing this type of wallet, one can store large amounts of BTC in the most secure environments. Popular examples of hardware wallets today include the Ledger [58] Nano and TREZOR [59].

The final type of wallets are the distributing-only wallets. These wallets also strive to reduce the security issues caused by the full-service wallets. They are in the form of networked wallets for public key distribution in a pre-populated manner, where the public keys are derived and distributed as needed by the network. Other distributing-only wallets are capable of generating the public keys as the case in HD wallets.

Exchange platforms store large portions of cryptocoins in online wallets to provide their users the advantage of reduced transaction time due to the immediate availability of their private keys. This is analogous to storing cash in a centralized entity such as a bank. It is important to point out that storing cryptocoins in an online wallet provided by an exchange platform is the least secure method since it means storing the corresponding private keys that can spend those cryptocoins. The users must completely trust the exchange platform to safely store the private keys and not act maliciously. Even worse, assuming we can trust an exchange platform, cryptocoin owners are still at risk of losing their cryptocoins in the event the exchange platform online wallets are hacked and the private keys are leaked. A hacker that gets a hold of the private keys can immediately use them to send the cryptocoins to his/her personal address. Once the transaction is processed and stored over the blockchain, it becomes immutable to being deleted/modified and most likely will not be reversed unless the blockchain is hard forked.

Throughout the history of cryptocurrencies, multiple attacks have occurred to exchanges that resulted in massive losses and severe price panics to certain cryptocurrencies. In 2011, one of the most notable Japanese-based exchanges, Mt. Gox, online wallets were hacked, leaking all the private keys it stored in the wallet.dat file. Mt. Gox was able to recover from that heist, however, later in 2014, it filed for bankruptcy and was shut down since it was responsible for around 70% of Bitcoin trading volume and lost approximately 850,000 BTC that was valued at more than $450 million dollars. The hackers were able to even steal BTC stored in the exchange’s hardware wallets. There is no legitimate evidence of how the attack occurred. In March 2014, Mt. Gox reported on its website that it had found 200,000 BTC from the total stolen in old-format digital wallets. The other 650,000 were believed to be laundered on another exchange platform known as BTC-e.

The problem is that such heists could possibly occur again. Exchange platforms remain to be an extremely attractive hacking points for hackers since they hold so many funds in the least secure manner. Users are recommended to keep limited amounts stored in exchanges while storing the majority of their funds in hardware wallets.

Another issue is whether or not it is possible to track the movement of stolen cryptocoins, hence, catch the hacker. Based on our analysis, it is theoretically possible. However, there have been scenarios where hackers were able to launder large portions of stolen cryptocoins such as the example discussed previously. Another famous example occurred in January 2018 when about $534 million dollars worth of a cryptocoin known as XEM were stolen from a Japanese-based exchange known as Coincheck. Today, this heist represents the largest theft in the history of cryptocurrencies. The exchange also announced that the cryptocoins were stolen from its online wallets through multiple unauthorized transactions. In an effort to combat this fiasco, the developer team announced that they will develop an automated tagging system to tag stolen XEM cryptocoins. However, the tracking system was ineffective. Once more, the stolen cryptocoins were laundered and completely lost.

In conclusion, we stress on the fact that there remains to be a trade-off between the security of a wallet and the ease of use. The most frequently used wallets today are full-service wallets. They are free, user-friendly and can perform all functions needed by a BTC owner. However, these wallets could be vulnerable to theft since they are connected to the network.

Vii Bitcoin Privacy

Bitcoin suffers inherent privacy issues in that attackers could link certain identities to their pseudonyms (such as Bitcoin addresses) and identify their history of transactions. This is known as the linking problem. Many users publish their real identities and Bitcoin addresses online so that others can make payments to them. This practice is common among blogs and websites that request BTC as donations or those selling a product or service. These actions could jeopardize their anonymity. Another common example is when users trade BTC for other altcoins over exchange platforms. Most exchange platforms require users to validate their identities by uploading a copy of official identification which exposes the users to the exchange applications. Such examples do not require an attack to learn the full transaction history of those users. Simply by tracing the Bitcoin addresses over the blockchain, the transactions could be revealed. In fact, even cautious users that do not publicly use their identities may be at risk as well.

Bitcoin utilizes Bitcoin addresses as its defense mechanisms to preserve the privacy of users. When generated for the users, bitcoin addresses do not leak any information about the identities of the users. However, attackers strive to search for links between bitcoin addresses and user identities using auxiliary information available over the network. If a link is found, it is possible to discover all the other Bitcoin addresses belonging to that user and revealing the complete history of BTC transactions of the user. Today, powerful analysis tools and search engines can be utilized to discover the Bitcoin address and determine this information. Even the strongly encouraged practice of using a new Bitcoin address for every new transaction cannot completely prevent this information from being revealed once a Bitcoin address is linked to an identity of the user.

The auxiliary information can be obtained by multiple methods. Different techniques exist today that can speculate links between Bitcoin addresses and user identities. The study in [61] shows that using information about how nodes are connected within a network can help identify users. In [62], it was shown that patterns of co-occurrences may reveal useful information and lead to any ties. The study in [63] showed that just by monitoring the communication channel, users are likely to lose their anonymity. In [64], an analysis is presented that shows how compromised network nodes can leak significant user information and link them to certain transactions. For further reading, we direct reader to [65, 66, 67, 68] that present similar studies.

Users can run their nodes over Tor [69] in an effort to hide their information from the rest of the network. Tor is a software that provides an additional layer of anonymity. It utilizes multi-layer encryption and random relaying nodes to transfer data between a sender and receiver. The sender begins by sending the multi-layer encrypted message to a random node that decrypts a single layer and transmits it to the next relaying node. This process continues until the message is completely decrypted and arrives at the receiver [70]. However, multiple studies, such as [71, 72, 73, 74], have shown that even a low-resource attacker could be capable of gaining information flowing between users running their Bitcoin nodes over Tor. This information can include the data sent between nodes or even the location of the nodes within the network topology.

Other efforts have also been employed in an effort to improve the anonymity of Bitcoin. We classify these efforts into two main classes: mixing services and joint transaction.

Vii-a Bitcoin Mixing Services

BTC mixing is an approach that mixes identifiable BTC in an effort to make them unrecognizable by public observers. The first generation mixing was centralized and performed by tumblers. Tumblers are third party mixers that receive BTC from different users, randomly mix them up, and then return to the users their updated BTC amounts. An attacker would no longer be able to trace the BTC of a certain user since the user no longer possesses the same BTC that he/she previously owned. However, a tumbler being a centralized entity presents many threats to the users. It must be fully trusted not to steal the BTC it mixes or even leak any information about the mixing process. Even when completely trusted, being centralized makes it prone to being compromised. In addition to this, tumblers charge users mixing fees in return for their services.

In an effort to mitigate these risks, a new generation of peer-to-peer tumblers was introduced to decentralize the process. Instead of sending BTC to a tumbler that performs mixing, the users themselves are involved in the process. This eliminates the need to completely trust a third party and minimizes the risk of privacy leakage. An example of such a protocol is CoinSwap which is presented in [75].

Vii-B Bitcoin Joint Transactions

A joint transaction allows different users to combine the inputs and outputs of their transactions into a single transaction to be processed as a whole. All participating users must provide their own signatures to the transaction to unlock their input portion. Once all participating users correctly sign their inputs, the transaction can be processed as a regular transaction and added to the blockchain. An attacker can no longer trace the BTC movement of a user since there is no direct relationship between the inputs and the outputs of a transaction. The level of privacy provided by a joint transaction increases with the number of participating users. This also results in a lower transaction fee that is paid by each user as it is divided among more users. In 2013, Gregory Maxwell introduced this concept as CoinJoin [76] which is widely used in practice today. CoinJoin eventually began to evolve and existed in multiple flavors. Notable examples that introduced new concepts are described below.

SharedCoin:  SharedCoin provided by Blockchain.info is one of the initial implementations of the CoinJoin protocol that ran over a centralized server. The centralized server was the meeting room for the participating users to meet and combine their transactions together. Since users meet in one place, the server is capable of keeping logs of the transactions processed over it. This requires users to completely trust the server not to misuse these logs and put their information at risk if compromised. Shortly, Kristov Atlas created CoinJoin Sudoku, a software that is capable of analyzing the mixing process performed by SharedCoin. The software aims at discovering the relationships between transactions and their owners. It clustered matching inputs and outputs of transactions trying to identify a common owner. However, this implementation is completely suspended today due to its various privacy limitations.

Dark Wallet:  In 2013, Cody Willson and Amir Taaki introduced Dark Wallet [77]. It provides anonymity using stealth addresses and the CoinJoin protocol. A stealth address is a public seed address combined with some metadata used to derive an actual address for a payee to receive transactions. The metadata is shared only between the payer and the payee, and cannot be accessed by the public observers. To generate an actual address, the payee generates a private key and its corresponding public key. Next, the payer uses the public key of the payee and some metadata to generate a transaction with a new address. Once the payee learns the metadata, it can claim the amount attached to the transaction by deriving the appropriate key from the stealth address. Others trying to trace the transaction that was received with a stealth address would not be able to trace it. However, Dark Wallet cannot provide complete anonymity against linking users to certain BTC transactions since the payer can trace it.

CoinShuffle:  CoinShuffle was introduced in 2014 [78]. It is a combination of the CoinJoin protocol and the accountable anonymous group communication protocol Dissent [79]. Its main purpose is to eliminate the involvement of third parties while achieving anonymity and protection against DoS attacks. The protocol consists of three main phases: announcement, shuffling and transaction verification. In the announcement phase, the participants generate a new pair of private and public keys then broadcast their corresponding public key to the other participants. In the shuffling phase, each participant generates a new Bitcoin address to be used as their output address in the mixing transaction. Following that, the participants obliviously shuffle these generated Bitcoin addresses. In the transaction verification phase, every participant checks whether their Bitcoin address is contained in the output list. If present, each participant creates a mixing transaction that spends the inputs to the shuffled list of outputs, signs the transaction, and broadcasts the signature. Once each participant receives the signatures of the others, every participant can generate a fully signed version of the mixing transaction. Dishonest behavior can be detected by the presence of one honest participant who would not broadcast his/her signature and report the dishonesty to all other participants.

However, Coinshuffle suffers anonymity vulnerability if not used cautiously since it allows users to assign change back to themselves in the mixing transaction. Once the change is assigned to the Bitcoin address of the user, anonymity could easily be lost. The best solution to this problem is to use amounts that do not require any change. However, the user does not necessarily get to choose what amount to use since the user must use (s) from previous transactions. In addition to this, Coinshuffle reveals the identities of the participants among each other during the process.

JoinMarket:  JoinMarket [80] is a decentralized CoinJoin implementation. It aimed at improving the privacy of all the previous implementations. JoinMarket introduced two types of participating users, market makers, and market takers. Market makers are users who are willing to mix their BTC at any given time in return for a fee. On the other hand, market takers are users that demand immediate mixing service and are willing to pay a fee as compensation to the market makers. Market makers and takers negotiate the service over an Internet Relay Chat (IRC) channel. Once terms are discussed, a mixing contract is generated which enables each participating user to operate from their own personal machine. The fact that the system is decentralized protects users from the need to trust a centralized entity. Furthermore, the fee paid by the takers to the makers incentivizes them to continue to join.

Various protocols continue to evolve in an effort to increase the anonymity of Bitcoin. However, the linkage problem still remains within Bitcoin that could jeopardize the anonymity of its users.

Viii Security and Privacy of Altcoins

The continuous emergence of altcoins presents enhanced features to the cryptocurrency enthusiasts. Some of these altcoins have proven to provide enhanced security and privacy over Bitcoin. However, Bitcoin continues to remain at the top of the list of cryptocurrencies with the largest market cap. This contradiction raises questions around its continuous dominance.

In this section, we unfold the major security and privacy advantages of altcoins. We first investigate distinct consensus algorithms implemented by different altcoins in an effort to keep their network secure. We strive to elucidate the security advantages of these algorithms over the proof-of-work implemented by Bitcoin. Next, we discuss major altcoin privacy protocols and privacy improvement over Bitcoin.

Viii-a Altcoin Security

The Proof-of-Work (PoW) implemented in Bitcoin utilizes ; a CPU-bound function. The time needed to run is determined by the speed of the machine. Powerful machines such as ASICs can run millions of times faster than various other CPUs, GPUs, and FPGAs. This created an unfair mining competition since not all miners use the same computing machine. In fact, it eliminated miners using CPUs, GPUs, and FPGAs since their chances of success are negligible when compared to those using ASICs.

This PoW has also been greatly criticized for being an energy-wasting technique. Mining is performed using powerful computing machines that require substantial energy to run. Most of the energy used by all these miners ends up being wasted since the output of only one miner is used to extend the blockchain. As a result, the cost of running this PoW to achieve consensus is extremely costly.

In addition to this, it is expected that Bitcoin will suffer a mining tragedy of the commons [81]. The mining reward will converge to zero since it continues to halve approximately every four years (precisely every 210,000 blocks). Eventually, the miners will no longer have an incentive in taking part of the consensus procedure someday. This will force the transacting users to increase their transaction fees as an alternative incentive to the miners. As a result, both the users and miners will be driven away from the system.

In an effort to mitigate these issues, some altcoins replaced with memory-bound hash functions in their PoW. In comparison to the CPU-bound function used by Bitcoin, the time needed to run memory-bound functions is determined by the amount of memory available to hold the processed data. Developing ASICs for memory-bound functions is no longer advantageous since they can only optimize CPU-bound functions. Notable examples of such functions include scrypt [82] and CryptoNight [83] Combinations of hashing algorithms have even been used such as X11 [84] and X12-X17 [85]. In these algorithms, 11-17 different hashing algorithms used. The result of each sub-algorithm is fed as input to the next sub-algorithm. Popular altcoins such as Litecoin, DASH [86], and Monero implement such examples. However, it was not too long until optimized memory-bound ASICs started re-monopolizing the mining process once again.

Developing an ASIC-resistant PoW has not succeeded. Altcoin developers began to deviate their efforts to implement alternative consensus algorithms that strive to mitigate ASIC centralization and prevent critical issues such as double-spending attacks. Similar to PoW, many of these alternative consensus protocols are chain-based. A chain-based protocol pseudo-randomly selects a single validator to generate the next block of the blockchain. Some widely implemented consensus chain-based protocols are described below.

Proof-of-Stake (PoS):  PoS is an alternative consensus algorithm that was initially suggested in [87]. In contrast to PoW, PoS is dependent on economic stakes of users (i.e. holdings in cryptocurrency) rather than their computational resources. The algorithm deterministically selects a user with significant holdings to validate the next block. In return, the selected validator is rewarded a certain value of the cryptocurrency similar to the mining reward in PoW and all the transaction fees included in the block. Conceptually, a user holding of the total available cryptocurrency will be chosen of the time as the validator in generating the next block. Once the block is generated, the validator relays it to the other validators to confirm it and extend the blockchain.

PoS has multiple benefits in comparison to PoW. Users are no longer required to consume substantial quantities of electricity since they no longer engage in a mining process. In fact, they are motivated to take part in the validation process as it requires nothing more than presenting their wealth in return for a reward if chosen to be the validator. In contrast to PoW, PoS significantly speeds up the consensus process. From a security perspective, PoS tackles the 51% attack by making it more expensive than performing it in a PoW environment. An attacker would need to possess 51% of the total cryptocurrency available to perform the majority attack. Assuming a single user possesses 51% of the total cryptocurrency and performs the attack, the value of the cryptocurrency will drop and the attacker would suffer most being the majority stakeholder. In comparison to PoW, the majority attack requires 51% of the total mining power which is theoretically achievable through mining pools. This incident previously occurred in the mining environment of Bitcoin as a mining pool (Ghash.IO) exceeded the 51% threshold.

Although PoS could handle some issues caused by PoW, it also introduced some major challenges. The largest stakeholders will be able to monopolize the consensus procedure as they will always be selected and earn the reward. This will create a centralized consensus environment. In addition to this, an attacker with a 51% stake can also completely destroy the cryptocurrency, assuming the intentions of the attacker are to eradicate the system at any price. PoS also suffers a major flaw known as Nothing at Stake (NoS). This issue can occur if coincidentally two stakeholders are chosen to validate the next block. This may result in two valid blocks that can extend the blockchain. As a result, a fork may occur to the blockchain as the miners accept both blocks. To resolve the fork, the validators vote on both branches. Voting is done at no cost which may be an incentive for a malicious validator to vote for a specific path of the blockchain and facilitate a double-spending attack.

These issues resulted in PoS to start appearing in multiple flavors. Its first implementation appeared in a Bitcoin fork, namely Peercoin, which incorporates a hybrid of an energy-efficient PoS [88] and the original PoW that runs . PoW was used initially as a method of coin generation and distribution to get the system running. As time progressed, PoW was slowly replaced by PoS to validate transactions, mint new coins and maintain consensus. The validators are chosen based on the number of coins in their possession and their corresponding age (i.e. a timestamp indicating how old the coins are). Once they are granted a reward in return for their service, the age of their coins goes back to zero to give other validators a chance to generate the next block. By that, no single validator can monopolize the validation process.

Later, modified versions of PoS were implemented into some cryptocurrencies. In [89], the age of the coins was removed as it was argued to be abusive to the system. It can help gain significant network weight and facilitate a double-spending attack. In some cases, it may also discourage honest users from staking persistently as they would hold back until their coins are oldest in age to maximize their chances.

In [90], a delegated proof-of-stake (DPoS) was proposed where the users vote for validators (referred to as witnesses). Each vote has a different strength based on the stake of the user. However, this requires users to completely trust the validators they vote for.

Proof-of-Activity (PoA):  PoA is a consensus algorithm that combines PoW and PoS into one protocol [91]. Its purpose is to reward only the online participators, thus motivate more miners to remain online in an effort to secure the network. The protocol is analogous to the lottery where the chances of winning of an individual are based on the number of tickets the individual holds.

In PoA, miners first utilize their computational power to compete in generating an empty block header; one that does not reference any transactions. A successful miner then immediately broadcasts the resulting hash to the network. This hash value is used to deterministically derive pseudo-random stakeholders who are potential miners if found to be online. The derivation of these stakeholders is performed by hashing a concatenation of the broadcast hash value, the hash of the previous block, and fixed suffix values. The protocol then invokes a subroutine known as follow-the-satoshi once for each derived value. The subroutine finds the block storing a satoshi with the same index as the result. Next, it inspects the block in which the satoshi was minted and traces its movement up until its last owner. If online, this owner participates in the next block generation process that extends the blockchain. Similar to PoS, the more satoshis an individual owns, the more likely that the individual will be selected randomly in this process.

Every stakeholder then checks the validity of the empty block header that was initially broadcast. Using this value they also check whether they were one of the selected validators. The first lucky stakeholders sign the hash of the empty block header with the private key that controls the satoshi derived from follow-the-satoshi subroutine. Next, they broadcast their signature to the network. The stakeholder then generates a wrapped block that extends the empty block header by including the desired transactions to be verified, the signatures, and his/her own signature for this block. The wrapped block is finally broadcast to the network to extend the blockchain. The transaction fees that the stakeholder collects from the included transactions are shared among the miner and the participators.

From a security perspective, PoA makes the 51% attack more difficult than PoW and PoS since a large computational power and a significant stake are both required in PoA.

Proof-of-Burn (PoB):  PoB is an algorithm that achieves consensus by burning a portion of a cryptocurrency. Burning a portion of cryptocurrency means generating a transaction with this portion destined to an inaccessible address by all users. The concept of burning is analogous to buying expensive computational hardware in PoW.

In general, a miner burns portions of his/her holdings and waits a certain period of time. This time ensures that it is impractical for an attacker to undo the transaction. After waiting, the transaction is permanently stored in the blockchain and becomes visible to all observers. This is proof that the potential miner has invested a portion of his/her holdings and is worthy of being a miner. Honest miners will burn portions of their holdings that are less than or equivalent to what they can return in the mining process if successful. In other words, if miners burn more than what they are expected to return in a successful mining process, they will spend more than what they earned, hence a loss.

The potential miners then create candidate blocks in an effort to extend the blockchain. By referencing their transactions in the blockchain, they can prove that they have burnt some of their holdings earlier, thus become accepted by the community as miners. The winning block that extends the blockchain is chosen by allocating the miner that has burnt the most after a certain period of time.

From a security perspective, this algorithm can achieve the same security as its predecessor algorithms. It requires a miner to perform an expensive task (burning) that is easily verified by all other participators observing the blockchain. Similar to PoS, it saves the miners the hassle of buying hardware to physically perform mining.

PoB is also known for its use in bootstrapping new cryptocurrencies. A new cryptocurrency can mint its new coins by utilizing PoB. Rather than releasing mint coins during the mining process as in PoW, a cryptocurrency can be burnt to mint coins from the new cryptocurrency. For example, a new cryptocurrency can mint coins by burning BTC.

Whether being used to maintain consensus of a network or bootstrap an emerging cryptocurrency, PoB has been criticized for its permanent coin destruction. This is a more critical issue for cryptocurrencies with a limited supply. The more PoB is utilized, the less the quantity of a cryptocurrency is in circulation. Such an issue can lead to significant inflation to the value of the cryptocurrency which can result in destruction of the cryptocurrency.

In contrast to the chain-based algorithms discussed previously, some alternatives are based on Byzantine Fault Tolerance (BFT) algorithms. In these algorithms, consensus on a block is independent of the chain. The algorithms utilize a multi-round process where every validator sends a vote for some specific block during each round. At the end of this process, the validators reach an agreement on whether to permanently accept a given block. These protocols could be somewhat more centralized since the validators work together to maintain consensus by handling each block individually. For further reading, the readers are recommended to review examples such as the ripple consensus protocol [92] and the stellar consensus protocol [93].

Viii-B Altcoin Privacy

Privacy is one of the most important issues of cryptocurrencies. Some notable altcoin privacy protocols are described below.

Zerocoin:  Zerocoin [94] protocol generates anonymous coins that can be exchanged for other cryptocurrencies (e.g. BTC) when mixing is desired. Cryptocurrencies that embed the protocol run in parallel with Bitcoin utilizing its blockchain. A user that wishes to mix BTC purchases this cryptocurrency through a mint transaction. A mint transaction trades the specified amount of the cryptocurrency into its corresponding value in the anonymous coins. Once bought, the user can convert his/her anonymous coins back into BTC through a spend transaction. The spend transaction trades the anonymous coins back to the original cryptocurrency. The mint transaction and the spend transaction are designed to be uncorrelated. After a spend transaction, the user ends up with a different set of BTC than the ones used in the mint transaction. In comparison to the mixing techniques of Bitcoin discussed previously, Zerocoin eliminates the need for tumblers. It relies on a combination of digital commitments, one-way accumulators, zero-knowledge proofs and the existing Bitcoin platform.

In a mint transaction, the user first generates a random serial number and cryptographically commits (i.e. encrypts) it into a coin using a randomly chosen key . The purpose of cryptographically committing the serial number is to hide its value from all the other users while binding it to its owner. The user then generates a Bitcoin transaction with the appropriate amount to pay for coin (i.e. the BTC to be mixed) and releases both of them into the network. The miners place coin into a one-way accumulator and mine the Bitcoin transaction into the blockchain. The transaction is not addressed to anyone and its value remains locked in the blockchain until it is redeemed by another user in a spend transaction. At this point, the user possesses anonymous coins equivalent to the amount of BTC that is to be mixed.

To convert the anonymous coins back to BTC, the user exchanges his/her anonymous coins with other locked BTC of users stored on the blockchain. The user first provides zero-knowledge proof of his cognizance of coin in the one-way accumulator. Next, the user must prove that his key and serial number correspond to coin . The miners then verify the proof and that the serial number has not been previously spent. Once verified, the spend transaction is mined into the blockchain granting the user an equivalent amount of BTC as spent in the corresponding mint transaction. The user ends up with fresh BTC that he/she have never possessed.

However, Zerocoin protocol has a few limitations. From the perspective of the system, Bitcoin must be soft-forked to account for the changes of the protocol. From the perspective of the protocol, the zero-knowledge proof computed generates large signatures that would eventually bloat the blockchain. The process is also time-consuming and requires more time for transactions to be accepted by the system. Most importantly, it requires a trusted party to initiate the one-way accumulator.

Shortly after the release of Zerocoin, Zerocash [95] also known as Zcash today, was introduced in an effort to reduce the cost of the zero-knowledge proof. However, this project did not require a soft fork to Bitcoin as Zerocoin protocol did. In fact, it was a standalone technology that implemented its own cryptocurrency. It utilizes a smaller sized zero-knowledge proof known as zk-SNARKs [96] that consumes less time to compute.

PrivateSend:  PrivateSend is an altcoin joint transaction protocol [97] that combines identical inputs from various users into one transaction with multiple outputs. A user initially reaches out to a random master node requesting mixing specific denominations of a certain amount of coins. The master node then announces that it is willing to accept other coins of identical quantities and denominations to be mixed into a transaction. Once the master node receives enough requests, the involved users specify their full list of inputs and outputs they wish to be mixed. The inputs specify the coins to be mixed while the outputs specify the output addresses of users where they wish to receive the mixed coins. The master node then puts all inputs and outputs into a joint transaction and sends it to the involved users. The users validate the transaction and sign their inputs and return it to the master node. The master node finally broadcasts the transaction to the network which is treated as any other transaction. However, PrivateSend is not a completely decentralized protocol and can jeopardize the anonymity of the user since it involves a centralized node.

CryptoNote:  CryptoNote is a privacy-preserving protocol embedded in some cryptocurrency implementations that strive to hide the connection between a sender and receiver from the rest of the network [83]. The protocol protects the identity of the sender by utilizing ring signatures [98] when signing transactions. The public key of the sender is shuffled with public keys of other senders, giving all keys equal probability of being linked to a transaction. In this way, an attacker has no way of identifying the private key used during transaction signing, hence identifying the sender. It also generates a unique public key for the receiver with each new incoming transaction. Using random data generated by the sender and the public key of the receiver, a one-time unique pair of private and public keys is generated via Diffie-Hellman key exchange [99]. These keys are used to claim the transaction output by the receiver.

Unlike Bitcoin, the blockchains used by the cryptocurrencies running CrypoNote do not reveal the information of transactions, hence improving anonymity. As a result, verifying transactions becomes a challenge. To handle this issue, a modified version of the original traceable ring signature [100] is utilized. The original scheme makes it possible to trace transactions sent by the same sender if they contain the same tag and are signed by the same private key. The modified scheme, referred to as one-time ring signature, replaces the tag with a key image. The key image is deterministically derived by applying a cryptographic hash function to the private key allowing each sender to generate only one valid signature using his/her private key. If the sender tries to generate two different signatures with the same private key, a link will be detected. Therefore, this scheme counters any double-spending attempts since the blockchain will only store one signature while invalidating all others.

Ix Conclusions and Future Research directions

After presenting our extensive survey, we recap the lessons learned and the future research directions that can be derived from this study.

In this survey, we strived to address major technical concerns regarding the future stability of Bitcoin. We first introduced the background of Bitcoin and explicated its major building blocks and protocols. The main purpose of our extensive background was to educate our readers about the blockchain technology using Bitcoin as a use case.

Next, we delved into crucial security concerns. We began by discussing the double-spending attack and analyzed its probability of success. We showed that the probability of success can be modeled using two different probabilistic models that result in a similar outcome. Using this analysis, we further evaluated the profitability of the attack. We showed that attackers with less than half of the total computational power of the system will eventually lose at some point while performing the attack. The main lesson learned was that there will always be a trade-off between the waiting time before accepting a transaction and the possibility of reversing the transaction. Users should realize this trade-off and only use the current systems acknowledging these risks.

Following that, we also explored the major network-related security issues of the underlying peer-to-peer network. Our discussion showed that these network attacks are inevitable since it is impossible to restrict malicious nodes from connecting to the network. We further explored storage security by investigating the wallet infrastructures and the different modes of storage. Our analysis shows that there is also a trade-off between storage security and practicality: the more user-friendly wallets are, the bigger risk of losing their cryptocoins. Therefore, users must take all precautions in order to protect their funds.

Beyond the security issues that Bitcoin suffers, we investigated some privacy limitations inherent to the system. We debunked the misconception of Bitcoin anonymity and reviewed major methods for privacy protection. Currently, systems similar to Bitcoin continue to suffer privacy issues. The privacy of the users is at risk.

Finally, in the last section, we also looked to expand the knowledge of the readers on some emerging protocols that have been implemented in some altcoins. The main purpose of these protocols is to enhance security and privacy. However, our discussion proved that even these emerging protocols have not been able to provide good enough systems to completely eliminate the current centralized systems.

As blockchain has presented many intriguing features, most importantly decentralization, it has also introduced new research challenges. Future research must find ways to combat these concerns in order for the stability of blockchain to become consolidated. Based on our survey, we encourage future research to expand on our mining profitability analysis in order to help users come to better decisions before utilizing such systems. We believe that in order for systems such as Bitcoin to attract massive adopters, users must clearly understand the risks and gain a certain degree of confidence.

We also believe that extensive research is required to enhance security and privacy protocols. Typical peer-to-peer network attacks and privacy concerns such as the ones discussed in this paper can be used to disrupt the stability of blockchain systems. Currently, all the evolving solutions may enhance the security and/or privacy slightly, but it usually comes with a price keeping the users skeptical about using such systems. However, as research advances in these fields, cryptocurrencies may help revolutionize the payment system as we know it today.

References

  • [1] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system.” https://bitcoin.org/bitcoin.pdf, 2008.
  • [2] S. Haber and W. S. Stornetta, “How to time-stamp a digital document,” in Conference on the Theory and Application of Cryptography, pp. 437–455, Springer, 1990.
  • [3] D. Bayer, S. Haber, and W. S. Stornetta, “Improving the efficiency and reliability of digital time-stamping,” in Sequences II, pp. 329–334, Springer, 1993.
  • [4] S. Haber and W. S. Stornetta, “Secure names for bit-strings,” in Proceedings of the 4th ACM Conference on Computer and Communications Security, pp. 28–35, ACM, 1997.
  • [5] H. Massias, X. S. Avila, and J.-J. Quisquater, “Design of a secure timestamping service with minimal trust requirement,” in the 20th Symposium on Information Theory in the Benelux, Citeseer, 1999.
  • [6] G. Karame, E. Androulaki, and S. Capkun, “Two bitcoins at the price of one? double-spending attacks on fast payments in bitcoin.,” IACR Cryptology ePrint Archive, vol. 2012, no. 248, 2012.
  • [7] M. Crosby, P. Pattanayak, S. Verma, and V. Kalyanaraman, “Blockchain technology: Beyond bitcoin,” Applied Innovation, vol. 2, pp. 6–10, 2016.
  • [8] L. Dashjr, “BIP process, revised.” https://github.com/bitcoin/bips/wiki/Comments:BIP-0002, 2016. [Online; accessed 13-November-2017].
  • [9] G. Hileman and M. Rauchs, “Global cryptocurrency benchmarking study,” Cambridge Centre for Alternative Finance, 2017.
  • [10] “Coin Market Capital.” https://coinmarketcap.com/, ©2018 CoinMarketCap.
  • [11] J. Bonneau, A. Miller, J. Clark, A. Narayanan, J. A. Kroll, and E. W. Felten, “Sok: Research perspectives and challenges for bitcoin and cryptocurrencies,” in Security and Privacy (SP), 2015 IEEE Symposium on, pp. 104–121, IEEE, 2015.
  • [12] F. Tschorsch and B. Scheuermann, “Bitcoin and beyond: A technical survey on decentralized digital currencies,” IEEE Communications Surveys & Tutorials, vol. 18, no. 3, pp. 2084–2123, 2016.
  • [13] M. Conti, C. Lal, S. Ruj, et al., “A survey on security and privacy issues of bitcoin,” arXiv preprint arXiv:1706.00916, 2017.
  • [14] M. C. K. Khalilov and A. Levi, “A survey on anonymity and privacy in bitcoin-like digital cash systems,” IEEE Communications Surveys & Tutorials, 2018.
  • [15] D. Chaum, “Blind signatures for untraceable payments,” in Advances in cryptology, pp. 199–203, Springer, 1983.
  • [16] D. Chaum, A. Fiat, and M. Naor, “Untraceable electronic cash,” in Proceedings on Advances in cryptology, pp. 319–327, Springer-Verlag New York, Inc., 1990.
  • [17] W. Dai, “b-money.” http://www.weidai.com/bmoney.txt, 1998. [Online; accessed 13-November-2017].
  • [18] M.-C. Frunza, “Solving modern crime in financial markets: Analytics and case studies,” ch. Cryptocurrencies: A New Monetary Vehicle, pp. 39–75, Academic Press, 2016.
  • [19] “Second life - virtual worlds, virtual reality, vr, avatars, free 3d chat.” http://secondlife.com/, 2017. [Online; accessed 13-November-2017].
  • [20] “Create virtual experiences — linden lab.” https://www.lindenlab.com/, 2017. [Online; accessed 13-November-2017].
  • [21] R. C. Merkle, “A digital signature based on a conventional encryption function,” in Conference on the Theory and Application of Cryptographic Techniques, pp. 369–378, Springer, 1987.
  • [22] D. R. L. Brown, “SEC 2: Recommended elliptic curve domain parameters,” tech. rep., Certicom Research, 2010.
  • [23] J. R. Douceur, “The sybil attack,” in International Workshop on Peer-to-Peer Systems, pp. 251–260, Springer, 2002.
  • [24] M. Rosenfeld, “Analysis of bitcoin pooled mining reward systems,” arXiv preprint arXiv:1112.4980, 2011.
  • [25] S. Pool, “Reward system.” https://slushpool.com/help/manual/rewards, 2017. [Online; accessed 12-December-2017].
  • [26] T. B. C. developers, “bitcoin/bitcoin.” https://github.com/bitcoin/bitcoin, 2017. [Online; accessed 26-March-2017].
  • [27] A. Loibl, “Namecoin,” Network Architectures and Services, vol. 107, 2014.
  • [28] J. Hilliard, “Reduced threshold segwit masf.” https://github.com/bitcoin/bips/wiki/Comments:BIP-0091, 2017. [Online; accessed 13-November-2017].
  • [29] P. Wuille, “Segregated witness and its impact on scalability,” in SF Bitcoin Devs Seminar, 2015.
  • [30] H. Finney, “Best practice for fast transaction acceptance - how high is the risk?’.” https://bitcointalk.org/index.php?topic=3441.msg48384#msg48384, Feb. 2011.
  • [31] K. Sigman, “Gambler’s Ruin Problem.” http://www.columbia.edu/~ks20/FE-Notes/4700-07-Notes-GR.pdf.
  • [32] A. P. Ozisik and B. N. Levine, “An explanation of nakamoto’s analysis of double-spend attacks,” arXiv preprint arXiv:1701.03977, 2017.
  • [33] M. Rosenfeld, “Analysis of hashrate-based double spending,” arXiv preprint arXiv:1402.2009, 2014.
  • [34] “Mining hardware comparison.” https://en.bitcoin.it/wiki/Mining_hardware_comparison, 2017. [Online; accessed 10-October-2017].
  • [35] “Non-specialized hardware comparison.” https://en.bitcoin.it/wiki/Non-specialized_hardware_comparison, 2017. [Online; accessed 10-October-2017].
  • [36] Bitmain, “BIP process, revised.” https://www.antpool.com/, 2017. [Online; accessed 9-October-2017].
  • [37] B. China, “Bttc.” https://www.btcc.com/, 2011-2017. [Online; accessed 9-October-2017].
  • [38] L. HK Bixin Network Technology Co., “Bixin.” https://bixin.com/. [Online; accessed 9-October-2017].
  • [39] Bitmain, “Btc.com.” https://btc.com/, 2017. [Online; accessed 9-October-2017].
  • [40] Btctop, “btc.top.” http://btc.top/, 2017. [Online; accessed 9-October-2017].
  • [41] “U.S. energy information administration.” https://www.eia.gov/electricity/monthly/epm_table_grapher.php?t=epmt_5_6_a. [Online; accessed 9-October-2017].
  • [42] B. Johnson, A. Laszka, J. Grossklags, M. Vasek, and T. Moore, “Game-theoretic analysis of ddos attacks against bitcoin mining pools,” in International Conference on Financial Cryptography and Data Security, pp. 72–86, Springer, 2014.
  • [43] M. Vasek, M. Thornton, and T. Moore, “Empirical analysis of denial-of-service attacks in the bitcoin ecosystem,” in International Conference on Financial Cryptography and Data Security, pp. 57–71, Springer, 2014.
  • [44] E. Heilman, A. Kendler, A. Zohar, and S. Goldberg, “Eclipse attacks on bitcoin’s peer-to-peer network.,” in USENIX Security Symposium, pp. 129–144, 2015.
  • [45] M. Castro, P. Druschel, A. Ganesh, A. Rowstron, and D. S. Wallach, “Secure routing for structured peer-to-peer overlay networks,” ACM SIGOPS Operating Systems Review, vol. 36, no. SI, pp. 299–314, 2002.
  • [46] A. Singh, T. wan Ngan, P. Druschel, and D. S. Wallach, “Eclipse attacks on overlay networks: Threats and defenses,” in In IEEE INFOCOM, Citeseer, 2006.
  • [47] E. Sit and R. Morris, “Security considerations for peer-to-peer distributed hash tables,” Peer-to-Peer Systems, pp. 261–269, 2002.
  • [48] M. Apostolaki, A. Zohar, and L. Vanbever, “Hijacking bitcoin: Routing attacks on cryptocurrencies,” arXiv preprint arXiv:1605.07524, 2016.
  • [49] Y. Rekhter, T. Li, and S. Hares, “A border gateway protocol 4 (bgp-4),” tech. rep., 2005.
  • [50] A. M. Antonopoulos, Mastering Bitcoin: unlocking digital cryptocurrencies. ” O’Reilly Media, Inc.”, 2014.
  • [51] P. Wuille, “Hierarchical deterministic wallets.” https://github.com/bitcoin/bips/wiki/Comments:BIP-0032, 2012. [Online; accessed 13-November-2017].
  • [52] M. Palatinus, P. Rusnak, A. Voisine, and S. Bowe, “Mnemonic code for generating deterministic keys.” https://github.com/bitcoin/bips/wiki/Comments:BIP-0039, 2013. [Online; accessed 13-November-2017].
  • [53] G. Andresen, “Pay to script hash.” https://github.com/bitcoin/bips/wiki/Comments:BIP-0016, 2012. [Online; accessed 13-November-2017].
  • [54] M. Caldwell and A. Voisine, “Passphrase-protected private key.” https://github.com/bitcoin/bips/wiki/Comments:BIP-0038, 2012. [Online; accessed 13-November-2017].
  • [55] J. Daemen and V. Rijmen, The design of Rijndael: AES-the advanced encryption standard. Springer Science & Business Media, 2013.
  • [56] Coinbase, “Coinbase - buy/sell digital currency.” https://www.coinbase.com, 2018.
  • [57] Blockchain, “Bitcoin block explorer - blockchain.” https://blockchain.info, 2018.
  • [58] Ledger, “Ledger wallet - hardware wallets - smartcard security for your bitcoins.” https://www.ledgerwallet.com, 2018.
  • [59] SatoshiLabs, “Trezor bitcoin wallet — the original and most secure hardware wallet.” https://trezor.io, 2018.
  • [60] “Bitcoin Developer Guide.” https://bitcoin.org/en/developer-guide#wallets, 2017. [Online; accessed 13-November-2017].
  • [61] A. Narayanan and V. Shmatikov, “De-anonymizing social networks,” in Security and Privacy, 2009 30th IEEE Symposium on, pp. 173–187, IEEE, 2009.
  • [62] D. J. Crandall, L. Backstrom, D. Cosley, S. Suri, D. Huttenlocher, and J. Kleinberg, “Inferring social ties from geographic coincidences,” Proceedings of the National Academy of Sciences, vol. 107, no. 52, pp. 22436–22441, 2010.
  • [63] R. Puzis, D. Yagil, Y. Elovici, and D. Braha, “Collaborative attack on internet users’ anonymity,” Internet Research, vol. 19, no. 1, pp. 60–77, 2009.
  • [64] A. Korolova, R. Motwani, S. U. Nabar, and Y. Xu, “Link privacy in social networks,” in Proceedings of the 17th ACM conference on Information and knowledge management, pp. 289–298, ACM, 2008.
  • [65] Y. Altshuler, N. Aharony, Y. Elovici, A. Pentland, and M. Cebrian, “Stealing reality: when criminals become data scientists (or vice versa),” in Security and Privacy in Social Networks, pp. 133–151, Springer, 2013.
  • [66] F. Reid and M. Harrigan, “An analysis of anonymity in the bitcoin system,” in Security and privacy in social networks, pp. 197–223, Springer, 2013.
  • [67] S. Meiklejohn, M. Pomarole, G. Jordan, K. Levchenko, D. McCoy, G. M. Voelker, and S. Savage, “A fistful of bitcoins: characterizing payments among men with no names,” in Proceedings of the 2013 conference on Internet measurement conference, pp. 127–140, ACM, 2013.
  • [68] E. Androulaki, G. O. Karame, M. Roeschlin, T. Scherer, and S. Capkun, “Evaluating user privacy in bitcoin,” in International Conference on Financial Cryptography and Data Security, pp. 34–51, Springer, 2013.
  • [69] R. Dingledine, N. Mathewson, and P. Syverson, “Tor: The second-generation onion router,” tech. rep., Naval Research Lab Washington DC, 2004.
  • [70] J. Ren and J. Wu, “Survey on anonymous communications in computer networks,” Computer Communications, vol. 33, pp. 420–431, March 2010.
  • [71] A. Biryukov, D. Khovratovich, and I. Pustogarov, “Deanonymisation of clients in bitcoin p2p network,” in Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 15–29, ACM, 2014.
  • [72] A. Biryukov and I. Pustogarov, “Bitcoin over Tor isn’t a good idea,” in Security and Privacy (SP), 2015 IEEE Symposium on, pp. 122–134, IEEE, 2015.
  • [73] L. Overlier and P. Syverson, “Locating hidden servers,” in Security and Privacy, 2006 IEEE Symposium on, pp. 15–pp, IEEE, 2006.
  • [74] R. Dingledine, N. Hopper, G. Kadianakis, and N. Mathewson, “One fast guard for life (or 9 months),” in 7th Workshop on Hot Topics in Privacy Enhancing Technologies (HotPETs 2014), 2014.
  • [75] G. Maxwell, “Coinswap: Transaction graph disjoint trustless trading.” https://bitcointalk.org/index.php?topic=321228.0, 2013. [Online; accessed 26-March-2017].
  • [76] G. Maxwell, “Coinjoin: Bitcoin privacy for the real world,” in Post on Bitcoin Forum, 2013.
  • [77] C. Willson and A. Taaki, “Dark wallet.” https://www.darkwallet.is/, 2017. [Online; accessed 9-October-2017].
  • [78] T. Ruffing, P. Moreno-Sanchez, and A. Kate, “Coinshuffle: Practical decentralized coin mixing for bitcoin,” in European Symposium on Research in Computer Security, pp. 345–364, Springer, 2014.
  • [79] H. Corrigan-Gibbs and B. Ford, “Dissent: accountable anonymous group messaging,” in Proceedings of the 17th ACM conference on Computer and communications security, pp. 340–350, ACM, 2010.
  • [80] A. Gibson and C. Belcher, “Joinmarket.” https://github.com/JoinMarket-Org/JoinMarket-Docs/blob/master/High-level-design.md, 2017. [Online; accessed 9-October-2017].
  • [81] G. Hardin, “The tragedy of the commons∗,” Journal of Natural Resources Policy Research, vol. 1, no. 3, pp. 243–253, 2009.
  • [82] C. Percival, “Stronger key derivation via sequential memory-hard functions,” Self-published, pp. 1–16, 2009.
  • [83] N. v. Saberhagen, “Crypto note v 2.0,” HYPERLINK https://cryptonote. org/whitepaper. pdf, 2013.
  • [84] B. Kiraly, “X11.” https://dashpay.atlassian.net/wiki/spaces/DOC/pages/1146918/X11, 2017. [Online; accessed 26-November-2017].
  • [85] PiMP, “Blog: What are all these x11, x13, x15 algorithms made of?.” https://getpimp.org/what-are-all-these-x11-x13-x15-algorithms-made-of/, 2017. [Online; accessed 26-November-2017].
  • [86] E. Duffield and D. Diaz, “Dash: A privacy-centric crypto-currency,” 2014.
  • [87] Q. Mechanic, “Proof of stake instead of proof of work.” https://bitcointalk.org/index.php?topic=27787.0, 2011. [Online; accessed 15-November-2017].
  • [88] S. King and S. Nadal, “Ppcoin: Peer-to-peer crypto-currency with proof-of-stake,” self-published paper, August, vol. 19, 2012.
  • [89] P. Vasin, “Blackcoin’s proof-of-stake protocol v2,” 2014.
  • [90] D. Larimer, “Delegated proof-of-stake (dpos),” Bitshare whitepaper, 2014.
  • [91] I. Bentov, C. Lee, A. Mizrahi, and M. Rosenfeld, “Proof of activity: Extending bitcoin’s proof of work via proof of stake [extended abstract] y,” ACM SIGMETRICS Performance Evaluation Review, vol. 42, no. 3, pp. 34–37, 2014.
  • [92] D. Schwartz, N. Youngs, and A. Britto, “The ripple protocol consensus algorithm,” Ripple Labs Inc White Paper, vol. 5, 2014.
  • [93] D. Mazieres, “The stellar consensus protocol: A federated model for internet-level consensus,” Stellar Development Foundation, 2015.
  • [94] I. Miers, C. Garman, M. Green, and A. D. Rubin, “Zerocoin: Anonymous distributed e-cash from bitcoin,” in Security and Privacy (SP), 2013 IEEE Symposium on, pp. 397–411, IEEE, 2013.
  • [95] E. B. Sasson, A. Chiesa, C. Garman, M. Green, I. Miers, E. Tromer, and M. Virza, “Zerocash: Decentralized anonymous payments from bitcoin,” in Security and Privacy (SP), 2014 IEEE Symposium on, pp. 459–474, IEEE, 2014.
  • [96] E. Ben-Sasson, A. Chiesa, E. Tromer, and M. Virza, “Succinct non-interactive zero knowledge for a von neumann architecture.,” in USENIX Security Symposium, pp. 781–796, 2014.
  • [97] B. Kiraly, “Privatesend.” https://dashpay.atlassian.net/wiki/spaces/DOC/pages/1146924/PrivateSend, 2017. [Online; accessed 21-November-2017].
  • [98] R. Rivest, A. Shamir, and Y. Tauman, “How to leak a secret,” Advances in Cryptology—ASIACRYPT 2001, pp. 552–565, 2001.
  • [99] W. Diffie and M. Hellman, “New directions in cryptography,” IEEE transactions on Information Theory, vol. 22, no. 6, pp. 644–654, 1976.
  • [100] E. Fujisaki and K. Suzuki, “Traceable ring signature,” in Public Key Cryptography, vol. 4450, pp. 181–200, Springer, 2007.