T-EDGE: Temporal WEighted MultiDiGraph Embedding for Ethereum Transaction Network Analysis

05/13/2019
by   Jiajing Wu, et al.
SUN YAT-SEN UNIVERSITY
0

Recently, graph embedding techniques have been widely used in the analysis of various networks, but most of the existing embedding methods omit the temporal and weighted information of edges which may be contributing in financial transaction networks. The open nature of Ethereum, a blockchain-based platform, gives us an unprecedented opportunity for data mining in this area. By taking the realistic rules and features of transaction networks into consideration, we propose to model the Ethereum transaction network as a Temporal Weighted Multidigraph (TWMDG) where each node is a unique Ethereum account and each edge represents a transaction weighted by amount and assigned with timestamp. In a TWMDG, we define the problem of Temporal Weighted Multidigraph Embedding (T-EDGE) by incorporating both temporal and weighted information of the edges, the purpose being to capture more comprehensive properties of dynamic transaction networks. To evaluate the effectiveness of the proposed embedding method, we conduct experiments of predictive tasks, including temporal link prediction and node classification, on real-world transaction data collected from Ethereum. Experimental results demonstrate that T-EDGE outperforms baseline embedding methods, indicating that time-dependent walks and multiplicity characteristic of edges are informative and essential for time-sensitive transaction networks.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

02/16/2021

Temporal-Amount Snapshot MultiGraph for Ethereum Transaction Tracking

With the wide application of blockchain in the financial field, the rise...
06/18/2021

Self-supervised Incremental Deep Graph Learning for Ethereum Phishing Scam Detection

In recent years, phishing scams have become the crime type with the larg...
01/15/2022

Transaction Tracking on Blockchain Trading Systems using Personalized PageRank

Due to the pseudonymous nature of blockchain, various cryptocurrency sys...
08/26/2018

Evolutionary dynamics of cryptocurrency transaction networks: An empirical study

Cryptocurrency is a well-developed blockchain technology application tha...
08/19/2021

Blockchain Phishing Scam Detection via Multi-channel Graph Classification

With the popularity of blockchain technology, the financial security iss...
11/26/2021

TEGDetector: A Phishing Detector that Knows Evolving Transaction Behaviors

Recently, phishing scams have posed a significant threat to blockchains....
11/24/2020

xFraud: Explainable Fraud Transaction Detection on Heterogeneous Graphs

At online retail platforms, it is crucial to actively detect risks of fr...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The past decade has witnessed an explosive growth of graph data, and analysis of large-scale networks has attracted increasing attention from both academia and industry [Volpp2006]. However, as a kind of networks that exists widely in the real world, there are relatively few analytical studies on financial transaction networks because the transaction data are usually private for the sake of security and interest. Fortunately, the recent emergence of blockchain technology makes transaction data mining more feasible and reliable. Generally speaking, blockchain is an open and distributed ledger technology managed by a peer-to-peer network through a special consensus mechanism, and all transaction records on blockchain are publicly accessible [Swan2015]. The open nature of blockchain data provides researchers with unprecedented opportunities for data mining in this area [Tasca et al.2018, Feder et al.2018, Atzei et al.2017, Möser et al.2013].

Being the largest public blockchain-based platform that supports smart contract, Ethereum [Wood2014] has attracted wide attention and its market capitalization has reached 20 billion USD [Chen et al.2018]. To facilitate the implementation of smart contracts, Ethereum introduces the concept of account, which is formally an address222Ethereum accounts/addresses are composed of the prefix ”0x”, a common identifier for hexadecimal, concatenated with the rightmost 20 bytes of the public key. One Example is “0x00b2ed34791c97206943314ee9cbd9530762a320” , but adds storage space for recording account balances, transactions, codes, etc. The corresponding cryptocurrency on Ethereum, known as Ether, can be transferred between accounts and used to compensate participant mining nodes. Since its debut in 2014, Ethereum has accumulated a large number of user transaction records. Utilizing these records, [Chen et al.2018] conducts the first systematic study to characterize Ethereum and obtain new observations via traditional network analysis. Different from other large-scale complex networks, Ethereum transaction network, where each edge represents a particular Ether transaction, contains some unique information such as the directions, amount values and timestamps of the transactions. It is essential to incorporate such information for accurate modeling, characterization, and understanding of transaction network data. In addition, multiple transactions between two users are expected and it is more comprehensive to model a transaction network as a multidigraph333In graph theory, a multigraph (in contrast to a simple graph) is a graph which is permitted to have self-loops and multiple edges (also called parallel edges). A multidigraph is a directed multigraph. rather than a simple graph. Therefore, in this work, we model the Ethereum transaction network as a Temporal Weighted Multidigraph where a node is a unique address and an edge represents a transaction weighted by amount and assigned with timestamp.

In recent years, researchers have extensively investigated a variety of machine learning applications on large-scale complex networks, and the performance of these machine learning tasks is heavily dependent on the choice of data representation. Graph embedding is an effective method to represent node features in a low dimensional space for network analysis and downstream machine learning tasks 

[Cai et al.2018]

. Among various graph embedding methods, a series of random walk based approaches have been proposed to learn a mapping function from an original graph to a low dimensional vector space by maximizing the likelihood of co-occurrence of neighbor nodes 

[Perozzi et al.2014, Grover and Leskovec2016]. Inspired by the algorithm [Mikolov et al.2013a]

proposed for natural language processing, these random walk based embedding methods are especially useful when the network is too large to be measured entirely 

[Goyal and Ferrara2018]. Recently, to better extract the temporal information from dynamic networks,  [Nguyen et al.2018] proposed a general framework called Continuous-Time Dynamic Network Embeddings (CTDNE) to incorporate temporal dependencies into existing random walk based network embedding models.

Figure 1: An illustration for Ethereum transaction network. Nodes are labeled by account addresses. Edges are attached by timestamp and amount value , and indexed in the increasing order of .

Taking the realistic rules and features of transaction networks like the Ethereum, the challenges of transaction network embedding are listed as follows: (1) Transaction networks evolve continuously over time with additions of links, which is overlooked in most of the existing graph embedding algorithms; (2) The practical meaning of connections between accounts is not a one-off established relationship but a time-dependent event. Hence multiple edges need to be considered in transaction network embedding; (3) Unlike social network, random walks on Ethereum transaction network are concrete, which represent money transfer flows in the real world; (4) The amount value of transaction reflects the similarity between two accounts to some extent. In most cases, the larger amount of transaction, the closer relationship between two accounts. Figure 1 is a microcosm of transaction activities on Ethereum.

To this end, we propose a novel framework named Temporal WEighted MultiDiGraph Embedding (T-EDGE), which aims to capture the non-negligible temporal properties and important money-transfer tendencies of time-sensitive transaction networks. For the transaction networks discussed here, existing methods that ignore temporal information may sample a large number of invalid transaction sequences to derive node embeddings. For example in Figure 1, is a possible random walk sequence in traditional methods. However, it is not practical in a temporal graph as the transaction from to happens earlier. While in CTDNE [Nguyen et al.2018], although temporal information is considered, the existence of multiple edges between points is neglected. For instance, according to CTDNE, the temporal walk from to is represented as a sequence of nodes . However, whether is possible for the next walk depends on whether the transaction path 1⃝ or 3⃝ is sampled by the previous walk from to .

In this work, we represent a -length temporal walk as a sequence of nodes together with a sequence of edges traversed in non-decreasing timestamps. This kind of temporal walk represents an actually feasible path for money flow in the transaction network. Therefore, the proposed method is expected to learn more meaningful and accurate time-dependent node embeddings that capture more comprehensive properties from dynamic transaction networks.

The main contributions of our paper are as follows:

  • To the best of our knowledge, this is the first work to understand Ethereum transaction records via graph embedding. In particular, we consider two important and practical machine learning tasks, namely link prediction and node classification.

  • We refine the definition of a temporal walk for transaction networks by considering temporal dependencies and multiplicity of edges. This kind of random walk sequences contains the practical meaning of money flow in transaction networks.

  • We propose a novel graph embedding method called Temporal Weighted Multidigraph Embedding (T-EDGE) which incorporates transaction information from both time and amount domains, and experiments on realistic Ethereum data demonstrate its superiority over existing methods.

2 Framework

Figure 2 demonstrates the four main steps of the proposed framework for Ethereum transaction network analysis, including data collection, network construction, graph embedding and downstream applications. The parts of network construction and graph embedding are described in the rest of this section, and the parts of data collection and applications will be explained later in Section 3.

Figure 2: The architecture of the proposed framework for network analysis of Ethereum.

2.1 Network Construction

Ether transfer is one of the major activities happening on Ethereum. Here we abstract an Ether transfer transaction as a four-tuple (src, dst, w, t), which means the sender src transfers w Ether to the recipient dst at time t. To investigate the Ether transfer on Ethereum, we abstract the Ethereum transaction network as a Temporal Weighted Multidigraph:

Definition 1 (Temporal Weighted Multidigraph (TWMDG)).

Given a graph , let be the set of nodes and be the set of edges. Each edge is unique and is represented as , where is the source node, is the target node, is the weight value and is the timestamp. For the sake of simplicity, we define mapping functions , , , for .

Based on collected four-tuples from Ethereum transaction records, we can build a Temporal Weighted Multidigraph, where each node represents a unique account and each edge represents a unique Ether transfer transaction.

2.2 Temporal Weighted Multidigraph Embedding

We now define the problem of Temporal WEighted MultiDiGraph Embedding (T-EDGE) as follows: Given a temporal weighted multidigraph , our principal goal is to learn an embedding function () which preserves original network information including node similarity, as well as temporal and weighting properties specifically for financial transaction networks, thus enhancing predictive performance on down-stream machine learning tasks. The proposed method aims to learn more appropriate and meaningful dynamic node representations using a general embedding framework consisting of two main parts. The first part is a random walk generator, which samples a set of walks with the temporal constraint and flexible biased strategies; the second part is an update procedure based on SkipGram [Mikolov et al.2013a, Mikolov et al.2013b], which learns node embeddings as a maximum likelihood optimization problem.

Random walk mechanism has been widely proved to be an effective technique to measure local similarity of networks for a variety of domains [Spitzer2013]. For a temporal weighted multidigraph discussed here, we define the concept of a Temporal Walk as follows:

Definition 2 (Temporal Walk).

In TWMDG, a temporal walk from node to is an -length path traversed in non-decreasing timestamps. Such a temporal walk is represented as a sequence of nodes together with a sequence of edges , where , , and . We define that nodes and are temporally connected if there exists a temporal path from to .

Figure 3: Illustration for a -length temporal walk

In order to sample valid random walks which obey the temporal constraint, we introduce a new concept called Temporal Successive Edges in TWMDG.

Definition 3 (Temporal Successive Edges).

Given a temporal weighted multidigraph , the temporal successive edges of a node at time is defined as follows:

For instance, in Figure 1, let , then . The set of temporal successive edges plays the role of candidate for walkers to select possible successors.

Apart from the temporal constraint, we further develop biased searching strategies by considering more detailed transaction information. For the Ethereum transaction network discussed here, we abstract the transaction time and amount as the temporal and weighted information of a TWMDG. Consider a random walk that just traversed edge , and is now stopping at node at time . The next node of the random walk is decided by selecting a temporally valid edge

. We describe different sampling biases by formulating the selection probability for each temporal successive edge

.

From the perspective of temporal domain, we consider both unbiased and biased sampling strategies as follows.

  • Temporal Unbiased Sampling (TUS). This is the default setting in the time domain, which assumes that each temporal successive edge of node at time has the same probability to be selected:

    (1)
  • Temporal Biased Sampling (TBS). For financial transaction networks, the similarity between accounts is time-dependent and dynamic.

    On the one hand, the accounts with frequent interactions are supposed to have a stronger relationship. Therefore, we let be a function that maps the timestamps of edges to a descending ranking. In this case, each edge will be assigned with a selection probability:

    (2)

    where denotes the timestamp of the edge . This sampling method biases the selection towards edges that are closer in time to the previous edge.

    On the other hand, sampling the interactions among accounts in a large time interval may also be important for different domains of networks for the purpose of preserving global similarity in time domain. For such scenarios, we propose another strategy that favors edges appearing later to the previous timestamp. Let be a function that maps the timestamps of edges to an ascending ranking. The probability of selecting each edge can be given as:

    (3)

Apart from the transaction time, the amount values of the edges (edge weights) also plays an essential role in financial transaction networks. In the following, we present unbiased and biased strategies from a weighted domain.

  • Weighted Unbiased Sampling (WUS). Similar to TUS, this is the default setting in the amount domain and each edge has the same probability to be sampled:

    (4)
  • Weighted Biased Sampling (WBS). As illustrated in the Introduction, the weight value of each transaction indicates the significance of interactions between the two accounts involved. For most instances, a higher value of transaction amount implies a larger similarity between the two accounts. Thus each edge can be assigned the selection probability:

    (5)

    To prevent the extreme situation where edges with small weights would never be sampled, we consider a linear mapping function to weakens the effects of edge weights. Thus we have

    (6)

Input: Temporal Weighted Multidigraph , dimensions , walks per node , walk length , window size
Output: for

1:  Initialize set of temporal walks to
2:  for  to  do
3:     for all nodes  do
4:         
5:         Append to
6:     end for
7:  end for
8:   = StochasticGradientDescent()
9:  return
Algorithm 1 Temporal Weighted Multidigraph Embedding

Input: Temporal Weighted Multidigraph , start node , walk length
Output:

1:  Let , initialize to [], to
2:  Randomly sample first edge , append to
3:  Let , append to
4:  for  to  do
5:      = [-1], = [-1]
6:      = GetNextEdgeWithStrategies(, )
7:     Append to ,
8:     Let = , append to ,
9:  end for
10:  return
Algorithm 2 Temporal Walk

Furthermore, we combine the aforementioned sampling probabilities from both temporal and weighted domains, i.e., and , by for . Here is the default value for balancing between time domain and amount domain. Note that T-EDGE, with default settings TUS and WUS, can be regarded as a specific version of DeepWalk for temporal and directed multigraphs like the transaction networks. In other words, under the temporal constraint, all candidate edges (temporal successive edges) are equally likely to be selected by T-EDGE, while T-EDGE (TBS), T-EDGE (WBS) and T-EDGE (TBS+WBS) select the edges with temporal or/and weighted biases.

Given the sampling results of temporal random walks, we formulate the task of learning time and weight dependent graph embedding in a TWMDG as an optimization problem. This optimization aims to maximize the log-probability of observing a node’s neighborhood conditioned on its embedding vector:

(7)

where is the window size which restricts the size of random walk context. According to the conditional independent assumption in SkipGram, Eq. 7 can be transformed to

(8)

The pseudocode for T-EDGE and temporal walk is given in Algorithms 1 and 2 respectively.

3 Experiments on Ethereum

3.1 Data Collection

On Ethereum, accounts can be divided into two categories, external owned accounts (EOA) which are similar to general bank accounts [Weili and Zibin2018]; and smart contract accounts which are source code files. In this work, we focus on the transactions among EOAs for the reason that the Ether transfer records between them are publicly available in the blockchain. Besides, we only include the successful transactions among EOAs with non-zero amount value into our dataset.

Since it is extremely time-consuming to process the whole Ethereum transaction network with more than two million EOAs [Chen et al.2018], here we ascertain a number of objective accounts and then obtain their transaction data through APIs of Etherscan (https://etherscan.io/). Centered by each objective account, we obtain a directed -order subgraph (See an example in Figure 4). -in and -out are two parameters to control the depth of sampling inward and outward from the center, respectively.

Figure 4: Schematic illustration of a directed -order subgraph.

On Ethereum, various related information of Ether transactions is stored as data packages. In details, the TxHash field is a unique identification of a transaction, the Value field in a transaction refers to the amount of money transferred, and the Timestamp field indicates when the transaction happens. Besides, the From and To field denote the sender and recipient of the transaction. With the collected four-tuples , we can easily construct a temporal weighted multidigraph.

3.2 Link Prediction

Link prediction problem predicts the occurrence of links in a given graph on the basis of observed information. In this work, we first evaluate performance of the proposed T-EDGE method on a temporal directed link prediction task based on binary classification.

First of all, we sort all the collected edges according to their timestamps and assume the earlier edges (with a smaller value of timestamp) as the known links, and denotes the nodes involved in . Node set and edge set constitute the current network . Then we can learn node representations of the current network for

via graph embedding methods. Secondly, for the binary classifier, node pairs

existing in act as positive samples of the training set. Then we randomly sample an equal number of node pairs with no link as negative samples. We obtain features of a directed link from nodes to by concatenating their node embeddings, i.e., . If , . Finally, we train a support vector classifier to classify the links in the test set where the remainder (links with a larger value of timestamp) are treated as the positive samples.

Dataset Current network Node pairs split for classification
#train #test test/train
EthereumG1 3,832 208,927 13,658 1,140 8.35%
EthereumG2 10,628 208,533 26,958 7,510 27.86%
EthereumG3 26,175 677,785 66,102 11,502 17.40%
Table 1: Statitics of datasets used in link prediction problem.
Metrics(%) EthereumG1 EthereumG2 EthereumG3
AUC AP AUC AP AUC AP
DeepWalk 82.71 76.69 85.91 82.13 79.92 77.72
node2vec 83.03 76.94 86.30 82.47 82.20 79.99
T-EDGE 87.73 83.73 92.85 90.29 93.00 90.78
T-EDGE(TBS+WBS) 89.55 85.58 93.36 90.94 93.83 91.89
Table 2: Performances of different methods for link prediction

Dataset

In this work, we collect three subgraphs with different size from Ethereum for experiments. EthereumG1 is centered by account “0x51faeda318982f439e80012fb45d2b017ddccdbe” with -in = -out = 3; EthereumG2 is centered by account “0x5e247060f48eeb64367250ed03ff5091bba47fd1” with -in = -out = 4; EthereumG3 is centered by the same account as EthereumG1 with -in = -out = 4. A summary of the dataset is listed in Table 1.

Settings

In the experiments, we compare the proposed T-EDGE with two baseline random walk based graph embedding methods, DeepWalk [Perozzi et al.2014] and node2vec [Grover and Leskovec2016]. To ensure a fair comparison, we implement the directed version of DeepWalk and node2vec using OpenNE [THUNLP2017]

, an open source toolkit for graph embedding. For these random walk based embedding methods, we have several hyperparameters: the node embedding dimension

, the size of window , the length of walk , and walks per node . In general, we set , and . Specifically, we set , for EthereumG1, , for EthereumG2, , for EthereumG3. For node2vec, we grid search over according to [Grover and Leskovec2016]. For DeepWalk, we set as it is a special case of node2vec.

Figure 5: Performance in terms of Area Under Curve (AUC) under varying hyperparameters, when (a) fixing , , , and varying from 2 to 8; (b) fixing , , , and varying from 4 to 10; (c) fixing , , , and varying from 8 to 20; (d) fixing , , , and varying from 8 to 256.

Discussion of results

Table 2 compares the performance of various methods on temporal directed link prediction in terms of Area Under Curve (AUC) and Average Precision (AP). For a clearer illustration, we only demonstrate two extreme sampling strategies of proposed algorithm: T-EDGE, which does not apply any bias, and T-EDGE (TBS+WBS), which combines biases from both time-domain and amount-domain with default . As discussed in Section 2.2, we have two kinds of TBS defined in Eqs. 2 and 3 as well as two kinds of WBS defined in Eqs. 5 and 6. Here we implement all the four possible combinations for T-EDGE (TBS+WBS), and report the best result in Table 2.

According to Table 2, we have the following observations: (1) T-EDGE without any bias overwhelmingly outperforms DeepWalk and node2vec, which manifests that the temporal information as well as the multiplicity characteristic of edges in TWMDG are very important and meaningful for analysis and understanding of financial transaction networks; (2) With biases of both time and amount domains, T-EDGE (TBS+WBS) attains better performance than unbiased T-EDGE, demonstrating that the rich information from time and amount domains does help us obtain a more comprehensive representation for predictive tasks.

To further illustrate the superiority of T-EDGE methods, we compare the performance of the embedding methods on EthereumG1 with varying value of node embedding dimension , walk length , walks per node and window size . Results in Figure 5 point out that: (1) T-EDGE with or without additional biases consistently outperform DeepWalk and node2vec under different circumstances of , , ; (2) DeepWalk and node2vec are more sensitive to two hyperparameters, walk length and walks per node , while T-EDGE methods can always achieve promising results with a wide range of both and ; (3) Interestingly, with an increase of , the performance of T-EDGE methods monotonically improves but performance of DeepWalk and node2vec degrades with larger than 64, which implies that T-EDGE methods can embed richer helpful information and thus requiring a larger value of for data representation.

To further investigate the effects of different sampling strategies on T-EDGE methods, we provide results of all possible combinations of three time domain strategies defined in Eqs. 123 and three amount domain strategies described in Eqs. 456. Figure 6 shows that averagely, the biased methods {T-EDGE (TBS), T-EDGE (WBS), T-EDGE (TBS+WBS)} outperform the unbiased method T-EDGE; Methods adding bias in both time and amount domain T-EDGE (TBS+WBS) surpass methods adding only one bias {T-EDGE (TBS), T-EDGE (WBS)}.

Figure 6: Heat map of Area Under Curve (%) for link prediction using proposed T-EDGE with different combinations of strategies.

3.3 Node Classification

Phishing scam is a new type of cybercrime which arises along with the emergence of online business [Liu and Ye2001]. It is reported to accounts for more than 50% of all cyber-crimes in Ethereum since 2017 [Konradt et al.2016]. To further evaluate the performance of the proposed T-EDGE strategies, we also conduct node classification experiments on Ethereum to classify labeled phishing nodes and unlabeled nodes (treated as non-phishing nodes). In this part, we consider 445 phishing nodes labeled by Etherscan and the same number of randomly selected unlabeled nodes as our objective nodes, and a detailed list of these nodes is given in [Authors2019]. We make an assumption that for a typical Ether transfer flow centered on a phishing node, the previous node of the phishing node may be a victim, and the next one to three nodes may be the bridge nodes with money laundering behaviors. Therefore, we collect subgraphs with -in = 1, -out = 3 for each of the 890 objective nodes and then splice them into a large-scale network with 86,623 nodes.

Training Ratio 60% 70% 80%
Metrics(%) Mi-F1 Ma-F1 Mi-F1 Ma-F1 Mi-F1 Ma-F1
DeepWalk 79.33 79.17 80.30 80.19 80.79 80.67
node2vec 79.72 79.56 80.15 80.05 80.56 80.36
T-EDGE 81.97 81.95 82.17 82.15 82.81 82.78
T-EDGE(TBS+WBS) 81.97 81.94 83.37 83.37 85.06 85.05
Table 3: Node classification results with different training ratio.

For all embedding methods, we utilize the same hyperparameter setting (, , , ), and the specific settings for node2vec are the same as that in link prediction experiments. To make a comprehensive evaluation, we randomly select {60%, 70%, 80%} of objective nodes as training set and the remaining objective nodes as test set respectively. We use five-fold cross validation to train the classifier and evaluate it on the test set. The results of micro-F1 (miF1) and Marco-F1 (maF1) are shown in Table 3. These results further verify our assumption and motivation in Section 1 that, with consideration of temporal properties and money-transfer information, we can obtain a more meaningful representation of transaction networks which can effectively boost predictive performance.

4 Conclusion

In this work, we proposed a novel framework for Ethereum analysis via network embedding. Particularly, we constructed a temporal weighted multidigraph to retain information as much as possible and present a graph embedding method called T-EDGE which incorporates temporal and weighted information of financial transaction networks into node embeddings. We implemented the proposed and two baseline embedding methods on realistic Ethereum network for two predictive tasks with practical relevance, namely, temporal link prediction and phishing/non-phishing node classification. Experimental results demonstrated the effectiveness of the proposed T-EDGE embedding method, meanwhile indicating that a temporal weighted multidigraph can more comprehensively represent the temporal and financial properties of dynamic transaction networks. For future work, we can use the proposed embedding method to investigate more applications of Ethereum or extend the current framework to analyze other large-scale temporal or domain-dependent networks.

References

  • [Atzei et al.2017] Nicola Atzei, Massimo Bartoletti, and Tiziana Cimoli. A survey of attacks on ethereum smart contracts (sok). In Principles of Security and Trust, pages 164–186, Berlin, Heidelberg, March 2017. Springer Berlin Heidelberg.
  • [Authors2019] Anonymous Authors. Objective accounts in node classification. https://anonfiles.com/3cl8X9ufb4/nodeClassification_xlsx, 2019.
  • [Cai et al.2018] Hongyun Cai, Vincent W Zheng, and Kevin Chen-Chuan Chang. A comprehensive survey of graph embedding: problems, techniques and applications. IEEE Transactions on Knowledge and Data Engineering, 30(9):1616–1637, 2018.
  • [Chen et al.2018] Ting Chen, Yuxiao Zhu, Zihao Li, Jiachi Chen, Xiaoqi Li, Xiapu Luo, Xiaodong Lin, and Xiaosong Zhange. Understanding ethereum via graph analysis. In IEEE INFOCOM 2018-IEEE Conference on Computer Communications, pages 1484–1492, Honolulu, HI, USA, April 2018. IEEE.
  • [Feder et al.2018] Amir Feder, Neil Gandal, JT Hamrick, and Tyler Moore. The impact of ddos and other security shocks on bitcoin currency exchanges: Evidence from mt. gox. Journal of Cybersecurity, 3(2):137–144, 2018.
  • [Goyal and Ferrara2018] Palash Goyal and Emilio Ferrara. Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems, 151:78–94, 2018.
  • [Grover and Leskovec2016] Aditya Grover and Jure Leskovec. Node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 855–864, New York, NY, USA, August 2016. ACM.
  • [Konradt et al.2016] Christian Konradt, Andreas Schilling, and Brigitte Werners. Phishing: An economic analysis of cybercrime perpetrators. Computers & Security, 58:39–46, 2016.
  • [Liu and Ye2001] Jiming Liu and Yiming Ye. Introduction to E-Commerce Agents: Marketplace Marketplace Solutions, Security Issues, and Supply and Demand. Springer Berlin Heidelberg, Berlin, Heidelberg, 2001.
  • [Mikolov et al.2013a] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
  • [Mikolov et al.2013b] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems 26, pages 3111–3119, Lake Tahoe, Nevada, USA, December 2013. Curran Associates, Inc.
  • [Möser et al.2013] Malte Möser, Rainer Böhme, and Dominic Breuker. An inquiry into money laundering tools in the bitcoin ecosystem. In 2013 APWG eCrime Researchers Summit, pages 1–14, San Francisco, CA, USA, September 2013. IEEE.
  • [Nguyen et al.2018] Giang Hoang Nguyen, John Boaz Lee, Ryan A. Rossi, Nesreen K. Ahmed, Eunyee Koh, and Sungchul Kim. Continuous-time dynamic network embeddings. In Companion Proceedings of the The Web Conference 2018, pages 969–976, Republic and Canton of Geneva, Switzerland, April 2018. International World Wide Web Conferences Steering Committee.
  • [Perozzi et al.2014] Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 701–710, New York, NY, USA, August 2014. ACM.
  • [Spitzer2013] Frank Spitzer. Principles of Random Walk. Springer Science & Business Media, 2013.
  • [Swan2015] Melanie Swan. Blockchain: Blueprint for a new economy. O’Reilly Media, Inc., Cambridge, Massachusetts, 2015.
  • [Tasca et al.2018] Paolo Tasca, Adam Hayes, and Shaowen Liu. The evolution of the bitcoin economy: Extracting and analyzing the network of payment relationships. The Journal of Risk Finance, 19(19):94–126, 2018.
  • [THUNLP2017] THUNLP. Openne: An open source toolkit for network embedding. https://github.com/thunlp/openne, 2017.
  • [Volpp2006] Leti Volpp. Complex networks: structure and dynamics. Physics Reports, 424(4):175–308, 2006.
  • [Weili and Zibin2018] Chen Weili and Zheng Zibin. Blockchain data analysis: A review of status, trends and challenges. Journal of Computer Research and Development, 55(9):1853–1870, 2018.
  • [Wood2014] Gavin Wood. Ethereum: A secure decentralised generalised transaction ledger. Ethereum project yellow paper, 151:1–32, 2014.