Linking Bank Clients using Graph Neural Networks Powered by Rich Transactional Data

by   Valentina Shumovskaia, et al.

Financial institutions obtain enormous amounts of data about user transactions and money transfers, which can be considered as a large graph dynamically changing in time. In this work, we focus on the task of predicting new interactions in the network of bank clients and treat it as a link prediction problem. We propose a new graph neural network model, which uses not only the topological structure of the network but rich time-series data available for the graph nodes and edges. We evaluate the developed method using the data provided by a large European bank for several years. The proposed model outperforms the existing approaches, including other neural network models, with a significant gap in ROC AUC score on link prediction problem and also allows to improve the quality of credit scoring.



There are no comments yet.


page 1

page 2

page 3

page 4


EWS-GCN: Edge Weight-Shared Graph Convolutional Network for Transactional Banking Data

In this paper, we discuss how modern deep learning approaches can be app...

Foundations and modelling of dynamic networks using Dynamic Graph Neural Networks: A survey

Dynamic networks are used in a wide range of fields, including social ne...

Edge Proposal Sets for Link Prediction

Graphs are a common model for complex relational data such as social net...

GENN: Predicting Correlated Drug-drug Interactions with Graph Energy Neural Networks

Gaining more comprehensive knowledge about drug-drug interactions (DDIs)...

A Cluster-Based Trip Prediction Graph Neural Network Model for Bike Sharing Systems

Bike Sharing Systems (BSSs) are emerging as an innovative transportation...

Graph Convolutional Gaussian Processes For Link Prediction

Link prediction aims to reveal missing edges in a graph. We address this...

Core-fringe link prediction

Data collection often involves the partial measurement of a larger syste...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

It is important for the financial institutions to know their client well in order to mitigate credit risks [17], deal with fraud [16] and recommend relevant services [4]. One of the defining properties of a particular bank client is his or her social and financial interactions with other people. It motivates to look on the bank clients as on the network of interconnected agents [18, 4, 21]. Thus, graph-based approaches can help to leverage this kind of data and solve the above mentioned problems more efficiently.

Importantly, information about clients and especially about their neighborhood is never complete – market is competitive and we can not expect all the people to use the same bank. Thus, some of the financial interactions are effectively hidden from the bank. That leads to the necessity to uncover hidden connections between clients with limited amount of information which can be done using link prediction approaches [20].

From the other hand, the financial networks have two notable features. First one is the size – the number of clients can be of order of millions and the number of transactions is estimated in billions. The second important feature is the dynamic structure of considered networks – the neighborhood of each client is ever-evolving. The classical link prediction algorithms are only capable of working with graphs of a much smaller size, while the temporal component is usually not considered 

[20]. Recently, several studies addressed large-scale graphs [23] as well as temporal networks [14]. However, only few works consider the financial networks, see, for example, [4] and [18].

We base our research on the well developed paradigms of graph mining with neural networks including graph convolution networks [11, 9], graph attention networks [19] and SEAL framework for link prediction [25]. The considered approaches consistently show state-of-the-art results in many applications but, to the best of our knowledge, were not yet used for financial networks. Our key contributions can be formulated as follows:

  • We build a scalable approach to link prediction in temporal graphs with the focus on extensive usage of Recursive Neural Networks (RNNs) both as feature generators for graph nodes and as a trainable attention mechanism for the graph edges.

  • We propose several modifications to graph pooling procedures including the usage of two node convolutions instead of sortpooling [25] and neighborhood prioritization by Weisfeiler-Lehman labeling [24].

  • We validate the proposed approaches on the link prediction and credit scoring problems for the real-world financial network with millions of nodes and billions of edges. Our experiments show that our improved models perform significantly better than the standard ones and efficiently exploit rich transactional data available for the edges and nodes while allowing to proceed large-scale graphs.

2 Problem and Data

From the prospectives of network science and data analysis, the considered problem of linking bank clients is the link prediction problem in graphs with two notable peculiarities. The first one is that the considered graph of clients and transactions between them is very large, having the order of millions of nodes and billions of edges. The second peculiarity is that both nodes and edges have rather complex attributes represented by times series of bank transactions of different types. We want to note that such kind of problem is not limited to banking as graphs with similar structure appear in social networks, telecom companies, and other scenarios, where we consider some objects as nodes and a certain type of communication between them. Thus, the algorithms developed in our work might be applicable beyond banking for any link prediction problem with times-series attributes.

In what follows, we first discuss the dataset studied in our work and then explain some peculiarities of the problem statement.

2.1 Dataset

The considered dataset is obtained from one of the large European banks. The data consists of user transactions and money transfers between users during five years. All the data is depersonalized with each transaction being described by timestamp, amount and currency. Thus, we observe a graph with a set of vertices and a set of edges . Here, an edge means that there was at least one transfer between a pair of clients and over the observed time period. Each node is represented by a time series of transactions for client , while each edge is represented by a time series of transfers between clients and . Finally, we obtain a huge 86-million nodes graph with about 4 billions of edges.

Such graph size makes its analysis difficult to approach, since the majority of the graph processing methods aimed to solve node classification, graph classification or link prediction problems are suitable for graphs of much smaller size  [8]. The time complexity of such methods usually grows at least as , where is number of nodes, limiting the possible graph sizes to several thousands of nodes and up to a one hundred thousand of edges.

As a result, when we work with a particular node or with a particular edge, we are forced to consider certain subgraphs around the target node or the target pair of nodes (for example, see [24]). In this work, we follow this approach and consider the subgraph around target nodes extracting hop 1 or hop 2 neighbors.

2.2 Problem Statement and Validation

Our goal is to determine how stable is the relationship between nodes. We start by describing out-of-time validation (see a similar approach in [12]), more specifically, we consider time interval time for and use all the information available (e.g. all the transactions and transfers) as an information encoded in a graph. Given the information available for the period we aim to predict the structure of the graph for the time interval with . In what follows, we say that there is an edge between two nodes in a graph for the certain time period if there was at least one transaction between these nodes during the considered period. Thus, we end up with link prediction problem where the pair of nodes is described by the graph structure and attributes during the period and the target label corresponds to the existence of the transaction between the pair of nodes during the period . In all the experiments below we take equal to one year and equal to 3 months.

We note that usually link prediction models are validated in a different way, e.g. by edge sampling [12]. In this approach, the whole set is considered as positive samples, while negative samples are constructed by taking node pairs (

is a hyperparameter and

means the set size), which do not intersect with . Then, the subgraph is passed to the link prediction algorithm, hiding the link, if it exists. In order to build training, validation and test parts, one divides positive and negative edge sets into three corresponding non-intersecting sets.

However, we think that for the time-evolving graphs in general and banking data in particular the out-of-time validation is more sensible. Thus, in this work, we focus on the out-of-time validation, while still providing a part of the experiments for both settings.

3 Neural Network Model for Link Prediction with Transactional Data

In this section, we describe the proposed neural network for solving a link prediction task powered by rich transactional data. The most challenging part is the work with transactional data itself, which is basically a multidimensional time series.

As a base graph neural network, we take SEAL framework [25]. Its input parameters are an adjacency matrix of a graph and a node feature matrix

with each row containing a feature vector for the corresponding node. Then SEAL considers the neigborhood subgraph for the target pair of nodes and performs several graph convolutions followed by sortpooling operation and fully connected layers, see Figure 

1. However, the considered network of bank transactions does not have an explicit adjacency matrix or a vector of features as both clients and interactions between them are represented by time series. In the following, we are going to adapt SEAL framework to work with time series data by processing them with RNN. Moreover, we make a number of specific improvements to the structure of SEAL model making it more efficient.

Figure 1: SEAL architecture. The input graph is passed to a series of Graph Convolution (GC) layers. The obtained nodes features are sorted and pooled with SortPooling layer, then they are passed to 1-D Convolution layer (1D Conv) and Fully Connected (FC) layer.

3.1 Recursive Neural Network Powers Graph Neural Network

3.1.1 RNN as Feature Generator

The powerful way of working with time series data is to build a Recurrent Neural Network (RNN,

[6]). The main question is what objective function RNN should target. We suggest to pretrain RNN model on the credit scoring problem similar to [2], see also additional details in Section 5.5. The model takes a time-series of user transactions and aims to predict the credit default. For that purpose, we take a quiet simple Recurrent Neural Network, which consists of GRU cell [5], followed by a series of fully connected layers. Importantly, such RNN model learns in the intermediate layers the meaningful vector representation for the transactions of the client. In the following, we call these vectors embedded transactions and use them as node feature vectors in all the considered graph neural network models.

3.1.2 RNN as Attention Mechanism

The question of processing time series corresponding to the graph edges is even more challenging than the one for nodes. The simplest way is just to ignore the whole time series and consider binary adjacency matrix with edges present for pairs of nodes with at least one transfer between them. However, in this case we lose significant amount of important information as the properties of transfers between clients are apparently directly linked with our link prediction objective.

Figure 2: RNN for link prediction architecture.

In order to get the full use of the data, we first note that one can consider a RNN model predicting the link between two nodes using solely the time series of transfers between them, see Figure 2. However, such a RNN model does not allow us to detect new possible connections since there is no data about interaction between users in this case. To overcome that drawback, a model based on a transactional graph can be used.

We first note that standard graph convolutional architectures (like GCN [11] or SEAL [25]) perform convolution operation by simple averaging over the neighborhood: [c] h’_i = σ(1Ni ∑_j ∈N_i W h_j), i = 1, …, n, where are node embedding vectors before the convolution operation, are their counterparts after it, are learnable weights, is a set of immediate neighbors of node and, finally,

is an activation function. The averaging operation implies that all the neigbors have an equal influence on the considered nodes which is apparently very unnatural in the majority of applications.

Graph Attention Networks [19] mitigate this problem by introducing weights and consider the weighted sum: [c] h’_i = σ(1Ni ∑_j ∈N_i α_ij W h_j). However, in the work [19] coefficients are computed solely basing on node features

. Instead, in order to use the full information about the graph, we propose to use the probabilities of the links between nodes output by RNN model as weights in the adjacency matrix, which then is passed to graph neural network.

The resulting model is called SEAL-RNN, see the architecture on Figure 3. After extracting an enclosing subgraph around the target link, all time series corresponding to edges are processed by RNN and the output probabilities are used to form weighted adjacency matrix which together with generated nodes features are passed into SEAL model.

Figure 3: SEAL-RNN model architecture. After extracting an enclosing subgraph around the target link, all pairs in that subgraph are estimated by RNN, so we get a weighted adjacency matrix. Weighted adjacency matrix and generated nodes features are passed into SEAL model.

3.2 Graph Neural Network (2-SEAL)

3.2.1 Pooling

We propose another pooling operation instead of sortpooling in the SEAL model. Sortpooling layer holds (hyperparameter) most valuable in the sense of sorting (descending order) node embeddings while filtering out the other embeddings. In contrast, we suggest taking embeddings of two nodes, between which we aim to predict the link. The idea is natural since we want to predict the link between exactly these two nodes, while their embeddings still contain information about the neighboring nodes. Most importantly, it reduces the number of learned parameters in the neural network, and we do not need neither a sorting operation, nor 1-D convolution after pooling (the purpose of 1-D convolution in SEAL framework is to reduce the size of obtained output, which is , where is a sum of node features dimension and dimensions of the graph convolution outputs). We name the proposed model 2-SEAL, see the schematic representation on Figure 4.

Figure 4: 2-SEAL architecture. At first, the input graph is passed to a series of Graph Convolution (GC) layers. Then the obtained nodes features of two target nodes (between which the link is predicted) are passed to Fully Connected (FC) layer.

3.2.2 Modified Structural Labels

Working in terms of out-of-time validation, we decided to change structural labels proposed in the SEAL framework. In SEAL framework, each node receives a structural label generated by a Double-Radius-Node-Labelling procedure, which meets the following conditions:

  1. two target nodes and have label ‘1’;

  2. nodes with different distances to both and have different labels.

The aim of the labels is to encode some of the topological information about the graph structure. These structural labels are concatenated with initial node features (if exist), and passed to neural network as node features. The labelling function ( is a node index) is the following: [c] f(i) = 1 + min(d_x, d_y) + (d/2) [(d/2) + (d%2) - 1], where , , , and are the integer quotient and remainder of division respectively, while is distance between nodes. Authors of initial paper suggest to take into account all subgraph nodes except during computing distance , and similarly for .

We suggest not to hide nodes and during finding distances . That better suits out-of-time validation by allowing to keep in data patterns for all kinds of combinations of link existence in the observed graph and link existence in the future.

4 Related Work

The idea to consider bank clients as a large network of interconnected agents was raised in the past several years [18, 4, 21]

. The number of bank clients counts in millions, so we solve the link prediction problem for graphs with millions of nodes, which requires the usage of scalable methods. There are few ways to handle the graphs of such size mentioned in the literature, mostly being the simple heuristics that compute some statistics for the immediate neighborhoods of target nodes, for example, Common Neighbors 

[13], Adamic-Adar [1] and others [20]. However, these models are not trainable and do not use the information about node features, which limits their performance in real-world applications.

The main challenge in the construction of machine learning models for link prediction is to handle variation in the graph size. One approach is presented in WLNM 

[24] – it is to use Weisfeiler-Lehman structural labels [22] to prioritize nodes and to leave only the important one from the immediate neighborhood of evaluated nodes. After that, we can use regular dense-connected neural networks.

The graph convolution networks [11] showed good performance on graph datasets. Original GCN is supposed to use the whole graph, and it is prohibitive for the graph on a scale of millions of nodes. In [25], the SEAL framework was proposed, which is to extract enclosing subgraphs around the target link and include such a pooling layer in the neural network architecture, which holds the fixed number of nodes for every subgraph. This allows using the model on arbitrarily sized graphs.

The novel Graph attention model GAT 

[19] allows specifying different weights to different nodes in the neighborhoods. That approach could be used to leverage sequence information on the edges by adding attention coefficients to the graph convolutions.

5 Experiments

5.1 Dataset Preprocessing

Firstly, we divide the whole time interval and the set of user IDs into three non-intersecting parts: first three years, fourth year, and fifth year, they correspond to training, validation, and test time and users segments, see Figure 5. Taking a point in one of the time intervals, we define the base and the target segment. The base segment corresponds to the time segment before the point, while the target segment corresponds to time after. For edge sampling validation, we observe graph state restricted to the base segment, while the target is whether there was at least one transfer between users during this time. For the out-of-time validation setting, the target is whether there is at least one transfer between users during the target segment. We consider ROC-AUC measure as a quality metric for the link prediction task.

Figure 5: Data split for train, validation and test.

5.2 Baselines

Due to the need in the scalability we consider only simple similarity-based approaches, such as Common Neighbors, Adamic-Adar Index, Resource Allocation, Jaccard Index and Preferential Attachment, as baselines for our task (see 

[20] for the description of the methods). Also, we take the SEAL model [25] as a baseline (with embedded transactions concatenated with structural labels as node features). The results can be found in Table 1. As we may see, the results obtained from simple heuristic methods are beaten by the neural network solution. Also, there is a gap in the ROC AUC score for different validation settings. It could be explained by the fact that the problem of prediction into the future is a more difficult problem than finding hidden links in the current graph state.

5.3 Implementation details

We use PyTorch 

[15] and PyTorch Geometric [7] to implement the models. Each model was trained with Adam optimizer [10] using learning rate scheduler and hyperparameter optimization [3] for the number of layers, size of the layers and initial learning rate. We used the server with single GPU (NVIDIA Tesla P100), 32 CPU cores Intel i7 and 512 GB of RAM in all the experiments.

5.4 Link Prediction Results

Method Edge sampling Out of time
Common Neighbors 0.398 0.629
Adamic-Adar 0.391 0.646
Resource Allocation 0.35 0.639
Jacard Index 0.284 0.62
Preferential Attachment 0.746 0.497
SEAL 0.85 0.77
Table 1: Heuristics approaches and SEAL (with embedded transactions and structural labels as node features) results for banking data (ROC AUC).
Method Edge sampling Out of time
SEAL 0.85 0.77
WL-SEAL 0.87 0.75
2-SEAL 0.89 0.78
Table 2: SEAL (with embedded transactions and structural labels as node features) pooling modifications results for banking data (ROC AUC).

The first improvement of the initial SEAL model is the new pooling operation. SEAL and 2-SEAL models are described in the previous sections (see Sections 3 and 4). We additionally consider WL-SEAL pooling operation which is based on the idea of the Weisfeiler-Lehman graph isomorphism test. Quiet similarly to the idea described in [24], we propose to color nodes of enclosing subgraphs by the Palette-WL algorithm (Algorithm 3 in [24]), thereby get nodes ordering. After that, we take only (hyperparameter) the most significant nodes of the subgraph, as an input of the neural network. Thus, all subgraphs have the same size, so there is no need for a pooling operation after convolution layers. We expect that such pooling is more meaningful in the sense of intuition, but the drawback of such a model is computationally expensiveness of coloring algorithm (). The results can be found in Table 2. We observe that both WL-SEAL and 2-SEAL are superior to SEAL. However, 2-SEAL shows the best results, being less computationally expensive model which motivates us to focus the further studies on this model.

Another set of experiments is devoted to the exploration of the features. In the previous set of experiments on neural networks, we used a concatenation of embedded transactions (the output of an intermediate level of RNN which solves a credit scoring task) and structural labels as node features. We provide experiments in different settings of node features embedded transactions, embedded transactions concatenated with structural labels, structural labels, and modified structural labels (structural labels and modified structural labels are described in Section 3.2.1). Surprisingly, the usage of embedded transactions plays a negative role in the link prediction task. We explain it by the fact that similar purchases do not play a significant role in problems of finding new connections in the network, while network structure and people’s connections are a way more important. Also, modified structural labels (without hiding the link) gave us a better performance.

The final set of experiments is based on the work with data corresponding to edges (see Table 3), where we consider different RNN-based models, see details in Section 3.1. We see that in every setting (except embedded transactions + structural labels for 2-SEAL model), we have a large increase in the ROC AUC score (almost 0.1 in some of the cases) for the proposed models. We conclude that 2-SEAL model with RNN attention is the best model for link prediction for the considered banking dataset.

The summary of the results can be found in Table 4. We observe the significant improvement in the ROC AUC score for the proposed 2-SEAL-RNN model compared to the best heuristic approach and SEAL.

Method ET ET+SL SL Modified SL
SEAL 0.62 0.747 0.74 0.76
SEAL-RNN 0.61 0.787 0.78 0.794
2-SEAL 0.7 0.739 0.77 0.787
2-SEAL-RNN 0.727 0.804 0.83 0.858
Table 3: SEAL pooling modifications results for banking data with embedded transactions (ET) and structural labels (SL) as node features (ROC AUC).
Method Result, ROC AUC
Best heuristic 0.646
SEAL 0.74
2-SEAL 0.79
2-SEAL-RNN 0.858
Table 4: Final results on banking data in out-of-time validation setting.

5.5 Credit Scoring Results

In this section, we want to show the applicability of the developed link prediction models to other problems relevant for the banking. One of the most important problems in the bank is to control the risks related to working with clients, especially in the process of issuing a loan. This problem is called credit scoring [17], and usually the ensemble of predictive models is used, which in particular are based on user transactional data. For example, the RNN model run on time series of transactions has been shown to be very efficient in credit scoring [2].

The usage of information available in the network of clients may further improve the prediction quality. We consider the credit scoring dataset of approximately one hundred thousand clients which is a part of our initial dataset. Our experiments show the standard Graph Convolutional Network (GCN) [11] trained on these data improves over baseline RNN model by Gini, see Table 5. However, GCN model is known to treat all the neighboring nodes equally without any prioritization (see discussion in Section 3.1), which is apparently not correct for the bank clients some of which have much more influence on the particular client than the others. This issue was addressed in the literature by introducing graph attention mechanism based on the available node features [19].

In our work, we propose to use the developed link prediction model (2-SEAL-RNN) as an attention mechanism by reweighing the neighbouring nodes with coefficients proportional to the probabilities of the connection output by the link prediction model. Unlike standard Graph Attention Networks [19], our attention mechanism considers not only node features but also the topology of the graph while still allowing to train the final credit scoring model in end-to-end fashion. In Table 5, we compare GCN performance which use binary adjacency matrices, and adjacency matrices weighed by the link prediction model. We note that we use the embeddings obtained by RNN as node features in both models. The results show that the link prediction model used as an attention in GCN allows almost to double the effect of considering graph structure in credit scoring problem. We believe that the further study of the link prediction based attentions in graph neural network may lead to even better credit scoring models.

Method Result, in Gini index
Standard GCN + 0.8%
GCN with LP-based attention + 1.4%
Table 5: Gini index scores for GNNs models applied to credit scoring task in comparison with results obtained by RNN run on transactional data for each user.

6 Conclusion

In this work, we developed the graph convolutional neural network, which can efficiently solve the link prediction problem in large-scale temporal graphs appearing in banking data. Our study shows that to benefit from the rich transaction data fully, one needs to efficiently represent such data and carefully design the structure of the neural network. Importantly, we show the effectiveness of Recursive Neural Networks as building blocks of temporal graph neural network, including a non-standard approach to the construction of attention mechanism based on RNNs. We also modify the existing GNN pooling procedures to simplify and robustify them. The developed models significantly improve over baselines and provide high-quality predictions on the existence of stable links between clients, which enables bank with a powerful instrument for the analysis of clients’ network. In particular, we show that the usage of the obtained link prediction model as an attention module in the graph convolutional neural network allows to improve the quality of credit scoring.


  • [1] L. A. Adamic and E. Adar (2001) Friends and neighbors on the web. Social Networks 25, pp. 211–230. Cited by: §4.
  • [2] D. Babaev, M. Savchenko, A. Tuzhilin, and D. Umerenkov (2019)

    ET-rnn: applying deep learning to credit loan applications

    In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2183–2190. Cited by: §3.1.1, §5.5.
  • [3] J. Bergstra, D. Yamins, and D. D. Cox (2013) Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28, ICML’13, pp. I–115–I–123. Cited by: §5.3.
  • [4] C. B. Bruss, A. Khazane, J. Rider, R. Serpe, A. Gogoglou, and K. E. Hines (2019) DeepTrax: embedding graphs of financial transactions. CoRR abs/1907.07225. External Links: Link, 1907.07225 Cited by: §1, §1, §4.
  • [5] K. Cho, B. van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio (2014) Learning phrase representations using rnn encoder–decoder for statistical machine translation. In

    Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

    pp. 1724–1734. Cited by: §3.1.1.
  • [6] J. T. Connor, R. D. Martin, and L. E. Atlas (1994) Recurrent neural networks and robust time series prediction. IEEE transactions on neural networks 5 (2), pp. 240–254. Cited by: §3.1.1.
  • [7] M. Fey and J. E. Lenssen (2019) Fast graph representation learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds, Cited by: §5.3.
  • [8] W. L. Hamilton, R. Ying, and J. Leskovec (2017) Representation learning on graphs: methods and applications. IEEE Data Engineering Bulletin. Cited by: §2.1.
  • [9] W. L. Hamilton, Z. Ying, and J. Leskovec (2017) Inductive representation learning on large graphs. In NIPS, Cited by: §1.
  • [10] D. P. Kingma and J. Ba (2014) Adam: a method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR, Cited by: §5.3.
  • [11] T. N. Kipf and M. Welling (2017) Semi-supervised classification with graph convolutional networks. In 5th International Conference on Learning Representations, ICLR, Cited by: §1, §3.1.2, §4, §5.5.
  • [12] D. Liben-Nowell and J. Kleinberg (2007) The link-prediction problem for social networks. J. AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY. Cited by: §2.2, §2.2.
  • [13] M. E. J. Newman (2001) Clustering and preferential attachment in growing networks.. Physical review. E, Statistical, nonlinear, and soft matter physics 64 2 Pt 2, pp. 025102. Cited by: §4.
  • [14] A. Pareja, G. Domeniconi, J. Chen, T. Ma, T. Suzumura, H. Kanezashi, T. Kaler, and C. E. Leisersen (2020) Evolvegcn: evolving graph convolutional networks for dynamic graphs. In AAAI, Cited by: §1.
  • [15] A. Paszke et al. (2019) PyTorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pp. 8024–8035. External Links: Link Cited by: §5.3.
  • [16] C. Phua, V. Lee, K. Smith, and R. Gayler (2010) A comprehensive survey of data mining-based fraud detection research. arXiv preprint arXiv:1009.6119. Cited by: §1.
  • [17] N. Siddiqi (2012) Credit risk scorecards: developing and implementing intelligent credit scoring. Vol. 3, John Wiley & Sons. Cited by: §1, §5.5.
  • [18] L. Tran, T. Tran, L. Tran, and A. Mai (2019) Solve fraud detection problem by using graph based learning methods. arXiv preprint arXiv:1908.11708. Cited by: §1, §1, §4.
  • [19] P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio (2018) Graph attention networks. In 6th International Conference on Learning Representations, ICLR, Cited by: §1, §3.1.2, §4, §5.5, §5.5.
  • [20] P. Wang, B. Xu, Y. Wu, and X. Zhou (2015) Link prediction in social networks: the state-of-the-art. Science China Information Sciences 58 (1), pp. 1–38. Cited by: §1, §1, §4, §5.2.
  • [21] M. Weber, J. Chen, T. Suzumura, A. Pareja, T. Ma, H. Kanezashi, T. Kaler, C. E. Leiserson, and T. B. Schardl (2018) Scalable graph learning for anti-money laundering: a first look. arXiv preprint arXiv:1812.00076. Cited by: §1, §4.
  • [22] B. Yu. Weisfeiler and A. A. Leman (1968) Reduction of a graph to a canonical form and an algebra arising during this reduction. Cited by: §4.
  • [23] R. Ying, R. He, K. Chen, P. Eksombatchai, W. L. Hamilton, and J. Leskovec (2018) Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 974–983. Cited by: §1.
  • [24] M. Zhang and Y. Chen (2017) Weisfeiler-lehman neural machine for link prediction. In KDD, pp. 575–583. Cited by: 2nd item, §2.1, §4, §5.4.
  • [25] M. Zhang and Y. Chen (2018) Link prediction based on graph neural networks. In Advances in Neural Information Processing Systems, pp. 5165–5175. Cited by: 2nd item, §1, §3.1.2, §3, §4, §5.2.