Analysis of 5G academic Network based on graph representation learning method

by   Xiaoming Li, et al.
Deakin University
Tianjin University

With the rapid development of 5th Generation Mobile Communication Technology (5G), the diverse forms of collaboration and extensive data in academic social networks constructed by 5G papers make the management and analysis of academic social networks increasingly challenging. Despite the particular success achieved by representation learning in analyzing academic and social networks, most present presentation learning models focus on maintaining the first-order and second-order similarity of nodes. They rarely possess similar structural characteristics of spatial independence in the network. This paper proposes a Low-order Network representation Learning Model (LNLM) based on Non-negative Matrix Factorization (NMF) to solve these problems. The model uses the random walk method to extract low-order features of nodes and map multiple components to a low-dimensional space, effectively maintaining the internal correlation between members. This paper verifies the performance of this model, conducts comparative experiments on four test datasets and four real network datasets through downstream tasks such as multi-label classification, clustering, and link prediction. Comparing eight mainstream network representation learning models shows that the proposed model can significantly improve the detection efficiency and learning methods and effectively extract local and low-order features of the network.


page 19

page 25


Representation Learning of Reconstructed Graphs Using Random Walk Graph Convolutional Network

Graphs are often used to organize data because of their simple topologic...

Topic-aware latent models for representation learning on networks

Network representation learning (NRL) methods have received significant ...

Multiple Kernel Representation Learning on Networks

Learning representations of nodes in a low dimensional space is a crucia...

TNE: A Latent Model for Representation Learning on Networks

Network representation learning (NRL) methods aim to map each vertex int...

Deep Representation Learning for Social Network Analysis

Social network analysis is an important problem in data mining. A fundam...

Multivariate Relations Aggregation Learning in Social Networks

Multivariate relations are general in various types of networks, such as...

A Hierarchical Block Distance Model for Ultra Low-Dimensional Graph Representations

Graph Representation Learning (GRL) has become central for characterizin...

1 Introduction

With the rapid development of 5G technology, the analysis of 5G academic, social networks is of great academic significance and helps to guide future scientific development. In recent years, researchers have studied 89 million papers published between 1900 and 2015 and found that the collaboration rate between authors has increased 25-fold in the last 116 years. In addition, as measured by the top of cited papers, more than of the world’s leading innovations were done by the organization in the early 21st century, which is almost four times as many as in the early 20th centuryHoang et al. (2017). This suggests that how to find valuable collaborators has become very importantSun et al. (2011), and further researchers have demonstrated that scientists or teams with tightly connected collaborative networks tend to generate more researchLi et al. (2014).

Research collaboration is a crucial mechanism through which knowledge and capacity, new ideas and research approaches are linked. Specifically, collaboration connects different talent sets to generate new research. The academic collaboration network can be constructed by taking other authors or institutions as nodes of the network structure and the cooperative relationships between authors or institutions as the links of the network structure. Based on the constructed academic cooperation network, the degree of network structure, node intimacy, intermediation, and PageRank can be analyzed. However, the deeper characteristics of the cooperation network can not be analyzed. In addition, the number of papers or patents published in academic cooperation is large, and the collaboration among authors has a particular tendency, and the partners of authors are relatively few and fixed, resulting in a very sparse link matrix representing the cooperation network of authors. Recent advances in complex network representation learningFu et al. (2020); Routh et al. (2021) allow us to mitigate the challenges of large scale and sparsity in academic collaboration networks. The benefits and effectiveness of complex network representation learning have been demonstrated in many multi-tasks, such as node classification, link prediction, and community detectionBattaglia et al. (2018); Chen et al. (2020); Zhang et al. (2018a). In recent years, a number of tasks related to academic collaboration networks have emerged, such as recommending author reviewers or future cooperative research institutions, predicting scholars’ research interestsMakarov et al. (2019), and mining scholars’ collaborative model, representing the combination of learning model and scholar attribute of academic collaboration networkYu et al. (2019); Thelwall and Kousha (2014); Zhang and Wu (2021). They argue that student relationships should be represented by co-authors and the research expertise described in the academic paper. Therefore, scholars should be represented in various attributes that reflect their existing collaborative structures and theoretical knowledge. However, these symbolic learning models ignore spatially independent structural similarity characteristics in academic cooperation networks.

Therefore, in this paper, the representation learning method of a complex network is applied to the academic cooperative network. While retaining the structural features of the network as much as possible, the low-dimensional representation of different nodes in the network can be trained to learn both local structural elements in the network and spatially independent structural similarity features in the network. Finally, the embedding results are applied to complex network tasks such as classification and clustering to excavate the cooperation law and patterns and trends of cooperation trend between schools and enterprises based on the characteristics of the network underlying structure.

In summary, the contribution of this paper can be described as follows: (1) We propose a low-order network representation learning model, namely LNLM, to map multiple components into a low-dimensional space while maintaining internal dependencies between components. (2) Comparing to other representation learning model, our proposed model preserves spatially independent structural similarity characteristics in the network. (3) By evaluating the effects on multiple datasets, including the 5G academic social network, the method proposed in this paper is shown to outperform existing higher-order dynamic network methods.

The paper is organized as follows: related work is discussed in Section 2; the proposed method is introduced in Section 3; experimental results are presented in Section 4; Section 5 concludes this paper.

2 Related work

2.1 Academic Collaborative Research

Academic network analysis effectively deals with explosive educational information and diverse associations in the academic big data environment. As a specific type of social network, academic collaboration networks play an increasingly important role in disseminating academic data and expanding academic collaboration. There are many ways to build an educational network, including co-citation, cooperation, similar research contents and everyday academic activities. In the analysis of authors and collaborators, Wang et al.Wang et al. (2019)and Evans et al.Evans et al. (2011)

found an apparent homogeneity in academic cooperation, as scientists tend to cooperate with others they like the most. However, collaborating with people from different fields can solve complex problems, such as patents across scientific fields. Furthermore, when this diversity helps to make more dispersed networks more connected, it will produce better scientific outcomes, such as high-quality papers in journals or conferences with more significant impact and higher citation rates. Wang et al. proposed a graph model based on time-constrained probability to identify the consultant-consultant relationship

Wang et al. (2010).

Analysis based on social networks, Newman et al. used the method of social network analysis to study the macro and micro characteristics of the large-scale academic cooperation networkNewman (2001), such as node degree, node centrality, network clustering coefficient, etc., academic cooperation research has aroused people’s interest again. Following up on Newman’s work in 2001, Barabasi et al. investigated the dynamics and evolution of author collaboration networksBarabâsi et al. (2002). Since then, authors’ collaborative networks have been extensively studied in various ways in the natural and social sciences. Kempe et al. also used author collaboration networks to identify the most influential authorsKempe et al. (2005). Liu et al. attempted to assess the identity of authors in a particular field by looking at their collaborative networks, thereby strengthening ties with the community by identifying the most influential researchersHuang et al. (2013); Liu et al. (2015).

In addition to identifying the most influential authors in academic networks, the task of educational relationship mining based on academic author networks also includes modelling cooperation patternsPetersen (2015); Wang et al. (2017a),recognition of academic relationshipWang et al. (2010), community detectionYu et al. (2019), and collaborator recommendationsKong et al. (2017); Qu and Xiong (2012); Wang et al. (2019); Xia et al. (2014).

2.2 Research on representation learning in complex networks

The 5G IoT social data boom is not only revolutionizing our daily lives, but it’s also generating a considerable amount of 5G-related paper data. In addition, the feature matrix of the academic cooperation network formed by 5G papers is mostly non-negative. Therefore, 5G Academic Social network analysis based on non-negative matrix factorization has various application scenarios. Matrix decomposition-based graph embedding models usually express graphs as matrices, such as the adjacency matrix or approximation matrix of a graph, and decompose the matrices by factoring to obtain graph embeddings or node embeddings that preserve the structural features of the graphGoyal and Ferrara (2018). There are two types of graph embedding based on matrix factorization: one factorises the Laplacian feature map of the graph, and the other factorises the approximate node matrix directly. The key to factor the Laplacian eigenmaps is to make the final embedded graph feature explain the similarity of paired nodes. Therefore, the greater the embedding distance between two nodes with a more remarkable parallel, the greater the penalty will be to preserve the similarity characteristics between nodes in the network structure. In the initial study MDSHofmann and Buhmann (1995)

, the distance between two eigenvectors was calculated as the strength of similarity of two nodes. However, this approach has the disadvantage that it does not take into account the neighbours around the node, i.e. any node in all graphs is considered to be connected to any of the remaining nodes. In order to overcome the shortcomings of the complexity of the problem, LPP

He and Niyogi (2004)first starts from the characteristics of data structure most adjacent graph, where each node only calculates the Euclidean distance of the nearest node, i.e. the proximity of the node to a neighbouring nodes, and then uses the proximity of different nodes to constrain the formation of additional penalties. More advanced models have recently been designed in recent years. For example, AGLPPJiang et al. (2016)realized the rapid improvement of model LPP efficiency by introducing an anchor diagram.

The incorporation of deep learning ideas into non-negative matrix factorisation has become a popular area of research. NMF can be used to learn data representation by decomposing multivariate data into the product of linear combination basis and auxiliary matrix. By adding additional constraints and penalty terms to induce the thinness

Hoyer (2002)and the NSNMFPascual-Montano et al. (2006)after adding smoothness, etc., an extension of NMF is formed to improve its performance. However, these methods have two inherent problems: first, they assume that the input data can be reconstructed linearly from the base; second, they are all single-layer structural learning, and only basic low-level features can be obtained from the original data.

There has been some exploration in deep matrix decomposition to extract high level non-linear features in network structuresGuo and Zhang (2019); Song et al. (2015); Trigeorgis et al. (2014); Yu et al. (2018). Their general idea is to decompose a matrix layer into multiple layers, hoping to get a hierarchical mapping. Trigeorgis et al. proposed a multi-layer semi-NMF model with a complete depth architecture for automatic learning of attribute hierarchies to facilitate clustering tasksTrigeorgis et al. (2014). Song et al. proposed a multi-layer NMF structure for classification tasks, in which a non-smooth NMF was used to solve the typical NMF in each layerSong et al. (2015). Then, a sparse depth NMF model is proposed, and Nesterov accelerated gradient descent algorithmGuo and Zhang (2019) is successfully applied to the light structure of data objects. Recently, Yu et al. proposed a deep non-smooth NMF architecture to learn partial and hierarchical attributesYu et al. (2018)

. However, all of these models consist only of decoder components. The LNLM model proposed in this paper is autoencoder-like with an objective function that combines an encoder and a decoder. The low-dimensional feature matrix hidden in the original network is transformed by the low-dimensional feature matrix hidden in the intermediate layers. For each layer of the encoder, the similarities between nodes at different granularity levels are explained. The decoder components seek to learn from the hierarchical mapping of the encoder components to effectively fuse and reconstruct the multi-feature matrix of the original network.

Symbol Definition
graph with node set and edge set
Adjacency matrix of graph
Low-order eigenmatrix of nodes
Representation matrix of the -dimension nodes
Local eigenmatrix of nodes
Capacity of Graph
Sliding Window Size
Table 1: Notations

3 Low-order networks represent learning to model

3.1 Model and its solution

Given an undirected network, , including node and side, represents a set of nodes, and represents a collection of edges between nodes. can be represented by an adjacency matrix , denoted representing the connections between node and other nodes in in the connection between the other nodes. For a weighted network, if there is an edge between node and node , otherwise . Since the network is undirected, is a symmetric matrix, i.e. . Throughout this paper, the matrix is shown using bold capital characters. The matrix preserves the low-order structural features of the nodes in the network, where is the dimension of the feature. describes the structural characteristics of the node . The purpose is to learn the representation of the node , where is the dimension of the representation. The table 1 shows the notation used in this paper.

In the design of the LNLM model, this paper mainly considers two essential parts, a local strCapacity of Graph Guctural feature encoder and a lower order feature encoder. The adjacency matrix retains most of the network topology characteristics and directly represents the first-order similarity. The model first captures the coarse-grained structure feature matrix from the original network by decomposing the adjacency matrix .

3.2 Local structural feature extraction

In the real world, information networks often lose a lot of information, and many nodes in the adjacency matrix that are not directly connected also have high similarity in nature. In addition, the adjacency matrix retains directly connected edges that represent the first-order similarity of the nodes. In particular, they reveal that the adjacency matrix preserves the topology of the network. Therefore, the adjacency matrix is decomposed to capture the topology of the network. In this paper, the NMF method is adopted to maintain local structure in low-dimensional space and minimize the following objective functions to the maximum extent:


Among them is the local eigenmatrix, it’s the spatial dimension.

Because embedding the matrix loses a lot of crucial information, lower-order features are integrated to mutually enhance the learning of at the same time, by decomposing the local eigenmatrix to obtain the low-dimensional representation of nodes that retain local and community structures, the following objective functions are minimized:


Among them it’s an embedded matrix, is the auxiliary matrix, is the embedded dimension.

3.3 Lower order structural feature extraction

In this paper, lower-order features of nodes are extracted based on a random walk. Zhang et al. proved that the model based on a random walk and graph hop is considered to be of the matrix decomposition, a closed-form and verify the effectiveness of them for the conventional network mining tasksQiu et al. (2018). The matrix representation of implicit approximation and factorization is as follows:


Where is the capacity of the graph , . is degree matrix of graph , . is the sliding window, and

is the negative sampling number in the skip-gram model. Truncated Singular Value Decomposition (SVD) is then performed on the constructed

to capture the eigenmatrix of the node in the low-dimensional space, that is, . As the window size increases, lower order structures can be obtained. Inspired by this, lower order structural features are extracted from the matrix


is an auxiliary matrix.

4 Model Building

Because the network is very sparse, many node features will be lost only through the first-order similarity of nodes. At the same time, the random walk sequence is generated for each node, and the order feature is learned by controlling the window size. In addition, the lower order features known from the random walk are integrated into the NMF framework, and the local structures are captured to be represented by learning nodes so that the lower order feature information and regional facilities are retained together. The overall architecture of this model is shown in Figure 1.

Figure 1: Framework of LNLM model

After integrating the objective functions (1), (2) and (4), the definition of the final loss function of this model is as follows:


Where , , are positive parameters used to adjust the contribution of the corresponding item.

4.1 Model optimization

Since the loss function in the formula (5) is not convex, the derivative cannot be used to calculate the optimal solution. In this paper, the loss function is divided into four subproblems, and the four parameter matrix is optimized respectively.

Then, using the Majorization - Minimization frameworkHunter and Lange (2004)each optimal local solution to the problem is then updated using the majorisation-minimisation framework. The update strategy adopted is alternate optimization, that is, when one matrix is updated, the other three matrices are fixed. The algorithm 1 shows the pseudocode for the optimization process. The specific formula is as follows:

The -related loss function is as follows:


Here, is a non-negative matrix, so the Lagrange multiplier matrix is introduced to obtain the following equivalent function:


Set the value of formula (7) to 0, that is :


Following the Karush-Kuhn-Tucker (KKT) condition on the non-negative property of , the following equation is obtained:


Initialization and updates based on are as follows


Where represents multiplying matrices

On the optimization of V-subproblem, when updating , the fixed arguments , , resolve the following target function:


Similar to , the update rules that define are as follows:


On the optimization of H-subproblem, when updating the fixed parameters , , , will have the following objective functions:


Similarly to , the update rule that defines is as follows:


On the optimization of U-subproblem, when updating the fixed parameters , , , will have the following target functions:


Similarly, similar to the optimization calculation of , the update rule that defines is as follows:


An algorithm 1 describes the optimization process of the method. The optimized input data includes network , lower order eigenmatrix , embedding dimension , convergence coefficient and balance parameters , ,

. First of all, with uniformly distributed random initialization

, , , . Then, iteratively update , , , until convergence occurs. The output is the embedded matrix for all nodes in the network. The node representation learned from the model in this paper can obtain lower order features and local structures.

Input: Network , lower order eigenmatrix , embedded dimension , convergence coefficient , balance parameter , ,
Output: , ,
1Initial:, , ,
2 while not conv do
3       if  then
4             (10) with the formula
5             (12) with the formula
6             (14) with the formula
7             (16) with the formula
8             Calculate the loss function (5) using the formula
10       end if
11      else
12             conv
14       end if
16 end while
return , , ,
Algorithm 1 LNLM model optimization

4.2 Model complexity

The entire computational complexity of the model depends on the matrix multiplication in the update rule, so given two matrices and ,Where and , computational complexity is . Based on this, the computational complexity of the four update formulas in the algorithm 1 is , , , . Since , , can be considered as input constants, and , , , the computational complexity is . In fact, most networks are very sparse, so only non-zero values are evaluated in matrix multiplication. Based on this, the calculation is simplified to , where is the number of edges in the network. , , , is the parameter matrix, so the space complexity is . Because , , and are less than , the spatial calculation is simplified to . It can be found that for most NMF-based algorithms, the complexity of the model is on the same order of magnitude.

To effectively illustrate the temporal complexity of the model, the low-dimensional embedding of nodes for several datasets of different sizes was learned, and the time was calculated and then displayed in the table 2. The results show that the size of the network increases exponentially, and the computing time also increases exponentially.

Dataset OAG Wikipedia Polblog Hep-ph
Number of nodes 13890 4777 10312 80513
Time 17.113s 0.683s 12.152s 77.330s
Table 2: Computation time on different datasets

5 Experimental Verification

In this section, extensive experiments will be conducted on multi-tag node classification, clustering, link prediction and visualization tasks to evaluate the model’s effectiveness. First, the data sets and experimental Settings used in this paper are described. In order to prove the validity of the model, it is compared with the latest methods in extensive experiments. Finally, the sensitivity of using parameters is analyzed. All the experiments were run on computers configured with a Windows 8 64-bit operating system, a 3.10 GHz CPU and 256 GB RAM. The detailed comparison algorithm is as follows:

5.1 Datasets

This section focuses on four widely used network data sets for multi-label node classification tasks and four real networks, including 5G academic social networks with basic fact data sets for clustering and link prediction. The statistical characteristics of these datasets are shown in table 3, which is described in detail below.

  • Wikipedia:This is the word co-occurrence network on Wikipedia. Class tags are part of speech (POS) tags inferred by Stanford PoS TaggerToutanova et al. (2003).

  • PolblogAdamic and Glance (2005):PolBlog, a social network whose nodes represent the blogs of US politicians, has an advantage if their blogs have Weblinks. The tag indicates the type of politician.

  • LivejournalYang and Leskovec (2015): LiveJournal is an online social network data set whose nodes represent bloggers and have edges between two nodes if they are friends. Divide the bloggers into groups based on their friendships and use these groups as tags.

  • OrkutYang and Leskovec (2015): Orkut is an online dating network that uses nodes as users and builds links between nodes based on their friends. The Web has several clearly labelled communities, including student communities, activities, interest-based groups, and school teams.

  • GRQC,Hep-th,Hep-phLeskovec et al. (2007): This is a collaborative network of three authors in general relativity and quantum cosmology, theory of high energy physics, and phenomenology of high energy physics extracted from arXiv. In this network, vertices represent authors, and edges represent authors who have co-authored a scientific paper in arXiv.

  • Open Academic Graph (OAG)Chaudhuri et al. (2012): This is an undirected author collaboration network constructed from a publicly available academic chart indexed by Microsoft Academic and American Miner websites. The network contains 67,768,244 authors and 895,368,962 collaborative advantages. Vertex tags are defined as each author’s top research area, such as computer science, physics, psychology, etc. There are 19 different fields (labels) in total, and the author can publish in multiple fields, which makes the corresponding vertices have multiple labels.

  • Academic Social Network(ASN)Wan et al. (2019):Data included 2,092,356 papers, with 8,024,869 citations, 1,712,433 authors and 4,258,615 co-authors. Among them, the number of nodes related to 5G is 2407, and the number of edges is 1836. In the past 10 years, the top highly relevant to 5G each year are 6143 in total, cited 100572 times by 2635 authors and 6316 co-authors. According to the domestic, foreign papers, and related patents on the line to label and the author cooperation network of different types of research, to brand.

Dataset Node Number Edge Number Category Number Multiple Tags
Wikipedia 4777 18412 40 yes
Polblog 1490 16627 2 no
Livejournal 11118 396461 26 no
Orkut 998 23050 6 no
GRQC 4158 13422 42 no
Hep-th 8638 24806 5 yes
Hep-ph 11204 57619 38 yes
OAG 13890 86784 19 yes
ASN 10407 14156 9 yes
Table 3: Dataset Statistics

5.2 Baseline methods

This section highlights the comparison of three and five state-of-the-art network embedding methods based on NMF. The details are as follows:

  • M-NMFWang et al. (2017b): M-NMF unifies the community structure characteristics and the 2-hop neighbourhood relations of nodes in the NMF framework to learn the embedding of nodes in the network structure. The model uses nodes to represent the consistent relationship with the network community structure. It uses an auxiliary community to represent the matrix to connect the local features (first-order similarity and the community structure features in the network structure) and uses the optimization formula to optimize them jointly. The embedded dimension in the experiment is 128, and other parameters in the investigation are set according to the original paper.

  • NetMFQiu et al. (2018): NETMF proved that models using negative samplings, such as DEEP WALK, PTE and LINE, can be decomposed into closed matrices and confirmed that it is superior to DEEP WALK and LINE in conventional network analysis and mining tasks.

  • AROPEZhang et al. (2018b)

    : The AROPE based SVD framework moves the embedded vectors across any order and reveals the intrinsic relationships between them to learn any high order proximity of nodes.

  • DeepWalkPerozzi et al. (2014): Deepwalk generates a random path for each node and treats the path of these nodes as a sentence in a language model. It then uses the skip-gram model to learn to embed vectors. In the experiment, the parameters are consistent with those in the original paper.

  • Node2vecGrover and Leskovec (2016): Node2vec extends DeepWalk by using biased random walk. It introduces two offset parameters and to optimize the random walk. All parameters are default Settings.

  • LINETang et al. (2015): LINE learns the embedding of nodes by defining two loss functions to preserve the first-order and second-order proximity, respectively. This article uses the default parameter settings, but the negative ratio is 5.

  • GAEKipf and Welling (2016):GAE is based on a variational autoencoder and has the same convolution architecture as GCN. This method has strong competitiveness in link prediction tasks in citation networks.

  • SDNEWang et al. (2016):SDNE uses a deep autoencoder with a semi-supervised architecture to optimize the first-order and second-order similarity of nodes simultaneously and uses explicit objective functions to clarify how to preserve the network structure. The parameters in the experiment are consistent with those set in the original paper.

5.3 Parameter sensitivity analysis

In this section, we analyze the effects of LNLM parameters , , and on the clustering performance on real networks, where is the dimension of the local structure embedding space. The experimental results show that the clustering performance presents a similar trend in different data sets with the change of parameters. For simplicity, the effects of different parameters are discussed with an OAG dataset as an example.

The details of the parameter analysis are shown in Figure 2. The influence of the two parameters on the experimental results is analyzed below. In general, the values of , and are set based on the data in the following experiment. Explore the effect of each parameter by changing two parameters while controlling the others. For example, change , and fix and to see the effect of , . And so on. To be specific, change from 100, 200, 300, 400 and 500. Figures 2(a)-(c) and Table 7 show the performance of NMI as these parameters vary, respectively. In the figure 2(a), the clustering performance is the worst when both and are less than 10. Within certain limits, NMI values tend to be stable as and increase. The clustering performance is best when is greater than 50 and is less than 30.

(a) Relationship between NMI and cluster evaluation index
(b) Relationship between NMI and cluster evaluation index
(c) Relationship between NMI and cluster evaluation index
Figure 2: Parameter sensitivity in NMI and cluster evaluation index

As shown in Figure 2(b), horizontally, NMI does not change much when in [1,20], indicating that the clustering performance is relatively stable as is in A and as is increased. Some range. In the figure 2(c), it is noted that within a specific field, NMI tends to be stable when and are linearly correlated, while NMI reaches its maximum when and are in the range [20,101].

The parameters of the LNLM model include three hyperparameters

, and , the local structure size and the embedded size . In the experiment, set , , or , to find the optimal parameters of the model. And varies according to the number of tags. , and perform better. In the process of extracting the lower order structure matrix , the dimension is set to 128. When is greater than 200, the clustering performance gradually decreases, so the dimension of is set to 200.

5.4 Multi-classification experiment

In this section, the performance of multi-label node classification is evaluated against the metrics MICRO F1 and MACRO F1. In order to reduce the contingency of experimental results, the classification process was repeated 10 times, and the average value was taken as the result. Table 4 shows the node classification performance of our model and the baselines of the four datasets for , respectively. All data sets in the same table share the same window size . In these tables, bold numbers indicate the best results.

Model BlogCatalog OAG Wikipedia ASN
Micro Macro Micro Macro Micro Macro Micro Macro
NetMF 31.54 14.86 16.01 12.10 49.9 9.25 23.86 4.26
M-NMF 21.81 6.53 18.34 11.13 48.13 7.91 25.17 8.66
LINE 23.74 13.32 11.94 9.54 41.74 9.73 26.06 7.79
DeepWalk 29.32 17.38 12.05 10.09 35.08 9.383 24.96 11.87
AROPE 33.87 14.51 19.61 12.78 52.83 10.69 16.23 10.76
GAE 27.11 25.58 16.67 11.85 50.48 10.75 26.37 11.02
LNLM 34.66 16.19 19.74 13.71 51.02 9.51 26.18 11.69
Table 4: Multi-label classification performance evaluation based on Micro/Macro-F1
Hep-th OAG Wikipedia ASN
Micro Macro Micro Macro Micro Macro Micro Macro
= 2 41.10 25.15 25.47 21.21 53.57 11.98 26.32 13.50
= 3 41.99 25.98 25.45 21.17 52.75 10.94 26.08 13.58
= 4 42.01 25.98 25.47 21.22 52.31 10.72 27.24 13.67
= 5 42.07 26.06 25.88 21.55 51.68 10.47 27.56 13.54
= 6 42.11 25.89 25.66 21.41 51.25 10.31 27.37 13.72
= 7 42.13 25.97 25.56 21.38 51.21 10.24 27.26 13.53
= 8 42.11 25.76 25.60 21.18 50.88 10.07 27.22 13.52
= 9 42.05 25.81 25.62 21.30 50.64 9.94 27.19 13.51
= 10 43.15 28.48 25.92 21.59 51.83 10.96 27.13 13.50
Table 5: Effects of different T on performance evaluation of multi-label classification based on Micro/Macro-F1

Obviously, according to the evaluation indexes Micro-F1 and Macro-F1, LNLM performs better than other models in the cited network dataset ASN, OAG and Hep-PH, proving the effectiveness of the network embedding model in the analysis of academic networks in this paper. In Wikipedia, the AREOP model shows better performance than the LNLM approach in Micro-F1 and Macro-F1. This phenomenon suggests that a relatively low order is sufficient to characterize the network structure of Wikipedia. The reason is that Wikipedia is a dense word co-occurrence network with a moderate degree of about 85, so if two words appear together in a window of size 2, they will have edges. The results show that the method based on matrix factorization alone does not perform well in the classification task, proving the effectiveness of combining lower-order feature learning node representation on academic networks.

As mentioned earlier, the window size of determines the order in which structures are captured. In addition, the effect of window size on multi-label classification performance is explored. Here, set the window size to 1 through 10. The table 5 show the relevant results and trends. Because table 4shows the result of =1, starts with 2 in table 5

. In HEP-TH, OAG, and ASN datasets, the proposed LNLM can significantly better classify performance as the window size

increases, such as Micro-F1 and Macro-F1. However, when gradually reaches 4, the performance tends to be stable. In wikis, classification performance gradually degrades when is greater than 3. This phenomenon indicates that the size of the window will never be as large as possible, and the window should be set dynamically according to the network sparsity to reduce the amount of computation. LNLM model can dynamically adjust the window size according to network sparsity to learn a better node representation.

5.5 Node clustering experiment

In this section, the performance of the node cluster is evaluated based on standardized mutual information (NMI) of typical metrics. This paper uses accurate data (including Polbog, Livejournal, and Orkut) to evaluate the clustering performance of real-world datasets. NMI varies between 0 and 1, and the larger the value, the better the cluster performance. In the experiment, the standard K-means algorithm is used to obtain the clustering results of other network embedding methods. Since the initial value significantly influences the clustering results, the clustering is repeated 10 times and its average value is calculated as a result.

Table6 shows node clustering performance concerning NMI. Again, bold numbers indicate the best results in the table. The table results show that LNLM has the best performance on all the network data sets on NMI. In particular, the LNLM approach showed a improvement in NMI compared to the second-best approach on the Pol blog dataset. This is because our method integrates lower-order structure features and local structure features and captures the network’s diverse and comprehensive structure features. SDNE and LINE only preserve the proximity between network nodes and cannot effectively maintain the community structure. Deepwalk and Node2vec based on a random walk can capture second-order and even higher similarity. However, they ignore community structure. AROPE can grasp similarities between different nodes. Although more global structure information is caught as the length increases, AROPE still forgets module information. M-NMF introduces modularization items to learn node embedding that preserves community structure. However, for sparse networks and networks with no prominent community structure, the modularization term constraint of NMF makes the representations of nodes similar, so its performance is relatively low. The results show that the method based on matrix factorization only performs poorly in the clustering task. The above results demonstrate the power of fusing lower-order features into embedding while preserving local structures.

Dataset Polblog Orkut Livejournal
NetMF 0.324 0.557 0.688
M-NMF 0.215 0.310 0.681
DeepWalk 0.475 0.120 0.103
Node2vec 0.453 0.331 0.117
LINE 0.226 0.211 0.565
SDNE 0.077 0.213 0.743
AROPE 0.241 0.306 0.165
GAE 0.369 0.768 0.787
LNLM 0.718 0.778 0.806
Table 6: Evaluation of node clustering performance based on NMI

5.6 Link prediction experiment

In this section’s experiment, to predict which node pairs are likely to form a boundary, we hide to of the edges for evaluation as test data while ensuring that the rest of the network is connected. The remaining edges are used to train the node embedding vector, respectively. A specific area under the curve (AUC) score is used to evaluate the performance of LNLM and other benchmark methods.

First, removing of the edges on all network datasets to verify the performance of LNLM is shown. As shown in Table 7, the LNLM model achieves , and improvements in Pol blog, Orkut and Livejournal, respectively. We note that M-NMF, which preserves the network community structure, is second only to the LNLM model in terms of predictive power in all data sets.

Dataset Polblog Orkut Livejournal GRQC
NteMF 0.525 0.650 0.806 0.795
M-NMF 0.672 0.835 0.878 0.843
DeepWalk 0.499 0.487 0.469 0.849
Node2vec 0.495 0.516 0.498 0.530
LINE 0.471 0.470 0.515 0.508
SDNE 0.460 0.521 0.529 0.513
AROPE 0.694 0.646 0.775 0.734
GAE 0.859 0.792 0.963 0.937
LNLM 0.860 0.899 0.972 0.941
Table 7: Experimental results of link prediction on AUC

Specifically, the results in LiveJournal and Orkut were used as examples to explore the effect of training data ratio. The results in Figure 3 show that the LNLM model has a better performance compared to all baselines in both datasets in different parts of the removed edge. Due to the separate network structures, some data sets can achieve the optimal prediction accuracy when of the edges are retained. In comparison, others can achieve the optimal prediction accuracy when of the edges are included. Overall, the results show that the LNLM model can achieve excellent link prediction, indicating the effectiveness of retaining high data sets. Ordered characteristics and local structure information of network embedding.

Figure 3: Relation between NMI and M changes

5.7 5G academic social network analysis and prediction

From 2011 to 2014, 4G has just been successfully developed and gradually popularized in China. During this period, the research on the 5G network only stays in the design and application of functions of the 5G network, such as the functional architecture of the mobile network, the future of the architectural design of 5G mobile network, and remote patient monitoring in 5G infrastructure prepared papers to stay in the theoretical writings, the patent application is relatively few. However, after 2018, with the discussion on the construction strategy of the transmission network in the 5G era and the strengthening of the challenges, methods and directions of the 5G network, papers and patents on low-complexity general-purpose filter multiple carriers applicable to the 5G wireless system increased rapidly in this year.

However, with the formal application of 5G network, whether the research hotspot can be further developed, this paper uses the popular ARMA model to predict the number of 5G-related papers and patents. It can be seen from 5

(a) that the data set is a stationary non-white noise sequence, and the ARMA model can model the series. First, we calculate the values of sample autocorrelation coefficient (ACF) and partial correlation coefficient (PACF) of the observation series based on

5(b) and 5(c). Then the ARMA (P, Q) model with appropriate order is selected to fit the properties of sample autocorrelation coefficient and partial autocorrelation coefficient. See the following figure4:

(a) Sequence Diagram
(b) ACF analysis of 5G data
(c) PACF analysis of 5G data
Figure 4: ARMA model parameters are determined

As shown in 4

(b), the PCA autocorrelation diagram shows that all the order autocorrelation coefficients fluctuate within the standard deviation except that the autocorrelation coefficient of order 1 is within the range of 2 standard deviations. The graph has a sinusoidal fluctuation trajectory, which indicates that the attenuation of the autocorrelation coefficient to zero is not a sudden process, but a continuous gradual process. Based on the characteristics of the autocorrelation coefficient, we can judge that the sequence has a short-term correlation and further determine the sequence stability.

4shows the process of the partial autocorrelation coefficient attenuating to zero. What is unique here is that the partial autocorrelation coefficient of the first order is in the range of 2 standard deviations, and the partial autocorrelation coefficient of the 13th order is also in the range of 5 standard deviations. According to the trailing autocorrelation. Further, we ran the predicted results as shown in the figure 5:

Figure 5: ARMA prediction based on 5G papers and patents

As shown in the figure 5, the grey line is the 120 data points used for training, the black line is the prediction of future values, and the red line is the upper and lower limits of the confidence interval. So there’s a chance that the actual value of the future will fall within this range.

6 Industrial applications

This section uses the LNLM model to verify this 5G academic social network data sets and analysis, through the model after embedding, detection of community and community evolution experiment, 5G academic social network analysis of core were the focus of scientific research team, team leaders and the development direction of different technical fields, and the academic relationship between quoter and development trend.The resulting author collaboration topology is shown the figure 6:

Figure 6: Author collaboration topology

Due to many generated nodes, this paper sets labels to display only the top 6 nodes in node degree. The larger the designation of a node is, the greater the degree of the node is. As shown in the figure, Mohsen Guizani’s degree value is the largest, so it is the most core node in this network. This is because Mohsen Guizani is a well-known expert in the field of 5G, the chief editor of IEEE fellow, IEEE Network and other international top journals, the University of Idaho, professor of electrical and computer engineering department, research line is wireless communication and mobile computing, computer Network, mobile cloud computing, therefore, Mohsen Guizani has published several high-level papers. The other authors are also experts in the field of 5G, so they occupy a relatively central position in the cooperative network.

According to the figure 7, the author cooperative network can be divided into 26 categories in total. They are Network Intelligence, 5G Microcell Base Station, 5G Transport Network, 5G Edge Service, Mobile Network Architecture Evolution, etc. These categories respectively represent the research direction of the author.

Figure 7: Author Cluster Graph Based on 5G Academic Social Networks

In order to find out the hot research direction of 5G academic social network, this paper selects the top 7 categories of clustering coefficient, Population health, Dual-Polarized antenna, Mobile Network Architecture Evolution, Modulation scheme, Network intelligence, among which network intelligence is the largest category, means that many authors have conducted researches on this research direction, including many well-known professors from Stanford University, Massachusetts Institute of Technology, Tsinghua University and so on. They have collaborated to complete a lot of papers. So that this direction of research personnel, significant influence.

Figure 8: Collaborative topology of cited scholars based on 5G academic social networks

In the figure 8 shows the cooperation topology of cited scholars, with a total of 26,000 nodes and 125,645 edges. It can be seen from the figure that Anonymous, Rappaport TS, and Andrews JG are highly cited scholars. These three scholars are professors from Harvard University, Stanford University and Oxford University, respectively. Their research direction is 5G intelligence, and they have published many top papers. Therefore, many scholars will appropriately cite their documents in their documents.

Figure 9: Cited scholar extension based on 5G academic social network

In order to better display the cluster diagram of cited 5G scholars, we have built a cooperative network of cited scholars, which has been grouped into 22 categories, including 5G Application, Channel Model, 5G Network, 5G Broadcast, 5G Fronthaul, etc. These all represent the research directions of the cited scholars. We found that they are all related to 5G research, and the core is 5G. According to different structures, they are divided into different directions, namely categories, representing the hot research directions at present.Shown the figure 9:

As shown in the figure 9, 5G application is the category with the most nodes in the clustering, which also indicates that 5G application is the frontier direction of the 5G research field. Many scholars have carried out related researches in this direction. It can be found from the figure that Balans CA, Huang H and Pozah DM have the highest node degree in this category, which means that these three scholars are the leaders in the research direction of 5G application. When we check relevant information, we find that these three scholars are all academicians of the US National Academy of Sciences. I’m a technical engineer with America Mobile.

7 Conclusions and future work

With the rapid development of 5G technology, the analysis of 5G academic social networks is of great academic significance and helps guide future scientific development. The diverse forms of collaboration and large scale of data in academic social networks constructed by 5G papers make the management and analysis of academic social networks increasingly challenging. Therefore, this paper builds a low-order feature matrix based on the random walk, and the combined NMF framework allows users to control the weight loss among different structural features. An efficient and scalable network embedding algorithm is proposed. This algorithm can capture the local network structure in the 5G academic social network and effectively integrate the low-order features of nodes into the framework of non-negative matrix factorization to further discover the critical personnel and cooperative communities in the 5G academic social network. The robustness of the proposed algorithm is verified by multi-label classification, clustering and link prediction experiments on four widely used network datasets, three real network datasets and eight mainstream network representation learning models.

It would be great to verify LNLM with a multi-layer graph; However, we cannot do this in this article due to accessibility issues. Another problem is that although our model proposes a new approach to building low-order eigenmatrices based on a random walk, it is still challenging to deal with multi-layer sparse data sets. In future work, we will try to solve this problem through local methods.


  • L. A. Adamic and N. Glance (2005) The political blogosphere and the 2004 us election: divided they blog. In Proceedings of the 3rd international workshop on Link discovery, pp. 36–43. Cited by: item 2.
  • A. Barabâsi, H. Jeong, Z. Néda, E. Ravasz, A. Schubert, and T. Vicsek (2002) Evolution of the social network of scientific collaborations. Physica A: Statistical mechanics and its applications 311 (3-4), pp. 590–614. Cited by: §2.1.
  • P. W. Battaglia, J. B. Hamrick, V. Bapst, A. Sanchez-Gonzalez, V. Zambaldi, M. Malinowski, A. Tacchetti, D. Raposo, A. Santoro, R. Faulkner, et al. (2018) Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261. Cited by: §1.
  • K. Chaudhuri, F. Chung, and A. Tsiatas (2012) Spectral clustering of graphs with general degrees in the extended planted partition model. In Conference on Learning Theory, pp. 35–1. Cited by: item 6.
  • Y. Chen, J. Zhang, Y. Fang, X. Cao, and I. King (2020) Efficient community search over large directed graphs: an augmented index-based approach. In Proceedings of the International Joint Conference on Artificial Inteeligence, pp. 3544–3550. Cited by: §1.
  • T. Evans, R. Lambiotte, and P. Panzarasa (2011) Community structure and patterns of scientific collaboration in business and management. Scientometrics 89 (1), pp. 381–396. Cited by: §2.1.
  • X. Fu, J. Zhang, Z. Meng, and I. King (2020)

    MAGNN: metapath aggregated graph neural network for heterogeneous graph embedding

    In Proceedings of The Web Conference, pp. 2331–2341. Cited by: §1.
  • P. Goyal and E. Ferrara (2018) Graph embedding techniques, applications, and performance: a survey. Knowledge-Based Systems 151, pp. 78–94. Cited by: §2.2.
  • A. Grover and J. Leskovec (2016) Node2vec: scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 855–864. Cited by: item 5.
  • Z. Guo and S. Zhang (2019) Sparse deep nonnegative matrix factorization. Big Data Mining and Analytics 3 (1), pp. 13–28. Cited by: §2.2.
  • X. He and P. Niyogi (2004) Locality preserving projections. Advances in neural information processing systems 16 (16), pp. 153–160. Cited by: §2.2.
  • D. T. Hoang, V. C. Tran, T. T. Nguyen, N. T. Nguyen, and D. Hwang (2017) A consensus-based method to enhance a recommendation system for research collaboration. In Asian Conference on Intelligent Information and Database Systems, pp. 170–180. Cited by: §1.
  • T. Hofmann and J. Buhmann (1995) Multidimensional scaling and data clustering. Advances in neural information processing systems, pp. 459–466. Cited by: §2.2.
  • P. O. Hoyer (2002) Non-negative sparse coding. In Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, pp. 557–565. Cited by: §2.2.
  • P. Huang, H. Liu, Chen,Chin-Hui, and Cheng,Pu-Jen (2013) The impact of social diversity and dynamic influence propagation for identifying influencers in social networks. In 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), Vol. 1, pp. 410–416. Cited by: §2.1.
  • D. R. Hunter and K. Lange (2004) A tutorial on mm algorithms. The American Statistician 58 (1), pp. 30–37. Cited by: §4.1.
  • R. Jiang, W. Fu, L. Wen, S. Hao, and R. Hong (2016) Dimensionality reduction on anchorgraph with an efficient locality preserving projection. Neurocomputing 187, pp. 109–118. Cited by: §2.2.
  • D. Kempe, J. Kleinberg, and É. Tardos (2005) Influential nodes in a diffusion model for social networks. In International Colloquium on Automata, Languages, and Programming, pp. 1127–1138. Cited by: §2.1.
  • T. N. Kipf and M. Welling (2016) Variational graph auto-encoders. arXiv preprint arXiv:1611.07308. Cited by: item 7.
  • X. Kong, H. Jiang, T. M. Bekele, W. Wang, and Z. Xu (2017) Random walk-based beneficial collaborators recommendation exploiting dynamic research interests and academic influence. In Proceedings of the 26th International Conference on World Wide Web Companion, pp. 1371–1377. Cited by: §2.1.
  • J. Leskovec, J. Kleinberg, and C. Faloutsos (2007) Graph evolution: densification and shrinking diameters. ACM transactions on Knowledge Discovery from Data (TKDD) 1 (1), pp. 2–es. Cited by: item 5.
  • J. Li, F. Xia, W. Wang, Z. Chen, N. Y. Asabere, and H. Jiang (2014) Acrec: a co-authorship based random walk model for academic collaboration recommendation. In proceedings of the 23rd international conference on World Wide Web, pp. 1209–1214. Cited by: §1.
  • J. Liu, Y. Li, Z. Ruan, G. Fu, X. Chen, R. Sadiq, and Y. Deng (2015) A new method to construct co-author networks. Physica A: Statistical Mechanics and its Applications 419, pp. 29–39. Cited by: §2.1.
  • I. Makarov, O. Gerasimova, P. Sulimov, and L. E. Zhukov (2019) Dual network embedding for representing research interests in the link prediction problem on co-authorship networks. PeerJ Computer Science 5, pp. e172. Cited by: §1.
  • M. E. Newman (2001) The structure of scientific collaboration networks. Proceedings of the national academy of sciences 98 (2), pp. 404–409. Cited by: §2.1.
  • A. Pascual-Montano, J. M. Carazo, K. Kochi, D. Lehmann, and R. D. Pascual-Marqui (2006) Nonsmooth nonnegative matrix factorization (nsnmf). IEEE transactions on pattern analysis and machine intelligence 28 (3), pp. 403–415. Cited by: §2.2.
  • B. Perozzi, R. Al-Rfou, and S. Skiena (2014) Deepwalk: online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 701–710. Cited by: item 4.
  • A. M. Petersen (2015) Quantifying the impact of weak, strong, and super ties in scientific careers. Proceedings of the National Academy of Sciences 112 (34), pp. E4671–E4680. Cited by: §2.1.
  • J. Qiu, Y. Dong, H. Ma, J. Li, K. Wang, and J. Tang (2018) Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In Proceedings of the eleventh ACM international conference on web search and data mining, pp. 459–467. Cited by: §3.3, item 2.
  • Y. Qu and N. Xiong (2012) RFH: a resilient, fault-tolerant and high-efficient replication algorithm for distributed cloud storage. In International Conference on Parallel Processing, Cited by: §2.1.
  • P. K. Routh, Y. Liu, N. Marcella, B. Kozinsky, and A. I. Frenkel (2021) Latent representation learning for structural characterization of catalysts. Journal of Physical Chemistry Letters 12, pp. 2086–2094. Cited by: §1.
  • H. A. Song, B. Kim, T. X. Luong, and S. Lee (2015)

    Hierarchical feature extraction by multi-layer non-negative matrix factorization network for classification task

    Neurocomputing 165, pp. 63–74. Cited by: §2.2.
  • Y. Sun, R. Barber, M. Gupta, C. C. Aggarwal, and J. Han (2011) Co-author relationship prediction in heterogeneous bibliographic networks. In 2011 International Conference on Advances in Social Networks Analysis and Mining, pp. 121–128. Cited by: §1.
  • J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei (2015) Line: large-scale information network embedding. In Proceedings of the 24th international conference on world wide web, pp. 1067–1077. Cited by: item 6.
  • M. Thelwall and K. Kousha (2014) social network or academic network?. Journal of the American Society for Information Science and Technology. Cited by: §1.
  • K. Toutanova, D. Klein, C. D. Manning, and Y. Singer (2003) Feature-rich part-of-speech tagging with a cyclic dependency network. In Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pp. 252–259. Cited by: item 1.
  • G. Trigeorgis, K. Bousmalis, S. Zafeiriou, and B. Schuller (2014)

    A deep semi-nmf model for learning hidden representations


    International Conference on Machine Learning

    pp. 1692–1700. Cited by: §2.2.
  • H. Wan, Y. Zhang, J. Zhang, and J. Tang (2019) Aminer: search and mining of academic social networks. Data Intelligence 1 (1), pp. 58–76. Cited by: item 7.
  • C. Wang, J. Han, Y. Jia, J. Tang, D. Zhang, Y. Yu, and J. Guo (2010) Mining advisor-advisee relationships from research publication networks. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 203–212. Cited by: §2.1, §2.1.
  • D. Wang, P. Cui, and W. Zhu (2016) Structural deep network embedding. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 1225–1234. Cited by: item 8.
  • W. Wang, J. Liu, Z. Yang, X. Kong, and F. Xia (2019) Sustainable collaborator recommendation based on conference closure. IEEE Transactions on Computational Social Systems 6 (2), pp. 311–322. Cited by: §2.1, §2.1.
  • W. Wang, S. Yu, T. M. Bekele, X. Kong, and F. Xia (2017a) Scientific collaboration patterns vary with scholars’ academic ages. Scientometrics 112 (1), pp. 329–343. Cited by: §2.1.
  • X. Wang, P. Cui, J. Wang, J. Pei, W. Zhu, and S. Yang (2017b) Community preserving network embedding. In

    Proceedings of the AAAI Conference on Artificial Intelligence

    Vol. 31, pp. 203–209. Cited by: item 1.
  • F. Xia, Z. Chen, W. Wang, J. Li, and L. T. Yang (2014) MVCWalker: random walk-based most valuable collaborators recommendation exploiting academic factors. IEEE Transactions on Emerging Topics in Computing 2 (3), pp. 364–375. Cited by: §2.1.
  • J. Yang and J. Leskovec (2015) Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems 42 (1), pp. 181–213. Cited by: item 3, item 4.
  • J. Yu, G. Zhou, A. Cichocki, and S. Xie (2018) Learning the hierarchical parts of objects by deep non-smooth nonnegative matrix factorization. IEEE Access 6, pp. 58096–58105. Cited by: §2.2.
  • S. Yu, F. Xia, and H. Liu (2019) Academic team formulation based on liebig’s barrel: discovery of anticask effect. IEEE Transactions on Computational Social Systems 6 (5), pp. 1083–1094. Cited by: §1, §2.1.
  • D. Zhang, J. Yin, X. Zhu, and C. Zhang (2018a) Network representation learning: a survey. IEEE transactions on Big Data 6 (1), pp. 3–28. Cited by: §1.
  • F. Zhang and S. Wu (2021) Measuring academic entities’ impact by content-based citation analysis in a heterogeneous academic network. Scientometrics (1). Cited by: §1.
  • Z. Zhang, P. Cui, X. Wang, J. Pei, X. Yao, and W. Zhu (2018b) Arbitrary-order proximity preserved network embedding. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2778–2786. Cited by: item 3.