1 Introduction
With the rapid development of 5G technology, the analysis of 5G academic, social networks is of great academic significance and helps to guide future scientific development. In recent years, researchers have studied 89 million papers published between 1900 and 2015 and found that the collaboration rate between authors has increased 25fold in the last 116 years. In addition, as measured by the top of cited papers, more than of the world’s leading innovations were done by the organization in the early 21st century, which is almost four times as many as in the early 20th centuryHoang et al. (2017). This suggests that how to find valuable collaborators has become very importantSun et al. (2011), and further researchers have demonstrated that scientists or teams with tightly connected collaborative networks tend to generate more researchLi et al. (2014).
Research collaboration is a crucial mechanism through which knowledge and capacity, new ideas and research approaches are linked. Specifically, collaboration connects different talent sets to generate new research. The academic collaboration network can be constructed by taking other authors or institutions as nodes of the network structure and the cooperative relationships between authors or institutions as the links of the network structure. Based on the constructed academic cooperation network, the degree of network structure, node intimacy, intermediation, and PageRank can be analyzed. However, the deeper characteristics of the cooperation network can not be analyzed. In addition, the number of papers or patents published in academic cooperation is large, and the collaboration among authors has a particular tendency, and the partners of authors are relatively few and fixed, resulting in a very sparse link matrix representing the cooperation network of authors. Recent advances in complex network representation learningFu et al. (2020); Routh et al. (2021) allow us to mitigate the challenges of large scale and sparsity in academic collaboration networks. The benefits and effectiveness of complex network representation learning have been demonstrated in many multitasks, such as node classification, link prediction, and community detectionBattaglia et al. (2018); Chen et al. (2020); Zhang et al. (2018a). In recent years, a number of tasks related to academic collaboration networks have emerged, such as recommending author reviewers or future cooperative research institutions, predicting scholars’ research interestsMakarov et al. (2019), and mining scholars’ collaborative model, representing the combination of learning model and scholar attribute of academic collaboration networkYu et al. (2019); Thelwall and Kousha (2014); Zhang and Wu (2021). They argue that student relationships should be represented by coauthors and the research expertise described in the academic paper. Therefore, scholars should be represented in various attributes that reflect their existing collaborative structures and theoretical knowledge. However, these symbolic learning models ignore spatially independent structural similarity characteristics in academic cooperation networks.
Therefore, in this paper, the representation learning method of a complex network is applied to the academic cooperative network. While retaining the structural features of the network as much as possible, the lowdimensional representation of different nodes in the network can be trained to learn both local structural elements in the network and spatially independent structural similarity features in the network. Finally, the embedding results are applied to complex network tasks such as classification and clustering to excavate the cooperation law and patterns and trends of cooperation trend between schools and enterprises based on the characteristics of the network underlying structure.
In summary, the contribution of this paper can be described as follows: (1) We propose a loworder network representation learning model, namely LNLM, to map multiple components into a lowdimensional space while maintaining internal dependencies between components. (2) Comparing to other representation learning model, our proposed model preserves spatially independent structural similarity characteristics in the network. (3) By evaluating the effects on multiple datasets, including the 5G academic social network, the method proposed in this paper is shown to outperform existing higherorder dynamic network methods.
The paper is organized as follows: related work is discussed in Section 2; the proposed method is introduced in Section 3; experimental results are presented in Section 4; Section 5 concludes this paper.
2 Related work
2.1 Academic Collaborative Research
Academic network analysis effectively deals with explosive educational information and diverse associations in the academic big data environment. As a specific type of social network, academic collaboration networks play an increasingly important role in disseminating academic data and expanding academic collaboration. There are many ways to build an educational network, including cocitation, cooperation, similar research contents and everyday academic activities. In the analysis of authors and collaborators, Wang et al.Wang et al. (2019)and Evans et al.Evans et al. (2011)
found an apparent homogeneity in academic cooperation, as scientists tend to cooperate with others they like the most. However, collaborating with people from different fields can solve complex problems, such as patents across scientific fields. Furthermore, when this diversity helps to make more dispersed networks more connected, it will produce better scientific outcomes, such as highquality papers in journals or conferences with more significant impact and higher citation rates. Wang et al. proposed a graph model based on timeconstrained probability to identify the consultantconsultant relationship
Wang et al. (2010).Analysis based on social networks, Newman et al. used the method of social network analysis to study the macro and micro characteristics of the largescale academic cooperation networkNewman (2001), such as node degree, node centrality, network clustering coefficient, etc., academic cooperation research has aroused people’s interest again. Following up on Newman’s work in 2001, Barabasi et al. investigated the dynamics and evolution of author collaboration networksBarabâsi et al. (2002). Since then, authors’ collaborative networks have been extensively studied in various ways in the natural and social sciences. Kempe et al. also used author collaboration networks to identify the most influential authorsKempe et al. (2005). Liu et al. attempted to assess the identity of authors in a particular field by looking at their collaborative networks, thereby strengthening ties with the community by identifying the most influential researchersHuang et al. (2013); Liu et al. (2015).
In addition to identifying the most influential authors in academic networks, the task of educational relationship mining based on academic author networks also includes modelling cooperation patternsPetersen (2015); Wang et al. (2017a),recognition of academic relationshipWang et al. (2010), community detectionYu et al. (2019), and collaborator recommendationsKong et al. (2017); Qu and Xiong (2012); Wang et al. (2019); Xia et al. (2014).
2.2 Research on representation learning in complex networks
The 5G IoT social data boom is not only revolutionizing our daily lives, but it’s also generating a considerable amount of 5Grelated paper data. In addition, the feature matrix of the academic cooperation network formed by 5G papers is mostly nonnegative. Therefore, 5G Academic Social network analysis based on nonnegative matrix factorization has various application scenarios. Matrix decompositionbased graph embedding models usually express graphs as matrices, such as the adjacency matrix or approximation matrix of a graph, and decompose the matrices by factoring to obtain graph embeddings or node embeddings that preserve the structural features of the graphGoyal and Ferrara (2018). There are two types of graph embedding based on matrix factorization: one factorises the Laplacian feature map of the graph, and the other factorises the approximate node matrix directly. The key to factor the Laplacian eigenmaps is to make the final embedded graph feature explain the similarity of paired nodes. Therefore, the greater the embedding distance between two nodes with a more remarkable parallel, the greater the penalty will be to preserve the similarity characteristics between nodes in the network structure. In the initial study MDSHofmann and Buhmann (1995)
, the distance between two eigenvectors was calculated as the strength of similarity of two nodes. However, this approach has the disadvantage that it does not take into account the neighbours around the node, i.e. any node in all graphs is considered to be connected to any of the remaining nodes. In order to overcome the shortcomings of the complexity of the problem, LPP
He and Niyogi (2004)first starts from the characteristics of data structure most adjacent graph, where each node only calculates the Euclidean distance of the nearest node, i.e. the proximity of the node to a neighbouring nodes, and then uses the proximity of different nodes to constrain the formation of additional penalties. More advanced models have recently been designed in recent years. For example, AGLPPJiang et al. (2016)realized the rapid improvement of model LPP efficiency by introducing an anchor diagram.The incorporation of deep learning ideas into nonnegative matrix factorisation has become a popular area of research. NMF can be used to learn data representation by decomposing multivariate data into the product of linear combination basis and auxiliary matrix. By adding additional constraints and penalty terms to induce the thinness
Hoyer (2002)and the NSNMFPascualMontano et al. (2006)after adding smoothness, etc., an extension of NMF is formed to improve its performance. However, these methods have two inherent problems: first, they assume that the input data can be reconstructed linearly from the base; second, they are all singlelayer structural learning, and only basic lowlevel features can be obtained from the original data.There has been some exploration in deep matrix decomposition to extract high level nonlinear features in network structuresGuo and Zhang (2019); Song et al. (2015); Trigeorgis et al. (2014); Yu et al. (2018). Their general idea is to decompose a matrix layer into multiple layers, hoping to get a hierarchical mapping. Trigeorgis et al. proposed a multilayer semiNMF model with a complete depth architecture for automatic learning of attribute hierarchies to facilitate clustering tasksTrigeorgis et al. (2014). Song et al. proposed a multilayer NMF structure for classification tasks, in which a nonsmooth NMF was used to solve the typical NMF in each layerSong et al. (2015). Then, a sparse depth NMF model is proposed, and Nesterov accelerated gradient descent algorithmGuo and Zhang (2019) is successfully applied to the light structure of data objects. Recently, Yu et al. proposed a deep nonsmooth NMF architecture to learn partial and hierarchical attributesYu et al. (2018)
. However, all of these models consist only of decoder components. The LNLM model proposed in this paper is autoencoderlike with an objective function that combines an encoder and a decoder. The lowdimensional feature matrix hidden in the original network is transformed by the lowdimensional feature matrix hidden in the intermediate layers. For each layer of the encoder, the similarities between nodes at different granularity levels are explained. The decoder components seek to learn from the hierarchical mapping of the encoder components to effectively fuse and reconstruct the multifeature matrix of the original network.
Symbol  Definition 
graph with node set and edge set  
Adjacency matrix of graph  
Loworder eigenmatrix of nodes  
Representation matrix of the dimension nodes  
Local eigenmatrix of nodes  
Capacity of Graph  
Sliding Window Size 
3 Loworder networks represent learning to model
3.1 Model and its solution
Given an undirected network, , including node and side, represents a set of nodes, and represents a collection of edges between nodes. can be represented by an adjacency matrix , denoted representing the connections between node and other nodes in in the connection between the other nodes. For a weighted network, if there is an edge between node and node , otherwise . Since the network is undirected, is a symmetric matrix, i.e. . Throughout this paper, the matrix is shown using bold capital characters. The matrix preserves the loworder structural features of the nodes in the network, where is the dimension of the feature. describes the structural characteristics of the node . The purpose is to learn the representation of the node , where is the dimension of the representation. The table 1 shows the notation used in this paper.
In the design of the LNLM model, this paper mainly considers two essential parts, a local strCapacity of Graph Guctural feature encoder and a lower order feature encoder. The adjacency matrix retains most of the network topology characteristics and directly represents the firstorder similarity. The model first captures the coarsegrained structure feature matrix from the original network by decomposing the adjacency matrix .
3.2 Local structural feature extraction
In the real world, information networks often lose a lot of information, and many nodes in the adjacency matrix that are not directly connected also have high similarity in nature. In addition, the adjacency matrix retains directly connected edges that represent the firstorder similarity of the nodes. In particular, they reveal that the adjacency matrix preserves the topology of the network. Therefore, the adjacency matrix is decomposed to capture the topology of the network. In this paper, the NMF method is adopted to maintain local structure in lowdimensional space and minimize the following objective functions to the maximum extent:
(1) 
Among them is the local eigenmatrix, it’s the spatial dimension.
Because embedding the matrix loses a lot of crucial information, lowerorder features are integrated to mutually enhance the learning of at the same time, by decomposing the local eigenmatrix to obtain the lowdimensional representation of nodes that retain local and community structures, the following objective functions are minimized:
(2) 
Among them it’s an embedded matrix, is the auxiliary matrix, is the embedded dimension.
3.3 Lower order structural feature extraction
In this paper, lowerorder features of nodes are extracted based on a random walk. Zhang et al. proved that the model based on a random walk and graph hop is considered to be of the matrix decomposition, a closedform and verify the effectiveness of them for the conventional network mining tasksQiu et al. (2018). The matrix representation of implicit approximation and factorization is as follows:
(3) 
Where is the capacity of the graph , . is degree matrix of graph , . is the sliding window, and
is the negative sampling number in the skipgram model. Truncated Singular Value Decomposition (SVD) is then performed on the constructed
to capture the eigenmatrix of the node in the lowdimensional space, that is, . As the window size increases, lower order structures can be obtained. Inspired by this, lower order structural features are extracted from the matrix(4) 
is an auxiliary matrix.
4 Model Building
Because the network is very sparse, many node features will be lost only through the firstorder similarity of nodes. At the same time, the random walk sequence is generated for each node, and the order feature is learned by controlling the window size. In addition, the lower order features known from the random walk are integrated into the NMF framework, and the local structures are captured to be represented by learning nodes so that the lower order feature information and regional facilities are retained together. The overall architecture of this model is shown in Figure 1.
After integrating the objective functions (1), (2) and (4), the definition of the final loss function of this model is as follows:
(5)  
Where , , are positive parameters used to adjust the contribution of the corresponding item.
4.1 Model optimization
Since the loss function in the formula (5) is not convex, the derivative cannot be used to calculate the optimal solution. In this paper, the loss function is divided into four subproblems, and the four parameter matrix is optimized respectively.
Then, using the Majorization  Minimization frameworkHunter and Lange (2004)each optimal local solution to the problem is then updated using the majorisationminimisation framework. The update strategy adopted is alternate optimization, that is, when one matrix is updated, the other three matrices are fixed. The algorithm 1 shows the pseudocode for the optimization process. The specific formula is as follows:
The related loss function is as follows:
(6) 
Here, is a nonnegative matrix, so the Lagrange multiplier matrix is introduced to obtain the following equivalent function:
(7)  
Set the value of formula (7) to 0, that is :
(8) 
Following the KarushKuhnTucker (KKT) condition on the nonnegative property of , the following equation is obtained:
(9) 
Initialization and updates based on are as follows
(10) 
Where represents multiplying matrices
On the optimization of Vsubproblem, when updating , the fixed arguments , , resolve the following target function:
(11) 
Similar to , the update rules that define are as follows:
(12) 
On the optimization of Hsubproblem, when updating the fixed parameters , , , will have the following objective functions:
(13) 
Similarly to , the update rule that defines is as follows:
(14) 
On the optimization of Usubproblem, when updating the fixed parameters , , , will have the following target functions:
(15) 
Similarly, similar to the optimization calculation of , the update rule that defines is as follows:
(16) 
An algorithm 1 describes the optimization process of the method. The optimized input data includes network , lower order eigenmatrix , embedding dimension , convergence coefficient and balance parameters , ,
. First of all, with uniformly distributed random initialization
, , , . Then, iteratively update , , , until convergence occurs. The output is the embedded matrix for all nodes in the network. The node representation learned from the model in this paper can obtain lower order features and local structures.4.2 Model complexity
The entire computational complexity of the model depends on the matrix multiplication in the update rule, so given two matrices and ,Where and , computational complexity is . Based on this, the computational complexity of the four update formulas in the algorithm 1 is , , , . Since , , can be considered as input constants, and , , , the computational complexity is . In fact, most networks are very sparse, so only nonzero values are evaluated in matrix multiplication. Based on this, the calculation is simplified to , where is the number of edges in the network. , , , is the parameter matrix, so the space complexity is . Because , , and are less than , the spatial calculation is simplified to . It can be found that for most NMFbased algorithms, the complexity of the model is on the same order of magnitude.
To effectively illustrate the temporal complexity of the model, the lowdimensional embedding of nodes for several datasets of different sizes was learned, and the time was calculated and then displayed in the table 2. The results show that the size of the network increases exponentially, and the computing time also increases exponentially.
Dataset  OAG  Wikipedia  Polblog  Hepph 
Number of nodes  13890  4777  10312  80513 
Time  17.113s  0.683s  12.152s  77.330s 
5 Experimental Verification
In this section, extensive experiments will be conducted on multitag node classification, clustering, link prediction and visualization tasks to evaluate the model’s effectiveness. First, the data sets and experimental Settings used in this paper are described. In order to prove the validity of the model, it is compared with the latest methods in extensive experiments. Finally, the sensitivity of using parameters is analyzed. All the experiments were run on computers configured with a Windows 8 64bit operating system, a 3.10 GHz CPU and 256 GB RAM. The detailed comparison algorithm is as follows:
5.1 Datasets
This section focuses on four widely used network data sets for multilabel node classification tasks and four real networks, including 5G academic social networks with basic fact data sets for clustering and link prediction. The statistical characteristics of these datasets are shown in table 3, which is described in detail below.

Wikipedia:This is the word cooccurrence network on Wikipedia. Class tags are part of speech (POS) tags inferred by Stanford PoS TaggerToutanova et al. (2003).

PolblogAdamic and Glance (2005):PolBlog, a social network whose nodes represent the blogs of US politicians, has an advantage if their blogs have Weblinks. The tag indicates the type of politician.

LivejournalYang and Leskovec (2015): LiveJournal is an online social network data set whose nodes represent bloggers and have edges between two nodes if they are friends. Divide the bloggers into groups based on their friendships and use these groups as tags.

OrkutYang and Leskovec (2015): Orkut is an online dating network that uses nodes as users and builds links between nodes based on their friends. The Web has several clearly labelled communities, including student communities, activities, interestbased groups, and school teams.

GRQC,Hepth,HepphLeskovec et al. (2007): This is a collaborative network of three authors in general relativity and quantum cosmology, theory of high energy physics, and phenomenology of high energy physics extracted from arXiv. In this network, vertices represent authors, and edges represent authors who have coauthored a scientific paper in arXiv.

Open Academic Graph (OAG)Chaudhuri et al. (2012): This is an undirected author collaboration network constructed from a publicly available academic chart indexed by Microsoft Academic and American Miner websites. The network contains 67,768,244 authors and 895,368,962 collaborative advantages. Vertex tags are defined as each author’s top research area, such as computer science, physics, psychology, etc. There are 19 different fields (labels) in total, and the author can publish in multiple fields, which makes the corresponding vertices have multiple labels.

Academic Social Network(ASN)Wan et al. (2019):Data included 2,092,356 papers, with 8,024,869 citations, 1,712,433 authors and 4,258,615 coauthors. Among them, the number of nodes related to 5G is 2407, and the number of edges is 1836. In the past 10 years, the top highly relevant to 5G each year are 6143 in total, cited 100572 times by 2635 authors and 6316 coauthors. According to the domestic, foreign papers, and related patents on the line to label and the author cooperation network of different types of research, to brand.
Dataset  Node Number  Edge Number  Category Number  Multiple Tags 
Wikipedia  4777  18412  40  yes 
Polblog  1490  16627  2  no 
Livejournal  11118  396461  26  no 
Orkut  998  23050  6  no 
GRQC  4158  13422  42  no 
Hepth  8638  24806  5  yes 
Hepph  11204  57619  38  yes 
OAG  13890  86784  19  yes 
ASN  10407  14156  9  yes 
5.2 Baseline methods
This section highlights the comparison of three and five stateoftheart network embedding methods based on NMF. The details are as follows:

MNMFWang et al. (2017b): MNMF unifies the community structure characteristics and the 2hop neighbourhood relations of nodes in the NMF framework to learn the embedding of nodes in the network structure. The model uses nodes to represent the consistent relationship with the network community structure. It uses an auxiliary community to represent the matrix to connect the local features (firstorder similarity and the community structure features in the network structure) and uses the optimization formula to optimize them jointly. The embedded dimension in the experiment is 128, and other parameters in the investigation are set according to the original paper.

NetMFQiu et al. (2018): NETMF proved that models using negative samplings, such as DEEP WALK, PTE and LINE, can be decomposed into closed matrices and confirmed that it is superior to DEEP WALK and LINE in conventional network analysis and mining tasks.

DeepWalkPerozzi et al. (2014): Deepwalk generates a random path for each node and treats the path of these nodes as a sentence in a language model. It then uses the skipgram model to learn to embed vectors. In the experiment, the parameters are consistent with those in the original paper.

Node2vecGrover and Leskovec (2016): Node2vec extends DeepWalk by using biased random walk. It introduces two offset parameters and to optimize the random walk. All parameters are default Settings.

LINETang et al. (2015): LINE learns the embedding of nodes by defining two loss functions to preserve the firstorder and secondorder proximity, respectively. This article uses the default parameter settings, but the negative ratio is 5.

GAEKipf and Welling (2016):GAE is based on a variational autoencoder and has the same convolution architecture as GCN. This method has strong competitiveness in link prediction tasks in citation networks.

SDNEWang et al. (2016):SDNE uses a deep autoencoder with a semisupervised architecture to optimize the firstorder and secondorder similarity of nodes simultaneously and uses explicit objective functions to clarify how to preserve the network structure. The parameters in the experiment are consistent with those set in the original paper.
5.3 Parameter sensitivity analysis
In this section, we analyze the effects of LNLM parameters , , and on the clustering performance on real networks, where is the dimension of the local structure embedding space. The experimental results show that the clustering performance presents a similar trend in different data sets with the change of parameters. For simplicity, the effects of different parameters are discussed with an OAG dataset as an example.
The details of the parameter analysis are shown in Figure 2. The influence of the two parameters on the experimental results is analyzed below. In general, the values of , and are set based on the data in the following experiment. Explore the effect of each parameter by changing two parameters while controlling the others. For example, change , and fix and to see the effect of , . And so on. To be specific, change from 100, 200, 300, 400 and 500. Figures 2(a)(c) and Table 7 show the performance of NMI as these parameters vary, respectively. In the figure 2(a), the clustering performance is the worst when both and are less than 10. Within certain limits, NMI values tend to be stable as and increase. The clustering performance is best when is greater than 50 and is less than 30.
As shown in Figure 2(b), horizontally, NMI does not change much when in [1,20], indicating that the clustering performance is relatively stable as is in A and as is increased. Some range. In the figure 2(c), it is noted that within a specific field, NMI tends to be stable when and are linearly correlated, while NMI reaches its maximum when and are in the range [20,101].
The parameters of the LNLM model include three hyperparameters
, and , the local structure size and the embedded size . In the experiment, set , , or , to find the optimal parameters of the model. And varies according to the number of tags. , and perform better. In the process of extracting the lower order structure matrix , the dimension is set to 128. When is greater than 200, the clustering performance gradually decreases, so the dimension of is set to 200.5.4 Multiclassification experiment
In this section, the performance of multilabel node classification is evaluated against the metrics MICRO F1 and MACRO F1. In order to reduce the contingency of experimental results, the classification process was repeated 10 times, and the average value was taken as the result. Table 4 shows the node classification performance of our model and the baselines of the four datasets for , respectively. All data sets in the same table share the same window size . In these tables, bold numbers indicate the best results.
Model  BlogCatalog  OAG  Wikipedia  ASN  
Micro  Macro  Micro  Macro  Micro  Macro  Micro  Macro  
NetMF  31.54  14.86  16.01  12.10  49.9  9.25  23.86  4.26 
MNMF  21.81  6.53  18.34  11.13  48.13  7.91  25.17  8.66 
LINE  23.74  13.32  11.94  9.54  41.74  9.73  26.06  7.79 
DeepWalk  29.32  17.38  12.05  10.09  35.08  9.383  24.96  11.87 
AROPE  33.87  14.51  19.61  12.78  52.83  10.69  16.23  10.76 
GAE  27.11  25.58  16.67  11.85  50.48  10.75  26.37  11.02 
LNLM  34.66  16.19  19.74  13.71  51.02  9.51  26.18  11.69 
Hepth  OAG  Wikipedia  ASN  
Micro  Macro  Micro  Macro  Micro  Macro  Micro  Macro  
= 2  41.10  25.15  25.47  21.21  53.57  11.98  26.32  13.50 
= 3  41.99  25.98  25.45  21.17  52.75  10.94  26.08  13.58 
= 4  42.01  25.98  25.47  21.22  52.31  10.72  27.24  13.67 
= 5  42.07  26.06  25.88  21.55  51.68  10.47  27.56  13.54 
= 6  42.11  25.89  25.66  21.41  51.25  10.31  27.37  13.72 
= 7  42.13  25.97  25.56  21.38  51.21  10.24  27.26  13.53 
= 8  42.11  25.76  25.60  21.18  50.88  10.07  27.22  13.52 
= 9  42.05  25.81  25.62  21.30  50.64  9.94  27.19  13.51 
= 10  43.15  28.48  25.92  21.59  51.83  10.96  27.13  13.50 
Obviously, according to the evaluation indexes MicroF1 and MacroF1, LNLM performs better than other models in the cited network dataset ASN, OAG and HepPH, proving the effectiveness of the network embedding model in the analysis of academic networks in this paper. In Wikipedia, the AREOP model shows better performance than the LNLM approach in MicroF1 and MacroF1. This phenomenon suggests that a relatively low order is sufficient to characterize the network structure of Wikipedia. The reason is that Wikipedia is a dense word cooccurrence network with a moderate degree of about 85, so if two words appear together in a window of size 2, they will have edges. The results show that the method based on matrix factorization alone does not perform well in the classification task, proving the effectiveness of combining lowerorder feature learning node representation on academic networks.
As mentioned earlier, the window size of determines the order in which structures are captured. In addition, the effect of window size on multilabel classification performance is explored. Here, set the window size to 1 through 10. The table 5 show the relevant results and trends. Because table 4shows the result of =1, starts with 2 in table 5
. In HEPTH, OAG, and ASN datasets, the proposed LNLM can significantly better classify performance as the window size
increases, such as MicroF1 and MacroF1. However, when gradually reaches 4, the performance tends to be stable. In wikis, classification performance gradually degrades when is greater than 3. This phenomenon indicates that the size of the window will never be as large as possible, and the window should be set dynamically according to the network sparsity to reduce the amount of computation. LNLM model can dynamically adjust the window size according to network sparsity to learn a better node representation.5.5 Node clustering experiment
In this section, the performance of the node cluster is evaluated based on standardized mutual information (NMI) of typical metrics. This paper uses accurate data (including Polbog, Livejournal, and Orkut) to evaluate the clustering performance of realworld datasets. NMI varies between 0 and 1, and the larger the value, the better the cluster performance. In the experiment, the standard Kmeans algorithm is used to obtain the clustering results of other network embedding methods. Since the initial value significantly influences the clustering results, the clustering is repeated 10 times and its average value is calculated as a result.
Table6 shows node clustering performance concerning NMI. Again, bold numbers indicate the best results in the table. The table results show that LNLM has the best performance on all the network data sets on NMI. In particular, the LNLM approach showed a improvement in NMI compared to the secondbest approach on the Pol blog dataset. This is because our method integrates lowerorder structure features and local structure features and captures the network’s diverse and comprehensive structure features. SDNE and LINE only preserve the proximity between network nodes and cannot effectively maintain the community structure. Deepwalk and Node2vec based on a random walk can capture secondorder and even higher similarity. However, they ignore community structure. AROPE can grasp similarities between different nodes. Although more global structure information is caught as the length increases, AROPE still forgets module information. MNMF introduces modularization items to learn node embedding that preserves community structure. However, for sparse networks and networks with no prominent community structure, the modularization term constraint of NMF makes the representations of nodes similar, so its performance is relatively low. The results show that the method based on matrix factorization only performs poorly in the clustering task. The above results demonstrate the power of fusing lowerorder features into embedding while preserving local structures.
Dataset  Polblog  Orkut  Livejournal 
NetMF  0.324  0.557  0.688 
MNMF  0.215  0.310  0.681 
DeepWalk  0.475  0.120  0.103 
Node2vec  0.453  0.331  0.117 
LINE  0.226  0.211  0.565 
SDNE  0.077  0.213  0.743 
AROPE  0.241  0.306  0.165 
GAE  0.369  0.768  0.787 
LNLM  0.718  0.778  0.806 
5.6 Link prediction experiment
In this section’s experiment, to predict which node pairs are likely to form a boundary, we hide to of the edges for evaluation as test data while ensuring that the rest of the network is connected. The remaining edges are used to train the node embedding vector, respectively. A specific area under the curve (AUC) score is used to evaluate the performance of LNLM and other benchmark methods.
First, removing of the edges on all network datasets to verify the performance of LNLM is shown. As shown in Table 7, the LNLM model achieves , and improvements in Pol blog, Orkut and Livejournal, respectively. We note that MNMF, which preserves the network community structure, is second only to the LNLM model in terms of predictive power in all data sets.
Dataset  Polblog  Orkut  Livejournal  GRQC 
NteMF  0.525  0.650  0.806  0.795 
MNMF  0.672  0.835  0.878  0.843 
DeepWalk  0.499  0.487  0.469  0.849 
Node2vec  0.495  0.516  0.498  0.530 
LINE  0.471  0.470  0.515  0.508 
SDNE  0.460  0.521  0.529  0.513 
AROPE  0.694  0.646  0.775  0.734 
GAE  0.859  0.792  0.963  0.937 
LNLM  0.860  0.899  0.972  0.941 
Specifically, the results in LiveJournal and Orkut were used as examples to explore the effect of training data ratio. The results in Figure 3 show that the LNLM model has a better performance compared to all baselines in both datasets in different parts of the removed edge. Due to the separate network structures, some data sets can achieve the optimal prediction accuracy when of the edges are retained. In comparison, others can achieve the optimal prediction accuracy when of the edges are included. Overall, the results show that the LNLM model can achieve excellent link prediction, indicating the effectiveness of retaining high data sets. Ordered characteristics and local structure information of network embedding.
5.7 5G academic social network analysis and prediction
From 2011 to 2014, 4G has just been successfully developed and gradually popularized in China. During this period, the research on the 5G network only stays in the design and application of functions of the 5G network, such as the functional architecture of the mobile network, the future of the architectural design of 5G mobile network, and remote patient monitoring in 5G infrastructure prepared papers to stay in the theoretical writings, the patent application is relatively few. However, after 2018, with the discussion on the construction strategy of the transmission network in the 5G era and the strengthening of the challenges, methods and directions of the 5G network, papers and patents on lowcomplexity generalpurpose filter multiple carriers applicable to the 5G wireless system increased rapidly in this year.
However, with the formal application of 5G network, whether the research hotspot can be further developed, this paper uses the popular ARMA model to predict the number of 5Grelated papers and patents. It can be seen from 5
(a) that the data set is a stationary nonwhite noise sequence, and the ARMA model can model the series. First, we calculate the values of sample autocorrelation coefficient (ACF) and partial correlation coefficient (PACF) of the observation series based on
5(b) and 5(c). Then the ARMA (P, Q) model with appropriate order is selected to fit the properties of sample autocorrelation coefficient and partial autocorrelation coefficient. See the following figure4:As shown in 4
(b), the PCA autocorrelation diagram shows that all the order autocorrelation coefficients fluctuate within the standard deviation except that the autocorrelation coefficient of order 1 is within the range of 2 standard deviations. The graph has a sinusoidal fluctuation trajectory, which indicates that the attenuation of the autocorrelation coefficient to zero is not a sudden process, but a continuous gradual process. Based on the characteristics of the autocorrelation coefficient, we can judge that the sequence has a shortterm correlation and further determine the sequence stability.
4shows the process of the partial autocorrelation coefficient attenuating to zero. What is unique here is that the partial autocorrelation coefficient of the first order is in the range of 2 standard deviations, and the partial autocorrelation coefficient of the 13th order is also in the range of 5 standard deviations. According to the trailing autocorrelation. Further, we ran the predicted results as shown in the figure 5:
As shown in the figure 5, the grey line is the 120 data points used for training, the black line is the prediction of future values, and the red line is the upper and lower limits of the confidence interval. So there’s a chance that the actual value of the future will fall within this range.
6 Industrial applications
This section uses the LNLM model to verify this 5G academic social network data sets and analysis, through the model after embedding, detection of community and community evolution experiment, 5G academic social network analysis of core were the focus of scientific research team, team leaders and the development direction of different technical fields, and the academic relationship between quoter and development trend.The resulting author collaboration topology is shown the figure 6:
Due to many generated nodes, this paper sets labels to display only the top 6 nodes in node degree. The larger the designation of a node is, the greater the degree of the node is. As shown in the figure, Mohsen Guizani’s degree value is the largest, so it is the most core node in this network. This is because Mohsen Guizani is a wellknown expert in the field of 5G, the chief editor of IEEE fellow, IEEE Network and other international top journals, the University of Idaho, professor of electrical and computer engineering department, research line is wireless communication and mobile computing, computer Network, mobile cloud computing, therefore, Mohsen Guizani has published several highlevel papers. The other authors are also experts in the field of 5G, so they occupy a relatively central position in the cooperative network.
According to the figure 7, the author cooperative network can be divided into 26 categories in total. They are Network Intelligence, 5G Microcell Base Station, 5G Transport Network, 5G Edge Service, Mobile Network Architecture Evolution, etc. These categories respectively represent the research direction of the author.
In order to find out the hot research direction of 5G academic social network, this paper selects the top 7 categories of clustering coefficient, Population health, DualPolarized antenna, Mobile Network Architecture Evolution, Modulation scheme, Network intelligence, among which network intelligence is the largest category, means that many authors have conducted researches on this research direction, including many wellknown professors from Stanford University, Massachusetts Institute of Technology, Tsinghua University and so on. They have collaborated to complete a lot of papers. So that this direction of research personnel, significant influence.
In the figure 8 shows the cooperation topology of cited scholars, with a total of 26,000 nodes and 125,645 edges. It can be seen from the figure that Anonymous, Rappaport TS, and Andrews JG are highly cited scholars. These three scholars are professors from Harvard University, Stanford University and Oxford University, respectively. Their research direction is 5G intelligence, and they have published many top papers. Therefore, many scholars will appropriately cite their documents in their documents.
In order to better display the cluster diagram of cited 5G scholars, we have built a cooperative network of cited scholars, which has been grouped into 22 categories, including 5G Application, Channel Model, 5G Network, 5G Broadcast, 5G Fronthaul, etc. These all represent the research directions of the cited scholars. We found that they are all related to 5G research, and the core is 5G. According to different structures, they are divided into different directions, namely categories, representing the hot research directions at present.Shown the figure 9:
As shown in the figure 9, 5G application is the category with the most nodes in the clustering, which also indicates that 5G application is the frontier direction of the 5G research field. Many scholars have carried out related researches in this direction. It can be found from the figure that Balans CA, Huang H and Pozah DM have the highest node degree in this category, which means that these three scholars are the leaders in the research direction of 5G application. When we check relevant information, we find that these three scholars are all academicians of the US National Academy of Sciences. I’m a technical engineer with America Mobile.
7 Conclusions and future work
With the rapid development of 5G technology, the analysis of 5G academic social networks is of great academic significance and helps guide future scientific development. The diverse forms of collaboration and large scale of data in academic social networks constructed by 5G papers make the management and analysis of academic social networks increasingly challenging. Therefore, this paper builds a loworder feature matrix based on the random walk, and the combined NMF framework allows users to control the weight loss among different structural features. An efficient and scalable network embedding algorithm is proposed. This algorithm can capture the local network structure in the 5G academic social network and effectively integrate the loworder features of nodes into the framework of nonnegative matrix factorization to further discover the critical personnel and cooperative communities in the 5G academic social network. The robustness of the proposed algorithm is verified by multilabel classification, clustering and link prediction experiments on four widely used network datasets, three real network datasets and eight mainstream network representation learning models.
It would be great to verify LNLM with a multilayer graph; However, we cannot do this in this article due to accessibility issues. Another problem is that although our model proposes a new approach to building loworder eigenmatrices based on a random walk, it is still challenging to deal with multilayer sparse data sets. In future work, we will try to solve this problem through local methods.
References
 The political blogosphere and the 2004 us election: divided they blog. In Proceedings of the 3rd international workshop on Link discovery, pp. 36–43. Cited by: item 2.
 Evolution of the social network of scientific collaborations. Physica A: Statistical mechanics and its applications 311 (34), pp. 590–614. Cited by: §2.1.
 Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261. Cited by: §1.
 Spectral clustering of graphs with general degrees in the extended planted partition model. In Conference on Learning Theory, pp. 35–1. Cited by: item 6.
 Efficient community search over large directed graphs: an augmented indexbased approach. In Proceedings of the International Joint Conference on Artificial Inteeligence, pp. 3544–3550. Cited by: §1.
 Community structure and patterns of scientific collaboration in business and management. Scientometrics 89 (1), pp. 381–396. Cited by: §2.1.

MAGNN: metapath aggregated graph neural network for heterogeneous graph embedding
. In Proceedings of The Web Conference, pp. 2331–2341. Cited by: §1.  Graph embedding techniques, applications, and performance: a survey. KnowledgeBased Systems 151, pp. 78–94. Cited by: §2.2.
 Node2vec: scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 855–864. Cited by: item 5.
 Sparse deep nonnegative matrix factorization. Big Data Mining and Analytics 3 (1), pp. 13–28. Cited by: §2.2.
 Locality preserving projections. Advances in neural information processing systems 16 (16), pp. 153–160. Cited by: §2.2.
 A consensusbased method to enhance a recommendation system for research collaboration. In Asian Conference on Intelligent Information and Database Systems, pp. 170–180. Cited by: §1.
 Multidimensional scaling and data clustering. Advances in neural information processing systems, pp. 459–466. Cited by: §2.2.
 Nonnegative sparse coding. In Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, pp. 557–565. Cited by: §2.2.
 The impact of social diversity and dynamic influence propagation for identifying influencers in social networks. In 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), Vol. 1, pp. 410–416. Cited by: §2.1.
 A tutorial on mm algorithms. The American Statistician 58 (1), pp. 30–37. Cited by: §4.1.
 Dimensionality reduction on anchorgraph with an efficient locality preserving projection. Neurocomputing 187, pp. 109–118. Cited by: §2.2.
 Influential nodes in a diffusion model for social networks. In International Colloquium on Automata, Languages, and Programming, pp. 1127–1138. Cited by: §2.1.
 Variational graph autoencoders. arXiv preprint arXiv:1611.07308. Cited by: item 7.
 Random walkbased beneficial collaborators recommendation exploiting dynamic research interests and academic influence. In Proceedings of the 26th International Conference on World Wide Web Companion, pp. 1371–1377. Cited by: §2.1.
 Graph evolution: densification and shrinking diameters. ACM transactions on Knowledge Discovery from Data (TKDD) 1 (1), pp. 2–es. Cited by: item 5.
 Acrec: a coauthorship based random walk model for academic collaboration recommendation. In proceedings of the 23rd international conference on World Wide Web, pp. 1209–1214. Cited by: §1.
 A new method to construct coauthor networks. Physica A: Statistical Mechanics and its Applications 419, pp. 29–39. Cited by: §2.1.
 Dual network embedding for representing research interests in the link prediction problem on coauthorship networks. PeerJ Computer Science 5, pp. e172. Cited by: §1.
 The structure of scientific collaboration networks. Proceedings of the national academy of sciences 98 (2), pp. 404–409. Cited by: §2.1.
 Nonsmooth nonnegative matrix factorization (nsnmf). IEEE transactions on pattern analysis and machine intelligence 28 (3), pp. 403–415. Cited by: §2.2.
 Deepwalk: online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 701–710. Cited by: item 4.
 Quantifying the impact of weak, strong, and super ties in scientific careers. Proceedings of the National Academy of Sciences 112 (34), pp. E4671–E4680. Cited by: §2.1.
 Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In Proceedings of the eleventh ACM international conference on web search and data mining, pp. 459–467. Cited by: §3.3, item 2.
 RFH: a resilient, faulttolerant and highefficient replication algorithm for distributed cloud storage. In International Conference on Parallel Processing, Cited by: §2.1.
 Latent representation learning for structural characterization of catalysts. Journal of Physical Chemistry Letters 12, pp. 2086–2094. Cited by: §1.

Hierarchical feature extraction by multilayer nonnegative matrix factorization network for classification task
. Neurocomputing 165, pp. 63–74. Cited by: §2.2.  Coauthor relationship prediction in heterogeneous bibliographic networks. In 2011 International Conference on Advances in Social Networks Analysis and Mining, pp. 121–128. Cited by: §1.
 Line: largescale information network embedding. In Proceedings of the 24th international conference on world wide web, pp. 1067–1077. Cited by: item 6.
 Academia.edu: social network or academic network?. Journal of the American Society for Information Science and Technology. Cited by: §1.
 Featurerich partofspeech tagging with a cyclic dependency network. In Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pp. 252–259. Cited by: item 1.

A deep seminmf model for learning hidden representations
. InInternational Conference on Machine Learning
, pp. 1692–1700. Cited by: §2.2.  Aminer: search and mining of academic social networks. Data Intelligence 1 (1), pp. 58–76. Cited by: item 7.
 Mining advisoradvisee relationships from research publication networks. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 203–212. Cited by: §2.1, §2.1.
 Structural deep network embedding. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 1225–1234. Cited by: item 8.
 Sustainable collaborator recommendation based on conference closure. IEEE Transactions on Computational Social Systems 6 (2), pp. 311–322. Cited by: §2.1, §2.1.
 Scientific collaboration patterns vary with scholars’ academic ages. Scientometrics 112 (1), pp. 329–343. Cited by: §2.1.

Community preserving network embedding.
In
Proceedings of the AAAI Conference on Artificial Intelligence
, Vol. 31, pp. 203–209. Cited by: item 1.  MVCWalker: random walkbased most valuable collaborators recommendation exploiting academic factors. IEEE Transactions on Emerging Topics in Computing 2 (3), pp. 364–375. Cited by: §2.1.
 Defining and evaluating network communities based on groundtruth. Knowledge and Information Systems 42 (1), pp. 181–213. Cited by: item 3, item 4.
 Learning the hierarchical parts of objects by deep nonsmooth nonnegative matrix factorization. IEEE Access 6, pp. 58096–58105. Cited by: §2.2.
 Academic team formulation based on liebig’s barrel: discovery of anticask effect. IEEE Transactions on Computational Social Systems 6 (5), pp. 1083–1094. Cited by: §1, §2.1.
 Network representation learning: a survey. IEEE transactions on Big Data 6 (1), pp. 3–28. Cited by: §1.
 Measuring academic entities’ impact by contentbased citation analysis in a heterogeneous academic network. Scientometrics (1). Cited by: §1.
 Arbitraryorder proximity preserved network embedding. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2778–2786. Cited by: item 3.