N2VSCDNNR: A Local Recommender System Based on Node2vec and Rich Information Network

04/12/2019 ∙ by Jinyin Chen, et al. ∙ 0

Recommender systems are becoming more and more important in our daily lives. However, traditional recommendation methods are challenged by data sparsity and efficiency, as the numbers of users, items, and interactions between the two in many real-world applications increase fast. In this work, we propose a novel clustering recommender system based on node2vec technology and rich information network, namely N2VSCDNNR, to solve these challenges. In particular, we use a bipartite network to construct the user-item network, and represent the interactions among users (or items) by the corresponding one-mode projection network. In order to alleviate the data sparsity problem, we enrich the network structure according to user and item categories, and construct the one-mode projection category network. Then, considering the data sparsity problem in the network, we employ node2vec to capture the complex latent relationships among users (or items) from the corresponding one-mode projection category network. Moreover, considering the dependency on parameter settings and information loss problem in clustering methods, we use a novel spectral clustering method, which is based on dynamic nearest-neighbors (DNN) and a novel automatically determining cluster number (ADCN) method that determines the cluster centers based on the normal distribution method, to cluster the users and items separately. After clustering, we propose the two-phase personalized recommendation to realize the personalized recommendation of items for each user. A series of experiments validate the outstanding performance of our N2VSCDNNR over several advanced embedding and side information based recommendation algorithms. Meanwhile, N2VSCDNNR seems to have lower time complexity than the baseline methods in online recommendations, indicating its potential to be widely applied in large-scale systems.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Due to the fast development of E-commerce, nowadays more and more items are sold online. Although convenient, it is becoming more time-consuming as the diversity of items increases. Recommender systems [1, 2] thus have been developed to help people find the items they are interested in and save their time in the searching process. Recommender systems could efficiently avoid information overload, a problem caused by the increasing amount of data overwhelmingly. It can efficiently predict [3, 4, 5] the likely preferences of the users, recommend related items for them to facilitate further decision.

One of the most critical issues for recommender systems is the data sparsity. With the increasing scale of the system, the number of items often reaches millions, even billion, leading to the quite less possibility of two users focusing on the same items. The common strategy to alleviate the data sparsity problem is using clustering-based recommendation [6], also known as local recommendation. Clustering-based recommendation approaches tackle the sparsity challenge by compressing the sparse network into a series of subsets. They are much more general and could be easily implemented across domains. The first clustering-based recommendation method was proposed by Ungar et al. [7], where the authors proposed a statistical clustering model and determine suitable parameters based on comparing different methods. In recent years, clustering-based recommendation methods [8, 9] attracts lots of attention from researchers. Typically, there are two kinds of clustering-based methods, i.e., single-set clustering and co-clustering. Single-set clustering methods, such like user clustering [10, 11, 12] and item clustering [13], cluster variables separately; while co-clustering recommenders [14, 15, 16] cluster users and items simultaneously. By comparison, single-set clustering methods are more feasible to exploit side-information, while co-clustering methods focus on transaction information [17]. In this work, we thus propose a single-set clustering-based recommendation framework to integrate both network topology and side-information.

Recently, network analysis technologies are becoming more and more popular for complex systems [18]. Such technologies can alleviate the data sparsity problem [19] to certain extent and capture latent information beyond explicit features. In recommender systems, it is natural to represent user-item relationships as bipartite networks. However, there are relatively few network analysis methods are designed to directly analyze bipartite networks [20]. In this paper, similar to [21], we first construct the corresponding one-mode projection category networks which only contains users and items, respectively. Then we apply a special network embedding method on these one-mode networks. In particular, we employed node2vec [22]

, an advanced network representational learning algorithm, to automatically extract low-dimensional vectors for nodes. This method can capture both local and global structural information. Moreover, to overcome information loss in the process of projection and representation, we integrate category information to build a more informative network.

In particular, we first construct two basic bipartite networks, i.e., user-item network and item-category network, based on which we further generate a user-category network. Then, we transform the three bipartite networks to two one-mode projection networks. Next, based on the two related one-mode projection networks, we use node2vec algorithm [22] to map each node into a vector, and use our SCDNN [23] to cluster users and items separately, aiming to deal with the information loss problem of the projection. Finally, we proposed the two-phase personalized recommendation to realize the personalized recommendation of items for each user. By comparing with [jinyin2017improved], the main contributions of this paper are as follows.

  • We first propose a novel network embedding based clustering recommender system by integrating item category as side information, namely N2VSCDNNR, which can alleviate the data sparsity problem to certain extent.

  • Then, we proposed a novel automatically determining cluster number method in SCDDN, which uses the normal distribution method to extract the information of the data points and determines the cluster centers based on the confidence interval principle.

  • Finally, our experimental results demonstrate the outstanding performance of N2VSCDNNR over several advanced embedding and side information based recommendation algorithms, and meanwhile N2VSCDNNR has relatively lower time complexity than the others, making it suitable for online recommendations.

The remainder of the paper is organized as follows. In Sec. II, we present the related works on clustering-based and embedding-based recommender systems; In Sec. III, we propose our N2VSCDNNR, which is based on both network embedding and clustering algorithms while integrates item category as side information; In Sec. IV, we compare our N2VSCDNNR with the previous version N2VSCDNN [23] and several advanced network embedding or side information based recommendation algorithms on multiple real-world datasets; Finally, we conclude the paper in Sec. V.

Fig. 1: The framework of N2VSCDNNR.

Ii Related Works

Ii-a Clustering-based Recommender Systems

During the past two decades, a large number of studies on recommendation have emerged. Many recommendation methods suffer sparsity and scalability problems. Clustering-based recommendation methods thus have been widely developed to overcome such shortcomings to certain extent, where users (items) are grouped into multiple classes, providing a novel way to identify the nearest-neighbors.

Most of clustering-based recommendation methods compute the similarity based on rating data, and then employ some basic clustering algorithm, such as K-means method, to generate the users (items) groups. Sarwar et al. 

[24] proposed the bisecting K-means method clustering algorithm to divide the users into multiple clusters. In this method, the nearest-neighbors of a target user is selected based on the partition that the user belongs to. Puntheeranurak and Tsuji [25]

proposed a hybrid recommender system, where they clustered users by adopting a fuzzy K-means clustering algorithm. The recommendation results for both original and clustered data are combined to improve the traditional collborative filtering (CF) algorithms. Rana proposed a dynamic recommender system (DRS) to cluster users via an evolutionary algorithm. Wang clustered users by using K-means algorithm and then estimated the absent rating in the user-item matrix to predict the preference of a target user. Ji et al. 

[26] paid more attention to discover the implicit similarity among users and items, where the authors first clustered user (or item) latent factor vectors into user (or item) cluster-level factor vectors. After that, they compressed the original approximation into a cluster-level rating-pattern based on the cluster-level factor vectors.

Ii-B Embedding-based Recommender Systems

In network science, an important question is how to properly represent the network information. Network representation learning, used to learn low-dimensional representations for nodes or links in the network, is capable to benefit a wide range of real-world applications, such as recommender system [27, 28, 29, 30, 31, 32, 33].

Recently, the DeepWalk algorithm [34]

was proposed to transform each node in a network to a vector automatically, which takes full advantage of the information of the random walk sequence in the network. Another network representation learning algorithm based on simple neural network is the LINE algorithm 

[35] which can be applied to large-scale directed weighted networks. Moreover, Grover and Leskovec [22] suggested that increasing the flexibility in searching for adjacent nodes is the key to enhance network feature learning. They thus proposed the node2vec algorithm, which learns low-dimensional representations for nodes by optimizing an objective of neighborhood preserving. It designs a flexible neighborhood sampling and a flexible biased random walk procedure that can explore neighborhoods through breadth-first sampling (BFS) [36] or depth-first sampling (DFS) [37]

. It defines a second-order random walk with two parameters guiding the walk. One controls how fast the walk explores and the other controls how fast it leaves the neighborhood of the starting node. These two parameters allow our search to interpolate between BFS and DFS and thereby reflect an affinity for different notions of node equivalences.

Then, with the development of network embedding, a number of embedding-based recommender systems were proposed in recently years. For instance, Palumbo et al. [28]

proposed an entity2rec algorithm to learning user-item relatedness from knowledge graphs, so as to realize item recommendation. Kiss and Filzmoser 

[31] proposed a method to map the users and items to the same two-dimensional embedding space to make the recommendations. Swami [29] introduced a heterogeneous representation learning model, called Metapath2vec++, which uses meta-path-based random walks to construct the heterogeneous neighbors of a node and then leverages a heterogeneous skip-gram model to perform node embeddings, and further make recommendations based on the network representation. Gao et al. [27] proposed a network embedding method for bipartite networks, namely BiNE. It generates node sequences that can well preserve the long-tail distribution of nodes in the bipartite networks by performing biased random walks purposefully. The authors make recommendations with the generated network representation. Wen et al. [30] proposed an embedding based recommendation method. In this model, they use a network embedding method to map each user into a low dimensional space at first, and then incorporate user vectors into a matrix factorization model for recommendation.

Iii The Framework of N2VSCDNNR

In this paper, in order to recommend the items for the users more accurately, we propose N2VSCDNNR, with its whole framework shown in Fig. 1, which consists of the following four steps.

  1. Construct two bipartite networks, i.e., user-item network and item-category network, based on which we further generate a user-category network. Then, compress the three bipartite networks by one-mode projection to generate user-user projection network and item-item projection network, as shown in Fig. 1 A, B, and C.

  2. Apply the node2vec algorithm to generate the user vectors and the item vectors according to the user-user projection network and item-item projection network, respectively, as shown in Fig. 1 D.

  3. Use the SCDNN algorithm to cluster the user vectors and the item vectors into multiple clusters, respectively, as shown in Fig. 1 E.

  4. Use the two-phase personalized recommendation to recommend suitable items to users. First, recommend item-clusters to each user-cluster based on K-means method, as shown in Fig. 1 F; Second, realize the personalized recommendation of items for each user in the user cluster, as shown in Fig. 1 G and H.

Iii-a One-mode Projection of Bipartite Networks

In recommender systems, constructing user-item bipartite networks based on their relationships is ubiquitous. But there is always data sparsity problem, i.e., part of users have very little records, making the constructed bipartite network not sufficient to capture the real relationship between users and items. We thus introduce item categories to effectively solve the sparsity problem. In particular, the one-mode projections of bipartite networks are performed by the following four steps.

  1. Build two bipartite networks, i.e., the user-item bipartite network and the item-category bipartite network, as shown in Fig. 1 A.

  2. Build the user-category bipartite network by integrating the user-item bipartite network and the item-category bipartite network, as shown in Fig. 1 A, where the weight between a user and a category is the total number of times that the user check the items in this category.

  3. Project the user-item network into two separate networks, i.e., a user-user network and an item-item network, where the weight is the number of the common neighbors between user (or item) and user (or item) in the corresponding bipartite network. Similarly, we obtain another user-user network and item-item network from the two corresponding bipartite network with category, respectively, with the weight denoted by . This process is shown in Fig. 1 B.

  4. For either users or items, the two projection networks are integrated as one network, as shown in Fig. 1 C, where the link weight between user (or item) and user (item) is defined as

    (1)

    This indicates that, by comparing with the traditional one-mode projection network just based on user-item relationships, our method can naturally integrate more information about items or users.

Iii-B Network Representation Using Node2vec

Though we enrich the network structure according to item categories, it is difficult to capture appropriate network features using traditional network analysis methods. We thus adopt node2vec to learn continuous feature representations of nodes in a network. Since its flexible neighborhood sampling strategy, node2vec can learn rich representation in a network, and meanwhile reduce the effect of data sparsity on the recommendation algorithm. Here, we use it to automatically capture network features of the generated projection networks to transform each user (or item) into a vector.

Iii-C Clustering Users and Items by SCDNN

After transforming the users and items into vectors, the next important step is clustering them. In this paper, we use the SCDNN method, which is based on DNNs and automatically cluster number determination algorithm, to cluster users (items) into several clusters based on the related user (item) vectors, as is described from the following two aspects.

First, we construct the DNNs similarity matrix. Many real-world datasets are multi-scale datasets, which are of quite different distribution densities of the user (item) in different clusters. Many clustering methods are difficult to obtain good clustering results on such multi-scale datasets.

Considering the above problem, Zelnik and Perona [38] proposed a novel spectral clustering method, called self-tuning spectral clustering (STSC) method, where local-scale parameters are adopted. Its Gaussian similarity function is defined as:

(2)

where is the distance between the data point and its -th nearest-neighbors.

In Fig. 2, based on the definition of , we can see and . Then, based on Eq. (2), we have , consistent with the fact here. However, assuming in Fig. 2, based on the definition of the local-scale parameters, we can find that and . Thus, according to Eq. (2), the similarity between and is larger than that between and , which however is inconsistent with the fact.

Fig. 2: An example of multi-scale dataset.
Fig. 3: (a) The original data distribution; (b) the density-distance distribution; (c) and the density distribution of , for a simple dataset.

Therefore, we can find that the similarity can be indeed affected by the density difference between data points. The density of the data point can be calculated as follows:

(3)
(4)

where is a set containing the first shortest distances between and other data points.

Considering the above analysis, we propose a novel similarity function, namely DNNs similarity function, defined by:

(5)
(6)

The local-scale parameter of data point is determined by the average distance between the data point and its DNNs defined as:

(7)
(8)

where is the initial neighbor set [39] for the data point , and represents the density difference threshold.

In Fig. 2, we can find that and are both located in the dense cluster, while and are both located in the sparse cluster. Assuming that , , , and , based on the definition of the DNN set, we can conclude that , , , and . Therefore, according to Eqs. (5) and (6), we get and , which are now consistent with the fact, indicating the effectiveness of this new definition of similarity.

Second, for automatically determining the cluster number, we use the automatically determining cluster number (ADCN) method. Based on the fast density clustering algorithm [40], in our proposed ADCN method, the cluster centers are automatically determined by constructing a normal distribution for density-distance mapping to figure out all the cluster centers.

Definition 3: The minimum distance of data point is defined as the minimum distance between it and the data points of higher density, as defined by:

(9)

where the set is composed of the data points that have higher density than data point .

Now, based on Eq. (3) and Eq. (9), we can calculate the density-distance () distribution. As an example, Fig. 3 (a) and (b) give the original data distribution and distribution, respectively, for a simple dataset, where we can see that cluster centers, represented by , and , have relatively larger and than the other data points.

Based on the density-distance distribution, we further introduce a variable for each data point , defined as

(10)

to determine the cluster centers more automatically. The density distribution of is shown in Fig. 3 (c), where we can see that it is close to a normal distribution. The cluster centers have relatively larger than the other data points, which thus can be considered as the exceptional data points. Suppose, the mean value of is

, and the variance is

. Based on the pauta criterion [41]

, i.e., the probability that

falls into the confidence interval is 99.73%. Since the number of users or number of items in a recommender system is typically quite large, we enlarge the confidence interval to so that the probability that falls into this interval is close to 99.99999999%. In this case, we have high confidence to consider that almost all the data points are contained in the interval , while those exceptions, i.e., , are cluster centers. In particular, we use the following two steps to automatically determine the cluster centers: 1) Calculate the mean value and the variance of ; 2) Treat the data points with as the cluster centers, e.g., , , in Fig. 3 (c).

Iii-D Two-phase Personalized Recommendation

After clustering users and items according to their vectors and the SCDNN algorithm, we first recommend item clusters to user clusters, and then further realize the personalized recommendation of items for each user. We call it a two-phase personalized recommendation, which is described as follows.

  1. First, we use the number of user-item relationships between each item clusters and the target user cluster to quantify the item cluster. Then, we use a basic clustering method, K-means method, to divide all item clusters into two classes, and recommend the item clusters in the class with the larger average weights to the user cluster.

  2. Based on above clustering recommendation results, some traditional recommendation algorithms are adopted to recommend items in the related item clusters to the users in each user cluster, based on the related rating records.

Iv Experiments

The proposed method is tested on multiple real-world datasets. In this paper, the node2vec [22] method is implemented in Python, the SCDNN method is implemented in Matlab and the conventional recommendation methods are implemented in R. In this section, we first introduce the datasets and the recommendation algorithms for comparison. Meanwhile, we also visualize the networks established in our framework to help readers better understand our N2VSCDNNR. Finally, we give the experimental results with explanations.

Iv-a Datasets

The real-world datasets used in the experiments are described as follows, with their basic statistics summarized in TABLE I.

  • Yelp: The Yelp dataset includes the user reviews on Yelp from 11 cities across 4 countries. Here, two American cities, i.e., Pittsburgh and Madison, are chosen, and the reviews are utilized to define user-item interactions. There are 161 different categories of items for Pittsburgh, and 150 categories of items for Madison.

  • Amazon: the Amazon dataset contains the user reviews on Amazon over 26 types of goods. Here, Musical Instruments dataset is chosen and the reviews are utilized to define user-item interaction. There are total 912 categories of items.

  • MovieLens: MovieLens dataset contains the user ratings over movies on MovieLens. the ratings are utilized to define user-item interaction. Note that it only contains the users with more than 20 ratings and demographic information. There are total 50 categories of movies.

Dataset #User #Item #Link #Category
Yelp (Pittsburgh) 466 1,672 10,373 161
Yelp (Madison) 332 1172 5,597 150
Amazon 6831 32054 71,661 912
MovieLens 943 1,682 100,000 50
TABLE I: Statistics of the four datasets.
Fig. 4: Based on the one-mode projection of (a) the original user-item bipartite network, we obtain (b) the user-user projection network and (c) the item-item projection network.

Iv-B Models and Algorithms for Comparison

In order to evaluate the proposed framework, we compare our method with the following two models.

  • Original: We directly use the traditional recommendation algorithms.

  • N2VSCDNN: This method was proposed in  [23]. As the previous version of N2VSCDNNR, it is purely based on user-item interactions without any category information.

In the two-phase personalized recommendation, we use the following four popular recommendation algorithms.

  • UBCF [42]: User-based collaborative filtering (UBCF) first finds out the users similar to the target user and then recommend items based on these similar users.

  • IBCF [43]: Item-based collaborative filtering (IBCF) is a kind of collaborative filtering algorithms based on the similarity between items calculated by using the ratings of these items.

  • NMF [44]:

    Non-negative matrix factorization (NMF) is a group of algorithms based on multivariate analysis. It seeks to approximate the input matrix with the multiplication of the

    -dimensional low-rank representations. In this paper, we set the dimension .

  • Popular: It divides the subset of users and items according to certain rules or attributes, and then recommends the items with the highest popularity to the users.

To validate the effectiveness of our N2VSCDNNR, we choose two advanced embedding-based, as well as a side-information-based [45, 46, 47, 48, 27], recommendation algorithms for comparison, which are briefly described as follows.

  • Metapath2vec++ [29]: This is the state-of-the-art method for embedding heterogeneous networks. The meta-path scheme chosen in our experiments is item-user-item.

  • BiNE [27]: As a novel network representation method, it was proposed to learn the representations for bipartite networks. It jointly models both the explicit relations and high-order implicit relations in learning the representation for nodes.

  • CoFactor [49]: This is a co-factorization model inspired by word2vec [50, 51], which jointly decomposes the user-item interaction matrix and the item-item co-occurrence matrix with shared item latent factors.

For each network representation learning method, we use the released implementations of the authors for our experiments, and adopt the inner product kernel to estimate the preference of user on item , and evaluate performance on the top-ranked results.

In this paper, we use 5-fold cross-validation to evaluate the performances of the methods, based on the four basic measurements in Top-N recommendation, including precision, recall, Hit Rank (HR) and Average-Reciprocal Hit Rank (ARHR).

As indicated in [52], when the parameter in the ADCN method is of the dataset, we can obtain good clustering result. Therefore, we determine the clusters number when . Moreover, to better reflect the network structure, we set the feature representation dimension , and set the in-out and return parameters both to be 1 in the node2vec algorithm.

Iv-C Network Visualization

In order to provide a more intuitive view of our method, we visualize the networks generated in the process of analyzing the Yelp (Madison) dataset as an example.

First, the user-item bipartite network is shown in Fig. 4 (a), where users and items are denoted by yellow and blue nodes, respectively. The node size is proportional to its degree in the network. Then, the bipartite network is compressed by one-mode projection to get the corresponding user projection network and item projection network, as shown in Fig. 4 (b) and (c), respectively.

Next, the node2vec algorithm is applied to transform the nodes in user (item) projection category network into user (item) vectors. And the SCDNN algorithm is used to cluster users (items), as shown in Fig. 5 (a), based on which a weighted bipartite network is generated between user (item) clusters, as shown in Fig. 5 (b), where user (item) clusters are denoted by yellow (blue) nodes. The node size is proportional to the number of the users (items) in the cluster. Each link is weighted by the total number of the relationships between the users (items) in the corresponding user (item) clusters.

Fig. 5: (a) The original user-item bipartite network with the user and item nodes clustered based on the corresponding node vectors. (b) The weighted bipartite network between user clusters and item clusters. (c) The user-item bipartite subnetwork by only considering the user nodes in the cluster U1. (d) The weighted bipartite network between the user cluster U1 and all the item clusters.

Take user cluster U1 for example, the relationships between it and all the item clusters are shown in Fig. 5 (c)-(d), where we can see that U1 has relatively stronger relationships with the item clusters I2 and I7 than the others. K-means algorithm then is used to divide all item clusters into two classes and recommend those in the class with larger average weight to the target user cluster U1, i.e., here we recommend I2 and I7 to U1. Finally, different recommendation algorithms are used to recommend items in I2 and I7 to each user in U1, according to their ratings.

Iv-D Results

In this section, at first, we compare the results obtained by the three models, including Original, N2VSCDNN and N2VSCDNNR, based on the four basic recommendation algorithms, including NMF, UBCF, IBCF and Popular, as shown in Fig. 6. We can see that, in general, our method N2VSCDNNR behaves better than N2VSCDNN, both of which behaves better than Original, in almost all the cases, by adopting any basic recommendation algorithm and using any performance measurement. Such superiorities are quite significant for the first three datasets, i.e., Yelp (Pittsburgh), Yelp (Madison), and Amazon, by adopting the NMF and UBCF recommendation algorithms. However, they are significant for the MovieLens only when the UBCF is adopted. Meanwhile, when comparing the four basic recommendation algorithms, the NMF behaves the best, the UBCF follows, while the IBCF and the Popular behave the worst, by using any model on any dataset.

(a) Yelp (Pittsburgh)
(b) Yelp (Madison)
(c) Amazon
(d) MovieLens
Fig. 6: The recommendation results of NMF-based recommender systems with proposed models and three baselines on the four datasets: (a) Yelp (Pittsburgh), (b) Yelp (Madison), (c) Amazon, and (d) MovieLens.
(a) Yelp (Pittsburgh)
(b) Yelp (Madison)
(c) Amazon
(d) MovieLens
Fig. 7: The recommendation results of various baselines on the four datasets: (a) Yelp (Pittsburgh), (b) Yelp (Madison), (c) Amazon, and (d) MovieLens.

In the following, we thus mainly focus on the NMF and UBCF recommendation algorithms, and try to reveal the relatively performance improvements by using N2VSCDNN and N2VSCDNNR, respectively, comparing with the Original model. The results are presented in TABLE II-V. Overall, larger relatively improvements are obtained by using our N2VSCDNNR, in most cases, for each performance metric. Consistent with the intuitive pictures of Fig 6, for the three datasets including Yelp (Pittsburgh), Yelp (Madison), and Amazon, such improvements are relatively significant for both the NMF and UBCF recommendation algorithms, while for the MovieLens, such improvements are remarkable only when our N2VSCDNNR model and the UBCF recommendation algorithm are adopted together.

Algorithm Model ARHR HR Precision Recall
NMF N2VSCDNNR 46.53 28.48 22.72 52.89
N2VSCDNN 0.00 6.99 3.41 8.94
UBCF N2VSCDNNR 93.84 71.40 77.92 81.45
N2VSCDNN 43.73 50.59 42.00 57.75
TABLE II: The average relative improvements of performances (%) introduced by N2VSCDNN and N2VSCDNNR on Yelp (Pittsburgh), by adopting the NMF and the UBCF recommendation algorithms.
Algorithm Model ARHR HR Precision Recall
NMF N2VSCDNNR 3.96 2.49 4.33 8.99
N2VSCDNN -5.69 -4.25 -2.99 -3.03
UBCF N2VSCDNNR 68.58 24.57 40.47 15.91
N2VSCDNN 40.26 22.84 36.94 12.68
TABLE III: The average relative improvements of performances (%) introduced by N2VSCDNN and N2VSCDNNR on Yelp (Madison), by adopting the NMF and the UBCF recommendation algorithms.
Algorithm Model ARHR HR Precision Recall
NMF N2VSCDNNR 29.00 17.92 18.94 21.77
N2VSCDNN 16.57 2.19 7.48 9.42
UBCF N2VSCDNNR 93.84 71.40 77.92 81.45
N2VSCDNN 43.73 50.59 42.00 57.75
TABLE IV: The average relative improvements of performances (%) introduced by N2VSCDNN and N2VSCDNNR on Amazon, by adopting the NMF and the UBCF recommendation algorithms.
Algorithm Model ARHR HR Precision Recall
NMF N2VSCDNNR 4.16 1.22 5.60 5.07
N2VSCDNN -2.77 -0.65 -2.17 -2.36
UBCF N2VSCDNNR 56.32 27.40 54.74 78.25
N2VSCDNN -2.56 -0.55 -2.51 -2.88
TABLE V: The average relative improvements of performances (%) introduced by N2VSCDNN and N2VSCDNNR on MovieLens, by adopting the NMF and the UBCF recommendation algorithms.

Quite impressively, when we adopt our N2VSCDNNR model based on the UBCF recommendation algorithm, we can get huge improvements based on any performance metric, e.g., they are even close to 100% for the Yelp (Pittsburgh) and Amazon datasets. This indicates that our model is especially useful to enhance the efficiency of user-based collaborative filtering method. Although the improvements introduced by our N2VSCDNNR model seems relatively small when the NMF is adopted, but they are still larger than those introduced by the N2VSCDNN model. This is mainly because the NMF recommendation algorithm itself behaves quite well even based on the Original model, and the potential for further improvement thus is relatively low.

As we can see, the recommendation results obtained by N2VSCDNNR based on NMF are better than those based on the other basic recommendation algorithms, we thus further compare the results obtained by N2VSCDNNR-NMF with the results obtained by the three advanced embedding and side information based recommendation algorithms, including Metapath2vec++, BiNE and CoFactor. The results are presented in Fig. 7 and TABLE VI, where we can see that the N2VSCDNNR-NMF outperforms all of the baseline methods in all the cases, while by comparison, Metapath2vec++ performs the worst in most cases. This may be because Metapath2vec++ just treats the explicit and implicit relations equally while ignores their weights which are useful to distinguish the importance of various relations.

Algorithm Model ARHR HR Precision Recall
Yelp (Pittsburgh) N2VSCDNNR 0.0644 0.2008 0.0153 0.0818
Metapath2vec++ 0.0426 0.1164 0.0095 0.0326
BiNE 0.0526 0.1352 0.0115 0.0399
CoFactor 0.0602 0.1832 0.0144 0.0735
Yelp (Madison) N2VSCDNNR 0.0686 0.2166 0.0166 0.1042
Metapath2vec++ 0.0381 0.1214 0.0101 0.0341
BiNE 0.0481 0.1512 0.0128 0.0542
CoFactor 0.0657 0.2088 0.0160 0.0951
Amazon N2VSCDNNR 0.0528 0.1237 0.0093 0.0608
Metapath2vec++ 0.0223 0.0643 0.0038 0.0257
BiNE 0.0366 0.0875 0.0059 0.0389
CoFactor 0.0494 0.1141 0.0087 0.0575
MovieLens N2VSCDNNR 1.2241 0.9350 0.2933 0.2865
Metapath2vec++ 1.1761 0.8957 0.2810 0.2750
BiNE 0.7559 0.7108 0.1721 0.1236
CoFactor 1.1761 0.8957 0.2810 0.2750
TABLE VI: The average performances of N2VSCDNNR-NMF and the three baselines on the four datasets.

Iv-E Time Complexity

Now, let’s analyze the time complexities of N2VSCDNNR and the baselines. We regard the procedures before personalized recommendation as pre-training, and only focus on the time complexity of online personalized recommendation. In particular, since N2VSCDNNR behaves the best when it is based on NMF, here, we just give the complexity of N2VSCDNNR-NMF for simplicity. Suppose the number of users is , the number of items is , the average number of items in the selected item clusters is , and the number of user consumption records is . In BiNE and Metapath2vec++, the window size is , and the iterations number is . The time complexities of all the considered recommender systems are presented in TABLE VII, where we can find that our N2VSCDNNR-NMF has much lower time complexity than other recommender systems when making online recommendations, especially when we divide items into more clusters while only recommend a small number of them to the target user cluster. This indicates that N2VSCDNNR could be more suitable to be applied in large-scale systems.

Algorithm Time complexity
N2VSCDNNR-NMF
Metapath2vec++
BiNE
CoFactor
TABLE VII: The time complexity of the considered recommender systems.

V Conclusion

In this paper, we enrich the network structure based on item categories. Then, we establish one-mode user and item projection networks, and further use node2vec technology to transform each user (or item) node to a user (or item) vector. After that, we cluster users (or items) based on these vectors, according to which we establish a bipartite cluster network using an improved spectral clustering algorithm SCDNN. Based on this bipartite cluster network, for each user cluster, we keep the item clusters with the most frequent relationships with the user cluster. Finally, we use four different recommendation algorithms to recommend the items in these item clusters to each user in the user cluster. By comparing with several advanced embedding and side information based recommendation algorithms, the experiments on four real-world datasets validate the outstanding performance of our framework, in terms of both higher precision and recall. Moreover, we also analyze the time complexity of these recommendation algorithms, and find that our N2VSCDNNR has relatively lower time complexity than the others in online recommendation, indicating its potential to be widely applied in large-scale systems.

In the future, we are interested in utilizing more network representation methods, besides the node2vec algorithm, in recommender systems, and also try to find the optimal parameters using some optimization algorithms to obtain more comprehensive results.

References

  • [1] X. Yang, C. Liang, M. Zhao, H. Wang, H. Ding, Y. Liu, Y. Li, and J. Zhang, “Collaborative filtering-based recommendation of online social voting,” IEEE Transactions on Computational Social Systems, vol. 4, no. 1, pp. 1–13, 2017.
  • [2] Y.-Y. Lo, W. Liao, C.-S. Chang, and Y.-C. Lee, “Temporal matrix factorization for tracking concept drift in individual user preferences,” IEEE Transactions on Computational Social Systems, vol. 5, no. 1, pp. 156–168, 2018.
  • [3]

    C. Fu, M. Zhao, L. Fan, X. Chen, J. Chen, Z. Wu, Y. Xia, and Q. Xuan, “Link weight prediction using supervised learning methods and its application to yelp layered network,”

    IEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 8, pp. 1507–1518, 2018.
  • [4] J. Chen, Y. Wu, X. Xu, Y. Chen, H. Zheng, and Q. Xuan, “Fast gradient attack on network embedding,” arXiv preprint arXiv:1809.02797, 2018.
  • [5] J. Chen, Z. Shi, Y. Wu, X. Xu, and H. Zheng, “Link prediction adversarial attack,” arXiv preprint arXiv:1810.01110, 2018.
  • [6]

    J. D. West, I. Wesley-Smith, and C. T. Bergstrom, “A recommendation system based on hierarchical clustering of an article-level citation network,”

    IEEE Transactions on Big Data, vol. 2, no. 2, pp. 113–123, 2016.
  • [7] L. H. Ungar and D. P. Foster, “Clustering methods for collaborative filtering,” in AAAI workshop on recommendation systems, vol. 1, 1998, pp. 114–129.
  • [8] G.-R. Xue, C. Lin, Q. Yang, W. Xi, H.-J. Zeng, Y. Yu, and Z. Chen, “Scalable collaborative filtering using cluster-based smoothing,” in Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval.   ACM, 2005, pp. 114–121.
  • [9] X. Yu, X. Ren, Y. Sun, Q. Gu, B. Sturt, U. Khandelwal, B. Norick, and J. Han, “Personalized entity recommendation: A heterogeneous information network approach,” in Proceedings of the 7th ACM international conference on Web search and data mining.   ACM, 2014, pp. 283–292.
  • [10] I. Esslimani, A. Brun, A. Boyer et al., “A collaborative filtering approach combining clustering and navigational based correlations.” in WEBIST, 2009, pp. 364–369.
  • [11] K. Joseph, C. H. Tan, and K. M. Carley, “Beyond local, categories and friends: clustering foursquare users with latent topics,” in Proceedings of the 2012 ACM conference on ubiquitous computing.   ACM, 2012, pp. 919–926.
  • [12] C. Rana and S. K. Jain, “An evolutionary clustering algorithm based on temporal features for dynamic recommender systems,”

    Swarm and Evolutionary Computation

    , vol. 14, pp. 21–30, 2014.
  • [13] M. O’Connor and J. Herlocker, “Clustering items for collaborative filtering,” in Proceedings of the ACM SIGIR workshop on recommender systems, vol. 128.   UC Berkeley, 1999.
  • [14] T. George and S. Merugu, “A scalable collaborative filtering framework based on co-clustering,” in Fifth IEEE International Conference on Data Mining (ICDM’05).   IEEE, 2005, pp. 4–pp.
  • [15] Y. Zhang, M. Zhang, Y. Liu, S. Ma, and S. Feng, “Localized matrix factorization for recommendation based on matrix block diagonal forms,” in Proceedings of the 22nd international conference on World Wide Web.   ACM, 2013, pp. 1511–1520.
  • [16] B. Xu, J. Bu, C. Chen, and D. Cai, “An exploration of improving collaborative recommender systems via user-item subgroups,” in Proceedings of the 21st international conference on World Wide Web.   ACM, 2012, pp. 21–30.
  • [17] M. Deodhar and J. Ghosh, “A framework for simultaneous co-clustering and learning from complex data,” in Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining.   ACM, 2007, pp. 250–259.
  • [18] Q. Xuan, M. Zhou, Z.-Y. Zhang, C. Fu, Y. Xiang, Z. Wu, and V. Filkov, “Modern food foraging patterns: Geography and cuisine choices of restaurant patrons on yelp,” IEEE Transactions on Computational Social Systems, vol. 5, no. 2, pp. 508–517, 2018.
  • [19] Q. Xuan, H. Fang, C. Fu, and V. Filkov, “Temporal motifs reveal collaboration patterns in online task-oriented networks,” Physical Review E, vol. 91, no. 5, p. 052813, 2015.
  • [20] Q. Xuan, F. Du, and T.-J. Wu, “Empirical analysis of internet telephone network: From user id to phone,” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 19, no. 2, p. 023101, 2009.
  • [21] T. Zhou, J. Ren, M. Medo, and Y.-C. Zhang, “Bipartite network projection and personal recommendation,” Physical Review E, vol. 76, no. 4, p. 046115, 2007.
  • [22] A. Grover and J. Leskovec, “node2vec: Scalable feature learning for networks,” in Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining.   ACM, 2016, pp. 855–864.
  • [23] J. Chen, Y. Wu, L. Fan, X. Lin, H. Zheng, S. Yu, and Q. Xuan, “Improved spectral clustering collaborative filtering with node2vec technology,” in 2017 International Workshop on Complex Systems and Networks (IWCSN).   IEEE, 2017, pp. 330–334.
  • [24] B. M. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Recommender systems for large-scale e-commerce: Scalable neighborhood formation using clustering,” in Proceedings of the fifth international conference on computer and information technology, vol. 1, 2002, pp. 291–324.
  • [25] S. Puntheeranurak and H. Tsuji, “A multi-clustering hybrid recommender system,” in 7th IEEE International Conference on Computer and Information Technology (CIT 2007).   IEEE, 2007, pp. 223–228.
  • [26] K. Ji, R. Sun, X. Li, and W. Shu, “Improving matrix approximation for recommendation via a clustering-based reconstructive method,” Neurocomputing, vol. 173, pp. 912–920, 2016.
  • [27] M. Gao, L. Chen, X. He, and A. Zhou, “Bine: Bipartite network embedding.” in SIGIR, 2018, pp. 715–724.
  • [28] E. Palumbo, G. Rizzo, and R. Troncy, “Entity2rec: Learning user-item relatedness from knowledge graphs for top-n item recommendation,” in Proceedings of the Eleventh ACM Conference on Recommender Systems.   ACM, 2017, pp. 32–36.
  • [29] Y. Dong, N. V. Chawla, and A. Swami, “metapath2vec: Scalable representation learning for heterogeneous networks,” in Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining.   ACM, 2017, pp. 135–144.
  • [30] Y. Wen, L. Guo, Z. Chen, and J. Ma, “Network embedding based recommendation method in social networks,” in Companion of the The Web Conference 2018 on The Web Conference 2018.   International World Wide Web Conferences Steering Committee, 2018, pp. 11–12.
  • [31] L. Grad-Gyenge, A. Kiss, and P. Filzmoser, “Graph embedding based recommendation techniques on the knowledge graph,” in Adjunct Publication of the 25th Conference on User Modeling, Adaptation and Personalization.   ACM, 2017, pp. 354–359.
  • [32] D. Wang, G. Xu, and S. Deng, “Music recommendation via heterogeneous information graph embedding,” in 2017 International Joint Conference on Neural Networks (IJCNN).   IEEE, 2017, pp. 596–603.
  • [33] E. Palumbo, G. Rizzo, R. Troncy, E. Baralis, M. Osella, and E. Ferro, “Knowledge graph embeddings with node2vec for item recommendation,” in European Semantic Web Conference.   Springer, 2018, pp. 117–120.
  • [34] B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learning of social representations,” in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining.   ACM, 2014, pp. 701–710.
  • [35] J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei, “Line: Large-scale information network embedding,” in Proceedings of the 24th international conference on world wide web.   International World Wide Web Conferences Steering Committee, 2015, pp. 1067–1077.
  • [36] J. Yang and J. Leskovec, “Overlapping communities explain core–periphery organization of networks,” Proceedings of the IEEE, vol. 102, no. 12, pp. 1892–1902, 2014.
  • [37] K. Henderson, B. Gallagher, T. Eliassi-Rad, H. Tong, S. Basu, L. Akoglu, D. Koutra, C. Faloutsos, and L. Li, “Rolx: structural role extraction & mining in large graphs,” in Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining.   ACM, 2012, pp. 1231–1239.
  • [38] L. Zelnik-Manor and P. Perona, “Self-tuning spectral clustering,” in Advances in neural information processing systems, 2005, pp. 1601–1608.
  • [39]

    T. Xiang and S. Gong, “Spectral clustering with eigenvector selection,”

    Pattern Recognition, vol. 41, no. 3, pp. 1012–1029, 2008.
  • [40] C. Jinyin, L. Xiang, Z. Haibing, and B. Xintong, “A novel cluster center fast determination clustering algorithm,” Applied Soft Computing, vol. 57, pp. 539–555, 2017.
  • [41] M. Zhang and H. Yuan, “The pauta criterion and rejecting the abnormal value,” Journal of Zhengzhou University of Technology, vol. 18, no. 1, pp. 84–88, 1997.
  • [42] Z.-D. Zhao and M.-S. Shang, “User-based collaborative-filtering recommendation algorithms on hadoop,” in 2010 Third International Conference on Knowledge Discovery and Data Mining.   IEEE, 2010, pp. 478–481.
  • [43] B. M. Sarwar, G. Karypis, J. A. Konstan, J. Riedl et al., “Item-based collaborative filtering recommendation algorithms.” Www, vol. 1, pp. 285–295, 2001.
  • [44] C. H. Ding, T. Li, and M. I. Jordan, “Convex and semi-nonnegative matrix factorizations,” IEEE transactions on pattern analysis and machine intelligence, vol. 32, no. 1, pp. 45–55, 2010.
  • [45] P. K. Gopalan, L. Charlin, and D. Blei, “Content-based recommendations with poisson factorization,” in Advances in Neural Information Processing Systems, 2014, pp. 3176–3184.
  • [46] I. Porteous, A. U. Asuncion, and M. Welling, “Bayesian matrix factorization with side information and dirichlet process mixtures.” in AAAI, 2010.
  • [47] T. D. T. Do and L. Cao, “Coupled poisson factorization integrated with user/item metadata for modeling popular and sparse ratings in scalable recommendation,” AAAI2018, pp. 1–7, 2018.
  • [48] C. Hu, P. Rai, and L. Carin, “Non-negative matrix factorization for discrete data with hierarchical side-information,” in Artificial Intelligence and Statistics, 2016, pp. 1124–1132.
  • [49] D. Liang, J. Altosaar, L. Charlin, and D. M. Blei, “Factorization meets the item embedding:regularizing matrix factorization with item co-occurrence,” in ACM Conference on Recommender Systems, 2016, pp. 59–66.
  • [50] O. Levy and Y. Goldberg, “Neural word embedding as implicit matrix factorization,” in Advances in neural information processing systems, 2014, pp. 2177–2185.
  • [51]

    T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in

    Advances in neural information processing systems, 2013, pp. 3111–3119.
  • [52] A. Rodriguez and A. Laio, “Clustering by fast search and find of density peaks,” Science, vol. 344, no. 6191, pp. 1492–1496, 2014.