Heterogeneous Information Network (HIN) (Sun et al., 2011) has been a popular framework in Recommender Systems (RSs) for its capability to model all sorts of heterogeneous side informations (SIs), which can improve recommending performance (Yu et al., 2014; Shi et al., 2015; Zhao et al., 2017; Shi et al., 2018; Han et al., 2018; Hu et al., 2018; Wang et al., 2018; Fan et al., 2019). Most of the existing HIN-based RS methods utilize the number of instances of meta-path, a sequence of node types in a HIN, connecting the users and items in computing similarity: the larger the number, the higher the similarity. For example, for in Figure 1(b), the number of meta-path instances connecting and in Figure 1(a) is , i.e., is the only meta-path instance of connecting and . In (Sun et al., 2011), commuting matrix was proposed to represent the number of meta-path instances connecting two nodes, which can be obtained by multiplying a series of adjacency matrices built on pairs of node types. For example, the commuting matrix for meta-path can be obtained by . Each entry in represents the number of instances connecting the users and the items, and and are the adjacency matrices for the corresponding pairs of types, i.e., (User, User) and (User, Item), respectively. The meta-path based similarities are then used as features in different recommending models.
In this paper, we argue that existing HIN-based RS methods have not fully exploited all of the information available in HIN. Consider Figure 1(a), the similarities for the pairs and based on in Figure 1(b) are both 1, because both pairs have only one instance connecting their end nodes (respectively, and ). Technically speaking, models social recommendation, i.e., if we want to recommend items to based on ’s friends, and will be the same because they have the same similarity to . However, when examining Figure 1(a) carefully, we can see that it is better to recommend over to because trusts more than . This is because trusts not only directly by its link to but also indirectly via . In other words, , , and forms a triadic closure, which indicates strong social relations (Simmel, 1908). This example shows that it is a problem for existing HIN-based RS methods to assume that nodes of the same type have the same weight when computing meta-path based similarities.
In the literature, the triangle, formed by and , can be generalized as network motif, which is a local structure involving multiple nodes in a homogeneous graph, e.g. social graph. For example, in Figure 2, we show seven typical 3-node motifs. Proposed in (Milo et al., 2002), motif has been demonstrated to be a very important local structure underlying various complex networks. It is also called higher-order relations in the literature (Benson et al., 2016a). We call the connection directly connecting nodes of same type edge-based first-order relations. In (Benson et al., 2016a; Zhao et al., 2018a), motif has been shown to be very important in obtaining the similarities among nodes of a graph. However, no previous works have explored the influence of motif in HIN. In this paper, we firstly propose Motif Enhanced Meta-Path (MEMP) to incorporate motif-based higher-order relations into conventional meta-path based similarity computation, then design the motif HIN-based RS (MoHINRec) for the recommendation. We then conduct experiments on two real-world datasets, Epinions and CiaoDVD, to demonstrate the effectiveness of motif on HIN-based RSs. The code of the proposed MoHINRec is available in https://github.com/HKUST-KnowComp/MoHINRec.
In this section, we present the details of our proposed framework for integrating Motif into HIN-based Recommender (MoHINRec).
2.1. Meta-path based similarity computation
In previous HIN-based scenarios, meta-paths are used to capture the complex semantics underlying the similarities between nodes of any types. In this part, we give a brief introduction on the counting-based meta-path based similarity. Given a meta-path, we want to compute the similarities between the source and the target nodes, i.e., users and items (Business) in Figure 1(b). Commuting matrix (Sun et al., 2011) has been used to compute the counting-based similarity matrix of a meta-path. Suppose we have a meta-path , where ’s are node types in , which represents the entity type set in a HIN. We can define a matrix as the adjacency matrix between node type and node type . Then the commuting matrix for meta-path is , which represents the number of instances of connecting two nodes of type and . For example, in Figure 1(b), the commuting matrix for can be obtained by , where is the adjacency matrix between type and type , and is the adjacency matrix for type , i.e., recording the relations among all instance nodes of type . This shows that counting-based similarities for a meta-path can be computed by multiplying a sequence of adjacency matrices. In this paper, we adopt the counting-based similarity for users and items given a meta-path. In practice, we can implement this procedure in a very efficient way if the adjacency matrices ’s are sparse.
2.2. MoHINRec Framework
In this part, we elaborate on the MoHINRec framework, consists of motif-based adjacency matrix, MEMP based similarity computation, and recommendation model with FMG (Zhao et al., 2017).
Firstly, we give the definition of motif-based adjacency matrix in the context of HIN. 111For the formal definition of motif, we refer readers to (Benson et al., 2016b). Given a motif , the definition of motif-based adjacency matrix for node type in a HIN is:
where , and belong to type , and is the truth-value indicator function, i.e., if the statement is true and 0 otherwise. Note that the weight is added to only if node and occur in the given motif . Note that for the 3-node motifs in Figure 2, there are six cases where two nodes occur in a 3-node motif because the graphs are directed. Therefore, computing the motif-based adjacency matrix incurs subgraph counting. Fortunately, for the seven motifs, there are simple formulas to compute the corresponding motif-based adjacency matrices. Here, we omit the detail for clarity, and refer the readers to (Benson et al., 2016b; Zhao et al., 2018a) for the formulas.
Now, we give an example to illustrate the motif-based adjacency matrix in Figure 3. In Figure 3(a), 5 nodes belong to type , and the relations among them are recorded by . Thus, an example graph of node type can be drawn accordingly in Figure 3(a). Assume that we want to compute the motif-based adjacency matrix given the motif in Figure 2. The result is shown in Figure 3(b). Details of the computation are given in (Benson et al., 2016b; Zhao et al., 2018a). From the matrix, we can see that , because and occur in two instances of , i.e., the triangles formed by and . This example explains the meaning of motif-based adjacency matrix, i.e., it records the frequency of two nodes occurring in a given motif.
Given a meta-path , for which the commuting matrix can be obtained by . Assuming and are of the same type, we can construct the motif-based adjacency matrix for node type , denoted as , given a motif . Then, same as (Zhao et al., 2018a), we propose to use linear combination to fuse the edge-based and motif-based adjacency matrices for node type . Specifically, we generate the MEMP-based adjacency matrix as follows.
where balances the combination of edge-base and motif-based adjacency matrices. When , it means we only use the type adjacency matrix, i.e., the edge-based adjacency matrix for node type , and when , it means we use the motif-based adjacency matrix alone. Then, in the computation of commuting matrix , we replace with . In this way, we incorporate motif into meta-path based similarity computation. For each meta-path, , we can obtain a new commuting matrix , recording the number of the MEMP instance connecting users and items. In this work, we use the frequency, i.e., , to denote the similarity between user and item under the MEMP .
After we obtain MEMP-based similarity matrices, we adopt a state-of-the-art HIN-based RS method (Zhao et al., 2017), which first factorizes each similarity matrix separately to obtains a group of user and item latent features from each matrix, and then feeds the features to a factorization machine (FM) (Rendle, 2012) to generate recommendations.
3. Experiment and Analysis
In this section, we present the experimental results.
3.1. Experimental Settings
Datasets. We conduct experiments on two real-world datasets: Epinions and CiaoDVD, which are review websites where users can write reviews on products and rate the reviews of other users. Moreover, users can add other users as trustworthy users if they like their reviews. The Epinions dataset is provided by (Tang et al., 2012), and CiaoDVD is provided by (Guo et al., 2014). The statistics of the two datasets are shown in Table 1. Note that although the datasets include other SIs such as the categories and reviews of items, the only relation that exists between nodes of same type is social relation. Thus, in our experiments, we use meta-paths and in Figure 1(b) to demonstrate the effectiveness of MEMP and omit the statistics of the other SIs in Table 1.
We choose to tackle the rating prediction task, which is widely used to evaluate CF-based RSs. We choose two evaluation metrics, Mean Absolute Error (MAE) and Root Mean Square Error (RMSE), to evaluate our framework.
Baselines. We compare our proposed models with the following RS methods:
Note that there are other HIN-based RS methods for rating prediction (Yu et al., 2014; Shi et al., 2015), but FMG has been shown to be consistently superior (Zhao et al., 2017) to these methods. Therefore, FMG is used in the experiments as the state-of-the-art baseline representing HIN-based RSs. Our MoHINRec method is based on FMG with different MEMP-based similarities. For a given motif in Figure 2, we denote the method as MoHINRec(). In this paper, we mainly focus on 3-node motifs, i.e., the seven motifs shown in Figure 2.
Settings. For the experimental settings, we randomly split each dataset into training, validation and test data with a ratio of 8:1:1. In the training process, the training data is used to fit the model, and the validation data is used to choose the best parameters, and the test data is used to compute the prediction errors of the models. For RegSVD, SoReg, and SocialMF, the rank is set to be , and for FMG, we adopt the same settings as (Zhao et al., 2017), i.e., . The regularization parameters and combination factor in Eq. (2) are tuned by the validation data. Note that for simplicity, we set . We repeat each experiment five times by randomly splitting the datasets and report the average results.
3.2. Performance Comparison
We show the performance of all methods in terms of RMSE and MAE in Table 2. We can see that MoHINRec with any of the seven motifs outperforms all baselines consistently on both datasets. This demonstrates the effectiveness of MEMP for HIN-based recommendation. Besides, the best performance is achieved when is in . For example, on Epinions, the lowest RMSEs are obtained when for , for and for . On CiaoDVD, the lowest RMSEs are obtained when for , for and for . This aligns with our assumption that motif-based higher-order relations and edge-based first-order relations among nodes of the same type are complementary to each other in meta-path based similarity computation. The same observation has been reported in (Zhao et al., 2018a) for the user-ranking task in social networks. We point out below two observations from Table 2.
First, the performance gain of MoHINRec varies across different datasets and different motifs. On Epinions, the best RMSE and MAE are achieved by , while on CiaoDVD, the best performance is obtained by for RMSE and for MAE. The performance gains of the other methods vary. It means that despite the usefulness of MEMP for HIN-based RS methods, the performance gain is dependent on the motif and dataset.
Second, FMG, the state-of-the-art HIN-based RS method, outperforms all other MF-based methods, i.e., RegSVD, SoReg, and SocialMF. It demonstrates the power of the “MF+FM” framework. However, MoHINRec clearly beats FMG. On Epinions, MoHINRec with decreases RMSE from to , and on CiaoDVD, MoHINRec with decreases RMSE from to . It means by incorporating motif into exiting meta-path based similarity computation, we can further improve the recommending performance.
From the experimental results, we can see that motif-based relations can benefit meta-path based recommendation with FMG model.
4. Related Work
In this section, we review related work on HIN-based RSs and motif in homogeneous graphs.
4.1. Recommendation in HIN
To better make use of rich side information in RSs, HIN has been proposed to represent disparate heterogeneous information into a single graph. Based on meta-path, several approaches have been proposed to exploit HIN for the recommendation task. (Yu et al., 2014; Shi et al., 2015; Zhao et al., 2017; Shi et al., 2018; Han et al., 2018; Hu et al., 2018; Wang et al., 2018; Zhao et al., 2018b; Fan et al., 2019) However, all the HIN-based RS methods ignore the motif-based higher-order relations among nodes of same type.
4.2. Motif in homogeneous graphs
Motif can be used to characterize higher-order relations in homogeneous graphs. Network motif was first introduced in (Milo et al., 2002). It has been shown to be useful in many applications such as social networks (Rotabi et al., 2017), and temporal networks (Paranjape et al., 2017). Recently, it was shown that motif can also be used for graph clustering or community detection (Benson et al., 2016a; Yin et al., 2017), user analysis (Zhang et al., 2017) and ranking (Zhao et al., 2018a) in social networks. Compared to these previous studies, which are all in homogeneous graphs, we are the first to incorporate the motif-based higher-order relations into heterogeneous graphs.
5. Conclusion and Future work
In this paper, we explore motif-based higher-order relations in HIN-based RSs, which are proved to be useful in homogeneous graphs of various domains. We propose the motif-enhanced meta-path (MEMP) for computing the similarities between users and items in HIN, and experimental results on two real-world datasets, Epinions and CiaoDVD, demonstrate that the proposed MoHINRec built on MEMP-based similarities is superior to existing HIN-based RS methods. For future work, we will explore motif-based relations among nodes of different types. This may lead to novel structures that can generalize meta-path and meta-graph in HIN.
Dik Lun Lee and Huan Zhao are supported by the Research Grants Council HKSAR GRF (No. 16215019). Yangqiu Song and Yingqi Zhou are supported by the Early Career Scheme (ECS, No. 26206717) from Research Grants Council in Hong Kong. We also thank the anonymous reviewers for their valuable comments and suggestions that help improve the quality of this manuscript.
- Higher-order organization of complex networks. Science 353 (6295), pp. 163–166. Cited by: §1, §4.2.
- Supplementary materials for higher-order organization of complex networks. Science. Cited by: §2.2, §2.2, footnote 1.
Metapath-guided heterogeneous graph neural network for intent recommendation. In KDD, pp. 2478–2486. Cited by: §1, §4.1.
- ETAF: an extended trust antecedents framework for trust prediction. In ASONAM, pp. 540–547. Cited by: §3.1.
- Cited by: 1st item, 2nd item, 3rd item.
- Aspect-level deep collaborative filtering via heterogeneous information networks.. In IJCAI, pp. 3393–3399. Cited by: §1, §4.1.
Leveraging meta-path based context for top- n recommendation with a neural co-attention model. In KDD, pp. 1531–1540. Cited by: §1, §4.1.
- A matrix factorization technique with trust propagation for recommendation in social networks. In RecSys, pp. 135–142. Cited by: 3rd item.
- Recommender systems with social regularization. In WSDM, pp. 287–296. Cited by: 2nd item.
- Network motifs: Simple building blocks of complex networks. Science 298 (5594), pp. 824–827. Cited by: §1, §4.2.
- Motifs in temporal networks. In WSDM, pp. 601–610. Cited by: §4.2.
- Cited by: 1st item.
- Factorization machines with libFM. ACM Transactions on Intelligent Systems and Technology (TIST) 3 (3), pp. 57:1–57:22. Cited by: §2.2.
- Cited by: §4.2.
- Heterogeneous information network embedding for recommendation. IEEE Transactions on Knowledge and Data Engineering (TKDE). Cited by: §1, §4.1.
- Semantic path based personalized recommendation on weighted heterogeneous information networks. In CIKM, pp. 453–462. Cited by: §1, §3.1, §4.1.
- Sociology: investigations on the forms of sociation. Duncker & Humblot, Berlin Germany. Cited by: §1.
- PathSim: Meta path-based top-k similarity search in heterogeneous information networks. VLDB 4 (11), pp. 992–1003. Cited by: §1, §2.1.
- ETrust: Understanding trust evolution in an online world. In KDD, pp. 253–261. Cited by: §3.1.
- Billion-scale commodity embedding for e-commerce recommendation in alibaba. In KDD, pp. 839–848. Cited by: §1, §4.1.
- Local higher-order graph clustering. In KDD, pp. 555–564. Cited by: §4.2.
- Personalized entity recommendation: a heterogeneous information network approach. In WSDM, pp. 283–292. Cited by: §1, §3.1, §4.1.
- StructInf: Mining structural influence from social streams.. In AAAI, pp. 73–80. Cited by: §4.2.
- Ranking users in social networks with higher-order structures. In AAAI, Cited by: §1, §2.2, §2.2, §2.2, §3.2, §4.2.
- Meta-graph based recommendation fusion over heterogeneous information networks. In KDD, pp. 635–644. Cited by: §1, §2.2, §2.2, 4th item, §3.1, §3.1, §4.1.
- Cited by: §4.1.