1 Introduction
There is a surge of research interest in constructing recommender systems based on the observed useritem interactions. Collaborative filtering (CF) (Bennett et al., 2007) is a popular recommendation technique that has achieved stateoftheart performance, which is often barely based on the users’ feedback on items, including explicit feedback such as ratings, or implicit one such as quantized play counts (Sarwar et al., 2001; Hu et al., 2008). The feedback of users on items can often be represented as a matrix of ordinal variables, which are categorical data exhibiting natural ordering between categories (for example: ). Such data are known to be sparse, bursty, and overdispersed, making their direct use in recommender systems challenging.
Matrix Factorization (MF) (Koren et al., 2009) is a popular approach of CF algorithms, which aims to approximate the observations by a lowrank structure: , where and describe user preferences and item attributes, respectively. Thus each user or item can be represented as a
dimensional latent vector, where
, and the useritem interaction can be measured by their corresponding inner product. Among these methods based on MF, Poisson factorization (PF) (Gopalan et al., 2015)is well suited for count data, which replaces the usual Gaussian assumption with a Poisson one, and has become popular in handling implicit feedback, achieving SOTA results. PF is also often applied to a binarized version of the useritem interaction matrix, containing only the information that a user is interacting with an item or not. However, the binarization stage induces a loss of information for PF, since the value associated to each interaction is removed. Although there have been several attempts in the literature to directly model raw data with PF
(Gopalan et al., 2015), or introduce the BernoulliPoisson link for binarized data, referred to as BernoulliPoisson factorization (BePoF) (Acharya et al., 2015a), these PF based methods still fail to fully describe the ordinal nature of the ordinal data within a limited range.Rather than treating the ordinal feedback as binary or count variables, several recent works have tried directly modeling the raw ordinal data. Discrete compound Poisson factorization (dcPF) (Gouvert et al., 2020b) adds a latent variable that obeys exponential dispersion model (EDM) in its generative process. (Agresti, 2003) utilize Cumulative Link Models (CLMs) to make a bridge between ordinal data and MF models, leading to OrdMF models (Gouvert et al., 2020a). The success of OrdMF motivates us to construct a fully Bayesian model to directly handle raw ordinal data.
From another perspective, moving beyond traditional recommender systems where the users are treated as independent and identically distributed (), many researchers have recently started to analyze social recommender systems (Tang et al., 2013; Jiang et al., 2014) meeting the prevalence of online social networks. Such social recommendation approaches are based on the social influence theory that states connected people would influence each other, leading to shared interests due to social interactions (Anagnostopoulos et al., 2008; Lewis et al., 2012). For instance, social regularization (or graph regularization) has been empirically proven effective for social recommendation, by assuming that connected users would share similar latent embeddings (Cai et al., 2010; Chang and Blei, 2009; Acharya et al., 2015b).
In an attempt to keep as much ordinal data information as possible and also consider social networks, we first develop a novel probabilistic generative model for jointly modeling both ordinal useritem interactions and useruser relations, named ordinal graph factor analysis (OGFA). Further, we extend OGFA to a hierarchical fashion to discover both the underlying user preferences and social communities (or groups) at different semantic levels. The contributions of this paper are listed as follows: We propose OGFA to jointly model both the raw ordinal useritem interactions and useruser social network via sharing their latent representations; We extend OGFA to a hierarchical version, named Ordinal Graph Gamma Belief Network (OGGBN), to provide multilayer user latent representations, revealing both their preferences and relationships at different semantic levels; We integrate highorder social information into the deep structure of OGGBN to model the recursive dynamic social diffusion, which is a common problem in social recommendation; For efficient inference, we develop a parallel hybrid GibbsEM algorithm for our models. It can make full use of the sparsity of the observed matrix and is scalable to large datasets.
2 Related Work
Recommendation with Raw Ordinal Data: Neither PFA nor BerPoPFA make correct assumptions when modeling ordinal data. One may better handle overdispersion by factorizing the rating matrix under the negative binomial (NB) likelihood like NBFA (Zhou and others, 2018), but this choice still ignores the fact that ordinal data take values from a limited range. Many efforts have been devoted to developing generative processes using the raw ordinal data to achieve better representation and recommendation performance. Discrete compound Poisson factorization (dcPF) (Gouvert et al., 2020b) introduces a discrete exponential dispersion models (EDM) as a map function, which links the latent count variable to the discrete observation. Inspired by the same ideals, Ordinal NMF (OrdNMF) (Gouvert et al., 2020a) methods adapt the Cumulative Link Models (CLMs) have been applied for ordinal regression (Agresti, 2003) and the underlying thought is to introduce a step function.
Social Recommendation Systems: Many researchers have recently focused on social recommender systems (Tang et al., 2013; Jiang et al., 2014; Fan et al., 2019), which has been emerging as a promising direction of analyzing user preferences via incorporating social information (Guo et al., 2012). For instances, considering the influence of trust users (including both trustees and trusters) on the rating prediction, (Guo et al., 2015) develop a TrustSVD based on SVD++ (Koren, 2008) and ensure that userspecific vectors can be learned from their trust information even if a few or no rating are given; Relational topic models (RTMs) treat each document as a user and construct a probabilistic model to jointly model observed documentword (useritem) matrix and documentdocument (useruser) interactions (Chang and Blei, 2009; RosenZvi et al., 2012; Acharya et al., 2015b). We argue that 1) RTMs simply unitize Poisson likelihood to model the rating matrix which has limited representation of ordinal data (Gouvert et al., 2020b), and 2) the above social networks only employ the firstorder social information, while ignoring social diffusion when making a recommendation.
3 Preliminary
For comprehensively understanding the importance of modeling raw ordinal data, we introduce the following background.
Poisson Factor Analysis: PFA (Gopalan et al., 2015) is a typical topic model and serves as a building block for our developed models. A common preprocessing operation for applying PFA (or other sophisticated variants) to recommendation is to binarize the observed useritem interaction matrix to a binary one , where and indicate the number of user and item, respectively. Then the binarized useritem interaction matrix can be factorized into a summation of equalsize latent matrices under the Poisson likelihood, formulated as
(1) 
where and denote the factor loading matrix and factor scores, respectively, with . More specifically, each row of , denoted as , contains relative community intensities specific to user ; each column of , denoted as , encodes the relative importance of each item in community . Thus each binarized observation
can be modeled with a Poisson distribution and further parameterized by the inner product of corresponding user preference and item attributes as
(2) 
where the sparsity of gamma distributed
indicates that each user is only interested in a few communities, contributing to exploring both the underlying user preferences and social communities.BernoulliPoisson Link: Instead of directly factorizing the binarized matrix under the Poisson likelihood, a better solution could be linking these binary interactions to latent count values with the BernoulliPoisson (BerPo) link (Zhou, 2015) and then factorizing the latent count matrix. Below we consider modeling useritem interactions after binarization with BerPoPFA, where equals to one if and only if nodes and are linked. Thus each nonzero interaction can be assumed to be derived as:
where denotes the augmented latent count variable, an indicator function, and the inner product of the corresponding latent representations of user and item , specifically and . Integrating out leads to the following generative process: , where Bern
refers to the Bernoulli distribution. The conditional posterior of
can be expressed as(3) 
where refers to the zerotruncated Poisson distribution and to the Dirac distribution located in 0. Moving beyond useritem interactions, we note that the BerPoPFA can be also directly applied to model the binarized useruser relations, denoted as .
Cumulative Link Model for Ordinal Data: Moving beyond adopting the binarization stage, which discards the value information associated to the useritem interaction, Cumulative Link Model (CLM) is developed to mix up the gap between the raw ordinal data and the factorization models (Agresti, 2003), playing a role of threshold model for ordinal regression without loss of generality. More specifically, for each observed ordinal useritem interaction , CLM introduces a continuous latent variable , which can be mapped to with a number of contiguous intervals with boundaries as following:
(4) 
Thus the rate score meeting the interval will be assigned to the corresponding rate as
The projection between and can be defined with a quantization function as
To ensure the nongenativity of and the ability to model overdispersion, Ordinal NMF (Gouvert et al., 2020a) typically introduces a nonnegative multiplicative random noise , with c.d.f denoted as , on the latent variable as
. Thus the c.d.f. of the ordinal random variable
can be formulated as:(5) 
where the p.m.f. can be calculated with:
(6) 
Notably, various functions can be used to determine the exact nature of the multiplicative noise.
4 Ordinal Graph Gamma Belief Network
Focusing on constructing a probabilistic generative model that can not only directly handle with raw ordinal data, but also jointly model both useritem and useruser interactions, we propose OGFA and further extend it to a hierarchical version, discovering representative user preferences and user communities at different semantic levels.
Ordinal Graph Factor Analysis: To consistent with the definition in the preliminary, we denote the raw ordinal useritem matrix as . Then the useruser social network can be represented as a set of users (nodes) and their relations (edges), resulting in a graph of size . In this paper, we focus on analyzing the undirected case, and represent the undirect graph as a symmetric binary adjacency matrix , which can be linked to a latent count matrix with the BerPo link in the preliminary. As illustrated in the left part of Fig. 1, the generative model of the proposed OGFA for jointly modeling useritem interaction and useruser social network can be formulated as:
(7) 
where is the quantization function defined in Eq. (3); represents how often users and interact due to their affiliations with community ; , which measures the importance of community in explaining useruser interactions, also contributes to balance the scale of that is related to both and . Notably, the latent communities in OGFA are treated independently and our model can be easily extended to the situation of considering both intra and inter community interactions, through modifying the generative process with respect to as
In our consideration, treating each community independently will be more suitable to model an assortative relational network exhibiting homophily (e.g., coauthor network) and similar conclusion can be found in (Zhou, 2015). Thus in what follows, we focus on the former simpler model by omitting intercommunity interactions.
The main purpose of OGFA is to jointly infer the user preferences and item attributes as well as the threshold sequence in Eq. (4). Focusing on the special case, where is a multiplicative inversegamma (IG) noise with the shape parameter , specifically with the c.d.f , the c.d.f in Eq. (5) associated with the ordinal data can be obtained as
(8) 
where and . The sequence corresponds to the inverse of the thresholds and is therefore decreasing, i.e., . Moreover, defining for , we have , specifically defining . The event satisfies a Bernoulli distribution as
It is interesting to notice these BerPobased relational models (Zhou, 2015; Acharya et al., 2015a) can be regarded as special cases of the OGFA setting and .
Combining Eq.(3) and (8), the p.m.f. of the observed can be obtained as :
where the loglikelihood of can be formulated as:
(9) 
Notably, the loglikelihood in Eq. (9) not only brings up a linear term in when , but also a nonlinear term of the form , where the latter part is similar to the BernoulliPoisson link.
Ordinal Graph Gamma Belief Network: Due to the fact that social influence recursively propagates and diffuses in the social network, the personal interests will change in the recursive process (Wu et al., 2019). As mentioned above, OGFA simply develops static models by leveraging the firstorder neighbor information without considering recursive diffusion in the social network, which may lead to suboptimal recommendation performance. To further exploit the multilevel semantics of the user perferences in a taxonomy fashion, together with capturing the social diffusion, we extend OGFA in a hierarchical fashion:
(10) 
where denotes the number of communities at layer ; , which is raised to the power of , represents the th order social information. To complete the hierarchical model, we apply the Dirichlet prior on each column of , specifically , and the gamma prior on at different layers.
Model Property: For an intuitive comparison between our models and traditional MF based recommendation methods, we summarize the characteristics of OGFA in Eq. (7) and OGGBN in Eq. (10) as following:
modeling raw ordinal data: Moving beyond adopting the binarization stage like traditional MF based recommendation methods, which discard the value information associated to useritem interactions, the proposed OGFA and OGGBN can directly handle with raw ordinal data and keep as much rating information as possible, providing more expressive latent representations demonstrated with following experiments.
hierarchical semantic communities: Through introducing social network relations into user latent representations, OGFA can discover the underlying communities from observed useruser interactions. For easily understanding, replacing each user with a specific electronic product, the product “Apple TV” and “Smart LED TV” belong to the “TV & VIDEO” community (or category), while “Apple AirPods” and “Sony WH1000XM3” belong to another community for “HEADPHONE”. Moving beyond discovering these shallow relations, the developed OGGBN can explore hierarchical semantic communities. Backward to the above instance, although “TV & VIDEO” and “HEADPHONE” are assigned to different communities at a shallow semantic level, both of them belong to a larger community for “ELECTRONIC”, indicating the fact that there is nature taxonomy in our shopping platforms or other fields (such as hobbies, behaviors etc).
highorder social diffusion: Through marginalizing out in Eq. (10), we can obtain
where and . Note that the adjacency matrix at deeper layer tends to cover wider social network ranges, reflecting higherorder information of useruser relations. Moreover, the positive variable reflects the importance of th community at layer to the generation process of the corresponding relation, and will vary with both the community index and the layer index .
scalability for large sparse recommendation: Both proposed OGFA and OGGBN are scalable to the large sparse recommendation, benefiting from following advantages: 1) the useritem interaction matrix is sparse and the augmentation in Eq. (12) can only focus on these nonzero useritem interactions; 2) the social network is also sparse and our models only need to handle with these nonzero useruser relations, taking advantages of BerPo link (Zhou, 2015). Thus, making full use of data sparsity, the time complexity of our models are linear to the total of nonzero elements in these observed matrices, greatly reducing the time cost.
5 Model Inference
Below we describe the key inference equations of the parallel hybrid GibbsEM algorithm for our models, which can be accelerated with GPU, and provide more details in appendix.
The main challenge in derivation could be that the loglikelihood in Eq. (9) for ordinal data introduces a nonlinear term
and it does not come with a conjugate prior to facilitate posterior inference. Thanks to the augmentation technique, for each observed ordinal data
, we can obtain a corresponding latent count variable as:(11) 
where indicates the truncated Poisson distribution.
After the augmentation in Eq. (11), exploiting the characteristic of Poisson distribution, we can twice augment each latent variable into a count vector as
(12) 
where . Taking advantages of the simplex constraint on , we can decouple and to as
(13) 
Then we can obtain analytic posteriors as following:
Sample : Using the conjugacy of multinomial and Dirichlet distributions, we can have
Sample : With the Poisson additive property and the conjugacy of Poisson and gamma distributions, we have
where and denote the latent count variables that are independently sampled from the corresponding useruser social network and useritem reactions, respectively.
6 Experiments
6.1 Experimental Setup
Datasets.
We consider four common used datasets: Ciao, Epinions, MovieLens and Taste Profile. The first two provide both the useruser social networks and the rating matrices, where the ratings scale is from 1 to 5. The other two provide the unique name of items (e.g., movie names) but without social networks, which can used for visualization as shown in Fig.
2. Each dataset is divided into a train set containing 80% samples and a test set containing the remaining 20%. More detailed descriptions can be found in Appendix B.Model  Data  Ciao  Epinions  

s=1  s=3  s=5  s=1  s=3  s=5  
PF (Gopalan et al., 2015)  R  0.057 / 0.097  0.071 / 0.212  0.063 / 0.222  0.040 / 0.102  0.039 / 0.097  0.030 / 0.092 
BePoF (Acharya et al., 2015a)  0.056 / 0.097  0.070 / 0.211  0.063 / 0.221  0.039 / 0.098  0.037 / 0.093  0.027 / 0.087  
dcPF (Gouvert et al., 2020b)  R  0.057 / 0.098  0.073 / 0.215  0.068 / 0.226  0.045 / 0.108  0.042 / 0.102  0.038 / 0.103 
OrdNMF (Gouvert et al., 2020a)  R  0.056 / 0.096  0.074 / 0.215  0.066 / 0.222  0.044 / 0.106  0.042 / 0.104  0.038 / 0.104 
GNMF (Cai et al., 2010)  R  0.056 / 0.096  0.072 / 0.212  0.063 / 0.218  0.041 / 0.102  0.040 / 0.100  0.036 / 0.101 
TrustSVD (Koren, 2008)  R  0.055 / 0.095  0.071 / 0.213  0.064 / 0.219  0.039 / 0.099  0.038 / 0.098  0.035 / 0.100 
TrustMF (Ma et al., 2011)  R  0.056 / 0.097  0.074 / 0.215  0.065 / 0.220  0.043 / 0.104  0.042 / 0.104  0.035 / 0.102 
GraphRec (Fan et al., 2019)  R  0.057 / 0.097  0.084 / 0.224  0.075 / 0.233  0.046 / 0.113  0.046 / 0.114  0.038 / 0.105 
DiffNet (Wu et al., 2019)  0.057 / 0.098  0.083 / 0.222  0.073 / 0.230  0.047 / 0.115  0.043 / 0.112  0.037 / 0.103  
PGBN (Zhou et al., 2015)  R  0.056 / 0.097  0.072 / 0.214  0.063 / 0.217  0.046 / 0.109  0.042 / 0.105  0.033 / 0.096 
PGGBN  R  0.056 / 0.098  0.076 / 0.217  0.069 / 0.224  0.047 / 0.111  0.044 / 0.108  0.033 / 0.097 
OGGBN1  R  0.056 / 0.097  0.080 / 0.221  0.072 / 0.230  0.047 / 0.110  0.045 / 0.110  0.035 / 0.100 
OGGBN2  R  0.056 / 0.098  0.084 / 0.223  0.075 / 0.234  0.048 / 0.118  0.047 / 0.114  0.037 / 0.108 
OGGBN3  R  0.058 / 0.104  0.086 / 0.227  0.077 / 0.238  0.047 / 0.116  0.051 / 0.120  0.039 / 0.112 
Model  Data  MovieLens  Taste Profile  

s=1  s=4  s=10  s=1  s=6  s=51  
PF (Gopalan et al., 2015)  R/Q  0.435 / 0.481  0.433 / 0.484  0.309 / 0.568  0.204 / 0.360  0.147 / 0.336  0.106 / 0.305 
BePoF (Acharya et al., 2015a)  0.435 / 0.482  0.432 / 0.484  0.312 / 0.592  0.208 / 0.382  0.147 / 0.335  0.115 / 0.341  
dcPF (Gouvert et al., 2020b)  R/Q  0.436 / 0.481  0.438 / 0.485  0.355 / 0.612  0.209 / 0.382  0.154 / 0.342  0.121 / 0.347 
OrdNMF (Gouvert et al., 2020a)  R/Q  0.444 / 0.481  0.444 / 0.484  0.353 / 0.596  0.210 / 0.383  0.152 / 0.342  0.117 / 0.343 
GNMF (Cai et al., 2010)  R/Q  0.441 / 0.482  0.441 / 0.483  0.341 / 0.576  0.210 / 0.385  0.148 / 0.336  0.110 / 0.339 
TrustSVD (Koren, 2008)  R/Q  0.446 / 0.485  0.445 / 0.485  0.341 / 0.578  0.215 /0.392  0.151 / 0.342  0.115 / 0.342 
TrustMF (Ma et al., 2011)  R/Q  0.448 / 0.486  0.445 / 0.484  0.344 / 0.579  0.217 /0.393  0.151 / 0.343  0.117 / 0.345 
GraphRec (Fan et al., 2019)  R  0.451 / 0.491  0.447 / 0.502  0.364 / 0.631  0.222 / 0.393  0.168 / 0.404  0.135 / 0.368 
DiffNet (Wu et al., 2019)  0.453 / 0.489  0.445 / 0.495  0.347 / 0.583  0.220 / 0.395  0.152 / 0.346  0.115 / 0.341  
PGBN (Zhou et al., 2015)  R/Q  0.448 / 0.490  0.448 / 0.502  0.343 / 0.628  0.218 / 0.393  0.162 / 0.398  0.133 / 0.359 
PGGBN  R/Q  0.450 / 0.491  0.447 / 0.500  0.355 / 0.630  0.220 / 0.393  0.162 / 0.401  0.131 . 0.358 
OGGBN1  R/Q  0.452 / 0.490  0.443 / 0.500  0.362 / 0.631  0.221 / 0.395  0.165 / 0.403  0.134 / 0.361 
OGGBN2  R/Q  0.453 / 0.492  0.447 / 0.502  0.364 / 0.632  0.222 / 0.394  0.169 / 0.405  0.134 / 0.368 
OGGBN3  R/Q  0.455 / 0.495  0.448 / 0.504  0.365 / 0.636  0.222 / 0.397  0.170 / 0.409  0.136 / 0.371 
Preprocess: For MovieLens and Taste Profile without social networks, we consider constructing a handcrafted user relational graph without introducing additional information.
Following (Sarwar et al., 2001), we construct user relations via calculating the cosine distance between users:
where and denote the observed useritem preferences;
is a hyperparameter that controls the sparsity of the relational graph. Notably, we set
on the MovieLens dataset and on the Taste Profile dataset, to make sure that the sparsity of the simulated graph is similar to the real ones.Baselines and Evaluation Metrics:
We compare OGGBN with following baselines, including: 1) MF based methods: PF, BePoF, dcPF and OrdNMF. 2) Social recommender systems: GNMF, TrustSVD, TrustMF, GraphRec and DiffNet. 3) PGBN and its variation PGGBN for ablation study, which generates the social relationship with Poisson distribution. The latent size for all models, and for OGGBN3. Focusing on recommended topN items for each user, two popular ranking based metrics are utilized for evaluation, including Hit Ratio (HR) and normalized discounted cumulative gain (NDCG) (Wu et al., 2019), which are both the higher the better. More details about baselines and metrics can be found in Appendix C.6.2 Results Analysis
In this section, we evaluate the performance of all comparison models on four datasets on two metrics HR@100 and NDCG@100. The experimental results on Ciao and Epinions are shown in Table 1. Compared to the MF based methods which only model the rating matrix, the social recommender systems achieve better performances, which contribute to the consideration of both rating matrix and social networks. This phenomenon indicates that the social information can indeed improve the performance when is small, while the performance drops sharply with the increase of due to the model’s inability for representing complex ordinal data. Taking advantages of both introducing social network information and directly handling with ordinal data, OGGBN outperforms other baselines and achieves SOTA performance on the two datasets. Furthermore, among the comparisons of our proposed models in the bottom group in Fig. 1, the 3layer OGGBN achieves the best score in most cases, exhibiting the effectiveness of modeling the information diffusion process in the social recommendation systems and exploring the hierarchical relations among users. Similar conclusions can be obtained from the results of MovieLens and Taste Profile dataset shown in Table 2. Benefit from simultaneously modeling social networks together with ordinal data and simulating the recursive diffusion in the global social network, the 3layer OGGBN achieves the best performance. It’s worth noting that, compared to PGBN, our OGGBN obtained significant superiority when is large, which can be explained by our learned potential thresholds represented bellow.
The Learned Thresholds: As mentioned before, OGGBN constructs the bridge between the ordinal observations and the latent variables owing to the threshold model. To give an intuitive understanding, we visualize the thresholds learned by OGGBN on MovieLens and Taste Profile datasets as shown in Fig. 3, from which we can observe that the gaps of thresholds increase as the ratings growing up. Taking the threshold function on MovieLens dataset for example, assuming the ratings of movies are roughly divided into three levels: of level (bad), of level (normal) and of level
(good). Our leaned thresholds indicate a very limited variance of bad movies at level
, but much larger diversity at level . In other words, it is easier to get an improved ratings score for a bad movie from level to , but very difficult to change from level to , which shows the superiority of our proposed method for modeling the intrinsically potential thresholds underlying in the observed data.Visualization: For quantitative evaluations, we also illustrate the learned communities at different layers and the corresponding inferred connections between them, as shown in Fig. 2. Each community is projected into the original item space for better visualization and interpretation, formulated as . Taking the hierarchical communities of Taste Profile dataset shown in Fig. 2 for example, it obvious that the communities at bottom layer tend to belong to some specific singers, such as th community orf Taylor Swift, th for John Mayer. As the network going deeper, these communities tend to be more general and show more diversity. Similar conclusion can be drawn from the hierarchical communities of MovieLen shown, which demonstrates the interpretability of our proposed recommendation system.
7 Conclusion
To jointly model the useritem rating matrix and the useruser social network for social recommendation, we first construct a shallow probabilistic model OGFA, which can directly handle with raw ordinal data. Further, considering the dynamic changes in the social diffusion process, we extend OGFA to a deeper fashion, named OGGBN, which captures the multiple semantic preferences of user and the high order social information. The experimental results clearly show the effectiveness and interpretability of the proposed models.
References
 Nonparametric bayesian factor analysis for dynamic count matrices. arXiv preprint arXiv:1512.08996. Cited by: §1, §4, Table 1, Table 2.
 Gamma process poisson factorization for joint modeling of network and documents. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 283–299. Cited by: §1, §2.
 Categorical data analysis. Vol. 482, John Wiley & Sons. Cited by: §1, §2, §3.
 Influence and correlation in social networks. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 7–15. Cited by: §1.
 The netflix prize. In Proceedings of KDD cup and workshop, Vol. 2007, pp. 35. Cited by: §1.
 Graph regularized nonnegative matrix factorization for data representation. IEEE transactions on pattern analysis and machine intelligence 33 (8), pp. 1548–1560. Cited by: §1, Table 1, Table 2.
 Relational topic models for document networks. In Artificial Intelligence and Statistics, pp. 81–88. Cited by: §1, §2.

Graph neural networks for social recommendation
. In The World Wide Web Conference, pp. 417–426. Cited by: §2, Table 1, Table 2.  Scalable recommendation with hierarchical poisson factorization.. In UAI, pp. 326–335. Cited by: §1, §3, Table 1, Table 2.
 Ordinal nonnegative matrix factorization for recommendation. arXiv preprint arXiv:2006.01034. Cited by: §1, §2, §3, Table 1, Table 2.
 Recommendation from raw data with adaptive compound poisson factorization. In Uncertainty in Artificial Intelligence, pp. 91–101. Cited by: §1, §2, §2, Table 1, Table 2.
 A simple but effective method to incorporate trusted neighbors in recommender systems. In International conference on user modeling, adaptation, and personalization, pp. 114–125. Cited by: §2.
 Trustsvd: collaborative filtering with both the explicit and implicit influence of user trust and of item ratings.. In Aaai, Vol. 15, pp. 123–125. Cited by: §2.
 Collaborative filtering for implicit feedback datasets. In 2008 Eighth IEEE International Conference on Data Mining, pp. 263–272. Cited by: §1.
 Scalable recommendation with social contextual information. IEEE Transactions on Knowledge and Data Engineering 26 (11), pp. 2789–2802. Cited by: §1, §2.
 Matrix factorization techniques for recommender systems. Computer 42 (8), pp. 30–37. Cited by: §1.
 Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 426–434. Cited by: §2, Table 1, Table 2.
 Social selection and peer influence in an online social network. Proceedings of the National Academy of Sciences 109 (1), pp. 68–72. Cited by: §1.
 Recommender systems with social regularization. In Proceedings of the fourth ACM international conference on Web search and data mining, pp. 287–296. Cited by: Table 1, Table 2.
 The authortopic model for authors and documents. arXiv preprint arXiv:1207.4169. Cited by: §2.
 Itembased collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web, pp. 285–295. Cited by: §1, §6.1.
 Social recommendation: a review. Social Network Analysis and Mining 3 (4), pp. 1113–1133. Cited by: §1, §2.
 A neural influence diffusion model for social recommendation. In Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval, pp. 235–244. Cited by: §4, §6.1, Table 1, Table 2.
 The poisson gamma belief network. In Advances in Neural Information Processing Systems, pp. 3043–3051. Cited by: Table 1, Table 2.
 Nonparametric bayesian negative binomial factor analysis. Bayesian Analysis 13 (4), pp. 1065–1093. Cited by: §2.
 Infinite edge partition models for overlapping community detection and link prediction. In Artificial intelligence and statistics, pp. 1135–1143. Cited by: §3, §4, §4, §4.