Thinking inside The Box: Learning Hypercube Representations for Group Recommendation

As a step beyond traditional personalized recommendation, group recommendation is the task of suggesting items that can satisfy a group of users. In group recommendation, the core is to design preference aggregation functions to obtain a quality summary of all group members' preferences. Such user and group preferences are commonly represented as points in the vector space (i.e., embeddings), where multiple user embeddings are compressed into one to facilitate ranking for group-item pairs. However, the resulted group representations, as points, lack adequate flexibility and capacity to account for the multi-faceted user preferences. Also, the point embedding-based preference aggregation is a less faithful reflection of a group's decision-making process, where all users have to agree on a certain value in each embedding dimension instead of a negotiable interval. In this paper, we propose a novel representation of groups via the notion of hypercubes, which are subspaces containing innumerable points in the vector space. Specifically, we design the hypercube recommender (CubeRec) to adaptively learn group hypercubes from user embeddings with minimal information loss during preference aggregation, and to leverage a revamped distance metric to measure the affinity between group hypercubes and item points. Moreover, to counteract the long-standing issue of data sparsity in group recommendation, we make full use of the geometric expressiveness of hypercubes and innovatively incorporate self-supervision by intersecting two groups. Experiments on four real-world datasets have validated the superiority of CubeRec over state-of-the-art baselines.

READ FULL TEXT VIEW PDF

Authors

page 8

03/24/2021

Hierarchical Hyperedge Embedding-based Representation Learning for Group Recommendation

In this work, we study group recommendation in a particular scenario, na...
10/02/2020

Overcoming Data Sparsity in Group Recommendation

It has been an important task for recommender systems to suggest satisfy...
06/05/2020

GroupIM: A Mutual Information Maximization Framework for Neural Group Recommendation

We study the problem of making item recommendations to ephemeral groups,...
12/25/2017

SAGA: A Submodular Greedy Algorithm For Group Recommendation

In this paper, we propose a unified framework and an algorithm for the p...
08/10/2019

Social Influence-based Attentive Mavens Mining and Aggregative Representation Learning for Group Recommendation

Frequent group activities of human beings have become an indispensable p...
09/21/2021

Graph Neural Netwrok with Interaction Pattern for Group Recommendation

With the development of social platforms, people are more and more incli...
09/09/2021

Double-Scale Self-Supervised Hypergraph Learning for Group Recommendation

With the prevalence of social media, there has recently been a prolifera...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Group activities are natural habitats for humans, and there is no exception in e-commerce. For example, friends can sign up for the same music festivals on Facebook Events, and online platforms like Meetup allow users to form groups and host activities (Yin et al., 2020). As traditional recommender systems only target at suggesting relevant items to individuals, there is an urgent demand in generating recommendations for a group of users, referred to as group recommendation. With the increasingly strong synergy between social and commercial functionalities in contemporary e-commerce platforms (e.g., Yelp, Gowalla, Steam, etc.) (Yin and Cui, 2016), there has been a recent surge in developing solutions to group recommendation (Hu et al., 2014; Cao et al., 2018; Sankar et al., 2020).

Similar to conventional recommendation, latent factor models exhibit dominant performance in group recommendation. In a general sense, groups and items are mapped to points in the vector space (i.e., embeddings), then the pairwise group-item affinity can be estimated via intuitive distance/similarity metrics. As such, current group recommenders share a core motivation of learning representative group-level embeddings to capture the joint preferences of users. In rare cases where group-item interactions are dense, group representations can be directly learned via standard collaborative filtering models by substituting users with groups

(Yin et al., 2019). However, the real group activities are highly sparse (Yin et al., 2020; Sankar et al., 2020)

, restricting the informativeness of the learned group representations and hindering recommendations for new groups. Thus, a common practice is to aggregate a group’s representation from its members’ user embeddings, which are usually learned from the richer user-item interactions. Such aggregation strategies have evolved from heuristics like the least misery and maximum pleasure

(Amer-Yahia et al., 2009), to the more recent attentive sum (Cao et al., 2018) and social influence-based approach (Yin et al., 2019).

Despite the increasing complexity of these recent preference aggregation approaches, the learned group representations still face a major bottleneck in their expressiveness. This is mainly due to the amount of information lost when compressing multiple user embeddings into one. In the context of embeddings, a group/user is represented as a point in the vector space, and each entry is the exact coordinate in the corresponding dimension. However, while each individual user’s preference tend to be stable and can be denoted by a certain value in each embedding dimension, the whole group’s overall preference reflects a blend of multiplex personal interests, challenging the capacity of point embeddings to accomodate such diversity. For example, if we associate the embedding dimensions with explicit item features, a group of users may land on different preferred values on the “price” dimension. But, it quickly becomes problematic when using a single point for the entire group due to the lack of considerations for such disparate and diverse user preferences, which commonly exist in web applications (Nguyen et al., 2017; Li et al., 2021).

In some cases, though higher dimensionality can be assigned to the group embedding to encode more information and alleviate the problem, this does not resolve the issue that a single embedding is unable to precisely reflect the decision-making process within a user group. Essentially, aggregating user preferences mimics the decision-making process (Guo et al., 2020) where all group members reach a consensus, reflected by a group-level representation. In reality, the group-level representation should involve synergies of common interests as well as compromises on some personal tastes. For example, for two users that have a large discrepancy on preferred prices, it is more realistic for them to agree upon an approximate price range in between, rather than an exact value. This, unfortunately, cannot be facilitated by existing aggregation schemes where all user preferences are merged into a single point. A point group embedding forces all users to agree upon a fixed decision in each dimension, no matter its value is simply averaged from all users (Amer-Yahia et al., 2009) or biased towards more influential group members (Yin et al., 2019; Cao et al., 2018).

To this end, we aim to represent the group-level preferences in a more expressive way, such that the multi-faceted user demands are thoroughly encoded, and the intricate decision-making process among all group members are faithfully preserved. Hence, we advocate addressing the restricted capacity of learned group representations by seeking an alternative to the outgoing point embeddings. Specifically, we introduce the notion of hypercubes as a new way to represent the aggregated group-level preferences in the vector space. In each dimension of the vector space, while a point embedding only has a fixed value, a hypercube covers a range of values that span across a continuous interval (Ren et al., 2020). Combining the intervals will shape a “cube”-like subspace, enabling a hypercube to accommodate innumerable points in the vector space (Zhang et al., 2021b). Compared with the traditional group embeddings, hypercubes enable a higher tolerance of different user preferences. Furthermore, as each dimension now encodes a preference interval, this paradigm is a more realistic reflection of how a group of distinct users give concessions to each other and eventually work out a combined criteria for desired items.

However, further challenges are in place when unleashing the full potential of hypercube group representations. Firstly, as traditional aggregation schemes are dedicated to learning groups’ embedding representations, a new paradigm is desired to effectively summarize all group members’ vectorized preferences into a hypercube with minimal information loss in user interests. Secondly, the utilization of group hypercubes voids the applicability of existing point-wise distance measures like Euclidean distance or cosine similarity for group-item pairs, highlighting the need for a revamped distance metric that can precisely quantify and distinguish the affinity between different hypercubes (i.e., groups) and points (i.e., items). Thirdly, the long-standing problem of severe data sparsity with group-level interactions

(Yin et al., 2020; Sankar et al., 2020) still persists in our case, which will impede the quality of learned group hypercube representations and amplify the risk of overfitting.

To address these issues, we present our solution to group recommendation, namely the hypercube recommender (CubeRec). CubeRec inherits the classic point embeddings for individual users/items whose preferences/attributes are deterministic and can be learned from the relatively rich individual-level interactions. Meanwhile, we innovatively represent each group with the more expressive hypercube, which is first composed by the embeddings of its user members, and then optimized through observed group-level interactions. In CubeRec, we propose two alternative strategies to merge individual user embeddings into group hypercubes, specifically a geometric bounding approach that physically encapsulates all group members’ points, and an attentive approach that composes hypercubes by investigating the semantics within user embeddings. To allow for accurate item ranking, we put forward a distance function to measure the distance between each item point and group hypercube. The distance metric accounts for distances from both the inside and outside of a hypercube, thus accomodating different scenarios and maintaining a discriminative characteristic. Furthermore, we make full use of the geometry of group hypercubes and propose a novel self-supervised learning paradigm to tackle the data sparsity in group recommendation. In short, we define an intersection operation between two group representations, and leverage the common users shared between two groups as supervision signals to enrich the information encoded within every hypercube.

In summary, the main contributions of our work are:

  • We define a new schema based on hypercubes to represent group-level preferences for group recommendation. Hypercubes bypass the limited expressiveness and flexibility of conventional point embeddings, allowing the group representations to encode multi-faceted user preferences while fully imitating a group’s decision-making process.

  • We propose CubeRec, a group recommender that comes with novel designs for learning group-level hypercubes and measuring group-to-item distances. To alleviate data sparsity, CubeRec further incorporates self-supervision by innovatively utilizing the intersection between groups.

  • We conduct extensive experiments on four real-world benchmark datasets. Experimental results show that CubeRec yields superior recommendation performance compared with state-of-the-art baselines.

2. Preliminaries

In this section, we mathematically define the hypercube representations of groups, which are the key building block of CubeRec.

One can think of a hypercube as extending a rectangle into a higher-dimensional space, where each edge of the hypercube corresponds to a real-valued, closed interval on each dimension (Ren et al., 2020; Zhang et al., 2021b). For group which is a subset of users, we define its representation via the following center-offset format:

(1)

where is the geometric center of the hypercube, and is the non-negative offset vector. Both the center and offset are -dimensional vectors learned within the same embedding space, where the center defines the Essentially, in the -th dimension of , will have its -th edge spanning across . For notation convenience, we define the hypercube representation for the -th group as a tuple with its group-specific center and offset . We use this tuple as a shorthand for Eq.(1) in the rest parts of our paper. Compared with point embeddings, relaxing the fixed value in each dimension into a flexible range brings an increased tolerance on the variety among group members. It also reflects a more realistic decision process, as the group does not have to pre-approve a specific value per dimension before ranking candidate items.

3. The Hypercube Recommender

To provide an overview, CubeRec first learns user and item embeddings based on user-level interactions with items. Then, we propose our solution to building and learning hypercube representations at the group level. To further compensate for the high sparsity of group-item interactions when learning hypercube representations, we introduce a novel learning objective by leveraging self-supervision signals from overlapping groups. In what follows, we unfold the design details of CubeRec.

3.1. Learning User and Item Embeddings

In CubeRec, we employ LightGCN (He et al., 2020b) to learn user and item embeddings. Compared with earlier collaborative filtering (CF) methods, adopting LightGCN facilitates the modelling of higher-order collaborative signals through message passing, and offers simplicity via the omission of excessive nonlinear components (Wang et al., 2019). We keep this part brief as it is not our core contribution and can actually be replaced by arbitrary latent factor models.

Modelling User-Item Interactions. Treating each user/item as a distinct node, the user-item interactions can be formulated as a bipartite graph, where we let and denote the one-hop neighbor sets of user and item , respectively. The embedding of an arbitrary user/item is updated by propagating its neighbors’ embeddings into /:

(2)

where denotes the propagation layer, is a graph Laplacian normalization term (He et al., 2020b; Kipf and Welling, 2017), while and are randomly initialized and will be trained via back-propagation. Following (He et al., 2020b), the final embeddings are obtained via mean pooling across all layers, i.e., and , which are used in the subsequent recommendation phases.

Handling Explicit Social Ties. Obviously, increasing can help capture higher-order relationships including implicit social relationships like user-item-user paths, as essentially controls the number of hops to be reached during the neighborhood aggregation. As group recommendation services are commonly provided on social platforms, the explicitly available social connections between users (e.g., following relationship on social media) carry strong signals for understanding a user’s preferences. However, the plain LightGCN does not account for such explicit social ties. In CubeRec, we use to denote the interactions between users from set and items from set , and to denote the social relation adjacency matrix of all users. Then, we formulate the adjacency matrix of the user-item graph as the following:

(3)

where 0 is a matrix consisting of s, and the resulted adjacency matrix now carries both user-item interactions and user-user social ties. With that, we summarize the socially enhanced LightGCN in the matrix form:

(4)

where is a diagonal degree matrix in correspondence to , while stacks all user and item embeddings.

User-level Recommendation Loss. We optimize all user and item embeddings with a distance-based objective on user-level interactions, namely the hinge loss (Chen et al., 2018; Hsieh et al., 2017):

(5)

where denotes the training set, is the safety margin size to be predefined, and is the squared Euclidean distance between the embeddings of and . The rationale of the hinge loss is that, in the vector space, each user should aways be closer to a visited item than an unvisited one . As such, a training sample is defined as a triple , where every observed user-item pair is matched with a fixed amount of uniformly sampled negative items to construct .

(a) Geometric fusion and projection for hypercube learning (b) Attentive fusion and projection for hypercube learning (c) Group-to-item distances for different item points (d) Our goal with self-supervised learning via hypercube intersection
Figure 1. A schematic view of the key hypercube-based computations in CubeRec. We use for demonstration purpose. Corresponding details can be found in Section 3.2 for (a) and (b), Section 3.3 for (c), and Section 3.4 for (d).

3.2. Composing Group-level Hypercubes

We perform pretraining with to acquire initial user and item embeddings for making recommendations at the group level. As a common practice, a group’s representation can be learned by merging all the embeddings of its members via means like attention networks (Cao et al., 2018; Huang et al., 2020) or graph convolution (Wang et al., 2020; Guo et al., 2021), but they inevitably incur limited expressiveness. As discussed earlier, compressing users’ multi-faceted preferences into a single point in the vector space leads to significant information loss, and also misaligns with a group’s decision-making process in the real world.

In this section, we present our solution to learning expressive and flexible group representations with hypercubes. Specifically, we introduce two different strategies for composing group-level hypercubes using individual users’ point embeddings, of which both have their unique properties and are further testified in Section 4.

Geometric Bounding and Projection. Given user embeddings in group , the geometric bounding operation is an intuitive way to generate the corresponding group-level hypercube. As Figure 1(a) shows, we first need to find the smallest hypercube covering all points (i.e., user embeddings) from . Such a hypercube is computed as the following:

(6)

where and are boundary vectors computed via:

(7)

where and both operate element-wise. However, straightforwardly utilizing as the final group representation is suboptimal due to its geometric property. That is, the more diverse that the users in are, the larger area that will cover in the -dimensional space. As the hypercube shares the same vector space with items, the recommendation will naturally prefer items that falls in the hypercube because it means those items meet all the preference intervals on all latent dimensions. Consequently, a highly diverse group will receive an excessively large hypercube, covering more false positive items that eventually dilute the recommendation accuracy. To alleviate this issue, we learn projection matrices for both the center and offset that help rescale into the final group hypercube :

(8)

where and are the corresponding projection weights. Note that is non-negative such that .

Attentive Fusion and Projection. Geometric bounding aims to let a hypercube fairly cover all users’ preferences within a group. While this is reasonable in many cases, recent studies also point out that each user in the group may affect the final group decision to a different extent (Cao et al., 2018; Yin et al., 2019) due to varied social influence and/or expertise. To make our learned group representations able to account for every group member’s importance discrepancy, we further propose an attentive approach (Vaswani et al., 2017; Chen et al., 2020a) for composing group hypercubes:

(9)

where is the learnable query vector in the above self-attention, are respectively the key and value projection matrices, and stacks all -dimensional user embeddings in group . As each reflects a user’s exact interest, the aggregation in Eq.(9) attentively aggregates all group members’ interests, denoted by . Then, we locate the centroid and offset of the composed hypercube via:

(10)

where we slightly abuse notations to denote the learnable matrices, and

is the bias vector. Since the attentively aggregated

summarizes the core interest of the group, we directly use its linear projection as the center . Meanwhile, the process of obtaining sacrifices users’ preference diversity, which should be preserved by the offset. Thus, we employ the more expressive nonlinearity to partially infer and recover such missing information from and obtain the offset

. The rectified linear unit (ReLU) is used to ensure non-negativity. This process is depicted by Figure

1(b). In short, the attentive approach focuses more on the semantics of user embeddings compared with its geometric counterpart that retains more physical meanings, and is relatively more efficient.

3.3. Learning Hypercubes for Group Recommendation

To optimize the composed hypercube representations, we firstly put forward a distance function that quantifies the affinity between each pair of group and item, and then introduce how we define the loss function at the group level for learning CubeRec.

Computing Item-to-Group Distances. Compared with conventional point group embeddings, each edge of a hypercube defines a range on the corresponding dimension in the latent feature space rather than a single value. Intuitively, for any item point that falls in the hypercube , it means each dimension of satisfies the preference range specified by , making item a good fit for group . However, it is not always the case in practice. On one hand, a diverse group tend to have a relatively large coverage with the generated hypercube, containing more irrelevant items. On the other hand, a group with relatively narrow interests can have a rather small hypercube, making it hard to cover any existing item embeddings, especially when is large. In light of this, we propose an approach that measures the item-to-group distance via:

(11)

where functions and measure the outer and inner item-to-group distances, respectively, which are balanced by a fixed coefficient . Taking the group hypercube and item embedding as inputs, these two functions are formulated as follows using squared Euclidean distance:

(12)

where two points are geometrically the lower-left and upper-right corners of hypercube , respetively:

(13)

As explained in Figure 1(c), assuming locates outside , what Eq.(12) does is to: (1) locate the anchor point on ’s surface that is the closest to ; (2) use to measure the outer distance from to this anchor point; and (3) use to measure the inner distance from the centroid to this anchor point. The outer and inner distances are then combined into via Eq.(11). When is inside or on the surface of , , and will reduce to the distance between item and centroid because . This is also shown in Figure 1(c).

Group-level Recommendation Loss. Following the definition of item-to-group distance in Section 3.2, the higher the affinity between group and item , the smaller the distance will be. Analogously to Eq.(5), we adopt the following hinge loss:

(14)

where denotes the safety margin, are training instances constructed by negative sampling. Essentially, for cube , aims to pull a positive item’s embedding into and closer to its center, while pushing the negative item’s embeddings away from it.

3.4. Self-supervision via Hypercube Intersections

Unfortunately, as pointed out by prior studies (Yin et al., 2019; Sankar et al., 2020), a long-lasting obstacle for learning high-quality group representations is that, the learning of group representations is purely dependant on the interacted items, which are highly sparse at the group level (see statistics in Section 4.1). As such, we propose to take advantage of the rich self-supervision signals from user-group associations for learning representative group hypercubes. Specifically, for group we firstly draw a different group , where shares at least one user in common with (details on how we deal with disjoint groups will follow). Users shared between and are termed relay users. Then intuitively, the intersection of two group hypercubes denoted by , which is also a hypercube, is supposed to reflect the properties of all the relay users.

With Figure 1(d), we illustrate our goal of self-supervised learning in CubeRec. In a geometric sense, two hypercubes’ intersection is ideally the subspace that hosts the point embeddings of relay users. On this basis, we formulate the self-supervision loss below:

(15)

where is a shorthand for the distance between the intersection and user , i.e., . For each positive user that falls in the group intersection , we sample a negative user from

with uniform distribution

. By doing so, the representations of groups and will be regularized by the relay users, so as to capture the common interests between them. Correspondingly, the learned group representations will be substantially more informative than those learned purely with sparse item-level interactions.

Hypercube Intersection Operation. We hereby define how the hypercube intersection is calculated. For convenience, we let . Because straightforwardly using the geometric intersection is only applicable when two hypercubes have actual overlaps in each dimension of , we resort to a relaxed intersection computation via a neural approach. Firstly, center is computed by performing element-wise attention over the two group centers:

(16)

where denotes element-wise multiplication,

is a multilayer perceptron (MLP) with

neurons throughout all layers. Meanwhile, the offset is determined via:

(17)

where is another MLP with a distinct set of parameters. Essentially, Eq.(16) places the intersection center somewhere between and

. Then, with the scaling effect of sigmoid function, Eq.(

17) shrinks the offsets of hypercubes and to obtain the new offset of the intersection .

Handling Isolated Groups. Despite the heavily overlapping nature of user groups on social e-commerce platforms, we must take into account the situation where a group might be disjoint with all other groups in the dataset. On this occasion, we propose the two following strategies to generate a dummy user group having overlapping users with :

  • Proportional Swap (PS): With a specified proportion , we swap users in with users uniformly sampled from with replacement. We denote this as .

  • Proportional Imputation (PI): With a specified proportion

    , we inject uniformly sampled users from (with replacement) into . We denote this as .

The generated then serves as to facilitate self-supervision via Eq.(15). We adopt a predefined value in our approach, where we use PS and PI alternately during training.

1:Input: , ,
2:Output: All model parameters collectively referred to as
3:Randomly initialize ;
4:repeat
5:     Draw a mini-batch from ;
6:     Take a gradient step to update u and v w.r.t. ;
7:until convergence Pretraining user and item embeddings
8:repeat
9:     Draw a mini-batch from and compute ;
10:     for each in mini-batch do
11:          if  s.t.  then
12:               Compute with a sampled ;
13:          else
14:               Generate via either or by coin-flipping;
15:               Compute with ;                
16:     Take a gradient step to update w.r.t. ;
17:until convergence
Algorithm 1 Training Procedure of CubeRec

3.5. Optimizing CubeRec

As all components of CubeRec are end-to-end differentiable, we learn the model parameters on multiple objectives with the mini-batch stochastic gradient descent algorithm Adam

(Kingma and Ba, 2015). As depicted by Algorithm 1, we adopt a two-stage training strategy for CubeRec. To be specific, we first obtain pretrained user and item embeddings by optimizing the user-level loss . Then, we fine-tune the pretrained embeddings and learn rest of the parameters in CubeRec by optimizing towards the combined loss (). In both training stages, we update corresponding model parameters in each iteration and repeat the entire training process until the loss converges or is sufficiently small.

We tune the hyperparameters using grid search. Specifically, the latent dimension

is searched in , and the distance coefficient and self-supervision weight are searched in . For optimizing the three losses , , , we draw 5 negative samples for each positive ground truth cases to construct corresponding training sets. The safety margins on those losses are set to following common practices for hinge loss (Zhang et al., 2021b; Chen et al., 2018). We set the the learning rate to and batch size to according to device capacity. To prevent overfitting, we adopt a dropout (Srivastava et al., 2014) ratio of on all deep layers of CubeRec during training.

4. Experiments

Meetup Yelp Gowalla Douban
#users 24,631 34,504 60,805 70,743
#groups 13,552 24,103 78,453 109,538
#items 19,031 22,611 8,984 60,028
#user-item interactions 126,813 482,273 1,625,817 3,422,266
#group-item interactions 19,031 26,883 208,336 164,153
average #items per user 5.15 13.98 26.74 48.38
average #items per group 1.40 1.12 2.66 1.50
average #groups per user 4.83 3.11 3.23 7.53
average group size 8.79 4.45 2.31 4.86
Table 1. Statistics of experimental datasets.
Method Yelp Douban Gowalla Meetup
Recall NDCG Recall NDCG Recall NDCG Recall NDCG
NCF-AVG (He et al., 2017) 0.0042 0.0090 0.0018 0.0030 0.0023 0.0042 0.0010 0.0015 0.0146 0.0222 0.0088 0.0108 0.0087 0.0092 0.0074 0.0075
FM-AVG (Vinh Tran et al., 2019) 0.0161 0.0249 0.0075 0.0097 0.0032 0.0065 0.0014 0.0023 0.0127 0.0212 0.0061 0.0082 0.0128 0.0232 0.0063 0.0089
AGREE (Cao et al., 2018) 0.0168 0.0286 0.0073 0.0103 0.0031 0.0047 0.0013 0.0017 0.0125 0.0228 0.0063 0.0088 0.0081 0.0117 0.0033 0.0041
SIGR (Yin et al., 2019) 0.0194 0.0311 0.0082 0.0114 0.0035 0.0056 0.0016 0.0020 0.0143 0.0266 0.0070 0.0099 0.0090 0.0139 0.0038 0.0049
CAGR (Yin et al., 2020) 0.0215 0.0345 0.0106 0.0138 0.0026 0.0072 0.0010 0.0021 0.0169 0.0320 0.0090 0.0129 0.0028 0.0029 0.0012 0.0013
GroupIM (Sankar et al., 2020) 0.0424 0.0535 0.0207 0.0235 0.0212 0.0316 0.0098 0.0124 0.0281 0.0384 0.0113 0.0136 0.0407 0.0525 0.0237 0.0266
CubeRec-G 0.0455 0.0592 0.0241 0.0298 0.0243 0.0354 0.0114 0.0167 0.0361 0.0575 0.0171 0.0194 0.0437 0.0616 0.0261 0.0306
CubeRec-A 0.0417 0.0520 0.0206 0.0224 0.0211 0.0309 0.0098 0.0158 0.0343 0.0495 0.0135 0.0164 0.0418 0.0525 0.0202 0.0211
Table 2. Recommendation results. Numbers in bold face are the best results for corresponding metrics.

In this section, we conduct experiments to verify the effectiveness of CubeRec in group recommendation. Specifically, we aim to answer the following research questions (RQs):

  • Is CubeRec the new state-of-the-art?

  • Are the major components proposed in CubeRec effective?

  • How does CubeRec perform w.r.t. different group sizes?

  • What is the impact of CubeRec’s key hyperparameters?

4.1. Datasets

We adopt four real-world datasets collected from different event-based social networks, namely Meetup, Yelp, Gowalla, and Douban. Both Meetup and Douban contain group events held in different venues respectively in New York and Beijing, where event venues are items to be recommended. Meanwhile, Yelp and Gowalla are typical check-in datasets on different restaurants (i.e., items in our case) based in the US. Since Yelp and Gowalla do not originally contain group information, we follow the widely adopted procedure (Sankar et al., 2020; Yin et al., 2019) to construct group interactions by finding overlaps on both check-in times and social relations. That is, we assume if a set of users who are connected on the social network visit the same venue at the same time, then they are regarded as members of a group, and the corresponding activities are group activities (Yin et al., 2019). The use of explicit social connections and spatiotemporal tags ensures the quality of discovered user groups in Yelp and Gowalla. We provide the key statistics of the four datasets in Table 1. Each dataset is split with a ratio of :: for training, validation, and test, respectively.

4.2. Baselines and Evaluation Protocols

We compare CubeRec111We release our implementation at: https://github.com/jinglong0407/CubeRec.git with the following state-of-the-art baselines:

  • MF-AVG (Vinh Tran et al., 2019): This baseline takes the average of all group-wise user representations learned via user-item matrix factorization (MF) (Koren et al., 2009) to compose group embeddings.

  • NCF-AVG (He et al., 2017): It also represents groups with average pooling on user embeddings, where user embeddings are learned via the neural collaborative filtering instead of MF.

  • AGREE (Cao et al., 2018): This approach utilizes attention networks, thus accounting for different importance of group members when learning group embeddings.

  • SIGR (Yin et al., 2019): It exploits and integrates users’ global and local social influence to improve the group recommendation.

  • CAGR (Yin et al., 2020): This method firstly learns centrality-aware user representations, and then learns a group recommender via two-stage optimization.

  • GroupIM (Sankar et al., 2020): This state-of-the-art group recommender adopts mutual information maximization between users and groups to overcome the sparse group-level interactions.

It is worth noting that in CubeRec, we implement both the geometric and attentive approaches described in Section 3.2 for composing group hypercubes, which are respectively marked as CubeRec-G and CubeRec-A. We set for learning user and item embeddings with LightGCN, and implement the two MLPs for hypercube intersection with a 3-layer setting and ReLU activation. Based on the hyperparameter search described in Section 3.5, we fix , , and across all datasets. The impact of these hyperparameters to the recommendation performance will be further discussed in Section 4.6.

We leverage two metrics, namely recall at rank (Recall) (Chen et al., 2019, 2021) and normalized discounted cumulative gain at rank (NDCG) (Chen et al., 2020b; Zhang et al., 2021c) that are widely adopted in recommendation research. We adopt where all items unvisited by each group are taken as negative samples for evaluation. In short, Recall measures the ratio of the ground truth items that are present on the top- list, and NDCG evaluates whether the model can rank the ground truth items as highly as possible.

Figure 2. Visualization of CubeRec-G (left) and CubeRec-A (right) for the same user group and recommended positive item on Meetup. Note that only the first two dimensions of are used to facilitate visualization. The amount of visualized negative items is different as all representations are learned from two independent vector spaces, and embeddings out of the range are clipped out to ensure clarity.

4.3. Recommendation Effectiveness (RQ1)

Table 2 summarizes the performance comparison among all the group recommenders, where each method’s results are averaged over five runs on every dataset. With the recommendation results, we discuss our key findings below.

Apparently, CubeRec-G consistently outperforms all baselines by a significant margin. Compared with GroupIM which is the best baseline, the average improvements on Recall@10 and NDCG@10 brought by CubeRec-G are 14.4% and 23.6%, respectively. Unlike the straightforward mean pooling from user embeddings in NCF-AVG and FM-AVG, the selective preference aggregation schemes in AGREE, SIGR, CAGR and GroupIM generally achieve better performance when learning group-level representations for recommendation. However, these baselines are still subject to limited flexibility and resilience of the point embeddings used for representing groups. In contrast, our hypercube-based group representations can bypass the inherent limitations of point embeddings, thus achieving the best recommendation effectiveness.

Meanwhile, although CubeRec-A is not the best-performing variant of CubeRec, it still yields highly competitive results which are on par or stronger than the best baseline GroupIM. On one hand, its strong performance further demonstrates the benefit of replacing point embeddings with the more expressive hypercube representations. On the other hand, a possible reason that CubeRec-G is more advantageous than CubeRec-A is that, the group hypercubes learned via geometric bounding has a higher tolerance to the multi-faceted user preferences, and are overall more comprehensive representations of group-level interests. To qualitatively compare CubeRec-G and CubeRec-A, in Figure 2, we visualize the learned hypercubes of a randomly picked group from Meetup. For clarity, we only use the first two dimensions of . The hypercubes and users in two plots belong to the same group in the dataset, and the same positive item has been successfully recommended in the top-10 list in both evaluation cases. As the visualization suggests, the hypercube learned by CubeRec-G is a tighter and more inclusive representation of all its members’ preferences. Also, CubeRec-G appears to be more capable of pulling the positive item into the group hypercube than CubeRec-A, leading to better recommendation performance.

Another observation is the relatively inferior performance of all methods on Douban compared with other datasets. Despite the dense user-item interactions for learning individual users’ preferences, the recommendation performance is largely impaired by the sparse group-item interactions and the largest item set. It is worth mentioning that methods with self-supervised learning (i.e., GroupIM and CubeRec) are still able to leverage augmented supervision signals to learn quality group representations, providing the best recommendation results even for this challenging dataset.

Method Architecture Yelp Douban Gowalla Meetup
CubeRec-G Default 0.0455 0.0243 0.0361 0.0437
Remove SR 0.0440 0.0222 0.0353 0.0414
Point Distance 0.0204 0.0098 0.0156 0.0181
Remove SSL 0.0442 0.0232 0.0342 0.0407
CubeRec-A Default 0.0417 0.0211 0.0343 0.0418
Remove SR 0.0391 0.0209 0.0326 0.0371
Point Distance 0.0202 0.0099 0.0159 0.0163
Remove SSL 0.0382 0.0195 0.0342 0.0399
Table 3. Ablation test with different model architectures (Recall@10 is demonstrated). Numbers in bold face are the best results from each model, and “” marks a severe (over ) performance drop compared with the best results.

4.4. Ablation Study (RQ2)

To better understand the performance gain from the core components proposed in CubeRec, we perform ablation analysis on different degraded versions of CubeRec. Table 3 summarizes the recommendation outcomes in terms of Recall@10. In what follows, we introduce all variants and discuss the effectiveness of corresponding model components.

Figure 3. Recommendation performance w.r.t. different group sizes.
Figure 4. Recommendation performance w.r.t. different hyperparameters.

Removing Social Relations (Remove SR). In CubeRec, we infuse social relations when modelling user preferences via Eq.(3). To testify the usefulness of social relations in CubeRec, we remove the social relation matrix S from the adjacency matrix by replacing it with a matrix filled with 0s. The resulted variant models, have experienced noticeable performance drops for both CubeRec-G and CubeRec-A, especially on Meetup dataset with the highest sparsity of user-item interactions. Hence, the inclusion of social relations is important for CubeRec to learn representative user embeddings, which are the building block for group hypercubes.

Using Only The Distance Between Item and Group Center Points (Point Distance). A crucial difference between CubeRec and point embedding-based group recommenders is the use of its hypercubes. Essentially, a group hypercube specifies a subspace in , where the recommended item points either locate in this subspace or maintain short distances with it. As a core ranking mechanism for group-item pairs in CubeRec, we verify the efficacy of such point-to-hypercube distance metric by replacing Eq.(11) with the conventional point distance. That is, where squared Euclidean distance between the group center and item points are used. As a result, there is a significant decrease in recommendation for CubeRec across all datasets. This reflects that our revamped distance metric in CubeRec is a highly distinctive measurement for the affinity between item and groups.

Removing Self-supervised Learning (Remove SSL). This variant disables the self-supervised loss when optimizing CubeRec, i.e., the model is only optimized towards instead of . As CubeRec can no longer leverage the common users shared between groups as a supervision signal to counteract the group-level data sparsity and regularize the learned group hypercubes, it suffers from inferior recommendation accuracy on all datasets. Specifically, significant performance drops (over 5%) are observed from CubeRec-G on Gowalla and Meetup, and from CubeRec-A on Yelp and Douban. As such, the novel self-supervised learning scheme showcases its strong contribution to the performance gain in CubeRec.

4.5. Effect of Different Group Sizes (RQ3)

In group recommendation, different group sizes will bring varied impact to the recommendation accuracy, and it is crucial for a recommender to keep robust in different circumstances. In this study, we further evaluate the performance of CubeRec on various group sizes. Following (Sankar et al., 2020), we segment all groups into 5 categories, namely groups with 2-3, 3-4, 6-7, 8-9, and 10 or more users. We additionally choose GroupIM as the peer method for comparison because it is the most performant baseline.

We summarize the recommendation results in Figure 3, where Recall@10 is used for benchmarking. The first observation is that, CubeRec-G outperforms CubeRec-A and GroupIM in almost all scenarios. This also supports the superior performance of CubeRec-G given its capability to deal with different group sizes. Secondly, CubeRec-G yields more performance gain compared with GroupIM when the group sizes are relatively small (i.e., between 2 and 5). This is especially important for group recommendation, as the group sizes follow the typical long-tail distribution in all our datasets, considering it is more natural and feasible for users to form smaller groups in real life. Thirdly, we find that in general, group sizes are positively associated with the recommendation accuracy. Though smaller groups may have weaker discrepancies in user preferences during decision-making, the learning of group-level representations are also restricted by the limited user-item interactions. When the group size grows, the composed group representations tend to encode richer predictive signals for making accurate recommendations. Meanwhile, in some cases like Yelp and Gowalla, when the group size exceeds 9, the recommendation performance starts decreasing. This is largely associated with the excessive noise introduced by the diverse group members.

4.6. Hyperparameter Sensitivity (RQ4)

We answer RQ4 by investigating the performance fluctuations of CubeRec with varying hyperparameters, in particular the latent dimension , trade-off between inner and outer distances in Eq.(11), and the self-supervision strength . Based on the standard hyperparameter setup in Section 4.2, we tune the value of one hyperparameter while keeping the others unchanged, and record the new results achieved in Figure 4. Note that we only showcase the hyperparameter sensitivity for CubeRec-G since it is the best-performing of the two, and CubeRec-A exhibits a highly similar trend as CubeRec-G.

Impact of . The value of is examined in . In general, CubeRec benefits from a relatively larger on all four datasets. But noticeably, the performance improvement stops when reaches a certain size (64 and 128 in our case) due to overfitting.

Impact of . We also study the impact of that weighs the contribution of inner distance in our revamped group-to-item distance metric. As increases from 0.1 to 0.3, there is a generally upward trend in CubeRec’s performance. However, when exceeds 0.5, the performance starts to decrease. Hence, in the revamped distance metric, accounting for slightly more outer distance offers higher resolution when distinguishing a group’s preferences towards different items.

Impact of . The self-supervised learning loss is multiplied by to control its regularization effect. When the value of varies in , the best performance of CubeRec is observed when in most scenarios, suggesting the necessity of utilizing our proposed self-supervision to counteract the sparsity of group-item interactions. However, CubeRec achieves better performance when on Yelp, which possibly attributes to the smallest average group size in this dataset (3.11 users per group), hindering the construction of self-supervision signals via relay users.

5. Related Work

In general, group recommendation methods are designed in different ways to suit two types of audiences (Yin et al., 2019; Sankar et al., 2020; Quintarelli et al., 2016): persistent groups and occasional groups. While persistent group recommendation assumes dense interaction records at the group level, our work addresses occasional group recommendation, which is more practical and widely studied as real user groups tend to be formed ad-hoc with limited group-item interactions (Yin et al., 2019). Compared with persistent group recommendation where traditional collaborative filtering can be directly applied on group-item interactions (Hu et al., 2014; Seko et al., 2011), a common practice in occasional group recommendation is to first learn individuals’ preferences from user-level interactions, then perform preference aggregation to infer the overall interest of a group. Besides, a widely acknowledged challenge (Sankar et al., 2020; Yin et al., 2020) in occasional group recommendation is the highly sparse group-level interactions. Hence, we review two lines of research that contributes to group recommendation, namely methods for preference aggregation and counteracting data sparsity.

5.1. Preference Aggregation

In group recommendation, all group members’ preferences can be aggregated via either late aggregation or early aggregation. The aim of late aggregation (Amer-Yahia et al., 2009) is to generate item recommendations or predicted item-wise affinity scores for each group member, which are combined at the output stage via predefined strategies to produce recommendation results for the group. In this category, the most representative approaches are average satisfaction, least misery and maximum pleasure (Amer-Yahia et al., 2009; Baltrunas et al., 2010; Quintarelli et al., 2016). Unfortunately, these predefined aggregation strategies heavily rely on heuristics, thus lacking the desired expressiveness and flexibility for optimal performance (Guo et al., 2020; Jia et al., 2021).

As such, recent group recommenders mostly rely on early aggregation methods (Cao et al., 2018; Liu et al., 2012; Vinh Tran et al., 2019; Yin et al., 2020; Yuan et al., 2014) for learning group-level preference representations. In contrast to late aggregation, such methods first aggregate the preferences of all group members into a group-level representation, and then make group recommendations accordingly. Some early aggregation methods are built upon probabilistic models (Gorla et al., 2013; Liu et al., 2012; Yuan et al., 2014)

by considering both a user’s own preferences and her/his impact to the group, but the same type of probability distribution is assumed across users, which is infeasible in real-world cases. To address this problem, latent factor models

(Sajjadi Ghaemmaghami and Salehi-Abari, 2021; Huang et al., 2020)

are adopted to map groups to embeddings in the vector space. For example, attentive neural networks are proposed in

(Cao et al., 2018; Guo et al., 2020) to selectively aggregate user representations within a group, and (Vinh Tran et al., 2019) further captures the fine-grained interactions between group members via a sub-attention network. More recently, there are also studies to incorporate additional information like social connections (Cao et al., 2021)

and knowledge graphs

(Deng et al., 2021) into the learning process of group representations. However, as discussed in Section 1, the point embeddings used in those methods sacrifice the diversity of personal preferences. Besides, compared with alternative solutions like capsule networks (Li et al., 2019; Cen et al., 2020) that rank items by mapping the learned multi-interest representations to a single embedding, CubeRec recommends items directly with group hypercubes, thus retaining loyalty to the decision-making process in real groups.

5.2. Alleviating Data Sparsity for Groups

When recommending items to occasional groups, the other crucial goal is to mitigate the scarce interactions between groups and items which can heavily hinder the quality of learned group representations. Some research account for the social relationships among users to provide hints on group-level preferences (Yin et al., 2019; Cao et al., 2021; Guo et al., 2020), such as (Yin et al., 2019) that considers the impact of users’ social influences on a group’s decision. Afterwards, (Guo et al., 2020) proposes an intra-group voting mechanism based on self-attention (Vaswani et al., 2017) to simultaneously model the social influence and the pairwise user interactions, and (He et al., 2020a) further learns multi-view embeddings from the interactions among groups, users and items with a heterogeneous graph. As self-supervised learning (SSL) has shown its effectiveness in general recommendation tasks (Wu et al., 2021; Zhou et al., 2021; Yu et al., 2022), attempts are also made to design SSL objectives to counteract the data sparsity in group recommendation. For example, (Zhang et al., 2021a) leverages common users among groups and designs a double-scale contrastive learning to enhance the learned user representations. Meanwhile, (Sankar et al., 2020) proposes a user-group mutual information maximization scheme to jointly learn informative user and group representations. Though SSL is proven to benefit group recommendation, existing methods are only compatible to point user/group embeddings in the vector space. In contrast, our design of SSL in CubeRec takes advantage of the geometric properties of hypercubes, allowing for more expressive group representations to be learned despite data sparsity.

6. Conclusion

To learn high-quality group representations that have the capacity to encode the multi-faceted user preferences and the flexibility to fully resemble a group’s decision-making process, we move beyond the traditional point embeddings, and propose to learn hypercube group representations in this paper. The proposed solution CubeRec takes advantage of its more expressive hypercubes in the vector space, and comes with a new distance metric and an intersection-based self-supervision paradigm to respectively facilitate group-item pairwise ranking and mitigation of data sparsity. Experimental results have rigorously verified the efficacy of CubeRec, shedding light on its applications in a wider range of recommendation tasks.

Acknowledgement

This work is supported by UQ New Staff Research Start-up Grant (NS-2103). This work is partly sponsored by Australian Research Council under the streams of Future Fellowship (FT210100624), Discovery Project (DP190101985) and Discover Early Career Researcher Award (DE200101465), and is also supported by National Natural Science Foundation of China (U21A20470).

References

  • (1)
  • Amer-Yahia et al. (2009) Sihem Amer-Yahia, Senjuti Roy, Ashish Chawlat, Gautam Das, and Cong Yu. 2009. Group recommendation: Semantics and efficiency. PVLDB 2, 1 (2009), 754–765.
  • Baltrunas et al. (2010) Linas Baltrunas, Tadas Makcinskas, and Francesco Ricci. 2010. Group recommendations with rank aggregation and collaborative filtering. In RecSys. 119–126.
  • Cao et al. (2018) Da Cao, Xiangnan He, Lianhai Miao, Yahui An, Chao Yang, and Richang Hong. 2018. Attentive group recommendation. In SIGIR. 645–654.
  • Cao et al. (2021) Da Cao, Xiangnan He, Lianhai Miao, Guangyi Xiao, Hao Chen, and Jiao Xu. 2021. Social-Enhanced Attentive Group Recommendation. TKDE 33, 03 (2021), 1195–1209.
  • Cen et al. (2020) Yukuo Cen, Jianwei Zhang, Xu Zou, Chang Zhou, Hongxia Yang, and Jie Tang. 2020. Controllable multi-interest framework for recommendation. In SIGKDD. 2942–2951.
  • Chen et al. (2018) Hongxu Chen, Hongzhi Yin, Weiqing Wang, Hao Wang, Quoc Viet Hung Nguyen, and Xue Li. 2018. PME: projected metric embedding on heterogeneous networks for link prediction. In SIGKDD.
  • Chen et al. (2019) Tong Chen, Hongzhi Yin, Hongxu Chen, Rui Yan, Quoc Viet Hung Nguyen, and Xue Li. 2019. AIR: Attentional intention-aware recommender systems. In ICDE. 304–315.
  • Chen et al. (2020a) Tong Chen, Hongzhi Yin, Quoc Viet Hung Nguyen, Wen-Chih Peng, Xue Li, and Xiaofang Zhou. 2020a. Sequence-Aware Factorization Machines for Temporal Predictive Analytics. ICDE (2020).
  • Chen et al. (2020b) Tong Chen, Hongzhi Yin, Guanhua Ye, Zi Huang, Yang Wang, and Meng Wang. 2020b. Try this instead: Personalized and interpretable substitute recommendation. In SIGIR. 891–900.
  • Chen et al. (2021) Tong Chen, Hongzhi Yin, Yujia Zheng, Zi Huang, Yang Wang, and Meng Wang. 2021. Learning elastic embeddings for customizing on-device recommenders. In SIGKDD. 138–147.
  • Deng et al. (2021) Zhiyi Deng, Changyu Li, Shujin Liu, Waqar Ali, and Jie Shao. 2021. Knowledge-Aware Group Representation Learning for Group Recommendation. In ICDE. 1571–1582.
  • Gorla et al. (2013) Jagadeesh Gorla, Neal Lathia, Stephen Robertson, and Jun Wang. 2013. Probabilistic group recommendation via information matching. In WWW. 495–504.
  • Guo et al. (2021) Lei Guo, Hongzhi Yin, Tong Chen, Xiangliang Zhang, and Kai Zheng. 2021. Hierarchical Hyperedge Embedding-based Representation Learning for Group Recommendation. TOIS (2021).
  • Guo et al. (2020) Lei Guo, Hongzhi Yin, Qinyong Wang, Bin Cui, Zi Huang, and Lizhen Cui. 2020. Group recommendation with latent voting mechanism. In ICDE. 121–132.
  • He et al. (2020b) Xiangnan He, Kuan Deng, Xiang Wang, Yan Li, Yongdong Zhang, and Meng Wang. 2020b. LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation. SIGIR (2020).
  • He et al. (2017) Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In WWW. 173–182.
  • He et al. (2020a) Zhixiang He, Chi-Yin Chow, and Jia-Dong Zhang. 2020a. GAME: Learning Graphical and Attentive Multi-view Embeddings for Occasional Group Recommendation. In SIGIR. 649–658.
  • Hsieh et al. (2017) Cheng-Kang Hsieh, Longqi Yang, Yin Cui, Tsung-Yi Lin, Serge Belongie, and Deborah Estrin. 2017. Collaborative metric learning. In WWW. 193–201.
  • Hu et al. (2014) Liang Hu, Jian Cao, Guandong Xu, Longbing Cao, Zhiping Gu, and Wei Cao. 2014. Deep modeling of group preferences for group-based recommendation. In AAAI.
  • Huang et al. (2020) Zhenhua Huang, Xin Xu, Honghao Zhu, and MengChu Zhou. 2020. An efficient group recommendation model with multiattention-based neural networks. TNNLS 31, 11 (2020), 4461–4474.
  • Jia et al. (2021) Renqi Jia, Xiaofei Zhou, Linhua Dong, and Shirui Pan. 2021. Hypergraph Convolutional Network for Group Recommendation. In ICDM. 260–269.
  • Kingma and Ba (2015) Diederik P Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2015).
  • Kipf and Welling (2017) Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. ICLR (2017).
  • Koren et al. (2009) Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer (2009).
  • Li et al. (2019) Chao Li et al. 2019. Multi-interest network with dynamic routing for recommendation at Tmall. In CIKM. 2615–2623.
  • Li et al. (2021) Yunchuan Li, Yan Zhao, and Kai Zheng. 2021. Preference-aware Group Task Assignment in Spatial Crowdsourcing: A Mutual Information-based Approach. In ICDM. 350–359.
  • Liu et al. (2012) Xingjie Liu, Yuan Tian, Mao Ye, and Wang-Chien Lee. 2012. Exploring personal impact for group recommendation. In CIKM. 674–683.
  • Nguyen et al. (2017) Quoc Viet Hung Nguyen, Chi Thang Duong, Thanh Tam Nguyen, Matthias Weidlich, Karl Aberer, Hongzhi Yin, and Xiaofang Zhou. 2017. Argument discovery via crowdsourcing. VLDBJ 26, 4 (2017), 511–535.
  • Quintarelli et al. (2016) Elisa Quintarelli, Emanuele Rabosio, and Letizia Tanca. 2016. Recommending new items to ephemeral groups using contextual user influence. In RecSys. 285–292.
  • Ren et al. (2020) Hongyu Ren, Weihua Hu, and Jure Leskovec. 2020. Query2box: Reasoning over knowledge graphs in vector space using box embeddings. ICLR (2020).
  • Sajjadi Ghaemmaghami and Salehi-Abari (2021) Sarina Sajjadi Ghaemmaghami and Amirali Salehi-Abari. 2021. DeepGroup: Group Recommendation with Implicit Feedback. In CIKM. 3408–3412.
  • Sankar et al. (2020) Aravind Sankar, Yanhong Wu, Yuhang Wu, Wei Zhang, Hao Yang, and Hari Sundaram. 2020. Groupim: A mutual information maximization framework for neural group recommendation. In SIGIR. 1279–1288.
  • Seko et al. (2011) Shunichi Seko, Takashi Yagi, Manabu Motegi, and Shinyo Muto. 2011. Group recommendation using feature space representing behavioral tendency and power balance among members. In RecSys. 101–108.
  • Srivastava et al. (2014) Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. JMLR (2014), 1929–1958.
  • Vaswani et al. (2017) Ashish Vaswani et al. 2017. Attention is all you need. In NIPS. 5998–6008.
  • Vinh Tran et al. (2019) Lucas Vinh Tran, Tuan-Anh Nguyen Pham, Yi Tay, Yiding Liu, Gao Cong, and Xiaoli Li. 2019. Interact and decide: Medley of sub-attention networks for effective group recommendation. In SIGIR. 255–264.
  • Wang et al. (2020) Wen Wang, Wei Zhang, Jun Rao, Zhijie Qiu, Bo Zhang, Leyu Lin, and Hongyuan Zha. 2020. Group-Aware Long-and Short-Term Graph Representation Learning for Sequential Group Recommendation. In SIGIR. 1449–1458.
  • Wang et al. (2019) Xiang Wang, Xiangnan He, Meng Wang, Fuli Feng, and Tat-Seng Chua. 2019. Neural graph collaborative filtering. In SIGIR. 165–174.
  • Wu et al. (2021) Jiancan Wu, Xiang Wang, Fuli Feng, Xiangnan He, Liang Chen, Jianxun Lian, and Xing Xie. 2021. Self-supervised graph learning for recommendation. In SIGIR. 726–735.
  • Yin and Cui (2016) Hongzhi Yin and Bin Cui. 2016. Spatio-temporal recommendation in social media. Springer.
  • Yin et al. (2019) Hongzhi Yin, Qinyong Wang, Kai Zheng, Zhixu Li, Jiali Yang, and Xiaofang Zhou. 2019. Social influence-based group representation learning for group recommendation. In ICDE. 566–577.
  • Yin et al. (2020) Hongzhi Yin, Qinyong Wang, Kai Zheng, Zhixu Li, and Xiaofang Zhou. 2020. Overcoming data sparsity in group recommendation. TKDE (2020).
  • Yu et al. (2022) Junliang Yu, Hongzhi Yin, Xin Xia, Tong Chen, Jundong Li, and Zi Huang. 2022. Self-Supervised Learning for Recommender Systems: A Survey. arXiv preprint arXiv:2203.15876 (2022).
  • Yuan et al. (2014) Quan Yuan, Gao Cong, and Chin-Yew Lin. 2014. COM: a generative model for group recommendation. In SIGKDD. 163–172.
  • Zhang et al. (2021a) Junwei Zhang, Min Gao, Junliang Yu, Lei Guo, Jundong Li, and Hongzhi Yin. 2021a. Double-Scale Self-Supervised Hypergraph Learning for Group Recommendation. In CIKM. 2557–2567.
  • Zhang et al. (2021b) Shuai Zhang, Huoyu Liu, Aston Zhang, Yue Hu, Ce Zhang, Yumeng Li, Tanchao Zhu, Shaojian He, and Wenwu Ou. 2021b. Learning User Representations with Hypercuboids for Recommender Systems. In WSDM. 716–724.
  • Zhang et al. (2021c) Shijie Zhang, Hongzhi Yin, Tong Chen, Zi Huang, Lizhen Cui, and Xiangliang Zhang. 2021c. Graph embedding for recommendation against attribute inference attacks. In The Web Conference. 3002–3014.
  • Zhou et al. (2021) Yujia Zhou, Zhicheng Dou, Yutao Zhu, and Ji-Rong Wen. 2021. PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling. In CIKM. 2749–2758.