Heterogeneous Global Graph Neural Networks for Personalized Session-based Recommendation

07/08/2021 ∙ by Yitong Pang, et al. ∙ JD.com, Inc. William & Mary 20

Predicting the next interaction of a short-term interaction session is a challenging task in session-based recommendation. Almost all existing works rely on item transition patterns, and neglect the impact of user historical sessions while modeling user preference, which often leads to non-personalized recommendation. Additionally, existing personalized session-based recommenders capture user preference only based on the sessions of the current user, but ignore the useful item-transition patterns from other user's historical sessions. To address these issues, we propose a novel Heterogeneous Global Graph Neural Networks (HG-GNN) to exploit the item transitions over all sessions in a subtle manner for better inferring user preference from the current and historical sessions. To effectively exploit the item transitions over all sessions from users, we propose a novel heterogeneous global graph that contains item transitions of sessions, user-item interactions and global co-occurrence items. Moreover, to capture user preference from sessions comprehensively, we propose to learn two levels of user representations from the global graph via two graph augmented preference encoders. Specifically, we design a novel heterogeneous graph neural network (HGNN) on the heterogeneous global graph to learn the long-term user preference and item representations with rich semantics. Based on the HGNN, we propose the Current Preference Encoder and the Historical Preference Encoder to capture the different levels of user preference from the current and historical sessions, respectively. To achieve personalized recommendation, we integrate the representations of the user current preference and historical interests to generate the final user preference representation. Extensive experimental results on three real-world datasets show that our model outperforms other state-of-the-art methods.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Recommendation systems are widely used in online platforms, as an effective tool for addressing information overload. Recently, in some real-world applications (e.g. stream media), recommendation systems need to focus on the interactions within the active session. However, traditional recommendation methods (e.g. collaborative filtering (Sarwar et al., 2001)) that usually learn user preferences from the long-term historical interactions, are typically not suitable for these scenarios. Therefore, session-based recommendation has attracted great attention in the past few years, which generates recommendations mainly using interactions in the active session.

The current mainstream methods for session-based recommendation model the sequential patterns of item transitions within the current session. Many deep learning-based models have been proposed for session-based recommendation, which utilize the item transition sequences to learn the representation of the target session

(Hidasi et al., 2015; Li et al., 2017; Wu et al., 2019; Qiu et al., 2019; Chen and Wong, 2020; Wang et al., 2020b)

. For instance, several methods rely on the sequence modeling capability of recurrent neural networks (RNNs) to extract the sequential feature of the current session

(Hidasi et al., 2015; Li et al., 2017).

Although these approaches have achieved encouraging results, they still suffer several significant limitations. First, most methods fail to consider the user historical sessions, as users are assumed to be anonymous, leading to non-personalized recommendations. These methods mainly rely on the item information of the current session to form the session representation, and then utilize the item correlation to make recommendations. However, these methods overlook the influence of user features and historical behaviors, rendering the issue of non-personalized recommendations. For instance, given user A and B, they have the similar session sequences, i.e. Iphone Phone case AirPods. As shown in Figure 1, these previous methods usually generate the same candidate items for different users: Screen protector, AirPods Pro, etc. It is worth to note that in many cases, the user may be logged in or some form of user identifier (cookie or other identifier) may exist. In these cases it is reasonable to assume that the user behaviors in historical sessions can be useful for providing personalized recommendations.

Figure 1. A toy example of session-based recommendation.

Second, the existing studies on personalized session-based recommendation model user preference only based on the session while ignoring the useful item-transition patterns from other user’s historical sessions. These approaches capture the impact of historical sessions on the current session of the user to obtain the personalized session representation via the RNN-based or GNN-based models (Quadrana et al., 2017; Liang et al., 2019; Zhang et al., 2020). Conceptually, utilizing the item-transition of the other sessions can capture more complex item correlations and might help model the user preference.

To address these aforementioned limitations and achieve personalized session-based recommendation, we propose a novel approach to exploit the item transitions over all sessions in a subtle manner for better inferring the user preference from the current and historical sessions, which is named Heterogeneous Global Graph Neural Networks (HG-GNN). To effectively exploit the item transitions over all sessions from users, we first propose a novel heterogeneous global graph consisting of user nodes and item nodes. In particular, we utilize user-item historical interactions to construct user-item edges in the graph in order to capture long-term user preference. Then we adopt pairwise item-transitions in the session sequence to construct connections between items. To capture the potential correlations, we calculate similar item pairs based on the global co-occurrence information to construct item edges. Moreover, to capture user preference from sessions comprehensively, we propose to learn two levels of user representations from the heterogeneous global graph via two graph augmented preference encoders. We propose a novel heterogeneous graph neural network (HGNN) on the heterogeneous global graph to learn the long-term user preference and item representations with rich semantics. Furthermore, Current Preference Encoder id proposed to capture the general and dynamic interest within the user’s current session. In the Historical Preference Encoder, we combine the long-term user preference learnt from HGNN Layer with current session items to learn historical preference of user. Finally, we combine the representations of the user current preference and the historical interests to generate the final user embedding, which is used to generate the more accurate and personalized recommendation list.

Figure 2. The overview of HG-GNN. The heterogeneous global graph contains two types of meta-paths: user to item and item to item. The global graph is passed as input to the HGNN layer to learn the user and item embeddings with rich semantics. We adopt the User Preference Attention to learn the impact of long-term user preference on current session. Meanwhile, we employ the Dynamic Interest Learning and General Interest Learning modules to comprehensively capture the current session preference. Finally, we aggregate the historical and current preference representations to generate the final user embedding.

Our main contributions of this work are summarized below:

  • We propose a novel heterogeneous global graph to effectively exploit the item transitions over all sessions from users , which consists of item transitions, user-item interactions, and similar pairs constructed based on global co-occurrence information.

  • We propose a graph augmented hybrid encoder which consists of a heterogeneous graph neural network and two different-level preference encoders to generate the user preference embedding for personalized session-based recommendation.

  • Extensive experiments on three datasets demonstrate that our model is superior compared with state-of-the-art models.

2. Related Works

In this section, we review the related work of session-based recommendations.

Traditional Methods. In early research, the session-based recommender systems were mainly based on co-occurrence information following collaborative filtering (Sarwar et al., 2001; Dias and Fonseca, 2013; Wang et al., 2016)

. To capture the sequential patterns in the session, Markov chain-based methods predicted the next action of users given the last action

(Rendle et al., 2010; Shani et al., 2005). For instance, FPMC (Rendle et al., 2010) factorized personalized transition matrices by a matrix factorization based first-order Markov chain. However, Markov chain-based methods usually only model first-order transitions and cannot capture more complex sequential patterns.

Deep Learning-based Methods. Recently, deep learning-based methods have been widely used for session-based recommendations, including RNN-based and CNN-based methods (Hidasi et al., 2015; Li et al., 2017; Liu et al., 2018; Ren et al., 2019; Song et al., 2019). GRU4Rec (Hidasi et al., 2015) was the first RNN-based method for session-based recommendation, and it captured item interaction sequences by GRU layers. NARM (Li et al., 2017) employed the attention mechanism with RNN to learn the more representative item-transition information for session-based recommendation, an approach which has been proved to be effective for learning session representations. Caser (Tang and Wang, 2018) represented the session items with latent matrix and adopted CNN to learn general preferences and sequential patterns.

GNN-based Methods. Recent years have seen a surge of interests in Graph Neural Networks (Hamilton et al., 2017; Kipf and Welling, 2016; Xu et al., 2018; Chen et al., 2019, 2020) and as a result various GNN methods have been utilized for improving recommendation system. Several GNN-based models have been proposed to learn item representations for the session-based recommendation (Wu et al., 2019; Xu et al., 2019; Qiu et al., 2019; Wang et al., 2020a; Chen and Wong, 2020; Wang et al., 2020b). For example, SR-GNN (Wu et al., 2019) applied the gated GNN (GGNN) (Li et al., 2015) to a directed session graph to learn item embeddings. Based on SR-GNN, GC-SAN (Xu et al., 2019) was proposed, a model that adopted self-attention mechanism to capture global dependencies between different positions and integrated GGNN to generate the session embeddings. (Wang et al., 2020a) proposed to construct a multi-relational item graph for session-based behavior prediction. Chen et al. (Chen and Wong, 2020) proposed a lossless encoding scheme that preserved the edge-order for better modeling of the conversion of sessions to graphs. Additionally, it proposed a shortcut graph attention layer to capture long-range dependencies between items in the session. GCE-GNN (Wang et al., 2020b) proposed a framework that adopted GNNs to learn session-level and global-level item embeddings from the directed session graph and the undirected global item graph, and integrated the different level item embeddings to generate the final session embedding. DHCN (Xia et al., 2020) transformed the session data to a hyper-graph to model the high-order correlations among items, and employed graph convolution in both the global hyper-graph and the line graph between different sessions to improve recommendation performance. However, these methods fail to consider the historical user sessions, and mainly focus on the item sequence in the session, leading to non-personalized recommendations.

Personalized Session-based Recommendation. The existing research on the personalized session-based recommendation is still in the early stage. Existing methods mainly focus on the use of the user’s historical session, which also called session-aware recommendations (You et al., 2019; Quadrana et al., 2017; Liang et al., 2019; Zhang et al., 2020). H-RNN (Quadrana et al., 2017) is the most representative RNN-based approaches for personalized session-based recommendation, which utilized a hierarchical RNN to capture users’ short- and long-term preferences from the historical sessions of the current user. Recently, A-PGNN (Zhang et al., 2020) converts the each user behaviors into graphs, and extracts the personalized structural information via the GNN. Besides, A-PGNN utilizes the on the Transformer net to explicitly model the effect of historical sessions on the current session.

However, these methods only use the sessions of current user and ignore the useful item-transition patterns from other user’s historical sessions. Historical sessions of other users can provide more useful item transition patterns and might help model the current user preference.

3. Preliminary

In existing literature, session-based recommendation is an instance of next-item recommendation. It aims to predict items that the user will click next, based on historically interacted items of the active session. The current mainstream research mainly focuses on learning the patterns of item transitions in the current session, which usually ignores the past sessions of the user. To achieve accurate and personalized recommendation, we consider the user information of the session in this paper. We present the formulation of the problem researched in this paper as below.

Let and denote the universal set of items and users, respectively. We denote the interaction records of each user as , which contain the historical sessions of user in chronological order, where is the -th session sequence of . The session is a sequence of items in chronological order, where represents the item which interacted with user at time step , and denotes the session length.

Given the historical sessions and current session of user , the objective of personalized session-based recommendation is to predict the next item that the user

is most likely to click. Specifically, the recommendation model is trained to generate the probability score for each candidates in item set

, i.e. where denotes the prediction score of item .

4. Methodology

4.1. Overview

In this section, we detail the design of our model, Heterogeneous Global Graph Neural Network for personalized session-based recommendation (HG-GNN). As shown in Figure 2, to effectively exploit the item transitions over all sessions from users, we propose a novel heterogeneous global graph to organize historical sessions and exact the global information. We utilize user-item historical interactions edges to construct user-item edges in the graph in order to capture user preference. To utilize the potential correlations, we calculate similar item pairs based on the global co-occurrence information to construct item edges. The pairwise item-transitions in the session sequence are also used to construct connections.

Furthermore, we propose to learn two levels of user representations from the heterogeneous global graph via two graph augmented preference encoders. Specifically, we propose a novel heterogeneous graph neural network (HGNN) on the heterogeneous global graph to learn the long-term user preference and item representations with rich semantics.

Next, we employ the Current Preference Encoder which consists of the dynamic and general interest learning modules to comprehensively capture the current preference from the user current session. Meanwhile, in the Historical Preference Encoder, we utilize the user preference attention to learn the impact of long-term user preference on the current session. To make the personalized session-based recommendation, we combine the representations of the user current preference and the historical interests to generate the final user embedding. We will introduce these modules in detail in next subsections.

4.2. Heterogeneous Global Graph Construction

In this subsection, we mainly describe how to construct a heterogeneous global graph with two meta-paths: item-to-user and item-to-item. In this paper, we transform all training session sequences into a directed heterogeneous global graph , where nodes are which consists of user nodes and item nodes . The edges in contain the different relations of meta-paths, and each edge is denoted as .

4.2.1. Item-to-Item.

The item transition information is the basis for session-based recommendations. The transition relationship between items can include the adjacent interaction behaviors in the session and frequent co-occurrence behaviors between the same session. The two behaviors are complementary to each other. Different from the existing methods (Chen and Wong, 2020; Wu et al., 2019; Wang et al., 2020b) that only use the adjacent interaction relationship in the session to build the session graph or the sequence model, we adopt both types of information to construct the global graph.

Similar to (Wu et al., 2019; Chen and Wong, 2020; Wang et al., 2020b), we define two edges , for the transition from to in the session. Meanwhile, for each item node , we generate weight for its adjacent edges to distinguish the importance of s neighbors as follows: For each edge (, , ) where (), we utilize its frequency over all the historical sessions as the edge weight. To ensure the relevance of items, we only sample the top- edges with the highest weights for each item on .

Additionally, we adopt the co-occurrence information to construct the edges between items. The frequent co-occurrence behaviors of two items in different sessions can show the strong item correlations. Specifically, for item , we calculate its co-occurring items based on all historical sessions, and we select top- items that co-occur with frequently. Thus we define the edge for the two co-occurrence items and . The co-occurrence frequency between item and is calculated as followed:

(1)

where represents the set of sessions in which occurs. It should be noted that there may be two edges and between two items at the same time. In this case, we only keep the former edge. To avoid introducing too much noise, for each item node , we use the number of adjacent interaction items of to cut off the top-. Hence, only edges are built as for item node :

(2)
(3)

4.2.2. Item-to-User.

The item-to-user meta-path directly represents the interaction behavior between the user and the item, revealing the implicit long-term preferences of the user. We convert the user-item interaction into two types of directed edges in the graph : and , which denote that user has interacted with item .

In summary, we construct a novel heterogeneous global graph with two types of nodes to effectively organize the session data. This global graph is consists of the basic pairwise item-transitions in sessions, the user-item historical interactions and the global co-occurrence information. We can unify the learning of user and item representations via the global graph, and capture item correlation and the long-term user preference. It is worth noting that other extra attributes of users or items can be easily integrated to construct extra edges or enhance node representation.

4.3. Heterogeneous Global Graph Neural Network

We propose a heterogeneous graph neural network (HGNN) on the directed heterogeneous global graph to encode the user and item representations, inspired by (Hamilton et al., 2017; Schlichtkrull et al., 2018). The item IDs and user IDs are embedded in -dimensional space and are used as initial node features in our model, and . Let and denote the refined embedding of user and item after layers propagation.

In the GNN layer, the node representations are updated by aggregating features of neighbors and passing messages along edges. Such a process can be described as:

(4)

where AGG is the aggregation function.

According to the definition of the directed heterogeneous graph in the previous subsection, for the item node, there exist three types of edges connecting the item neighbors i.e. , and . There is one type of edge connecting the user neighbors: . For each specific edge-type , we accumulate messages over all neighbors . The aggregation process can be denoted as follows:

(5)
(6)

where denotes the neighbors of with the edge type , and are edge-type specific parameters.

denotes the activation function, and we choose relu function for activation.

is the representation of neighbor node in the -th layer, i.e. or .

For item , we accumulate different messages propagated by different types of edges and update the item node representation, as shown in the following formula:

(7)

where denotes an accumulation operation, such as or . is the short for . In practice, we adopt the operation to take the average of all messages.

The aggregation operation of the user node is similar to the item operation as above. The updated user node representation is as follows:

(8)

where is the short for .

After layers of HGNN, we combine the embeddings obtained at each layer to form the final global-level representation of a user and item:

(9)

where indicates the importance of the -th layer output. In the experiment, we empirically set uniformly as following (He et al., 2020). Through the HGNN layer, we can learn the long-term user preference and global-level refined item embeddings which contain rich semantics.

4.4. Current Preference Encoder

Modeling user preferences based on the current session requires consideration of the drifted preference and general preference of the user. Conceptually, user preferences are dynamic and will change over time; this change is largely affected by sequential interactions in the session. Thus we propose a Dynamic Interest Learning module to obtain the item transitions pattern and model the sequential feature based on the RNN method.

Meanwhile, the user’s next interaction behavior is also flexible and not necessarily affected by sequence dependencies of the session, which can be called as the general preference. Therefore, we propose a General Interest Learning module to capture flexible dependencies based on the attention mechanism.

4.4.1. Dynamic Interest Learning

Previous works (Wu et al., 2019; Li et al., 2017; Chen and Wong, 2020; Wang et al., 2020b) show that the item sequence of the session is essential to the session-based recommendation. In this module, we mainly adopt the RNN-based methods to capture the dynamic interest from the item sequence. Given the current session , we select the last items from as the input of DIL module. Firstly, we employ the GRU to capture the sequential feature of the current session as follows:

(10)

where denotes the

-th item sequential vector of the session

.

To capture the user’s main purpose in the current session and represent the current session as an embedding vector, we apply an item-level attention mechanism which dynamically selects and linearly combines different sequential information following (Li et al., 2017; Wu et al., 2019; Chen and Wong, 2020):

(11)
(12)

where is an activate function, and are trainable parameters. Finally, we combine the different levels of the sequential feature as the dynamic interest representation of user :

(13)

4.4.2. General Interest Learning

In this part, we employ a position-aware attention mechanism to capture the general user interest of the current session. We employ the attention mechanism to learn the importance of items in current session, and perform feature aggregation to obtain the general representation of the current session. In essence, GIL can capture the basic item correlations and general feature of the user interactions in the current session. We also consider the influence of the item position in the modeling.

The contribution of each item to the session representation is often influenced by the item position information (i.e. the chronological order in the session sequence) (Wang et al., 2020b). Therefore, given the last items of the current session, we concatenate the item representation learnt from the HGNN layer with the reversed position embedding as follows :

(14)

where position embedding are trainable parameters.

We use an attention layer based on general information of the current session and position-aware item representations to capture the general correlation feature of the items:

(15)
(16)
(17)

where and are trainable parameters.

Through the DIL and GIL module, we can obtain the user’s different preferences in the current session. We argue that the two different types of preference representations might contribute differently when building an integrated representation. Therefore, we design the following gating mechanism to form the current session preference representation :

(18)
(19)

4.5. Historical Preference Encoder

Only modeling the current session cannot achieve the personalized recommendation, because it does not model the user’s historical interactions. Based on the heterogeneous global graph proposed in this work, we can make full use of the user’s historical feature and capture the long-term preferences of users through the user node embedding. We can obtain the representation of long-term user preferences through the HGNN layer. This is the basis for making personalized session-based recommendations.

Based on the user representation learnt from the HGNN layer, we consider the impact of user historical interactions on the current session. Specifically, given the last items of the current session, we propose a user preference attention module to calculate the impact of historical interests on the current session:

(20)
(21)

where and are trainable parameters.

Finally, we use the gate mechanism to combine user long-term preferences and historical fusion representation , to get the historical preference representation :

(22)
(23)

4.6. Prediction and Training

To achieve personalized recommendation and improve the model performance, we combine the current session and the historical preference representation to generate the final user representation:

(24)

Based on the user preference representation and the initial embeddings of candidate items, we can compute the recommendation probability of candidate items in the current session:

(25)

where denotes the probability that the user will click on item in the current session.

The objective function can be formulated as a cross entropy loss as follows:

(26)

where is a one-hot vector of ground truth.

5. Experiments

In this section, we conduct experiments on session-based recommendation to evaluate the performance of our method compared with other state-of-the-art models111Our code and data will be released for research purpose.. Our purpose is to answer the following research questions:

  • RQ1: How does our model perform compared with state-of-the-art session-based recommender methods?

  • RQ2: How do different settings of the HG-GNN modules and the heterogeneous graph construction influence the performance?

  • RQ3: How do the hyper-parameters affect the effectiveness of our model?

  • RQ4: Can our model make a personalized session-based recommendation?

Statistic Last.fm Xing Reddit
No. of users 992 11,479 18,271
No. of items 38,615 59,121 27,452
No. of sessions 385,135 91,683 1,135,488
Avg. of session length 8.16 5.78 3.02
Session per user 373.19 7.99 62.15
No. of train sessions 292,703 69,135 901,161
No. of test sessions 92,432 22,548 234,327
Table 1. Statistics of datasets used in experiments.

5.1. Experimental Setup

5.1.1. Dataset.

We conduct extensive experiments on three real-world datasets: Last.fm, Xing and Reddit are widely used in the session-based recommendation research (Chen and Wong, 2020; Wang et al., 2020b; Guo et al., 2019; Ren et al., 2019; Quadrana et al., 2017; Zhang et al., 2020). These datasets both contain the basic user information that can support our work on personalized session-based recommendation.

  • Last.fm222http://ocelma.net/MusicRecommendationDataset/lastfm-1K.html contains the whole listening habits for nearly 1,000 users collected from Last.fm. In this work, we focus on the music artist recommendation. We keep the top 40,000 most popular artists and group interaction records in 8 hours from the same user as a session, following (Chen and Wong, 2020; Guo et al., 2019).

  • Xing333http://2016.recsyschallenge.com/ collects the job postings from a social network platform, and contains interactions on job postings for 770,000 users. We split each user’s records into sessions manually by using the same approach as mentioned in (Quadrana et al., 2017).

  • Reddit444https://www.kaggle.com/colemaclean/subreddit-interactions records the interactions user from the Reddit. It contains tuples of user name, a subreddit where the user makes a comment to a thread, and a timestamp for the interaction. We partitioned the interaction data into sessions by using a 60-minute time threshold following (Ludewig et al., 2019).

To filter poorly informative sessions, we filtered out sessions having less than 3 interactions and kept users having 5 sessions or more to have sufficient historical sessions following (Quadrana et al., 2017; Zhang et al., 2020). For each user, we take the last 20% sessions as the test set, while the remaining sessions forms the train set. Additionally, we filtered out the interactions from the test set where the items do not appear in the training set.

The statistics of preprocessed datasets are summarized in Table 1. Referring to (Quadrana et al., 2017; Zhang et al., 2020), we segment the sessions of each user into historical sequences and item labels. For example, for the session data of user , the historical sessions, current sessions and target label are set as , and . The target label is the next interacted item within the current session.

5.1.2. Baseline Models.

To evaluate the performance of our method on session-based recommendation, we compare it with several representative competitors, including the state-of-the-art GNN-based models and several personalized methods.

  • Item-KNN

    (Sarwar et al., 2001) is a conventional item-to-item model which recommends items similar to the items in the session.

  • GRU4Rec555https://github.com/hidasib/GRU4Rec (Hidasi et al., 2015) employs the GRU to capture the representation of the item sequence through a session-parallel mini-batch training process.

  • NARM666https://github.com/lijingsdu/sessionRec_NARM (Li et al., 2017) is also a RNN-based model which incorporates attention mechanism into RNN to generate the session embedding.

  • SR-GNN777https://github.com/CRIPAC-DIG/SR-GNN (Wu et al., 2019) converts session sequences into directed unweighted graphs and utilizes a GGNN layer (Li et al., 2015) to learn the patterns of item transitions.

  • LESSR888https://github.com/twchen/lessr (Chen and Wong, 2020) adds shortcut connections between items in the session and considers the sequence information in graph convolution by using GRU.

  • GCE-GNN999https://github.com/CCIIPLab/GCE-GNN (Wang et al., 2020b) aggregates the global context and the item sequence in the current session to generate the session embedding through different level graph neural networks.

  • H-RNN101010https://github.com/mquad/hgru4rec (Quadrana et al., 2017) is a RNN-based personalized method which utilizes a hierarchical RNNs consist of a session-based and a user-level RNN to model the cross-session user interests.

  • A-PGNN111111https://github.com/CRIPAC-DIG/A-PGNN (Zhang et al., 2020) converts all sessions of each user into a graph and employ the GGNN model to learn the item transitions. Besides, As a personalized recommender, A-PGNN utilizes the attention mechanism to explicitly model the effect of the user’s historical interests on the current session.

5.1.3. Evaluation Metrics.

To evaluate the recommendation performance, we employ two widely used metrics: Hit Ratio (HR@) and Mean Reciprocal Rank (MRR@) following (Wu et al., 2019; Wang et al., 2020b), where . The average results over all test users are reported.

5.1.4. Implementation Details.

We implement the proposed model based on Pytorch and DGL. We employ the Adam optimizer with the initial learning rate of 0.001, which will decay by 0.1 after every 3 epochs following

(Wu et al., 2019; Wang et al., 2020b). The mini-batch size is set to be 512 for all models. We employ the grid search to find the optimal hyper-parameters by taking 10% of training data as the validation set. The embedding size is set to be 128. For the global graph construction, we set the sampling size and the number of similarity items as hyper-parameters and adopt one layer HGNN. Furthermore, we adopt an early stopping strategy, i.e. stopping the training process if the metric HR@10 does not increase for 5 successive epochs.

5.2. Model Comparison (RQ1)

Models Last.fm Xing Reddit
HR@5 HR@10 MRR@5 MRR@10 HR@5 HR@10 MRR@5 MRR@10 HR@5 HR@10 MRR@5 MRR@10
ItemKNN 6.73 10.90 4.02 4.81 8.79 11.85 5.01 5.42 21.71 30.32 11.74 12.88
GRU4Rec 8.47 12.86 4.71 5.29 10.35 13.15 5.94 6.36 33.72 41.73 24.36 25.42
NARM 10.29 15.03 6.09 6.71 13.51 17.31 8.87 9.37 33.25 40.52 24.56 25.52
SR-GNN 11.89 16.90 7.23 7.85 13.38 16.71 8.95 9.39 34.96 42.38 25.90 26.88
LESSR 12.96* 17.88 8.24* 8.82* 14.84 16.77 11.98* 12.13* 36.03 43.27 26.45 27.41
GCE-GNN 12.83 18.28* 7.60 8.32 16.98* 20.86* 11.14 11.65 36.30 45.16 26.65 27.70
H-RNN 10.92 15.83 6.71 7.39 10.72 14.36 7.22 7.74 44.76 53.44 32.13 33.29
A-PGNN 12.10 17.13 7.37 8.01 14.23 17.01 10.26 10.58 49.10* 58.23* 33.54* 34.62*
HG-GNN 13.15 19.45 7.45 8.28 17.39 20.76 12.66 13.11 51.16 60.62 35.76 37.04
1.46% 6.40% - - 2.41% - 5.68% 8.08% 4.20% 4.10% 6.62% 7.00%
Table 2. Experimental results (%) of different models in terms of HR@{5, 10}, and MRR@{5, 10} on three datasets. The * means the best results on baseline methods. means improvement over the state-of-art methods.

To demonstrate the overall performance of the proposed model, we compared it with the state-of-the-art recommendation methods. We can obtain the following important observations from the comparison results shown in Table 2.

First, HG-GNN outperforms other SOTA models on all datasets on most metrics consistently, demonstrating the superiority of our model. Among the comparison methods, the performance of the conventional method ( Item-KNN ) is not competitive. Meanwhile, all deep learning-based methods achieve better performance than Item-KNN. It proves that deep learning can capture the complex patterns of item transitions in the session, and is efficient for session-based recommendation.

Besides, we compare our model with the advanced GNN-based methods. These methods generally achieve better performance in contrast with the RNN-based methods. SR-GNN converts the session sequence to a directed graph and employ the GNN to encode the session graph. As the variant methods of SR-GNN, LESSR also achieves promising results. It demonstrates that GNN model has a stronger capability of modeling session sequences than the RNN model in session-based recommendation tasks. Furthermore, GCE-GNN not only utilizes the session graph for the item sequence, but also constructs an undirected global graph from all sessions following interaction sequences. GCE-GNN outperforms SR-GNN on both datasets, showing the importance of global context information for the session-based recommendation.

Compared with the state-of-the-art personalized session-based approaches i.e. H-RNN and A-PGNN, our approach achieves a significant performance improvement on all metrics consistently. Specifically, HG-GNN outperforms A-PGNN by 4.20% in terms of HR@5 and 6.62% in terms of MRR@5 on Reddit. We can attribute the success of A-PGNN to the ability to model explicitly the effect of the user’s historical interests on the current session. However, these methods only focus on the cross-session information of the current user while ignore the influence of global context information of historical sessions. In contrast, HG-GNN overcomes this deficiency by forming the heterogeneous global session graph. Through the global graph, we effectively organize historical sessions. we learn the user and item embeddings with rich semantics through the HGNN layer. For the current session, we capture the dynamic interest and general interest to model current preference comprehensively. The heterogeneous global graph design and effective preference modeling together contribute to the remarkable performance.

Method Last.fm Xing Reddit
A-PGNN 1,795.53 188.18 1,995.20
LESSR 612.80 76.68 790.83
GCE-GNN 821.23 156.35 875.37
HG-GNN 458.56 61.23 576.13
Table 3. Runtime (seconds) of each training epoch.

We also record the runtime of some GNN-based methods and the proposed HG-GNN approach. We implement both models with the same 128-dimensional embedding vectors and the same batch size, and test them on the same GPU server (Tesla V100 DGXS 32GB) with sufficient resources. We only record the training time of each epoch, excluding the process of data loading and testing. The average running time of 10 epochs on three datasets are reported in Table 3

. It illustrates that HG-GNN is more efficient than A-PGNN. Generally, the main calculation of HG-GNN includes the aggregation operation of HGNN and short sequence modeling. A-PGNN utilize several attention models on historical sessions, and learn the correlation of the long item sequence which needs a relatively large time cost. Besides, the calculation complexity of GGNN used in A-PGNN is high. Although HG-GNN also uses the attention mechanism, it does not perform long sequence modeling. Meanwhile, the HGNN structure is similar to the structure of GraphSage

(Hamilton et al., 2017), which is simpler and faster. Experiments results above imply that HG-GNN may be more suitable for practical application because computational efficiency is crucial in real-world systems.

5.3. Ablation and Effectiveness Analyses (RQ2)

In this subsection, we conduct some ablation studies on the proposed model to investigate the effectiveness of some designs.

5.3.1. Impact of different Layers

In this part, we compare our method with different variants to verify the effectiveness of the critical components of HG-GNN. Specifically, we remove critical modules of HG-GNN to observe changes about model performance. The results of ablations studies are shown in the Table 4. We observe that the HGNN module is pivotal for the model performance by seeing ”w/o HGNN module”. For the historical preference modeling, the removal of user embedding has a greater impact on model results than the user preference attention (UPA) module, which demonstrates that long-term user preference is still valuable for the session-based recommendation. Through the above comparison and analysis, we can conclude that the main components design of HG-GNN are effective.

Model setting Last.fm Xing
HR@5 HR@10 MRR@5 MRR@10 HR@5 HR@10 MRR@5 MRR@10
w/o HGNN module
11.98
(-8.90%)
17.42
(-10.44%)
6.81
(-8.59%)
7.53
(-9.06%)
16.54
(-4.89%)
20.14
(-2.99%)
11.76
(-7.11%)
12.24
(-6.64%)
w/o DIL module
13.09
(-0.46%)
19.39
(-0.31%)
7.35
(-1.34%)
8.18
(-1.21%)
17.25
(-0.81%)
20.30
(-2.22%)
12.23
(-3.40%)
12.79
(-2.44%)
w/o GIL module
12.83
(-2.43%)
19.16
(-1.49%)
7.07
(-5.10%)
7.91
(-4.47%)
13.16
(-24.32%)
16.53
(-20.38%)
8.70
(-31.28%)
9.15
(-30.21%)
w/o user embedding
12.48
(-5.10%)
18.06
(-7.15%)
7.12
(-4.43%)
7.86
(-5.07%)
16.72
(-3.85%)
20.24
(-2.50%)
12.00
(-5.21%)
12.23
(-6.71%)
w/o UPA module
12.96
(-1.44%)
19.22
(-1.18%)
7.13
(-4.30%)
8.04
(-2.90%)
17.24
(-0.86%)
20.38
(-1.83%)
12.21
(-3.55%)
12.74
(-2.82%)
HG-GNN 13.15 19.45 7.45 8.28 17.39 20.76 12.66 13.11
Table 4. Impact of different layer of HG-GNN.

5.3.2. Impact of different Global Graph Design.

We next conduct experiments to evaluate the effectiveness of the proposed heterogeneous global graph. Specifically, we remove different types of edges and nodes in the graph to observe their impact on the model. The experimental results are shown in Table 5.

We can see that the deletion nodes or edges can bring performance loss. Removing the user nodes leads to the most significant performance drop, demonstrating the importance of user information and user-item interaction for recommendation. It is worth noting that our method without user nodes is still competitive, compared with other methods in Table 2. It also can be observed that without the similar edges constructed by co-occurrence, the performance declines to a certain extent. This indicates that constructing extra edges through the global co-occurrence information between is effective and useful for session-based recommendations. Additionally, removing the or edges can bring more performance loss than removing the edges. It demonstrates that adjacent interactions relationship and the global co-occurrence relationship complement each other, and the adjacent interactions relationship is more important in the session-based recommendation task.

Graph Setting HR@5 HR@10 MRR@5 MRR@10
w/o user nodes 17.03 19.82 12.05 12.86
w/o in edges 17.14 20.02 12.12 12.93
w/o out edges 17.11 20.00 12.02 12.87
w/o similar edges 17.24 20.09 12.32 13.04
HG-GNN 17.39 20.76 12.66 13.11
Table 5. The performance comparison w.r.t. different graph design on Xing.

5.4. Hyper-parameters Study (RQ3)

In this subsection, we perform experiments to explore how the hyper-parameters like sampling size and maximum session length influence the model performance.

5.4.1. Effect of maximum session length.

Figure 3. Performance comparison w.r.t. maximum session length .

In this part, we discuss how the performance change with hyper-parameter maximum session length , which indicates the upper limits of historical information that the network can utilize to make predictions for current session. It means to take the last items of the current session, reflecting the upper limits of the current session information that the model can directly utilize to modeling the current session preference. Figure 3 shows the evaluation results with different maximum session lengths. HR@10 reaches the highest score when is 8 for Xing and is 10 for Last.fm. In general, we can find that longer session does not lead to better performance. This shows that the increase of current session information does not necessarily lead to an increase in model performance.

5.4.2. Effect of the sampling size.

Figure 4. Performance comparison w.r.t. sampling size in the global graph construction.

In the process of building the global graph, due to calculation efficiency and information validity, we sample edges for each item node according to the weight of the adjacent edges or . Thus, is the pivotal hyper-parameter for our model. The experimental results are shown in the Figure 4. As can be seen that the model performance reaches the highest value when is 8 for the Last.fm and Xing dataset. The metrics change in the figure can show that too big or too small sampling size will bring loss to the model effect. The most suitable sampling size can achieve the balance between effective information and irrelevant information, so as to achieve the best performance.

5.4.3. Effect of the Top- similar items.

Figure 5. Performance comparison w.r.t. the Top- similar items in the global graph construction.

In addition to the sampling size discussed above, another key hyper-parameter for the global graph is the number of similar items. For each item, we select the top- similar items based on the global co-occurrence information to construct the similar edges . In practice, the number of edges for each item node is cut off based on the size of adjacent interaction items. The experimental results are shown in the Figure 5. It is clear that the optimal value is quite different on different datasets. Since the relationship of similar items depends on the global interaction, the distribution of similar relationships on different datasets will vary greatly. Therefore, this hyper-parameter is relatively sensitive to data.

Figure 6. A case study of two users from Last.fm data for music artist recommendation. Two users have different music preferences but similar session sequences, which are the basis to generate the recommendation. This figure presents the difference between the recommendation results (top 5 artists) generated by our model and the base model.

5.5. Case Study (RQ4)

To illustrate the personalized recommendation result of our model intuitively, we present a music artist recommendation case in Figure 6. We select two users from Last.fm, who have different listening history. Given their similar current session sequences, i.e. they both contain the same artists, our model generates different recommendation lists than the non-personalized base model (SR-GNN).

For user and , they both have different music preferences but have partially overlapping listening histories. The former prefers classic music and rock music e.g. Gold Earring, while the latter is younger and loves pop and rock music, such as Joss Stone. The current sessions share Led Zeppelin (a famous rock band). For such similar sessions, the artists recommended by the base model overlap much for two users (here we only take the top 5 recommendation results). Our model takes into account the user preferences, so it can recommend musicians who are more relevant to the current session such as Jimi Hendrix, as well as artists that meet the other preferences of users, such as Sublime (a ska punk band) for . In summary, our model can achieve more accurate and personalized recommendations.

6. Conclusion

In this paper, we proposed a heterogeneous global graph neural network for personalized session-based recommendation. Contrasting with previous methods, we considered the impact of historical interactions of users and build a heterogeneous global graph that consists of historical user-item interactions, item transitions and global co-occurrence information. Furthermore, we proposed a graph augmented hybrid encoder which consists of a heterogeneous graph neural network and two different-level preference encoders to capture the user preference representation comprehensively. In the experiments, our model outperforms other state-of-the-art session-based models, showing the effectiveness of our model.

References

  • T. Chen and R. C. Wong (2020) Handling information loss of graph neural networks for session-based recommendation. In KDD, pp. 1172–1180. Cited by: §1, §2, §4.2.1, §4.2.1, §4.4.1, §4.4.1, 1st item, 5th item, §5.1.1.
  • Y. Chen, L. Wu, and M. J. Zaki (2019) Reinforcement learning based graph-to-sequence model for natural question generation. In The Eighth International Conference on Learning Representations (ICLR 2020), Cited by: §2.
  • Y. Chen, L. Wu, and M. J. Zaki (2020) Iterative deep graph learning for graph neural networks: better and robust node embeddings. In Thirty-Fourth annual conference on Neural Information Processing Systems (NeurIPS 2020), Cited by: §2.
  • R. Dias and M. J. Fonseca (2013) Improving music recommendation in session-based collaborative filtering by using temporal context. In

    2013 IEEE 25th international conference on tools with artificial intelligence

    ,
    pp. 783–788. Cited by: §2.
  • L. Guo, H. Yin, Q. Wang, T. Chen, A. Zhou, and N. Quoc Viet Hung (2019) Streaming session-based recommendation. In KDD, pp. 1569–1577. Cited by: 1st item, §5.1.1.
  • W. L. Hamilton, R. Ying, and J. Leskovec (2017) Inductive representation learning on large graphs. arXiv preprint arXiv:1706.02216. Cited by: §2, §4.3, §5.2.
  • X. He, K. Deng, X. Wang, Y. Li, Y. Zhang, and M. Wang (2020) Lightgcn: simplifying and powering graph convolution network for recommendation. In SIGIR, pp. 639–648. Cited by: §4.3.
  • B. Hidasi, A. Karatzoglou, L. Baltrunas, and D. Tikk (2015) Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939. Cited by: §1, §2, 2nd item.
  • T. N. Kipf and M. Welling (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907. Cited by: §2.
  • J. Li, P. Ren, Z. Chen, Z. Ren, T. Lian, and J. Ma (2017) Neural attentive session-based recommendation. In CIKM, pp. 1419–1428. Cited by: §1, §2, §4.4.1, §4.4.1, 3rd item.
  • Y. Li, D. Tarlow, M. Brockschmidt, and R. Zemel (2015) Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493. Cited by: §2, 4th item.
  • T. Liang, Y. Li, R. Li, X. Gu, O. Habimana, and Y. Hu (2019) Personalizing session-based recommendation with dual attentive neural network. In IJCNN, pp. 1–8. Cited by: §1, §2.
  • Q. Liu, Y. Zeng, R. Mokhosi, and H. Zhang (2018) STAMP: short-term attention/memory priority model for session-based recommendation. In KDD, pp. 1831–1839. Cited by: §2.
  • M. Ludewig, N. Mauro, S. Latifi, and D. Jannach (2019) Performance comparison of neural and non-neural approaches to session-based recommendation. In Proceedings of the 13th ACM conference on recommender systems, pp. 462–466. Cited by: 3rd item.
  • R. Qiu, J. Li, Z. Huang, and H. Yin (2019) Rethinking the item order in session-based recommendation with graph neural networks. In CIKM, pp. 579–588. Cited by: §1, §2.
  • M. Quadrana, A. Karatzoglou, B. Hidasi, and P. Cremonesi (2017) Personalizing session-based recommendations with hierarchical recurrent neural networks. In proceedings of the Eleventh ACM Conference on Recommender Systems, pp. 130–137. Cited by: §1, §2, 2nd item, 7th item, §5.1.1, §5.1.1, §5.1.1.
  • P. Ren, Z. Chen, J. Li, Z. Ren, J. Ma, and M. De Rijke (2019) Repeatnet: a repeat aware neural recommendation machine for session-based recommendation. In AAAI, Vol. 33, pp. 4806–4813. Cited by: §2, §5.1.1.
  • S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme (2010) Factorizing personalized markov chains for next-basket recommendation. In WWW, pp. 811–820. Cited by: §2.
  • B. Sarwar, G. Karypis, J. Konstan, and J. Riedl (2001) Item-based collaborative filtering recommendation algorithms. In WWW, pp. 285–295. Cited by: §1, §2, 1st item.
  • M. Schlichtkrull, T. N. Kipf, P. Bloem, R. Van Den Berg, I. Titov, and M. Welling (2018) Modeling relational data with graph convolutional networks. In European semantic web conference, pp. 593–607. Cited by: §4.3.
  • G. Shani, D. Heckerman, R. I. Brafman, and C. Boutilier (2005) An mdp-based recommender system..

    Journal of Machine Learning Research

    6 (9).
    Cited by: §2.
  • J. Song, H. Shen, Z. Ou, J. Zhang, T. Xiao, and S. Liang (2019) ISLF: interest shift and latent factors combination model for session-based recommendation.. In IJCAI, pp. 5765–5771. Cited by: §2.
  • J. Tang and K. Wang (2018) Personalized top-n sequential recommendation via convolutional sequence embedding. In the Eleventh ACM International Conference, Cited by: §2.
  • W. Wang, H. Yin, S. Sadiq, L. Chen, M. Xie, and X. Zhou (2016) SPORE: a sequential personalized spatial item recommender system. In ICDE, pp. 954–965. Cited by: §2.
  • W. Wang, W. Zhang, S. Liu, Q. Liu, B. Zhang, L. Lin, and H. Zha (2020a) Beyond clicks: modeling multi-relational item graph for session-based target behavior prediction. In Proceedings of The Web Conference 2020, pp. 3056–3062. Cited by: §2.
  • Z. Wang, W. Wei, G. Cong, X. Li, X. Mao, and M. Qiu (2020b) Global context enhanced graph neural networks for session-based recommendation. In SIGIR, pp. 169–178. Cited by: §1, §2, §4.2.1, §4.2.1, §4.4.1, §4.4.2, 6th item, §5.1.1, §5.1.3, §5.1.4.
  • S. Wu, Y. Tang, Y. Zhu, L. Wang, X. Xie, and T. Tan (2019) Session-based recommendation with graph neural networks. In AAAI, Vol. 33, pp. 346–353. Cited by: §1, §2, §4.2.1, §4.2.1, §4.4.1, §4.4.1, 4th item, §5.1.3, §5.1.4.
  • X. Xia, H. Yin, J. Yu, Q. Wang, L. Cui, and X. Zhang (2020) Self-supervised hypergraph convolutional networks for session-based recommendation. arXiv preprint arXiv:2012.06852. Cited by: §2.
  • C. Xu, P. Zhao, Y. Liu, V. S. Sheng, J. Xu, F. Zhuang, J. Fang, and X. Zhou (2019) Graph contextualized self-attention network for session-based recommendation.. In IJCAI, Vol. 19, pp. 3940–3946. Cited by: §2.
  • K. Xu, L. Wu, Z. Wang, Y. Feng, M. Witbrock, and V. Sheinin (2018) Graph2seq: graph to sequence learning with attention-based neural networks. arXiv preprint arXiv:1804.00823. Cited by: §2.
  • J. You, Y. Wang, A. Pal, P. Eksombatchai, C. Rosenburg, and J. Leskovec (2019) Hierarchical temporal convolutional networks for dynamic recommender systems. In The world wide web conference, pp. 2236–2246. Cited by: §2.
  • M. Zhang, S. Wu, M. Gao, X. Jiang, K. Xu, and L. Wang (2020) Personalized graph neural networks with attention mechanism for session-aware recommendation. IEEE Transactions on Knowledge and Data Engineering. Cited by: §1, §2, 8th item, §5.1.1, §5.1.1, §5.1.1.