1. Introduction
With the sheer volume of online information, much attention has been given to datadriven recommender systems. Those systems automatically guide users to discover products or services respecting their personal interests from a large pool of possible options. Numerous recommendation techniques have been developed. Three main categories of them are: collaborative filtering methods, contentbased methods and hybrid methods (Bobadilla et al., 2013; Lu et al., 2015). In this paper, we aim to develop a method producing a ranked list of
movies to a user at a given moment (top
movie recommendation) by exploiting both historical usermovie interactions and the content information of movies.Matrix factorization (MF) (Koren et al., 2009) is one of the most successful techniques in the practice of recommendation due to its simplicity, attractive accuracy and scalability. It has been used in a broad range of applications such as recommending movies, books, web pages, relevant research and services. The matrix factorization technique is usually effective because it discovers the latent features underpinning the multiplicative interactions between users and movies. Specifically, it models the user preference matrix approximately as a product of two lowerrank latent feature matrices representing user profiles and movie profiles respectively.
Despite the appeal of matrix factorization, this technique does not explicitly consider the temporal variability of data (Wu et al., 2017)
. Firstly, the popularity of an movie may change over time. For example, movie popularity booms or fades, which can be triggered by external events such as the appearance of an actor in a new movie. Secondly, users may change their interests and baseline ratings over time. For instance, a user who tended to rate an average movie as “4 stars”, may now rate such a movie as “3 stars”. Recently, recurrent neural network (RNN)
(Hochreiter and Schmidhuber, 1997) has gained significant attention by considering such temporal dynamics for both users and movies and achieved high recommendation quality (Wu et al., 2016, 2017). The basic idea of these RNNbased methods is to formulate the recommendation as a sequence prediction problem. They take the latest observations as input, update the internal states and make predictions based on the newly updated states. As shown in (Devooght and Bersini, 2016), such prediction based on shortterm dependencies is likely to improve the recommendation diversity.More recent work (Wu et al., 2017) reveals that matrix factorization based and RNNbased recommendation approaches have good performances for the reasons that are complementary to each other. Specifically, the matrix factorization recommendation approaches make movie predictions based on users’ longterm interests which change very slowly with respect to time. On the contrary, the RNN recommendation approaches predict which movie will the user consume next, respecting the dynamics of users’ behaviors and movies’ attributes in the short term. It therefore motivates us to devise a joint approach that takes advantage of both matrix factorization and RNN, exploiting both longterm and shortterm associations among users and movies.
Furthermore, most existing recommender systems take into account only the users’ past behaviors when making recommendation. compared with tens of thousands of movies in the corpus, the historical rating set is too sparse to learn a wellperformed model. It is desirable to exploit the content information of movies for recommendation. For example, movie posters reveal a great amount of information to understand movies and users, as demonstrated in (Zhao et al., 2016). Such a poster is usually the first contact that a user has with a movie, and plays an essential role in the user’s decision to watch it or not. When a user is watching the movie presented in cold, blue and mysterious visual effects, he/she may be interested in receiving recommendations for movies with similar styles, rather than others that are with the same actors or subject (Zhao et al., 2016). These visual features of movies are usually captured by the corresponding posters.
In this paper, we propose a novel LSIC model, which leverages Long and Shortterm Information in Contentaware movie recommendation using adversarial training. The LSTC model employs an adversarial framework to combine the MF and RNN based models for the top movie recommendation, taking the best of each to improve the final recommendation performance. In the adversarial process, we simultaneously train two models: a generative model and a discriminative model . In particular, the generator takes the user and time as input, and predicts the recommendation list for user at time based on the historical usermovie interactions. We implement the discriminator via a siamese network that incorporates longterm and sessionbased ranking model in a pairwise scenario. The two pointwise networks of siamese network share the same set of parameters. The generator and the discriminator are optimized with a minimax twoplayer game. The discriminator tries to distinguish the real highrated movies in the training data from the recommendation list generated by the generator , while the training procedure of generator
is to maximize the probability of
making a mistake. Thus, this adversarial process can eventually adjust to generate plausible and highquality recommendation list. In addition, we integrate poster information of movies to further improve the performance of movie recommendation, which is specifically essential when few ratings are available.We summarize our main contributions as follows:

To the best of our knowledge, we are the first to use GAN framework to leverage the MF and RNN approaches for top recommendation. This joint model adaptively adjusts how the contributions of the longterm and shortterm information of users and movies are mixed together.

We propose hard and soft mixture mechanisms to integrate MF and RNN. We use the hard mechanism to calculate the mixing score straightforwardly and explore several soft mechanisms to learn the temporal dynamics with the help of the longterm profiles.

Our model uses reinforcement learning to optimize the generator for generating highly rewarded recommendation list. Thus, it effectively bypasses the nondifferentiable task metric issue by directly performing policy gradient update.

We automatically crawl the posters of the given movies, and explore the potential of integrating poster information to improve the accuracy of movie recommendation. The release of the collected posters would push forward the research of integrating content information in movie recommender systems.

To verify the effectiveness of our model, we conduct extensive experiments on two widely used reallife datasets: Netflix Prize Contest data and Movielens data. The experimental results demonstrate that our model consistently outperforms the stateoftheart methods.
The rest of the paper is organized as follows. In Section 2, we review the related work on recommender systems. Section 3 presents the proposed adversarial learning framework for movie recommendation in details. In Section 4, we describe the experimental data, implementation details, evaluation metrics and baseline methods. The experimental results and analysis are provided in Section 5. Section 6 concludes this paper.
2. Related Work
Recommender system is an active research field. The authors of (Bobadilla et al., 2013; Lu et al., 2015) describe most of the existing techniques for recommender systems. In this section, we briefly review the following major approaches for recommender systems that are mostly related to our work.
Matrix factorization for recommendation
Modeling the longterm interests of users, the matrix factorization method and its variants have grown to become dominant in the literature (Rennie and Srebro, 2005; Koren, 2008; Koren et al., 2009; Hernando et al., 2016; He et al., 2016). In the standard matrix factorization, the recommendation task can be formulated as inferring missing values of a partially observed useritem matrix (Koren et al., 2009). The Matrix Factorization techniques are effective because they are designed to discover the latent features underlying the interactions between users and items. Srebro et al. (2005) suggested the Maximum Margin Matrix Factorization (MMMF), which used lownorm instead of lowrank factorizations. Mnih and Salakhutdinov (2008) presented the Probabilistic Matrix Factorization (PMF) model that characterized the user preference matrix as a product of two lowerrank user and item matrices. The PMF model was especially effective at making better predictions for users with few ratings. He et al. (2016) proposed a new MF method which considers the implicit feedback for online recommendation. In (He et al., 2016), the weights of the missing data were assigned based on the popularity of items. To exploit the content of items and solve the sparse issue in recommender systems, (Zhao et al., 2016) presented the model for movie recommendation using additional visual features (e.g.. posters and still frames) to better understand movies, and further improved the performance of movie recommendation.
Recurrent neural network for recommendation
These traditional MF methods for recommendation systems are based on the assumption that the user interests and movie attributes are near static, which is however not consistent with reality. Koren (2010) discussed the effect of temporal dynamics in recommender systems and proposed a temporal extension of the SVD++ (called TimeSVD++) to explicitly model the temporal bias in data. However, the features used in TimeSVD++ were handcrafted and computationally expensive to obtain. Recently, there have been increasing interests in employing recurrent neural network to model temporal dynamic in recommendation systems. For example, Hidasi et al. (2015) applied recurrent neural network (i.e. GRU) to sessionbased recommender systems. This work treats the first item a user clicked as the initial input of GRU. Each followup click of the user would then trigger a recommendation depending on all of the previous clicks. Wu et al. (2016) proposed a recurrent neural network to perform the time heterogeneous feedback recommendation. Wu et al. (2017)
used LSTM autoregressive model for the user and movie dynamics and employed matrix factorization to model the stationary components that encode fixed properties. Different from their work, we use GAN framework to leverage the MF and RNN approaches for top
recommendation, aiming to generate plausible and highquality recommendation lists. To address the cold start problem in recommendation, (Cui et al., 2016)presented a visual and textural recurrent neural network (VTRNN), which simultaneously learned the sequential latent vectors of user’s interest and captured the contentbased representations that contributed to address the cold start.
Generative adversarial network for recommendation
In parallel, previous work has demonstrated the effectiveness of generative adversarial network (GAN) (Goodfellow et al., 2014) in various tasks such as image generation (Reed et al., 2016; Arjovsky et al., 2017), image captioning (Chen et al., 2017), and sequence generation(Yu et al., 2017). The most related work to ours is (Wang et al., 2017)
, which proposed a novel IRGAN mechanism to iteratively optimize a generative retrieval component and a discriminative retrieval component. IRGAN reported impressive results on the tasks of web search, item recommendation, and question answering. Our approach differs from theirs in several aspects. First, we combine the MF approach and the RNN approach with GAN, exploiting the performance contributions of both approaches. Second, IRGAN does not attempt to estimate the future behavior since the experimental data is split randomly in their setting. In fact, they use future trajectories to infer the historical records, which seems not useful in reallife applications. Third, we incorporate poster information of movies to deal with the coldstart issue and boost the recommendation performance.
3. Our Model
the usermovie rating matrix  
,  the number of users and movies 
rating score of user on movie  
rating score of user on movie at time  
MF user factors for user  
MF movie factors for movie  
bias of user in MF and RNN hybrid calculation  
bias of movie in MF and RNN hybrid calculation  
LSTM hiddenvector at time for user  
LSTM hiddenvector at time for movie  
the rating vector of user at time (LSTM input)  
the rating vector of movie at time (LSTM input)  
attention weight of user at time  
attention weight of movie at time  
index of a positive (highrating) movie drawn from the  
entire positive movie set  
index of a negative (lowrating) movie randomly chosen from the  
entire negative movie set  
index of an item chosen by generator at time 
Suppose there is a sparse usermovie rating matrix that consists of users and movies. Each entry denotes the rating of user on movie at time step . The rating is represented by numerical values from 1 to 5, where the higher value indicates the stronger preference. Instead of predicting the rating of a specific usermovie pair as is done in (Adomavicius and Tuzhilin, 2005; McNee et al., 2006), the proposed LSIC model aims to provide users with ranked lists of movies (top recommendation) (Liu and Yang, 2008).
In this section, we elaborate each component of LSIC model for contentaware movie recommendation. The main notations of this work are summarized in Table 1 for clarity. The LSTC model employs an adversarial framework to combine the MF and RNN based models for the top movie recommendation. The overview of our proposed architecture and its dataflow are illustrated in Figure 1. In the adversarial process, we simultaneously train two models: a generative model and a discriminative model .
3.1. Matrix Factorization (MF)
The MF framework (Mnih and Salakhutdinov, 2008) models the longterm states (global information) for both users () and movies (). In its standard setting, the recommendation task can be formulated as inferring missing values of a partially observed usermovie rating matrix . The formulation of MF is given by:
(1) 
where and represent the user and movie latent factors in the shared dimension space respectively. denotes the user ’s rating on movie . is an indicator function and equals 1 if , and 0 otherwise. and are regularization coefficients. The is a logistic scoring function that bounds the range of outputs.
In most recommender systems, matrix factorization techniques (Koren et al., 2009) recommend movies based on estimated ratings. Even though the predicted ratings can be used to rank the movies, it is known that it does not provide the best prediction for the top recommendation. Because minimizing the objective function – the squared errors – does not perfectly align with the goal to optimize the ranking order. In this paper, we apply MF for ranking prediction (top recommendation) directly, similar to (Wang et al., 2017).
3.2. Recurrent Neural Network (RNN)
The RNN based recommender system focuses on modeling sessionbased trajectories instead of global (longterm) information (Wu et al., 2017). It predicts future behaviors and provides users with a ranking list given the users’ past history. The main purpose of using RNN is to capture timevarying state for both users and movies. Particularly, we use LSTM cell as the basic RNN unit. Each LSTM unit at time consists of a memory cell , an input gate , a forget gate , and an output gate . These gates are computed from previous hidden state and the current input :
(2) 
The memory cell is updated by partially forgetting the existing memory and adding a new memory content :
(3)  
(4) 
Once the memory content of the LSTM unit is updated, the hidden state at time step is given by:
(5) 
For simplicity of notation, the update of the hidden states of LSTM at time step is denoted as .
Here, we use and to represent the rating vector of user and movie given time respectively. Both and serve as the input to the LSTM layer at time to infer the new states of the user and the movie:
(6)  
(7) 
Here, and denote hidden states for user and movie at time step respectively.
In this work, we explore the potential of integrating posters of movies to boost the performance of movie recommendation. Inspired by the recent advances of CNNs in computer vision, the poster is mapped to the same space of the movie by using a CNN. More concretely, we encode each image into a FC2k feature vector with Resnet101 (101 layers), resulting in a 2048dimensional vector representation. The poster
of movie is only inputted once, at , to inform the movie LSTM about the poster content:(8) 
3.3. RNN and MF Hybrid
The sessionbased model deals with temporal dynamics of the user and movie states, we further incorporate the longterm preference of users and the fixed properties of movies. To exploit their advantages together, similar to (Wu et al., 2017), we define the rating prediction function as:
(9) 
where is a score function, and denote the global latent factors of user and movie learned by Eq. (1); and denote the hidden states at time step of two RNNs learned by Eq. (6) and Eq. (7) respectively. In this work, we study four strategies to calculate the score function , integrating MF and RNN. The details are described below.
LsicV1
This is a hard mechanism, using a simple way to calculate the mixing score from MF and RNN with the following formulation:
(10)  
(11) 
where and are the biases of user and movie ; and are computed by Eq. (6) and Eq. (7).
In fact, LSICV1 does not exploit the global factors in learning the temporal dynamics. In this paper, we also design a soft mixture mechanism and provide three strategies to account for the global factors and in learning and , as described below (i.e., LSICV2, LSICV3 and LSICV4).
LsicV2
We use the latent factors of user () and movie () pretrained by MF model to initialize the hidden states of the LSTM cells and respectively, as depicted in Figure 2(b).
LsicV3
As shown in Figure 2(c), we extend LSICV2 by treating (for user ) and (for movie ) as the static context vectors, and feed them as an extra input into the computation of the temporal hidden states of users and movies by LSTM. At each time step, the context information assists the inference of the hidden states of LSTM model.
LsicV4
This method use an attention mechanism to compute a weight for each hidden state by exploiting the global factors. The mixing scores at time can be reformulated by:
(12)  
(13) 
where and are the context vectors at time step for user and movie ; and are the hidden states of LSTMs at time step , computed by
(14)  
(15) 
The context vectors and act as extra input in the computation of the hidden states in LSTMs to make sure that every time step of the LSTMs can get full information of the context (longterm information). The context vectors and are the dynamic representations of the relevant longterm information for user and movie at time , calculated by
(16) 
where and are the number of users and movies. The attention weights and for user and movie at time step are computed by
(17)  
(18) 
where
is a feedforward neural network to produce a realvalued score. The attention weights
and together determine which user and movie factors should be selected to generate .3.4. Generative Adversarial Network (GAN) for Recommendation
Generative adversarial network (GAN)(Goodfellow et al., 2014) consist of a generator G and a discriminator D that compete in a minimax game with two players: The discriminator tries to distinguish real highrated movies on training data from ranking or recommendation list predicted by G, and the generator tries to fool the discriminator to generate(predict) wellranked recommendation list. Concretely, D and G play the following game on V(D,G):
(19)  
Here, is the input data from training set,
is the noise variable sampled from normal distribution.
We propose an adversarial framework to iteratively optimize two models: the generative model predicting recommendation list given historical usermovie interactions, and the discriminative model predicting the relevance of the generated list. Like the standard generative adversarial networks (GANs) (Goodfellow et al., 2014), our model also optimizes the two models with a minimax twoplayer game. tries to distinguish the real highrated movies in the training data from the recommendation list generated by , while maximizes the probability of making a mistake. Hopefully, this adversarial process can eventually adjust to generate plausible and highquality recommendation list. We further elaborate the generator and discriminator below.
3.4.1. Discriminate Model
As depicted in Figure 1 (right side), we implement the discriminator via a Siamese Network that incorporates long and sessionbased ranking models in a pairwise scenario. The discriminator has two symmetrical pointwise networks that share parameters and are updated by minimizing a pairwise loss.
The objective of discriminator is to maximize the probability of correctly distinguishing the ground truth movies from generated recommendation movies. For fixed, we can obtain the optimal parameters for the discriminator with the following formulation.
(20)  
where denotes the user set, denotes user , is a positive (highrating) movie, is a negative movie randomly chosen from the entire negative (lowrating) movie space, and are parameters of and , and is the generated movie by given time . Here, we adopt hinge loss as our training objective since it performs better than other training objectives. Hinge loss is widely adopted in various learning to rank scenario, which aims to penalize the examples that violate the margin constraint:
(21)  
where is the hyperparameter determining the margin of hinge loss, and we compress the outputs to the range of .
3.4.2. Generative Model
Similar to conditional GANs proposed in (Mirza and Osindero, 2014), our generator takes in the auxiliary information (user and time ) as input, and generates the ranking list for user . Specifically, when D is optimized and fixed after computing Eq. 20, the generator can be optimized by minimizing the following formulation:
(22) 
Here, denotes the movie set. As in (Goodfellow et al., 2014), instead of minimizing , we train to maximize .
3.4.3. Policy Gradient
Since the sampling of recommendation list by generator is discrete, it cannot be directly optimized by gradient descent as in the standard GAN formulation. Therefore, we use policy gradient based reinforcement learning algorithm (Sutton et al., 2000) to optimize the generator so as to generate highly rewarded recommendation list. Concretely, we have the following derivations:
(23)  
where is number of movies sampled by the current version of generator and is the th sampled item. With reinforcement learning terminology, we treat the term as the reward at time step , and take an action
at each time step. To accelerate the convergence, the rewards within a batch are normalized with a Gaussian distribution to make the significant differences.
4. Experimental Setup
4.1. Datasets
Dataset  Movielens100K  Netflix3M  NetflixFull 
Users  943  326,668  480,189 
movies  1,6831  17,751  17,770 
Ratings  100,000  16,080,980  100,480,507 
Train Data  09/9703/98  9/0511/05  12/9911/05 
Test Data  03/9804/98  12/05  12/05 
Train Ratings  77,714  13,675,402  98,074,901 
Test Ratings  21,875  2,405,578  2,405,578 
Density  0.493  0.406  0.093 
Sparsity  0.063  0.003  0.012 
In order to evaluate the effectiveness of our model, we conduct experiments on two widelyused reallife datasets: Movielens100K and Netflix (called “NetflixFull”). To evaluate the robustness of our model, we also conduct experiments on a 3month Netflix (called “Netflix3M”) dataset, which is a small version of NetflixFull and has different training and testing period. For each dataset, we split the whole data into several training and testing intervals based on time, as is done in (Wu et al., 2017), to simulate the actual situation of predicting future behaviors of users given the data that occurred strictly before current time step. Then, each testing interval is randomly divided into a validation set and a testing set. We removed the users and movies that do not appear in training set from the validation and test sets. The detailed statistics are presented in Table 2^{1}^{1}1“Density” shows the average number of 5ratings for the user per day. “Sparsity” shows the fillingrate of usermovie rating matrix as used in (Wu et al., 2017). Following (Wang et al., 2017), we treat “5star” in Netflix, “4start” and “5start” for Movielens100K as positive feedback and all others as unknown (negative) feedback.
4.2. Implementation Details
Matrix Factorization.
We use matrix factorization with 5 and 16 factor numbers for Movielens and Netflix respectively (Wang et al., 2017)
. The parameters are randomly initialized by a uniform distribution ranged in [0.05, 0.05]. We take gradientclipping to suppress the gradient to the range of [0.2,0.2]. L2 regularization (with
) is used to the weights and biases of user and movie factors.Recurrent Neural Network.
We use a singlelayer LSTM with 10 hidden neurons, 15dimensional input embeddings, and 4dimensional dynamic states where each state contains 7days users/movies behavioral trajectories. That is, we take one month as the length of a session. The parameters are initialized with the same way as in MF. L2 regularization (with
) is used to the weights and biases of the LSTM layer to avoid overfitting.Generative Adversarial Nets.
We pretrain and on the training data with a pairwise scenario, and use SGD algorithm with learning rate to optimize its parameters. The number of sampled movies is set to 64 (i.e., ). In addition, we use matrix factorization model to generate 100 candidate movies, and then rerank these movies with LSTM. In all experiments, we conduct minibatch training with batch size 128.
4.3. Evaluation Metrics
To quantitatively evaluate our method, we adopt the rankbased evaluation metrics to measure the performance of top recommendation (Liu and Yang, 2008; Cremonesi et al., 2010) , including Precision@N, Normalised Discounted Cumulative Gain (NDCG@N), Mean Average Precision (MAP) and Mean Reciprocal Ranking (MRR).
4.4. Comparison to Baselines
In the experiments, we evaluate and compare our models with several stateoftheart methods.
Precision@3  Precision@5  Precision@10  NDCG@3  NDCG@5  NDCG@10  MRR  MAP  
BPR  0.2795  0.2664  0.2301  0.2910  0.2761  0.2550  0.4324  0.3549 
PRFM  0.2884  0.2699  0.2481  0.2937  0.2894  0.2676  0.4484  0.3885 
LambdaFM  0.3108  0.2953  0.2612  0.3302  0.3117  0.2795  0.4611  0.4014 
RRN  0.2893  0.2740  0.2480  0.2951  0.2814  0.2513  0.4320  0.3631 
IRGAN  0.3022  0.2885  0.2582  0.3285  0.3032  0.2678  0.4515  0.3744 
LSICV1  0.2946  0.2713  0.2471  0.2905  0.2801  0.2644  0.4595  0.4066 
LSICV2  0.3004  0.2843  0.2567  0.3122  0.2951  0.2814  0.4624  0.4101 
LSICV3  0.3105  0.3023  0.2610  0.3217  0.3086  0.2912  0.4732  0.4163 
LSICV4  0.3327  0.3173  0.2847  0.3512  0.3331  0.2939  0.4832  0.4321 
Impv  7.05%  7.45%  9.00%  6.36%  6.87%  5.15%  4.79%  7.65% 
Precision@3  Precision@5  Precision@10  NDCG@3  NDCG@5  NDCG@10  MRR  MAP  
BPR  0.2670  0.2548  0.2403  0.2653  0.2576  0.2469  0.3829  0.3484 
PRFM  0.2562  0.2645  0.2661  0.2499  0.2575  0.2614  0.4022  0.3712 
LambdaFM  0.3082  0.2984  0.2812  0.3011  0.2993  0.2849  0.4316  0.4043 
RRN  0.2759  0.2741  0.2693  0.2685  0.2692  0.2676  0.3960  0.3831 
IRGAN  0.2856  0.2836  0.2715  0.2824  0.2813  0.2695  0.4060  0.3718 
LSICV1  0.2815  0.2801  0.2680  0.2833  0.2742  0.2696  0.4416  0.4025 
LSICV2  0.2901  0.2883  0.2701  0.2903  0.2831  0.2759  0.4406  0.4102 
LSICV3  0.3152  0.3013  0.2722  0.2927  0.2901  0.2821  0.4482  0.4185 
LSICV4  0.3221  0.3193  0.2921  0.3157  0.3114  0.2975  0.4501  0.4247 
Impv  4.51%  7.00%  3.88%  4.85%  4.04%  4.42%  4.29%  5.05% 
Precision@3  Precision@5  Precision@10  NDCG@3  NDCG@5  NDCG@10  MRR  MAP  
BPR  0.3011  0.2817  0.2587  0.2998  0.2870  0.2693  0.3840  0.3660 
PRFM  0.2959  0.2837  0.2624  0.2831  0.2887  0.2789  0.4060  0.3916 
LambdaFM  0.3446  0.3301  0.3226  0.3450  0.3398  0.3255  0.4356  0.4067 
RRN  0.3135  0.2954  0.2699  0.3123  0.3004  0.2810  0.3953  0.3768 
IRGAN  0.3320  0.3229  0.3056  0.3319  0.3260  0.3131  0.4248  0.4052 
LSICV1  0.3127  0.3012  0.2818  0.3247  0.3098  0.2957  0.4470  0.4098 
LSICV2  0.3393  0.3271  0.3172  0.3482  0.3401  0.3293  0.4448  0.4213 
LSICV3  0.3501  0.3480  0.3291  0.3498  0.3451  0.3321  0.4503  0.4257 
LSICV4  0.3621  0.3530  0.3341  0.3608  0.3511  0.3412  0.4587  0.4327 
Impv  5.08%  6.94%  3.56%  4.58%  3.33%  4.82%  5.30%  6.39% 
Precision@3  Precision@5  Precision@10  NDCG@3  NDCG@5  NDCG@10  MRR  MAP  
LSICV4  0.3221  0.3193  0.2921  0.3157  0.3114  0.2975  0.4501  0.4247 
w/o RL  0.3012  0.2970  0.2782  0.2988  0.2927  0.2728  0.4431  0.4112 
w/o poster  0.3110  0.3012  0.2894  0.3015  0.3085  0.2817  0.4373  0.4005 
Groundtruth  IRGAN (Wang et al., 2017)  RRN (Wu et al., 2017)  LambdaFM (Yuan et al., 2016)  LSICV4  
Userid: 1382  9 Souls The Princess Bride Stuart Saves His Family The Last Valley Wax Mask After Hours Session 9 Valentin  [1] The Beatles: Love Me Do [2] Wax Mask ✓ [3] Stuart Saves His Family ✓ [4] After Hours ✓ [5] Top Secret! [6] Damn Yankees [7] Dragon Tales: It’s Cool to Be Me! [8] Play Misty for Me [9] The Last Round: Chuvalo vs. Ali’ [10] La Vie de Chateau  [1] Falling Down [2] 9 Souls ✓ [3] Wax Mask ✓ [4] After Hours ✓ [5] Stuart Saves His Family ✓ [6] Crocodile Dundee 2 [7] The Princess Bride ✓ [8] Dragon Tales: It’s Cool to Be Me! [9] They Were Expendable [10] Damn Yankees  [1] The Avengers ’63 [2] Wax Mask ✓ [3] The Boondock Saints [4] Valentin ✓ [5] 9 Souls ✓ [6] The Princess Bride ✓ [7] After Hours ✓ [8] Tekken [9] Stuart Saves His Family ✓ [10] Runn Ronnie Run  [1]9 Souls ✓ [2]The Princess Bride ✓ [3]Stuart Saves His Family ✓ [4] The Last Valley ✓ [5] Wax Mask ✓ [6] Session 9 ✓ [7] Dragon Tales: It’s Cool to Be Me! [8] Damn Yankees [9] After Hours ✓ [10] Valentin ✓ 
Userid: 8003  9 Souls Princess Bride  [1] Cheech Chong’s Up in Smoke [2] Wax Mask [3] Damn Yankees [4] Dragon Tales: It’s Cool to Be Me! [5] Top Secret! [6] Agent Cody Banks 2: Destination London [7] After Hours [8] Stuart Saves His Family [9] 9 Souls ✓ [10] The Beatles: Love Me Do  [1] Crocodile Dundee 2 [2] Session 9 [3] Falling Down [4] Wax Mask [5] After Hours [6] Stuart Saves His Family [7] 9 Souls ✓ [8] The Princess Bride ✓ [9] Dragon Tales: It’s Cool to Be Me! [10] Scream 2  [1] The Insider [2]A Nightmare on Elm Street 3 [3] Dennis the Menace Strikes Again [4] Civil Brand [5] 9 Souls ✓ [6] Falling Down [7] The Princess Bride ✓ [8] Radiohead: Meeting People [9] Crocodile Dundee 2 [10] Christmas in Connecticut  [1] 9 Souls ✓ [2]The Princess Bride ✓ [3]The Last Valley [4]Stuart Saves His Family [5]Wax Mask [6]Dragon Tales: It’s Cool to Be Me! [7]Session 9 [8]Crocodile Dundee 2 [9]Damn Yankees [10]Cheech Chong’s Up in Smoke 
Bayesian Personalised Ranking (BPR)
Given a positive movie, BPR uniformly samples negative movies to resolve the imbalance issue and provides a basic baseline for topN recommendation (Rendle et al., 2009).
Pairwise Ranking Factorization Machine (PRFM)
LambdaFM
It is a strong baseline for recommendation, which directly optimizes the rank biased metrics (Yuan et al., 2016). We run the LambdaFM model with the publicly available code^{2}^{2}2https://github.com/fajieyuan/LambdaFM
, and use default settings for all hyperparameters.
Recurrent Recommender Networks (RRN)
Irgan
This model trains the generator and discriminator alternatively with MF in an adversarial process (Wang et al., 2017). We run the IRGAN model with the publicly available code^{3}^{3}3https://github.com/geekai/irgan, and use default settings for all hyperparameters.
5. Experimental Results
In this section, we compare our model with baseline methods quantitatively and qualitatively.
5.1. Quantitative Evaluation
We first evaluate the performance of top recommendation. The experimental results are summarized in Tables 3 ,4 and 5. Our model substantially and consistently outperforms the baseline methods by a noticeable margin on all the experimental datasets. In particular, we have explored several versions of our model with different mixture mechanisms. As one anticipates, LSICV4 achieves the best results across all evaluation metrics and all datasets. For example, on MovieLens dataset, LSICV4 improves on percision@5 and on NDCG@5 over the baseline methods. The main strength of our model comes from its capability of prioritizing both longterm and shortterm information in contentaware movie recommendation. In addition, our mixture mechanisms (hard and soft) also seem quite effective to integrate MF and RNN.
To better understand the adversarial training process, we visualize the learning curves of LSICV4 as shown in Figure 3. Due to the limited space, we only report the Pecision@5 and NDCG@5 scores as in (Wang et al., 2017)
. The other metrics exhibit a similar trend. As shown in Figure 3, after about 50 epoches, both Precision@5 and NDCG@5 converge and the winner player is the generator which is used to generate recommendation list for our final top
movie recommendation. The performance of generator becomes better with the effective feedback (reward) from discriminator . On the other hand, once we have a set of highquality recommendation movies, the performance of deteriorates gradually in the training procedure and makes mistakes for predictions. In our experiments, we use the generator G with best performance to predict test data.5.2. Ablation Study
In order to analyze the effectiveness of different components of our model for top movie recommendation, in this section, we report the ablation test of LSICV4 by discarding poster information (w/o poster) and replacing the reinforcement learning with GumbelSoftmax (Kusner and HernándezLobato, 2016) (w/o RL), respectively. GumbelSoftmax is an alternative method to address the nondifferentiation problem so that can be trained straightforwardly.
Due to the limited space, we only illustrate the experimental results for Netflix3M dataset that is widely used in movie recommendation (see Table 6). Generally, both factors contribute, and reinforcement learning contributes most. This is within our expectation since discarding reinforcement learning will lead the adversarial learning inefficient. With GumbelSoftmax, does not benefit from the reward of , so that we do not know which movies sampled by G are good and should be reproduced. Not surprisingly, poster information also contributes to movie recommendation.
5.3. Case Study
In this section, we will further show the advantages of our models through some quintessential examples.
In Table 7, we provide the recommendation lists generated by three stateoftheart baseline methods (i.e., IRGAN, RNN, LambdaFM) as well as the proposed LSICV4 model for two users who are randomly selected from the Netflix3M dataset. Our model can rank the positive movies in higher positions than other methods. For example, the ranking of the movie “9 souls” for user “8003” has increased from 5th position (by LambdaFM) to 1st position (by LSICV4). Meanwhile some emerging movies such as “Session 9” and “The Last Valley” that are truly attractive to the user “1382” have been recommended by our models, whereas they are ignored by baseline methods. In fact, we can include all positive movies in the top10 list for user “1382” and in top3 list for user “8003”. Our model benefits from the fact that both dynamic and static knowledge are incorporated into the model with adversarial training.
5.4. Rerank Effect
From our experiments, we observe that it could be timeconsuming for RNN to infer all movies. In addition, users may be interested in a few movies that are subject to a longtailed distribution. Motivated by these observations, we provide a reranking strategy as used in (Covington et al., 2016). Specifically, we first generate candidate movies by MF, and then rerank these candidate movies with our model. In this way, the inference time can been greatly reduced. Figure 4 illustrates the performance curves over the number of candidate movies (i.e. ) generated by MF. We only report the Pecision@5 and NDCG@5 results due to the limited space, the other metrics exhibit a similar trend. As shown in Figure 4, when the number of candidate movies is small, i.e., for Netflix3M dataset, the Percision@5 raises gradually with the number of candidate movies increases. Nevertheless, the performance drops rapidly with the increasing candidate movies when . It suggests that the candidates in a longtail side generated by MF is inaccurate and these candidate movies deteriorate the overall performance of the rerank strategy.
5.5. Session Period Sensitivity
The above experimental results have shown that the sessionbased (shortterm) information indeed improves the performance of topN recommendation. We conduct an experiment on Netflix3M to investigate how the session period influences the final recommendation performance. As shown in Figure 5, the bars in purple color (below) describe the basic performance of MF component while the red ones (above) represent the extra improvement by the LSICV4. The RNN plays an insignificant role in the very early sessions since it lacks enough historical interaction data. For later period of sessions, RNN component tends to be more effective, and our joint model achieves a clear improvement over the model with only MF component.
6. Conclusion
In this paper, we proposed a novel adversarial process for top recommendation. Our model incorporated both matrix factorization and recurrent neural network to exploit the benefits of the longterm and shortterm knowledge. We also integrated poster information to further improve the performance of movie recommendation. Experiments on two reallife datasets showed the performance superiority of our model.
References
 (1)
 Adomavicius and Tuzhilin (2005) Gediminas Adomavicius and Alexander Tuzhilin. 2005. Toward the next generation of recommender systems: A survey of the stateoftheart and possible extensions. IEEE transactions on knowledge and data engineering 17, 6 (2005), 734–749.

Arjovsky
et al. (2017)
Martin Arjovsky, Soumith
Chintala, and Léon Bottou.
2017.
Wasserstein generative adversarial networks. In
International Conference on Machine Learning
. 214–223.  Bobadilla et al. (2013) Jesus Bobadilla, Fernando Ortega, Antonio Hernando, and Abraham Gutierrez. 2013. Recommender systems survey. Knowledgebased systems 46 (2013), 109–132.
 Chen et al. (2017) TsengHung Chen, YuanHong Liao, ChingYao Chuang, WanTing Hsu, Jianlong Fu, and Min Sun. 2017. Show, Adapt and Tell: Adversarial Training of Crossdomain Image Captioner. arXiv preprint arXiv:1705.00930 (2017).
 Covington et al. (2016) Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 191–198.
 Cremonesi et al. (2010) Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of recommender algorithms on topn recommendation tasks. In Proceedings of the fourth ACM conference on Recommender systems. ACM, 39–46.
 Cui et al. (2016) Qiang Cui, Shu Wu, Qiang Liu, and Liang Wang. 2016. A Visual and Textual Recurrent Neural Network for Sequential Prediction. arXiv preprint arXiv:1611.06668 (2016).
 Devooght and Bersini (2016) Robin Devooght and Hugues Bersini. 2016. Collaborative filtering with recurrent neural networks. arXiv preprint arXiv:1608.07400 (2016).
 Goodfellow et al. (2014) Ian Goodfellow, Jean PougetAbadie, Mehdi Mirza, Bing Xu, David WardeFarley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672–2680.
 He et al. (2016) Xiangnan He, Hanwang Zhang, MinYen Kan, and TatSeng Chua. 2016. Fast matrix factorization for online recommendation with implicit feedback. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 549–558.
 Hernando et al. (2016) Antonio Hernando, Jesús Bobadilla, and Fernando Ortega. 2016. A non negative matrix factorization for collaborative filtering recommender systems based on a Bayesian probabilistic model. KnowledgeBased Systems 97 (2016), 188–202.
 Hidasi et al. (2015) Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Sessionbased recommendations with recurrent neural networks. In ICLR.
 Hochreiter and Schmidhuber (1997) Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long shortterm memory. Neural computation 9, 8 (1997), 1735–1780.
 Koren (2008) Yehuda Koren. 2008. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 426–434.
 Koren (2010) Yehuda Koren. 2010. Collaborative filtering with temporal dynamics. Commun. ACM 53, 4 (2010), 89–97.
 Koren et al. (2009) Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009).
 Kusner and HernándezLobato (2016) Matt J Kusner and José Miguel HernándezLobato. 2016. GANS for Sequences of Discrete Elements with the Gumbelsoftmax Distribution. arXiv preprint arXiv:1611.04051 (2016).
 Liu and Yang (2008) Nathan N Liu and Qiang Yang. 2008. Eigenrank: a rankingoriented approach to collaborative filtering. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 83–90.
 Lu et al. (2015) Jie Lu, Dianshuang Wu, Mingsong Mao, Wei Wang, and Guangquan Zhang. 2015. Recommender system application developments: a survey. Decision Support Systems 74 (2015), 12–32.
 McNee et al. (2006) Sean M McNee, John Riedl, and Joseph A Konstan. 2006. Being accurate is not enough: how accuracy metrics have hurt recommender systems. In CHI’06 extended abstracts on Human factors in computing systems. ACM, 1097–1101.
 Mirza and Osindero (2014) Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014).
 Mnih and Salakhutdinov (2008) Andriy Mnih and Ruslan R Salakhutdinov. 2008. Probabilistic matrix factorization. In Advances in neural information processing systems. 1257–1264.
 Qiang et al. (2013) Runwei Qiang, Feng Liang, and Jianwu Yang. 2013. Exploiting ranking factorization machines for microblog retrieval. In Proceedings of the 22nd ACM international conference on Conference on information & knowledge management. ACM, 1783–1788.
 Reed et al. (2016) Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. 2016. Generative adversarial text to image synthesis. In Proceedings of the 33rd International Conference on International Conference on Machine Learning. JMLR.org, 1060–1069.

Rendle et al. (2009)
Steffen Rendle, Christoph
Freudenthaler, Zeno Gantner, and Lars
SchmidtThieme. 2009.
BPR: Bayesian personalized ranking from implicit
feedback. In
Proceedings of the twentyfifth conference on uncertainty in artificial intelligence
. AUAI Press, 452–461.  Rennie and Srebro (2005) Jasson DM Rennie and Nathan Srebro. 2005. Fast maximum margin matrix factorization for collaborative prediction. In Proceedings of the 22nd international conference on Machine learning. ACM, 713–719.
 Srebro et al. (2005) Nathan Srebro, Jason Rennie, and Tommi S Jaakkola. 2005. Maximummargin matrix factorization. In Advances in neural information processing systems. 1329–1336.
 Sutton et al. (2000) Richard S Sutton, David A McAllester, Satinder P Singh, and Yishay Mansour. 2000. Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems. 1057–1063.
 Wang et al. (2017) Jun Wang, Lantao Yu, Weinan Zhang, Yu Gong, Yinghui Xu, Benyou Wang, Peng Zhang, and Dell Zhang. 2017. IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 515–524.
 Wu et al. (2016) Caihua Wu, Junwei Wang, Juntao Liu, and Wenyu Liu. 2016. Recurrent neural network based recommendation for time heterogeneous feedback. KnowledgeBased Systems 109 (2016), 90–103.
 Wu et al. (2016) ChaoYuan Wu, Amr Ahmed, Alex Beutel, and Alexander J Smola. 2016. Joint Training of Ratings and Reviews with Recurrent Recommender Networks. ICLR (2016).
 Wu et al. (2017) ChaoYuan Wu, Amr Ahmed, Alex Beutel, Alexander J Smola, and How Jing. 2017. Recurrent recommender networks. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. ACM, 495–503.
 Yu et al. (2017) Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2017. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient.. In AAAI. 2852–2858.
 Yuan et al. (2016) Fajie Yuan, Guibing Guo, Joemon M Jose, Long Chen, Haitao Yu, and Weinan Zhang. 2016. Lambdafm: learning optimal ranking with factorization machines using lambda surrogates. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM, 227–236.
 Zhao et al. (2016) Lili Zhao, Zhongqi Lu, Sinno Jialin Pan, and Qiang Yang. 2016. Matrix Factorization+ for Movie Recommendation.. In IJCAI. 3945–3951.
Comments
There are no comments yet.