NAIRS: A Neural Attentive Interpretable Recommendation System

02/20/2019 ∙ by Shuai Yu, et al. ∙ 0

In this paper, we develop a neural attentive interpretable recommendation system, named NAIRS. A self-attention network, as a key component of the system, is designed to assign attention weights to interacted items of a user. This attention mechanism can distinguish the importance of the various interacted items in contributing to a user profile. Based on the user profiles obtained by the self-attention network, NAIRS offers personalized high-quality recommendation. Moreover, it develops visual cues to interpret recommendations. This demo application with the implementation of NAIRS enables users to interact with a recommendation system, and it persistently collects training data to improve the system. The demonstration and experimental results show the effectiveness of NAIRS.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

With the huge volumes of online information, attention has been continuously paid to recommender systems [23, 25]. Item-based collaborative filtering (CF) is one of the most successful techniques in practice due to its simplicity, accuracy, and scalability [13, 21, 16]. It profiles a user with the historically interacted items and recommends similar items in terms of user profiles.

Most of the existing item-based CF methods utilize statistical measures (e.g., cosine similarity) to estimate item similarities. However, the assumption of equal weights is often applied for the items in the measurement

[13]. In other words, different items in the historical list are equally treated, which is not true for many of the real-world recommendation applications. On the other hand, interpretable recommendations are of increasing interest, which explain the underlying reasons for the potential user interest on the recommended items. Traditional methods often generate explanations from the textual data such as the content and reviews associated with the items [30, 5, 2, 4]. Yet, generating reasons of recommendation remains unsolvable when the texts are unavailable.

Inspired by the recent successes of attention-based neural networks 


in computer vision and natural language processing, this paper proposes a neural attentive interpretable recommendation system (NAIRS) to alleviate the aforementioned limitations. The key to the design of NAIRS is an self-attention network that computes the attention weights of the historical items in a user profile according to their intent importance associated with the user’s preferences. With the learned attention weights, NAIRS provides a high-quality personalized recommendation to users according to their historical preferences. Meanwhile, it interprets the reasons of recommendation by visualizing the learned attention weights for the user’s historical list. The function of personalized and interpretable recommendation assists users and manufacturer in identifying results of interest and exploring alternative choices more efficiently. In addition, NAIRS enables users to search for the users who have the similar results and search for the items which are similar to the chosen item. The two functions help the users to discover more potentially interesting items. Furthermore, NAIRS actively records users’ interactive behaviors in the system, such as their input queries, liked items, and clicked results.

2 Related Work

Recommender system is an active research field. The authors of [3, 15] described most of the existing techniques for recommender systems. In this section, we briefly review the following major approaches that are related to our work.

2.1 Item-based Collaborative Filtering

Item-based collaborative filtering [20] is one of the most successful techniques in the practice of recommendation due to its simplicity and attractive accuracy. The main idea behind item-based CF is that the prediction of a user on a target item depends on the similarity of to all items the user has interacted with in the past. In [17], the authors proposed a method named SLIM (short for Sparse Linear Method), which learned item similarities by optimizing a recommendation-aware objective function. SLIM minimized the loss between the original user-item interaction matrix and the reconstructed one from the item-based CF model. FISM [13] was one of the most widely used collaborative filtering method, which achieved the state-of-the-art performance among the item-based methods. In its standard setting, the prediction of a user to an item is calculated by the inner product of the historical items and the target item. [24] proposed an attention-based transaction embedding (ATEM) model. It was a shallow wide-in-wide-out neural network, which learned an attentive context embedding that is expected to be most relevant to the next choice over all the observed items in a transaction. [8] leveraged historical items as attention source to calculate the relationship between the historical items and new-coming item. [29]

proposed an integrated network to combine non-linear transformation with latent factors.

2.2 Deep Learning for Recommender Systems

These traditional MF methods for recommender systems are based on the assumption that the user interests and movie attributes are near static, which is however not consistent with reality. [14]

discussed the effect of temporal dynamics in recommender systems and proposed a temporal extension of the SVD++ (called TimeSVD++) to explicitly model the temporal bias in data. However, the features used in TimeSVD++ were hand-crafted and computationally expensive to obtain. Recently, there have been increasing interests in employing recurrent neural network to model temporal dynamic in recommender systems. For example,

[11] applied recurrent neural network (i.e. GRU) to session-based recommender systems. This work treated the first item a user clicked as the initial input of GRU. Each follow-up click of the user would then trigger a recommendation depending on all of the previous clicks. [26] proposed a recurrent neural network to perform the time heterogeneous feedback recommendation. [27]

used LSTM autoregressive model for the user and movie dynamics and employed matrix factorization to model the stationary components that encode fixed properties.

To address the cold start problem in recommendation, [6]

presented a visual and textural recurrent neural network (VT-RNN), which simultaneously learned the sequential latent vectors of users’ interest and captured the content-based representations that contributed to address the cold-start issues.

2.3 Attention Network For Recommendation

Attention mechanisms have recently become an essential part in deep neural networks, which equip a deep neural network with the ability to focus on a subset of its inputs (or features). Self-Attention Network [22]

is a special case of the attention mechanism, which uses a token embedding from the source input itself as the attention source to calculate the distributed representation of the input sequence. It relates elements at different positions from a single input sequence by computing the attention between each pair of tokens. Expressive performance have been achieved by self-attention mechanism for modeling both long-range and local dependencies of the input sequence. Recently, remarkable success has been achieved by self-attention in a variety of tasks, such as reading comprehension


and neural machine translation

[22]. ATRank [31]

can model with heterogeneous user behaviors using only the attention model. Behaviors interactions are captured using self-attention in multiple semantic spaces. User preferences often evolve over time, thus modeling their temporal dynamics is essential for recommendation.

[18] proposed the Interacting Attention-gated Recurrent Network (IARN)[18] to accommodate temporal context for better recommendation. IARN can not only accurately measure the relevance of individual time steps of user and item history for recommendation, but also capture the dependencies between user and item dynamics in shaping user-item interactions

The studies most similar to ours are proposed in [18, 28, 31, 7]. [28] proposed a novel two-layer hierarchical attention network (SHAN) to recommend the next item the user might be interested in. Specifically, the first attention layer learns user long-term preferences based on the representations of historical purchased items, and the second layer outputs final user representation through coupling user long-term and short-term preferences. [31] proposed an attention based user behavior modeling framework (ATRank). Heterogeneous user behaviors are considered in the model that project all types of behaviors into multiple latent semantic spaces, where influence can be made among the behaviors via self-attention. [7] employed an item-side interactive neural attention network (NAIS), which assigned different weights on historical items. Our model differs from NAIS in several aspects. First, we employ a self-attention mechanism to learn the representations of users, instead of calculating the attention scores with respect to specific items, as in [7]. Second, in practice, we can recommend items based on the pre-computed representations of users in real-time, while[7] needs to calculate attention weights every time. In addition, NAIRS can provide user profiles that play a crucial role in broad applications. More importantly, this demonstration paper provides the live demonstration and prototype of the interpretable recommendation system.

3 Core Algorithm

We denote a user-item interaction matrix as , where M and N are the number of users and items, respectively. We use to denote the set of user-item pairs and use to denote the set of items that user has interacted with. As described in [13], each item has two embedding vectors p and q to distinguish its role of history item and prediction target.

Figure 1: Architecture overview of NAIRS.

The FISM [13] is one of the most widely used collaborative filtering method, which achieves the state-of-the-art performance among the item-based methods. In its standard setting, the prediction of a user to an item can be calculated as below:


where and denote the user and item biases, respectively.

Despite the effectiveness of FISM, we argue that its performance is hindered by assigning equal weight to each interacted item. To address this limitation, we propose a neural attentive network to assign different weights to the items according to their intent importance. Mathematically, the prediction of user to item can be calculated as


where is the attention weight of item in contributing to user ’s representation. Specifically, we exploit self-attention to learn the representation of user , each of the historical items learns to align to each other. The weight of each historical item j is computed by


where is an alignment model which scores the contribution of item to the representation of user

. To form a proper probability distribution over the items, we normalize the scores across the items using

softmax function and get attention score . is a smoothing hyper parameter that will be discussed later in this section. and are the weight matrices, and (

) is the activation function.

In practice, the standard attention network fails to learn from users’ historical data and perform accurate recommendation. By analyzing the attention weights outputted by the model, we reveal that the performance of the model is largely hindered by the softmax

function, due to the large variances on the lengths of user histories. The attention weights of the items from long history list are largely decreased. To address this problem, we introduce a new symbol

to smooth the denominator of the original attention formula111This smoothing method is similar to that in [7]. However, we did this work independently and this work (which was initially submitted to SIGIR-18) was done before the publication of [7].. can be set in a range of . If , then Eq. (3) degenerates into the original softmax. One typically chooses the value of between zero and one. This smooth setting leads to much better performance than standard softmax function.

Following the strategy in the previous work [9], we treat the observations as positive instances and randomly sample the unobserved items as negative instances. Cross entropy is adopted as the objective function, which minimizes the regularized log loss:


where denotes the number of the training instances, denotes the parameters of the model.

4 System Architecture

The proposed NAIRS, overviewed in Figure 1, consists of five main modules. (1) The Data crawler module collects user interactive information from various websites such as Amazon, Jindong, and IMDB. (2) The Recommendation module produces recommendation results and interpretable partial scores of user-item pairs. (3) The Interpretation module visualizes the interpretable reasons of the recommendation by scoring user’s historical list. (4) The Retrieval module enables the users to i) find people with similar preferences (with historical lists) and ii) explore the items that are similar to a user-specified item. This module effectively assists the users in finding more items in which they may be interested. (5) The Logging module collects user behaviors from the system, such as chosen items. The logging information is utilized to further improve the recommender system. In the rest of this section, we elaborate each module of the above.

Figure 2: The Neural Attentive Interpretable Recommendation System. The top part shows the Interpretable Recommendation module, the bottom part shows the Retrieval Module. The user’s historical interacted movies are displayed in the tag cloud. When the user clicks a movie in the recommendation list, the related movies in the tag cloud will become bigger. The user can either search similar items in the query box or click the movie’s name in the tag cloud. The user can also click the link bellow the user logo to explore the users who have similar interests to her/him.

4.1 Data Crawler Module

The data crawler collects three types of user-item interactive data: (i) movie rating data from IMDB222; (2) books rating data from Amazon333; and (3) daily goods rating data from Jingdong444 For movie rating data, users are selected at random for inclusion. All selected users have rated at least one movie. For book rating data, we focus on the top 1000 popular books. We also collect six categories of daily goods including clothes, shoes, cosmetics, foods, toys, and smart phones. In this work, all user-sensitive information is removed.

4.2 Recommendation Module

We implement the interpretable recommendation algorithm introduced in Section 2 to perform top-

item recommendation. Our recommendation model is implemented with the TensorFlow

555 library and trained on a NVIDIA Titan Xp GPU. After training, we can obtain the user and item representations for each user and item, which are then used to predict the rating scores and assign weights to items in user’s historical list for interpretation. In addition, we can obtain the similar users and similar items results easily with the learned user and item representations. The results learned by Recommendation module can be directly used by the Interpretation module and the Retrieval module.

Note that during the bootstrap process NAIRS provides users a navigation page in which the users can choose the items that they are interested in. This process can alleviate the cold start problem in recommendation to some extent, especially for new users.

4.3 Interpretation Module

Given a user , the historical items , and a recommended item , the Interpretation module provides the top- recommendation results and interprets the reasons of the recommendation by visualizing the attention scores of user ’s historical list . In particular, we support the users to add interested movies into their profile list or delete the movies they do not like. NAIRS then demonstrates the recommendation results (on the right of the interface) and interprets the reason of each recommendation with a tag cloud (in the center of the interface), as shown in Figure 2. The importance of each item in the user’s historical list is shown with various font sizes. The larger the item names, the more important the items in contributing to the recommendation. For example, the movie Men in Black is recommended based on Nikata in the user’s historical list which has the highest attention score. We show that these two movies both belong to the action movie category. On the other hand, the movie Escape from New York contributes little to recommend movie In the Army Now since they belong to different categories.

4.4 Retrieval Module

4.4.1 Similar Users

The Similar Users module can assist end users to find other users who have similar interests. This module plays an important role in helping the users who might not know exactly what they are looking for to discover potentially interesting items based on the observation that people who agree in the past are likely to agree again. In order to overcome the insensitive of average value, we calculate the similarity between users with adjusted cosine similarity as follows:


where and are the average values over the user’s embedding dimensions. Note that we map the value space of the similarity from [-1, 1] to [0, 2] to provide positive similarity scores for better visualization. As shown in Figure 2, we visualize the similar users with their historical lists. In addition, we provide the “like” and “dislike” buttons for users to select/filter the displayed items. These feedback information can be used to update our Recommendation module. In NAIRS, the calculated similarities are cached after updating the model to speed up the results retrieval process.

4.4.2 Similar Items

Intuitively, a user is likely to have similar level of interest for similar items. The Similar items module finds items similar to the items liked or chosen by the user. In particular, we provide the end user with a search window for searching any items in the system. Then the items whose similarities are above a threshold are returned as the search results. The similarity between items is calculated as follows:


Similar to the Similar Users component, the similarities between items are cached in the system. When the end user requests similar items, we can obtain the results in time.

5 Demostration

5.1 Demonstration Setup

The NAIRS prototype has client and server ends. Clients can access the system by web, mainly for rendering recommendation, interpretation, search, and query results. The server is deployed on Apache Tomcat, which performs the recommendation algorithm and communicates with clients.

5.2 Walkthrough Example

The NAIRS demo consists of the following steps:

Step 1: The user can access to the system by web either on PCs or smart phones. After logging onto the system, the user can select a kind of recommendation service from three categories: movies, books, and daily goods.

Step 2: If the user is new to the system, a collection of randomly chosen items are presented, and the user is asked to choose some items in which the user is interested. After submitting the chosen items, the system offers the top-10 recommendation lists based the chosen items. Furthermore, the system shows tag cloud of the user profile, which reveals why the system recommends the specific items to the user. The user can click any item in the recommendation list, and the user profile tag cloud change accordingly.

Step 3: The user can query other users that have similar interests by clicking the “similar users” button. Then the similar users with their historical lists are returned to the user, and the user can choose to follow them and find the potentially interesting items via this function.

Step 4: The user can also search the items similar to the item inputed by the user. If our system has items for which the user search, similar items are returned; otherwise, a warning message is shown. Note that to enhance user experience, we implement an Auto-suggestion query box.

6 Quantitative Evaluation

In this section, we evaluate the performance of NAIRS quantitatively, then we investigate the interpretation of the proposed system.

(a) Movielens HR
(b) Movielens NDCG
(c) Pinterest HR
(d) Pinterest NDCG
Figure 3: Performance comparison.

We conduct experiments on two widely used datasets: Movielens-1M and Pinterest, as the ones used in the study [9]. The results are judged with hit ratio(HR) and Normalized Discounted Cumulative Gain(NDCG), which have been widely used in top- recommendation [13, 9]. NAIR is compared with several baseline methods including MF-BPR [19], MF-eALS [10], FISM [13], and MLP [9].

The experimental results are shown in Figure 3. We observe that our method outperforms other competitive methods for both of the datasets, which shows the effectiveness of the proposed approach on top- recommendation used in the demonstration.


  • [1] Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. ICLR (2015)
  • [2] Bauman, K., Liu, B., Tuzhilin, A.: Aspect based recommendations: Recommending items with the most valuable aspects based on user reviews. In: Proceedings of SIGKDD (2017)
  • [3] Bobadilla, J., Ortega, F., Hernando, A., Gutierrez, A.: Recommender systems survey. Knowledge-based systems 46, 109–132 (2013)
  • [4] Chelliah, M., Sarkar, S.: Product recommendations enhanced with reviews. In: Proceedings of RecSys (2017)
  • [5] Chen, L., Wang, F.: Explaining recommendations based on feature sentiments in product reviews. In: Proceedings of IUI (2017)
  • [6] Cui, Q., Wu, S., Liu, Q., Wang, L.: A visual and textual recurrent neural network for sequential prediction. arXiv preprint arXiv:1611.06668 (2016)
  • [7] He, X., He, Z., Song, J., Zhenguang, L., Yu-Gang, J., Chua, T.S.: Nais:neural attentive item similarity model for recommendation. IEEE TKDE (2018)
  • [8] He, X., He, Z., Song, J., Liu, Z., Jiang, Y.G., Chua, T.S.: Nais: Neural attentive item similarity model for recommendation. IEEE Transactions on Knowledge and Data Engineering (2018)
  • [9] He, X., Liao, L., Zhang, H., Nie, L., Hu, X., Chua, T.S.: Neural collaborative filtering. In: WWW. pp. 173–182 (2017)
  • [10] He, X., Zhang, H., Kan, M.Y., Chua, T.S.: Fast matrix factorization for online recommendation with implicit feedback. In: SIGIR. pp. 549–558. ACM (2016)
  • [11] Hidasi, B., Karatzoglou, A., Baltrunas, L., Tikk, D.: Session-based recommendations with recurrent neural networks. In: ICLR (2015)
  • [12] Hu, M., Peng, Y., Qiu, X.: Reinforced mnemonic reader for machine comprehension. CoRR, abs/1705.02798 (2017)
  • [13] Kabbur, S., Ning, X., Karypis, G.: Fism: factored item similarity models for top-n recommender systems. In: SIGKDD. pp. 659–667. ACM (2013)
  • [14] Koren, Y.: Collaborative filtering with temporal dynamics. Communications of the ACM 53(4), 89–97 (2010)
  • [15] Lu, J., Wu, D., Mao, M., Wang, W., Zhang, G.: Recommender system application developments: a survey. Decision Support Systems 74, 12–32 (2015)
  • [16] Ning, X., Karypis, G.: SLIM: sparse linear methods for top-n recommender systems. In: Proceedings of ICDM (2011)
  • [17] Ning, X., Karypis, G.: Slim: Sparse linear methods for top-n recommender systems. In: ICDM. pp. 497–506. IEEE (2011)
  • [18] Pei, W., Yang, J., Sun, Z., Zhang, J., Bozzon, A., Tax, D.M.: Interacting attention-gated recurrent networks for recommendation. In: CIKM. pp. 1459–1468. ACM (2017)
  • [19] Rendle, S., Freudenthaler, C., Gantner, Z., Schmidt-Thieme, L.: Bpr: Bayesian personalized ranking from implicit feedback. In: UAI. pp. 452–461. AUAI Press (2009)
  • [20] Sarwar, B., Karypis, G., Konstan, J., Riedl, J.: Item-based collaborative filtering recommendation algorithms. In: WWW. pp. 285–295. ACM (2001)
  • [21] Sarwar, B.M., Karypis, G., Konstan, J.A., Riedl, J.: Item-based collaborative filtering recommendation algorithms. In: Proceedings of WWW (2001)
  • [22] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: NIPS. pp. 6000–6010 (2017)
  • [23] Wang, J., Yu, L., Zhang, W., Gong, Y., Xu, Y., Wang, B., Zhang, P., Zhang, D.: Irgan: A minimax game for unifying generative and discriminative information retrieval models. In: Proceedings of SIGIR. ACM (2017)
  • [24] Wang, S., Hu, L., Cao, L., Huang, X., Lian, D., Liu, W.: Attention-based transactional context embedding for next-item recommendation. AAAI (2018)
  • [25] Wang, X., Yu, L., Ren, K., Tao, G., Zhang, W., Yu, Y., Wang, J.: Dynamic attention deep model for article recommendation by learning human editors’ demonstration. In: Proceedings of SIGKDD. ACM (2017)
  • [26] Wu, C., Wang, J., Liu, J., Liu, W.: Recurrent neural network based recommendation for time heterogeneous feedback. Knowledge-Based Systems 109, 90–103 (2016)
  • [27] Wu, C.Y., Ahmed, A., Beutel, A., Smola, A.J., Jing, H.: Recurrent recommender networks. In: WSDM. pp. 495–503. ACM (2017)
  • [28] Ying, H., Zhuang, F., Zhang, F., Liu, Y., Xu, G., Xie, X., Xiong, H., Wu, J.: Sequential recommender system based on hierarchical attention networks. In: IJCAI (2018)
  • [29] Zhang, S., Yao, L., Sun, A., Wang, S., Long, G., Dong, M.: Neurec: On nonlinear transformation for personalized ranking. In: IJCAI. pp. 3669–3675 (2018)
  • [30]

    Zhang, Y., Lai, G., Zhang, M., Zhang, Y., Liu, Y., Ma, S.: Explicit factor models for explainable recommendation based on phrase-level sentiment analysis. In: ACM SIGIR (2014)

  • [31] Zhou, C., Bai, J., Song, J., Liu, X., Zhao, Z., Chen, X., Gao, J.: Atrank: An attention-based user behavior modeling framework for recommendation. In: AAAI (2018)