1 Introduction
Online retail is a growing market with sales accounting for $394.9 billion or 11.7% of total US retail sales in 2016. In the same year, ecommerce sales accounted for 41.6 percent of all retail sales growth. For some entertainment products such as movies, books, and music, online retailers have long outperformed traditional instore retailers. One of the driving forces of this success is the ability of online retailers to collect purchase histories of customers, online shopping behavior, and reviews of products for a very large number of users. This data is driving several machine learning applications in online retail, of which personalized recommendation is the most important one. With recommender systems online retailers can provide personalized product recommendations and anticipate purchasing behavior.
In addition, the availability of product reviews allows users to make more informed purchasing choices and companies to analyze costumer sentiment towards their products. The latter was coined sentiment analysis and is concerned with machine learning approaches that map written text to scores. Nevertheless, even the best sentiment analysis methods cannot help in determining which new products a costumer might be interested in. The obvious reason is that costumer reviews are not available for products they have not purchased yet.
In recent years the availability of large corpora of product reviews has driven textbased research in the recommender system community (e.g. [21, 19, 3]). Some of these novel methods extend latent factor models to leverage review text by employing an explicit mapping from text to either user or item factors. At prediction time, these models predict product ratings based on some operation (typically the dot product) applied to the user and product representations. Sentiment analysis, however, is usually applied to some representation (e.g. bagofwords) of review text but in a recommender system scenario the review is not available at prediction time.
With this paper we propose TransRev, a method that combines a personalized recommendation learning objective with a sentiment analysis objective into a joint learning objective. TransRev learns vector representations for users, items, and reviews jointly. The crucial advantage of TransRev is that the review embedding is learned such that it corresponds to a translation that moves the embedding of the reviewing user to the embedding of the item the review is about. This allows TransRev to approximate a review embedding at test time as the difference of the item and user embedding despite the absence of a review from the user for that item. The approximated review embedding is then used in the sentiment analysis model to predict the review score. Moreover, the approximated review embedding can be used to retrieve reviews in the training set deemed most similar by a distance measure in the embedding space. These retrieved reviews could be used for several purposes. For instance, such reviews could be provided to users as a starting point for a review, lowering the barrier to writing reviews.
We performed an extensive set of experiments to evaluate the performance of TransRev on standard recommender system data sets. TransRev outperforms state of the art methods on 15 of the 19 data sets. Moreover, we qualitatively compare actual reviews with the retrieved ones by TransRev based on a similarity metric in the review embedding space. Finally, we discuss some weaknesses of TransRev and possible future research directions.
2 TransRev: Modeling Reviews as Translations in Vector Space
We address the problem of learning prediction models for the product recommendation problem. There are a set of users , a set of items , and a set of reviews . Each represents a review written by user for item . Hence, , that is, each review is a sequence of tokens. In the following we refer to as a triple. Each such triple is associated with the review score given by the user to item .
TransRev embeds all users, items and reviews into a latent space where the embedding of a user plus the embedding of the review is learned to be close to the embedding of the reviewed item. It simultaneously learns a regression model to predict the rating given a review text. At prediction time, reviews are not available, but the modeling assumption of TransRev allows to predict the review embedding by taking the difference of the embedding of the item and user. Then this approximation is used as input feature of the regression model to perform rating prediction.
TransRev embeds all nodes and reviews into a latent space (
is a model hyperparameter). The review embeddings are computed by applying a learnable function
to the token sequence of the reviewThe function
can be parameterized (typically with a neural network such as a recursive or convolutional neural network) but it can also be a simple parameterfree aggregation function that computes, for instance, the elementwise average or maximum of the token embeddings.
We propose and evaluate a simple instance of where the review embedding is the average of the embeddings of the tokens occurring in the review. More formally,
(1) 
where is the embedding associated with token and is a review bias which is common to all reviews and takes values in . The review bias is of importance since there are some reviews all of whose tokens are not in the training vocabulary. In these cases we have .
The learning of the item, review, and user embeddings is determined by two learning objectives. The first objective guides the joint learning of the parameters of the regression model and the review embeddings such that the regression model performs well at review score prediction
(2) 
where is the set of training triples and their associated ratings, and is a learnable regression function that is applied to the representation of the review .
While can be an arbitrary complex function, the instance of used in this work is as follows
(3) 
where are the learnable weights of the linear regressor,
is the sigmoid function
, and is the shortcut we use to refer to the sum of the bias terms, namely the user, item and overall bias: .Of course, in a realworld scenario a recommender system makes rating predictions on items that users have not rated yet and, consequently, reviews are not available for those items. The application of the linear regressor of Equation (2) to new examples, therefore, is not possible at test time. Our second learning procedure aims at overcoming this limitation by leveraging ideas from embeddingbased knowledge base completion methods. We want to be able to approximate a review embedding at test time such that this review embedding can be used in conjunction with the learned regression model. Hence, in addition to the learning objective (2), we introduce a second objective that forces the embedding of a review to be close to the difference between the item and user embeddings. This translationbased modeling assumption is followed in TransE [5] and several other knowledge base completion methods [14, 11]. We include a second term in the objective that drives the distance between (a) the user embedding translated by the review embedding and (b) the embedding of the item to be small
(4) 
where and are the embeddings of the user and item, respectively. In the knowledge base embedding literature (cf. [5]) it is common the representations are learned via a marginbased loss, where the embeddings are updated if the score (the negative distance) of a positive triple (e.g. is not larger than the score of a negative triple (e.g. plus a margin. Note that this type of learning is required to avoid trivial solutions. The minimization problem of Equation (4) can easily be solved by setting . However, this kind of trivial solutions is avoided by jointly optimizing Equations (2) and (4), since a degenerate solution like the aforementioned one would lead to a high error with respect to the regression objective (Equation (2)). The overall objective can now be written as
(5) 
where is a term that weights the approximation loss due to the modeling assumption formalized in Equation (4). In our model, corresponds to the parameters , , , and the bias terms .
At test time, we can now approximate review embeddings of pairs not seen during training by computing
With the trained regression model we can make rating predictions for unseen pairs by computing
(6) 
Contrary to training, now the regression model is applied over , instead of , which is not available at test time.
All parameters of the parts of the objective are jointly learned with stochastic gradient descent. More details regarding the parameter learning are contained in Section
4.4. Algorithm 1 illustrates the generic working of TransRev.3 Related Work
There are three lines of research related to our work. Recommender systems, sentiment analysis and multirelational graph completion. There is an extensive body of work on recommender systems [1, 6, 26, 27, 7, 30, 13, 10]
. Singular Value Decomposition (
SVD) [17] computes the review score prediction as the dot product between the item embeddings and the user embeddings plus some learnable bias terms. Due to its simplicity and performance on numerous data sets it is still one of the most used methods for product recommendations. Even though there has been a flurry of research on predicting ratings from the interaction of latent representations of users and items, there is not much work on incorporating review text despite its availability in several corpora. [16]was one of the first approaches that demonstrated that features extracted from review text are are useful in learned models to improve the accuracy of rating predictions. Most of the previous research that explored the utility of review text for rating prediction can be classified into two categories.

Semisupervised approaches. HFT [21]
was one of the first methods combining a supervised learning objective to predict ratings with an unsupervised learning objective (e.g. latent Dirichlet allocation) for text content to regularize the parameters of the supervised model. The idea of combining two learning objectives has been explored in several additional approaches
[19, 3, 9, 2]. The methods differ in the unsupervised objectives, some of which are tailored to a specific domain. For example, JMARS [9] outperforms HFT on a movie recommendation data set but it is outperformed by HFT on data sets similar to those used in our work [31]. 
Supervised approaches. Methods that fall into this category such as [29, 33, 8] learn latent representations of users and items from the text content so as to perform well at rating prediction. The learning of the latent representations is done via a deep architecture. The approaches differences lie mainly in the neural architectures they employ.
There is one crucial difference between the aforementioned methods and TransRev. TransRev predicts the review score based on an approximation of the review embedding computed at test time. Moreover, since TransRev is able to approximate a review embedding, we can use this embedding to retrieve reviews in the training set deemed most similar by a distance metric in the embedding space.
Similar to sentiment analysis methods, TransRev trains a regression model that predicts the review rating from the review text. Contrary to the typical setting in which sentiment analysis methods operate, however, review text is not available at prediction time in the recommender system setting. Consequently, the application of sentiment analysis for recommender systems is not directly possible. In the simplest case, a sentiment analysis method is a linear regressor applied to a text embedding (Equation (3)). TransRev trains such a regression model to perform well in conjunction with the approximated review embedding.
The third research theme related to TransRev is knowledge base completion. In the last years, many embeddingbased methods have been proposed to infer missing relations in knowledge bases based on function that computes a likelihood score based on the embeddings of entities and relation types. Due to its simplicity and good performance, there is a large body of work on translationbased scoring functions [5, 14, 11]. [15] propose an approach to largescale sequential sales prediction that embeds items into a transition space where user embeddings are modeled as translation vectors operating on item sequences. The associated optimization problem is formulated as a sequential Bayesian ranking problem [25]. To the best of our knowledge, [15] is the first work in leveraging ideas from knowledge base completion methods for recommender system. Whereas TransRev addresses the problem of rating prediction by incorporating review text, [15] addresses the different problem of sequential recommendation. Therefore the experimental comparison to that work is not possible. In TransRev the review embedding translates the user embedding to the product embedding. In [15], the user embedding translates a product embedding to the embedding of the next purchased product. TransRev is also novel in that the approximated review embeddings can be used to retrieve, from an existing training set, the reviews deemed most similar by a distance metric in the embedding space.
4 Experimental Setup
We conduct several experiments to empirically compare TransRev to state of the art methods for product recommendation. More specifically, we compare TransRev to competitive matrix factorization methods as well as methods that take advantage of review text. Moreover, we provide some qualitative results on retrieving training reviews most similar to the approximated reviews at test time.
4.1 Data Sets
We evaluate the various methods on two commonly used data sets. The Yelp Business Rating Prediction Challenge^{1}^{1}1https://www.kaggle.com/c/yelprecsys2013 data set consists of reviews on restaurants in Phoenix (United States). The Amazon Product Data^{2}^{2}2http://jmcauley.ucsd.edu/data/amazon has been extensively used in previous works [21, 22, 23]. The data set consists of reviews and product metadata from Amazon from May 1996 to July 2014. We focus on the 5core versions (which contain at least 5 reviews for each user and item) of those data sets. There are 24 product categories from which we have selected those 12 used in [29], plus 6 randomly picked categories out of the 12 remaining ones. We treat each of these resulting 18 data sets independently in our experiments. Ratings in both benchmark data sets are integer values between 1 and 5. As in previous work, we randomly sample 80% of the reviews as training, 10% as validation, and 10% as test data. We remove reviews from the validation and test splits if they involve either a product or a user that is not part of the training data.
4.2 Review Text Preprocessing
We follow the same preprocessing steps for each data set. First, we lowercase the review texts and apply the regular expression “” to tokenize the text data, discarding those words that appear in less than 0.1 of the reviews of the data set under consideration. For the Amazon data sets, both full reviews and short summaries (rarely having more than 30 words) are available. Since classifying short documents into their sentiment is less challenging than doing the same for longer text [4], we have used the reviews summaries for our work. For the Yelp data only full reviews are available. We truncate these reviews to the first 200 words. Some statistics of the preprocessed data sets are summarized in Table 1.
Users  Items  Words  Training  Valid.  Test  
Amazon  5,131  1,686  513  27,610  3,449  3,445 
Instant Video  
Automotive  2,929  1,836  589  15,741  1,965  1,963 
Baby  22,364  12,102  497  124,978  15,612  15,613 
Cds and Vinyl  75,259  64,444  576  813,897  101,600  101,581 
Grocery  14,682  8,714  565  116,192  14,502  14,499 
Gourmet Food  
Health  38,610  18,535  573  261,102  32,588  32,585 
Personal Care  
Kindle Store  68,224  61,935  456  717,845  89,628  89,637 
Musical  1,430  901  512  7,925  989  985 
Instruments  
Office Products  4,906  2,421  652  41,687  5,210  5,206 
Patio, Lawn  1,687  963  697  10,320  1,279  1,285 
Garden  
Pet Supplies  19,857  8,511  515  120,831  15,073  15,070 
Tools  16,639  10,218  587  103,373  12,911  12,910 
Home Improv.  
Toys  19,413  11,925  516  127,712  15,864  15,850 
Games  
Beauty  22,364  12,102  497  150,452  18,774  18,783 
Digital Music  5,542  3,569  625  48,283  6,029  6,021 
Video Games  24,304  10,673  591  175,650  21,948  21,937 
Sports  35,599  18,358  530  224,596  28,045  28,035 
Outdoors  
Cell Phones  27,880  10,430  504  149,668  18,667  18,673 
Accesories  
Yelp  45,981  11,538  5,314  183,886  20,315  20,294 
Offset  Attn+CNN  NMF  SVD  HFT  DeepCoNN  TransRev  
Amazon Instant Video  1.180  0.936  0.946  0.904  0.888  0.943  0.884 
Automotive  0.948  0.881  0.876  0.857  0.862  0.753  0.855 
Baby  1.262  1.176  1.171  1.108  1.104  1.154  1.100 
Cds and Vinyl  1.127  0.866  0.871  0.863  0.854  0.888  0.854 
Grocery and Gourmet Food  1.165  1.004  0.985  0.964  0.961  0.973  0.957 
Health and Personal Care  1.200  1.054  1.048  1.016  1.014  1.081  1.011 
Kindle Store  0.87  0.617  0.624  0.607  0.593  0.648  0.599 
Musical Instruments  0.733  0.703  0.725  0.694  0.692  0.723  0.690 
Office Products  0.876  0.726  0.742  0.727  0.727  0.738  0.724 
Patio, Lawn and Garden  1.156  0.999  0.958  0.950  0.956  1.070  0.941 
Pet Supplies  1.354  1.236  1.241  1.198  1.194  1.281  1.191 
Tools and Home Improvement  1.017  0.938  0.908  0.884  0.884  0.946  0.879 
Toys and Games  0.975    0.821  0.788  0.784  0.851  0.784 
Beauty  1.322    1.204  1.168  1.165  1.184  1.158 
Digital Music  1.137    0.805  0.797  0.793  0.835  0.782 
Video Games  1.401    1.138  1.093  1.086  1.133  1.082 
Sports and Outdoors  0.931    0.856  0.828  0.824  0.882  0.823 
Cell Phones and Accesories  1.455    1.357  1.290  1.285  1.365  1.279 
  0.921  
Yelp  1.385  1.212  1.229  1.158  1.148  1.215  1.144 
4.3 Baselines
We compare to the matrix factorizationbased methods SVD and NMF (nonnegative matrix factorization) as well as approaches that leverage review text for rating prediction in a semisupervised manner like HFT, and in a supervised manner such as Attn+CNN [29, 28] and DeepCoNN [33]. We also compare to a simple baseline Offset that simply uses the average rating in the training set as the prediction.
k  Baby 





4  1.100  0.782  0.724  0.880  
8  1.100  0.782  0.723  0.878  
16  1.100  0.782  0.724  0.879  
32  1.102  0.785  0.722  0.888  
64  1.099  0.787  0.726  0.888 
4.4 Parameter Setting
We set the dimension of the embedding space to for all methods. We evaluated the robustness of TransRev to changes in the hyperparameter but did not observe any significant performance difference. This is in line with previous work on the Yelp and Amazon data sets that observed that HFT and SVD did not show any improvements for [21]. For SVD and NMF we used the Python package SurPRISE^{3}^{3}3https://pypi.python.org/pypi/scikitsurprise, whose optimization is performed by vanilla stochastic gradient descent, and chose the learning rate and regularization term on the validation set from the values and . For HFT we used the original implementation of the authors^{4}^{4}4http://cseweb.ucsd.edu/ jmcauley/code/codeRecSys13.tar.gz and validated the regularization term from the values . For TransRev we validated among the values and the learning rate of the optimizer and regularization term ( in our model) from the same set of values as for SVD and NMF. To ensure a fair comparison with SVD and NMF, we also use vanilla SGD to optimize TransRev. TransRev’s parameters were randomly initialized [12]. Parameters for HFT were learned with LBFGS which was run for 2,500 learning iterations and validated every 50 iterations.
A single learning iteration performs SGD with all review triples in the training data and their associated ratings. For TransRev we used a batch size of 64. We ran SVD, NMF and TransRev for a maximum of 500 epochs and validated every epochs. All methods are validated according to the Mean Squared Error (MSE)
where is either the validation or test set. The implementation of Attn+CNN is not publicly available, so we directly copied the MSE from [29] where the training, validation, and test data sets have the same proportions (). For DeepCoNN the original author code is not available and we used a thirdparty implementation^{5}^{5}5https://github.com/chenchongthu/DeepCoNN. We applied the default hyperparameters values for dropout and L2 regularization and used the same embedding dimension as for all other methods.
4.5 Sensitivity
We randomly selected the 4 data sets Baby, Digital Music, Office and ToolsHome Improvement from the Amazon data and evaluated different values of for user, item and word embedding sizes. We increase from 4 to 64 and list the MSE scores in Table 3. We only observe insignificant differences in the corresponding model’s performances. This observation is in line with [21].
4.6 Results
The experimental results are listed in Table 2 where the best performance is in bold font. TransRev achieves the best performance on 17 out of the 19 data sets. In line with previous work [16, 21], both TransRev and HFT outperform methods that do not take advantage of review text. TransRev is competitive with and often outperforms HFT on the benchmark data sets under consideration. To quantify that the rating predictions made by HFT and TransRev
are significantly different we have computed the dependent ttest for paired samples and for all data sets where
TransRev outperforms HFT, the pvalue is smaller than 0.01.We only copied the numbers of Attn+CNN from [29] since an implementation is not available. This could lead to differences in the results due to the different randomly sampled training, validation, and test sets. However, in addition to the results in this paper, Attn+CNN was compared to some of the baselines in related work [29]. The authors there showed that Attn+CNN performs worse than either SVD or HFT or both in 10 of 12 Amazon data sets. At the same time, in our experiments, TransRev performs better than HFT and SVD on the same data sets with the exception of the Kindle Store category.
Actual test review  Closest training review in embedding space 

skin improved (5)  makes your face feel refreshed (5) 
love it (5)  you’ll notice the difference (5) 
best soap ever (5)  I’ll never change it (5) 
it clumps (2)  gives me headaches (1) 
smells like bug repellent (3)  pantene give it up (2) 
fake fake fake do not buy (1)  seems to be harsh on my skin (2) 
saved my skin (5)  not good quality (2) 
another great release from saliva (5)  can t say enough good things about this cd (5) 
a great collection (5)  definitive collection (5) 
sound nice (3)  not his best nor his worst (4) 
a complete massacre of an album (2)  some great songs but overall a dissapointment (3) 
the very worst best of ever (1)  overall a pretty big disappointment (2) 
what a boring moment (1) 
overrated but still allright (3) 
great cd (5)  a brilliant van halen debut album (5) 
4.7 Visualization of the Word Embeddings
Review embeddings learned by TransRev are learned so as to carry information about user ratings (Equation (2)) and information about the average word embedding of the words in the review text. As a consequence the learned word embeddings are correlated with ratings. To visualize the correlation between words and ratings we proceed as follows. First, we assign a score to each word that is computed by taking the average rating of the reviews that contain the word. Second, we compute a 2dimensional representation of the words by applying tSNE [20] to the 16dimensional word embeddings learned by TransRev. Figure 4 depicts these 2dimensional word embedding vectors learned for the Amazon Baby data set. The corresponding rating scores are indicated by the color of the dots.
The clusters we discovered in Figure 4 are interpretable. They are meaningful with respect to the score, observing that the bottom cluster is mostly made up of words with negative connotations (e.g. horrible, useless, terrible), the middle one of neutral words (e.g. with, products, others) and the upper one of words with positive connotations (e.g. awesome, fantastic, excellent). This shows TransRev’s ability to learn word embeddings that also capture the sentiment of the review.
4.8 Suggesting Reviews to Users
One of the characteristics of TransRev is its ability to approximate the review representation at prediction time. This approximation is used to make a rating prediction, but it can also be used to propose a tentative review on which the user can elaborate on. This is related to a number of approaches [32, 18, 24] on explainable recommendations. We think that this can lower the barrier to write reviews. We compute the Euclidean distance between the approximated review embedding and all review embeddings from the training set. We then retrieve the review text with the most similar review embedding. We investigate the quality of the tentative reviews that TransRev retrieves for the Beauty and Digital Music data sets. The example reviews listed in Table 4 show that while the overall sentiment is correct in most cases, we can also observe the following shortcomings:

The function chosen in our work is invariant to word ordering and, therefore, cannot learn that bigrams such as “not good” have a negative meaning.

Despite matching the overall sentiment, the actual and retrieved review can refer to different aspects of the product (for example, “it clumps” and “gives me headaches”).

Reviews can be specific to a single product. A straightforward improvement could be achieved by retrieving only existing reviews for the specific product under consideration.
We believe that more sophisticated sentence and paragraph representations might lead to better results in the review retrieval task. Moreover, a promising line of research has to do with learning representations for reviews that are aspectspecific. It would allow users to obtain retrieved reviews that mention specific aspect of products such as “ease of use” and “price.” We also think that similar ideas can be followed with data modalities other than review text.
5 Conclusion
TransRev is a novel approach for product recommendation combining methods and ideas from the areas of matrix factorizationbased recommender systems, sentiment analysis, and knowledge graph completion. TransRev achieves state of the art performance on the data sets under consideration and outperforms existing methods in 15 of these data sets. TransRev is learned so as to be able to approximate, at test time, the embedding of the review as the difference of the embedding of the reviewed item and of the reviewing user. The approximated review embedding can be used with a sentiment analysis method to predict the review score.
References
 [1] Allen, R.B.: User models: Theory, method, and practice. International Journal of ManMachine Studies 32(5), 511–543 (1990)

[2]
Almahairi, A., Kastner, K., Cho, K., Courville, A.C.: Learning distributed representations from reviews for collaborative filtering. In: RecSys. pp. 147–154 (2015)
 [3] Bao, Y., Fang, H., Zhang, J.: Topicmf: Simultaneously exploiting ratings and reviews for recommendation. In: AAAI. pp. 2–8 (2014)
 [4] Bermingham, A., Smeaton, A.F.: Classifying sentiment in microblogs: is brevity an advantage? In: CIKM. pp. 1833–1836 (2010)
 [5] Bordes, A., Usunier, N., GarcíaDurán, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multirelational data. In: NIPS. pp. 2787–2795 (2013)
 [6] Breese, J.S., Heckerman, D., Kadie, C.M.: Empirical analysis of predictive algorithms for collaborative filtering. In: UAI. pp. 43–52 (1998)
 [7] Brun, A., Hamad, A., Buffet, O., Boyer, A.: Towards preference relations in recommender systems. In: Preference Learning (PL 2010) ECML/PKDD 2010 Workshop (2010)
 [8] Catherine, R., Cohen, W.W.: Transnets: Learning to transform for recommendation. In: RecSys. pp. 288–296 (2017)
 [9] Diao, Q., Qiu, M., Wu, C., Smola, A.J., Jiang, J., Wang, C.: Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS). In: KDD. pp. 193–202 (2014)
 [10] Dong, X., Yu, L., Wu, Z., Sun, Y., Yuan, L., Zhang, F.: A hybrid collaborative filtering model with deep structure for recommender systems. In: AAAI. pp. 1309–1315 (2017)
 [11] GarcíaDurán, A., Bordes, A., Usunier, N.: Composing relationships with translations. In: EMNLP. pp. 286–290. The Association for Computational Linguistics (2015)
 [12] Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: AISTATS. JMLR Proceedings, vol. 9, pp. 249–256 (2010)
 [13] Guo, G., Zhang, J., YorkeSmith, N.: Trustsvd: Collaborative filtering with both the explicit and implicit influence of user trust and of item ratings. In: AAAI. pp. 123–129 (2015)
 [14] Guu, K., Miller, J., Liang, P.: Traversing knowledge graphs in vector space. In: EMNLP. pp. 318–327. The Association for Computational Linguistics (2015)
 [15] He, R., Kang, W., McAuley, J.: Translationbased recommendation. In: RecSys. pp. 161–169 (2017)
 [16] Jakob, N., Weber, S.H., Müller, M.C., Gurevych, I.: Beyond the stars: exploiting freetext user reviews to improve the accuracy of movie recommendations. In: 1st international CIKM workshop on Topicsentiment analysis for mass opinion. pp. 57–64 (2009)
 [17] Koren, Y., Bell, R.M., Volinsky, C.: Matrix factorization techniques for recommender systems. IEEE Computer 42(8), 30–37 (2009)

[18]
Lawlor, A., Muhammad, K., Rafter, R., Smyth, B.: Opinionated explanations for recommendation systems. In: International Conference on Innovative Techniques and Applications of Artificial Intelligence. pp. 331–344. Springer (2015)
 [19] Ling, G., Lyu, M.R., King, I.: Ratings meet reviews, a combined approach to recommend. In: RecSys. pp. 105–112 (2014)
 [20] Maaten, L.v.d., Hinton, G.: Visualizing data using tsne. Journal of Machine Learning Research 9, 2579–2605 (2008)
 [21] McAuley, J.J., Leskovec, J.: Hidden factors and hidden topics: understanding rating dimensions with review text. In: RecSys. pp. 165–172 (2013)
 [22] McAuley, J.J., Pandey, R., Leskovec, J.: Inferring networks of substitutable and complementary products. In: KDD. pp. 785–794 (2015)
 [23] McAuley, J.J., Targett, C., Shi, Q., van den Hengel, A.: Imagebased recommendations on styles and substitutes. In: SIGIR. pp. 43–52 (2015)
 [24] Qureshi, M.A., Greene, D.: Lit@ eve: Explainable recommendation based on wikipedia concept vectors. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. pp. 409–413. Springer (2017)

[25]
Rendle, S., Freudenthaler, C., SchmidtThieme, L.: Factorizing personalized markov chains for nextbasket recommendation. In: WWW. pp. 811–820 (2010)
 [26] Rennie, J.D.M., Srebro, N.: Fast maximum margin matrix factorization for collaborative prediction. In: ICML. ACM International Conference Proceeding Series, vol. 119, pp. 713–719 (2005)
 [27] Sarwar, B.M., Karypis, G., Konstan, J.A., Riedl, J.: Itembased collaborative filtering recommendation algorithms. In: WWW. pp. 285–295 (2001)
 [28] Seo, S., Huang, J., Yang, H., Liu, Y.: Interpretable convolutional neural networks with dual local and global attention for review rating prediction. In: RecSys. pp. 297–305 (2017)
 [29] Seo, S., Huang, J., Yang, H., Liu, Y.: Representation learning of users and items for review rating prediction using attentionbased convolutional neural network. In: 3rd International Workshop on Machine Learning Methods for Recommender Systems (MLRec) (2017)

[30]
Wang, H., Wang, N., Yeung, D.: Collaborative deep learning for recommender systems. In: KDD. pp. 1235–1244 (2015)
 [31] Wu, C., Beutel, A., Ahmed, A., Smola, A.J.: Explaining reviews and ratings with PACO: poisson additive coclustering. In: WWW (Companion Volume). pp. 127–128 (2016)
 [32] Zhang, Y., Lai, G., Zhang, M., Zhang, Y., Liu, Y., Ma, S.: Explicit factor models for explainable recommendation based on phraselevel sentiment analysis. In: SIGIR. pp. 83–92 (2014)
 [33] Zheng, L., Noroozi, V., Yu, P.S.: Joint deep modeling of users and items using reviews for recommendation. In: WSDM. pp. 425–434 (2017)
Comments
There are no comments yet.