1. Introduction
Recommender systems provide personalized suggestions of products to endusers in a variety of settings. It has been applied to several domains, such as ecommerce (e.g., Amazon and eBay), online streaming (e.g., Netflix and Pandora), social media (e.g., Facebook and Twitter), and so forth.
One way to view the recommendation problem by the collaborative recommendation approach is to consider the users and items as forming a matrix where the entries in the matrix are known ratings by particular users for given items. In this framework, collaborative recommendation becomes the task of creating a function for predicting the likely values of unknown cells in this matrix. i.e., R: Users Items Ratings.
In recent years, contextawareness in the recommender systems raised the research attentions. The list of the recommendations cannot stand alone without considering contexts, since user’s tastes may vary from contexts to contexts. For example, a user may choose a different movie if he is going to watch the movie with kids rather than with his partner. A user may prefer a fast food restaurant for quick lunch by himself, while he may choose a formal restaurant if he is going to have a business dinner with the colleagues. As a result, contextaware recommendation turns the prediction task into a multidimensional rating function – R: Users Items Contexts Ratings (Adomavicius et al., 2011).
Several contextaware recommendation algorithms were proposed and developed in the past decade. They explore different ways to incorporate context information (such as time, location, weather, companion, etc) (Zheng, 2015)
into the recommendation models, in order to improve the quality of item recommendations. Among these effective algorithms, many of them are machine learning based approaches which are able to significantly improve the recommendations but hard to be interpreted, such as the models based matrix factorization. Therefore, there are limited research that try to utilize the model to interpret the contextual effects in the recommender systems. It is not that easy to understand how context information take effects in the recommendation process, and how can they affect the quality of the recommendations.
In this paper, we specifically focus on different contextual modeling approaches, and make the first attempt to reshape the structure of the models, in order to further exploit how to utilize the existing contextual modeling to interpret the contextual effects in the recommender systems.
2. Related Work
In this section, we introduce the existing categories of the contextaware recommendation models and then discuss existing work on interpreting contextual effects in the recommender systems.
To better understand the contextaware recommendation, we introduce the terminology in this domain as follows.
User  Item  Rating  Time  Location  Companion 
U1  T1  3  weekend  home  alone 
U1  T1  5  weekend  cinema  girlfriend 
U1  T1  ?  weekday  home  family 
Assume there are one user , one item , and three contextual dimensions – Time (weekend or weekday), Location (at home or cinema) and Companion (alone, girlfriend, family) as shown in the table above. In the following discussion, we use context dimension to denote the contextual variable, e.g. “Location”. The term context condition refers to a specific value in a dimension, e.g. “home” and “cinema” are two contextual conditions for “Location”. A context or context situation is, therefore, a set of contextual conditions, e.g. {weekend, home, family}.
2.1. Contextaware Recommendation Models
Context can be applied in recommendation using three basic strategies: prefiltering, postfiltering and contextual modeling (Adomavicius et al., 2011; Adomavicius and Tuzhilin, 2011). The first two strategies rely on either filtering profiles or filtering the recommendations, but they still use standard twodimensional recommendation algorithms in the modeling process. By contrast, in contextual modeling, the predictive models are learned by using the multidimensional rating data. These scenarios are depicted in Figure 1 (Adomavicius et al., 2011).
As the name would suggest, prefiltering techniques use the contextual information to remove profiles or parts of profiles from consideration in the recommendation process. For example, contextaware splitting approaches (Baltrunas and Ricci, 2009; Zheng et al., 2014a) use context as filter to preselect rating profiles and then apply the recommendation algorithms only with profiles contain ratings in matching contexts. Postfiltering techniques (Panniello et al., 2009; RamirezGarcia and GarcíaValdez, 2014) utilize contexts to filter or rerank the list of the recommendations.
These filteringbased methods, including both prefiltering and postfiltering, are straightforward and easy to be interpreted. However, by using context information as filters, it usually introduces the sparsity problem, and even the coldstart context problems. For example, to recommend a list of movies for a user within the context situation {cinema, at weekend, with kids}, we need rich rating profiles that the ratings were given in the same or similar contexts. Otherwise, the recommendations by the filteringbased approaches may be not that reliable.
By contrast, contextual modeling approaches are usually the machine learning based algorithms which are able to alleviate the sparsity problems and produce better contextaware recommendations than the ones by the filteringbased methods. These models directly incorporate context information as parts of the predictive functions, and contexts are no longer used as filters in the recommendation process. Tensor factorization
(Karatzoglou et al., 2010), contextaware matrix factorization (Baltrunas et al., 2011b) and contextual sparse linear modeling (Zheng et al., 2014b) are the examples of the most effective contextual modeling algorithms in the recommender systems. However, it may be difficult to interpret the models in order to understand why and how contexts play an important role in the recommendation process.2.2. Interpretations By Contextual Filtering
Due to the difficulty of interpreting the contextual modeling approaches, most of the existing work focus on the interpretations by the contextual filtering methods, especially the prefiltering approaches. For example, our previous work (Zheng et al., 2013b)
views emotional states as the contexts, and utilizes the feature selection process in differential context relaxation (DCR)
(Zheng et al., 2012) and the feature weighting in differential context weighting (DCW) (Zheng et al., 2013a) to find out and interpret which emotional variables are crucial in different recommendation components or stages. DCR and DCW are two hybrid models of the contextual filtering approaches. Codina, et al. (Codina et al., 2013) develop a distributional semantic prefiltering contextaware recommendation algorithm which is able to calculate the similarity between two contexts. The context similarity, as a result, can tell why and which rating profiles are helpful in the recommendation process. It is able to alleviate the sparsity problems, since we can select similar rating profiles to predict a user’s rating by given a context, and we no longer require an exact matching of the context information. As far as we know, there are no existing work that discuss the interpretations by the contextual modeling approaches, where we will explore this work in the following sections.3. Interpretation By Contextual Modeling
In this paper, we are not going to develop new interpretable contextaware recommendation models. Instead, we focus on the existing contextual modeling approaches, reshape the structure of algorithms so that we can view these algorithms from another perspective and interpret the contextual effects by these approaches.
In this section, we introduce the dependent and independent contextual modeling approaches in our own way, and discuss the capability of the interpretations after we reshaping the structure of these algorithms.
3.1. Example: A Linear Model
First of all, we use the simple linear regression model (i.e.,
) for example to explain two strategies in the optimization.The visualization of the optimizations can be described by Figure 2, where is the truth and
represents the estimated linear model. The optimization goal in the linear regression is to minimize the squared errors. Figure
2 a) is named as a moving model, since the model tries to vary the values of only in order to minimize the squared errors. Figure 2 b) is a rotation model, while this model varies the value of only in the optimization. Of course, the third strategy could be the one that combines the moving model and the rotation model, which is the common optimization in the linear regression model. In this paper, we only focus on the moving and rotation models, since they are simple and straightforward. We will explore the combination of these models in our future work.3.2. Dependent Contextual Modeling
Among the contextual modeling approaches, dependent contextual model is the model that exploits the dependency or correlations among users, items and contexts. There are two existing categories of the algorithms – deviationbased contextual modeling and similaritybased contextual modeling which we introduce individually as follows.
3.2.1. DeviationBased Models
The deviationbased contextual modeling is a learning algorithm that minimizing the squared rating prediction errors by learning the rating deviations between two context situations. One example can be shown by Table 2.
Context  D1: Time  D2: Location 

Weekend  Home  
Weekday  Cinema  
Dev(Di)  0.5  0.1 
Assume there are two context dimensions: Time and Location. Each context situation is constructed by the context conditions in these two variables. There are two contexts :{Weekend, Home} and :{Weekday, Cinema}. The last row in Table 2 tells the rating deviation from to in each context variable. For example, Dev(D1) represents the rating deviation in the variable ”Time” from to . More specifically, it indicates that a user’s rating in weekday is generally higher than his or her rating at weekend by 0.5. Accordingly, Dev(D2) is 0.1, which tells that a user’s rating in cinema is generally lower than his or her rating at home by 0.1.
Therefore, we can predict a user’s rating on a specific item within contexts if we know his or her rating on the same item within contexts . For example, if user rated item in context as a four star, his or her rating on in context can be simply estimated as the four star plus the aggregated rating deviations in each context variable. Namely, the predicted rating will be 4.4 (i.e., 4 + 0.5  0.1).
Theoretically, we are able to learn the rating deviations between every two context conditions in a same context variable. However, it may introduce sparsity problems if there are many context conditions in a single context variable. A simple solution to alleviate this problem is to set a baseline. Take Table 3 for example, we introduce a special context situation , where the context conditions in all the context variables are ”N/A” (i.e., not available). The ratings in can be interpreted as a user’s ratings on the items without considering contextual situations.
Context  D1: Time  D2: Location 

N/A  N/A  
Weekday  Cinema  
Dev(Di)  0.5  0.1 
Therefore, the predictive function for user’s rating on an item within a context can be described as Equation 1.
(1) 
where is the prediction function to estimate user ’s rating on item within context . is the predicted rating given by on without considering any context situations. tells the rating deviation at the context variable from to , where is the number of context dimensions in the data set.
, as the predicted rating given by on , can be replaced by any predictive function in the traditional recommendation algorithms. For example, it could be the prediction function in userbased collaborative filtering, or the function by matrix factorization. What we are going to learn are the rating deviations in each context variable from to , i.e., the rating deviation between two context conditions in each context variable.
In Equation 1, we simply assume the is the same for all the users and the items. A finergrained model may assume may vary from users to users, from items to items. For example, a userspecific model could be described by Equation 2, where we assign a to each user by assuming that different users may have personalized values in . According, an itemspecific model can be developed too.
(2) 
The introductions above give a highlevel picture of how the rating deviations in different contexts can be incorporated into the recommendation model. Contextaware matrix factorization (CAMF) (Baltrunas et al., 2011b) is the first attempt as the deviationbased contextual modeling approach, where it replaces the by the predictive function in matrix factorization. Deviationbased contextual sparse linear method (CSLIM) (Zheng et al., 2014b) is another example which utilizes the prediction function in sparse linear method (Ning and Karypis, 2011) as the component .
The deviationbased contextual modeling is similar to the moving model described in Figure 2, where the component of the aggregation of rating deviations (such as ) is equivalent to the in the linear regression model.
3.2.2. SimilarityBased Models
By contrast, similaritybased contextual modeling tries to learn the similarity between two context situations.
Context  D1: Time  D2: Location 

N/A  N/A  
Weekday  Cinema  
Sim(Di)  0.5  0.1 
Table 4 gives an example of context similarity. The table is similar to the one that is used to represent rating deviations in contexts. The last row in Table 4 tells the similarity of the contexts between and in each context variable. In fact, the term ”context similarity” or ”similarity of contexts” actually refers to the similarity of a user’s rating behavior in two contexts.
Therefore, the predicted rating or ranking score can be described as:
(3) 
Again, any predictive function in the traditional recommender systems can be used to replace the . The challenge becomes how to measure the similarity between and .
propose three methods to represent the similarity of contexts. The independent context similarity (ICS) assumes the similarity between two contexts equals to the multiplication of the similarity between two context conditions in each context variable. In this case, the model will learn the similarity between every two context conditions in the same context variable. The latent context similarity (LCS) is an improved method based on the ICS, where each context condition is represented by a latent vector, and therefore the similarity between two context conditions can be estimated by the dot product of the two corresponding vectors. LCS is used to alleviate the coldstart context problem in ICS.
The multidimensional context similarity (MCS) is the most effective but also the complicated one. An example of the visualization can be shown by Figure 3. In MCS, each context variable is depicted by a dimension or an axis in the multidimensional space. Each context condition will be assigned a real number value so that they can be placed in specific positions. As a result, a context situation, such as {Weekend, Family, Home}, can be represented as a point in the space. The dissimilarity of two context situations becomes the distance between two points. In MCS, the model will learn the position of the context conditions. For example, in Figure 3, we change the positions of ”Family” and ”Kids”, which results in difference distance values between two contexts, since the position of the points will be changed if the values for the two context conditions, ”Family” and ”Kids”, are updated.
Anyway, the similaritybased contextual modeling is able to learn the similarity of contexts. Note that this approach is different from the semantic prefiltering algorithm. As mentioned previously, Codina, et al. (Codina et al., 2013) develop a distributional semantic prefiltering contextaware recommendation algorithm which is able to calculate the similarity between two contexts. In their approach, the context similarity is calculated based on a formula, while the similarity of contexts is learned by minimizing squared errors in the learning process.
The similaritybased contextual modeling is similar to the rotation model we introduced previously, where we try to vary the value of the in the linear regression model. Of course, it is able to combine the moving and rotation model. Similarly, we may also combine the deviationbased and similaritybased contextual modeling, which we will explore in our future work.
3.3. Independent Contextual Modeling
Independent contextual modeling doesn’t explicitly make assumptions of the dependency or correlations among users, items and contexts. These models assume the user, item and context dimensions are independent and explore the interactions among these three dimensions. One example is the tensor factorization (TF) (Karatzoglou et al., 2010) that was applied in the contextaware recommendations. The optimization in TF can be realized by either Tucker decomposition or Canonical Polyadic (CP) decomposition. We choose CP decomposition in this paper, since each context condition can be represented by a vector. Therefore, we can continue to calculate the similarity of contexts by using the latent context similarity (LCS) (Zheng et al., 2015c) and the learned vectors in TF.
3.4. Capability of Interpretations
We interpret the dependent and independent contextual modeling approaches in our own way in the previous sections. We summarize the capability of interpretations of these contextual modeling methods as follows:

In dependent contextual modeling, the models explicitly make assumptions about the dependency or correlations among users, items and contexts. For example, the deviationbased models can learn the rating deviations in different contexts. These deviations could be userspecific or itemspecific ones. By contrast, independent contextual modeling may be more difficult to interpret the contextual effects in the recommendation process.

The deviationbased contextual modeling can tell the rating deviations between two contexts. For example, a user’s rating in one context condition is higher or lower than another context condition. It interprets why a user gives a higher rating in one context situation than another situation, even if the user watches the same movie or listens to a same music track.

The similaritybased contextual modeling learns the similarity between two contexts. It is pretty useful since we do not need to perform exact matching in two context situations, but we can tell how similar the user ratings will be if they are going to leave ratings within two contexts.

It is much more difficult to interpret the independent context modeling. By using CP decomposition in the TF, we can interpret the similarity of contexts by using the learned vectors in TF and latent context similarity.
4. Experiment and Results
In this paper, we decide to focus on interpreting the contextual effects or the contextual modeling by comparison of the context similarities, and ignore the rating deviations in different contexts. The underlying reasons can be listed as follows:

We can obtain the rating deviations in different contexts, but it is difficult to evaluate whether they are true or not. By contrast, we can examine the context similarity by common sense. For example, given a context situation, we can utilize the similaritybased contextual modeling and the TF approach to retrieve the top similar contexts, and compare the lists by our common sense.

The TF approach can only interpret the context similarities, if we would like to add the independent contextual modeling in the experimental comparison.

We can also compare the quality of context similarity between contextual modeling and the contextual prefiltering approach that was developed by odina, et al. (Codina et al., 2013).
STS Data (temperature: cold)  
CSLIM_MCS  TF  SPF  
1 




2 




3 




4 


season: winter  
5 

season: winter  mood: lazy  
Music Data (landscape: country side)  
CSLIM_MCS  TF  SPF  
1  landscape: mountains  naturalphenomena: afternoon  naturalphenomena: afternoon  
2  roadtype: serpentine  naturalphenomena: morning  landscape: mountains  
3  roadtype: city  roadtype: city  naturalphenomena: morning  
4  naturalphenomena: afternoon  landscape: mountains  roadtype: city  
5  trafficconditions: free road  roadtype: serpentine  naturalphenomena: day time 
4.1. Experimental Setting
In the domain of contextaware recommendation, there are very limited number of available data sets for research. One of the reasons is that the context acquisition is difficult. Therefore, most data were collected from surveys, which results in either small or sparse data set. Another reason is that the context information is usually related to user privacy problem. It is difficult to obtain the large contextaware data set from industry.
There are a list of available contextaware data sets^{1}^{1}1https://github.com/irecsys/CARSKit/tree/master/contextaware_data_sets, where we select the South Tyrol Suggests (STS) and the music data, since they have many more context dimensions and conditions. More specifically,

the STS data (Braunhofer et al., 2013) was collected from a mobile app which provides contextaware suggestions for attractions, events, public services, restaurants, and much more for South Tyrol. There are 14 contextual dimensions in total, such as budget, companion, daytime, mood, season, weather, etc. The total number of context conditions from all of the 14 dimensions is 53. There are 2354 ratings (scale 15) given by 325 users on 249 items within different context situations.

the music data (Baltrunas et al., 2011a) was collected from InCarMusic which is an Android mobile application offering music recommendations to the passengers of a car. Users are requested to enter ratings for some items using a web application. There are 8 context dimensions included in the data: driving style, road type, landscape, sleepiness, traffic conditions, mood, weather, natural phenomena. There are 26 context conditions in total. 3251 ratings (scale 15) were given by 42 users on 139 items within different context situations.
4.2. Evaluation Protocol
Our add the following models in the comparison: tensor factorization (TF) (Karatzoglou et al., 2010), similaritybased contextual sparse linear method using multidimensional context similarity (CSLIM_MCS) (Zheng et al., 2015c) and the semantic prefiltering algorithm (SPF) (Codina et al., 2013). TF is the representative of independent contextual modeling, while CSLIM_MCS is demonstrated as the best performing dependent contextual modeling approach which utilizes context similarity. SPF is our baseline which is a prefiltering algorithm that utilizes the context similarity.
First of all, we examine the quality of top10 contextaware recommendations by these algorithms. We select recall and normalized discounted cumulative gain (NDCG) as the evaluation metrics. After that, we retrieve the top5 similar context situations by given a context, and compare the list of ranked contextual situations to see which approach is better. We use CARSKit
(Zheng et al., 2015a) which is an opensource contextaware recommendation library in our experiments. All of these evaluations are based on 5fold crossvalidation, since the data is relatively small.4.3. Results and Findings
The quality of the top10 contextaware recommendations can be described by Figure 4, where the bars present the results in precision and the curves describe the results in NDCG which tells the quality of the rankings. Apparently, CSLIM_MCS is the best performing model which obtains the highest precision and NDCG in these two data sets. TF performs better than the SPF approach in the STS data, but SPF is the better one in the music data. Dependent contextual modeling is usually better than the independent modeling, since there are always dependency among users, items, contexts in the data, especially when it comes to a data set that has many more context dimensions. It is not surprising that TF fails to outperform CSLIM_MCS in these two data sets.
Afterwards, we want to compare whether these models can retrieve highquality similar contexts. In the STS data, we pick up the ”temperature: cold” as the target context situation. We run CSLIM_MCS and SPF algorithms on the STS data and try to output the top5 similar contextual situations to this target context. In terms of the TF approach, we utilize the latent context similarity (LCS) (Zheng et al., 2015c) based on the learned vectors in TF to retrieve the top5 similar contexts. We choose ”landscape: country side” as the target context and apply the same process in the music data.
The results are described by Table 5. Note that, there is only one context condition for each rating profile in the music data, while the context situation is a combination of context conditions in the STS data. Based on the results in Table 5, we can observe that the CSLIM_MCS approach can retrieve more contextual situations with the combinations of ”temperature: cold” in the STS data, while the TF and SPF retrieve context conditions in other dimensions, such as season, daytime, mood, without the pair of ”temperature: cold”. It may be useful to explain why CSLIM_MCS outperforms other approaches, since it is better to learn the similarity of contexts.
Similar patterns can be observed in the music data. For example, CSLIM_MCS can retrieve more relevant context situations such as the conditions in landscape and roadtype dimensions, while TF and SPF retrieve more situations in the naturalphenomena dimension.
Based on these comparisons, we can tell CSLIM_MCS is the best performing contextaware recommendation algorithm for these two data sets. According to the retrieved top5 similar contexts, we can observe that CSLIM_MCS is also able to better learn and retrieve more relevant context situations by given a target context. By this way, we are able to understand why one contextual modeling is better than other algorithms by the comparison and interpretations.
5. Conclusions and Future Work
In this paper, we focus on the capability of interpretations by the contextual modeling approaches in the recommender systems. We reshape or reexplain the structure of the independent and dependent contextual modeling, and exploit the interpretations of contextual effects based on the similarity of the contexts. Our experimental results on the STS and music data can tell that CSLIM_MCS is the best performing contextaware recommendation algorithms for these two data sets, since it is able to better learn the similarity of the contexts.
In this paper, we did not evaluate the capability of interpretations by the deviationbased contextual modeling. In our future work, we will seek appropriate ways to evaluate the quality of rating deviations in different contexts, and also try to figure out a way to compare the interpretations by the deviationbased and similaritybased contextual modeling.
References
 (1)
 Adomavicius et al. (2011) Gediminas Adomavicius, Bamshad Mobasher, Francesco Ricci, and Alexander Tuzhilin. 2011. ContextAware Recommender Systems. AI Magazine 32, 3 (2011), 67–80.
 Adomavicius and Tuzhilin (2011) Gediminas Adomavicius and Alexander Tuzhilin. 2011. Contextaware recommender systems. In Recommender systems handbook. Springer, 217–253.
 Baltrunas et al. (2011a) Linas Baltrunas, Marius Kaminskas, Bernd Ludwig, Omar Moling, Francesco Ricci, Aykan Aydin, KarlHeinz Lüke, and Roland Schwaiger. 2011a. Incarmusic: Contextaware music recommendations in a car. In ECommerce and Web Technologies. Springer, 89–100.
 Baltrunas et al. (2011b) Linas Baltrunas, Bernd Ludwig, and Francesco Ricci. 2011b. Matrix factorization techniques for context aware recommendation. In Proceedings of the fifth ACM conference on Recommender systems. ACM, 301–304.
 Baltrunas and Ricci (2009) Linas Baltrunas and Francesco Ricci. 2009. Contextbased splitting of item ratings in collaborative filtering. In Proceedings of ACM conference on Recommender systems. 245–248.
 Braunhofer et al. (2013) Matthias Braunhofer, Mehdi Elahi, Francesco Ricci, and Thomas Schievenin. 2013. ContextAware Points of Interest Suggestion with Dynamic Weather Data Management. In Information and Communication Technologies in Tourism 2014. Springer, 87–100.
 Codina et al. (2013) Victor Codina, Francesco Ricci, and Luigi Ceccaroni. 2013. Exploiting the semantic similarity of contextual situations for prefiltering recommendation. In User Modeling, Adaptation, and Personalization. Springer, 165–177.
 Karatzoglou et al. (2010) Alexandros Karatzoglou, Xavier Amatriain, Linas Baltrunas, and Nuria Oliver. 2010. Multiverse recommendation: ndimensional tensor factorization for contextaware collaborative filtering. In Proceedings of the fourth ACM conference on Recommender systems. ACM, 79–86.
 Ning and Karypis (2011) Xia Ning and George Karypis. 2011. SLIM: Sparse linear methods for topn recommender systems. In Data Mining (ICDM), 2011 IEEE 11th International Conference on. IEEE, 497–506.
 Panniello et al. (2009) Umberto Panniello, Alexander Tuzhilin, Michele Gorgoglione, Cosimo Palmisano, and Anto Pedone. 2009. Experimental comparison of prevs. postfiltering approaches in contextaware recommender systems. In Proceedings of the third ACM conference on Recommender systems. ACM, 265–268.
 RamirezGarcia and GarcíaValdez (2014) Xochilt RamirezGarcia and Mario GarcíaValdez. 2014. Postfiltering for a restaurant contextaware recommender system. In Recent Advances on Hybrid Approaches for Designing Intelligent Systems. Springer, 695–707.
 Zheng (2015) Yong Zheng. 2015. A revisit to the identification of contexts in recommender systems. In Proceedings of the Conference on Intelligent User Interfaces Companion. ACM, 133–136.
 Zheng et al. (2012) Yong Zheng, Robin Burke, and Bamshad Mobasher. 2012. Differential Context Relaxation for ContextAware Travel Recommendation. In ECommerce and Web Technologies. Springer Berlin Heidelberg, 88–99.
 Zheng et al. (2013a) Yong Zheng, Robin Burke, and Bamshad Mobasher. 2013a. Recommendation with Differential Context Weighting. In User Modeling, Adaptation, and Personalization. Springer Berlin Heidelberg, 152–164.
 Zheng et al. (2013b) Y. Zheng, R. Burke, and B. Mobasher. 2013b. The Role of Emotions in Contextaware Recommendation. In ACM RecSys’ 13, Proceedings of the 3rd International Workshop on Human Decision Making in Recommender Systems. ACM, 21–28.
 Zheng et al. (2014a) Yong Zheng, Robin Burke, and Bamshad Mobasher. 2014a. Splitting approaches for contextaware recommendation: An empirical study. In Proceedings of the 29th Annual ACM Symposium on Applied Computing. ACM, 274–279.
 Zheng et al. (2014b) Y. Zheng, B. Mobasher, and R. Burke. 2014b. CSLIM: Contextual SLIM Recommendation Algorithms. In Proceedings of the 8th ACM Conference on Recommender Systems. ACM, 301–304.
 Zheng et al. (2015a) Yong Zheng, Bamshad Mobasher, and Robin Burke. 2015a. CARSKit: A javabased contextaware recommendation engine. In Data Mining Workshop (ICDMW), 2015 IEEE International Conference on. IEEE, 1668–1671.
 Zheng et al. (2015b) Yong Zheng, Bamshad Mobasher, and Robin Burke. 2015b. Integrating Context Similarity with Sparse Linear Recommendation Model. In Proceedings of the Conference on User Modeling Adaptation and Personalization. 370–376.
 Zheng et al. (2015c) Yong Zheng, Bamshad Mobasher, and Robin Burke. 2015c. SimilarityBased Contextaware Recommendation. In Proceedings of the 2015 Conference on Web Information Systems Engineering. Springer Berlin Heidelberg, 431–447.