Introduction
With their rise in prominence, recommendation systems have greatly alleviated information overload for their users by providing personalized suggestions for countless products such as music, movies, books, housing, jobs, and etc. We consider a specific recommender system domain, that of job recommendations, and propose to develop a novel method for this domain using Statistical Relational Learning. This domain easily scales to billions of items including user resumes and job postings, as well as even more data in the form of user interactions between these items. CareerBuilder, the source of the data for our experiments, operates one of the largest job boards in the world. It has millions of job postings, more than 60 million activelysearchable resumes, over one billion searchable documents, and receives several million searches per hour [AlJadda et al.2014]. The scale of the data is not the only interesting aspect of this domain, however. The job recommendations use case is inherently relational in nature, readily allowing for graph mining and relational learning algorithms to be employed. As Figure 1 shows, very similar kinds of relationships exist among the jobs that are applied to by the same users and among the users who share similar preferences.
One of the popular recommender approaches is contentbased filtering [Basu, Hirsh, and Cohen1998], which exploits the relations between (historically) appliedto jobs and similar features among new job opportunities for consideration (with features usually derived from textual information). An alternative recommendation approach is based on collaborative filtering [Breese, Heckerman, and Kadie1998], which makes use of the fact that users who are interested in the same items generally have similar preferences for additional items. Clearly, using both types of information together can potentially yield a more powerful recommendation system, which is why modelbased hybrid recommender systems were developed [Basilico and Hofmann2004]. While successful, these systems typically need extensive feature engineering to make the combination practical.
Our hypothesis that we sought to verify empirically was that recent advancements in the fields of machine learning and Artificial Intelligence could lead to powerful and deployable recommender systems. In particular, we assessed leveraging Statistical Relational Learning (SRL)
[Getoor and Taskar2007], which combines the representation abilities of rich formalisms such as firstorder logic or relational logic with the ability of probability theory to model uncertainty. We employed a stateoftheart SRL formalism for combining contentbased filtering and collaborative filtering. SRL can directly represent the probabilistic dependencies among the attributes from different objects that are related with each other through certain connections (in our domain, for example, the jobs applied to by the same user or the users who share the same skills or employers). SRL models remove the necessity for an extensive feature engineering process, and they do not require learning separate recommendation models for each individual item or user cluster, a requirement for many standard modelbased recommendation systems
[Pazzani and Billsus1997].We propose a hybrid model combining contentbased filtering and collaborative filtering that is learned by an efficient statistical relational learning approach  Relational Functional Gradient Boosting(RFGB)
[Natarajan et al.2012]. Specifically, we define the target relation as which indicates that the user–job pair is a match when the grounded relation is true, hence that job should be recommended to the target user. The task is to predict the probability of this target relation for users based on the information about the job postings, the user profile, the application history, as well as application histories of users that have the similar preferences or profiles as the target user. RFGB is a boosted model which contains multiple relational regression trees with additive regression values at the sink node of each path. Our hypothesis is that these trees can capture many of the weak relations that exist between the target user and the job with which he/she is matched.In addition, this domain has practical requirements which must be considered. For example, we would rather overlook some of the candidate jobs that could match the users (false negatives) than send out numerous spam emails to the users with inappropriate job recommendations (false positives). The cost matrix thus does not contain uniform cost values, but instead needs to represent a higher cost for the user–job pairs that are false positives compared to those that are false negatives, i.e. precision is preferred over recall. To incorporate such domain knowledge within the cost matrix, we adapted the previous work [Yang et al.2014], which extended RFGB by introducing a penalty term into the objective function of RFGB so that the tradeoff between the precision and recall can be tuned during the learning process.
In summary, we considered the problem of matching a user with a job and developed a hybrid contentbased filtering and collaborative filtering approach. We adapted a successful SRL algorithm for learning features and weights and are the first to implement such a system in a realworld big data context. Our algorithm is capable of handling different costs for false positives and false negatives making it extremely attractive for deploying within many kinds of recommendation systems, including those within the domain upon which we tested.
Related Work
Recommendation systems usually handle the task of estimating the relevancy or ratings of items for certain users based on information about the target user–item pair as well as other related items and users. The recommendation problem is usually formulated as
where is the space of all users, is the space of all possible items and is the utility function that projects all combinations of useritem pairs to a set of predicted ratings which is composed by nonnegative integers. For a certain user , the recommended item would be the item with the optimal utility value, i.e. . The user space contains the information about all the users, such as their demographic characteristics, while the item space contains the feature information of all the items, such as the genre of the music, the director of a movie, or the author of a book.Generally speaking, the goal of Contentbased filtering is to define recommendations based upon feature similarities between the items being considered and items which a user has previously rated as interesting [Adomavicius and Tuzhilin2005], i.e. for the target useritem rating , Contentbased filtering would predict the optimal recommendation based on the utility functions of which is the historical rating information of user on items() similar with . Originated from information retrieval and information filtering, most contentbased filtering systems are applied to items that are rich in textual information. From this textual information, item features are extracted and represented as keywords with respective weighting measures calculated by certain mechanisms such as the term frequency/inverse document frequency (TF/IDF) measure [Salton1989]. The feature space of the user is then constructed from the feature spaces of items that were previously rated by that user through various keyword analysis techniques such as averaging approach [Rocchio1971]
, Bayesian classifier
[Pazzani and Billsus1997] and etc. Finally, the utility function of the target useritem pairis calculated by some scoring heuristic such as the cosine similarity
[Salton1989]between the user profile vector and the item feature vector or some traditional machine learning models
[Pazzani and Billsus1997].On the other hand, the goal of the collaborative filtering is to recommend items by learning from users with similar preferences [Adomavicius and Tuzhilin2005, Su and Khoshgoftaar2009, Rao et al.2015], i.e. for the target useritem rating , Collaborative filtering builds its belief in the best recommendation by learning from the utility functions of which is the rating information of the user set that has similar preferences as the target user . The commonly employed approaches fall into two categories: memorybased(or heuristicbased) and modelbased systems. The heuristicbased approaches usually predict the ratings of the target useritem pair by aggregating the ratings of the most similar users for the same item with various aggregation functions such as mean, similarity weighted mean, adjusted similarity weighted mean(which uses relative rating scales instead of the absolute values to address the rating scale differences among users), etc. The set of most similar users and their corresponding weights can be decided by calculating the correlation (such as Pearson Correlation Coefficient [Resnick et al.1994]) or distance (such as cosinebased [Breese, Heckerman, and Kadie1998] or mean squared difference) between the rating vectors of the target user and the candidate user on common items. Whereas modelbased algorithms are used to build a recommendation system by training certain machine learning models [Salakhutdinov, Mnih, and Hinton2007, Breese, Heckerman, and Kadie1998, Si and Jin2003, Sahoo, Singh, and Mukhopadhyay2010] based on the ratings of users that belong to the same cluster or class as the target user. Hence, prior research has focused on applying statistical relational models to collaborative filtering systems [Getoor and Sahami1999, Newton and Greiner2004, Gao et al.2007, Huang, Zeng, and Chen2005].
There are Hybrid approaches which combine collaborative filtering and contentbased filtering into a unified system [De Campos et al.2010, Balabanović and Shoham1997, Basilico and Hofmann2004]. For instance Basilico et al. [Basilico and Hofmann2004]
unified contentbased and collaborative filtering by engineering the features based on various kernel functions, then trained a simple linear classifier (Perceptron) in this engineered feature space.
The most related work to ours is [Hoxha and Rettinger2013], where they proposed to use Markov Logic Networks to build hybrid models combining contentbased filtering and collaborative filtering. Their work only employed one type of probabilistic logic model, which is demonstrated later in this paper to not be the best one, and it did not consider the special requirement of many recommendation systems that precision should be preferred over recall (or at least that the relative weight of the two should be configurable).
Building Hybrid Recommendation Systems with SRL Models
In order to represent the data in a flat table, the standard modelbased recommendation systems need an exhaustive feature engineering process to construct the user profile by aggregating the attributes over all the similar users who share the same background or similar preferences as the target user. The aggregationbased strategies are necessary because the standard algorithms require a regular flat table to represent the data. However, the number of similar users related to the target user may vary a lot among different individuals. For example, users with common preferences could have more similar users than the users with unique tastes.
We propose to employ SRL for the challenging task of implementing a hybrid recommendation system. Specifically, we consider the formulation of Relational Dependency Networks (RDN) [Neville and Jensen2007], which are approximate graphical models that are inferred using the machinery of Gibbs sampling. Figure 2 shows a template model of RDNs learned from our experiment. As can be seen, other than the attributes of the target user A and target job B, it also captures the dependencies between the target predicate
and attributes from the similar user D and previous applied job C. As an approximation of Bayesian Networks, Dependency Networks (DNs) make the assumption that the joint distributions can be approximated as the product of the individual conditional probability distributions and that these conditional probability distributions are independent from each other. RDNs extend DNs to relational data and are considered as one of the most successful SRL models that have been applied to realworld problems. Hence, we propose to construct a hybrid recommendation system by learning an RDN using a stateoftheart learning approach–Relational Functional Gradient Boosting(RFGB) which has been proven to be one of the most efficient relational learning approaches
[Natarajan et al.2012].The following subsections will first introduce the basic concept of an RFGB, then cover the way we incorporate domain knowledge on the cost matrix so the proposed hybrid recommendation system can improve the confidence of recommended jobs.
Relational Functional Gradient Boosting
When fitting a probabilistic model , standard gradient ascent approaches start with initial parameters and iteratively add the gradient () of an objective function with respect to . Friedman [Friedman2001] proposed an alternate approach where the objective function is represented using a regression function over the examples , and the gradients are performed w.r.t. . Similar to parametric gradient descent, after iterations of functional gradient descent, .
Each gradient term () is a set of training examples and regression values given by the gradient w.r.t , i.e., . To generalize from these regression examples, a regression function (generally regression tree) is learned to fit to the gradients. The final model is a sum over these regression trees. Functional gradient ascent is also known as functional gradient boosting (FGB) due to this sequential nature of learning models.
FGB has been applied to relational models [Natarajan et al.2012, Karwath, Kersting, and Landwehr2008, Sutton et al.2000, Natarajan et al.2011] due to its ability to learn the structure and parameters of these models simultaneously. Gradients are computed for every groundings/instantiation of the target firstorder predicate. In our case, the grounding Match(John, Software Engineer) of the target predicate Match(User, Job) could be one example. Relational regression trees [Blockeel and Raedt1998] are learned to fit the function over the relational regression examples. Since the regression function
is unbounded, a sigmoid function over
is commonly used to represent conditional distributions. Thus the RFGB loglikelihood function is:where corresponds to a target grounding of example with parents . In our case, the target predicate is Match(User, Job), and the parents would be the attributes of the target user and target job, and the jobs previously applied to by the target user and similar users sharing the same preferences. is the true label for a user–job pair which is for a positive matching pair and for a negative matching pair. The key assumption is that the conditional probability of a target grounding , given all the other predicates, is modeled as a sigmoid function.
The gradient w.r.t. is
(1) 
which is the difference between the true observation ( is the indicator function) and the current predicted probability of the match being true. Note the indicator function, returns for positives and for negatives. Hence the positive gradient terms for positive examples push the regression values closer to and thereby the probabilities closer to 1, whereas for negative examples, the regression values are pushed closer to and the probabilities closer to .
Cost Sensitive Learning with RFGB
Following the work of Yang et al. [Yang et al.2014], we propose to construct a hybrid job recommendation system by learning a costsensitive RDN.
As shown in equation 1
, the magnitude (absolute value) of the gradient in RFGB only depends on how well the current model fits the example. If it fits well, the probability of the positive example given the current model would be close to 1 (0 for negative examples), and the gradient that will be assigned to such examples as the training weights would approach 0. If it does not, the predicted probability of the example would be far from the true label and hence cause the boosting algorithm to attach a high weight to that example. As a result, this method treats both false positive and false negative examples in the same way. Since most of the relational data suffers from class imbalance, where negative instances are much higher cost than positive instances, the negative outliers would easily dominate the classification boundary after a few iterations. So, Yang et al.
[Yang et al.2014] proposed a costsensitive relational learning approach which is able to address these issues and model the target task more faithfully. This is achieved by adding a term to the objective function that penalizes false positives and false negatives differently. They defined the cost function as:where is the true label of the instance and is the predicted label. is for false negatives (in our case, the matching user–job pair that is predicted as mismatching) and is for false positives (in our case, the mismatching user–job pair that is classified as matching). This cost function was hence being introduced into the normalization term of the objective function as:
Thus, in addition to simple loglikelihood of the examples, the algorithm also takes into account these additional costs.
Then, the gradient of the objective function w.r.t can be calculated by:
(2) 
where
(3) 
As shown above, the cost function is controlled by when a positive example is misclassified, while being controlled by when a negative example is misclassified.
Generally, if (), the algorithm is more tolerant of misclassified positive (negative) examples. Alternately, if (), the algorithm penalizes misclassified positive (negative) examples even more than standard RFGB. Thus, the influence of positive and negative examples on the final learned distribution can be directly controlled by tuning the parameters and .
Now, consider the special requirement on the cost matrix in most job recommendation systems that we would rather miss certain candidate jobs which to some extent match the target user than send out recommendations that are not appropriate to the target user. In other word, we prefer high precision as long as the recall maintains above such a reasonable value that the system would not return zero recommendations for the target user.
Since is the parameter controlling the weights of false negative examples, we simply assign it as 0 which makes for misclassified positive examples. As a result, the gradient of the positive examples is the same as it was in the original RFGB settings.
For the false positive examples, we use a harsher penalty on them, so the algorithm would put more effort into classifying them correctly in the next iteration. According to Equation 3, when it is a negative example (), we have
As , , hence , so
This means the gradient is pushed closer to its maximum magnitude , no matter how close the predicted probability is to the true label. On the other hand, when , then , hence , which means that the gradients are pushed closer to their minimum value of 0. So, in our experiment, we set , which amounts to putting a large negative weight on the false positive examples.
Consider a medical diagnosis task, where we would wish to correctly classify as many positive examples as possible, while at the same time, avoid overfitting the negative examples. In such a case, setting can satisfy the domain requirements on the cost matrix (i.e. classifying negative example as positive is to some extent tolerable), as well as handle special properties of the data (i.e. that the class is highly imbalanced with negative examples as the majority) at the same time.
In job recommendation system, by contrast, the major goal is typically not to have misclassified false positive examples. As a result, we need to eliminate the noise/outliers in negative examples as much as possible. Most algorithms generate negative examples by randomly drawing objects from two related variables, and the pairs that are not known as positivelyrelated for the given facts are assumed to be a negative pair. However, in our case, if we randomly draw instances from and , and assume it is a negative example if that grounded user never applied to that grounded job, it could introduce a lot of noise into the data since not applying could be the result of any number of reasons. For example, it could simply be due to the job never being seen by the user. Hence, instead of generating negative instances following a “closedworld assumption”, as most of the relational data did, we instead generated the negative examples by extracting the jobs that were sent to the user as recommendations but were not applied to by the user. In this way, we can guarantee that this User–Job pair is indeed not matching.
JobTitle  Training  Test  

pos  neg  facts  pos  neg  facts  
Class20  Retail Sales Consultant  224  6973  13340875  53  1055  8786938 
FPR  FNR  Precision  Recall  Accuracy  AUCROC  
Contentbased Filtering  0.537  0.321  0.060  0.679  0.473  0.628  
Soft Contentbased Filtering  0.040  0.868  0.143  0.132  0.921  0.649  
Class20  Hybrid Recommender  0.516  0  0.089  1.0  0.509  0.776 
Soft Hybrid Recommender  0.045  0.906  0.096  0.094  0.914  0.755 
Experiments
We extracted 4 months of user job application history and active job posting records and evaluated our proposed model on that data. Our intention was to investigate whether our proposed model can efficiently construct a hybrid recommendation system with costsensitive requirements by explicitly addressing the following questions:

(Q1) How does combining collaborative filtering improve the performance compared with contentbased filtering alone?

(Q2) Can the proposed costsensitive SRL learning approach reduce false positive prediction without sacrificing too much on the other evaluation measurements?
To answer these questions, we extracted 9 attributes from user resumes as well as job postings, which are defined as firstorder predicates: JobSkill(jobid, skillid), UserSkill(userid, skillid), JobClass (jobid, classid), UserClass(userid, classid), PrAppliedJob(userid, jobid), UserJobDis(userid, jobid, distance), UserCity(userid, cityname), MostRecentCompany(userid, companyid), mostRecentJobTitle(userid, jobtitle).
There are 707820 total job postings in our sample set, and the number of possible instances the first order variables can take is shown below.
Variable Name  skillDid  classDid  distance 

Num of Instances  8534  1867  4 
Variable Name  cityname  companyid  jobtitle 
Num of Instances  22137  1154623  823733 
Information on the JobClass and UserClass are extracted based upon the work of Javed et al. [Javed et al.2015]. The other features related to users are UserSkill, UserCity, MostRecentCompany and mostRecentJobTitle which are either extracted from the user’s resume or the user’s profile document, whereas the job feature JobSkill represents a desired skill extracted from the job posting. Predicate UserJobDis indicates the distance between the user(first argument) and the job(second argument), which is calculated based on the user and job locations extracted from respective documents. The UserJobDis feature is discretized into 4 classes (1: mile; 2: [15 miles, 30 miles); 3: [30 miles, 60 miles]; 4: miles). The predicate PrAppliedJob defines the previous applied jobs and serves as both an independent predicate which indicates whether the target user is in a cold start scenario, as well as acting as a bridge which introduces into the searching space the attributes of other jobs related to the target user during the learning process.
We also use three additional firstorder predicates: CommSkill(userid1, userid2), CommClass (userid1, userid2) and CommCity(userid1, userid2) which are induced from the given groundings of the predicates UserSkill, UserClass and UserCity and also serve as bridges which introduce features of other users who share the similar background with the target user.
The performance of our model is evaluated in 1 user classes, each of which has its data scale description shown in Table 1.
For each of these user classes, we experimented with our proposed model using firstorder predicates of the contentbased filtering alone, as well the firstorder predicates of both contentbased filtering and collaborative filtering.
As Table 2 shows, although the two approaches show similar performance on False Positive Rate, Precision, and Accuracy, the hybrid recommendation system improves a lot on the False Negative Rate, Recall and AUCROC compared with contentbased filtering alone, especially on the Recall (reached 1.0 for all three of the user classes). So, question (Q1) can be answered affirmatively. The hybrid recommendation system improves upon the performance of contentbased filtering alone, by taking into consideration the information of similar users who have the same expertise or location as the target user.
The first column of Table 2 shows the False Positive Rate which we want to reduce. As the numbers shown, the soft margin approach greatly decreases the FPR compared with prior research which does not consider the domain preferences on the cost matrix. It also significantly improves the accuracy at the same time. Note that, although it seems that recall has been considerably sacrificed, our goal here is not to capture all the matching jobs for the target user, but instead to increase the confidence on the recommendations we are giving to our users. Since we may have hundreds of millions of candidates and jobs in the data pool, we can usually guarantee that we will have a sufficient number of recommendations even with relatively low recall. Hence, question (Q2) can be also be answered affirmatively. Moreover, our proposed system can satisfy various requirements on the tradeoff of precision and recall for different practical consideration by tuning the parameters and . If one does not want the recall too low, in order to guarantee the quantity of recommendations, one can simply decrease the value of ; if one does not want the precision too low, in order to improve the customer satisfaction, one can just increase the value of .
It is worth mentioning that we also tried to experiment with Markov Logic Networks on the same data by using Alchemy2 [Kok et al.2009]. However, it failed after continuously running for three months due to the large scale of our data. This underscores one of the major contributions of this research in applying SRL using a hybrid approach in a realworld largescale job recommendation system.
Conclusion
We proposed an efficient statistical relational learning approach to construct a hybrid job recommendation system which can also satisfy the unique cost requirements regarding precision and recall of a specific domain. The experiment results show the ability of our model to reduce the rate of inappropriate job recommendations. Our contribution includes: i. we are the first to apply statistical relational learning models to a realworld largescale job recommendation system; ii. our proposed model not only proves to be the most efficient SRL learning approach, but also demonstrates its ability to further reduce false positive predictions; iii. the experiment results reveal a promising direction for future hybrid recommendation systems– with proper utilization of firstorder predicates, an SRLmodelbased hybrid recommendation system can not only prevent the necessity for exhaustive feature engineering or preclustering, but can also provide a robust way to solve the coldstart problem.
References
 [Adomavicius and Tuzhilin2005] Adomavicius, G., and Tuzhilin, A. 2005. Toward the next generation of recommender systems: A survey of the stateoftheart and possible extensions. IEEE Trans. on Knowl. and Data Eng. 17(6):734–749.
 [AlJadda et al.2014] AlJadda, K.; Korayem, M.; Grainger, T.; and Russell, C. 2014. Crowd sourced query augmentation through semantic discovery of domainspecific jargon. In 2014 IEEE International Conference on Big Data, 808–815. IEEE.
 [Balabanović and Shoham1997] Balabanović, M., and Shoham, Y. 1997. Fab: contentbased, collaborative recommendation. Communications of the ACM 40(3):66–72.
 [Basilico and Hofmann2004] Basilico, J., and Hofmann, T. 2004. Unifying collaborative and contentbased filtering. In Proceedings of the Twentyfirst International Conference on Machine Learning.
 [Basu, Hirsh, and Cohen1998] Basu, C.; Hirsh, H.; and Cohen, W. 1998. Recommendation as classification: Using social and contentbased information in recommendation. In Fifteenth National Conference on Artificial Intelligence, 714–720. AAAI Press.

[Blockeel and Raedt1998]
Blockeel, H., and Raedt, L. D.
1998.
Topdown induction of firstorder logical decision trees.
Artificial Intelligence 101:285–297.  [Breese, Heckerman, and Kadie1998] Breese, J. S.; Heckerman, D.; and Kadie, C. 1998. Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, 43–52. Morgan Kaufmann Publishers Inc.
 [De Campos et al.2010] De Campos, L. M.; FernándezLuna, J. M.; Huete, J. F.; and RuedaMorales, M. A. 2010. Combining contentbased and collaborative recommendations: A hybrid approach based on bayesian networks. International Journal of Approximate Reasoning 51(7):785–799.
 [Friedman2001] Friedman, J. H. 2001. Greedy function approximation: A gradient boosting machine. Annals of Statistics 1189–1232.
 [Gao et al.2007] Gao, Y.; Qi, H.; Liu, J.; and Liu, D. 2007. A recommendation algorithm combining user gradebased collaborative filtering and probabilistic relational models. In Fourth International Conference on Fuzzy Systems and Knowledge Discovery, volume 1, 67–71. IEEE.
 [Getoor and Sahami1999] Getoor, L., and Sahami, M. 1999. Using probabilistic relational models for collaborative filtering. In Workshop on Web Usage Analysis and User Profiling (WEBKDD’99).
 [Getoor and Taskar2007] Getoor, L., and Taskar, B. 2007. Introduction to Statistical Relational Learning. Adaptive computation and machine learning. MIT Press.
 [Hoxha and Rettinger2013] Hoxha, J., and Rettinger, A. 2013. Firstorder probabilistic model for hybrid recommendations. In 12th International Conference on Machine Learning and Applications, ICMLA 2013, 133–139.
 [Huang, Zeng, and Chen2005] Huang, Z.; Zeng, D. D.; and Chen, H. 2005. A unified recommendation framework based on probabilistic relational models. Available at SSRN 906513.
 [Javed et al.2015] Javed, F.; Luo, Q.; McNair, M.; Jacob, F.; Zhao, M.; and Kang, T. S. 2015. Carotene: A job title classification system for the online recruitment domain. In IEEE First International Conference on Big Data Computing Service and Applications, 286–293.
 [Karwath, Kersting, and Landwehr2008] Karwath, A.; Kersting, K.; and Landwehr, N. 2008. Boosting relational sequence alignments. In ICDM.
 [Kok et al.2009] Kok, S.; Sumner, M.; Richardson, M.; Singla, P.; Poon, H.; Lowd, D.; Wang, J.; and Domingos, P. 2009. The alchemy system for statistical relational AI. Technical report, Department of Computer Science and Engineering, University of Washington, Seattle, WA.
 [Natarajan et al.2011] Natarajan, S.; Joshi, S.; Tadepalli, P.; Kristian, K.; and Shavlik, J. 2011. Imitation learning in relational domains: A functionalgradient boosting approach. In IJCAI.
 [Natarajan et al.2012] Natarajan, S.; Khot, T.; Kersting, K.; Gutmann, B.; and Shavlik, J. 2012. Gradientbased boosting for statistical relational learning: The relational dependency network case. Mach. Learn. 86(1):25–56.
 [Neville and Jensen2007] Neville, J., and Jensen, D. 2007. Relational Dependency Networks. J. Mach. Learn. Res. 8:653–692.
 [Newton and Greiner2004] Newton, J., and Greiner, R. 2004. Hierarchical probabilistic relational models for collaborative filtering. In Workshop on Statistical Relational Learning, 21st International Conference on Machine Learning.
 [Pazzani and Billsus1997] Pazzani, M., and Billsus, D. 1997. Learning and revising user profiles: The identification ofinteresting web sites. Mach. Learn. 27(3):313–331.
 [Rao et al.2015] Rao, N.; Yu, H.F.; Ravikumar, P. K.; and Dhillon, I. S. 2015. Collaborative filtering with graph information: Consistency and scalable methods. In Advances in Neural Information Processing Systems 28. Curran Associates, Inc. 2107–2115.
 [Resnick et al.1994] Resnick, P.; Iacovou, N.; Suchak, M.; Bergstrom, P.; and Riedl, J. 1994. Grouplens: An open architecture for collaborative filtering of netnews. 175–186. ACM Press.
 [Rocchio1971] Rocchio, J. 1971. Relevance feedback in information retrieval. In The SMART Retrieval System: Experiments in Automatic Document Processing. PrenticeHall Inc. chapter 14, 313–323.

[Sahoo, Singh, and
Mukhopadhyay2010]
Sahoo, N.; Singh, P. V.; and Mukhopadhyay, T.
2010.
A hidden markov model for collaborative filtering.
Management Information Systems Quarterly, Forthcoming.  [Salakhutdinov, Mnih, and Hinton2007] Salakhutdinov, R.; Mnih, A.; and Hinton, G. 2007. Restricted boltzmann machines for collaborative filtering. In Proceedings of the 24th International Conference on Machine Learning, 791–798. ACM.
 [Salton1989] Salton, G. 1989. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Boston, MA, USA: AddisonWesley Longman Publishing Co., Inc.
 [Si and Jin2003] Si, L., and Jin, R. 2003. Flexible mixture model for collaborative filtering. In ICML, 704–711. AAAI Press.
 [Su and Khoshgoftaar2009] Su, X., and Khoshgoftaar, T. M. 2009. A survey of collaborative filtering techniques. Adv. in Artif. Intell. 2009:4:2–4:2.

[Sutton et al.2000]
Sutton, R.; McAllester, D.; Singh, S.; and Mansour, Y.
2000.
Policy gradient methods for reinforcement learning with function approximation.
In NIPS.  [Yang et al.2014] Yang, S.; Khot, T.; Kersting, K.; Kunapuli, G.; Hauser, K.; and Natarajan, S. 2014. Learning from imbalanced data in relational domains: A soft margin approach. In 2014 IEEE International Conference on Data Mining, ICDM 2014, 1085–1090.
Comments
There are no comments yet.