Today, Internet is one of the widest available media worldwide. It has essentially become a huge hit of data that has the potential to serve many information centric applications in our life. Recommendation system takes an essential part of many internet services and online applications, including applications like social-networking and recommendation of products (films, music, articles,..i.e.). Recommendation techniques have been used by the most known companies such as Amazon, Netflix and eBay to recommend releated items or products by estimating the probable preferences of customers. These techniques are profitable to both service provider and user. According to pervious works, two popular approaches for building recommendation systems can be categorized as content-based (CB), collaborative filtering (CF).
Content-based (CB) recommending is adopted for recommendation systems model widely, which takes advantage of the property of items to create features and characteristics to coordinate user profiles. It can be relied on the properties of the items that each user likes to discover what else the user may like. One major issue of CB filtering method is that the recommendation system is required to gain an understanding of user preferences for some sorts of items and deploy these for other sorts of items.
Nevertheless CF has two widely known problems which are sparsity and cold start (CS). In the rating matrix, The percentage of elements which get values is small. Even it is possible that CF considers only a few rating for popular items. For instance, upon a considerable Netflix rating dataset which is provided for Netflix Prize competition, there are about 100 milion ratings for about 18,000 movies that are given by 480,000 users. The percentage of rating matrix elements which are received ratings is 1. With a sparse ranking matrix it is very challenging topic to make an effective recommendation, depending on estimation of the relationship between items and users. CS problem is another widely known issue for CF approach, which can occur on new users or items. In terms of achieving an effective recommendation, CF approach requires either ratings on an item or a large number of ratings from a user.
Recently, researchers proposed various methods based on probabilistic topic modeling methods Bleiet al. (2003). LDA is a generative probabilistic model broadly used in the information retrieval field. Researchers have used topic modeling methods based on LDA for building recommendation systems in various subjects, including app recommendation Zhang et al. (2015); Cao et al. (2017); Chua et al. (2017); He et al. (2017); Zhu et al. (2017), event recommendationFang et al. (2016); Cheng et al. (2016); Magnuson et al. (2015); Khrouf et al. (2013); Zhang et al. (2013); Minkov et al. (2010); Zhang et al. (2013); Hsieh et al. (2016); Zhengxing et al. (2014); Li et al. (2017), hashtag recommendation Reddy et al. (2017); Gong et al. (2017); Tang et al. (2013); Xu et al. (2011); Lu et al. (2015); Wang et al. (2015); Krestel et al. (2009); Shi et al. (2016); Wu et al. (2015); Prokofyev et al. (2012); Jin et al. (2010); Wang et al. (2014); Ma et al. (2014); Lu et al. (2011); Zhao et al. (2016); Zhang et al. (2013); She et al. (2014); Tomar et al. (2014); Jianjun et al. (2015); Li et al. (2016); Jiang et al. (2017), social networks and media Liu et al. (2013); He et al. (2016); Fang et al. (2016); Khrouf et al. (2013); Bobadilla et al. (2013); Silva et al. (2013); Huang et al. (2017). In this paper, We present a taxonomy of recommendation systems applications based on topic modeling (LDA) of the recent research and evaluate ISWC and WWW publications in computer science between 2013 to 2017 from DBLP dataset.
2 Natural language Processing and LDA topic model
Topic models are a powerful and practical tool for analyzing huge text documents in Natural language processing. Topic models can automatically cluster words into topics and discover relationship between documents from a dataset. For example; we can assume a three-topic model of a News Dataset, including ”sport”, ”money” and “politic”. The most common words in the sport topic (Topic 1) might be “gym”, “football”, and “tennis” and in addition, for politic topic (Topic 2) might be “senator”, “president”, and “election”; while the money topic (Topic 3) can be made up of words such as “dollar”, “currency”, and “euro”. Figure 1; show a simple example for understanding a topic discovery from group of words. LDA is a popular technique to semantic analysis in topic modeling and text mining. LDA can apply in a diversity of text-information to evaluate topic trends over time and analyze large numbers of documents.
In-process detail for LDA, defined a corpus (text) as where M is number of text documents and is a number of text documents in the corpus. A document is a series of N words denoted by , where is the word in the sequence of text document. In addition, z is a latent variable representing the hidden topic associated with each showed word. The generative procedure for LDA, formally defined as:
For topic index
i. Selected a word distribution
For text document
i.Selected a topic distribution
a.Selected a topic assignment
b.Selected a word
Mult() is a multinomial distribution, and is a Dirichlet distribution which is a prior distribution of Mult(), and
2.1 Gibbs sampling and Learning LDA
As previously mentioned, Topic modeling can find a collection of distributions over words for each topic and the relationship of topics with each document. To perform approximate inference and learning LDA, there are many inference methods for LDA topic model such as Gibbs sampling, collapsed Variational Bayes, Expectation Maximization. Gibbs sampling is a popular technique because of its simplicity and low latency. However, for large numbers of topics, Gibbs sampling can become unwieldy. In this paper, we use Gibbs Sampling in our experiment in section 5.
3 Topic model based on recommendation systems: Recent research
In this section, we considered six recommendation systems based on LDA which includes: scientific paper recommendation Amamiet al. (2017); Dai et al. (2017); Kimet al. (2013); Li et al. (2013); Sugiyama et al. (2013); Wang et al. (20111, 2013); Younus et al. (2014), music and video recommendation Hariri et al. (2012); Cheng et al. (2016); Yan et al. (2016); Zhang et al. (2012); Dias et al. (2013); Zheleva et al. (2010); Hu et al. (2014); Basu et al. (2016); Lee et al. (2017); Tan et al. (2016); Hariri et al. (2012), location recommendationKoren et al. (2009); Liu et al. (2013, 2013b); Ho et al. (2012); Kurashima et al. (2013); Xiong et al. (2017); Wang et al. (2017), travel and tour recommendationQuyang et al. (2015); He et al. (2016); Kavitha et al. (2017); Sun et al. (2016), app recommendationZhang et al. (2015); Cao et al. (2017); Chua et al. (2017); He et al. (2017); Zhu et al. (2017), friend recommendation Huang et al. (2017); Wang et al. (2016); Zhu et al. (2015), as shown in Figure 2.
3.1 Topic model based on scientific paper recommendation
In recent years a considerable amount of research has addressed the task of defining models and systems for scientific papers recommendation; this trend has emerged as a natural consequence of the increasing growth of the number of scientific publications. For example, Youn and et al, proposed an approach to scientific articles’recommendation of user’s interests based on a topic modeling framework. The authors, used a LDA model in order to extract the topics of the followees’tweets (followed Twitterers) and the paper titlesYounus et al. (2014). They apply the Twitter-LDA algorithm simultaneously on the followees’ tweets and the paper titles with the number of topics set to 200, they utilized the intersection of topics found in both paper titles and followees’s tweets. Each followee of a user is ranked as follows:
where denotes all tweets by a followee,
denotes the set of topics defining the titles of scientific articles,
denotes the set of topics defining the tweets of a followee
and the number of times a particular topic
‘t’ from within occurs among the tweets of a followee. Based on the
ranking scores of all followees of a particular user,and obtained top-k researchers
followed by a target user. For evaluation approach, considered DBLP database as a
large academic bibliographic network.
|Younus et al. (2014)||2014||Present an approach to utilize this||LDA, Twitter-LDA Zhu et al. (2015)||DBLP dataset|
|valuable information source to suggest|
|Amamiet al. (2017)||2017||A scientific paper recommendation approach||LDA, Gibbs sampling||ArnetMiner Dataset|
|Dai et al. (2017)||2017||A Citation recommendation||LDA||ACL Anthology|
|present a topic model||Maximum A Posteriori (MAP)||Network ,|
|combing with author link||DBLP|
|Kimet al. (2013)||2013||A personalized recommendation||LDA, EM algorithm||Digg articles|
|system for Digg articles||(digg.com)|
|Wang et al. (20111)||2011||A scientific articles recommendation||LDA, EM algorithm||CiteULike Dataset|
|to users based on both content and|
|other users ratings|
Also, some researchers, introduced a combined model based on traditional collaborative filtering and topic modeling and designed a novel algorithm to scientific articles recommendation for users from an online community, called CTR model. They considered LDA to initialize the CTR model, Infact they combined the matrix factorization and the LDA model, and is shown their approach better than the recommendations based on matrix factorization. For evaluation and test, Used a large dataset from a bibliography sharing service (CiteULike) Wang et al. (20111). Table 1, shown some impressive work based on LDA for paper recommendation.
3.2 Topic model based on music and video recommendation
Video and music recommendation has become an essential way for helping people explore the video world and discover the ones that may be of interest to them. Recently, analyze user interests and a good video or music recommendation in internet society is a big challenge. Hariri and et al. proposed a combined approach based on content and collaborative filtering methods from the sequence of songs listened to generate music recommendation. They applied a LDA model to reduce the dimensionality of the feature and obtain the hidden relationships between songs and tags. They collected 218,261 distinct songs from ”Art of the Mix” website for evaluation their approach Hariri et al. (2012). Yan and et al, focused on the efficiency of users’ information content on the online social network and provided a solution as a personalized video recommendation with considering users’ cross-network social and content data. They applied a topic model based on LDA for each user, that user as document and user’s hashtags as word, with considering user’s information from Twitter Yan et al. (2016). They derived Twitter user topic distributions and observed user-video interactions on YouTube, and presented a solution for user preference transfer:
, Twitter user tweet topic distribution matrix = ;…; , Twitter user social topic distribution matrix = ;…; , and with observations of the overlapped user’s Twitter and YouTube behaviors, as the collection of all the observed user-video pairs, is a trade-off parameter to balance the contribution of different types of user’s behaviors on Twitter, where is the row of , is the column of L, is the entry located in the column and row of L. Based on, updated and , iteratively until convergence or maximum iteration. The update rules are:
where denotes the learning rate.
With the derived transfer matrices , and video latent factor representations V, given a test user with his/her tweeting activity, friend collection, and the corresponding Twitter topical distributions , , we can estimate preferences on YouTube videos as:
For test and experiment, utilized their approach on YouTube-Twitter dataset that include 9,253,729 tweeting behaviors and 1,097,982 video-related behavior and showed combining auxiliary network information and utilizing a cross-network Collaborative can lead to generating novel recommendations and increasing satisfaction for users.
In addition, some researchers; proposed an approach based on Collaborative Filter (CF) Algorithm and utilized the application of session variety and temporal context. They applied a LDA model to temporal properties extraction of sessions and that considering sessions as documents and songs as words. For evaluation of this approach, they used Last.fm dataset(log) that includes 19,150,868 entries from 992 users. Results showed that the approach with using temporal information can increase the accuracy of music recommendations Dias et al. (2013).
In addition, other researchers used a dynamic framework based on four aspects of user’s preference (collaborative aspect, content aspect, popularity aspect, and randomization aspect) for movie recommendation. The authors applied a linear combination model to generate the final recommendation list Cheng et al. (2016)
. Hu et al, proposed a novel topic modeling to audio retrieval, called GaussianLDA. In general, in this approach it was assumed that each audio document includes various latent topics and each topic considered as a Gaussian distribution. They prepared 1214 audio documents (length: between 0.82 s to 1 min), that each audio document is related with a category in different subject that includes: bell, river, rain, laugh, dog, gun and so on. Their results showed that the GaussianLDA model significantly outperform the standard LDA topic modelHu et al. (2014). Table 2, shown some impressive work based on LDA for music and video recommendation.
|Hariri et al. (2012)||2012||A Music recommendation,||LDA||Art of the Mix|
|tracks and detects changes||Collaborative filtering||[www.artofthemix.org]|
|in users’s preferences|
|Cheng et al. (2016)||2016||A venue-aware music recommender||LDA, SVM||A music dataset:|
|to identify suitable songs||A music dataset:Concept-Labeled Music (TC1)|
|and Large Music (TC2)|
|Yan et al. (2016)||2016||A video recommendation,||LDA||A Google+ dataset|
|obtain users’s rich social||(137,317 Google+ users)|
|Dias et al. (2013)||2013||A Music recommendation,||LDA||Last.fm|
|analysis of user listening||Collaborative filtering|
|Zheleva et al. (2010)||2010||A Music recommendation,||LDA,||A Zune Social music community|
|characterizing user preferences||Message-passing|
|in social media content|
|Hu et al. (2014)||2014||An audio retrieval,||LDA,||An audio dataset|
|Audio analysis of||Gaussian-LDA||in various categories|
3.3 Topic model based on location recommendation
Recommendation systems based on location can suggest a set of places that users may be interested in, based on their history and behavior analysis. LDA can also be used for location recommendation. Kurashima et al, proposed a novel topic model for recommending new locations to visit, called Geo Topic Model. This model can predict user’s interest and the user’s spatial area based on features of visited locations. They used Tabelog-based (tablelog.com) and Flickr-based real-location log data for evaluation of their approach. They found that this model can discover latent topics related to art, great views, nature, atmosphere, and construction and other from logs of visited places Ho et al. (2012). Table 3, shown some impressive work based on LDA for location recommendation.
|Ho et al. (2012)||2013||A Location Recommendation,||LDA,||A tabelog and|
|analysis of user’s, Interest||EM algorithm||a Flick dataset|
|and the user’s spatial area||[tabelog.com, Flickr.com]|
|Liu et al. (2013)||2013||Present a topic and||LDA,||A large real-world LBSN,|
|location aware, Point-of-Interest||Gibbs sampling||Foursquare Dataset|
|Xiong et al. (2017)||2017||A Spatial Item Recommendation,||LDA , Gibbs Sampling,||Foursquare Dataset,|
|Study on patterns of||a gradient descent learning||A twitter dataset|
Liu and et al, investigated the POI recommendation issue in LBSNs by mining textual information and proposed a ’Topic and Location-aware’ probabilistic matrix factorization (TL-PMF) method for Point-of-Interest recommendation to discover personalized recommendations from favorite places Liu et al. (2013). The distribution over the observed ratings as well as the textual information is:
Where, be the rating of user for ,
and are the user and POI
latent feature space vector respectively,
is a Gaussian distribution with mean and variance,
is the indicator function,
Function is to approximate the rating of user for .
they analyzed the topic characteristics from POIs across various geographical areas. The experiments were conducted on a large real-world LBSN dataset; they analyzed the topic characteristics from POIs across various geographical areas.
3.4 Topic model based on friend recommendation
Friend recommendation is a popular method to help users to make new friends and discover interesting information. Friend recommendation is a relative challenging issue contrasted with group or item recommendations in online social networksWang et al. (2016); Huang et al. (2017); Zhu et al. (2015). To address this challenge issue, a recent work in  proposed a friend recommendation based on LDA, which contains two stages: first step, they applied tag-user information to produce a possible friend list and then they created a topic model to demonstrate the relationship between user’s friend making behaviour and image features. They applied experiments on the Flickr as a standard dataset and showed that their recommends friends more quickly than traditional methods.
3.5 Topic model Based on travel and tour recommendation
Definitely, recommendation systems can have a significant impact to build a smart travel recommendation. Many different techniques have recently been developed to support travel recommendation based on different kinds of data. For example, in Quyang et al. (2015), the authors proposed a novel generative probabilistic model named socoLDA with heterogeneous social influence to better capture users’s travel interests. They introduced the framework of travel-package recommendation named socoTraveler, which applys socoLDA to show a user’s travel interests in topic space and find similar users to produce recommendations with user-based collaborative filtering. In addition, in Kavitha et al. (2017), the authors proposed a framework to suggest top-k tours with highest marks for a user by using the photos shared by other users in an online social network. Table 4, shown some impressive work based on LDA for travel and tour recommendation.
|Quyang et al. (2015)||2016||A travel-package||LDA,||A real travel dataset|
|recommendations with||Gibbs sampling||(from a China tourism|
|considering social influence||company)|
|He et al. (2016)||2017||A tourism recommendation||LDA,||A twitter dataset|
|on twitter users||Gibbs sampling|
|Kavitha et al. (2017)||2017||A Tour recommendations||LDA,||A Flickr dataset|
|Sun et al. (2016)||2015||Personalized trip||LDA,||Yelp and|
|recommendation||collaborative filtering||Foursquare dataset|
|Wang et al. (2017)||2017||A social sequential||LDA,||A twitter dataset Zheng et al. (2015)|
|tour based POI||Gibbs sampling|
|(point of interest)|
|Zhao et al. (2011)||2016||Trip recommendation||LDA, Yelp and|
|based on points of||Depth-first search (DFS)||Foursquare dataset|
3.6 Topic model based on app recommendation
Currently, a wide range of recommendation approaches have been proposed and applied to recommend mobile apps. For example, In Chua et al. (2017), proposed a novel probabilistic model, named Goal-oriented Exploratory Model (GEM), to combine the identification of exploratory behavior and mobile app recommendations into a unified framework. The authors, employed the idea of LDA to design a topic model to cluster items into goals and identify the personal distribution of goals for each user and developed an effective and efficient algorithm, which integrates Expectation-Maximization (EM) algorithm with collapsed Gibbs sampling for model learning. They collected a mobile app dataset from Qihoo 360 Mobile Assistant, an open mobile app platform in China for Android users. Lin et al, investigated the cold-start issue with using the social information for App recommendation in Twitter and used a LDA model to discovering latent group from ”Twitter personalities” to recommendations discovery. This approach is based on a simple ”averaging” method where the probability of how likely the target user will like the app is the expectation of how the Twitter-followers like the app. Given a set of Twitter-followers , the probability that user likes app a is defined as follows:
where T(a) is the set of possible Twitter-followers following app a, in which assume that:
(i) Twitter-followers are examined once at a time to make a decision about whether an app is liked or disliked.
(ii) when the Twitter follower is known, the judgement does not depend on the app any more, i.e.,
(iii) the fact that given a user and an app, there is no judgement involved, i.e.,
(iv) the fact that an app has a given Twitter follower is independent from the user, i.e.,
Equation (6) is then reduced to the estimation of two quantities:
1. The probability that user u likes app a given that app a has Twitter-follower t, i.e., , and
2. The probability of considering Twitter-follower t given app a, i.e.,
is straightforward to estimate as it can be rewritten as:
where and are derived from LDA, which is the probability that Twitter-follower t occurs in an app that is liked (or disliked) by user u.
For test and experiment, they considered Apple’s iTunes App Store and Twitter as dataset. Experimental results show, their approach significantly better than other state-of-the-art recommendation techniques Falher et al. (2015).
In He et al. (2017), the authors proposed an allocation-based probabilistic mechanism that considers multiple user-app factors to help users with app recommendations. This framework that can capture geographical influences on usage behaviors and effectively model user mobility patterns, which in turn affect app usage patterns. They used Gibbs sampling to approximately estimate and infer the parameters of LDA. The authors measure the similarity of two location blocks by the similarity of app usage pattern, which is calculated by Pearson’s correlation similarity Zhang et al. (2016):
defined =1, if user has launched app at location block .
Otherwise, =0. Therefore, the number of users who have launched at location block is .
where denotes average mobile app influences at location block .
To determine whether a location block belongs to a location ,
They define the app usage pattern coefficient as for geographical block . The initial collaborative filter coefficient is the average mobile app influence. To decide whether location block belongs to geographical region . Also, they calculate its collaborative filter coefficient as follows :
where is the number of location blocks in geographical region . If the value of , where is a predefined threshold value, location block will be included in .
|Zhang et al. (2015)||2017||A mobile App recommendation||LDA||A benchmark dataset based upon|
|stochastic gradient||Apple’s iTunes App Store|
|Cao et al. (2017)||2017||A mobile App recommendation||LDA,||iphone-iPad ,|
|with considering users and||stochastic gradient||iphone-iPad-iMac Dataset|
|Apps data on multiple||descent (SGD),||(manual obtained)|
|Collaborative Topic modeling (C. Wang & Blei, 2011)|
|Chua et al. (2017)||2017||A mobile app recommendation||LDA,||A mobile app dataset|
|with exploratory behavior||EM Collapsed Gibbs||(Qihoo 360 Mobile|
|from big data||sampling algorithm||Assistant in China)|
|He et al. (2017)||2017||A mobile application recommendation||LDA,||A app mobile dataset|
|with considering geographical location||Gibbs Sampling||by a chinese app|
|Zhu et al. (2017)||2016||A semantic recommendation||LDA,||A app dataset|
|to evaluate mobile applications||Gibbs Sampling||of iOS mobile|
|with behavior Analysis||apps|
|Falher et al. (2015)||2013||Addressing the Cold-Start||LDA||A dataset from Apple’s iTunes|
|problem, A app Recommendation||Gibbs Sampling||App Store|
|with considering Twitter-followers||collaborative filtering||and Twitter|
4 Experiment: Semantic Discovery and researchers behavior analysis
We extracted ISWC and WWW conferences publications from DBLP website by only considering conferences for which data was available for years 2013-2017. In total, It should be noted that in these experiments, we considered abstracts and titles from each article. In this paper, we used MALLET (http://mallet.cs.umass.edu/) to implement the inference and obtain the topic models. In addition, our full dataset is available at https://github.com/JeloH/Dataset_DBLP. The most important goal of this experiment is discover the trends of the topics and find relationship between LDA topics and paper features and generate trust tags.
4.2 Parameter Settings
In this paper, all experiments were carried out on a machine running Windows 7 with CoreI3 and 4 GB memory. We learn a LDA model with 100 topics; , and using Gibbs sampling as a parameter estimation. Related words for a topic are quite intuitive and comprehensive in the sense of supplying a semantic short of a specific research field.
4.3 Semantic analysis and generate tags
In this section, we provide the results and discovered topics of 100-topics for ISWC and WWW conferences.
|Semantic Web||Ontology modularity||Web content analysis||
|Topic 14||Topic 28||Topic 87||Topic 61|
|Mapping, Query language||Cloud computing, Data stream||online social networks||
|Topic 57||Topic 4||Topic 18||Topic 7|
|Question Answering System||Concept representation||Recommendation systems||
|Topic 59||Topic 25||Topic 19||Topic 29|
|documents analysis||Linked Data, semantically query||Social network and security||
|Topic 20||Topic 58||Topic 92||Topic 69|
|semantic network and natural language||RDFa for Educational websites||Mobile Networks||
|Topic 52||Topic 66||Topic 14||Topic 24|
According to Table 6 , the following observations can be made:
In ISWC conference, Topic 25 sounds considerably more generic and is consistent with ’Concept Representation’ in general, and marked by representation, structure, investigate, detection, values, and concept. Also, we can see that from 20 generated words in Topic 20, some words are very related to each other in means such as documents, scientific, networks, web, health, kg and we found that this topic covers papers that propose models in ’Documents Analysis in health research’.
In WWW conference, Topic 7, this is our question; the word of ’ad’! Is for ’Advertisement’ or ’Ad Hoc Network’? As we can see, the word ’ad’ can be related to ’Advertisement’ or also to ’ad hoc network’. To answer this question, it is very easy to see that topic 7 reveals social, modeling, ranking, question, ad, browser. If only we consider the words ’browser’, news, social, we can predict that this topic can be related to ’Advertisement’ and this topic covers papers that propose methods in ’Question Answering and Social Media’.
5 Discussion, Open Issues and Future Directions
In this study, we focused on the LDA approaches to recommendation systems and given the importance of research, we have studied recent impressive articles on this subject and presented a taxonomy of recommendation systems based on LDA of the recent research. we evaluated ISWC and WWW conferences articles from DBLP website and used the Gibbs sampling algorithm as an evaluation parameter. We succeeded in discovering the relationship between LDA topics and paper features and also obtained the researchers’ interest in research field. According to our studies, some issues require further research, which can be very effective and attractive for the future.
5.1 Topic modeling methods and traditional methods in recommendation systems
There are differences between recommendation systems based on LDA and traditional collaborative filtering (CF), we discus about the issue of ’Cold-start’, ’latent user interest’ and Sparsity in the field of recommender systems. It should be noted that to overcome the major weaknesses of CF-based recommendation systems, many models have been proposed, such as Wang et al. (20111).
Recommendation systems based on LDA in cold-start, the cold start problem occurs when a new item or user has just
logged into that system; it is difficult to find similar ones because there is not enough information. LDA can be effective and useful to deal with cold-start in recommendation systems. There are approaches based on LDA to deal with this problem that some of the methods are combined with CF methods, for example, Lin et al, investigated the cold-start issue with use the social information for app recommendation in Twitter and used a LDA model to discovering latent group from ’Twitter personalities’ to recommendations discovery and shown that their approach overcomes the difficulty of cold-start app recommendation. Also Some researchers investigated the cold-start problem in tag recommendation, for example; In Hariri et al. (2012) , presented a system recommendation based on LDA for Cold-Start in Music Recommendation and showed that their approach can be useful in handling the cold start problem where a new song hasn’t occurred in the training data. Other researchers also analyze the cold problem in the Video recommendation, for example; In Yan et al. (2016) introduced a unified YouTube video recommendation solution via cross-network collaboration and LDA to address the typical cold-start and data sparsity problems in recommender systems and show that this approach can be effective in terms of precision and improving the diversity of the recommended videos.
Recommendation Systems based on LDA for ’Sparsity problem’, the data sparsity challenge happens when the ratio is too small to supply enough information for effective predictions in CF systems, and hence the access matrix is very sparse. Recommendation systems based on LDA with ratings data can provide significant advantage and addition can be useful for exploratory data analysis and dimensionality reduction in huge content-text. This dimensionality reduction can also help to alleviate the sparseness problem which is inherent to many traditional collaborative-filtering systems. There are approaches based on LDA to deal with this Scarcity problem such as Mishar et al. (2017)
Recommendation Systems based on LDA for ’user latent interests’, latent interest refers to long-term interest in a specific topic, in fact; A latent interest can be viewed as one specific characteristic of the users and the items who have this latent interest will prefer the items with this characteristic. Finding a community with a latent interest in another could help in recommending interesting new communities for a user. However, for CF systems, it is hard to identify the user latent interests, since the only information available is the user interaction information with the system. While topic models can be utilized to simulate the user latent interests and showed the way of extracting these interests from the latent Dirichlet allocation (LDA) model by the Gibbs sampling method in our experiment. We should note that, in terms of items, a latent interest can be viewed as one specific characteristic of the items and the users who have this latent interest will prefer the items with this characteristic. Fortunately, as a simulation tool, the topic model (e.g., LDA) can be utilized to learn the meaning, significance, characteristics and attributes of items in a data-driven, i.e., from given rating records, possibly without further content or prior knowledge of these items118.
In this paper, we presented a taxonomy of recommendation systems and applications based on LDA of the recent research, including app, travel, friend, location, scientific paper, and music recommendation. Furthermore, we applied LDA algorithm and Gibbs sampling on ISWC and WWW conference’s publications from 2013-2017. Generally, recommendation systems can be an impressive interface between online users and websites in the Internet communities. Our study suggest that NLP methods based on LDA can discover hidden aspects to better understanding of the behaviors of the people to build smart recommendation systems in online communities.
Bleiet al. (2003)
D. M. Blei, A. Y. Ng, M. I. Jordan, Latent dirichlet allocation, J Machine Learning Research Archive 3 (2003) 993-1022.
- Amamiet al. (2017) M. Amami, R. Faiz, F. Stella, G. Pasi, A graph based approach to scientific paper recommendation, in: the International Conference, 2017, pp. 777-782.
- Dai et al. (2017) T. Dai, L. Zhu, X. Cai, S. Pan, S. Yuan, Explore semantic topics and author communities for citation recommendation in bipartite bibliographic network, Journal of Ambient Intelligence and Humanized Computing (9) (2017) 1-19.
- Kimet al. (2013) Y. Kim, Y. Park, K. Shim, Digtobi:a recommendation system for digg articles using probabilistic modeling (2013) 691-702.
- Li et al. (2013) Y. Li, M. Yang, Z. Zhang, Scientific articles recommendation, in: ACM International Conference on Conference on Information and Knowledge Management, 2013, pp. 1147-1156.
- Sugiyama et al. (2013) K. Sugiyama, M. Y. Kan, Exploiting potential citation papers in scholarly paper recommendation, 2013, pp. 153-162.
- Wang et al. (20111) C. Wang, D. M. Blei, Collaborative topic modeling for recommending scientific articles, in: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2011, pp. 448-456.
Wang et al. (2013)
H. Wang, B. Chen, W. J. Li, Collaborative topic regression with social regularization for tag recommendation, in: International Joint Conference on Artificial Intelligence, 2013, pp. 2719-2725.
- Younus et al. (2014) A. Younus, M. A. Qureshi, P. Manchanda, C. ORiordan, G. Pasi, Utilizing Microblog Data in a Topic Modelling Framework for Scientific Articles Recommendation, Springer International Publishing, 2014.
- Hariri et al. (2012) N. Hariri, B. Mobasher, R. Burke, Using social tags to infer context in hybrid music recommendation, in: Twelfth International Workshop on Web Information and Data Management, 2012, pp. 41-48.
- Cheng et al. (2016) Z. Cheng, J. Shen, On effective location-aware music recommendation, Acm Transactions on Information Systems 34 (2) (2016) 1-32.
- Yan et al. (2016) M. Yan, J. Sang, C. Xu, M. S. Hossain, A united video recommendation by cross-network user modeling, Acm Transactions on Multimedia Computing Communications and Applications 12 (4) (2016) 53.
- Zhang et al. (2012) Y. C. Zhang, D. Quercia, T. Jambor, Auralist: introducing serendipity into music recommendation, in: ACM International Conference on Web Search and Data Mining, 2012, pp. 13-22.
- Dias et al. (2013) R. Dias, M. J. Fonseca, Improving music recommendation in session-based collaborative filtering by using temporal context, in: IEEE International Conference on TOOLS with Artificial Intelligence, 2013, pp. 783-788.
- Zheleva et al. (2010) E. Zheleva, J. Guiver, E. M. Rodrigues, Statistical models of music-listening sessions in social media, in: International Conference on World Wide Web, WWW 2010, Raleigh, North Carolina, Usa, April, 2010, pp. 1019-1028.
Hu et al. (2014)
P. Hu, W. Liu, W. Jiang, Z. Yang, Latent topic model for audio retrieval, Pattern Recognition 47 (3) (2014) 1138-1143.
- Basu et al. (2016) S. Basu, Y. Yu, V. K. Singh, R. Zimmermann, Videopedia: Lecture video recommendation for educational blogs using topic modeling, in: Multimedia Modeling Conference, 2016, pp. 1020-1027.
- Lee et al. (2017) W. P. Lee, C. T. Chen, J. Y. Huang, J. Y. Liang, A smartphone-based activity-aware system for music streaming recommendation, Knowledge-Based Systems.2017
- Tan et al. (2016) E. Tan, I. Seaman, H. Leung, Y. K. Ng, Making personalized movie recommendations for children, in: International Conference on Information Integration and Web-Based Applications and Services, 2016, pp. 96-105.
- Hariri et al. (2012) N. Hariri, B. Mobasher, R. Burke, Context-aware music recommendation based on latenttopic sequential patterns, in: ACM Conference on Recommender Systems, 2012, pp. 131-138.
- Koren et al. (2009) Y. Koren, R. Bell, C. Volinsky, Matrix factorization techniques for recommender systems, Computer 42 (8) (2009) 30-37.
- Liu et al. (2013) B. Liu, Y. Fu, Z. Yao, H. Xiong, Learning geographical preferences for point-of-interest recommendation, in: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2013, pp. 1043-1051.
- Liu et al. (2013b) B. Liu, H. Xiong, Point-of-Interest Recommendation in Location Based Social Networks with Topic and Location Aware- ness, 2013.
- Ho et al. (2012) S. S. Ho, M. Lieberman, P. Wang, H. Samet, Mining future spatiotemporal events and their sentiment from online news articles for location-aware recommendation system, in: ACM Sigspatial International Workshop on Mobile Geographic Information Systems, 2012, pp. 25-32.
- Kurashima et al. (2013) T. Kurashima, T. Iwata, T. Hoshide, N. Takaya, K. Fujimura, Geo topic model: joint modeling of user’s activity area and interests for location recommendation (2013) 375-384.
- Xiong et al. (2017) H. Xiong, H. Xiong, H. Xiong, H. Xiong, H. Xiong, H. Xiong, A location-sentiment-aware recommender system for both home-town and out-of-town users (2017) 1135-1143.
- Wang et al. (2017) W. Wang, H. Yin, L. Chen, Y. Sun, S. Sadiq, X. Zhou, St-sage: A spatial-temporal sparse additive generative model for spatial item recommendation, Acm Transactions on Intelligent Systems and Technology 8 (3) (2017) 48.
- Reddy et al. (2017) C. K. Reddy, C. K. Reddy, C. K. Reddy, C. K. Reddy, Probabilistic social sequential model for tour recommendation, in: Tenth ACM International Conference on Web Search and Data Mining, 2017, pp. 631-640.
- Gong et al. (2017) Y. Gong, Q. Zhang, X. Huang (2017), Hashtag recommendation for multimodal microblog posts, Neurocomputing.
- Tang et al. (2013) H. Tang, L. Shen, Y. Qi, Y. Chen, Y. Shu, J. Li, D. A. Clausi, A multiscale latent dirichlet allocation model for object- oriented clustering of vhr panchromatic satellite images, IEEE Transactions on Geoscience and Remote Sensing 51 (3) (2013) 1680-1692.
- Xu et al. (2011) G. Xu, Y. Gu, P. Dolog, Y. Zhang, M. Kitsuregawa, Semrec: A semantic enhancement framework for tag based recommendation, in: AAAI Conference on Artificial Intelligence, AAAI 2011, San Francisco, California, Usa, August, 2011.
- Lu et al. (2015) H. M. Lu, C. H. Lee, A twitter hashtag recommendation model that accommodates for temporal clustering effects, IEEE Intelligent Systems 30 (3) (2015) 18-25.
Wang et al. (2015)
H. Wang, X. Shi, D. Y. Yeung, Relational stacked denoising autoencoder for tag recommendation, in: Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015, pp. 3052-3058.
- Krestel et al. (2009) R. Krestel, P. Fankhauser, W. Nejdl, Latent dirichlet allocation for tag recommendation, in: ACM Conference on Recom- mender Systems, Recsys 2009, New York, Ny, Usa, October, 2009, pp. 61-68.
- Shi et al. (2016) B. Shi, G. Ifrim, N. Hurley, Learning-to-rank for real-time high-precision hashtag recommendation for streaming news, in: International Conference on World Wide Web, 2016, pp. 1191-1202.
Wu et al. (2015)
H. Wu, Y. Pei, B. Li, Z. Kang, X. Liu, H. Li, Item recommendation in collaborative tagging systems via heuristic data fusion, Knowledge-Based Systems 75 (C) (2015) 124-140.
- Prokofyev et al. (2012) R. Prokofyev, A. Boyarsky, O. Ruchayskiy, K. Aberer, G. Demartini, Tag recommendation for large-scale ontology-based information systems, in: International Conference on the Semantic Web, 2012, pp. 325-336.
- Jin et al. (2010) Y. Jin, R. Li, Y. Cai, Q. Li, A. Daud, Y. Li, Semantic grounding of hybridization for tag recommendation., in: Web-Age Information Management, International Conference, WAIM 2010, Jiuzhaigou, China, July 15-17, 2010. Proceedings, 2010, pp. 139-150.
- Wang et al. (2014) Y. Wang, J. Liu, J. Qu, Y. Huang, J. Chen, X. Feng, Hashtag graph based topic model for tweet mining, in: IEEE International Conference on Data Mining, 2014, pp. 1025-1030.
- Ma et al. (2014) Z. Ma, A. Sun, Q. Yuan, G. Cong, Tagging your tweets: A probabilistic modeling of hashtag annotation in twitter, in: ACM International Conference on Conference on Information and Knowledge Management, 2014, pp. 999-1008.
- Lu et al. (2011) C. Lu, X. Hu, J. R. Park, J. Huang, Post-based collaborative filtering for personalized tag recommendation, in: Iconference 2011, Inspiration, Integrity, and Intrepidity, Seattle, Washington, Usa, February, 2011, pp. 561-568.
- Zhao et al. (2016) F. Zhao, Y. Zhu, H. Jin, L. T. Yang, A personalized hashtag recommendation approach using lda-based topic model in microblog environment, Future Generation Computer Systems 65 (C) (2016) 196-206.
- Zhang et al. (2013) D. Zhang, D. Zhang, L. Si, Semantic hashing using tags and topic modeling, in: International ACM SIGIR Conference on Research and Development in Information Retrieval, 2013, pp. 213-222.
- She et al. (2014) J. She, L. Chen, Tomoha: Topic model-based hashtag recommendation on twitter, in: International Conference on World Wide Web, 2014, pp. 371-372.
Tomar et al. (2014)
A. Tomar, F. Godin, B. Vandersmissen, W. D. Neve, R. V. D. Walle, Towards twitter hashtag recommendation using distributed word representations and a deep feed forward neural network, in: International Conference on Advances in Computing, Communications and Informatics, 2014, pp. 362-368.
- Jianjun et al. (2015) Jianjun, Tongyu, Combining long-term and short-term user interest for personalized hashtag recommendation, Frontiers of Computer Science 9 (4) (2015) 608-622.
- Li et al. (2016) J. Li, H. Xu, Suggest what to tag: Recommending more precise hashtags based on users dynamic interests and streaming tweet content, Knowledge-Based Systems 106 (2016) 196-205.
- Jiang et al. (2017) Y. Li, J. Jiang, T. Liu, M. Qiu, X. Sun, Personalized microtopic recommendation on microblogs, Acm Transactions on Intelligent Systems and Technology 8 (6) (2017) 1-21.
- Quyang et al. (2015) X. Li, J. Ouyang, X. Zhou, Centroid prior topic model for multi-label classification, Pattern Recognition Letters 62 (2015) 8-13.
- He et al. (2016) J. He, H. Liu, H. Xiong, Socotraveler : Travel-package recommendations leveraging social in uence of different relationship types, Information and Management 53 (8) (2016) 934-950.
- Kavitha et al. (2017) S. Kavitha, V. Jobi, S. Rajeswari, Tourism Recommendation Using Social Media Profiles, 2017. 17
- Sun et al. (2016) C. Y. Sun, A. J. T. Lee, Tour recommendations by mining photo sharing social media, Decision Support Systems, 2017.
- Zhang et al. (2015) C. Zhang, H. Liang, K. Wang, J. Sun, Personalized trip recommendation with poi availability and uncertain traveling time (2015) 911-920.
- Cao et al. (2017) D. Cao, L. Nie, X. He, X. Wei, J. Shen, S. Wu, T. S. Chua, Version-sensitive mobile app recommendation, Information Sciences 381 (2017) 161-175.
- Chua et al. (2017) T. S. Chua, T. S. Chua, T. S. Chua, T. S. Chua, T. S. Chua, T. S. Chua, T. S. Chua, Cross-platform app recommendation by jointly modeling ratings and texts, Acm Transactions on Information Systems 35 (4) (2017) 37.
- He et al. (2017) J. He, H. Liu, Mining exploratory behavior to improve mobile app recommendations, Acm Transactions on Information Systems 35 (4) (2017) 1-37.
- Zhu et al. (2017) K. Zhu, L. Zhang, A. Pattavina, Learning geographical and mobility factors for mobile application recommendation, IEEE Intelligent Systems 32 (3) (2017) 36-44.
- Fang et al. (2016) Z. R. Fang, S. W. Huang, F. Yu, Appreco: Behavior-aware recommendation for ios mobile applications, in: IEEE Inter- national Conference on Web Services, 2016, pp. 492-499.
- Cheng et al. (2016) X. Li, X. Cheng, S. Su, S. Li, J. Yang, A hybrid collaborative filtering model for social in uence prediction in event-based social networks , Neurocomputing 230 (2016) 197-209.
- Magnuson et al. (2015) A. Magnuson, V. Dialani, D. Mallela, Event recommendation using twitter activity, 2015, pp. 331-332.
- Khrouf et al. (2013) H. Khrouf, Hybrid event recommendation using linked data and user diversity, in: ACM Conference on Recommender Systems, 2013, pp. 185-192.
- Zhang et al. (2013) Y. Zhang, H. Wu, V. Sorathia, V. K. Prasanna, Event recommendation in social networks with linked data enablement, in: 15th International Conference on Enterprise Information Systems, 2013, pp. 371-379.
- Minkov et al. (2010) E. Minkov, B. Charrow, J. Ledlie, S. Teller, T. Jaakkola, Collaborative future event recommendation, in: ACM Interna- tional Conference on Information and Knowledge Management, 2010, pp. 819-828.
- Zhang et al. (2013) W. Zhang, J.Wang, W. Feng, Combining latent factor model with location features for event-based group recommendation, in: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2013, pp. 910-918.
- Hsieh et al. (2016) C. K. Hsieh, L. Yang, H. Wei, M. Naaman, D. Estrin, Immersive recommendation: News and event recommendations using personal digital traces, in: International Conference on World Wide Web, 2016, pp. 51-62.
- Zhengxing et al. (2014) H. Zhengxing, D. Wei, J. Lei, G. Chenxi, L. Xudong, D. Huilong, Discovery of clinical pathway patterns from event logs using probabilistic topic models, Journal of Biomedical Informatics 47 (2) (2014) 39.
- Li et al. (2017) S. Li, X. Cheng, S. Su, H. Sun, Exploiting organizer in uence and geographical preference for new event recommendation, Expert Systems 34 (3) (2017) e12190.
- Purushotham et al. (2016) S. Purushotham, C. C. J. Kuo, Personalized group recommender systems for location- and event-based social networks 2 (4) (2016) 1-29.
- Wang et al. (2016) Z. Wang, J. Liao, Q. Cao, H. Qi, Friendbook: A semantic-based friend recommendation system for social networks, Mobile Computing IEEE Transactions on 14 (3) (2016) 538-551.
- Pennacchiotti et al. (2011) M. Pennacchiotti, S. Gurumurthy, Investigating topic models for social media user recommendation, in: International Conference on World Wide Web, WWW 2011, Hyderabad, India, March 28 - April, 2011, pp. 101-102.
- Zhang et al. (2017) Y. Zhang, Z. Tu, Q. Wang, Temporec: Temporal-topic based recommender for social network services, Mobile Networks and Applications (2017) 1-10.
- Chu et al. (2017) W. T. Chu, Y. L. Tsai, A hybrid recommendation system considering visual information for predicting favorite restaurants, World Wide Web-internet and Web Information Systems 20 (6) (2017) 1313-1331.
- Resnick et al. (1994) P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, J. Riedl, Grouplens: an open architecture for collaborative filtering of netnews, in: ACM Conference on Computer Supported Cooperative Work, 1994, pp. 175-186.
- Shardanand et al. (1995) U. Shardanand, Social information filtering: algorithms for automating word of mouth 110 (1) (1995) 210-217.
- Nathaniel et al. (1999) Nathaniel, Schafer, J. Ben, Konstan, A. Joseph, Borchers, Sarwar, Badrul, Combining collaborative filtering with personal agents for better recommendations, in: Sixteenth National Conference on Artificial Intelligence and the Eleventh Innovative Applications of Artificial Intelligence Conference Innovative Applications of Artilcial Intelligence, 1999, pp. 439-446.
- Sarwar et al. (2001) B. Sarwar, G. Karypis, J. Konstan, J. Riedl, Item-based collaborative filtering recommendation algorithms, in: Interna- tional World Wide Web Conference, 2001, pp. 285-295.
- Bobadilla et al. (2013) J. Bobadilla, F. Ortega, A. Hernando, Recommender systems survey, Knowledge-Based Systems 46 (1) (2013) 109-132.
- Silva et al. (2013) T. Silva, Z. Guo, J. Ma, H. Jiang, H. Chen, A social network-empowered research analytics framework for project selection, Decision Support Systems 55 (4) (2013) 957-968.
- Huang et al. (2017) S. Huang, J. Zhang, S. Dan, L. Wang, X. S. Hua, Two-stage friend recommendation based on network alignment and series expansion of probabilistic topic model, IEEE Transactions on Multimedia 19 (6) (2017) 1314-1326.
- Zhu et al. (2015) J. Zhu, L. Li, Chun-Mei, From interest to location: Neighbor-based friend recommendation in social media, Journal of Computer Science and Technology 30 (6) (2015) 1188-1200.
- Zheng et al. (2015) N. Zheng, S. Song, H. Bao, A temporal-topic model for friend recommendations in chinese microblogging systems, Systems Man and Cybernetics Systems IEEE Transactions on 45 (9) (2015) 1245-1253.
- Zhao et al. (2011) W. X. Zhao, J. Jiang, J. Weng, J. He, E. P. Lim, H. Yan, X. Li, Comparing twitter and traditional media using topic models., Lecture Notes in Computer Science 6611 (2011) 338-349.
- Falher et al. (2015) G. L. Falher, A. Gionis, M. Mathioudakis, Where is the soho of rome? measures and algorithms for finding similar neighborhoods in cities (2015) 901-910.
- Zhang et al. (2016) C. Zhang, H. Liang, K. Wang, Trip Recommendation Meets Real-World Constraints: POI Availability, Diversity, and Traveling Time Uncertainty, ACM, 2016.
- Lin et al. (2013) J. Lin, K. Sugiyama, M. Y. Kan, T. S. Chua, Addressing cold-start in app recommendation: latent user models constructed from twitter followers, in: International ACM SIGIR Conference on Research and Development in Information Retrieval, 2013, pp. 283-292.
- Lipman et al. (1985) D. J. Lipman, W. R. Pearson, Rapid and sensitive protein similarity searches., Science 227 (4693) (1985) 1435.
- Jamali et al. (2009) M. Jamali, M. Ester, Trustwalker:a random walk model for combining trust-based and item-based recommendation, in: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2009, pp. 397-406.
- Basilico et al. (2004) J. Basilico, T. Hofmann, Unifying collaborative and content-based filtering, in: International Conference on Machine Learning, 2004, p. 9.
- Su et al. (2009) X. Su, T. M. Khoshgoftaar, A survey of collaborative filtering techniques, Hindawi Publishing Corp., 2009.
- Zhao et al. (2016) W. X. Zhao, S. Li, Y. He, E. Y. Chang, J. R. Wen, X. Li, Connecting social media to e-commerce: Cold-start product recommendation using microblogging information, IEEE Transactions on Knowledge and Data Engineering 28 (5) (2016) 1147-1159.
- Masood et al. (2017) M. A. Masood, R. A. Abbasi, O. Maqbool, M. Mushtaq, N. R. Aljohani, A. Daud, M. A. Aslam, J. S. Alowibdi, Mfs- lda: a multi-feature space tag recommendation model for cold start problem, Program Electronic Library and Information Systems (4) (2017) 00-00.
- Mishar et al. (2017) N. Mishra, S. Chaturvedi, V. Mishra, R. Srivastava, P. Bargah, Solving sparsity problem in rating-based movie recom- mendation system, 2017.
- Gao et al. (2016) Z. Gao, Y. Fan, C. Wu, W. Tan, J. Zhang, Y. Ni, B. Bai, S. Chen, Seco-lda: Mining service co-occurrence topics for recommendation, in: IEEE International Conference on Web Services, 2016, pp. 25-32.
- Jaing et al. (2015) S. Jiang, X. Qian, J. Shen, T. Mei, Travel recommendation via author topic model based collaborative filtering, in: International Conference on Multimedia Modeling, 2015, pp. 392-402.
- Bao et al. (2012) J. Bao, Y. Zheng, M. F. Mokbel, Location-based and preference-aware recommendation using sparse geo-social networking data, in: International Conference on Advances in Geographic Information Systems, 2012, pp. 199-208.
- Yao et al. (2015) W. Yao, J. He, H. Wang, Y. Zhang, J. Cao, Collaborative topic ranking: leveraging item meta-data for sparsity reduction, in: Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015, pp. 374-380.
- Chen et al. (2012) Q. Liu, E. Chen, H. Xiong, C. H. Ding, J. Chen, Enhancing collaborative filtering by user interest expansion via person- alized ranking., IEEE Transactions on Systems Man and Cybernetics Part B 42 (1) (2012) 218-33.