Cross-domain novelty seeking trait mining for sequential recommendation

03/05/2018 ∙ by Fuzhen Zhuang, et al. ∙ Microsoft Institute of Computing Technology, Chinese Academy of Sciences 0

Transfer learning has attracted a large amount of interest and research in last decades, and some efforts have been made to build more precise recommendation systems. Most previous transfer recommendation systems assume that the target domain shares the same/similar rating patterns with the auxiliary source domain, which is used to improve the recommendation performance. However, to the best of our knowledge, almost these works do not consider the characteristics of sequential data. In this paper, we study the new cross-domain recommendation scenario for mining novelty-seeking trait. Recent studies in psychology suggest that novelty-seeking trait is highly related to consumer behavior, which has a profound business impact on online recommendation. Previous work performing on only one single target domain may not fully characterize users' novelty-seeking trait well due to the data scarcity and sparsity, leading to the poor recommendation performance. Along this line, we proposed a new cross-domain novelty-seeking trait mining model (CDNST for short) to improve the sequential recommendation performance by transferring the knowledge from auxiliary source domain. We conduct systematic experiments on three domain data sets crawled from Douban (www.douban.com) to demonstrate the effectiveness of the proposed model. Moreover, we analyze how the temporal property of sequential data affects the performance of CDNST, and conduct simulation experiments to validate our analysis.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 5

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Personalized recommendation plays a very important role in the rapid development of E-commerce. To make more precise recommendations for personal needs, we should understand users’ preference propensity or profiles according to their historical behaviors. For example, on the well known E-commerce website Amazon, we may make recommendations to one user if he shares the similar consuming behaviors with other ones, or according to his historical consuming behaviors. Therefore, recommendation system has attracted vast amount of interest and research in recent years to handle the information overload problem and make predictions (Bobadilla et al., 2013; Su and Khoshgoftaar, 2009).

Unlike most of previous works, there has been some effort devoted to modeling an individual’s propensity from psychological perspective for recommendation systems in recent years (Zhang et al., 2014, 2015). Novelty seeking is a personal trait described as the search for unfamiliar experiences and feelings that are “varied, novel, complex, and intense”, and measured by the readiness to take “physical, social, legal, and financial” risks for the sake of such experiences. Novelty seeking, as well as harm avoidance and reward dependence, has been regarded as the basic requirement for human activities (Cloninger et al., 1994). Behaviors of users are also relatively consistent in similar situations (Furr and Funder, 2004). In consumer behavior and recommender system research, understanding this personality trait is particularly crucial since consumers’ attributes are strong indicators of their purchasing behaviors (Stern, 1962). Hence, if you know more about whether your consumer loves trying new things, you can recommend your product more reasonably according to consumer’s taste and reach your targets faster and more effectively. To that end, Zhang et al. (Zhang et al., 2014) proposed a computational framework named Novel Seeking Model (NSM) to explore the novelty-seeking trait implied by observable sequential activities. Experimental results showed that NSM can uncover the correlation of novelty-seeking trait at different levels, and improve the recommendation performance. Following this line, Zhang et al. (Zhang et al., 2015) also proposed a novelty-seeking based dining recommendation system for effective dining recommendation.

Users are always active in many E-commerce websites, and have large number of sequential behavioral data in different domains. As we all know, a user’s behaviors in different areas have consistency. For example, if a user watches movies focused on his favourite actors, this phenomenon shows that the user is lower novelty-seeking propensity, so he will listen some particular genre of music. The modeling of novelty-seeking trait in one single domain may not completely characterize each individual’s profiles, while the sequential behavioral data of one user from different domains may help to exploit the novelty-seeking trait. For example, on the well known Chinese social media platform Douban111https://www.douban.com/, users usually read books, listen to music, watch movies, and then express their propensity comments. Observing these three domains of sequential behavioral data, we find that users listened some music and then after a period of time they would watch some related movie, e.g., the music is the theme music of the movie; users sometimes watch some movies after they read some related books, from which the movies are derived. Based on these observations, whether the sequential behavioral data of domains of Book and Music can help to model the novelty-seeking trait in the domain of Movie? This issue is crucial to cross-domain recommendation, especially in the situation where one domain suffers from the cold-start problem. On the other hand, transfer learning aims to transfer the knowledge from related auxiliary source domain to target domain. Along this line, we propose a new cross-domain novelty-seeking trait mining model, termed as CDNST, in which the parameters characterizing the novelty-seeking trait are shared across different domains to achieve significant improvement for recommendations. We crawled three domains of data from Douban website, i.e., Book, Music and Movie, and conduct extensive experiments to validate the effectiveness of the proposed model. The experiments also indicate that performance of CDNST is sensitive to the sequential property of related domain data, which inspires us to study new cross-domain method in the future.

Our contribution can be summarized as follows,

  • We propose a new cross-domain novelty-seeking algorithm for better modeling an individual’s propensity from psychological perspective for recommendation, in which the novelty-seeking level of each individual is shared for knowledge transfer across different domains.

  • We crawl three domains of data sets from the well-known Chinese social-media platform Douban and construct 14 transfer recommendation problems, to demonstrate the effectiveness of the proposed model CDNST.

  • We are the first to analyze how the temporal property of sequential data affects the transfer learning model, and conduct simulation experiments to validate our analysis. Moreover, we define an effective relatedness measure to decide what kinds of transfer learning problems are suitable to our model.

The remainder of this paper is organized as follows. Section 2 briefly introduces the related work. Section 3 details the problem formulation and solution derivation of CDNST. The effectiveness and analysis experiments are shown in Section 4. Finally, Section 5 concludes this paper.

2. Related Work

In this section, we briefly introduce the most related work on novelty-seeking research and transfer recommendation systems.

2.1. Novelty Seeking Research

Novelty seeking is a personality trait expressed in the generalized tendency to seek varied, novel, complex, and intense sensations and experiences and the willingness to take risks for the sake of such experiences (Zuckerman, 1979). Consumer behavior and health science focused on novelty seeking a long time (Ebstein et al., 1996). It is construed as sensation seeking or neophilia. The notion of novelty seeking was proposed by Acker and McReynolds (Acker and McReynolds, 1967). And then it was studied by McClenland (McClelland, 1955), Fiske and Maddi (Fiske and Maddi, 1961) and Rogers(Rogers, 2010). Rajus (Raju, 1980) studied personality traits, demographic variables, and exploratory behavior in the consumer context. Baumgartner (Baumgartner and Steenkamp, 1996) proposed a two-factor conceptualization of exploratory consumer buying behavior. Zhang (Zhang et al., 2014) presented a computational framework for exploiting the novelty-seeking trait implied by the observable activities. Our work is inspired by (Zhang et al., 2014), and focused on transfer learning model for better recommendation.

2.2. Transfer Recommendation Systems

In order to integrate more information from different domains for better recommendation, cross-domain recommendation considers to combine data sources from different domains with the original target data (Li et al., 2009; Fernández-Tobías et al., 2012). The basic idea of existing methods utilize the common latent structure shared across domains as the bridge for knowledge transfer. Recently, considering the number of overlapped users is often small, Jiang et al. (Jiang et al., 2016) proposed a novel semi-supervised transfer learning method to address the problem of cross-platform behavior prediction. Wei et al. (Wei et al., 2017) proposed a Heterogeneous Information Ensemble framework to predict users’ personality traits by integrating heterogeneous information including self-language usage, avatar, emotion, and responsive patterns. Lian et al. (Lian et al., 2017) proposed CCCFNet which combine collaborative filtering and content-based filtering for cross-domain Recommendation. Wang et al. (Wang et al., 2017) proposed a model across multiple deep neural nets to catch representation learning of each article and capture the change.

Although these cross-domain recommendation methods have achieved successes in many applications, these methods are usually designed for statical rating data. This paper focuses on the transfer recommendation system for sequential behavior data from psychological perspective.

3. Model and Solution

In this section, we present a cross-domain framework to explore the novelty-seeking trait embodied in an individual’s behavioral data in a target domain by transferring knowledge from related auxiliary source domain data. First, we would like to clarify some of the notions commonly used in this paper, and then propose the cross-domain novelty-seeking trait mining model (CDNST). Finally, the solution of CDNST is inferred.

3.1. Preliminaries

and : Denote and the specific observed behavior taken by an individual in the source domain and target domain , respectively. Meanwhile, and are separately selected from their optional choices and , i.e., and , where and are numbers of choices in the domains of and , respectively. The granularity of choices can vary according to the different data format and applications. For example, an action on Amazon refers to the purchase of an item, where the choices are all available items. For Douban, in particular, an action could refer to comment for a specific artwork, which is considered as one of the choices. Besides, every choice candidate in both domain and has its context information, which involves categories, tags, keyword, and etc. For example, the information of players, e.g., “Will Smith”, “Tony Stark”, could be found in the keywords of a movie (choice) in Douban movie channel; the category of music (choice), e.g., “folk”, “R&B”, are presented as tags in Douban music channel. The action sequence of an individual refers to the actions taken in chronological order in a specific domain, where is the number of actions. We show a general example of action and choice in Fig. 1 (a) and (b). In more detail, Fig. 1 (a) demonstrates three users’ action sequences, and Fig. 1 (b) exhibits four choice candidates in the running example, in which “A”, “B”, “C”,“D” and “E” denote the context information of these choices.

Notations Denotations
number of optional values for novelty
-seeking level
number of optional choices for an action
in the domain of
number of optional choices for an action
in the domain of
length of actions in the domain of
length of actions in the domain of
=

a vector indicates action

sequence in the domain of
= a vector indicates action
sequence in the domain of
= a vector indicates novelty-seeking level
sequence in the domain of
= a vector indicates novelty-seeking level
sequence in the domain of
= novelty-seeking level distribution
= choice utility distribution in the
domain of
= choice utility distribution in the
domain of
dynamic choice novelty matrix in the
domain of
dynamic choice novelty matrix in the
domain of
hyperparameters relate to ,
and separately.
Table 1. Summary table of symbols

: Given an arbitrary domain,  (Zhang et al., 2014) is a matrix, where is the length of action sequences and denotes the number of choices in such domain. Every element in is an integer in . The is used to present partial orders among choices at each position. In more detail, the -th row in measures the partial orders among choices at -th position. For instance, the User1 in Fig. 1 (a) has four choices at the -th position in his action sequence, and its corresponding row in , i.e., the -th row of as shown in Fig. 1 (c) refers to the current novelty of such four choice candidates which are related to two factors: (1) popularity of the choice and (2) popularity of the choice transition, given historical observations. The more popular the two factors, the lower ranking the choice. The for a given source domain for example can be computed according to the following principle

(1)

where refers to the frequency of terms in context information (keyword for specific) in before the

-th position in this individual’s sequence. It measures the popularity at that moment in view of this individual.

refers to the transition probability of keywords in

before -th position in this individual’s sequence, which measures the context information transition popularity at the moment in view of this individual. The notation for the target domain can be obtained by the similar way.

For example, regarding User1 in Fig. 1 (a), the novelty order of choices at -th position is o >o = o = o since the frequency of terms in o is minimal and transfer of o , o and o considering this individual’s historical behavior. This is partial order according to Equation (1), and denotes the novelty-seeking of taken by a user at position of , thus (Note that 1 indicates the lowest ranking and vice versa).

Figure 1. Dynamic choice novelty with regards to individual in the domain of .

- : The novelty-seeking level is a positive integer, where a larger value indicates a higher novelty-seeking propensity and vice versa in a given domain. In the action sequence of an individual in the domain, each position relates to a specific novelty level, e.g., if User1 choose at the last position in domain , it is more likely he has a high novelty-seeking propensity at that moment and want to explore something new in such domain. Otherwise, , and might be his choice. We argue that the individual’s novelty in different domains sometimes have similar traits, we thus can transfer such knowledge between multiple domains.

- : Novelty-seeking trait is an real number ranging from 1 to , which refers to the mean of a multinomial distribution , where refers to the probability of having novelty-seeking level of . As (Zhang et al., 2014) introduced, the larger the NST, the greater the novelty-seeking propensity the individual possesses and vice versa.

3.2. The Proposed Model

In the following, we detail the proposed model, which is inspired by NSM proposed by Zhang et al. (Zhang et al., 2014). In their work, a graphical model expressing how to generate observable actions in one specific domain was proposed. In this paper, however, the proposed CDNST attempts to transfer the novelty seeking traits learnt from auxiliary source domain for improving the accuracy of recommendation of target domain. Our notation and terminology closely follows standards in (Zhang et al., 2014) and deviates only when necessary.

Figure 2. A graphical representation of our general novelty seeking model.

The notations as well as denotations we use in this model are summarised in Table 1. In CDNST, we extend the framework of NST to multiple domains and give its graphical model as Fig. 2. As shown in Fig. 2, is the latent variable that represents the novelty-seeking level at the position of -th in the source domain . Similarly, denotes the latent variable about the novelty-seeking level at the position of -th in the target domain . Both of them are sampled from a shared multinomial novelty-seeking distribution in Fig. 2. In addition, we use latent variables and to represent the utility of each choice in the domain of and , respectively. They can be interpreted as this individual’s preference for each choice in the corresponding domain. Furthermore, , and are the relevant hyper-parameters to , and , respectively. The observed actions for the -th position in the domain of and are denoted by and in the figure respectively. The value of relies on the novelty-seeking level at -th position, namely , the choice utility distribution , and the previous chosen action. The generation process of is similar to but relies on the corresponding variables in the target domain.

The first-order dependency of the action sequence is still carried out for the different domains in CDNST for simplicity and feasibility. Hence given a dynamic choice novelty matrix  () precomputed according to the individual’s behavior, and incorporating both the utility and the novelty-seeking factors, the conditional probability is given as:

(2)
(3)

where the first-order dependency between and and that between and are embodied when we compute and .

The assumption of consistency between the novelty of a choice with the novelty-seeking level at a given position derives that the individual will accept the choice with a higher probability. For instance, if an individual is at the higher novelty-seeking of at that moment, we expect he/she is more likely to accept a choice with the largest novelty in the partial order. Otherwise, he is likely to accept a choice with little novelty in the partial order. As a result, we give the action function adopted for both source and target domain in CDNST as follows:

(4)
(5)

where max indicates the maximum value in the -th row of matrix , max indicates the maximum value in the -th row of matrix .

The generative process of CDNST is summarized as Algorithm 1.

  • Draw novelty-seeking level distribution ;

  • Draw choice utility distribution in the domain ;

  • For the -th position in the sequence

    • Draw novelty-seeking level ;

    • Draw item ;

  • Draw choice utility distribution in the domain ;

  • For the -th position in the sequence

    • Draw novelty-seeking level ;

    • Draw item ;

Algorithm 1 Generative process of CDNST

3.3. Model Inference

Following NSM, pointwise Gibbs sampling is applied by repeatedly drawing novelty-seeking level and novelty-seeking level distribution , and choice utility distribution and . The sampling process is summarised as follows:

  • Randomly draw from

    (6)
  • Randomly draw from

    (7)

    where is a vector that increases the position by for , is the number of novelty-seeking level with value in the current state of the sampler.

  • Randomly draw from

    (8)
  • Randomly draw from

    (9)
  • Random draw from

    (10)

    where is a vector that increases the position by for , is the number of novelty-seeking level with value in the current state of the sampler.

  • Randomly draw from

    (11)

4. Experiments

In this section, we first conduct extensive experiments to demonstrate the effectiveness of the proposed model CDNST, and then analyze how the temporal property of sequential data affects the performance of CDNST. Finally, we design some simulation experiments to validate our analysis and define an effective relatedness measure to judge what kinds of transfer learning problems are suitable to our model.

(a) An user’s watching list of movies.
(b) An example of movie’s information.
(c) An example of music’s information.
(d) An example of book’s information.
Figure 3. Some Examples of Data in Douban.
Source Target Statistics
#user
Movie_category Music_tags #Movie_category
Music_tags Movie_category #Ave_Movie_category
#Music_tags
#Ave_Music_tags
#user
Movie_tags Music_tags #Movie_tags
Music_tags Movie_tags #Ave_Movie_tags
#Music_tags
#Ave_Music_tags
#user
Movie_dir Music_tags #Movie_dir
Music_tags Movie_dir #Ave_Movie_dir
#Music_tags
#Ave_Music_tags
#user
Music_tags Book_tags #Music_tags
Book_tags Music_tags #Ave_Music_tags
#Book_tags
#Ave_Book_tags
#user
Book_tags Movie_category #Book_tags
Movie_category Book_tags #Ave_Book_tags
#Movie_category
#Ave_Movie_category
#user
Book_tags Movie_tags #Book_tags
Movie_tags Book_tags #Ave_Book_tags
#Movie_tags
#Ave_Movie_tags
#user
Book_tags Movie_dir #Book_tags
Movie_dir Book_tags #Ave_Book_tags
#Movie_dir
#Ave_Movie_dir

A_tags (category, dir) means A’s tags (category, director and players).

Table 2. The statistics of seven pairs of data sets

4.1. Data Preparation

We prepare the data sets by crawling the data from Douban, which is one of the most influential social-network service website in China, containing movie, music and book ratings of around millions of registered users. In Douban, users usually write a comment to movie, music or book after they have watched a movie, listened to a song or read a book. As shown in Fig. 3(a), there is a user’s watching list of movies, and each record contains the name of movie and the watching time (here we regard the time when user preforms the rating as the watching time). Also, there are descriptions of movie, music and book, whose examples are respectively shown in Figs. 3(b) to 3(d). For example, the description of a move contains movie’s category, director and players, tags and so on.

We crawled the data from three domains of Movie (i.e., movie’s category, director and players, and tags.), Music (i.e., music’s tags.) and Book (i.e., book’s tags.), and extracted the registered users who perform sequential behaviors on at least two domains. Finally, we constructed 14 transfer sequential recommendation problems (i.e., 7 pairs of data sets). For clarity, the statistics of seven pairs of data sets are summarized in Table 2. In this table, we provide the statistical information, including the number of users, the number of records (categories, director and players, tags), and the average number of records (categories, director and players, tags) for each user. From this table, we can find that movie data is much denser than book and music, and the music data is the most sparse.

Music_tags
Movie_category
Music_tags
Movie_tags
Music_tags
Movie_dir
Music_tags
Book_tags
Book_tags
Movie_category
Book_tags
Movie_tags
Book_tags
Movie_dir
MRR
OF
OF_U
MC
0.2733
MC_U
NSM
NSM_U
CDNST
0.4024 0.4746 0.2570 0.3414 0.3965 0.4162
CDNST
0.2570 0.4271 0.5032 0.2675 0.3522 0.4017 0.4273
nDCG@15
OF
OF_U
MC
0.3490
MC_U
0.2957
NSM
NSM_U
CDNST
0.4778 0.5500 0.3529 0.3823 0.4829 0.5148
CDNST
0.3058 0.4930 0.5897 0.3614 0.3916 0.4875 0.5261
p@3
OF
OF_U
MC
0.2125
MC_U
NSM
NSM_U
CDNST
0.3990 0.4636 0.2039 0.3002 0.3538 0.4040
CDNST
0.2166 0.4185 0.4792 0.2174 0.3137 0.3601 0.4109
Table 3. Recommendation Performance on 7 Data Sets
Movie_category
Music_tags
Movie_tags
Music_tags
Movie_dir
Music_tags
Book_tags
Music_tags
Movie_category
Book_tags
Movie_tags
Book_tags
Movie_dir
Book_tags
MRR
OF
OF_U
MC
MC_U
NSM
0.6842 0.6842 0.6842 0.6891 0.4031 0.4031 0.4031
NSM_U
CDNST
CDNST
0.7196 0.7374 0.7063 0.7101 0.4459 0.5016 0.4282
nDCG@15
OF
OF_U
MC
MC_U
NSM
0.7599 0.7599 0.7599 0.7657 0.4989 0.4989 0.4989
NSM_U
CDNST
CDNST
0.7794 0.7930 0.7715 0.8061 0.5194 0.5147 0.5308
p@3
OF
OF_U
MC
MC_U
NSM
0.6718 0.6718 0.6718 0.6744 0.3840 0.3840 0.3840
NSM_U
CDNST
CDNST
0.6852 0.7167 0.6950 0.7192 0.4108 0.4296 0.4007
Table 4. Recommendation Performance on The Other 7 Coupled Data Sets

4.2. Evaluation Metrics and Baselines

4.2.1. Evaluation Metrics

For all compared algorithms, they give a recommendation list of candidate choices with prediction probabilities, according to which we sort the candidate choices in descending order. In our experiments, the widely used evaluation metrics of nDCG 

(Liu, 2009) , MRR (Hanani et al., 2001) and Precision (Järvelin and Kekäläinen, 2000) are adopted to evaluate the performance of all algorithms, and they are defined as follows,

(12)
(13)

where ranges over positions in the recommendation list and reflects the preference of the -th items by the user. Sort by descending order and compute like Equation. (12), then we can obtain . The result of dividing by indicates the difference of recommendation order and true order. When the actual result in prediction list is in the more front position, the value of MRR is larger; when the actual result in prediction list is in the more front position of -position, nDCG@k and Precision@k are better.

4.2.2. Baselines

We compare the proposed model CDNST with the following methods:

  • OF (Order by Frequency): OF method always gives a recommendation list according to the frequency in the individual’s historical behavior sequence.

  • OF_U (Order by Frequency across domains): The only difference is that in OF_U we compute the frequency in both source and target domains, while in OF only target domain is used.

  • MC (Markon Chain) (Markov, 1971): The MC method models sequential behaviors in target domain by learning a transition graph and performing predictions.

  • MC_U (Markon Chain) (Markov, 1971): The MC method models sequential behaviors on both source and target domains by learning a transition graph and performing predictions.

  • NSM (Novel Seeking Model) (Zhang et al., 2014): This is a data-driven model to predict the behaviour on target domain.

  • NSM_U: We run the NSM model simply on both source and target domains, rather than in transfer manner.

We set the number of optional values for novelty-seeking level as 9 for both NSM and CDNST.

4.3. Experimental Results

We first provide the experiments on all 14 transfer learning problems, and then show how the temporal property of data across different domain affects the proposed model CDNST.

4.3.1. Effectiveness Results

These seven pairs of data sets are divided into two groups, i.e., the pair of and are put into different groups ( and represent two domains), and all the results of the evaluation metrics of nDCG@15, MRR and p@3 are shown in Table 3 and 4. From these results, we have the following insightful observations,
From Table 3, we can find that our model CDNST outperforms all the baselines, except that on data set “Music_tags Movie_category”, MC is slightly better than CDNST. And also, NSM performs at the second place. We try to investigate why MC performs the best on data set “Music_tags Movie_category”. Fig. 4 shows that the statistical information of Movie_tags, Movie_category and Movie_dir. In Fig. 4, x-axis represents the number of transition status and y-axis represents the percentage of users whose transition status is larger than the given threshold value. From these results, we indeed find that the average number of transition status on Movie_category is much smaller than the ones of Movie_tags and Movie_dir, which may be in favor of the MC method.
Also, it is observed that incorporating the information from auxiliary source domain does not lead to the performance improvement, i.e., OF_U, MC_U and NSM_U, which indicates that the previous models (i.e., OF, MC and NSM) can not effectively make full use of the auxiliary information. On the other hand, our model can benefit from the source domain to achieve significant improvement compared with NSM.
Overall, NSM performs better than MC, and MC outperforms OF.
From Table 4, we find that the incorporating auxiliary information from source domain can not promote the performance of all algorithms on the second group of transfer learning problems, even the performances of baselines drop dramatically. After analyzing the data, we conjecture that the sequential property of auxiliary domain data affects the performance, which will be detailed in Section 4.3.2.

Figure 4. Long Tails of movie_category, movie_tags and movie_dir.

4.3.2. Analysis

Figure 5. Examples of Some Interesting Phenomenon of Users’ Sequential Behaviors on Both Domains.
Movie_category
Music_tags
Movie_tags
Music_tags
Movie_dir
Music_tags
Book_tags
Music_tags
Movie_category
Book_tags
Movie_tags
Book_tags
Movie_dir
Book_tags
MRR
OF_U
MC_U
NSM_U
CDNST
0.7054 0.7122 0.6946 0.7067 0.4183 0.4942 0.4128
nDCG@15
OF_U
MC_U
NSM_U
CDNST
0.7749 0.7826 0.7652 0.7857 0.5163 0.5089 0.5281
p@3
OF_U
MC_U
NSM_U
CDNST
0.6926 0.7073 0.6824 0.7068 0.4082 0.4097 0.3928
Table 5. Recommendation Performance on The Other 7 Coupled Data Sets Advanced two months in source domain (The results of OF, MC and NSM are the same as the ones in Table 4, which are omitted in this table.)

Overall, the results in Table 3 and 4 imply that the domains of Music and Book can help learn the model on Movie domain, and Music can help the learning of Book. To intuitively show the temporal property of auxiliary domain data may affect the performances of all algorithms, we carefully investigate the characteristics of data set “Music_tags Movie_dir”, and find some interesting phenomenon. Fig. 5 lists some examples about the sequential behaviors of two users on both domains. For User 1, 1) he/she first listened to a song of the Chemical Brothers222https://en.wikipedia.org/wiki/The_Chemical_Brothers. Electro and Techno are members of the Chemical Brothers. at time 2012/03/21 in the source domain, then later he/she would watch the movie about the Chemical Brother, e.g., “The Chemical Brothers: Don’t Think (2012)” at time 2012/04/04 in the target domain; 2) he/she first listened to the theme about the film of “An Inaccurate Memoir” composed by Pong Nan at time 2012/04/04, then he/she would watched the movie of “An Inaccurate Memoir” at time 2012/05/04; 3) he/she listened to the music of Leslie Cheung at time 2012/04/30, then later he/she would watch the movie “Farewell My Concubine”333https://en.wikipedia.org/wiki/Farewell_My_Concubine_(film) with the player Leslie Cheung at time 2012/05/30. For User 2, 1) he/she first listened to the theme of Rooftop Prince at time 2012/03/21 in the source domain, then he/she would watch the movie of Rooftop Prince directed by Shin Yoon-sub at time 2012/05/26 in the target domain; 2) he/she listened to the music song by TVXQ at time 2012/09/10 in the source domain, then he/she would watch the movie of “I AM.-SM Town Live World Tour in Madison Square Garden”444https://en.wikipedia.org/wiki/I_AM. played by the TVXQ in the target domain.

These examples may imply that given the source domain data Music_tags, we can transfer the information to give better recommendation on target domain Movie_dir. However in reverse, if we use Movie_dir as source domain, which may not provide useful information for the recommendation on Music_tags, since the related behaviors in Movie_dir occur after the related ones in Music_tags. Even worse, Movie_dir may become noise data, which leads to the performance degrading. To further validate our analysis that the temporal property of source and target domain data affects the performance of the proposed model, we conduct simulation experiments on the second group of data sets. Specifically, we intentionally modify the occurring time of the behaviors in source domain, e.g., setting the occurring time by in advance ( is set as two months in our experiments.), and conduct the experiments again on the second group of data sets. Table 5 records all the results, which show that the recommendation performance of all the algorithms becomes better, and our model CDNST again achieves the best results.

Music_tags
Movie_category
Music_tags
Movie_tags
Music_tags
Movie_dir
Music_tags
Book_tags
Book_tags
Movie_category
Book_tags
Movie_tags
Book_tags
Movie_dir
Movie_category
Music_tags
Movie_tags
Music_tags
Movie_dir
Music_tags
Book_tags
Music_tags
Movie_category
Book_tags
Movie_tags
Book_tags
Movie_dir
Book_tags
Table 6. The Relatedness on 7 Pairs of Data Sets

Obviously, the transfer learning problem for sequential data is different from previous works, since it is directed. As we know, almost all the previous transfer learning algorithms are undirected, which are assumed to work well on both cases and . This may lead to the failure when the problem does not satisfy the temporal property. Is it possible to propose a relatedness measure to judge whether a transfer learning problem is suitable to our model CDNST? To this end, we propose an effective measure by incorporating the external web data, and . We hope when , our model CDNST can make sense, and vice versa.

There are keywords in the context information, so before formally defining the measure , we will first introduce how to compute the similarity of two keywords and the relatedness of behaviours from different domains for each user. For each keyword of both domains, we use it as a query and crawl the top 100 results from the search engine (e.g., Baidu and Google.) to form a corpus. Then, we can convert a keyword to a vector using the word2vec technique555http://spark.apache.org/docs/1.3.1/mllib-feature-extraction.html#word2vec., and , where denotes a keyword, denotes its corresponding vector, and denotes the transposition of (the number of dimension is set as 50 in the experiments.). As shown in Fig. 6, we sort the keywords of user in domain and in domain with chronological order, where and respectively denote the number of keywords in domains and , and then the relatedness of behaviours in the case of for each user is defined as,

(14)

where if the timestamp of is , denotes the set of keywords in domain, whose timestamp is between and , and is the total number of keyword pairs for user . Finally, we are ready to define ,

(15)

where is the number of users. We compute the relatedness measure of all 14 problems and 7 new constructed problems in Section 4.3.2, and all results are recorded in Table 6. From these results, we can find that the values of relatedness measure on the first group (in the second row) are all larger than the ones on the second group (in the fourth row), which is coincident with our analysis. Also on the new constructed problems, the values of relatedness measure are significantly increased. Therefore, we can adopt this relatedness measure to judge whether a transfer learning problem is suitable to our model.

Figure 6. The chronological order of actions from two domains for one user.

4.3.3. Transfer for Personalized Recommendation

Furthermore, we can adopt the proposed relatedness measure to make effective transfer on the user level recommendation, which is very useful for personalized recommendation. Specifically, according to Equation. (14) the users are chosen who are suitable for the transfer scenario (i.e., ) or (i.e., ), then we can run CDNST on these corresponding users on seven pairs of data sets. The results are shown in the last row of each metric in Table 3 and 4 (Our model is denoted as CDNST for this personalized recommendation). From these results, we can find that CDNST can obtain additional improvement compared with the one transfer on domain level, which again indicate the effectiveness of the proposed relatedness measure.

5. Conclusions and Remarks

In this paper, we propose a new cross-domain recommendation algorithm, in which the novelty-seeking trait of users are shared across source and target domains for effective knowledge transfer. To validate the effectiveness of the proposed model, we first crawl three domains of data sets from the well-known Chinese social-media platform Douban, and construct 14 transfer recommendation problems. The experiments show that our model is more accurate, when the source and target domain data satisfy the sequential property, i.e., the related behaviors in source domain occur before the related ones in target domains. This may be a new cross-domain recommendation problem, which we call it sequential recommendation. In the future, we will aim to propose new transfer recommendation model to address this problem.

References

  • (1)
  • Acker and McReynolds (1967) Mary Acker and Paul McReynolds. 1967. THE” NEED FOR NOVELTY”: A COMPARISON OF SIX INSTRUMENTS. The Psychological Record (1967).
  • Baumgartner and Steenkamp (1996) Hans Baumgartner and Jan-Benedict EM Steenkamp. 1996. Exploratory consumer buying behavior: Conceptualization and measurement. International Journal of Research in Marketing 13, 2 (1996), 121–137.
  • Bobadilla et al. (2013) Jesús Bobadilla, Fernando Ortega, Antonio Hernando, and Abraham Gutiérrez. 2013. Recommender systems survey. Knowledge-Based Systems (2013).
  • Cloninger et al. (1994) C. Robert Cloninger, Przybeck Thomas R., and Svrakic Dragan M. 1994. The Temperament and Character Inventory (TCI): A guide to its development and use. St. Louis, MO: Center for Psychobiology of Personality, Washington University.
  • Ebstein et al. (1996) Richard P Ebstein, Olga Novick, Roberto Umansky, Beatrice Priel, Yamima Osher, Darren Blaine, Estelle R Bennett, Lubov Nemanov, Miri Katz, and Robert H Belmaker. 1996. Dopamine D4 receptor (D4DR) exon III polymorphism associated with the human personality trait of novelty seeking. Nature genetics 12, 1 (1996), 78–80.
  • Fernández-Tobías et al. (2012) Ignacio Fernández-Tobías, Iván Cantador, Marius Kaminskas, and Francesco Ricci. 2012. Cross-domain recommender systems: A survey of the state of the art. In Spanish Conference on Information Retrieval. 24.
  • Fiske and Maddi (1961) Donald W Fiske and Salvatore R Maddi. 1961. Functions of varied experience. Dorsey.
  • Furr and Funder (2004) R Michael Furr and David C Funder. 2004. Situational similarity and behavioral consistency: Subjective, objective, variable-centered, and person-centered approaches. Journal of Research in Personality 38, 5 (2004), 421–447.
  • Hanani et al. (2001) Uri Hanani, Bracha Shapira, and Peretz Shoval. 2001. Information filtering: Overview of issues, research and systems. User modeling and user-adapted interaction 11, 3 (2001), 203–259.
  • Järvelin and Kekäläinen (2000) Kalervo Järvelin and Jaana Kekäläinen. 2000. IR evaluation methods for retrieving highly relevant documents. In SIGIR. ACM, 41–48.
  • Jiang et al. (2016) Meng Jiang, Peng Cui, Nicholas Jing Yuan, Xing Xie, and Shiqiang Yang. 2016. Little Is Much: Bridging Cross-Platform Behaviors through Overlapped Crowds. In AAAI. 13–19.
  • Li et al. (2009) Bin Li, Qiang Yang, and Xiangyang Xue. 2009. Can Movies and Books Collaborate? Cross-Domain Collaborative Filtering for Sparsity Reduction. In IJCAI. 2052–2057.
  • Lian et al. (2017) Jianxun Lian, Fuzheng Zhang, Xing Xie, and Guangzhong Sun. 2017.

    CCCFNet: A Content-Boosted Collaborative Filtering Neural Network for Cross Domain Recommender Systems. In

    WWW. 817–818.
  • Liu (2009) Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval 3, 3 (2009), 225–331.
  • Markov (1971) Andrey Markov. 1971.

    Extension of the limit theorems of probability theory to a sum of variables connected in a chain.

    Dynamic Probabilistic Systems (1971).
  • McClelland (1955) David C McClelland. 1955. Studies in motivation. Appleton-Century-Crofts.
  • Raju (1980) Puthankurissi S Raju. 1980. Optimum stimulation level: Its relationship to personality, demographics, and exploratory behavior. Journal of Consumer Research 7, 3 (1980), 272–282.
  • Rogers (2010) Everett M Rogers. 2010. Diffusion of innovations. Simon and Schuster.
  • Stern (1962) Hawkins Stern. 1962. The significance of impulse buying today. The Journal of Marketing (1962), 59–62.
  • Su and Khoshgoftaar (2009) Xiaoyuan Su and Taghi M. Khoshgoftaar. 2009. A Survey of Collaborative Filtering Techniques. Adv. in Artif. Intell. (2009).
  • Wang et al. (2017) Xuejian Wang, Lantao Yu, Kan Ren, Guanyu Tao, Weinan Zhang, Yong Yu, and Jun Wang. 2017. Dynamic attention deep model for article recommendation by learning human editors’ demonstration. In SIGKDD. ACM, 2051–2059.
  • Wei et al. (2017) Honghao Wei, Fuzheng Zhang, Nicholas Jing Yuan, Chuan Cao, Hao Fu, Xing Xie, Yong Rui, and Wei-Ying Ma. 2017. Beyond the Words: Predicting User Personality from Heterogeneous Information. In WSDM. ACM, 305–314.
  • Zhang et al. (2014) Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, and Xing Xie. 2014. Mining Novelty-seeking Trait Across Heterogeneous Domains. In WWW. 373–384.
  • Zhang et al. (2015) Fuzheng Zhang, Kai Zheng, Nicholas Jing Yuan, Xing Xie, Enhong Chen, and Xiaofang Zhou. 2015. A Novelty-Seeking Based Dining Recommender System. In WWW. 1362–1372.
  • Zuckerman (1979) Marvin Zuckerman. 1979. Sensation seeking. Wiley Online Library.