JSCN
None
view repo
Crossdomain recommendation can alleviate the data sparsity problem in recommender systems. To transfer the knowledge from one domain to another, one can either utilize the neighborhood information or learn a direct mapping function. However, all existing methods ignore the highorder connectivity information in crossdomain recommendation area and suffer from the domainincompatibility problem. In this paper, we propose a Joint Spectral Convolutional Network (JSCN) for crossdomain recommendation. JSCN will simultaneously operate multilayer spectral convolutions on different graphs, and jointly learn a domaininvariant user representation with a domain adaptive user mapping module. As a result, the highorder comprehensive connectivity information can be extracted by the spectral convolutions and the information can be transferred across domains with the domaininvariant user mapping. The domain adaptive user mapping module can help the incompatible domains to transfer the knowledge across each other. Extensive experiments on 24 Amazon rating datasets show the effectiveness of JSCN in the crossdomain recommendation, with 9.2% improvement on recall and 36.4% improvement on MAP compared with stateoftheart methods. Our code is available online [%s].
READ FULL TEXT VIEW PDFNone
Recommending users with a set of preferred items is still an open problem [18, 39, 9, 24, 4, 5], especially when the dataset is very sparse. To remedy the data sparsity issue, broadleraning based model [36] and crossdomain recommender system [14, 24] are proposed where the information from other source domains can be transferred to the target domain. To transfer the knowledge from one domain to another, one can use the overlapping users [14, 5, 24, 12] in two ways: (1) the neighborhood information of common users stores the structure information of different domains with which we can do crossdomain recommendation [33, 5]; or (2) we can learn a mapping function [24, 14]
to project latent vectors learned in one domain into another, and thus the knowledge can be transferred.
However, all existing methods ignore the highorder connectivity information [30]. Highorder connectivity information consists of all the neighborhood information, the neighbors of all the neighbors, and so on by using the linkage information in the graph. The highorder connectivity information is explained in Figure 1 wherein the middle part user A and user C are the overlapping users, the upper/green part is the target domain, and the lower/blue part is the source domain. For example, in the target domain (only the upper part), user D has a connection with item . Merely with the neighborbased information [9, 14, 35], item and item should be ranked similarly since the neighbor user (i.e. user C) of user D has no direct connections with them. However, with the highorder connectivity information, we argue that user D should prefer item more than item as there is a path from item 2 to user D ^{2}^{2}2item 2–user B–item 3–user C–item 4–user D, while item is only connected with user A and apart from the others. Moreover, the preference ranking may be different if taking account of the source domain (considering both the upper and lower graphs). We can find two paths ^{3}^{3}3item 1–user A–item 5–user C–item 4–user D and item 1–user A–item 6–user C–item 4–user D from item to user D compared with the single path from item to user D. Hence user D may prefer item more than item if the highorder connectivity information across domains is included. However, the highorder connectivity problem is not well studied yet in the crossdomain recommendation.
To capture the connectivity information in a graph, one can transform the graph into the frequency domain by applying the spectral graph theory [17, 3, 30]. In spectral theory [30, 2]
, the spectrum of a graph extracts the comprehensive connectivity information of a graph with the graph Fourier transformer in terms of the eigenvectors of the graph laplacian
[30]. Based on this, we can design the spectral convolutional network [39, 1] whose convolutions are linear operators diagonalized by the Fourier basis. With the spectral convolutional network, nodes in a graph are represented as spectral vectors [17, 39]. When it comes to bipartite graphs, we can learn the spectral representations of users and items to capture the connectivity information. The spectral representation models the highorder nonlinear interactions among users and items with multilayer spectral convolutions. Hence, recalling the problem discussed before in Figure 1, in the spectral domain, the item 1 will be closer to user D than item 2 as there exist more connections from item 1 to user D than those from item 2.However, different domains may be incompatible with each other which is also called as domainincompatibility problem [29] in the crossdomain recommendation. For instance, if the target domain is a Movie domain where users are connected with the movie items, and the source domain is a Clothing domain where users are connected with the clothing items, they will be incompatible with each other since the behavior of users varies a lot. The information from the source domain cannot be directly utilized in the target domain. Thus we need to propose some mapping methods [24, 15, 20] as a bridge for the information transferring.
In this paper, unlike previous direct mapping methods [24, 15, 33], we view the latent vectors of a user in a specific domain as an interest projection from a domaininvariant representation. We show an illustration of mapping the domaininvariant user representation to a domainspecific user latent vectors in Fig. 2. To learn transferable representations, we jointly learn the domaininvariant representation of users across different domains. The joint convolution can capture the highorder connectivity information across different domains and learn domaininvariant representations by keeping the spectral similarity of the overlapping users. Based on this, we design a Joint Spectral Convolutional Network (JSCN) to fuse the information from multiple domains. JSCN will simultaneously operate multilayer spectral convolution on the graph from each domain. Then the extracted spectral features can be shared across different graphs with the domaininvariant representations. Since JSCN jointly learns the spectral representations on different graphs, the highorder comprehensive connectivity information can be shared across domains. And because of the domaininvariant representations of users, JSCN alleviates the domainincompatibility problem. We summarize our main contributions as follows:
[leftmargin=*]
Transferable spectral representation: To the best of our knowledge, it is the first work to study how to transfer the spectral representation of bipartite graphs, which captures the highorder nonlinear interactions of useritem both within domain and across domains.
Joint spectral convolution on graphs: In this paper, we design a joint spectral convolutional network for learning the representations of multiple graphs concurrently. The highorder comprehensive connectivity information can be shared across different graphs.
Domain adaptive module: To deal with the domain incompatibility problem, we apply a novel domain adaptive module to jointly learn the domaininvariant spectral representations of users, with which we can implement the joint convolution on graphs and share information across different domains.
The rest of the paper is organized as follows. In Sec. II, we review some previous works related to this paper. Then in Sec. III, we introduce the definitions of the notations and concepts, as well as the problem. In Sec. IV, we present the proposed model and the formulation of the model. Finally, in Sec. V we discuss the experiment before we draw a conclusion in Sec. VI.
In this section we give a brief review of two closely related areas: (1) deep learning based recommender system; and (2) crossdomain Recommendation.
Since [28] introduces deep learning into recommender system (RS), [41, 9, 10]
propose deep neural network based RS to learn from either explicit or implicit data. To counter the sparsity problem, some scholars propose to utilize deep learning techniques to build a hybrid recommender system.
[32] and [34]introduce Convolutional Neural Networks (CNN) and Deep Belief Network (DBN) assist in representation learning for music data. These approaches above pretrain embeddings of users and items with matrix factorization and utilize deep models to finetune the learned item features based on item content. In
[4], a multiview deep model is built to utilize item information from more than one domain. [16] integrates a CNN with PMF to analyze documents associated with items to predict users’ future explicit ratings. [40] leverages two parallel neural networks to jointly model latent factors of users and items. To incorporate visual signals into RS, [8, 22, 25, 7]propose CNNbased models to incorporate visual signals into RS. They make use of visual features extracted from product images using deep networks to enhance the performance of RS.
[35, 38] investigates how to leverage the multiview information to improve the quality of recommender systems. Due to the limited space, readers can refer to [37] for more works on deep recommender systems.Broad Learning [36] is a way to transfer the information from different domains, which focuses on fusing and mining multiple information sources of large volumes and diverse varieties. To solve the coldstart problem in item recommendation, crossdomain recommendation is proposed by either learning shallow embedding with factorization machine [14, 31, 23, 33] or learning deep embedding with neural networks [24, 26, 13, 12, 21]. When learning shallow embedding, CMF [31] jointly factorizes the useritem interaction matrices from different domains. In order to model the domain information explicitly, CDTF [14] and CDCF [23] is designed where the former factorizes the useritemdomain triadic relation and the later models the source domain information as the context information of users. When learning the deep embedding of users and items, CSN [26] is introduced firstly in multitask learning scenario, where a convolutional network with crossstitch units can share the parameters across different domains. This idea is extended later by CoNet [12] with cross connections across different networks where shared mapping matrices is introduced to transfer the knowledge. Additionally, EMCDR [24]
transfers the knowledge across source and target domains with multilayer perceptron. Our proposed JSCN model also jointly learns a deep embedding for both indomain and crossdomain information.
In this section, the preliminaries and definitions are presented. At first, we formally define the useritem bipartite graph and the corresponding connectivity matrices. Then we define the bipartite graph domain as well as the source domain and target domain before we formulate our problem. The important notations used in this paper are summarized in Table I.
(Bipartite Graph). A bipartite useritem graph with vertices and edges for recommendation is defined as , where and are two disjoint vertex sets, i.e. user set and item set, respectively. Every edge is in the form as , denoting the interaction of a user with an item , e.g. an item is viewed/purchased/liked by a user.
A bipartite graph describes the interactions among users and items, thus we can define an implicit feedback matrix [27, 9] for a given bipartite graph as
(1) 
where and are the th user in the user set and th item in the item set , respectively.
Given an implicit feedback matrix of a bipartite graph , the corresponding adjacent matrix can be defined as
(2) 
where the adjacent matrix is an matrix and is the number of nodes in the bipartite graph, i.e., .
With the adjacent matrix of a bipartite graph, a laplacian matrix of a bipartite graph can be calculated as
(3) 
where
is the identity matrix and
is a diagonal matrix where each entry on the diagonal denotes the sum of all the elements in the corresponding row of the adjacent matrix, i.e. .In this paper, we focus on the crossdomain recommendation. Thus we would combine the information from a set of bipartite graphs and then recommend items to users. In each domain, we have a categorical mapping function which projects the items into a specific category, e.g. Movies, describing the type of the items in the domain. We assume all the items belongs to one domain and thus we have the definition of graph domain.
(Bipartite graph domain) A Bipartite graph domain is defined on a categorical mapping function of items. Two bipartite graphs and are in different domains if and only if .
The source domain bipartite graph is the source interaction bipartite graph of users and items, which provides auxiliary information for target domain bipartite graph where we would recommend items to users. We would integrate the information across the source domain and target domain, and make a recommendation in the target domain.
(Problem Definition). Given a set of source domain bipartite graphs and a target domain graph , we aim at recommending each user in with a ranked list of items from which have no existing interaction with that user in graph . The source domains share a set of common users with each other, and the shared users between pairwise source domains can be denoted as set . Meanwhile, the target domain also shares a set of common users with each of the source domains, which denoted as set .
Notation  Description 

bipartite graph, source graph, target graph  
set of users  
set of items  
user, item  
set of common users  
common user  
eigenvectors, diagmatrix of eigenvalues 

input dimension of feature vector  
spectral convolution parameter in each layer  
,  user, item latent vectors 
,  source, target invariant user representation 
dimension of spectral latent vectors  
dimension of domaininvariant representation  
domain related user mapping function  
categorical mapping function of items 
In this section, we explain the spectral convolution network for collaborative filtering [39] first before we introduce the domain invariant user representation. After that, we will present our proposed Joint Spectral Convolution Network (JSCN) for crossdomain recommendation. Finally, we will formulate the adaptive user mapping mechanism. The overall framework of our proposed model is given in Fig. 3. We use triangles and squares denoting users and items, respectively. Different colors for users and items denote different domains. And the same numbers on squares represent common users in different domains.
Given a bipartite graph , we would like to learn an embedding for each of the node, i.e. user or item, as illustrated in the first step in Fig. 3. At first, users and items are represented as dimensional vectors, and all the user and item latent vectors can be grouped together and represented as matrices and respectively, where and . With the graph structure information, the spectral convolutional operator is defined [30, 39, 3] based on the eigenvectors and the corresonding eigenvalues as
(4) 
In Eq. (4), the term preserves the structure information of the bipartite graph, where is the convolutional filter to extract the spectral feature, and
denotes the logistic sigmoid function. It is the SP layer in the second step in Fig.
3.With multiple spectral convolutional operators on the original feature vectors , we construct a layer spectral convolutional network on bipartite graph as shown in Eq. (5), with which we could learn the spectral representations of the nodes in the graph,
(5) 
where and (). After layer spectral convolutional operations, we represent the users and items as latent vectors and respectively by either concatenating the extracted spectral features vectors at each layer or using the spectral feature vectors at the last layer. It corresponds to the third step in Fig. 3.
In terms of the loss function, we apply the
BPRloss as suggested in [27, 39] to compute the indomain loss, which models the indomain useritem interactions,(6) 
where the are the triples that sampled from useritem interaction records in which denotes the index of a user, denotes the index of an item with which the user has interaction, and denotes the index of an item with which the user has no interaction. And we apply dot product of user vector and item vector. Unlike pairwise learning process [19], BPRloss maximizes the difference between and with the assumption that users prefer observed items over unobserved items . We use denotes the user latent vector of user , and and denote the item latent vector of item and item , respectively.
With the indomain loss , we could learn both the user and item latent vectors from the multilayer spectral convolutional network. Recall the problem definition in Def. 3, we have a set of source domain bipartite graphs and one target domain bipartite graph , and every domain has a set of overlapping users with each other.
A user requires different aspects w.r.t. different domains that lead to different user latent vectors, but we prefer invariant user representation across different domains, and hence we define the domain invariant user representation as , from which we generate the domainspecific latent vector with corresponding domainrelated user mapping function as
For example, has a set of common users with the target domain , which is denoted as . With the indomain loss, we learn the domain specific user latent vectors individually for and as and respectively. is generated from the domainindependent user representations by the corresponding domainrelated user mapping function . is generated from the domainindependent user representations by the corresponding domainrelated user mapping function . With the inverse function of the user mapping function , denoted as , we can obtain the domain invariant user representation from the domain specific user latent vector as , which is the fourth step in Fig. 3.
Since we have the domain invariant user representations, each user in should be represented as a same representation both in and . To make this constraint trainable, we construct the crossdomain loss as the distance of the domain invariant user representations as:
(7) 
where denotes the common users between source domains and as defined in Def. (3). , and denotes the domain invariant representation of the anchor user w.r.t. the corresponding domainindependent user representations , and , respectively.
The crossdomain loss combines the information across different domains with the domain invariant user representation of the common users. Even if a common user only exists in part of all the domains, the information can be shared across different domains, as the effect of collaborative filtering. But we cannot directly learn the domain invariant representation, and thus instead, we learn the user and item latent vectors with the indomain loss. Then we apply the inverse function of the user mapping function to learn the domaininvariant user representations. And the crossdomain loss can be written as:
(8) 
where , and denote the latent vector of the common user w.r.t. the corresponding domainspecific user latent vectors , and , respectively. We present this in the fifth step in Fig. 3. Hence the joint spectral convolution model has the loss function as:
(9) 
where is the indomain loss of the source domain , is the indomain loss of the target domain , and is the regularization term defined as:
(10) 
where is the regularization hyperparameter.
As described in Sec. IVB, we can use the inverse function of the domainrelated user mapping function to generate the domaininvariant user representation from the spectral user latent vector. We define this inverse function as the adaptive user mapping function, which can either be a linear mapping function or a neural network based nonlinear function [9]. For simplicity, here we only present the linear mapping function, which leads to
(11) 
where the is the domain adaptive matrix w.r.t. graph domain . This mapping function is a kind of structural regularization [42] of different domains. It turns out the mapping can transfer the spectral information during the joint learning process.
With this adaptive user mapping matrix, we can rewrite the crossdomain loss as:
(12) 
where and are two adaptive user mapping matrix corresponding with the domain and respectively.
We follow the optimization approach in [11, 39]
to learn the spectral latent vectors and domain invariant user mapping with RMSprop. The RMSprop is an adaptive version of gradient descent which controls the step size with respect to the absolute value of the gradient. It is done by scaling the updated value of each weight by a running average of its gradient norm.
For the prediction, we focus on improving the performance on the target domain. We use the spectral representation and of users and items respectively in the target domain to make a recommendation. For a specific user , we predict the user’s preference over an item as , then we sort the preferences as the ranking list for recommendation.
In this section, we introduce the dataset first. After that, we discuss the baselines that we compare in this paper. Then we give the experimental settings such as the evaluation metrics. Finally, we present the experiments in details. Through the experiment, we respond to the following research questions:
[leftmargin=*]
RQ1: Does the source domain information help to improve the recommendation performance in target domain?
RQ2: Will spectral feature be better in improving the crossdomain recommendation performance?
RQ3: Can the adaptive user mapping help to transfer the information across different domains?
In this paper, we use the Amazon rating dataset [7], where we find the interactions of users and items. The rating data where a user rates an item scoring from to is from May 1996  July 2014. The dataset consists of different domains, we present part of the statistics as in Table II. The original dataset is the rating data, we follow the convention in [39, 9] to transform the data into implicit interactions.
Domain Name  # User  # Item  # Rating 

Movies and TV  k  k  k 
Clothing, Shoes and Jewelry  k  k  k 
Apps for Android  k  k  k 
Amazon Instant Video *  k  k  k 
Each domain shares a set of common users with other domains. In the experiment, we use the Amazon Instant Video dataset as the target domain and the other domains as the source domains.
[leftmargin=*]
To answer the previous research questions, we compare our proposed model and methods with some stateoftheart methods. The major task is defined in Def. 3 which focuses on improving the recommendation performance in the target domain. And we categorize the baseline methods into two groups: (1) Single domain based methods. To answer RQ1 we should compare our model with other models that are noncrossdomain, e.g., BPR [27], NCF [9], and SpectralCF [39]. (2) Crossdomain based methods. For RQ2, we will investigate the capability of spectral feature in transferring the information across different domains, e.g., CMF [31], CDCF [23], CoNet [12] and our proposed model JSCN. For RQ3, we compare the different version of our proposed model to study the function of the adaptive user mapping. We introduce these methods as followings:
[leftmargin=*]
NCF [9]: Neural Collaborative Filtering applies neural architecture replacing the inner product of latent factors. Thus it can model the nonlinear interaction of items and users.
SpectralCF [39]: Spectral Collaborative Filtering is the SOTA work to learn the spectral feature of users and items, which is based on the BPR pairwise loss.
CMF [31]: Collective Matrix Factorization is a matrix factorization based cross domain rating prediction model. In this paper, we change the rating to 0/1 w.r.t. the implicit interaction of users and items.
CDCF [23]: CrossDomain Collaborative Filtering method model the useritem interaction as the context feature for the factorization machine. With arbitrary source domain, CDCF can treat them as input feature of users, and learn the latent vectors for both users and items.
CoNet [12]: It is the SOTA deep learning method to learn a shared crossdomain mapping matrix such that the information can be transferred. CoNet enables dual knowledge transferring across domains by introducing cross connections from one base network to another and vice versa. We implement the model with the code published by the author ^{4}^{4}4http://home.cse.ust.hk/~ghuac/conetcode_onlyhucikm1820181115.zip.
JSCN: Joint Spectral Convolution Network is our proposed model to learn a crossdomain recommender system. It is based on graph convolutional network to transfer the spectral feature of users across different domains. This model is a simple version without the adaptive user mapping, only enforcing the spectral vector in different domains to be similar.
JSCN: It is the complete version of our proposed model, which includes adaptive user mapping.
Different from the rating score prediction task, the interaction prediction models in this paper should predict items that are interacted with users in the top ranking list. Thus in the experiment, we utilize the Recall@K and MAP@K to evaluate the performance of models. We usually have thousands of valid items in a given domain, we use to present the performance of models.
For the baseline methods, we select the dimension of latent vectors from for BPR and SpectralCF. And we follow the suggestion in original papers for NCF to train 3layer MLP. We implement the CMF model by using the 01 interaction matrix. For CDCF, the dimension is set to
which is the same as all the crossdomain based model. For our proposed model, there are some hyperparameters requiring tuning. To reduce the complexity of the proposed model, we would let the dimension of invariant user representation equal to the dimension of the spectral latent vector, i.e.,
. And we set the convolutional dimension parameter . The number of filters is important to the performance of the model. And with the validation on different source domain datasets, we find when the number of filters , the performance of JSCN is the best for most of the source domains. We present the validation on JSCN with source domain as Apps for Android in Figure 4. And we use the linear mapping for domain adaptive part as suggested in Sec. VG. For the training process, we set the learning rate as and the regularization weight as .To answer RQ1, in this experiment part, we would compare the single domain based methods with the crossdomain based models on the same domain. The target domain is the Amazon Instant Video dataset. And to answer RQ2, we would use the same source domain to compare different crossdomain based methods. To answer the RQ3, we would compare the performance of different versions of JSCN, i.e., JSCN and JSCN. In this section, we would use three different source domain datasets to improve the recommendation performance, which are Movies and TV, Clothing, Shoes and Jewelry and Apps for Android. We analyze the results in details.
In Fig. 5, we present the performance of different models on the target domain w.r.t. Recall@K. And in Fig. 6, we show the performance w.r.t. MAP@K. For the crossdomain based models, JSCN performs the best compared to all the other methods. JSCN improves the performance of SpectralCF by on recall on average, and on MAP on average, which answers that crossdomain information can improve the performance. CMF cannot achieve a good performance compared to the other crossdomain based models. Among all the single domain based models, according to the result in [39] and our results, SpectralCF is the best model compared to NCF and BPR as it can not only model the positive and negative interactions of useritem but also, with the graph convolution, model the interaction in a highorder nonlinear way. From the result, some crossdomain based models cannot always surpass the single domain based models.
CDCF, CoNet, JSCN, and JSCN can all well transfer the information across different domains. But since CDCF and CoNet has no spectral convolutional architecture, it cannot capture the highorder interactions of useritem. From our results, SpectralCF can achieve comparable performance with CDCF and CoNet even without source domain information. This suggests that we should apply spectral convolution to transfer the information across different domains. CoNet can transfer the information that learned from the neural networks and shared across different networks. But it cannot capture the highorder information across domain. JSCN beats the performance of CoNet by on recall in average and on MAP in average, which answers that the spectral representation generated by JSCN can improve the performance in crossdomain recommendation.
The users in source domain Movies and TV should request similar aspects of items with the users in target domain Amazon Instant Video as the items are similar. Thus it is straightforward to transfer the information across these two compatible domains. The result is illustrated in Fig. 4(a) and Fig. 5(a). The performance of JSCN and JSCN are relatively close. However, the source domain Clothing, Shoes and Jewelry is incompatible with the target domain. From the result in Fig 4(b) and Fig 5(b), we can find both JSCN and CDCF cannot improve the performance compared to SpectralCF. But JSCN learns the domaininvariant user representation which can transfer the information even the domain is incompatible. As a result, the adaptive user mapping in JSCN is important to transfer the information across different domains even if the domains are incompatible. JSCN beats the performance of JSCN by 9.2% on recall in average and 36.4% on MAP in average, which answers that the adaptive user mapping can solve the domain incompatible problem thus improve the performance in crossdomain recommendation.
In this section, we report the crossdomain recommendation results of JSCN on the target domain with different source domains w.r.t. MAP@20 in Fig. 7. Since the recall performance varies little for different source domains, and due to the space limitation of the paper, we choose not to show the results of recall.
The best result is from source domain Apps for Android. And we can find that even if some of the source domains are incompatible with the target domain Amazon Instant Video, e.g. Clothing, Shoes and Jewelry, the crossdomain recommendation performs well. Even if some of the source domains e.g. Home and Kitchen, Health and Personal Care, and Office Products, perform not that well compared with other source domains, they still improve the performance of SpectralCF by , and respectively, which suggests the benefits of source domain information and the effectiveness of our proposed model.
label  Domain Name  # User  # Item  # Ratings 

Home and Kitchen  k  k  k  
2  Health & Personal Care  k  k  k 
3  Office Products  k  k  k 
From the results in Sec. VE, we notice that our model performs differently given different source domain. Some source domains cannot provide enough information and hence the crossdomain recommendation results are not that good compared to the other source domains. The JSCN model can combine the information from source domains and share the information together to improve the performance on the target domain. In this section, we conduct the experiment on training JSCN models on multiple source domains.
We select three source domains: Home and Kitchen, Health and Personal Care, and Office Products, which perform worst compared with the other source domains. The domain statistics are summarized in Table III. We conduct the experiment by choosing two out of three source domains to jointly learn the JSCN model. Hence we have and in this experiment. The comparison result is presented in Fig. 8.
From the result, when we can find that multiple source domains can improve the performance compared with single source domain. Especially the combination of Home and Kitchen and Health and Personal Care source domains improve the performance by 37.2% on average compared using each one of the two domains. This experiment can prove the JSCN can jointly learn the information from multiple source domains. When we find the performance is a little bit worse than the combination of source domain 1 and source domain 2 (but still better than the other two combinations), which suggests that also requires tuning. The reason why is better than is that the source domain has smaller density value ^{5}^{5}5Density: 1 Home and Kitchen : , 2 Health and Personal Care : and 3 Office Products : compared with the other 2 domains, which can induce more disturbance to the model.
In this part, we compare the performance of JSCN and JSCN with different mapping functions. Recall that JSCN is the simple version of the joint spectral convolutional network which enforces the common user latent vector to be similar without the domain adaptive module. As for the domain adaptive module of JSCN, we have either the linear mapping or nonlinear multilayer perceptron (MLP) mapping. We use four source domains, i.e. Books, Movies and TV (MT), Clothing, Shoes and Jewelry (CSJ) and Apps for Android (AfA).
Source Domain  Books  MT  CSJ  AfA 

JSCN  0.02374  0.02291  0.02076  0.02103 
JSCNMLP  0.02678  0.02375  0.02654  0.02537 
JSCN  0.02769  0.02364  0.02877  0.03043 
Source Domain  Books  MT  CSJ  AfA 

JSCN  0.2011  0.2021  0.2050  0.2032 
JSCNMLP  0.2107  0.2165  0.2112  0.2097 
JSCN  0.2187  0.2179  0.2155  0.2217 
From the result in Table IV and Table V, we can find JSCN performs much better than JSCN
, which shows the effectiveness of the domain adaptive user mapping module. One interesting observation is the linear mapping beats the nonlinear mapping. Since the nonlinear mapping requires tuning a lot of hyperparameters, such as choosing the activation function and the dimension of the hidden layer, we suggest using the linear mapping function for learning the invariant user vector. One possible explanation for this observation is that since the spectral vectors are already low dimensional vectors, MLP can easily find a mapping function such that the invariant user vectors in different domains to be the same, hence overfitting the user vectors. As overfitting will harm the structural regularization
[42] of the domain adaptive user mapping, the information cannot be transferred in a good way compared with linear mapping.In this paper, we design a Joint Spectral Convolutional Network (JSCN) to solve the crossdomain recommendation problem. Firstly, JSCN operates multilayer spectral convolutions on different graphs simultaneously. Secondly, JSCN maps the learned spectral latent vectors to a domain invariant user representation with adaptive user mapping module. Finally, JSCN minimizes both the indomain loss in the spectral latent vector space and the crossdomain loss in the domain invariant user representation space to learn the parameters. From the experiment, we can answer three questions: 1)JSCN can use the source domain information to improve the recommendation performance; 2) the spectral convolutions in JSCN can capture the comprehensive connectivity information to improve the performance in crossdomain recommendation; 3) the adaptive user mapping of learning the domaininvariant representation can help to transfer knowledge across different domains.
This work is supported in part by NSF under grants III1526499, III1763325, III1909323, CNS1930941, and CNS1626432. This work is also partially supported by NSF through grant IIS1763365 and by FSU.
Sessionbased recommendations with recurrent neural networks
. arXiv preprint arXiv:1511.06939. Cited by: §IIA.Neural networks for machine learning lecture 6a overview of minibatch gradient descent
. Cited by: §IVE.Cross domain recommendation using vector space transfer learning.
. In RecSys Posters, Cited by: §I, §I.Proceedings of the 26th International Joint Conference on Artificial Intelligence
, pp. 2464–2470. Cited by: §I, §I, §I, §IIB.Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, pp. 3994–4003. Cited by: §IIB.The emerging field of signal processing on graphs: extending highdimensional data analysis to networks and other irregular domains
. IEEE Signal Processing Magazine 30 (3), pp. 83–98. External Links: Document, ISSN 10535888 Cited by: §I, §I, §IVA.