A Survey on Cross-domain Recommendation: Taxonomies, Methods, and Future Directions

08/07/2021
by   Tianzi Zang, et al.
Shanghai Jiao Tong University
0

Traditional recommendation systems are faced with two long-standing obstacles, namely, data sparsity and cold-start problems, which promote the emergence and development of Cross-Domain Recommendation (CDR). The core idea of CDR is to leverage information collected from other domains to alleviate the two problems in one domain. Over the last decade, many efforts have been engaged for cross-domain recommendation. Recently, with the development of deep learning and neural networks, a large number of methods have emerged. However, there is a limited number of systematic surveys on CDR, especially regarding the latest proposed methods as well as the recommendation scenarios and recommendation tasks they address. In this survey paper, we first proposed a two-level taxonomy of cross-domain recommendation which classifies different recommendation scenarios and recommendation tasks. We then introduce and summarize existing cross-domain recommendation approaches under different recommendation scenarios in a structured manner. We also organize datasets commonly used. We conclude this survey by providing several potential research directions about this field.

READ FULL TEXT VIEW PDF

Authors

page 1

page 2

page 3

page 4

03/02/2021

Cross-Domain Recommendation: Challenges, Progress, and Prospects

To address the long-standing data sparsity problem in recommender system...
01/28/2021

A Survey on Personality-Aware Recommendation Systems

With the emergence of personality computing as a new research field rela...
03/16/2021

A Novel Paper Recommendation Method Empowered by Knowledge Graph: for Research Beginners

Searching for papers from different academic databases is the most commo...
04/26/2018

CD-CNN: A Partially Supervised Cross-Domain Deep Learning Model or Urban Resident Recognition

Driven by the wave of urbanization in recent decades, the research topic...
03/22/2021

Grand challenges and emergent modes of convergence science

To address complex problems, scholars are increasingly faced with challe...
10/18/2019

JSCN: Joint Spectral Convolutional Network for Cross Domain Recommendation

Cross-domain recommendation can alleviate the data sparsity problem in r...
05/11/2017

Transfer Learning for Cross-Dataset Recognition: A Survey

This paper summarises and analyses the cross-dataset recognition transfe...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

In the era of information explosion, people are easy to be overwhelmed by a large amount of information. Millions of new products, text, and videos are being released all the time, which makes it difficult for people to find their interested items. Users’ historical interactions are believed to contain rich information about user interests that can be used to predict their future interests. This promotes the emergence of recommendation systems (RS). The basic idea of RS is to analyze and estimate users’ interests, and then select items that users may be interested in from a large number of candidate items and recommend them to the users.

Although recommendation systems have been proved to play a significant role in a variety of applications, there are two long-standing obstacles that greatly limit the performance of recommendation systems. On the one hand, the number of user-item interaction records often tends to be small and is insufficient to mine user interests well, which is called the data sparsity problem. On the other hand, for any service, there are constantly new users joining, for whom there are no historical interaction records. Traditional recommendation systems cannot make recommendations to these users, which is called the cold-start problem. As more and more users begin to interact with more than one domains (e.g., music and book), it increases opportunities of leveraging information collected from other domains to alleviate the two problems (i.e., data sparsity and cold-start problems) in one domain. This idea leads to Cross-Domain Recommendation (CDR) which has attracted increasing attention in recent years.

Compared with traditional recommendation systems, cross-domain recommendation is more complicated. First, considering the relations between user sets and item sets of two domains, there are different recommendation scenarios of cross-domain recommendation such as user overlap or non-overlap (Cremonesi et al., 2011; Khan et al., 2017). Second, the recommendation tasks of cross-domain recommendation are various. For example. the recommended items and the user may be in the same domain or different domains. The goal of recommendation may be improving performance in one specific domain or multiple domains. Third, traditional recommendation systems only need to focus on how to model user interests from historical interaction records. For cross-domain recommendation, besides modeling user interests within a domain, it also needs to consider how to transfer the knowledge (i.e., user interests) among domains. This leads to two core issues for cross-domain recommendation, namely, what to transfer and how to transfer. What to transfer is how to mine useful knowledge in each domain, and how to transfer focuses on how to establish linkages between domains and realize the transfer of knowledge.

Over the last decade or so, many efforts have been engaged for cross-domain recommendation. To answer the question of what to transfer, existing studies are dedicated to applying different methods to extract useful knowledge in each domain. Traditional machine learning methods, such as matrix factorization (Singh and Gordon, 2008; Xin et al., 2015), factorization machines (Li et al., 2019; Loni et al., 2014), co-clustering (Moreno et al., 2012; Li et al., 2009a; Wang et al., 2019), and latent semantic analysis (Lu et al., 2013; Tan et al., 2014) have been widely applied. In recent years, with the emergence and development of deep learning technologies, many approaches based on deep learning have been proposed (Elkahky et al., 2015; Yan et al., 2019; Zhao et al., 2019; Hu et al., 2018a; Ma et al., [n.d.]; Hu et al., 2018b; Gao et al., 2019), which greatly improves the accuracy and performance of cross-domain recommendation. To answer the question of how to transfer, a straightforward idea is to utilize overlapping entities, either users or items, to directly establish relationships between domains (Zhu et al., 2019, 2020; Perera and Zimmermann, 2020; Singh and Gordon, 2008; Hu et al., 2013; Yan et al., 2019; Chen et al., 2019). When there are no overlapping entities, some efforts are also made to establish linkages by extracting cluster-level patterns (Li et al., 2009a; Moreno et al., 2012; Shu et al., 2018; Wang et al., 2019) or resorting to other auxiliary information (e.g., users’ generated tags, reviews, user profiles, and item content) (Fernández-Tobías and Cantador, 2014; Shi et al., 2011; Yang et al., 2015; Zhang et al., 2019a).

In the literature, there are several surveys on cross-domain recommendation. Li et al. (Li, 2011) first gave a brief survey in which they proposed that there were three different types of domains, that is, system domains, data domains, and temporal domains. To be more specific, for different system domains, there are different types of items (e.g., books and movies) or items of different genres (e.g., comedy movies and fiction movies). For data domains, it refers to the fact that users’ preferences towards items can be stored in multiple data types (e.g., explicit numeric rating data and implicit binary feedback data) and each type of data is treated as a domain. For the temporal domain, the interaction records are divided into several temporal slices according to timestamps and each time slice constitutes a domain. This classification of domain types was found to be widely cited by later researchers. Cremonesi et al. (Cremonesi et al., 2011) identified four different cross-domain scenarios based on the relations between user sets and items sets of two domains, that is, no overlap, user overlap, item overlap, and full overlap, which have been recognized by the following studies (Fernández-Tobías et al., 2012; Khan et al., 2017). The most recent survey was written by Khan et al. (Khan et al., 2017) in which they made a detailed comparison and discussion of previous surveys and identified domain type, user-item overlap scenario, and recommendation tasks as three building blocks of cross-domain recommender systems. They also made a detailed summary and analysis of enabling algorithms, identified problems, and future directions.

Since existing surveys on cross-domain recommendation have been published for several years, we found that they can not meet the current demand for research in this area. We write this survey mainly for the following three reasons. First, the classification of recommendation scenarios and recommendation tasks in existing surveys are coarse-grained. Considering the actual situation of cross-domain recommendation research, the classification can be further refined (see section 2.2.1 and section 2.2.2 for details). For example, existing surveys just classified the relations between user sets and item sets into two cases: overlap and non-overlap. We propose that the overlap relation can be further divided into partial overlap and full overlap. Second, many significant advances in cross-domain recommendation have happened after these surveys were published. In particular, the rising popularity and rapid development of deep learning-based technologies in recent years have affected the field of cross-domain recommendation to a large extend. Therefore, there is a need for a survey that summarizes the most recent approaches in cross-domain recommendation. Third, since there exist different recommendation scenarios and recommendation tasks in cross-domain recommendation, a method proposed in one scenario is often not applicable to another scenario. Therefore, when discussing existing studies, it is essential to categorize them according to the classification of recommendation scenarios and tasks, which has not been well addressed by existing surveys.

For a literature survey, it is essential for finding every relevant piece of work. We adopted a hybrid approach for searching the relevant literature. We first used Google Scholar as the main search engine to discover related papers. We then screened most of the related high-profile conferences such as NIPS, ICML, ICLR, SIGKDD, WWW, AAAI, SIGIR, IJCAI, and RecSys, to find out the recent works. The major keywords we use include “cross-domain rec”, “cross-system rec”, “cross-network rec”, and “cross-platform rec”. We also pay attention to the references mentioned in the relevant work section of each paper to prevent omissions of relevant literature.

The contributions of this paper are summarized as follows:

  • We propose an original two-level taxonomy of cross-domain recommendation which identifies different recommendation scenarios based on the overlap of user/item sets and different recommendation tasks.

  • We summarize existing research works and make a method-based categorization of them concerning the recommendation scenarios and recommendation tasks they address.

  • We introduce frequently used datasets in cross-domain recommendation, group them into different categories and explain how they can be used in researches.

  • We outline further potential research directions of cross-domain recommendation.

The remainder of this paper is organized as follows. In section 2, we introduce the notations adopted in this paper and our proposed two-level taxonomy about cross-domain recommendation scenarios and recommendation tasks. In section 35, we, respectively, summarize and make method-based categorization of existing cross-domain works under three widely studied recommendation scenarios in a structured manner. Section 6 will introduce datasets frequently used in cross-domain recommendation. In section 7, we present potential future search directions. Finally, we conclude this paper in Section 8.

2. taxonomy

In this section, we first introduce the related notations adopted in this paper. Then we introduce our proposed two-level taxonomy of recommendation scenarios and recommendation tasks, followed by the method-based categorization of existing researches under different recommendation scenarios.

2.1. Notations

Without loss of generality, we consider the cross-domain recommendation when only two domains and are involved. The notations introduced here can be easily extended to situations with multiple domains. and , respectively, denote the user set and item set in while and are the user set and item set in . Two matrices and represent interactions between users and items in each domain, where and denote the number of users and items in and . Each element (or ) can be numeric denoting a user’s explicit rating to an item or binary number representing an implicit interaction (e.g., click, purchase, add to cart) between a user-item pair. In addition, in each domain, there may be another two matrices ( in ) representing user-profiles and items attributes, respectively.

2.2. The Proposed Two-level Taxonomy

Different from traditional single-domain recommendation systems, cross-domain recommendation is more complicated as there exist different recommendation scenarios and recommendation tasks. Therefore, we first proposed a two-level taxonomy, each level of which consists of two dimensions to, respectively, classify cross-domain recommendation scenarios and recommendation tasks. With our proposed taxonomy, all of the current studies can be classified and positioned, which can help researchers quickly understand the characteristics of each study. In the following, we introduce our proposed two-level taxonomy in detail.

Figure 1. Comparison between previous classification of recommendation scenarios and ours. (a) illustration of scenarios identified by previous surveys. (b) illustration of scenarios identified in this paper.

2.2.1. The First Level: Classification of recommendation scenarios

The first level of the proposed taxonomy is about recommendation scenarios and the classification criteria is the relations between user sets and item sets of two domains. To be more specific, there are two dimensions, that is, the overlap of user sets and the overlap of item sets.

The classification of recommendation scenarios was first proposed by Cremonesi et al. (Cremonesi et al., 2011) in which they identified four different recommendation scenarios (shown in Fig.1 (a)), that is, no overlap, user overlap, item overlap, and full overlap. This classification was widely accepted by subsequent researchers. However, through analysis and summary of recent works, we propose that this classification is coarse-trained and can be further refined. Still considering the relations between user sets and item sets of two domains, we describe these two dimensions in detail as follows.

Dimension 1: the overlap of user sets. This dimension refers to whether there are common users between the user sets of two domains. Previous studies only divided user sets into overlap and non-overlap, but we propose that overlap can be further subdivided into partial overlap and full overlap. Therefore, under this dimension, cross-domain recommendation can be divided into three sub-categories. The first category is user non-overlap. Following the notations defined above, it can be denoted as . The second category is user partial overlap which means and . The third category is user full overlap which means .

Dimension 2: the overlap of item sets. This dimension refers to whether there are common items between the item sets of two domains. Previous studies also only divided items into overlap and non-overlap categories. Similar to the first dimension, considering the overlap of item sets, cross-domain recommendation can also be divided into three sub-categories, that is, item non-overlap, item partial overlap and item full overlap. Similarly, they can be denoted, respectively, as , and , and .

Overlap of user sets Overlap of item sets Existing works
User non-overlap Item non-overlap  (Li et al., 2009a; Moreno et al., 2012; Gao et al., 2013; Ren et al., 2015; He et al., 2018b; Zhang et al., 2017; Chen et al., 2013; Zhang et al., 2018; Shu et al., 2018; Li et al., 2009b; Iwata and Takeuchi, 2015; Wang et al., 2019)
 (Enrich et al., 2013; Fernández-Tobías and Cantador, 2014; Shi et al., 2011; Hao et al., 2016; Zhang et al., 2019a; Fang et al., 2015; Kumar et al., 2014; Yang et al., 2015; Zhao et al., 2013, 2017; Zhang et al., 2016)
Item partial overlap symmetric to user partial overlap item non-overlap
Item full overlap symmetric to user full overlap item non-overlap
User partial overlap Item non-overlap  (Rafailidis and Crestani, 2016, 2017; Zhang et al., 2019b; Yang et al., 2017; Zhu et al., 2019, 2020; Perera and Zimmermann, 2017, 2020; Cui et al., 2020; Li et al., 2020; Jiang et al., 2016)
 (Man et al., 2017; Wang et al., 2018; Fu et al., 2019; Bi et al., 2020b, a; Zhu et al., 2018; Kang et al., 2019; Zhang et al., 2020; Zhao et al., 2020; Wang et al., 2017)
Item partial overlap -
Item full overlap -
User full overlap Item non-overlap (Lu et al., 2013; Tan et al., 2014; Sahebi et al., 2017; Singh and Gordon, 2008; Ma et al., 2008; Xin et al., 2015; Zhao et al., 2018; Huang et al., 2019; Hu et al., 2013; Liu et al., 2015; Song et al., 2017; Loni et al., 2014; Li et al., 2019; Ma et al., 2018)
(Elkahky et al., 2015; Lian et al., 2017; He et al., 2018a; Zhao et al., 2019; Yan et al., 2019; Chen et al., 2019; Hu et al., 2018a; Liu et al., 2020b; Li and Tuzhilin, 2020; Ma et al., [n.d.]; Liu et al., 2020a; Hu et al., 2018b; Gao et al., 2019; Perera and Zimmermann, 2018)
Item partial overlap -
Item full overlap equivalent to single-domain recommendation
Table 1. Categorization of existing works according to recommendation scenarios.

With these two dimensions, we extend the classification of recommendation scenarios from previous categories to categories and show the comparison in Fig. 1. We also present the categorization of existing works according to recommendation scenarios which is shown in Table 1. It is worth noting that there are currently no studies on three scenarios (i.e., user partial overlap item partial overlap, user partial overlap item full overlap, and user full overlap item partial overlap). To ensure the integrity of the tree-structured categorization, we retain these nodes. It should also be pointed out that the recommendation scenario with user non-overlap item partial overlap is symmetric to the case with user partial overlap item non-overlap. Similarly, the scenario with user non-overlap item full overlap is symmetric to the case with user full overlap item non-overlap. Most of the approaches proposed for one scenario apply to the other scenario by exchanging users and items. Therefore, we will not discuss in detail the scenarios with user non-overlap item partial overlap and user non-overlap item full overlap. Moreover, the recommendation scenario with user full overlap item full overlap is equivalent to single-domain recommendation and is therefore outside the scope of this paper.

2.2.2. The Second Level: Classification of Recommendation Tasks

The second level of our proposed taxonomy is about recommendation tasks which also consists of two dimensions. Based on whether the recommended items are in the same domain as the user, cross-domain recommendation can be divided into two categories, that is, intra-domain recommendation and inter-domain recommendation. Considering whether the number of target domains is single or multiple, cross-domain recommendation can be divided into other two categories, i.e., single-target cross-domain recommendation and multi-target cross-domain recommendation. For the convenience of understanding, we elaborate on these two dimensions.

Dimension 1: whether recommended items are in the same domain as the user. Considering this dimension, there are two different cross-domain recommendation tasks, that is, intra-domain recommendation and inter-domain recommendation. For intra-domain recommendation, the recommended items are in the same domain as the user which means we recommend a subset of items to a user or we recommend to a user . For inter-domain recommendation, the recommended items are from a different domain with the user, that is, we recommend to a user or to a user . Specifically, inter-domain recommendation is often referred to as cold-start user recommendation which is because that (or ) have no interaction with items in (or ) and, therefore, can be seen as a cold-start user in domain (or domain ).

Dimension 2: whether the number of target domains is single or multiple. Based on this dimension, there are another two cross-domain recommendation tasks, that is, single-target recommendation and multi-target recommendation. For single-target recommendation, the domain with a denser interaction matrix is generally treated as the auxiliary/source domain and the other domain with a sparser interaction matrix is referred to as the target domain. The goal of the recommendation is to improve the recommendation performance in the target domain. For multi-target recommendation, it is based on the assumption that the interaction matrices of both domains are sparse or both domains are sparse in some kinds of information, thus, the performance of both domains can be improved by utilizing knowledge from the other domain. There is no source-target distinction and the aim is to simultaneously enrich both domains and improve the performance of both domains.

Dimension 2Dimension 1 Intra-domain recommendation Inter-domain recommendation
Single-target recommendation (Li et al., 2009a; Moreno et al., 2012; He et al., 2018b; Chen et al., 2013; Zhang et al., 2018; Shi et al., 2011; Shu et al., 2018; Perera and Zimmermann, 2017, 2020; Enrich et al., 2013) (Kumar et al., 2014; Yang et al., 2015; Wang et al., 2017; Man et al., 2017; Wang et al., 2018; Fu et al., 2019; Bi et al., 2020b, a; Zhu et al., 2018; Kang et al., 2019)
(Fernández-Tobías and Cantador, 2014; Zhao et al., 2013, 2017; Loni et al., 2014; Li et al., 2019; Hu et al., 2018b; Gao et al., 2019; Ma et al., 2018; Perera and Zimmermann, 2018)
Multi-target recommendation (Gao et al., 2013; Ren et al., 2015; Li et al., 2009b; Iwata and Takeuchi, 2015; Wang et al., 2019; Shi et al., 2011; Fang et al., 2015; Hao et al., 2016; Zhang et al., 2019a, 2016)
(Cui et al., 2020; Zhang et al., 2020; Zhu et al., 2020, 2019; Jiang et al., 2016; Rafailidis and Crestani, 2016, 2017; Zhang et al., 2019b; Yang et al., 2017) (Li et al., 2020; Zhao et al., 2020)
(Singh and Gordon, 2008; Ma et al., 2008; Xin et al., 2015; Zhao et al., 2018; Hu et al., 2013; Liu et al., 2015; Song et al., 2017; Elkahky et al., 2015; Lian et al., 2017; He et al., 2018a)
(Zhao et al., 2019; Yan et al., 2019; Chen et al., 2019; Hu et al., 2018a; Wang et al., 2018; Liu et al., 2020b; Li and Tuzhilin, 2020; Ma et al., [n.d.]; Liu et al., 2020a)
Table 2. Categorization of existing works according to recommendation tasks.

Table 2 shows the categorization of representative cross-domain recommendation studies according to recommendation tasks. As we can see, the most widely studied task is the intra-domain multi-target recommendation followed by the intra-domain single-target recommendation. The inter-domain multi-task recommendation is the least studied, with only two corresponding papers.

Figure 2. Method-based categorization under different recommendation scenarios.

2.3. Method-based Categorization under Different Recommendation Scenarios

Based on our proposed two-level taxonomy, we have identified different recommendation scenarios and different recommendation tasks. In the following, we will systematically review and analyze the approaches of existing works which is the core of this paper. However, approaches proposed under a scenario may not apply to another scenario. It is meaningless to mix approaches for introduction regardless of which scenario they are aimed at. Therefore, before introducing existing studies, we first make a method-based categorization of them under different recommendation scenarios, which is shown in Fig. 2. Specifically, the four gray boxes represent three unstudied recommendation scenarios and a recommendation scenario that is equivalent to single-domain recommendation for which we have no further method-based classification. In the following sections, we will introduce existing works under recommendation scenarios, i.e., user non-overlap item non-overlap, user partial overlap item non-overlap (equivalent to user non-overlap item partial overlap), and user full overlap item non-overlap (equivalent to user non-overlap item full overlap), in turn.

Method Approach Venue Year Recommendation Tasks
Dimension 1 Dimension 2
Extracting Cluster- Level Rating Patterns CBT (Li et al., 2009a) IJCAI 2009 intra-domain single-target
TALMUD (Moreno et al., 2012) CIKM 2012 intra-domain single-target
CLFM (Gao et al., 2013) PKDD 2013 intra-domain multi-target
PCLF (Ren et al., 2015) AAAI 2015 intra-domain multi-target
MINDTL (He et al., 2018b) WSDM 2018 intra-domain single-target
FUSE (Chen et al., 2013) KDD 2013 intra-domain single-target
ProbKT (Zhang et al., 2018) ICONIP. 2018 intra-domain single-target
CIT (Shi et al., 2011) Support Syst. 2017 intra-domain single-target
CrossFire (Shu et al., 2018) WSDM 2018 intra-domain single-target
RMGM (Li et al., 2009b) ICML 2009 intra-domain multi-target
 (Iwata and Takeuchi, 2015) AISTATS 2015 intra-domain multi-target
CDIE-C (Wang et al., 2019) WSDM 2019 intra-domain multi-target
Capturing Tag Correlations UserItemTags (Enrich et al., 2013) EC-Web 2013 intra-domain single-target
TagGSVD++ (Fernández-Tobías and Cantador, 2014) RecSys 2014 intra-domain single-target
TagCDCF (Shi et al., 2011) UMAP 2011 intra-domain multi-target
ETagiCDCF (Hao et al., 2016) FUZZ-IEEE 2016 intra-domain multi-target
SCT (Zhang et al., 2019a) IJCNN 2019 intra-domain multi-target
TMT (Fang et al., 2015) ICDMW 2015 intra-domain multi-target
SCD (Kumar et al., 2014) CIDM 2014 inter-domain single-target
GRAPH (Yang et al., 2015) CIKM 2015 inter-domain single-target
Applying Active Learning MMMF (Zhao et al., 2013) AAAI 2013 intra-domain single-target
RLMF, PMF (Zhao et al., 2017) Artif. Intell. 2017 intra-domain single-target
MultiAL (Zhang et al., 2016) AAAI 2016 intra-domain multi-target
Table 3. Method-based categorization of existing approaches for the recommendation scenario with user non-overlap and item non-overlap.

3. Scenario 1: User non-overlap & Item non-overlap

Due to the independence and isolation of information in different domains, user sets and item sets of two domains are not overlapped, or even there are some overlapped users or items, they can not be identified and the correspondences are not available. This leads to the first recommendation scenario, where neither user sets nor item sets overlap, which was extensively studied in the early cross-domain recommendation. In particular, there are three classes of approaches for this recommendation scenario, that is, () extracting cluster-level rating patterns, () capturing tag correlations, and ()

applying active learning technology

. The method-based categorization of existing approaches is displayed in Table 3.

3.1. Extracting Cluster-Level Rating Patterns

3.1.1. The basic paradigm

The first class of approaches assume that the services of domains are geared towards the general population and users in different domains may have similar preferences while items may share some properties as well. Although there are not overlapping users/items between domains, two domains may share cluster-level rating patterns. Therefore, approaches of this class aim at extracting cluster-level rating patterns from one domain and transfer them to the other domain. Fig. 3 shows the schematic diagram of this method and we describe the details in the following.

The first step is to factorize the rating matrix of the source domain into three latent matrices , , and

. Specifically, the orthogonal nonnegative matrix tri-factorization (ONMTF) algorithm is a widely used co-clustering algorithm that is proved to be equivalent to a two-way K-means clustering algorithm.

(1)

Each row of U and V is the cluster indicator for a user or an item.

The next step is to construct a compact matrix B that encodes the cluster-level rating patterns of clusters of users and clusters of items that are shared between domains. The third step is to get the cluster indicator matrices and in the target domain by transferring B to the target domain.

(2)

where denotes the element-wise product. Finally, the predicted ratings in the target domain are generated:

(3)
Figure 3. The schematic diagram of extracting cluster-level rating patterns.

3.1.2. Approaches of this method

The first approach of this method is proposed by Li et al. (Li et al., 2009a). The generated compact matrix B is called a “codebook”. They set the nonnegative entry of U and V in each row to be and the others to be getting two indicator matrices and . Then the codebook B is constructed by averaging all the ratings in each user-item co-cluster as an entry in the codebook as follows:

(4)

The basic paradigm was then widely borrowed, expanded, and improved by many researchers, and some variant approaches have been proposed.

Moreno et al. (Moreno et al., 2012) extended this paradigm by transferring knowledge from multiple source domains with varying levels of relevance and proposed an approach named TALMUD. For each source domain with rating matrix , there is a codebook representing the transferred knowledge from this domain and a variable denoting its relevant degree with the target domain. The target rating matrix is reconstructed by linearly integrating the rating patterns of all source domains, i.e., the equation (2) is changed as follows:

(5)

Some researchers proposed that there simultaneously existed common cluster-level rating patterns shared across domains and domain-specific cluster-level rating patterns for each domain. The final predicted rating scores should be the combination of ratings from the perspective of common rating patterns and ratings from the perspective of domain-specific rating patterns. Following this idea, Gao et al. (Gao et al., 2013) proposed a Cluster-Level Latent Factor Model (CLFM) which partitioned the cluster-level rating patterns B in each domain into a common part and a domain-specific part , that is . Ren et al. (Ren et al., 2015) proposed a Probabilistic Cluster-level Latent Factor (PCLF) model that jointly learnt a common cluster-level rating matrix and a domain-specific cluster-level rating matrix. Zhang et al. (Zhang et al., 2018) proposed a joint probabilistic model named ProbKT which jointly captured domain-shared group-level knowledge and domain-specific group-level knowledge. The equation (2) of these approaches is changed as follows:

(6)

Chen et al. (Chen et al., 2013)

extended this paradigm by taking users’ generated tags into consideration and proposed a tensor-factorization-based framework (FUSE). Instead of sharing a cluster-level rating matrix between domains, they constructed a shared three-dimensional cluster-level tensor to transfer knowledge. The matrix factorization process is, therefore, replaced by tensor factorization which maps users, items, and tags into a shared latent feature space. The clustering operation is simultaneously performed on users, items, and tags to obtain clusters of them.

Zhang et al. (Zhang et al., 2017) proposed that there are divergences between domains and directly transferring the cluster-level rating patterns of the source domain to the target domain may result in ”negative transfer”. Therefore, they utilized a domain adaptation technique to ensure the consistency of the transferred knowledge. He et al. (He et al., 2018b) proposed that the orthogonal nonnegative matrix tri-factorization (ONMTF) algorithm adopted by existing approaches requires that the matrix be fully rated which is not easy to be satisfied. They extended the TALMUD (Moreno et al., 2012) approach by applying an incomplete orthogonal nonnegative matrix tri-factorization (IONMTF) algorithm which relaxed the full rating restriction on the rating matrix of ONMTF and was easier to implement on real-world datasets.

Shu et al. (Shu et al., 2018) proposed a cross-media joint friend and item recommendation framework (CrossFire) which aimed to jointly recommend items and friends to users in the target domain. In addition to tri-factorize rating matrices into user feature matrices, item feature matrices, and a shared rating feature matrix as previous works, it also tri-factorized user-user link matrices into user feature matrices and a shared interaction matrix. Besides, CrossFire also utilized item features as auxiliary information. It assumed that items in different domains share a dictionary and factorized feature matrices into the item feature matrices and the dictionary in each domain.

3.1.3. Expand to multi-target recommendation.

All of the approaches mentioned above are for single-target recommendation as they explicitly extract cluster-level rating patterns in the source domain and then transfer them to the target domain. Some researchers have tried to expand this method to multi-target recommendation.

Li et al. (Li et al., 2009b)

proposed a rating-matrix generative model (RMGM) which integrated several sparse domains and assumed that multiple domains share common latent cluster-level patterns. Then they learned the probabilities of each user and item belonging to this shared latent structure. Iwata et al. 

(Iwata and Takeuchi, 2015)

proposed that the latent factors of users/items generated by matrix factorization in each domain were from a common Gaussian distribution. The same mean vector and covariance matrix helped to align the latent factors from different domains.

Wang et al. (Wang et al., 2019) focused on the cross-domain session-based recommendations and proposed a method called cross-domain item embedding method based on co-clustering (CDIE-C). As user information is not available and items differ between domains, we also categorize it to the recommendation scenario where neither user nor item overlaps. The core idea of CDIE-C is to extract cluster-level correlations by exploring cross-domain co-occurrence relations of items based on the co-clustering method. Then, both item-level sequence information within each domain and cluster-level cross-domain correlation information can be captured to generate the final cross-domain recommendation.

3.2. Capturing Tag Correlations

3.2.1. The basic paradigm.

This method is based on the assumption that although users and items are different between domains, users may use the same tags to annotate items of interest, and items in different domains may be tagged by the same tags to encode their properties. As shown in Fig. 4, domain A and domain B have different users and items. Users in domain A use tags to tag books, and users in domain B use tags to tag movies, so both tags “sci-friction” and “romantic” exist in both domains. Therefore, this method turns to user-generated tags to establish the linkages between different domains. Specifically, there are two ways to capture tag correlations. On the one hand, tags can be used to simultaneously enhance the profiles of users and items. On the other hand, matrices of similarities between users, items, and tags can be generated based on tags and then be used as constraints to learn better user and item representations.

Figure 4. The schematic diagram of capturing tag correlations.

3.2.2. Approaches of this method

For enhancing the profiles of users and items, Enrich et al. (Enrich et al., 2013) first proposed to utilize tags as implicit user feedback to enhance item factors. They proposed a tag-based cross-domain collaborative filtering approach based on the SVD++ (Koren, 2008) algorithm with three different adaptations being explored. Fernandez-Tobias et al. (Fernández-Tobías and Cantador, 2014) claimed that the approach proposed in (Enrich et al., 2013) did not fully exploit users’ preferences expressed in their tags assigned to items. They further proposed an approach named TagGSVD++ based on GSVD++ (Manzato, 2013) which simultaneously enriched users’ profiles with tags they used and extended items’ profiles with tags they received.

For constructing similarity matrices between users and items, Shi et al. (Shi et al., 2011) proposed a tag-induced cross-domain collaborative filtering (TagCDCF) algorithm that exploited shared tags to construct cross-domain user-to-user similarity matrix and item-to-item similarity matrix . The tag-induced similarity matrices were then incorporated into the matrix factorization process as additional constraints by forcing similar users/items to have closer representations. The objective function is generally defined as follows:

Many researchers have proposed improved approaches based on this idea. Hao et al. (Hao et al., 2016) proposed that the number of shared tags between domains was limited and it was a waste to discard domain-dependent tags. They proposed an Enhanced Tag-induced Cross-Domain Collaborative Filtering (ETagiCDCF) algorithm to explore domain-dependent tags. It first grouped domain-dependent tags into clusters based on their co-occurrences with shared tags. The cross-domain user-to-user and item-to-item similarities were computed on these tag clusters. Zhang et al. (Zhang et al., 2019a) proposed that although there were a limited number of shared tags between domains, two non-identical tags in two domains might be semantically related. They proposed a cross-domain recommendation approach with semantic correlations in tagging systems (SCT). It utilized word2vec to analyze all tags and generated continuous vectors to capture contextual information and semantic similarities between tags. Both the intra-domain similarities and inter-domain similarities between users and items can be computed based on tags’ semantic similarities.

Instead of constructing the similarity matrices between users and items, Fang et al. (Fang et al., 2015) proposed to construct a tag co-occurrence matrix that captured the interrelatedness among tags. Then the rating matrix in each domain was tri-factorized into three parts, that is, the tag co-occurrence matrix, the user latent factors for tags, and the item latent factors for tags. The factorization process was optimized by forcing the generated co-occurrence matrix to be as close to the constructed co-occurrence matrix as possible.

3.2.3. Expand to Inter-domain recommendation

Kumar et al. (Kumar et al., 2014) proposed that the vocabularies of different domains may not match explicitly, but through the use of ontologies, it may be possible to derive semantic relationships between words of distinct domains. They proposed a semantic-clustering-based cross-domain (SCD) recommendation algorithm which first performed semantic clustering to obtain clusters of semantically equivalent words. SCD then got item-based topic distributions and user-based topic distributions. The user profiles were then mapped to the target domain and used to perform inter-domain recommendations.

Yang et al. (Yang et al., 2015) had a similar proposal that different vocabularies were used by different domains so tags should be correlated on semantic level rather than lexical level. Therefore, they utilized online encyclopedias to achieve the semantic matching of tags and built a multi-partite graph to represent the similarities of objects in different domains. Finally, the similarity between a user and an item was measured based on the graph propagation.

3.3. Applying Active Learning

3.3.1. The basic paradigm.

This method involves a certain amount of human effort within a fixed budget. For example, at first, there were no explicit entity (i.e., user or item) correspondences between different domains, some users/items are sometimes overlapped. It is expensive or time-consuming to recognize all user correspondences, but the partial mappings of a small number of users or items can be identified within a fixed budget. In addition, more ratings can be obtained through human efforts, which alleviates the problem of data sparsity.

3.3.2. Approaches of this method.

Zhao et al. (Zhao et al., 2013) proposed a margin-based active learning approach that selected the entities in the target domain with low prediction certainty. They first applied a maximum-margin matrix factorization (MMMF) in the target domain to get the original user and item latent factors. They then queried the correspondences of entities with low prediction certainty in the source domain and an extended MMMF was proposed to transfer knowledge from the source domain to refine the user/item latent factors in the target domain. This method was later generalized to a unified framework. Two variants based on regularized low-rank matrix factorization (i.e., RLMF) and probabilistic matrix factorization (i.e., PMF) were proposed in their later works (Zhao et al., 2017).

Zhang et al. (Zhang et al., 2016) incorporated active learning with the previous proposed RMGM (Li et al., 2009b) approach. It first generated the original user and item latent factors as well as shared cluster-level rating patterns using the RMGM algorithm. It added unrated user-item pairs of all domains into a pool and the proposed algorithm iteratively selected the most informative items from the pool to ask for users’ ratings. The selecting criterion measured the global generalization errors which jointly considered domain-specific errors and domain-independent errors. The new rated user-item pairs will be added to the training set and help to re-train the RMGM algorithm.

4. Scenario 2: User partial overlap & Item non-overlap

Method Approach Venue Year Recommendation Tasks
Dimension 1 Dimension 2
Collective Matrix Factorization XPTRANS (Jiang et al., 2016) AAAI 2016 intra-domain multi-target
JCSL (Rafailidis and Crestani, 2016) PKDD 2016 intra-domain multi-target
CDCR (Rafailidis and Crestani, 2017) CIKM 2017 intra-domain multi-target
KerKT (Zhang et al., 2019b) - 2019 intra-domain multi-target
MPF (Yang et al., 2017) SIGIR 2017 intra-domain multi-target
Representation Combination of overlapping users DTCDR (Zhu et al., 2019) CIKM 2019 intra-domain multi-target
GA-DTCDR (Zhu et al., 2020) IJCAI 2020 intra-domain multi-target
 (Perera and Zimmermann, 2017) Multimedia 2017 intra-domain single-target
 (Perera and Zimmermann, 2020) AAAI 2020 intra-domain single-target
Embedding and Mapping EMCDR (Man et al., 2017) IJCAI 2017 inter-domain single-target
CDLFM (Wang et al., 2018) DASFAA 2018 inter-domain single-target
RC-DFM (Fu et al., 2019) AAAI 2019 inter-domain single-target
DCDIR (Bi et al., 2020b) SIGIR 2020 inter-domain single-target
HCDIR (Bi et al., 2020a) SIGIR 2020 inter-domain single-target
DCDCSR (Zhu et al., 2018) IJCAI 2018 inter-domain single-target
SSCDR (Kang et al., 2019) CIKM 2019 inter-domain single-target
CGN (Zhang et al., 2020) IJCAI 2020 intra-domain multi-target
Graph Neural Network-based Approaches HeroGRAPH (Cui et al., 2020) IJCAI 2017 intra-domain multi-target
NSCR (Wang et al., 2017) SIGIR 2017 inter-domain single-target
ECHCDR (Li et al., 2020) DASFAA 2020 intra/inter-domain multi-target
Capturing Aspect Correlations CATN (Zhao et al., 2020) SIGIR 2020 inter-domain multi-target
Table 4. Method-based categorization of existing approaches for the recommendation scenario with user partial overlap and item non-overlap

For this kind of recommendation scenario, some users have interactions in both domains while others are only in a specific domain. It is symmetrical to the scenario with item partial overlap and user non-overlap. Approaches discussed in this section can be applied to these two recommendation scenarios. In particular, these approaches can be categorized into five classes, that is, () collective matrix factorization, () representation combination of overlapping users, () embedding and mapping, () graph neural network-based approaches, and () capturing aspect correlations. A brief summary of method-based categorization of existing approaches for this recommendation scenario is displayed in Table 4.

4.1. Collective Matrix Factorization

Figure 5. The schematic diagram of collective matrix factorization.

4.1.1. The basic paradigm.

Collective matrix factorization is a direct extension of the general matrix factorization method for the cross-domain recommendation problem. Fig. 5 shows the schematic diagram of collective matrix factorization. Similar to general matrix factorization, the rating matrix in each domain is factorized into a user latent feature matrix and an item latent feature matrix. The difference is that cross-domain knowledge is utilized to constrain the matrix factorization process within each domain.

4.1.2. Approaches of this method.

Jiang et al. (Jiang et al., 2016)

proposed a semi-supervised transfer learning approach called XPT

RANS in which they argued that the similarities between overlapped users were consistent across different domains. They first performed nonnegative matrix factorization on two domains. Then the user-based similarities were added as constraints to the matrix factorization process when learning the representations of users and items. Rafailidis et al. (Rafailidis and Crestani, 2016) proposed a joint cross-domain user clustering and similarity learning recommendation algorithm (JCSL) in which they jointly considered cluster-based and user-based cross-domain similarities. The resulting similarity matrices acted as social regularization terms during the matrix factorization process. Rafailidis et al. (Rafailidis and Crestani, 2017) further proposed a cross-domain recommendation approach with collaborative ranking (CDCR) which focused on the ranking performance when generating the recommendation. In this model, they learned the cross-domain user latent matrix to capture correlations of users in two domains and incorporated it as a constraint into the learning process of user/item latent factors.

Zhang et al. extended their previous study (Zhang et al., 2017) to the scenario of user partial overlap and proposed a cross-domain recommender system with kernel-induced knowledge transfer (KerKT) (Zhang et al., 2019b). The overlapping users were utilized to train the domain adaptation function to ensure the consistency of the transferred knowledge. Kernel-induced completion was conducted to measure the user similarities which were integrated into the matrix factorization process as constraints. Yang et al. (Yang et al., 2017) proposed a generative model of Multi-site Probabilistic Factorization (MPF) the basic idea of which was to model cross-site user preferences and site-specific user preferences simultaneously. For multiple-site users, their latent feature vectors in site consist of a common part and a site-specific part , i.e., . For an exclusive user in site , . In the generative process, each site had a different prior for the site-specific part of users’ latent feature vectors.

4.2. Representation Combination of Overlapping Users

4.2.1. The basic paradigm

Fig. 6 shows the schematic paradigm of this method. We can see, there are typically three layers. The embedding layer generates embeddings for users and items in each domain. In the combination layer, the embeddings of overlapping users from both domains are combined to generate the unified embeddings for overlapping users. Finally, the prediction layer takes both the embeddings of distinct users and overlapping users to train the recommendation model on each domain separately.

Figure 6. The schematic diagram of representation combination of overlapping users.

4.2.2. Approaches of this method

Zhu et al. (Zhu et al., 2019)

first proposed this paradigm for dual-target cross-domain recommendation (DTCDR) the core idea of which was to share the knowledge of overlapping users across domains. Specifically, in the embedding layer, it generated embeddings for users and items in each domain from both rating and content information. In the combination layer, the embeddings of overlapping users were combined by three different combination operations, i.e., concatenation, max-pooling, and average-pooling. Zhu et al. 

(Zhu et al., 2020) then proposed a Graphical and Attentional framework for Dual-Target Cross-Domain Recommendation (GA-DTCDR). In the embedding layer, it applied a graph embedding technique (i.e., Node2vec) to generate more representative user and item embeddings in each domain. In the combination layer, it employed an element-wise attention mechanism to more effectively combine the representations of overlapping users.

Some approaches of this method focused on capturing the dynamic nature of user preferences. Perera et al. proposed to utilize time-stamped, cross-network information for both new (i.e., distinct users) and existing (i.e., overlapping users) user recommendation (Perera and Zimmermann, 2017). In the embedding layer, they generated users’ topical distribution by topic modeling in each domain. Before the combination, two transfer functions map user preferences from the topical space to the target network user space. In the combination layer, the representations of overlapping users are obtained by fusing user preferences from both domains. Later, they further proposed a time-aware unified cross-network solution (Perera and Zimmermann, 2020) which, in the embedding layer, modeled user preferences under short, long and long short term levels in each domain. Then, in the combination layer, the three-level preference representations in both domains of overlapping users (referred to as existing users in their paper) were integrated to obtain users’ final representations. The distinct users’ (referred to as new users) representations were directly generated by fusing the three representations from the source domain.

4.3. Embedding and Mapping

4.3.1. The basic paradigm

This is an inter-domain recommendation method in which one domain is treated as the source domain and the other as the target domain. Fig. 7 shows the schematic diagram of this method. There are three main steps, i.e., latent factor modeling, latent space mapping and cross domain recommendation. During the latent factor modeling process, the aim is to generate user and item latent factors in each domain. During the latent space mapping, the aim is to train a mapping function . The objective of is to establish the relationships between the latent space of domains:

(8)

During the cross-domain recommendation process, for a user who only has a latent factor in the source domain, it generates the user’s latent factor in the target domain:

(9)

With , the recommendation in the target domain can be performed to this user.

Figure 7. The schematic diagram of embedding and mapping.

4.3.2. Approaches of this method

The embedding and mapping framework for cross-domain recommendation (EMCDR) was first proposed by Man et al. (Man et al., 2017) that aimed at the inter-domain recommendation

problem. For latent factor modeling, EMCDR, respectively, applied Matrix Factorization (MF) and Bayesian Personalized Ranking (BPR) to generate user and item latent factors. For latent space mapping, it utilized both a linear function and a nonlinear function based on Multi-Layer Perceptron (MLP) to act as the mapping function. The objective is to approximate the latent factor

of overlapping users in the source domain mapped by with their corresponding latent factors in the target domain.

This idea was widely improved by many researchers and the improvement is mainly in two aspects, one is the latent factor modeling process, the other is the latent space mapping process.

For latent factor modeling, Wang et al. (Wang et al., 2018) proposed a Cross-Domain Latent Feature Mapping (CDLFM) model. It first defined three similarity measurements on users’ rating behaviors. During latent factor modeling, the similarity values were embedded into the matrix factorization process as constraints. Fu et al. (Fu et al., 2019)

proposed a Review and Content-based Deep Fusion Model (RC-DFM). It extended stacked denoising autoencoders to effectively fuse review text and item contents with the rating matrix to generate user and item representations with more semantic information. Bi et al. 

(Bi et al., 2020b, a) proposed to construct a heterogeneous information network and took into consideration the interaction sequence information to learn effective user/item representations in each domain. The proposed approaches are proved to be effective in the cross-domain insurance recommendation.

For latent space mapping, Zhu et al. (Zhu et al., 2018) proposed a Deep framework for both Cross-Domain and Cross-System Recommendation (DCDCSR) which took into account rating sparsity degrees of individual users/items to generate benchmark factor matrices. The mapping function was trained to map the latent factor matrices of users and items to fit the benchmark factor matrices. Kang et al. (Kang et al., 2019) proposed that in existing EMCDR-based approaches, the training of mapping functions only used overlapping users, so their performance was sensitive to the number of overlapping users. After an in-depth analysis of the Amazon dataset, they proposed that in real-world datasets, the number of overlapping users was always small, which limited the performance of existing approaches. Therefore, they proposed a Semi-Supervised framework for Cross-Domain Recommendation (SSCDR) to utilize both the overlapping users and source-domain items to train the mapping function.

4.3.3. Expand to intra-domain recommendation

The approaches described above are for inter-domain recommendation, as they all learn a mapping function between different domains and map the users’ representation in the source domain to the target domain. Zhang et al. (Zhang et al., 2020) further applied this method to intra-domain recommendation. They proposed that users’ interests and states may vary over time and it was important to quickly capture these changes for timely and accurate recommendations. Therefore, they first divided the sequence of users’ interacted items into itemsets based on timestamps of interactions to denote users’ interests during a period. Instead of capturing the mapping relationships between user representations in different domains like previous EMCDR-based approaches, the proposed cycle generation network (CGN) learned a user’s personalized dual-direction mapping function between the representation of her interaction itemsets in different domains at the same temporal period.

4.4. Graph Neural Network-based Approaches

Figure 8. The schematic diagram of graph neural network-based approaches.

4.4.1. The basic paradigm.

The basic paradigm of this method is building shared graphs to represent the relationships among users, items, attributions, and other factors (i.e., nodes in the graphs), and learn a representation for each node through graph representation learning. The learned representations are proved to be able to capture the high-order and non-linear dependencies between nodes and can be used for subsequent recommendations. Since constructing shared graphs among domains can integrate the information of different domains, through graph representation learning, the representations of nodes from different domains are embedded in the same latent space and, therefore, the cross-domain information can be transferred. Fig. 8 shows the schematic diagram of this method.

4.4.2. Approaches of this method.

Wang et al. (Wang et al., 2017) first applied graph neural networks to the cross-domain social recommendation and proposed a Neural Social Collaborative Ranking (NSCR) approach. It aims to recommend items in an information-oriented domain to users in a social-oriented domain, which can be seen as an inter-domain recommendation. NSCR first utilizes an attributed-aware deep collaborative filtering model in the information-oriented domain to learn user-item interactions. Then embeddings of overlapping users are directly transferred to the social-oriented domain. The cross-domain knowledge is transferred by a multi-layer graph convolution network propagating representations of overlapping users to non-overlapping users. Finally, the embeddings of non-overlapping users in the social-oriented domain and embeddings of items in the information-oriented domain are used for recommendation.

Cui et al. proposed that, in previous works, user behaviors are processed within each domain which is an indirect way to incorporate cross-domain information. They proposed a heterogeneous graph framework (HeroGRAPH) (Cui et al., 2020) that collected user behaviors from all domains to devise a shared graph to directly model users’ cross-domain behaviors. Information within each domain was utilized to conduct within-domain modeling and graph convolution operations with recurrent attention on the shared graph is applied to conduct cross-domain modeling. The within-domain representations and cross-domain representations of users and items are combined to perform the final recommendation. Li et al. (Li et al., 2020) further proposed an embedding content and heterogeneous network (ECHCDR) which creatively incorporated an adversarial learning algorithm. It first utilized Doc2vec to generate content representations of users/items and they are concatenated with adjacency representations to act as initial representations of users/items. It then simultaneously train a generator and a discriminator to learn suitable representations of users and items. Finally, both intra-domain and inter-domain recommendation can be performed based on the inner product of the learned representations.

4.5. Capturing Aspect Correlations

4.5.1. The basic paradigm.

Apart from the previously introduced embedding and mapping-based approaches, capturing aspect correlations is another method for inter-domain recommendation. It assumes users’ preferences are multi-faceted and aims at modeling fine-grained semantic aspects and exploring their mutual relationships across domains. The score prediction can be performed by matching the aspect features of a user in one domain and an item in the other domain.

4.5.2. Approaches of this method.

Zhao et al. (Zhao et al., 2020) proposed a cross-domain recommendation framework via aspect transfer network (CATN) to capture users’ multi-faceted and fine-grained preferences. It first represents a user by a user document that contains all reviews written by this user, and an item by an item document that contains all reviews it receives. It then generates abstract aspect features for each user and each item from their documents. Aspect features of overlapping users were utilized to identify the global cross-domain aspect correlations. The inter-domain recommendation can be performed by utilizing the user’s review document in the source domain and the item’s review document in the target domain, and vice versa. Specifically, the rating prediction is obtained by aggregating the semantic matching between two aspects in the aspect features of a user and an item.

5. Scenario 3: User full overlap & Item non-overlap

Method Approach Venue Year Recommendation Tasks
Dimension 1 Dimension 2
Collective Matrix Factorization CMF (Singh and Gordon, 2008) KDD 2008 intra-domain multi-target
SoRec (Ma et al., 2008) CIKM 2008 intra-domain multi-target
CTR+RBF (Xin et al., 2015) IJCAI 2015 intra-domain multi-target
LSCD (Zhao et al., 2018) DASFAA 2018 intra-domain multi-target
Tensor Factorization CDTF (Hu et al., 2013) WWW 2013 intra-domain multi-target
HST (Liu et al., 2015) ICML 2015 intra-domain multi-target
RB-JTF (Song et al., 2017) DASFAA 2017 intra-domain multi-target
Factorization Machines FM-MCMC (Loni et al., 2014) ECIR 2014 intra-domain single-target
CoFM (Li et al., 2019) AAAI 2019 intra-domain single-target
Deep Sharing User Representations MVDNN (Elkahky et al., 2015) WWW 2015 intra-domain multi-target
CCCFNet (Lian et al., 2017) WWW 2017 intra-domain multi-target
GCBAN (He et al., 2018a) ICDM 2018 intra-domain multi-target
PPGN (Zhao et al., 2019) CIKM 2019 intra-domain multi-target
DeepAPF (Yan et al., 2019) IJCAI 2019 intra-domain multi-target
EATNN (Chen et al., 2019) SIGIR 2019 intra-domain multi-target
Deep Dual Knowledge Transfer CoNet (Hu et al., 2018a) CIKM 2018 intra-domain multi-target
ACDN (Liu et al., 2020b) WWW 2020 intra-domain multi-target
DDTCDR (Li and Tuzhilin, 2020) WSDM 2020 intra-domain multi-target
-Net (Ma et al., [n.d.]) SIGIR 2019 intra-domain multi-target
BiTGCF (Liu et al., 2020a) CIKM 2020 intra-domain multi-target
Deep Integration of Source Domain Information MTNet (Hu et al., 2018b) KDD 2018 intra-domain single-target
NATR (Gao et al., 2019) WWW 2019 intra-domain single-target
MF(Ma et al., 2018) IJCAI 2018 intra-domain single-target
 (Perera and Zimmermann, 2018) IJCAI 2018 intra-domain single-target
Table 5. Method-based categorization of existing approaches for the recommendation scenario with user fully overlap and item non-overlap

This category of recommendation refers to the scenario in which all users have interactions in all domains while items are disjoint among domains. The users are sometimes called multi-homed users which are widely studied in recent years. This recommendation scenario is symmetrical to the scenario with item full overlap and user non-overlap. The approaches discussed in this section can be applied to these two recommendation scenarios. In particular, approaches can be divided into six classes, that is, () collective matrix factorization, () tensor factorization, and () factorization machines, ()deep sharing user representations, ()deep dual knowledge transfer, and () deep integration of source domain information. Specifically, the first three classes of approaches are generally in shallow structures while the last three classes are deep learning-based approaches. A brief summary of method-based categorization of existing approaches for this recommendation scenario is displayed in Table 5.

5.1. Collective Matrix Factorization

5.1.1. The basic paradigm.

As described in section 4.1, collective matrix factorization is a direct extension of the traditional matrix factorization method for cross-domain recommendation problem. The difference is that cross-domain knowledge can be utilized to constrain the matrix factorization process within each domain. To some extent, the approaches discussed in section 4.1 are applicable to this recommendation scenario. We omit the schematic diagram as it is the same as Fig. 7. In the following, we discuss approaches that are specially proposed for this recommendation scenario.

5.1.2. Approaches of this method.

Singh et al. (Singh and Gordon, 2008) first proposed the collective matrix factorization (CMF) model which collectively factorized rating matrices of two domains into user representations and item representations . The constraint was that the representations of users are shared across different domains which means . Ma et al. (Ma et al., 2008) proposed a probabilistic matrix factorization approach for social recommendation (SoRec). The rating matrix was factorized into a user feature matrix and an item feature matrix while the social network matrix was factorized into a user feature matrix and a factor feature matrix . Similar to CMF, the cross-domain constraint was that the user feature matrix was shared during the process, i.e., .

However, both of the above two approaches force the same user to have exactly the same representations in different domains, that is, they assume that users’ characteristics and preferences are consistent, which ignores users’ domain-specific characteristics. To overcome this shortcoming, Xin et al. (Xin et al., 2015) proposed a relaxed restriction by putting forward the nonlinear mapping relationships between user representations. They proposed to train two nonlinear mapping functions to make and . Zhao et al. (Zhao et al., 2018; Huang et al., 2019) proposed a low-rank and sparse cross-domain (LSCD) recommendation approach in which the user latent feature matrices were divided into two parts: domain-shared feature matrix and domain-specific feature matrix . The user representation is the sum of these two parts, i.e., .

5.2. Tensor Factorization

Figure 9. The schematic diagram of tensor factorization-based approaches.

5.2.1. The basic paradigm.

In essence, tensor factorization is the higher-order generalization of matrix factorization. When only user factors and item factors are considered, the interaction information within each domain can be represented by a matrix. However, when domain factors, time factors, and aspect factors are further considered, the information within each domain will change from a matrix to a tensor. The core idea is to factorize the tensor into user representations, item representations, and other factor representations (i.e., domain, time, and aspect). By multiplying the user representations and domain representations, the domain-specific representations can be obtained. Similarly, time-specific representations and aspect-specific representations can be obtained. The final prediction is derived from the product of multiple related representations. CP model (canonical decomposition/parallel factor analysis (PARAFAC)) is the most widely used tensor factorization algorithm. Fig. 9 shows the schematic diagram of tensor factorization-based approaches.

5.2.2. Approaches of this method.

Hu et al. (Hu et al., 2013) proposed a Cross-Domain Triadic Factorization (CDTF) model which took into consideration the full triadic relation user-item-domain to reveal user preferences on items within various domains in depth. Except for a user-factor matrix shared among domains and an exclusive item-factor matrix for each domain, CDTF also generated a domain-factor matrix to express the traits of each domain. Then the recommendations are performed based on the results of triadic interactions among user, item, and domain factors. Song et al. proposed to exploit the aspect factors extracted from the review text to improve the performance of cross-domain recommendation. They extracted fine-grained user preferences in aspect-level and concern degrees toward different aspects of items as two tensors. A review-based joint tensor factorization (RB-JTF) approach (Song et al., 2017) was proposed and tensor factorization was applied simultaneously in each domain. The rating matrices were factorized into user, item, and aspect latent factors while knowledge transfer was realized by sharing user latent factors and transferring aspect latent factors.

Liu et al. proposed that previous studies only identified and transferred the linearly correlated knowledge between domains. They proposed a new knowledge transfer technique, called the hyper-structure transfer (HST) (Liu et al., 2015), that captured the non-linear correlations of knowledge between domains. Compared with previous approaches that directly share the cluster-level rating patterns between domains (see section 3.1), HST first generated a more complex structure and required the transferred rating pattern matrices in each domain to be projections of . To get , canonical polyadic decomposition of tensors is applied.

5.3. Factorization Machines

5.3.1. The basic paradigm.

Factorization Machines (FM) is a machine learning algorithm based on matrix factorization that aims to solve the problem of feature combination in large-scale sparse data. The advantages of FM fit well with the problem scenario of recommendation systems and FM has been proved to be one of the effective algorithms with verified effects. Formula (10) shows a second-order factorization machine model which can be easily extended to multi-order models.

(10)

where represents the dimension of the feature vector. denotes the -th feature, and is the combination parameter representing the importance of the combination feature.

FM has been already used for single-domain recommendation, but for cross-domain recommendation, it is necessary to extend FM to allow them to incorporate user interaction patterns from different domains.

5.3.2. Approaches of this method.

Loni et al. (Loni et al., 2014) proposed an extension of FM named FM-MCMC that incorporated user domain-specific interaction patterns from the source domain to expand feature vectors of the target domain. The expanded feature vectors served as input to the general FM model. Specifically, a domain-dependent real-valued function was defined to control the amount of knowledge that was transferred from the source domain. Li et al. (Li et al., 2019) designed coupled factorization machines (CoFM) and proposed that the coupled fields of coupled datasets contained shared characteristics as well as domain-specific uniqueness. CoFM, therefore, allowed the latent vectors of the coupled field between two domains to have a shared part and a domain-specific part.

5.4. Deep Sharing User Representations

5.4.1. The basic paradigm

The schematic diagram of this class of approaches is shown in Fig. 10. The core idea is to first generate users’ initial embeddings , items’ initial embeddings and in domain and domain , respectively. Then these initial embeddings, respectively, enter into three separate deep neural network modules to learn the latent representations of users and items. The module that deals with users is shared between two domains to achieve deep sharing of user representations.

5.4.2. Approaches of this method

Elkahky et al. (Elkahky et al., 2015) proposed a Multi-View Deep Neural Network (MVDNN) in which the initial embeddings were generated from users’ features and items’ attributes. User embeddings were input of one view while item embeddings were input of another two separate views. All the views shared the same structure of multi-layer perception to generate representations of users and items. Two domains shared the same user view, thus achieving the goal of deep sharing of user representations. Following this approach, Lian et al. and He et al., respectively, proposed a cross-domain content-boosted collaborative filtering neural network (CCCFNet) (Lian et al., 2017) and a general cross-domain framework via a bayesian neural network (GCBAN) (He et al., 2018a)

. The core idea of these two approaches was to incorporate both collaborative filtering and content-based factors into account when generated initial embeddings of users and items in each view. Thus, the initial embeddings consist of two parts: one part generated from features and the other part generated from one-hot encoding. The structure of each view is the same as MVDNN 

(Elkahky et al., 2015).

Figure 10. The schematic diagram of deep sharing user representations.

Zhao et al. (Zhao et al., 2019) proposed a Preference Propagation GraphNet (PPGN) which constructed a cross-domain preference matrix to model the interactions of different domains as a whole. The high-order user preferences were propagated and integrated through multiple graph convolution and propagation layers. Compared with previous works, the multi-layer perception was replaced by multiple graph convolutional layers.

Some researchers improved previous works by considering users’ domain-shared representations and domain-specific representations. Yan et al. (Yan et al., 2019) proposed a model of Deep Attentive Probabilistic Factorization (DeepAPF) in which the user embeddings were initialized into three parts: , , and . was the domain-shared representations capturing cross-domain commonality of user interests and were the domain-specific representations capturing site-peculiarity of user interests. Attention mechanisms were utilized to fuse these two kinds of user representations to generate users’ final representations, i.e., . Similarly, Chen et al. (Chen et al., 2019) proposed an efficient adaptive transfer neural network (EATNN) for the social-aware recommendation which jointly performed item recommendation and friend recommendation. It initialized user embeddings into a domain-shared part , an item domain-specific part , and a social domain-specific part . Two attention-based kernels were utilized to automatically estimate the difference of mutual influences between the item domain and the social domain to incorporate the shared and domain-specific representations in each domain.

5.5. Deep Dual Knowledge Transfer

5.5.1. The basic paradigm

Approaches of this class have symmetrical model structures, that is to say, each domain has the same structure and the model structures are in deep structure with multiple hidden layers. As the structures are symmetrical, these methods can simultaneously perform recommendations on multiple domains and are, therefore, for the multi-target recommendation. Fig. 11 shows the schematic diagram of these approaches. Let denote the deep neural network structure with layers in the domain and domain . and denote the input and output of the -th layer in domain while and for domain . The deep dual knowledge transfer is realized by fusing the outputs of the previous layer in this domain and the other domain as the input of the next layer, which can be denoted as and . The inputs of the first layer, i.e., and , are the original embeddings of users and items. The outputs of the last layer, i.e., and , are used for recommendation.

Figure 11. The schematic diagram of deep dual knowledge transfer.

5.5.2. Approaches of this methods

Hu et al. (Hu et al., 2018a) first proposed a collaborative cross-network (CoNet) that introduced cross-connections from one base network to another. The inputs of domain are the one-hot embeddings and of a user and an item while the inputs of domain are the one-hot embeddings and of the same user and another item . The one-hot embeddings in each domain are concatenated as and acting as the input of the first layer, that is, and . The deep dual knowledge transfer is achieved in multi-layer feedforward networks by adding dual connections. The formulas are as follows:

(11)

where and are domain-specific parameters while are domain-shared parameters.

Following CoNet, Liu et al. (Liu et al., 2020b)

proposed to capture users’ aesthetic preferences when transferring knowledge between different domains. They proposed a deep aesthetic cross-domain network (ACDN) which utilized a pre-trained deep convolutional neural network to extract aesthetic features

and from item images in each domain. The aesthetic features together with one-hot embedding features , , acted as the input of the first layer, that is, and . The deep dual knowledge transfer is realized in the same way as CoNet which is shown in equation 11.

Li et al. (Li and Tuzhilin, 2020)

proposed a Deep Dual Transfer Cross-Domain Recommendation (DDTCDR) model and, compared with CoNet, the improvements mainly lay in three aspects. First, DDTCDR used pre-trained autoencoders which encoded user and item features to generate feature representations as the model input. Second, it learned a latent orthogonal mapping function for transferring user preferences between domains. Third, it jointly modeled users’ within-domain preferences

and cross-domain preferences to model user preferences and user-item interactions in each domain.

Ma et al. (Ma et al., [n.d.]) applied this method to the shared-account cross-domain sequential recommendation and proposed a parallel information-sharing network (-Net). It first utilized separate RNNs in each domain to encode user behavior sequences into a sequence representation. Instead of dual knowledge transferring in each layer, the dual knowledge transfer was performed in each step of RNNs. To be specific, a shared account filter unit and cross-domain transfer unit were designed to extract and share useful information between two domains. For each step in RNNs, the input was the combination of the output of the last step with the transferred information from another domain.

Liu et al. (Liu et al., 2020a) further incorporated graph neural network into this method and proposed a bi-direction transfer learning approach by using graph collaborative filtering network as the base model (BiTGCF). It constructed a user-item bipartite graph in each domain and the multi-layer feedforward networks in previous works were replaced by multi-layer graph convolutional networks. Features of both users and items were propagated by graph convolution operations. The deep dual knowledge transfer was performed between two graph convolutional networks. Similar to previous works, the user representations of the -th layer were a combination of the output of the previous layers in both domains.

5.6. Deep Integration of Source Domain Information

5.6.1. The basic paradigm

Different from the above two classes of approaches that are in symmetrical model structures and aim at the multi-target recommendation, this class of approaches are for the single-target recommendation. As shown in Fig. 12, they treat one domain as the target domain to play a dominant role and the other domains as source domains. The information from the source domains is incorporated into the target domain as additional auxiliary information.

Figure 12. The schematic diagram of deep integration of source domain information.

5.6.2. Approaches of this method

Hu et al. (Hu et al., 2018b) proposed an approach based on neural networks named MTNet in which a memory network was utilized to generate high-level representations of the context text (i.e., product reviews) in the target domain and a transfer network was utilized to generate representations of source domain knowledge. Together with representations of the user-item interaction, these three kinds of representations were combined to perform the final recommendation. Gao et al. (Gao et al., 2019) designed a model named Neural Attentive-Transfer Recommendation (NATR) which focused on sharing item embeddings between domains. It first looked up the embeddings in the source domain of the items that a user has interacted with within the target domain. An item-level attention unit was designed to aggregate the item embeddings generating an additional user embedding . A domain-level attention unit was designed to fuse with the user embedding in the target domain to generate a unified user embedding . The interaction probability was predicted by the inner product of and an item embedding .

Ma et al. (Ma et al., 2018) included users’ content information in social media into rating prediction and proposed a social media content enriched matrix factorization (MFS). It first adopted several different embedding learning algorithms (i.e., word2vector, stacked denoising autoencoder) to generate user embeddings from the content information. It modified users’ latent vector as where is randomly initialized. It then applied matrix factorization to train user and item embeddings by keeping unchanged and only updating .

Perera et al. (Perera and Zimmermann, 2018) focused on capturing the dynamic natures of user preferences and providing timely recommendations in the target domain. It first divided user interactions in source domains into several sets according to timestamps of his interactions in the target domain. After topical modeling and sum pooling, each set of interactions in source domains during a time interval was encoded as a high-level latent representation. An improved LSTM with attention mechanisms and time-aware gates were utilized to encode the sequence of the representations and generated the final recommendations.

6. Datasets for Cross-Domain Recommendation

In this section, we introduce datasets used in cross-domain recommendation. Specifically, we first introduce three datasets with multiple domains, namely, Amazon, Douban, and Epinions. Then we introduce datasets with a single domain such as movie, book, music, social interaction, point-of-interest, and others. Table 6 shows the results of our classification of datasets and the usage of these datasets by existing works. In the following, we detail the source and composition of each dataset and explain how each dataset can be applied to cross-domain recommendation tasks.

6.1. Datasets with Multiple Domains

Datasets that contain information from multiple domains are the most widely used datasets in cross-domain recommendation as they perfectly fit the cross-domain recommendation problem scenarios. With shared user identities among domains, it is easy to find corresponding users among different domains and form cross-domain recommendation tasks. In general, there are three representative datasets with multiple domains: Amazon(ama, [n.d.]), Douban(dou, [n.d.]) and Epinions(epi, [n.d.]).

  • The Amazon dataset is crawled from Amazon, the world’s largest e-commerce platform, which has the most extensive behavior data volume and good diversity. This dataset contains information of different domains (e.g., CDs and Vinyl, Electronics, Movies, and TV), depending on the type of items. For each interaction, a user gives an explicit rating score on a scale of to an item indicating the user’s preference for this item. At the same time, this dataset also includes users’ review texts on items, the interaction timestamps, and items’ categories.

  • The Douban dataset is crawled from Douban, a renowned Chinese online social network, where users give ratings to three types of items: movie, music, and book. This dataset consists of three domains, that is, Douban-Movies, Douban-Music, and Douban-Books. Apart from ratings, this dataset also includes users’ profiles involving gender, age, place of residence, tag, etc., and the attributes of items such as names and ownerships. In addition, social relationships among users are also available.

  • The Epinions dataset is extracted from Epinions, a popular product review website, providing 587 distinct domains (e.g. Books, Videos and DVDs, Baby Care) and sub-domains. A user offers both an explicit rating score on a scale of and reviews to an item in each interaction. Additionally, this dataset compromises user profiles including name, location, top rank, etc., and features of items like name, category, and description. More importantly, trust statements and ”experts”, i.e., category leads, top reviewers, or advisors, are also available in this dataset. A user scores other users positive trust statement, i.e. when trusting their reviews, otherwise negative one, i.e. and ”experts” offer more reliable trust statements.

Typically, datasets with multiple domains can generate two different kinds of cross-domain recommendation tasks. In most cases, any two domains can be used to generate a cross-domain recommendation task (e.g., Amazon-Movies and TV Amazon-CDs and Vinyl, Douban-Movies Douban-Music) where users partially/fully overlap, and items no-overlap. Alternatively, some domains together with domains from other datasets can form another kind of cross-domain recommendation tasks (e.g., Amazon-Movie and TV Douban-Movies, Amazon-Books Douban-Books) where users no-overlap, and items partially/fully overlap.

Dataset Type Domain Type Name of Dataset Paper
Datasets with Multiple Domain - Amazon (Zhang et al., 2017; Iwata and Takeuchi, 2015; Fu et al., 2019; Wang et al., 2018; Kang et al., 2019; Zhao et al., 2020; Cui et al., 2020; Zhao et al., 2018; Hu et al., 2018a; Tan et al., 2014; Zhao et al., 2019; Loni et al., 2014)
(Mirbakhsh and Ling, 2015; Huang et al., 2019; Hu et al., 2013; Li et al., 2020; Song et al., 2017; Yan et al., 2020; Hu et al., 2018b; Zhang et al., 2020; Yuan et al., 2019; Wang et al., 2019; Liu et al., 2020a; Liu et al., 2020b)
- Douban (Man et al., 2017; Zhu et al., 2018, 2019, 2020; Xin et al., 2015; He et al., 2018a; Zhao et al., 2013; Zhang et al., 2016; Yang et al., 2015; Lian et al., 2017; Zhang et al., 2019b; Chen et al., 2020; Ma et al., 2018; Jiang et al., 2016)
- Epinions (Rafailidis and Crestani, 2016; Ma et al., 2008; Shu et al., 2018; Chen et al., 2020; Rafailidis and Crestani, 2017; Mirbakhsh and Ling, 2015)
Datasets with Single Domain Movie MovieLens (Zhang et al., 2017; Li et al., 2009a; Gao et al., 2013; Iwata and Takeuchi, 2015; He et al., 2018b; Li et al., 2009b; Shi et al., 2011; Hao et al., 2016; Man et al., 2017; Zhu et al., 2018, 2019, 2020; Zhao et al., 2018; He et al., 2018a; Tan et al., 2014; Enrich et al., 2013; Fernández-Tobías and Cantador, 2014; Gao et al., 2019)
(Liu et al., 2015; Zhang et al., 2019b; Huang et al., 2019; Fang et al., 2015; Agarwal et al., 2011; Li et al., 2015; Pan et al., 2010; Zhang et al., 2016; Kumar et al., 2014; Pan et al., 2012; Ren et al., 2015; Lian et al., 2017; Chen et al., 2013; Perera and Zimmermann, 2020)
Netflix (Zhang et al., 2017; Iwata and Takeuchi, 2015; Moreno et al., 2012; Man et al., 2017; Zhu et al., 2018; Zhao et al., 2013; Zhang et al., 2016; Pan et al., 2011; Li et al., 2011; Gao et al., 2019; Pan et al., 2012; Zhang et al., 2019b; Li et al., 2015; Singh and Gordon, 2008; Lu et al., 2013; Pan et al., 2010)
EachMovie (Li et al., 2009a; Gao et al., 2013; Iwata and Takeuchi, 2015; He et al., 2018b; Li et al., 2009b; Zhang et al., 2016; Ren et al., 2015; Berkovsky et al., 2007)
Music Yahoo-Music (Zhang et al., 2017; Li and Lin, 2014)
Last.FM (Zhang et al., 2019a)
Book Book-Crossing (Li et al., 2009a; Gao et al., 2013; He et al., 2018b; Li et al., 2009b; Kumar et al., 2014; Ren et al., 2015)
Library Thing (Zhang et al., 2017; Shi et al., 2011; Hao et al., 2016; Fernández-Tobías and Cantador, 2014; Chen et al., 2013)
Social Interaction Twitter (Wang et al., 2017; Perera and Zimmermann, 2020, 2017; Yan et al., 2015)
Youtube (Perera and Zimmermann, 2018, 2020, 2017; Yan et al., 2015)
Weibo (Yang et al., 2015; Ma et al., 2018; Jiang et al., 2016)
DBLP (Liu et al., 2015)
Point of Interest Yelp (Sahebi and Brusilovsky, 2015; Manotumruksa et al., 2019; Krishnan et al., 2020)
Brightkite (Manotumruksa et al., 2019)
Foursquare (Manotumruksa et al., 2019)
Others Cheetah Mobile (Hu et al., 2018a; Yan et al., 2020; Hu et al., 2018b)
Table 6. A summary of datasets and corresponding papers

6.2. Datasets with Single Domain

6.2.1. Movie

Movie datasets are favored by a majority of works because they have innate shared movies in different categories which can be treated as different domains to form cross-domain recommendation tasks. Also, as many movies in these datasets are in common, a recommendation scenario based on overlapping items can generate.There are three widely used movie datasets: MovieLens (mov, [n.d.]), Netflix (net, [n.d.]) and EachMovie (eac, [n.d.]).

  • The MovieLens dataset is crawled from MovieLens, a well-received movie recommendation website, which contains a group of datasets leveled by a rating scale. Among them three stable benchmark datasets, i.e., MovieLens-100K, MovieLens-1M and MovieLens-20M, are most popular. The differences between them are that they have a distinct volume of interactions. Each interaction in this dataset includes ratings on a scale of with half-rating increments, attributes of movies, and tags user-generated. Movies are separated into 18 distinct genres including Action, Adventure, Horror, and so on.

  • The Netflix dataset, offered by Netflix for a competition, consists of about ratings for movies given by users. Each interaction consists of ratings on a scale of , the interaction time, and attributes of movies.

  • The EachMovie dataset is extracted from EachMovie recommendation service compromising ratings from to entered by users for different movies.

6.2.2. Book.

The cross-domain book recommendation is another prevalent task. Book-Crossing (boo, [n.d.]) and LibraryThing (lib, [n.d.]) are two majorly adopted datasets for the task.

  • The Book-Crossing dataset is crawled from the Book-Crossing community contains users with ratings about books. Interactions in this dataset compromise explicit ratings on a scale of to books. Demographic information about users including location and age are provided while content-based information of each book, i.e., tile, author, year of publication, and publisher, are given.

  • The Library Thing dataset is collected from Library Thing, an online book review website, which publishes information about books and enables users to create their virtual libraries and tag books. It consists of tuples with users, books, and tags. In each dataset, a user gives ratings ranged between and , reviews, tags to a book. In addition, friend relations among users are given which are similar to the trust relations of social networks in Epinions.

Although user identities are not identical across different datasets, each book in these datasets is labeled by a unique ISBN ID. It is possible to match the same book in different domains to build cross-domain recommendation scenarios where users do not overlap and items are fully or partially overlapped.

6.2.3. Music.

Music datasets record user-music interactions and Yahoo! Music (yah, [n.d.]) and Last.FM (las, [n.d.]) are two representative music datasets.

  • The Yahoo! Music dataset, offered by Yahoo Research, consists of a wealth of information about music. Users give explicit ratings to entities of four different types, i.e., tracks, albums, artists, and genres, constituting four types of interactions. In addition to ratings, each interaction also contains attributes of entities.

  • The Last.FM is released by HetRec in . Different from previously introduced datasets, this dataset contains the number of times a song has been listened to by a user instead of explicit ratings as well as tags generated by the user.

6.2.4. Social Interaction.

In most instances, social interaction datasets are used as auxiliary information to assist cross-domain recommendation. Twitter (twi, [n.d.]), Youtube (you, [n.d.]), Weibo (wei, [n.d.]) and DBLP (dbl, [n.d.]) are four typically used datasets.

  • The Twitter dataset is crawled from one of the most widespread social networks, Twitter. Each interaction includes tweet content, the interaction time, and tags users generated.

  • The Youtube dataset, given by Youtube, consists of user-video interactions provided with video ID, video title, label, class, the interaction time, and detailed description.

  • The Weibo dataset is crawled from the largest Chinese Twitter Weibo. Each tweet contains tweet content, user identification, interaction time, comments, and tags. Each user owns a profile of gender, name, followers, and followees, which is possible to form a social relation network.

  • The DBLP dataset is offered by a famous citation network DBLP. Each interaction compromises paper, abstract, authors, year, venue, title, and citations. Moreover, citations of papers are used to construct a social relation network.

6.2.5. Point-of-Interest.

Point-of-interest datasets record the behaviors of users at the Point-of-Interest (POI) which are widely used for cross-domain venue recommendation. The most popular POI datasets include Yelp (yel, [n.d.]) and two check-in datasets, Brightkite (bri, [n.d.]) and Foursquare (fou, [n.d.]).

  • The Yelp dataset is offered by the largest review site in the U.S., i.e., Yelp, for a challenge. This dataset consists of ratings on a scale of , reviews, and tips of each interaction as well as social relation network.

  • The Brightkite dataset is extracted from a location-based social networking service provider Brightkite, where users shared their locations by checking-in. Interactions with check-in time and location as well as friendship network are available in this dataset.

  • The Foursquare dataset is collected from Foursquare, a location data platform for understanding how people move through the real world. The dataset consists of check-in data for different cities accompanied by timestamps, GPS coordinates, and semantic meaning (represented by fine-grained venue categories).

6.2.6. Others.

  • The Cheetah Mobile dataset is a mobile dataset collected from Cheetah. It consists of two domains, that is, app installation and news browsing, which makes it naturally fit for cross-domain recommendation scenarios where users overlap but items do not. Each interaction in the news browsing domain contains contents of news, the interaction time, and user profiles, while each interaction of app installation includes app IDs and some metadata about both users and apps.

7. Future Directions

While existing works have established a solid foundation for cross-domain recommendation research, there are still some further opportunities. In this section, we outline and discuss some future research directions.

7.1. Exploring Unstudied Recommendation Scenarios

As we have introduced in Section 2.2.1, based on our proposed taxonomy, there exist different recommendation scenarios for cross-domain recommendation. However, three of them (i.e., user partial overlap item partial overlap, user partial overlap item full overlap, and user full overlap item partial overlap) have not been studied so far. One important reason is that no dataset can support these kinds of studies, because they require that datasets overlap in both user sets and item sets. However, with the increase in the number of service platforms, more and more users begin to interact on multiple platforms with similar functions, which makes the generation of such datasets possible. For example, the same users watch videos on Tencent and iQIYI, which makes user sets of these two platforms overlap. At the same time, there are the same videos (i.e., the items) provided by these two platforms. Therefore, a dataset supporting these kinds of studies can be formed. The same phenomenon also exists on e-commerce platforms like Taobao and Amazon. Therefore, future researches on cross-domain recommendation can be conducted under these unexplored recommendation scenarios.

7.2. Adopting Latest Advances in Deep Learning

The cross-domain recommendation problem can be regarded as a sub-problem of the recommendation system, which is closely related to the traditional recommendation problem. Therefore, the methods proposed in traditional recommendation problems, especially the ways of extracting knowledge in each domain, can be directly referred to and applied to cross-domain recommendation problems. Secondly, technological developments in other related fields can also be introduced. For example, technology in the field of natural language processing can be used to process user comments and item contents that act as auxiliary information, and technology in the field of computer vision can be used to process visual information of items 

(Liu et al., 2020b). In terms of specific methods, Krishnan et al. (Krishnan et al., 2020) proposed to incorporate the strengths of meta-learning into cross-domain recommendation by defining transferrable neural layers via contextual predicates, working in tandem with and guiding domain-specific representations. A novel adaptation approach with regularized residual learning is developed to incorporate new targets domains with minimal overheads. Hu et al. (Hu et al., 2018b) utilized a memory network to attentively extract useful information to be transferred from text content. However, with the rapid development of deep learning and neural networks, there are still many new technologies and structures that can be utilized to improve the performance of cross-domain recommendation.

7.3. Exploring Robustness of Recommendation

Robustness is critical to the generalization of models. If the robustness of a model is poor, a little disturbance may greatly damage the prediction accuracy. There are inherent challenges to the robustness of cross-domain recommendation models compared to single-domain recommendation models. First of all, the core idea of cross-domain recommendation is to transfer knowledge from other domains to improve the prediction accuracy of a domain. However, the transferred knowledge may be noise instead of helpful knowledge, which is called ”negative transfer”. The existence of the negative transfer may destroy the robustness of the models. Second, cross-domain recommendation models tend to be more complex because they consider not only how to extract useful knowledge within each domain, but also how to share and transfer the knowledge across domains. Complex models are more susceptible to noise interference and reduce robustness. Third, as cross-domain recommendation aims at alleviating the problem of data sparsity, the datasets used are often sparser than those used by single-domain recommendation, and the parameters trained based on sparse datasets tend to lack robustness.

There have been several studies on the robustness of cross-domain recommendation models. Yan et al. (Yan et al., 2020) proposed a new Adversarial Cross-Domain Network (ACDN) which is based on the similar framework of CoNet. When learning parameters, ACDN adds intentional perturbations on the embedding representations to generate adversarial examples and help to learn robust parameters. Zhang et al. (Zhang et al., 2017, 2019b) aimed at the ”negative transfer” problem and applied domain adaptation functions to ensure the consistency of the transferred knowledge, which can enhance the robustness of the models. However, these are only preliminary explorations, and there is still a large research space on the robustness of cross-domain recommendation models.

7.4. Scalability of Deep Cross-domain Models

With the rapid development of informatization and digitalization, the amount of information grows exponentially. As for recommendation systems, more and more users register and use a service or system, a large number of new products or items are produced and released every day, and tens of billions of user logs are generated every moment. All these components lead to the generation of a large volume of huge datasets, which presents a great challenge to whether proposed recommendation models can be really applied to real-world applications. On the one hand, the storage of large-scale datasets takes up a lot of space; on the other hand, the increase in the number of users and items also causes the growth of model parameters, which puts higher requirements on the computation power of the machines. At the same time, when complex and deep models meet large-scale datasets, the efficiency and timeliness of the models become critical, especially when deployed to real-world systems. The problems mentioned above become more serious in cross-domain recommendation because the integration of information from multiple domains makes datasets of cross-domain recommendation several times that of single-domain recommendation. Therefore, the scalability of models is an issue that must be considered and studied in future cross-domain recommendation research.

7.5. Interpretability of Cross-Domain Recommendation

In recommendation systems, the interpretability of models mainly refers to explaining why the items are recommended to a user, which is of vital significance for practitioners to understand the internal working mechanism of the models and to improve users’ trust in the recommendation results of the models. In the early stage of the development of cross-domain recommendation, neighbor-based methods are highly explainable, the basic idea of which is to recommend to users the items that their neighbors like in other domains. The disadvantage of these methods is that their ideas are too simple and the prediction accuracy is poor. These methods are gradually surpassed and replaced by deep learning-based methods. However, it has become a common fact that deep neural networks are highly noninterpretable, so making interpretable recommendations has become a tricky problem.

Generally speaking, there are three main ways to improve the interpretability of recommendation models. The first way, which is currently the most commonly used in the field of cross-domain recommendation, is to introduce user review information. Researchers believe that there exist more fine-grained user preference in reviews and reviews lead to ratings. Secondly, attention mechanisms can provide the interpretability of models. By visualizing the weight of the attention mechanisms, it can explicitly indicate which part of the information is more important or which step of an input is more strongly associated with the outputs. However, in cross-domain recommendation, attention mechanisms are currently only used to merge to parts of user representations, rather than being applied in a more valuable way. Thirdly, knowledge graph has been widely combined with deep learning in recent years and is believed to ease the noninterpretable neural networks. Since knowledge graphs can connect items in two domains through some relationships, it is more suitable for the interpretable cross-domain recommendation. However, up to now, no cross-domain recommendation model combining knowledge graph and deep learning has been proposed. To sum up, improving the interpretability of recommendations in cross-domain recommendations is still an open issue worthy of study.

8. Conclusion

In this paper, we provide a comprehensive and systematic investigation on cross-domain recommendation, which is a powerful tool to solve the data sparsity and cold-start problems in traditional recommender systems. We first proposed a two-level taxonomy of cross-domain recommendation scenarios and recommendation tasks for organizing and clustering existing works. Under each research scenario, we systematically sort out and summarize existing research works in terms of methods being used. Moreover, a detailed introduction about frequently used datasets including datasets with multiple domains and datasets with a single domain is provided. Finally, we discuss some promising potential research directions for further research on cross-domain recommendation. We hope that this survey can provide both newcomers and experts of cross-domain recommendation with a comprehensive understanding of the problem definition of this field, clarify existing works clearly, and shed some light on future studies.

References

  • (1)
  • ama ([n.d.]) [n.d.]. ”Amazon”. http://jmcauley.ucsd.edu/data/amazon. 2018.
  • boo ([n.d.]) [n.d.]. ”Book-Crossing”. http://www2.informatik.uni-freiburg.de/~cziegler/BX/. 2004.
  • bri ([n.d.]) [n.d.]. ”Brightkite”. http://snap.stanford.edu/data/loc-brightkite.html. 2011.
  • dbl ([n.d.]) [n.d.]. ”DBLP”. https://dblp.uni-trier.de/xml/. 2019.
  • dou ([n.d.]) [n.d.]. ”Douban”. https://www.douban.com/.
  • eac ([n.d.]) [n.d.]. ”EachMovie”. https://www.cs.cmu.edu/~lebanon/IR-lab.htm. 2003.
  • epi ([n.d.]) [n.d.]. ”Epinions”. https://projet.liris.cnrs.fr/red/. 2011.
  • fou ([n.d.]) [n.d.]. ”Foursquare”. https://archive.org/details/201309_foursquare_dataset_umn. 2013.
  • las ([n.d.]) [n.d.]. ”Last.FM”. https://www.last.fm/.
  • lib ([n.d.]) [n.d.]. ”LibraryThing”. https://www.librarything.com.
  • mov ([n.d.]) [n.d.]. ”MovieLens”. https://grouplens.org/datasets/movielens/. 2019.
  • net ([n.d.]) [n.d.]. ”Netflix”. https://netflixprize.com/index.html. 2009.
  • twi ([n.d.]) [n.d.]. ”Twitter”. https://twitter.com/.
  • wei ([n.d.]) [n.d.]. ”Weibo”. https://weibo.com/.
  • yah ([n.d.]) [n.d.]. ”Yahoo!Music”. http://webscope.sandbox.yahoo.com/.
  • yel ([n.d.]) [n.d.]. ”Yelp”. https://www.yelp.com/dataset.
  • you ([n.d.]) [n.d.]. ”Youtube”. https://github.com/google/youtube-8m. 2019.
  • Agarwal et al. (2011) Deepak Agarwal, Bee-Chung Chen, and Bo Long. 2011. Localized factor models for multi-context recommendation. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, August 21-24, 2011. ACM, 609–617.
  • Berkovsky et al. (2007) Shlomo Berkovsky, Tsvi Kuflik, and Francesco Ricci. 2007. Cross-Domain Mediation in Collaborative Filtering. In User Modeling 2007, 11th International Conference, UM 2007, Corfu, Greece, June 25-29, 2007, Proceedings (Lecture Notes in Computer Science), Vol. 4511. Springer, 355–359.
  • Bi et al. (2020a) Ye Bi, Liqiang Song, Mengqiu Yao, Zhenyu Wu, Jianming Wang, and Jing Xiao. 2020a. DCDIR: A Deep Cross-Domain Recommendation System for Cold Start Users in Insurance Domain. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25-30, 2020. ACM, 1661–1664.
  • Bi et al. (2020b) Ye Bi, Liqiang Song, Mengqiu Yao, Zhenyu Wu, Jianming Wang, and Jing Xiao. 2020b. A Heterogeneous Information Network based Cross Domain Insurance Recommendation System for Cold Start Users. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25-30, 2020. ACM, 2211–2220.
  • Chen et al. (2020) Chaochao Chen, Liang Li, Bingzhe Wu, Cheng Hong, Li Wang, and Jun Zhou. 2020. Secure Social Recommendation Based on Secret Sharing. In

    ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020 - Including 10th Conference on Prestigious Applications of Artificial Intelligence (PAIS 2020)

    (Frontiers in Artificial Intelligence and Applications), Vol. 325. IOS Press, 506–512.
  • Chen et al. (2019) Chong Chen, Min Zhang, Chenyang Wang, Weizhi Ma, Minming Li, Yiqun Liu, and Shaoping Ma. 2019. An Efficient Adaptive Transfer Neural Network for Social-aware Recommendation. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, July 21-25, 2019. ACM, 225–234.
  • Chen et al. (2013) Wei Chen, Wynne Hsu, and Mong-Li Lee. 2013. Making recommendations from multiple domains. In The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, Chicago, IL, USA, August 11-14, 2013. ACM, 892–900.
  • Cremonesi et al. (2011) Paolo Cremonesi, Antonio Tripodi, and Roberto Turrin. 2011. Cross-Domain Recommender Systems. In Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on, Vancouver, BC, Canada, December 11, 2011. IEEE Computer Society, 496–503.
  • Cui et al. (2020) Qiang Cui, Tao Wei, Yafeng Zhang, and Qing Zhang. 2020. HeroGRAPH: A Heterogeneous Graph Framework for Multi-Target Cross-Domain Recommendation. In Proceedings of the 3rd Workshop on Online Recommender Systems and User Modeling co-located with the 14th ACM Conference on Recommender Systems (RecSys 2020), Virtual Event, September 25, 2020 (CEUR Workshop Proceedings), Vol. 2715. CEUR-WS.org.
  • Elkahky et al. (2015) Ali Mamdouh Elkahky, Yang Song, and Xiaodong He. 2015. A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems. In Proceedings of the 24th International Conference on World Wide Web, WWW 2015, Florence, Italy, May 18-22, 2015. ACM, 278–288.
  • Enrich et al. (2013) Manuel Enrich, Matthias Braunhofer, and Francesco Ricci. 2013. Cold-Start Management with Cross-Domain Collaborative Filtering and Tags. In E-Commerce and Web Technologies - 14th International Conference, EC-Web 2013, Prague, Czech Republic, August 27-28, 2013. Proceedings (Lecture Notes in Business Information Processing), Vol. 152. Springer, 101–112.
  • Fang et al. (2015) Zhou Fang, Sheng Gao, Bo Li, Juncen Li, and Jianxin Liao. 2015. Cross-Domain Recommendation via Tag Matrix Transfer. In IEEE International Conference on Data Mining Workshop, ICDMW 2015, Atlantic City, NJ, USA, November 14-17, 2015. IEEE Computer Society, 1235–1240.
  • Fernández-Tobías and Cantador (2014) Ignacio Fernández-Tobías and Iván Cantador. 2014. Exploiting Social Tags in Matrix Factorization Models for Cross-domain Collaborative Filtering. In Proceedings of the 1st Workshop on New Trends in Content-based Recommender Systems co-located with the 8th ACM Conference on Recommender Systems, CBRecSys@RecSys 2014, Foster City, Silicon Valley, California, USA, October 6, 2014 (CEUR Workshop Proceedings), Vol. 1245. CEUR-WS.org, 34–41.
  • Fernández-Tobías et al. (2012) Ignacio Fernández-Tobías, Iván Cantador, Marius Kaminskas, and Francesco Ricci. 2012. Cross-domain recommender systems: A survey of the state of the art. In Spanish conference on information retrieval. sn, 1–12.
  • Fu et al. (2019) Wenjing Fu, Zhaohui Peng, Senzhang Wang, Yang Xu, and Jin Li. 2019. Deeply Fusing Reviews and Contents for Cold Start Users in Cross-Domain Recommendation Systems. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019. AAAI Press, 94–101.
  • Gao et al. (2019) Chen Gao, Xiangning Chen, Fuli Feng, Kai Zhao, Xiangnan He, Yong Li, and Depeng Jin. 2019. Cross-domain Recommendation Without Sharing User-relevant Data. In The World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019. ACM, 491–502.
  • Gao et al. (2013) Sheng Gao, Hao Luo, Da Chen, Shantao Li, Patrick Gallinari, and Jun Guo. 2013. Cross-Domain Recommendation via Cluster-Level Latent Factor Model. In Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Prague, Czech Republic, September 23-27, 2013, Proceedings, Part II (Lecture Notes in Computer Science), Vol. 8189. Springer, 161–176.
  • Hao et al. (2016) Peng Hao, Guangquan Zhang, and Jie Lu. 2016. Enhancing cross domain recommendation with domain dependent tags. In 2016 IEEE International Conference on Fuzzy Systems, FUZZ-IEEE 2016, Vancouver, BC, Canada, July 24-29, 2016. IEEE, 1266–1273.
  • He et al. (2018a) Jia He, Rui Liu, Fuzhen Zhuang, Fen Lin, Cheng Niu, and Qing He. 2018a. A General Cross-Domain Recommendation Framework via Bayesian Neural Network. In IEEE International Conference on Data Mining, ICDM 2018, Singapore, November 17-20, 2018. IEEE Computer Society, 1001–1006.
  • He et al. (2018b) Ming He, Jiuling Zhang, Peng Yang, and Kaisheng Yao. 2018b. Robust Transfer Learning for Cross-domain Collaborative Filtering Using Multiple Rating Patterns Approximation. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, WSDM 2018, Marina Del Rey, CA, USA, February 5-9, 2018. ACM, 225–233.
  • Hu et al. (2018a) Guangneng Hu, Yu Zhang, and Qiang Yang. 2018a. CoNet: Collaborative Cross Networks for Cross-Domain Recommendation. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, Torino, Italy, October 22-26, 2018. ACM, 667–676.
  • Hu et al. (2018b) Guangneng Hu, Yu Zhang, and Qiang Yang. 2018b. MTNet: a neural approach for cross-domain recommendation with unstructured text. KDD Deep Learning Day (2018), 1–10.
  • Hu et al. (2013) Liang Hu, Jian Cao, Guandong Xu, Longbing Cao, Zhiping Gu, and Can Zhu. 2013. Personalized recommendation via cross-domain triadic factorization. In 22nd International World Wide Web Conference, WWW ’13, Rio de Janeiro, Brazil, May 13-17, 2013. International World Wide Web Conferences Steering Committee / ACM, 595–606.
  • Huang et al. (2019) Ling Huang, Zhi-Lin Zhao, Chang-Dong Wang, Dong Huang, and Hong-Yang Chao. 2019. LSCD: Low-rank and sparse cross-domain recommendation. Neurocomputing 366 (2019), 86–96.
  • Iwata and Takeuchi (2015) Tomoharu Iwata and Koh Takeuchi. 2015. Cross-domain recommendation without shared users or items by sharing latent vector distributions. In Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2015, San Diego, California, USA, May 9-12, 2015 (JMLR Workshop and Conference Proceedings), Vol. 38. JMLR.org.
  • Jiang et al. (2016) Meng Jiang, Peng Cui, Nicholas Jing Yuan, Xing Xie, and Shiqiang Yang. 2016. Little Is Much: Bridging Cross-Platform Behaviors through Overlapped Crowds. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA. AAAI Press, 13–19.
  • Kang et al. (2019) SeongKu Kang, Junyoung Hwang, Dongha Lee, and Hwanjo Yu. 2019. Semi-Supervised Learning for Cross-Domain Recommendation to Cold-Start Users. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, CIKM 2019, Beijing, China, November 3-7, 2019. ACM, 1563–1572.
  • Khan et al. (2017) Muhammad Murad Khan, Roliana Ibrahim, and Imran Ghani. 2017. Cross Domain Recommender Systems: A Systematic Literature Review. ACM Comput. Surv. 50, 3 (2017), 36:1–36:34.
  • Koren (2008) Yehuda Koren. 2008. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, August 24-27, 2008. ACM, 426–434.
  • Krishnan et al. (2020) Adit Krishnan, Mahashweta Das, Mangesh Bendre, Hao Yang, and Hari Sundaram. 2020. Transfer Learning via Contextual Invariants for One-to-Many Cross-Domain Recommendation. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25-30, 2020. ACM, 1081–1090.
  • Kumar et al. (2014) Anil Kumar, Nitesh Kumar, Muzammil Hussain, Santanu Chaudhury, and Sumeet Agarwal. 2014. Semantic clustering-based cross-domain recommendation. In 2014 IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2014, Orlando, FL, USA, December 9-12, 2014. IEEE, 137–141.
  • Li (2011) Bin Li. 2011. Cross-Domain Collaborative Filtering: A Brief Survey. In IEEE 23rd International Conference on Tools with Artificial Intelligence, ICTAI 2011, Boca Raton, FL, USA, November 7-9, 2011. IEEE Computer Society, 1085–1086.
  • Li et al. (2009a) Bin Li, Qiang Yang, and Xiangyang Xue. 2009a. Can Movies and Books Collaborate? Cross-Domain Collaborative Filtering for Sparsity Reduction. In IJCAI 2009, Proceedings of the 21st International Joint Conference on Artificial Intelligence, Pasadena, California, USA, July 11-17, 2009, Craig Boutilier (Ed.). 2052–2057.
  • Li et al. (2009b) Bin Li, Qiang Yang, and Xiangyang Xue. 2009b. Transfer learning for collaborative filtering via a rating-matrix generative model. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, Montreal, Quebec, Canada, June 14-18, 2009 (ACM International Conference Proceeding Series), Vol. 382. ACM, 617–624.
  • Li et al. (2015) Bin Li, Xingquan Zhu, Ruijiang Li, and Chengqi Zhang. 2015. Rating Knowledge Sharing in Cross-Domain Collaborative Filtering. IEEE Trans. Cybern. 45, 5 (2015), 1054–1068.
  • Li et al. (2011) Bin Li, Xingquan Zhu, Ruijiang Li, Chengqi Zhang, Xiangyang Xue, and Xindong Wu. 2011. Cross-Domain Collaborative Filtering over Time. In IJCAI 2011, Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, July 16-22, 2011. IJCAI/AAAI, 2293–2298.
  • Li and Lin (2014) Chung-Yi Li and Shou-De Lin. 2014. Matching users and items across domains to improve the recommendation quality. In The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, New York, NY, USA - August 24 - 27, 2014. ACM, 801–810.
  • Li et al. (2020) Jin Li, Zhaohui Peng, Senzhang Wang, Xiaokang Xu, Philip S. Yu, and Zhenyun Hao. 2020. Heterogeneous Graph Embedding for Cross-Domain Recommendation Through Adversarial Learning. In Database Systems for Advanced Applications - 25th International Conference, DASFAA 2020, Jeju, South Korea, September 24-27, 2020, Proceedings, Part III (Lecture Notes in Computer Science), Vol. 12114. Springer, 507–522.
  • Li et al. (2019) Lile Li, Quan Do, and Wei Liu. 2019. Cross-Domain Recommendation via Coupled Factorization Machines. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019. AAAI Press, 9965–9966.
  • Li and Tuzhilin (2020) Pan Li and Alexander Tuzhilin. 2020. DDTCDR: Deep Dual Transfer Cross Domain Recommendation. In WSDM ’20: The Thirteenth ACM International Conference on Web Search and Data Mining, Houston, TX, USA, February 3-7, 2020. ACM, 331–339.
  • Lian et al. (2017) Jianxun Lian, Fuzheng Zhang, Xing Xie, and Guangzhong Sun. 2017. CCCFNet: A Content-Boosted Collaborative Filtering Neural Network for Cross Domain Recommender Systems. In Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, April 3-7, 2017. ACM, 817–818.
  • Liu et al. (2020b) Jian Liu, Pengpeng Zhao, Fuzhen Zhuang, Yanchi Liu, Victor S. Sheng, Jiajie Xu, Xiaofang Zhou, and Hui Xiong. 2020b. Exploiting Aesthetic Preference in Deep Cross Networks for Cross-domain Recommendation. In WWW ’20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020. ACM / IW3C2, 2768–2774.
  • Liu et al. (2020a) Meng Liu, Jianjun Li, Guohui Li, and Peng Pan. 2020a. Cross Domain Recommendation via Bi-directional Transfer Graph Collaborative Filtering Networks. In CIKM ’20: The 29th ACM International Conference on Information and Knowledge Management, Virtual Event, Ireland, October 19-23, 2020. ACM, 885–894.
  • Liu et al. (2015) Yan-Fu Liu, Cheng-Yu Hsu, and Shan-Hung Wu. 2015. Non-Linear Cross-Domain Collaborative Filtering via Hyper-Structure Transfer. In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015 (JMLR Workshop and Conference Proceedings), Vol. 37. JMLR.org, 1190–1198.
  • Loni et al. (2014) Babak Loni, Yue Shi, Martha A. Larson, and Alan Hanjalic. 2014. Cross-Domain Collaborative Filtering with Factorization Machines. In Advances in Information Retrieval - 36th European Conference on IR Research, ECIR 2014, Amsterdam, The Netherlands, April 13-16, 2014. Proceedings (Lecture Notes in Computer Science), Vol. 8416. Springer, 656–661.
  • Lu et al. (2013) Zhongqi Lu, Weike Pan, Evan Wei Xiang, Qiang Yang, Lili Zhao, and Erheng Zhong. 2013. Selective Transfer Learning for Cross Domain Recommendation. In Proceedings of the 13th SIAM International Conference on Data Mining, May 2-4, 2013. Austin, Texas, USA. SIAM, 641–649.
  • Ma et al. (2008) Hao Ma, Haixuan Yang, Michael R. Lyu, and Irwin King. 2008. SoRec: social recommendation using probabilistic matrix factorization. In Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM 2008, Napa Valley, California, USA, October 26-30, 2008. ACM, 931–940.
  • Ma et al. ([n.d.]) Muyang Ma, Pengjie Ren, Yujie Lin, Zhumin Chen, Jun Ma, and pages = 685–694 publisher = ACM year = 2019 title = -Net: A Parallel Information-sharing Network for Shared-account Cross-domain Sequential Recommendations, booktitle = Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, July 21-25, 2019. [n.d.].
  • Ma et al. (2018) Weizhi Ma, Min Zhang, Chenyang Wang, Cheng Luo, Yiqun Liu, and Shaoping Ma. 2018. Your Tweets Reveal What You Like: Introducing Cross-media Content Information into Multi-domain Recommendation. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden. ijcai.org, 3484–3490.
  • Man et al. (2017) Tong Man, Huawei Shen, Xiaolong Jin, and Xueqi Cheng. 2017. Cross-Domain Recommendation: An Embedding and Mapping Approach. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, August 19-25, 2017. ijcai.org, 2464–2470.
  • Manotumruksa et al. (2019) Jarana Manotumruksa, Dimitrios Rafailidis, Craig Macdonald, and Iadh Ounis. 2019. On Cross-Domain Transfer in Venue Recommendation. In Advances in Information Retrieval - 41st European Conference on IR Research, ECIR 2019, Cologne, Germany, April 14-18, 2019, Proceedings, Part I (Lecture Notes in Computer Science), Vol. 11437. Springer, 443–456.
  • Manzato (2013) Marcelo Garcia Manzato. 2013. gSVD++: supporting implicit feedback on recommender systems with metadata awareness. In Proceedings of the 28th Annual ACM Symposium on Applied Computing, SAC ’13, Coimbra, Portugal, March 18-22, 2013. ACM, 908–913.
  • Mirbakhsh and Ling (2015) Nima Mirbakhsh and Charles X. Ling. 2015. Improving Top-N Recommendation for Cold-Start Users via Cross-Domain Information. ACM Trans. Knowl. Discov. Data 9, 4 (2015), 33:1–33:19.
  • Moreno et al. (2012) Orly Moreno, Bracha Shapira, Lior Rokach, and Guy Shani. 2012. TALMUD: transfer learning for multiple domains. In 21st ACM International Conference on Information and Knowledge Management, CIKM’12, Maui, HI, USA, October 29 - November 02, 2012. ACM, 425–434.
  • Pan et al. (2011) Weike Pan, Nathan Nan Liu, Evan Wei Xiang, and Qiang Yang. 2011. Transfer Learning to Predict Missing Ratings via Heterogeneous User Feedbacks. In IJCAI 2011, Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, July 16-22, 2011. IJCAI/AAAI, 2318–2323.
  • Pan et al. (2010) Weike Pan, Evan Wei Xiang, Nathan Nan Liu, and Qiang Yang. 2010. Transfer Learning in Collaborative Filtering for Sparsity Reduction. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2010, Atlanta, Georgia, USA, July 11-15, 2010. AAAI Press.
  • Pan et al. (2012) Weike Pan, Evan Wei Xiang, and Qiang Yang. 2012. Transfer Learning in Collaborative Filtering with Uncertain Ratings. In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, July 22-26, 2012, Toronto, Ontario, Canada. AAAI Press.
  • Perera and Zimmermann (2017) Dilruk Perera and Roger Zimmermann. 2017. Exploring the use of Time-Dependent Cross-Network Information for Personalized Recommendations. In Proceedings of the 2017 ACM on Multimedia Conference, MM 2017, Mountain View, CA, USA, October 23-27, 2017. ACM, 1780–1788.
  • Perera and Zimmermann (2018) Dilruk Perera and Roger Zimmermann. 2018. LSTM Networks for Online Cross-Network Recommendations. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden. ijcai.org, 3825–3833.
  • Perera and Zimmermann (2020) Dilruk Perera and Roger Zimmermann. 2020. Towards Comprehensive Recommender Systems: Time-Aware Unified Recommendations Based on Listwise Ranking of Implicit Cross-Network Data. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, 189–197.
  • Rafailidis and Crestani (2016) Dimitrios Rafailidis and Fabio Crestani. 2016. Top-N Recommendation via Joint Cross-Domain User Clustering and Similarity Learning. In Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2016, Riva del Garda, Italy, September 19-23, 2016, Proceedings, Part II (Lecture Notes in Computer Science), Vol. 9852. Springer, 426–441.
  • Rafailidis and Crestani (2017) Dimitrios Rafailidis and Fabio Crestani. 2017. A Collaborative Ranking Model for Cross-Domain Recommendations. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, November 06 - 10, 2017. ACM, 2263–2266.
  • Ren et al. (2015) Siting Ren, Sheng Gao, Jianxin Liao, and Jun Guo. 2015. Improving Cross-Domain Recommendation through Probabilistic Cluster-Level Latent Factor Model. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, January 25-30, 2015, Austin, Texas, USA. AAAI Press, 4200–4201.
  • Sahebi and Brusilovsky (2015) Shaghayegh Sahebi and Peter Brusilovsky. 2015. It Takes Two to Tango: An Exploration of Domain Pairs for Cross-Domain Collaborative Filtering. In Proceedings of the 9th ACM Conference on Recommender Systems, RecSys 2015, Vienna, Austria, September 16-20, 2015. ACM, 131–138.
  • Sahebi et al. (2017) Shaghayegh Sahebi, Peter Brusilovsky, and Vladimir Bobrokov. 2017. Cross-Domain Recommendation for Large-Scale Data. In Proceedings of the 1st Workshop on Intelligent Recommender Systems by Knowledge Transfer & Learning co-located with ACM Conference on Recommender Systems (RecSys 2017), Como, Italy, August 27, 2017 (CEUR Workshop Proceedings), Vol. 1887. CEUR-WS.org, 9–15.
  • Shi et al. (2011) Yue Shi, Martha A. Larson, and Alan Hanjalic. 2011. Tags as Bridges between Domains: Improving Recommendation with Tag-Induced Cross-Domain Collaborative Filtering. In User Modeling, Adaption and Personalization - 19th International Conference, UMAP 2011, Girona, Spain, July 11-15, 2011. Proceedings (Lecture Notes in Computer Science), Vol. 6787. Springer, 305–316.
  • Shu et al. (2018) Kai Shu, Suhang Wang, Jiliang Tang, Yilin Wang, and Huan Liu. 2018. CrossFire: Cross Media Joint Friend and Item Recommendations. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, WSDM 2018, Marina Del Rey, CA, USA, February 5-9, 2018. ACM, 522–530.
  • Singh and Gordon (2008) Ajit Paul Singh and Geoffrey J. Gordon. 2008. Relational learning via collective matrix factorization. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, August 24-27, 2008. ACM, 650–658.
  • Song et al. (2017) Tianhang Song, Zhaohui Peng, Senzhang Wang, Wenjing Fu, Xiaoguang Hong, and Philip S. Yu. 2017. Review-Based Cross-Domain Recommendation Through Joint Tensor Factorization. In Database Systems for Advanced Applications - 22nd International Conference, DASFAA 2017, Suzhou, China, March 27-30, 2017, Proceedings, Part I (Lecture Notes in Computer Science), Vol. 10177. Springer, 525–540.
  • Tan et al. (2014) Shulong Tan, Jiajun Bu, Xuzhen Qin, Chun Chen, and Deng Cai. 2014. Cross domain recommendation based on multi-type media fusion. Neurocomputing 127 (2014), 124–134.
  • Wang et al. (2017) Xiang Wang, Xiangnan He, Liqiang Nie, and Tat-Seng Chua. 2017. Item Silk Road: Recommending Items from Information Domains to Social Users. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Shinjuku, Tokyo, Japan, August 7-11, 2017. ACM, 185–194.
  • Wang et al. (2018) Xinghua Wang, Zhaohui Peng, Senzhang Wang, Philip S. Yu, Wenjing Fu, and Xiaoguang Hong. 2018. Cross-Domain Recommendation for Cold-Start Users via Neighborhood Based Feature Mapping. In Database Systems for Advanced Applications - 23rd International Conference, DASFAA 2018, Gold Coast, QLD, Australia, May 21-24, 2018, Proceedings, Part I (Lecture Notes in Computer Science), Vol. 10827. Springer, 158–165.
  • Wang et al. (2019) Yaqing Wang, Chunyan Feng, Caili Guo, Yunfei Chu, and Jenq-Neng Hwang. 2019. Solving the Sparsity Problem in Recommendations via Cross-Domain Item Embedding Based on Co-Clustering. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, WSDM 2019, Melbourne, VIC, Australia, February 11-15, 2019. ACM, 717–725.
  • Xin et al. (2015) Xin Xin, Zhirun Liu, Chin-Yew Lin, Heyan Huang, Xiaochi Wei, and Ping Guo. 2015. Cross-Domain Collaborative Filtering with Review Text. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015. AAAI Press, 1827–1834.
  • Yan et al. (2019) Huan Yan, Xiangning Chen, Chen Gao, Yong Li, and Depeng Jin. 2019. DeepAPF: Deep Attentive Probabilistic Factorization for Multi-site Video Recommendation. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019. ijcai.org, 1459–1465.
  • Yan et al. (2020) Haoran Yan, Pengpeng Zhao, Fuzhen Zhuang, Deqing Wang, Yanchi Liu, and Victor S. Sheng. 2020. Cross-Domain Recommendation with Adversarial Examples. In Database Systems for Advanced Applications - 25th International Conference, DASFAA 2020, Jeju, South Korea, September 24-27, 2020, Proceedings, Part III (Lecture Notes in Computer Science), Vol. 12114. Springer, 573–589.
  • Yan et al. (2015) Ming Yan, Jitao Sang, and Changsheng Xu. 2015. Unified YouTube Video Recommendation via Cross-network Collaboration. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, Shanghai, China, June 23-26, 2015. ACM, 19–26.
  • Yang et al. (2017) Chunfeng Yang, Huan Yan, Donghan Yu, Yong Li, and Dah Ming Chiu. 2017. Multi-site User Behavior Modeling and Its Application in Video Recommendation. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Shinjuku, Tokyo, Japan, August 7-11, 2017. ACM, 175–184.
  • Yang et al. (2015) Deqing Yang, Jingrui He, Huazheng Qin, Yanghua Xiao, and Wei Wang. 2015. A Graph-based Recommendation across Heterogeneous Domains. In Proceedings of the 24th ACM International Conference on Information and Knowledge Management, CIKM 2015, Melbourne, VIC, Australia, October 19 - 23, 2015. ACM, 463–472.
  • Yuan et al. (2019) Feng Yuan, Lina Yao, and Boualem Benatallah. 2019. DARec: Deep Domain Adaptation for Cross-Domain Recommendation via Transferring Rating Patterns. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019. ijcai.org, 4227–4233.
  • Zhang et al. (2019a) Qian Zhang, Peng Hao, Jie Lu, and Guangquan Zhang. 2019a. Cross-domain Recommendation with Semantic Correlation in Tagging Systems. In International Joint Conference on Neural Networks, IJCNN 2019 Budapest, Hungary, July 14-19, 2019. IEEE, 1–8.
  • Zhang et al. (2019b) Qian Zhang, Jie Lu, Dianshuang Wu, and Guangquan Zhang. 2019b. A Cross-Domain Recommender System With Kernel-Induced Knowledge Transfer for Overlapping Entities. IEEE Trans. Neural Networks Learn. Syst. 30, 7 (2019), 1998–2012.
  • Zhang et al. (2017) Qian Zhang, Dianshuang Wu, Jie Lu, Feng Liu, and Guangquan Zhang. 2017. A cross-domain recommender system with consistent information transfer. Decis. Support Syst. 104 (2017), 49–63.
  • Zhang et al. (2018) Qian Zhang, Dianshuang Wu, Jie Lu, and Guangquan Zhang. 2018. Cross-domain Recommendation with Probabilistic Knowledge Transfer. In Neural Information Processing - 25th International Conference, ICONIP 2018, Siem Reap, Cambodia, December 13-16, 2018, Proceedings, Part III (Lecture Notes in Computer Science), Vol. 11303. Springer, 208–219.
  • Zhang et al. (2020) Yinan Zhang, Yong Liu, Peng Han, Chunyan Miao, Lizhen Cui, Baoli Li, and Haihong Tang. 2020. Learning Personalized Itemset Mapping for Cross-Domain Recommendation. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020. ijcai.org, 2561–2567.
  • Zhang et al. (2016) Zihan Zhang, Xiaoming Jin, Lianghao Li, Guiguang Ding, and Qiang Yang. 2016. Multi-Domain Active Learning for Recommendation. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA. AAAI Press, 2358–2364.
  • Zhao et al. (2019) Cheng Zhao, Chenliang Li, and Cong Fu. 2019. Cross-Domain Recommendation via Preference Propagation GraphNet. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, CIKM 2019, Beijing, China, November 3-7, 2019. ACM, 2165–2168.
  • Zhao et al. (2020) Cheng Zhao, Chenliang Li, Rong Xiao, Hongbo Deng, and Aixin Sun. 2020. CATN: Cross-Domain Recommendation for Cold-Start Users via Aspect Transfer Network. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25-30, 2020. ACM, 229–238.
  • Zhao et al. (2013) Lili Zhao, Sinno Jialin Pan, Evan Wei Xiang, Erheng Zhong, Zhongqi Lu, and Qiang Yang. 2013. Active Transfer Learning for Cross-System Recommendation. In Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, July 14-18, 2013, Bellevue, Washington, USA. AAAI Press.
  • Zhao et al. (2017) Lili Zhao, Sinno Jialin Pan, and Qiang Yang. 2017. A unified framework of active transfer learning for cross-system recommendation. Artif. Intell. 245 (2017), 38–55.
  • Zhao et al. (2018) Zhi-Lin Zhao, Ling Huang, Chang-Dong Wang, and Dong Huang. 2018. Low-Rank and Sparse Cross-Domain Recommendation Algorithm. In Database Systems for Advanced Applications - 23rd International Conference, DASFAA 2018, Gold Coast, QLD, Australia, May 21-24, 2018, Proceedings, Part I (Lecture Notes in Computer Science), Vol. 10827. Springer, 150–157.
  • Zhu et al. (2019) Feng Zhu, Chaochao Chen, Yan Wang, Guanfeng Liu, and Xiaolin Zheng. 2019. DTCDR: A Framework for Dual-Target Cross-Domain Recommendation. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, CIKM 2019, Beijing, China, November 3-7, 2019. ACM, 1533–1542.
  • Zhu et al. (2018) Feng Zhu, Yan Wang, Chaochao Chen, Guanfeng Liu, Mehmet A. Orgun, and Jia Wu. 2018. A Deep Framework for Cross-Domain and Cross-System Recommendations. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden. ijcai.org, 3711–3717.
  • Zhu et al. (2020) Feng Zhu, Yan Wang, Chaochao Chen, Guanfeng Liu, and Xiaolin Zheng. 2020. A Graphical and Attentional Framework for Dual-Target Cross-Domain Recommendation. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020. ijcai.org, 3001–3008.