Latent Unexpected Recommendations

07/27/2020 ∙ by Pan Li, et al. ∙ NYU college 0

Unexpected recommender system constitutes an important tool to tackle the problem of filter bubbles and user boredom, which aims at providing unexpected and satisfying recommendations to target users at the same time. Previous unexpected recommendation methods only focus on the straightforward relations between current recommendations and user expectations by modeling unexpectedness in the feature space, thus resulting in the loss of accuracy measures in order to improve unexpectedness performance. Contrast to these prior models, we propose to model unexpectedness in the latent space of user and item embeddings, which allows to capture hidden and complex relations between new recommendations and historic purchases. In addition, we develop a novel Latent Closure (LC) method to construct hybrid utility function and provide unexpected recommendations based on the proposed model. Extensive experiments on three real-world datasets illustrate superiority of our proposed approach over the state-of-the-art unexpected recommendation models, which leads to significant increase in unexpectedness measure without sacrificing any accuracy metric under all experimental settings in this paper.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Recommender systems have been playing an important role in the process of information dissemination and online commerce, which assist users in filtering the best content while shaping their consumption behavior patterns at the same time. However, classical recommender systems are facing the problem of filter bubbles (Pariser, 2011; Nguyen et al., 2014), which means that target users only get recommendations of their most familiar items, while losing reach to many other available items. They also lead to the problem of user boredom (Kapoor et al., 2015a, b), which significantly deteriorates user satisfaction with recommender systems. For example, even a Harry Potter fan may feel unsatisfied if the system keeps recommending Harry Potter series all the time.

To address these two problems, researchers have introduced recommendation objectives beyond accuracy, including unexpectedness, serendipity, novelty and diversity (Shani and Gunawardana, 2011), the goal of which is to provide novel, surprising and satisfying recommendations. Among them, unexpectedness is of particular interest for its close relation with user satisfaction and ability to improve recommendation performance (Adamopoulos, 2014; Adamopoulos and Tuzhilin, 2015). Therefore, we focus on modeling unexpectedness and providing unexpected recommendations in this paper.

In prior literature, researchers have proposed to define unexpectedness in multiple ways, including deviations from primitive prediction results (Murakami et al., 2007; Ge et al., 2010), unexpected combination of feature patterns (Akiyama et al., 2010) and feature distance from previous consumptions (Adamopoulos and Tuzhilin, 2015). They subsequently provide unexpected recommendations based on these definitions and achieve significant performance improvements in terms of certain unexpectedness measures.

However as shown in the prior literature (Zolaktaf et al., 2018; Zhou et al., 2010), improvements in unexpectedness come at the cost of sacrificing accuracy measures, which severely limits practical use of unexpected recommendations since the major goal of recommender system is to enhance overall user satisfaction. This is the case for the following reasons. First, previous models only focus on the straightforward relations between current recommendation and user expectations by modeling unexpectedness in the feature space

, while not taking into account deep, complex and heterogeneous relations between users and items. Second, prior modeling of unexpectedness relies completely on the explicit user and item information, and may not work well in the case when the consumption records are sparse, noisy or even missing. And finally, the distance metric between discrete items, which is crucial for defining unexpectedness, is hard to formulate in the discrete feature space, and this may lead to unintentional biases in the estimation of user preferences. Therefore, prior unexpected recommendation models can be further improved, and this constitutes the main topic of this paper.

To address the aforementioned concerns, in this paper we propose to define unexpectedness in the latent space containing latent embeddings of users and items, as opposed to the feature space that only has the explicit information about them. Specifically, we propose a novel Latent Closure (LC) method to model unexpectedness that:

  • captures latent, complex and heterogeneous relations between users and items to effectively model the concept of unexpectedness.

  • provides unexpected recommendations without sacrificing any performance accuracy.

  • efficiently computes unexpectedness for large-scale recommendation services.

The proposed unexpected recommendation model follows the following three-stage procedure. First, we map the features of users and items into the latent space and represent users and items as latent embeddings there. These embeddings are obtained using several state-of-the-art mapping approaches, including Heterogeneous Information Network Embeddings (HINE) (Sun and Han, 2013; Shi et al., 2018; Dong et al., 2017), AutoEncoder (AE) (Hinton and Salakhutdinov, 2006; Sedhain et al., 2015) and MultiModal Embeddings (ME) (Pan et al., 2016) methods. We subsequently utilize the concept of ‘’closure” from differential geometry and formulate the definition of unexpectedness of a new item as the distance between the embedding of that item and the closure of all the previously consumed item embeddings. And finally, we combine this unexpectedness measure with the estimated rating of the item to construct the hybrid utility function for providing unexpected recommendations.

In this paper, we make the following contributions:

(1) We propose latent modeling of unexpectedness. Although many papers have recently explored latent spaces for recommendation purposes, it is not clear how to do it for unexpected recommendations, which constitutes the topic of this work.

(2) We construct hybrid utility function based on the proposed unexpectedness measure and provide unexpected recommendations accordingly. We also demonstrate that this approach would significantly outperform all other unexpected recommendation baselines considered in this paper.

(3) We conduct extensive experiments in multiple settings and show that it is indeed the latent modeling of unexpectedness that leads to significant increase in unexpectedness measures without sacrificing any accuracy performance. Thus, the proposed method helps users to break out of their filter bubbles without sacrificing recommendation performance.

The rest of the paper is organized as follows. We discuss the related work in Section 2 and present our proposed latent modeling of unexpectedness in Section 3. The unexpected recommendation model is introduced in Section 4. Experimental design on three real-world datasets are described in Section 5 and the results as well as discussions are presented in Section 6. Finally, Section 7 summarizes our contributions and concludes the paper.

2. Related Work

In this section, we provide an overview on the related work covering three fields: beyond-accuracy metrics, unexpected recommendations and latent embeddings for recommendations. We highlight the importance of combining unexpected recommendations with latent modeling approaches to achieve superb recommendation performance.

2.1. Beyond-Accuracy Metrics

As researchers have pointed out, accuracy is not the only important objective of recommendations (McNee et al., 2006), while other beyond-accuracy metrics should also be taken into account, including unexpectedness, serendipity, novelty, diversity, coverage and so on (Ge et al., 2010; Kaminskas and Bridge, 2016). Note that, these metrics are closely related to each other, but still different in terms of definition and formulation. Therefore, prior literature have proposed multiple recommendation models to optimize each of these metrics separately.

Serendipity measures the positive emotional response of the user about a previously unknown item and indicates how surprising these recommendations are to the target users(Shani and Gunawardana, 2011; Chen et al., 2019). Representative methods to improve serendipity performance include Serendipitous Personalized Ranking (SPR) (Lu et al., 2012) that extends traditional personalized ranking methods by considering serendipity information in the AUC optimization process; and Auralist(Zhang et al., 2012) that utilizes the topic modeling approach to capture serendipity information and provide serendipitous recommendations accordingly.

Novelty measures the percentage of new recommendations that the users have not seen before or known about (McNee et al., 2006). It is computed as the percentage of unknown items in the recommendations. Researchers have proposed multiple methods to improve novelty measure in recommendations, including clustering of long-tail items (Park and Tuzhilin, 2008), innovation diffusion (Ishikawa et al., 2008), graph-based algorithms (Shi, 2013) and ranking models (Wasilewski and Hurley, 2019; Oh et al., 2011).

Diversity measures the variety of items in a recommendation list, which is commonly modeled as the aggregate pairwise similarity of recommended items (Ziegler et al., 2005). Typically models to improve diversity of recommendations include Determinantal Point Process (DPP) (Gartrell et al., 2017; Chen et al., 2018) that proposes a novel algorithm to greatly accelerate the greedy MAP inference and provide diversified recommendation accordingly; Greedy Re-ranking methods (Ziegler et al., 2005; Smyth and McClave, 2001; Kelly and Bridge, 2006; Vargas et al., 2014; Barraza-Urbina, 2017) that provide diversified recommendations based on the combination of the item’s relevance and its average distance to items already in the recommended list; and also Latent Factor models to optimize diversity measures (Shi et al., 2012; Hurley, 2013; Su et al., 2013)

Coverage measures the degree to which recommendations cover the set of available items (Ge et al., 2010; Herlocker et al., 2004; Adomavicius and Kwon, 2011a). To improve coverage measure, researchers propose to use coverage optimization (Adomavicius and Kwon, 2011a, b) and popularity reduction methods (Vargas and Castells, 2011) to balance between relevance and coverage objectives (Wu et al., 2016).

Over all beyond-accuracy metrics, in this paper we only focus on the unexpectedness measure and aim at providing unexpected recommendations for its close relation with user satisfaction and ability to improve recommendation performance (Adamopoulos, 2014; Adamopoulos and Tuzhilin, 2015). Moreover, the proposed unexpected recommendation algorithm is capable of improving serendipity and diversity measures as well, as shown in our experiment results.

2.2. Unexpectedness in Recommendations

Different from other beyond-accuracy metrics, unexpectedness measures those recommendations that are not included in user expectations and depart from what they would expect from the recommender system. Researchers have shown the importance of incorporating unexpectedness in recommendations, which could overcome the overspecialization problem (Adamopoulos and Tuzhilin, 2015; Iaquinta et al., 2010), broaden user preferences (Herlocker et al., 2004; Zhang et al., 2012; Zheng et al., 2015) and increase user satisfaction (Adamopoulos and Tuzhilin, 2015; Zhang et al., 2012; Lu et al., 2012). Unexpectedness captures the deviation of a particular recommender system from the results obtained from other primitive prediction models (Murakami et al., 2007; Ge et al., 2010; Akiyama et al., 2010), and also the deviation from user expectations (Adamopoulos and Tuzhilin, 2015; Li and Tuzhilin, 2019a)

.To improve unexpectedness measure in the final recommendations, existing models can be classified into three categories: rule-based approaches, model-based approaches and utility-based approaches, as we show in Table

1.

Rule-based approaches typically involve pre-definition of a set of rules or recommendation strategies for unexpected recommendations, including partial similarity (Kamahara et al., 2005), k-furthest-neighbor (Said et al., 2012) and graph-based approaches (Taramigkou et al., 2013; Lee and Lee, 2015). Rule-based approaches are generally simple to implement and easy to put into actual practice, as most of the approaches incorporate unexpectedness into the classical models instead of starting from scratch. Besides, rule-based approaches allow for more control in the model, as the rules and recommendation strategies are often explicitly specified by the designers. It also improves the explanability and interpretability of the proposed unexpected recommendation model. However, they require pre-defined strategies to be set prior to recommendations. Also, scalability is a big concern for the usage of rule-based methods. In addition, these models typically lack of generalizability for they focus only on specific domains and specific applications.

Model-based approaches aim to improve novelty and unexpectedness of the recommended items by proposing new models and data structures that go beyond the traditional collaborative filtering paradigm. Representative models that optimize the unexpectedness objective include personalized ranking (Wasilewski and Hurley, 2019), innovator identification (Kawamae et al., 2009; Kawamae, 2010) and transition cost graph (Shi, 2013). Model-based approaches are backed with mathematical foundations that guarantee either convergence or stability of the learning process, thus making them robust to different settings and with greater potential of generalizablity. However, they are often hard to interpret, for there is no natural way to transfer mathematical formulations into explicit rules or recommendation strategies. Therefore, it is relatively hard to control the degree of unexpectedness that we aim to incorporate into the recommendation model. And finally, model-based approaches might not take full advantages of all available information due to the restrictions of specific model form.

Utility-based approaches involve the construction of a hybrid utility function as the combination of estimated relevance and degree of unexpectedness. Researchers in (Weng et al., 2007; Iaquinta et al., 2008; Hijikata et al., 2009) have followed this direction of research. Specifically, (Adamopoulos and Tuzhilin, 2015)

proposed to include user expectation into the hybrid utility function and achieves state-of-the-art unexpected recommendation performance. Utility-based methods allow for more control of the recommendation strategy, and it is easier to implement and put into practice as well. Especially, the construction of unexpectedness do not depend on the estimation of user preferences towards the candidate item, thus making it model-agnostic. On the other hand, the unexpected hyperparameter plays an important role in determining the recommendation performance of the hybrid-based model, thus requiring proper hyperparameter optimization.

One important limitation of all prior unexpected recommendation models lies in that they only focus on the straightforward relations between users and items and define unexpectedness in the feature space, without taking into account the deep, complex interactions underlying their feature information. Therefore, previous unexpected recommendations might not reach the optimal recommendation performance, as discussed in (Yu et al., 2013; Shi et al., 2018; Yu et al., 2014). In addition, they are facing the trade-off dilemma between optimizing the accuracy and unexpectedness objectives. To address these limitations, in this paper we propose to define unexpectedness instead in the latent space, thus obtaining significant improvements over previous models.

Model Literature Strength Weakness
Rule-Based Approaches (Said et al., 2012), (Chiu et al., 2011), (Kamahara et al., 2005),     Easy to implement     Require pre-defined rules
    K-Furthest Neighbor (Lee and Lee, 2015), (Taramigkou et al., 2013)     Allow for model control     Lack of scalability
    Frequency Discount     Improves interpretability     Lack of generalizability
    Taxonomy-Based Similarity
    Partial Similarity
    Social Network
    Graph Theory
Model-Based Approaches (Kawamae et al., 2009), (Kawamae, 2010), (Lu et al., 2012),     Robust and generalizable     Lack of interpretability
    Matrix Factorization (Shi, 2013), (Wasilewski and Hurley, 2019)     Mathematical foundation     Restricted model control
    Learning to Rank     Efficient optimization     Limited model input
    Re-Ranking
    Clustering
    Graph Theory
Utility-Based Approaches (Weng et al., 2007), (Iaquinta et al., 2008), (Hijikata et al., 2009),     Balance between objectives     Require hyperparameter optimization
    Weighted Sum Model (Zhang et al., 2012), (Adamopoulos and Tuzhilin, 2015), (Li and Tuzhilin, 2019a)     Allow for model control     Explicit information only
    Weighted Product Model     Model-agnostic
    Probabilistic Model
  

  Neural Network Model


Table 1. Classification of Unexpected Recommendation Research

2.3. Latent Embeddings for Recommendation

Another body of related work is around embedding approaches that effectively map users and items into the latent space and extract their deep, complex and heterogeneous relations between each other. Specifically, different embedding methods fit for different recommendation applications. In the case where heterogeneous feature data is available, Heterogeneous Information Network Embedding approach (HINE) (Shi et al., 2017, 2018; Dong et al., 2017) utilizes the data structure of heterogeneous information network (HIN) to extract complex heterogeneous relations between user and item features and thus provide better recommendations to the target users. In the case where rich interactions between users and items are available, AutoEncoding (AE) approach (Rumelhart et al., 1985; He et al., 2017; Hinton and Salakhutdinov, 2006; Sedhain et al., 2015; Li and Tuzhilin, 2019b, 2020) utilizes deep neural network (DNN) techniques and obtain the semantic-aware representations of users and items as embeddings in the latent space to model their relationship and provide recommendations accordingly. Finally in the case where multimodal dataset is available, researchers propose to use Multimodal Embedding (ME) approach (Pan et al., 2016) to combine information from different sources and obtain superb recommendation performance.

Compared with classical approaches, latent embedding methods have several important advantages that enable recommender systems to provide more satisfying recommendations (Zhang et al., 2017; Lin et al., 2005), as discussed in Section 1. Therefore, in this paper we provide the definition of unexpectedness utilizing these latent embedding methods, which contributes to the strong recommendation performance.

3. Latent Modeling of Unexpectedness

In this section, we introduce the proposed latent modeling of unexpectedness. We compare the new definition with feature-based definitions and illustrate superiority and benefits of the proposed approach.

3.1. Latent Space

As introduced in prior literature (Murakami et al., 2007; Ge et al., 2010; Adamopoulos and Tuzhilin, 2015), an important component for modeling unexpectedness is the expected set, which contains previous consumptions of the user. The idea is that, users should have no unexpectedness towards those recommended items that they have purchased before or very similar to their purchases, for they understand that typical recommender systems collect their historic behaviors and thus provide similar recommendations based on these records.

To construct the user expectations, (Adamopoulos and Tuzhilin, 2015) propose to form the expected set by taking into account explicit feature information of users and items. For example in the book recommendation, the expected set is constructed based on the features of alternative editions, in the same series, with same subjects and classifications, with the same tags, and so on. Unexpectedness is subsequently defined by a positive, unbounded function of the distance of the recommended item from the set of expected items.

However, this definition only focuses on the straightforward relations between users and items, but fall short of addressing deeper correlations beyond the explicit feature information. For example, if a certain user has been a frequent consumer of McDonald and Carl’s Jr, then the recommendation of Burger King might not be unexpected to that user, although these restaurants belong to different franchise and offer different menus, as shown in their feature information.

Besides, feature-based modeling of unexpectedness typically assumes the same importance for each feature during the calculation of unexpectedness, while in reality it is not necessarily the case. A natural example is that for music recommendations, genre information plays a more important role in determining the degree of unexpectedness than profile information, such as time of release. In addition, the distance function is also hard to define in the discrete feature space.

Therefore, in this paper, we propose to construct the expected set in the latent space by taking the closure of item embeddings. Unexpectedness is subsequently defined as the distance between the new item embedding and the closure of the expected set in the latent space. Comparing with feature-based definitions, latent modeling of unexpectedness obtains several important advantages, as discussed in Section 1. Especially, we point out that the proposed Latent Closure (LC) model is capable of utilizing richer information of user reviews and multi-modal data to determine the degree of unexpectedness, as previous models typically do not take these information into account, as shown in Table 2. These benefits are also supported by strong experiment results.

Latent Modeling Feature Modeling
Algorithms LC SPR Auralist HOM-LIN DPP
Latent Embeddings
Explicit Features
User Reviews
Pre-Defined Rules
Past Transactions
User Ratings
Table 2. Comparison of Unexpected Recommendation Methods

In the next section, we will introduce the idea of latent closure and how to construct user expectations based on the proposed latent closure method.

3.2. Latent Closure

As discussed in the previous section, we propose to compute user expectations in the latent space rather than in the original feature space. In addition, we point out that the modeling of user expectations should go beyond the direct aggregation of previous consumptions, and should also take into account those items that are similar to the consumed items, while similarities between items are captured by the Euclidean distance in the latent space. Therefore, it is natural to take the ‘’closure” of all consumed item embeddings to model the expected set, as opposed to using individual item embedding in the latent space.

According to mathematical theories in differential geometry (Helgason, 2001), there are three common geometric structures in high-dimensional latent spaces that can be naturally extended to modeling the closure of latent embeddings, namely Hypersphere, Hypercube and Convex Hull. The particular choice of latent closure depends on the assumption we make towards the relations between users and items in the latent space.

  • Latent HyperSphere (LHS) The hypersphere in the space is defined as the set of n-tuples points () such that where r is the radius of the hypersphere. Under this definition, we assume that the expected set of items for each user grows homogeneously in all directions in the latent space.

  • Latent HyperCube (LHC) The hypercube is a closed, compact, convex figure, whose 1-skeleton consists of groups of opposite parallel line segments aligned in each of the space’s dimensions, perpendicular to each other and of the same length. Under this definition, we assume that the expected set of each user grows homogeneously in the n perpendicular directions.

  • Latent Convex Hull (LCH) The convex hull of a set of points in the Euclidean space is the smallest convex set that contains all points in . Under this definition, we assume that the expected set maintains its convexity in the growing process. In addition, if we construct the expected set as the convex hull of consumed item embeddings, the convexity property will guarantee the feasibility of the recommendation as an optimization problem given by the Slater’s Condition (Slater, 2014).

We visualize the definition of unexpectedness based on these geometric structures in Figure (a)a, (b)b and (c)c. These latent closure approaches capture latent semantic interactions between users and items and construct the expected set for each user accordingly. Compared to feature-based definitions (Adamopoulos and Tuzhilin, 2015), latent closures utilize richer information including user and item features to model user expectations more precisely. The process for finding closures in high-dimensional latent spaces is not significantly different from the process in the 2-dimensional space. For LHS and LHC, we only need to find the furthest two points in the latent space to identify the centroid of the latent closure. For LCH, we follow the QuickHull algorithm (Barber et al., 1996) to identify the latent structure. Experiment results show that all three geometric structures consistently obtain significant improvements over baseline models, while no structure dominates the other two.

(a) Latent Convex Hull
(b) Latent Hypersphere
(c) Latent Hypercube
Figure 1. Visualization of Latent Closure and the Unexpectedness. Blue points stand for all the available items; Orange points represent the consumed items; Green point refers to the newly recommended item. We define unexpectedness as the distance between the new item and the latent closure generated by all consumed items.

To sum up, in this paper we utilize the latent closure method to model unexpectedness in the latent space. We hereby propose the following definition of unexpectedness:

Definition 3.0 ().

Unexpectedness of a new item as the distance between the embedding of that item and the closure of all previously consumed item embeddings.

In the next section, we will discuss the specific techniques for obtaining latent embeddings and methods to provide unexpected recommendations accordingly.

4. Unexpected Recommendation Model

4.1. Latent Embeddings

To effectively model unexpectedness in the latent space and demonstrate the robustness of the proposed model, we utilize three state-of-the-art latent embedding approaches, namely HINE, AE and ME to map users and items into the latent space and calculate the unexpectedness subsequently.

4.1.1. Heterogeneous Information Network Embeddings (HINE)

To capture the complex and multi-dimensional relations in the data record, Heterogeneous Information Network (HIN) (Sun and Han, 2013) has become an effective data structure for recommendations, which models multiple types of objects and multiple types of links in one single network. It includes users, items, transactions, ratings, entities extracted from reviews and the feature information. We link the associated entities with corresponding users and items in the network and utilize meta-path embedding approach (Dong et al., 2017) to obtain node embeddings.

We denote the heterogeneous network as , in which each node and each link are assigned with specific type and

. To effectively learn node representations we enable the skip-gram mechanism to maximize the probability of each context node

within the neighbors of , denoted as , where we add the subscript () to limit the node to a specific type:

(1)

Thus, it is important to calculate , which represents the conditional probability of context node given node . Therefore, we follow (Grover and Leskovec, 2016) and revise the network embedding model accordingly for dealing with heterogeneous information network. Specifically, we propose to use heterogeneous random walk to generate paths of multiple types of nodes in the network. Given a heterogeneous information network , the metapath of the network is generated in the form of wherein defines the composite relations between the start and the end of the heterogeneous random walk. The transition probability within each random walk between two nodes is defined as follows:

(2)

where stands for the transition coefficient between the type of node and the type of node . We have 6 different transition coefficients that correspond to 6 different relations in the network and . (U:User, I:Item, E:Entity/Feature) stands for the number of nodes of type in the neighborhood of . We apply heterogeneous random walk iteratively to each node and generate the collection of meta-path sequences. The user and item embeddings are therefore obtained through the aforementioned skip-gram mechanism.

Figure 2. Heterogeneous Information Network Embedding Method

4.1.2. AutoEncoder (AE)

Apart from modeling interactions between users and items through HIN, AutoEncoder (AE) approach also constitutes an important tool to learn the latent representations of user and item features and transform discrete feature vectors into continuous feature embeddings.

We denote the feature information for user as and the feature information for item as , where and

stand for the dimensionality of user and item feature vectors respectively. The goal is to train two separate neural networks: encoder that maps feature vectors into latent embeddings, and decoder that reconstructs feature vectors from latent embeddings. Due to effectiveness and efficiency of the training process, we formulate both the encoder and the decoder as multi-layer perceptron (MLP). MLP learns the hidden representations using the following equations:

(3)

where represents the latent embeddings and

stands for the fully connected layer with activation functions. We apply another layer of fully connected network for reconstruction and optimization. Note that, in this step we train the global autoencoder for users and items in the entire dataset simultaneously to obtain the hidden representation.

Figure 3. AutoEncoder Embedding Method

4.1.3. Multimodal Embeddings (ME)

In addition to the aforementioned approaches, when dealing with datasets that include multiple modalities, such as movie and video data (which are usually associated with images and subtitles), multimodal embeddings (Pan et al., 2016; Wei et al., 2019) constitute an efficient tool to combine the information from different sources.

Specifically, in the video recommendation task, we illustrate the model for obtaining video embeddings in Figure 4. First, we initialize the embeddings for text, audio and image data through Fully Convolutional Network (FCN) with L2-Norm as regularization term. For the text data, we use the average pooling technique as a special treatment to obtain the semantic information as the average of word embeddings. Then we concatenate these embeddings and apply another layer of Fully Convolutional Network to obtain multimodal embeddings for the input video that captures joint information of subtitles, sound and graphics.

Figure 4. Multimodal Embedding Method

4.2. Hybrid Utility Function

Based on the latent embedding approaches introduced in the previous section, we map the users and items into the continuous latent space and model the expected set for each user as the latent closure of item embeddings. Specifically, we feed the user and item features as input into the latent embedding models and obtain their latent representations. We subsequently formulate the unexpectedness as the distance between the embedding of new item and the latent expected set as

(4)

where contains the embeddings of all consumed items. This unexpectedness metric is well defined as the minimal distance from the new item to the boundaries of the closure in the latent space. We then perform the unexpected recommendation based on the hybrid utility function:

(5)

which incorporates the linear combination of estimated ratings and unexpectedness. The key idea lies in that, instead of recommending the similar items that the users are very familiar with as the classical recommenders do, we recommend unexpected and useful items to the users that they might have not thought about, but indeed fit well to their satisfactions. The two adversarial forces of accuracy and unexpectedness work together to get the optimal recommendation and thus obtain the best recommendation performance and user satisfaction. We present the entire framework in Algorithm 1.

Data: Users; Items; Historic Actions; Other feature information
Result: List of Recommended Items
1 Map users and items into the latent space;
2 for each user in Users do
3       for each item in Items do
4             ;
5            
6       end for
7      Recommend Top-N(Utility);
8 end for
Algorithm 1 Latent Unexpected Recommendation

5. Experiments

To validate the performance of our approach, we conduct extensive experiments on three large-scale real-world applications and compare the results of our model with the state-of-the-art baselines. The experimental setup is introduced in this section. Specifically, we design the experiments to address the following research questions:

RQ1: How does the proposed model perform compared to baseline unexpected recommendation models?

RQ2: Can we achieve significant improvements in unexpectedness measure while keeping the same level of accuracy performance?

RQ3: Are the improvements robust to different experimental settings?

5.1. Datasets

We implement our model on three real-world datasets: the Yelp Challenge Dataset Round 12111https://www.yelp.com/dataset/challenge, which contains ratings and reviews of users and restaurants; the TripAdvisor Dataset222http://www.cs.cmu.edu/ jiweil/html/hotel-review.html

, which contains check-in information of users and hotels; and the Video Dataset, which includes the traffic logs we collected from a large-scale industrial video platform. Specifically, we use four days of traffic logs for the training process and the following day for the evaluation process. We list the descriptive statistics of these datasets in Table

3. To avoid the cold-start and sparsity issues, we filter out users and items that appear less than 5 times in all three datasets.

Dataset Yelp TripAdvisor Video
# of Records 5,996,996 878,561 1,155,987
# of Items 188,593 576,689 287,607
# of Users 1,518,169 3,945 5,241
Sparsity 0.002% 0.039% 0.077%
Table 3. Descriptive Statistics of Three Datasets

5.2. Parameter Settings

We perform Bayesian optimization (Snoek et al., 2012) to select optimal hyperparameters for the proposed method as well as baseline models. The is selected as 0.03, where we achieve the optimal balance between the accuracy and unexpectedness measures. In addition, the dimension of the latent embeddings is 128, which is efficient to capture the relations between users and items, as shown in (Zhang et al., 2017). Detailed parameter settings are further introduced in the next section.

As discussed in Section 4, for three different datasets we select three state-of-the-art embedding approaches accordingly to model the unexpectedness in the latent space. Specifically, the Yelp dataset contains information about explicit users, items and ratings, as well as substantial amounts of meta-information, including text reviews, friendship network, user demographic and geolocation. Thus, it is suitable to be analyzed using Heterogeneous Information Network Embedding (HINE) approach to address the heterogeneous relationships within the Yelp dataset. Meanwhile, due to the multimodality of video data structure, we utilize the Multimodal Embedding (ME) approach to calculate the unexpectedness between users and videos in the Video dataset. Meanwhile, the TripAdvisor dataset only includes users, items and their associated feature information, which makes the AutoEncoding (AE) approach a reasonable choice for obtaining latent embeddings.

We point out that, although it could further increase the validity of our approach if we test the same embedding approach on the three datasets, it is not practical to do so. By implementing our model through three different embedding approaches, we illustrate the strength of modeling unexpectedness in the latent space. Note that illustration of this point does not rely on the specific design of embedding approaches.

5.3. Training Procedure

Our proposed latent unexpected recommendation model follows a three-step training procedure: first, we utilize the latent embedding approaches to map users and items into the latent space; then we subsequently calculate the unexpectedness and construct the hybrid utility function for each user; finally, we provide unexpected recommendations based on the hybrid utility function and update our model accordingly.

To obtain the heterogeneous information network embeddings from the Yelp dataset, we extract the users, restaurants and feature labels from the dataset to construct the nodes in the heterogeneous information network. We link the user nodes and items nodes with their associate feature nodes, and we also link the user node with the item node if the user has visited that restaurant before. We conduct heterogeneous random walk (Shi et al., 2018) with length 100 starting from each node to generate the sequences of nodes. We repeat this process 10 times. Then we enable skip-gram mechanism following the procedures in (Grover and Leskovec, 2016) with window size 2, minimal term count 1 and iterations 100 to map the nodes into the latent space, and obtain the corresponding latent embeddings.

To obtain the autoencoder embeddings from the TripAdvisor dataset, we utilize one layer of MLP (Multi-Layer Perceptron) as the encoder to generate latent representations for each user and item, and then use one layer of MLP as decoder to reconstruct the original information. We jointly optimize encoder and decoder to generate the latent embeddings.

To obtain the multimodal embeddings from the Video dataset, we decompose the input videos into texts, audios and images, where we subsequently apply FCN (Fully-Connected Network) with L2-Norm as regularization term to obtain the latent embeddings separately. Then we concatenate text embeddings, audio embeddings and image embeddings to go through another layer of FCN to generate the final multimodal embeddings.

For performance comparison, we select the deep-learning based Neural Collaborative Filtering (NCF) model

(He et al., 2017)

as well as five popular collaborative filtering algorithms including k-Nearest Neighborhood approach (KNN)

(Altman, 1992)

, Singular Value Decomposition approach (SVD)

(Sarwar et al., 2002), Co-Clustering approach (George and Merugu, 2005), Non-Negative Matrix Factorization approach (NMF) (Lee and Seung, 2001) and Factorization Machine approach (FM) (Rendle, 2010) to verify robustness of the proposed model. We implement the model in the Python environment using the ”Surprise”,”SciPy” and ”Gensim” packages. All experiments are performed on a laptop with 2.50GHz Intel Core i7 and 8GB RAM. We show that the training procedure is time-efficient: it takes 3 hours, 0.5 hours and 1 hours respectively for our proposed model to obtain latent embeddings in the Yelp dataset, the TripAdvisor dataset and the Video dataset. The subsequent unexpected recommendation process takes less than one hour to complete.

5.4. Evaluation Metrics: Accuracy and Unexpectedness

To compare the performance of the proposed Latent Closure (LC) method and baseline models, we measure the recommendation results along two dimensions: accuracy, in terms of RMSE, MAE, Precision@N and Recall@N metrics (Herlocker et al., 2004), and unexpectedness, in terms of Unexpectedness, Serendipity and Diversity metrics (Ge et al., 2010). Specifically, we calculate unexpectedness through equation (5) following our proposed definition, while serendipity and diversity are computed following the standard measures in the literature (Ziegler et al., 2005; Ge et al., 2010).

(6)

Serendipity is computed as the percentage of serendipitous recommendations, where RS stands for the recommended items using the target model, PM

stands for the recommendation items using a primitive prediction algorithm (usually selected as the linear regression) and

USEFUL stands for the items whose utility is above the average level. Diversity is computed as the average intra-list distance.

(7)

5.5. Baseline Models

We implement several state-of-the-art unexpected recommendation models as baselines and report their performance in terms of aforementioned metrics. The baseline models include SPR, Auralist, DPP, HOM-LIN. Note that we do not include neural network approach because there is no deep-learning based model for unexpected recommendations in the literature.

  • SPR (Lu et al., 2012). Serendipitous Personalized Ranking is a simple and effective method for serendipitous item recommendation that extends traditional personalized ranking methods by considering item popularity in AUC optimization, which makes the ranking sensitive to the popularity of negative examples.

  • Auralist (Zhang et al., 2012). Auralist is a personalized recommendation system that balances between the desired goals of accuracy, diversity, novelty and serendipity simultaneously. Specifically in the music recommendation, the authors combine Artist-based LDA recommendation with two novel components: Listener Diversity and Musical Bubbles. We adjust the algorithm accordingly to fit in our restaurant and hotel recommendation scenario.

  • DPP (Chen et al., 2018)

    The determinantal point process (DPP) is an elegant probabilistic model of repulsion with applications in various machine learning tasks. The authors propose a fast greedy MAP inference approach for DPP to generate relevant and diverse recommendations.

  • HOM-LIN (Adamopoulos and Tuzhilin, 2015). HOM-LIN is the state-of-the-art unexpected recommendation algorithm, where the author propose to define unexpectedness as the distance between items and the expected set of users in the feature space and linearly combine unexpectedness with estimated ratings to provide recommendations.

5.6. Significant Testing

To illustrate the differences of recommendation performance between our proposed model and the baseline methods, we conduct significant testing over the experiment results. Specifically, the significance level is determined through rerunning the unexpected recommendation models with random initialization multiple times and conduct Student’s t-test to compute the p-value. We report the significance level together with our results in the next section.

5.7. Cold-Start Problem

Note that, the cold start problem is very important in recommender systems. We would like to point out that our proposed unexpected recommender system does not encounter this problem due to the following reasons: First, for the user-side cold start problem, we do not provide unexpected recommendations, as the new users have very few interactions and normally do not face the problem of boredom. Instead, we suggest to provide classical recommendations, which aim at producing similar recommendations to help the users identify and reinforce their interested contents. Second, for the item-side cold start problem, the new item embeddings could be obtained through classical cold start embedding methods (Wang et al., 2018), and then we could subsequently calculate the unexpectedness and provide unexpected recommendations accordingly.

6. Results

In this section, we report the experimental results on three real-world datasets to answer the research questions in Section 5.

Yelp Dataset
Model RMSE MAE Pre@5 Rec@5 Unexp Ser Div
NCF+LC 0.9169 0.7078 0.7783* 0.6291* 0.1450 0.4905* 0.4178*
FM+LC 0.9180 0.6888* 0.7704 0.6278 0.1378 0.4603 0.4164
CC+LC 0.9514 0.7007 0.7626 0.5926 0.1355 0.4793 0.3961
SVD+LC 0.9136 0.7039 0.7722 0.6212 0.1214 0.4630 0.3511
NMF+LC 0.9522 0.7026 0.7781 0.6238 0.1466* 0.4894 0.4045
KNN+LC 0.9133* 0.7715 0.7674 0.6287 0.1288 0.4380 0.3388
SPR 1.0351 0.7729 0.7692 0.6188 0.0668 0.3720 0.2532
Auralist 1.0377 0.7799 0.7678 0.6000 0.0663 0.3637 0.2047
HOM-LIN 0.9609 0.7447 0.7621 0.6150 0.0751 0.4329 0.3011
DPP 1.0288 0.7702 0.7598 0.6012 0.0670 0.4488 0.2488
Table 4. Comparison of unexpected recommendation performance in the Yelp dataset, ”*” stands for 95% statistical significance
TripAdvisor Dataset
Model RMSE MAE Pre@5 Rec@5 Unexp Ser Div
NCF+LC 0.9624* 0.7310* 0.7201* 0.9810* 0.0586 0.4635 0.0472
FM+LC 1.0230 0.7450 0.7031 0.9638 0.0581 0.4637* 0.0388
CC+LC 1.0230 0.7539 0.6887 0.9754 0.0587 0.4629 0.0491*
SVD+LC 0.9908 0.7519 0.7093 0.9569 0.0585 0.4614 0.0477
NMF+LC 1.0280 0.7594 0.6864 0.9735 0.0584 0.4629 0.0488
KNN+LC 0.9981 0.7493 0.6909 0.9743 0.0588* 0.4625 0.0488
SPR 1.0328 0.8008 0.6395 0.9325 0.0474 0.3593 0.0375
Auralist 1.0318 0.7997 0.6460 0.9390 0.0473 0.3462 0.0355
HOM-LIN 1.0298 0.7902 0.6420 0.9418 0.0572 0.3729 0.0411
DPP 1.0304 0.8158 0.6264 0.9303 0.0464 0.3245 0.0311
Table 5. Comparison of unexpected recommendation performance in the TripAdvisor dataset, ”*” stands for 95% statistical significance
Video Dataset
Model RMSE MAE Pre@5 Rec@5 Unexp Ser Div
NCF+LC 0.3810* 0.2854* 0.2560 0.3615 0.7070 0.9830 0.2538
FM+LC 0.3924 0.3044 0.2498 0.3265 0.7096* 0.9833* 0.2510
CC+LC 0.4167 0.3296 0.2569 0.3676* 0.7053 0.9815 0.2519
SVD+LC 0.3888 0.2862 0.2455 0.3253 0.7018 0.9810 0.2412
NMF+LC 0.4405 0.3330 0.2494 0.3439 0.6999 0.9792 0.2450
KNN+LC 0.4088 0.3091 0.2608* 0.3212 0.9814 0.9801 0.2558*
SPR 0.4610 0.3638 0.2298 0.2870 0.6300 0.9593 0.2137
Auralist 0.4515 0.3610 0.2304 0.2890 0.6462 0.9462 0.1980
HOM-LIN 0.4498 0.3608 0.2310 0.2912 0.6732 0.9473 0.2154
DPP 0.4770 0.3670 0.2271 0.2870 0.6593 0.9328 0.2154
Table 6. Comparison of unexpected recommendation performance in the Video dataset, ”*” stands for 95% statistical significance

6.1. Unexpected Recommendation Performance

To start with, we compare the recommendation performance of proposed latent unexpectedness with baseline unexpected recommendation models. Specifically, the proposed LC method provides unexpected recommendations through where unexpectedness is calculate using Latent HyperSphere introduced in Section 3.2 and estimated ratings are computed through deep-learning based method Neural Collaborative Filtering (NCF) and five other popular collaborative filtering algorithms Factorization Machine (FM), CoClustering (CC), Singular Value Decomposition (SVD), Non-negative Matrix Factorization (NMF) and K-Nearest Neighbor (KNN). We denote the corresponding unexpected recommendations provided through hybrid utility functions as NCF+LC, FM+LC, CC+LC, SVD+LC, NMF+LC and KNN+LC accordingly.

As shown in Table 6, 6 and 6, by utilizing the proposed latent modeling of unexpectedness, all six unexpected recommendation models consistently and significantly outperforms the baseline methods in both accuracy and unexpectedness measures. Specifically, we observe an average increase of 5.21% in RMSE, 8.11% in MAE, 1.14% in Precision, 1.57% in Recall, 48.77% in Unexpectedness, 8.30% in Serendipity and 27.69% in Diversity compared to the second best baseline model in the Yelp dataset. That is to say, the proposed latent modeling of unexpectedness enables us to provide more unexpected and more useful recommendations at the same time. Also, we show that the superiority of latent unexpectedness is robust to the specific selection of collaborative filtering algorithms, as we obtain significant increase of performance measures in all six algorithms and do not observe any significant difference in unexpectedness metric within these methods.

(a) RMSE
(b) MAE
(c) Unexpectedness
(d) Serendipity
Table 7. Comparison of recommendation performance with and without unexpectedness in the Yelp dataset, ”*” stands for 95% statistical significance; we observe significant improvements in unexpectedness measures in (c) and (d), while no significant change in accuracy measures in (a) and (b) at the same time.
(a) RMSE
(b) MAE
(c) Unexpectedness
(d) Serendipity
Table 8. Comparison of recommendation performance with and without unexpectedness in the TripAdvisor dataset, ”*” stands for 95% statistical significance; we observe significant improvements in unexpectedness measures in (c) and (d), while no significant change in accuracy measures in (a) and (b) at the same time.
(a) RMSE
(b) MAE
(c) Unexpectedness
(d) Serendipity
Table 9. Comparison of recommendation performance with and without unexpectedness in the Video dataset, ”*” stands for 95% statistical significance; we observe significant improvements in unexpectedness measures in (c) and (d), while no significant change in accuracy measures in (a) and (b) at the same time.

6.2. Improving Unexpectedness while Keeping Accuracy

As we discuss in the previous section, an important problem with incorporating unexpectedness into recommendations is the trade-off between accuracy and novelty measures (Zolaktaf et al., 2018; Zhou et al., 2010), which is crucial to the practical use of unexpected recommendations. In this section, we compare the unexpected recommendation performance using hybrid utility functions with those classical recommender systems that provide recommendations based on estimated ratings only.

As shown in Table 7, 8, 9 and the corresponding plots, when including unexpectedness in the recommendation process, we consistently obtain significant improvements in terms of unexpectedness, serendipity and diversity measures, while we do not witness any loss in the accuracy measures. Therefore, we show that it is indeed the proposed latent closure approach that enables us to provide useful and unexpected recommendations simultaneously. It is crucial for the successful deployment of unexpected recommendation models in the industrial applications.

In addition, we study the impact of the hyperparameter in Equation (5) that controls for the degree of unexpectedness and usefulness in the hybrid utility function. Typically a higher value of indicates that the recommendation model is in favor of unexpected recommendations over useful recommendations, while a lower value of tends to recommend more useful items as opposed to unexpected items.

We plot the change of accuracy and novelty measures with respect to different value in Figure 5. This figure illustrates that when we select relatively small value of , (e.g., =0.03) we can obtain significant amount of increase in unexpectedness (8.40%, 10.81% and 9.10% respectively in three datasets) while the decrease of accuracy performance is not statistically significant for all three datasets. It is also worth noticing that if we select a large value of , we might risk deteriorating the accuracy performance of recommendations significantly.

(a) Yelp
(b) TripAdvisor
(c) Video
Figure 5. Comparison of Accuracy-Novelty Trade-off

6.3. Robustness Analysis

In this paper, we show that the proposed latent modeling of unexpectedness significantly improves recommendation performance and provide indeed unexpected recommendations. In total, we conduct the experiments on 3 different datasets. using 3 different latent embedding approaches, 6 different collaborative filtering algorithms, 7 different evaluation metrics and 3 different geometric structures for modeling unexpectedness, resulting in 378 experimental settings, where all 378 results are in supportive of our claims. We observe significant improvements in unexpectedness, serendipity and diversity measures, while we do not witness any loss in accuracy measures compared to plain collaborative filtering algorithms that do not include unexpectedness during the recommendation process. In addition, when compared to baseline unexpected recommendation models, our model significantly outperforms them in both accuracy and unexpectedness measures.

To sum up, the superiority of latent modeling of unexpectedness is robust to

  • Various Datasets We conduct the experiments in three different datasets: Yelp dataset, TripAdvisor dataset and Video dataset, where we obtain consistent improvements in all three datasets.

  • Multiple Latent Embedding Approaches To construct the unexpectedness in the latent space, we utilize three state-of-the-art latent embedding approaches: Heterogeneous Information Network Embeddings (HINE), Autoencoder Embeddings (AE) and Multimodal Embeddings (ME) and obtain similar superior recommendation performance over baseline models.

  • Specific Collaborative Filtering Algorithms We select six representative collaborative filtering algorithms to estimate user ratings and form the hybrid utility function accordingly. These methods include the deep-learning based approach NCF and five other popular models FM, CC, SVD, NMF and KNN. The latent modeling of unexpectedness enables each collaborative filtering algorithm to provide more unexpected recommendations without losing any accuracy measure.

  • Selective Evaluation Metrics We evaluate the recommendation performance using accuracy measures RMSE, MAE, Precision, Recall and unexpectedness measures Unexpectedness, Serendipity, Diversity. The proposed model significantly outperforms baseline unexpected recommendation models in all these seven metrics.

  • Different Geometric Shapes of Latent Closures As discussed in Section 3.2, there are three common geometric structures in high-dimensional latent space that are suitable for modeling the closure of latent embeddings: Latent HyperSphere (LHS), Latent HyperCube (LHC) and Latent Convex Hull (LCH). We calculate unexpectedness using the three structures separately and provide unexpected recommendations accordingly. As shown in Table 10, 11 and 12, the specific selection of geometric structure does not influence the recommendation performance, as we get similar results and neither approach dominates the other two. Instead, it is really the latent modeling of unexpectedness that contributes to the significant improvements of recommendation performance.

Model RMSE MAE Pre@5 Rec@5 Unexp Ser Div
NCF+LCH 0.9158 0.7076 0.7798 0.6308 0.1478 0.4889 0.4170
NCF+LHS 0.9169 0.7078 0.7783 0.6291 0.1450 0.4905 0.4178
NCF+LHC 0.9180 0.7013 0.7725 0.6270 0.1478 0.4930 0.4178
FM+LCH 0.9178 0.6820 0.7700 0.6123 0.1422 0.4593 0.4198
FM+LHS 0.9180 0.6888 0.7704 0.6278 0.1378 0.4603 0.4164
FM+LHC 0.9162 0.6798 0.7698 0.6195 0.1402 0.4608 0.4198
CC+LCH 0.9504 0.7038 0.7596 0.5864 0.1400 0.4660 0.3869
CC+LHS 0.9514 0.7007 0.7626 0.5926 0.1355 0.4793 0.3961
CC+LHC 0.9501 0.7072 0.7645 0.5774 0.1349 0.4644 0.3847
SVD+LCH 0.9134 0.7076 0.7701 0.6175 0.1240 0.4569 0.3524
SVD+LHS 0.9136 0.7039 0.7722 0.6212 0.1214 0.4630 0.3511
SVD+LHC 0.9126 0.7081 0.7720 0.6133 0.1192 0.4534 0.3602
NMF+LCH 0.9522 0.7054 0.7722 0.6233 0.1390 0.4869 0.4030
NMF+LHS 0.9522 0.7026 0.7781 0.6238 0.1466 0.4894 0.4045
NMF+LHC 0.9558 0.7013 0.7692 0.6260 0.1471 0.4852 0.4012
KNN+LCH 0.9128 0.7751 0.7659 0.6273 0.1220 0.4365 0.3259
KNN+LHS 0.9133 0.7715 0.7674 0.6287 0.1288 0.4380 0.3388
KNN+LHC 0.9117 0.7753 0.7662 0.6272 0.1327 0.4421 0.3427

Table 10. Comparison of unexpected recommendations in the Yelp dataset using different geometric structures, ”*” stands for 95% statistical significance
Model RMSE MAE Pre@5 Rec@5 Unexp Ser Div
NCF+LCH 0.9635 0.7317 0.7210 0.9795 0.0579 0.4622 0.0478
NCF+LHS 0.9624 0.7310 0.7201 0.9810 0.0586 0.4635 0.0472
NCF+LHC 0.9652 0.7305 0.7214 0.9814 0.0593 0.4647 0.0469
FM+LCH 1.0275 0.7445 0.7040 0.9656 0.0543 0.4631 0.0393
FM+LHS 1.0230 0.7450 0.7031 0.9638 0.0581 0.4637 0.0388
FM+LHC 1.0218 0.7472 0.7020 0.9632 0.0561 0.4607 0.0407
CC+LCH 1.0285 0.7541 0.6865 0.9703 0.0552 0.4619 0.0471
CC+LHS 1.0230 0.7539 0.6887 0.9754 0.0587 0.4629 0.0491
CC+LHC 1.0200 0.7539 0.6864 0.9730 0.0562 0.4667 0.0498
SVD+LCH 0.9937 0.7517 0.7085 0.9594 0.0544 0.4621 0.0499
SVD+LHS 0.9908 0.7519 0.7093 0.9569 0.0585 0.4614 0.0477
SVD+LHC 0.9884 0.7541 0.7091 0.9474 0.0562 0.4654 0.0485
NMF+LCH 1.0262 0.7533 0.6881 0.9775 0.0544 0.4627 0.0499
NMF+LHS 1.0280 0.7594 0.6864 0.9735 0.0584 0.4629 0.0488
NMF+LHC 1.0265 0.7600 0.6853 0.9711 0.0559 0.4677 0.0504
KNN+LCH 1.0001 0.7483 0.6907 0.9763 0.0543 0.4631 0.0492
KNN+LHS 0.9981 0.7493 0.6909 0.9743 0.0588 0.4625 0.0488
KNN+LHC 0.9950 0.7524 0.6927 0.9701 0.0564 0.4671 0.0500

Table 11. Comparison of unexpected recommendations in the TripAdvisor dataset using different geometric structures, ”*” stands for 95% statistical significance
Model RMSE MAE Pre@5 Rec@5 Unexp Ser Div
NCF+LCH 0.3799 0.2870 0.2572 0.3638 0.7049 0.9819 0.2538
NCF+LHS 0.3810 0.2854 0.2560 0.3615 0.7070 0.9830 0.2538
NCF+LHC 0.3817 0.2846 0.2549 0.3632 0.7101 0.9852 0.2536
FM+LCH 0.3906 0.2998 0.2510 0.3278 0.7112 0.9840 0.2518
FM+LHS 0.3924 0.3044 0.2498 0.3265 0.7096 0.9833 0.2510
FM+LHC 0.3940 0.3056 0.2506 0.3302 0.7177 0.9833 0.2518
CC+LCH 0.4157 0.3240 0.2564 0.3624 0.7101 0.9817 0.2512
CC+LHS 0.4167 0.3296 0.2569 0.3676 0.7053 0.9815 0.2519
CC+LHC 0.4151 0.3358 0.2553 0.3659 0.7065 0.9830 0.2508
SVD+LCH 0.3841 0.2925 0.2400 0.3277 0.7010 0.9844 0.2408
SVD+LHS 0.3888 0.2862 0.2455 0.3253 0.7018 0.9810 0.2412
SVD+LHC 0.3836 0.2841 0.2433 0.3271 0.7007 0.9812 0.2454
NMF+LCH 0.4423 0.3306 0.2380 0.3491 0.7008 0.9799 0.2488
NMF+LHS 0.4405 0.3330 0.2494 0.3439 0.6999 0.9792 0.2450
NMF+LHC 0.4433 0.3387 0.2420 0.3459 0.6961 0.9803 0.2438
KNN+LCH 0.4106 0.3107 0.2584 0.3175 0.7007 0.9817 0.2558
KNN+LHS 0.4088 0.3091 0.2608 0.3212 0.7014 0.9814 0.2558
KNN+LHC 0.4069 0.3099 0.2620 0.3248 0.7073 0.9830 0.2519

Table 12. Comparison of unexpected recommendations in the Video dataset using different geometric structures, ”*” stands for 95% statistical significance

6.4. Visualization of Latent Embeddings

Finally, we conduct case study to reveal the effectiveness of modeling unexpectedness through latent embedding approaches. Specifically, we visualize the learned embedding vectors to provide insights of their semantic information in the latent space. Taking the Yelp dataset as an example, we randomly select 100 restaurants from the dataset and obtain their corresponding embeddings through the HINE method. In Figure 6, we show the visualization of those embeddings through t-SNE (Maaten and Hinton, 2008), in which similar restaurants are clustered close to each other. We could see that cafes and bakeries are clustered to the left side, whereas burger bars and fast food restaurants are clustered to the right side, and Asian restaurants are clustered to the far right in the latent space. Therefore, we show that the latent embedding approaches we use in this paper are indeed capable of capturing latent relations among items and thus providing precise modeling of unexpectedness.

Figure 6. t-SNE Visualization of Latent Embeddings

7. Conclusion

In this paper, we propose novel latent modeling of unexpectedness that simultaneously provides unexpected and satisfying recommendations. Specifically, we define unexpectedness of a new item as the distance between the embedding of that item in the latent space and the closure of all the previously consumed item embeddings. This new definition enables us to capture latent, complex and heterogeneous relationships between users and items that significantly improves performance and practicability of unexpected recommendations. To achieve this, we design a hybrid utility function as the linear combination of estimated ratings and unexpectedness to optimize accuracy and unexpectedness objectives of recommendations simultaneously. Furthermore, we demonstrate that the proposed approach consistently and significantly outperforms all other baseline models in terms of unexpectedness, serendipity and diversity measures without losing any accuracy performance.

The contributions of this paper are threefold. First, we propose latent modeling of unexpectedness. Though it is a common idea to explore latent space for recommendations, it is not obvious how to do it for unexpected recommendations, as we have discussed in Section 3. Second, we construct the hybrid utility function that combines the proposed unexpectedness measure with the rating estimation value and provides unexpected recommendations based on the hybrid utility values. We demonstrate that this approach significantly outperforms all other unexpected recommendation baselines. Third, we conduct extensive experiments in multiple settings and show that it is indeed the latent modeling of unexpectedness that leads to the significant increase in unexpectedness measures without sacrificing any performance accuracy. Thus, the proposed approach helps users to break out of their filter bubbles.

As the future work, we plan to conduct live experiments within real business environments in order to further evaluate the effectiveness of unexpected recommendations and analyze both qualitative and quantitative aspects in online retail settings through A/B tests. Specifically, we plan to launch our model in an industrial platform and measure its performance using business metrics, including CTR and GMV. Moreover, we will further explore the impact of unexpected recommendations on user satisfaction. Finally, we plan to design algorithms that automatically incorporate the concept of unexpectedness into the deep-learning recommendation framework that optimizes the recommendation performance and the construction of latent embeddings at the same time.

References

  • P. Adamopoulos and A. Tuzhilin (2015) On unexpectedness in recommender systems: or how to better expect the unexpected. In ACM Transactions on Intelligent Systems and Technology (TIST), pp. 54. Cited by: §1, §1, §2.1, §2.2, §2.2, Table 1, §3.1, §3.1, §3.2, 4th item.
  • P. Adamopoulos (2014) On discovering non-obvious recommendations: using unexpectedness and neighborhood selection methods in collaborative filtering systems. In Proceedings of the 7th ACM international conference on Web search and data mining, pp. 655–660. Cited by: §1, §2.1.
  • G. Adomavicius and Y. Kwon (2011a) Improving aggregate recommendation diversity using ranking-based techniques. In IEEE Transactions on Knowledge and Data Engineering, pp. 896–911. Cited by: §2.1.
  • G. Adomavicius and Y. Kwon (2011b) Maximizing aggregate recommendation diversity: a graph-theoretic approach. Cited by: §2.1.
  • T. Akiyama, K. Obara, and M. Tanizaki (2010) Proposal and evaluation of serendipitous recommendation method using general unexpectedness.. In PRSAT@ RecSys, pp. 3–10. Cited by: §1, §2.2.
  • N. S. Altman (1992) An introduction to kernel and nearest-neighbor nonparametric regression. In The American Statistician, pp. 175–185. Cited by: §5.3.
  • C. B. Barber, D. P. Dobkin, D. P. Dobkin, and H. Huhdanpaa (1996) The quickhull algorithm for convex hulls. In ACM Transactions on Mathematical Software (TOMS), pp. 469–483. Cited by: §3.2.
  • A. Barraza-Urbina (2017) The exploration-exploitation trade-off in interactive recommender systems. In Proceedings of the Eleventh ACM Conference on Recommender Systems, pp. 431–435. Cited by: §2.1.
  • L. Chen, G. Zhang, and E. Zhou (2018) Fast greedy map inference for determinantal point process to improve recommendation diversity. In Advances in Neural Information Processing Systems, pp. 5622–5633. Cited by: §2.1, 3rd item.
  • L. Chen, Y. Yang, N. Wang, K. Yang, and Q. Yuan (2019) How serendipity improves user satisfaction with recommendations? a large-scale user evaluation. In The World Wide Web Conference, pp. 240–250. Cited by: §2.1.
  • Y. Chiu, K. Lin, and J. Chen (2011) A social network-based serendipity recommender system. In 2011 International Symposium on Intelligent Signal Processing and Communications Systems (ISPACS), pp. 1–5. Cited by: Table 1.
  • Y. Dong, N. V. Chawla, and A. Swami (2017) Metapath2vec: scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 135–144. Cited by: §1, §2.3, §4.1.1.
  • M. Gartrell, U. Paquet, and N. Koenigstein (2017) Low-rank factorization of determinantal point processes. In

    Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence

    ,
    pp. 1912–1918. Cited by: §2.1.
  • M. Ge, C. Delgado-Battenfeld, and D. Jannach (2010) Beyond accuracy: evaluating recommender systems by coverage and serendipity. In Proceedings of the fourth ACM conference on Recommender systems, pp. 257–260. Cited by: §1, §2.1, §2.1, §2.2, §3.1, §5.4.
  • T. George and S. Merugu (2005) A scalable collaborative filtering framework based on co-clustering. In Proceedings of the Fifth IEEE International Conference on Data Mining, pp. 625–628. Cited by: §5.3.
  • A. Grover and J. Leskovec (2016) Node2vec: scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 855–864. Cited by: §4.1.1, §5.3.
  • X. He, L. Liao, H. Zhang, L. Nie, X. Hu, and T. Chua (2017) Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web, pp. 173–182. Cited by: §2.3, §5.3.
  • S. Helgason (2001) Differential geometry and symmetric spaces. Vol. 341, American Mathematical Soc.. Cited by: §3.2.
  • J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl (2004) Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems (TOIS) 22 (1), pp. 5–53. Cited by: §2.1, §2.2, §5.4.
  • Y. Hijikata, T. Shimizu, and S. Nishida (2009) Discovery-oriented collaborative filtering for improving user satisfaction. In Proceedings of the 14th international conference on Intelligent user interfaces, pp. 67–76. Cited by: §2.2, Table 1.
  • G. E. Hinton and R. R. Salakhutdinov (2006) Reducing the dimensionality of data with neural networks. science 313 (5786), pp. 504–507. Cited by: §1, §2.3.
  • N. J. Hurley (2013) Personalised ranking with diversity. In Proceedings of the 7th ACM conference on Recommender systems, pp. 379–382. Cited by: §2.1.
  • L. Iaquinta, M. De Gemmis, P. Lops, G. Semeraro, M. Filannino, and P. Molino (2008) Introducing serendipity in a content-based recommender system. In Hybrid Intelligent Systems, 2008. HIS’08. Eighth International Conference on, pp. 168–173. Cited by: §2.2, Table 1.
  • L. Iaquinta, M. de Gemmis, P. Lops, G. Semeraro, and P. Molino (2010) Can a recommender system induce serendipitous encounters?. In E-commerce, Cited by: §2.2.
  • M. Ishikawa, P. Geczy, N. Izumi, and T. Yamaguchi (2008) Long tail recommender utilizing information diffusion theory. In 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Vol. 1, pp. 785–788. Cited by: §2.1.
  • J. Kamahara, T. Asakawa, S. Shimojo, and H. Miyahara (2005) A community-based recommendation system to reveal unexpected interests. In 11th international multimedia modelling conference, pp. 433–438. Cited by: §2.2, Table 1.
  • M. Kaminskas and D. Bridge (2016) Diversity, serendipity, novelty, and coverage: a survey and empirical analysis of beyond-accuracy objectives in recommender systems. ACM Transactions on Interactive Intelligent Systems (TiiS) 7 (1), pp. 1–42. Cited by: §2.1.
  • K. Kapoor, V. Kumar, L. Terveen, J. A. Konstan, and P. Schrater (2015a) I like to explore sometimes: adapting to dynamic user novelty preferences. In Proceedings of the 9th ACM Conference on Recommender Systems, pp. 19–26. Cited by: §1.
  • K. Kapoor, K. Subbian, J. Srivastava, and P. Schrater (2015b) Just in time recommendations: modeling the dynamics of boredom in activity streams. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, pp. 233–242. Cited by: §1.
  • N. Kawamae, H. Sakano, and T. Yamada (2009) Personalized recommendation based on the personal innovator degree. In Proceedings of the third ACM conference on Recommender systems, pp. 329–332. Cited by: §2.2, Table 1.
  • N. Kawamae (2010) Serendipitous recommendations via innovators. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pp. 218–225. Cited by: §2.2, Table 1.
  • J. P. Kelly and D. Bridge (2006) Enhancing the diversity of conversational collaborative recommendations: a comparison. Artificial Intelligence Review 25 (1-2), pp. 79–95. Cited by: §2.1.
  • D. D. Lee and H. S. Seung (2001) Algorithms for non-negative matrix factorization. In Advances in neural information processing systems, pp. 556–562. Cited by: §5.3.
  • K. Lee and K. Lee (2015) Escaping your comfort zone: a graph-based recommender system for finding novel recommendations among relevant items. Expert Systems with Applications 42 (10), pp. 4851–4858. Cited by: §2.2, Table 1.
  • P. Li and A. Tuzhilin (2019a) Latent modeling of unexpectedness for recommendations. Proceedings of ACM RecSys 2019 Late-breaking Results, pp. 7–10. Cited by: §2.2, Table 1.
  • P. Li and A. Tuzhilin (2019b) Latent multi-criteria ratings for recommendations. In Proceedings of the 13th ACM Conference on Recommender Systems, pp. 428–431. Cited by: §2.3.
  • P. Li and A. Tuzhilin (2020) DDTCDR: deep dual transfer cross domain recommendation. In Proceedings of the 13th International Conference on Web Search and Data Mining, pp. 331–339. Cited by: §2.3.
  • Y. Lin, T. Liu, and H. Chen (2005)

    Semantic manifold learning for image retrieval

    .
    In Proceedings of the 13th annual ACM international conference on Multimedia, pp. 249–258. Cited by: §2.3.
  • Q. Lu, T. Chen, W. Zhang, D. Yang, and Y. Yu (2012) Serendipitous personalized ranking for top-n recommendation. In Web Intelligence and Intelligent Agent Technology (WI-IAT), 2012 IEEE/WIC/ACM International Conferences on, Vol. 1, pp. 258–265. Cited by: §2.1, §2.2, Table 1, 1st item.
  • L. v. d. Maaten and G. Hinton (2008) Visualizing data using t-sne. Journal of machine learning research 9 (Nov), pp. 2579–2605. Cited by: §6.4.
  • S. M. McNee, J. Riedl, and J. A. Konstan (2006) Being accurate is not enough: how accuracy metrics have hurt recommender systems. In CHI’06 extended abstracts on Human factors in computing systems, pp. 1097–1101. Cited by: §2.1, §2.1.
  • T. Murakami, K. Mori, and R. Orihara (2007) Metrics for evaluating the serendipity of recommendation lists. In Annual conference of the Japanese society for artificial intelligence, pp. 40–46. Cited by: §1, §2.2, §3.1.
  • T. T. Nguyen, P. Hui, F. M. Harper, L. Terveen, and J. A. Konstan (2014) Exploring the filter bubble: the effect of using recommender systems on content diversity. In Proceedings of the 23rd international conference on World wide web, pp. 677–686. Cited by: §1.
  • J. Oh, S. Park, H. Yu, M. Song, and S. Park (2011) Novel recommendation based on personal popularity tendency. In 2011 IEEE 11th International Conference on Data Mining, pp. 507–516. Cited by: §2.1.
  • Y. Pan, T. Mei, T. Yao, H. Li, and Y. Rui (2016) Jointly modeling embedding and translation to bridge video and language. In

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    ,
    pp. 4594–4602. Cited by: §1, §2.3, §4.1.3.
  • E. Pariser (2011) The filter bubble: how the new personalized web is changing what we read and how we think. Penguin. Cited by: §1.
  • Y. Park and A. Tuzhilin (2008) The long tail of recommender systems and how to leverage it. In Proceedings of the 2008 ACM conference on Recommender systems, pp. 11–18. Cited by: §2.1.
  • S. Rendle (2010) Factorization machines. In Data Mining (ICDM), 2010 IEEE 10th International Conference on, pp. 995–1000. Cited by: §5.3.
  • D. E. Rumelhart, G. E. Hinton, and R. J. Williams (1985) Learning internal representations by error propagation. Technical report California Univ San Diego La Jolla Inst for Cognitive Science. Cited by: §2.3.
  • A. Said, B. Kille, B. J. Jain, and S. Albayrak (2012) Increasing diversity through furthest neighbor-based recommendation. Proceedings of the WSDM 12. Cited by: §2.2, Table 1.
  • B. Sarwar, G. Karypis, J. Konstan, and J. Riedl (2002) Incremental singular value decomposition algorithms for highly scalable recommender systems. In Fifth International Conference on Computer and Information Science, Cited by: §5.3.
  • S. Sedhain, A. K. Menon, S. Sanner, and L. Xie (2015) Autorec: autoencoders meet collaborative filtering. In Proceedings of the 24th International Conference on World Wide Web, pp. 111–112. Cited by: §1, §2.3.
  • G. Shani and A. Gunawardana (2011) Evaluating recommendation systems. In Recommender systems handbook, pp. 257–297. Cited by: §1, §2.1.
  • C. Shi, B. Hu, X. Zhao, and P. Yu (2018) Heterogeneous information network embedding for recommendation. IEEE Transactions on Knowledge and Data Engineering. Cited by: §1, §2.2, §2.3, §5.3.
  • C. Shi, Y. Li, J. Zhang, Y. Sun, and S. Y. Philip (2017) A survey of heterogeneous information network analysis. IEEE Transactions on Knowledge and Data Engineering 29 (1), pp. 17–37. Cited by: §2.3.
  • L. Shi (2013) Trading-off among accuracy, similarity, diversity, and long-tail: a graph-based recommendation approach. In Proceedings of the 7th ACM conference on Recommender systems, pp. 57–64. Cited by: §2.1, §2.2, Table 1.
  • Y. Shi, X. Zhao, J. Wang, M. Larson, and A. Hanjalic (2012) Adaptive diversification of recommendation results via latent factor portfolio. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, pp. 175–184. Cited by: §2.1.
  • M. Slater (2014) Lagrange multipliers revisited. In Traces and Emergence of Nonlinear Programming, pp. 293–306. Cited by: 3rd item.
  • B. Smyth and P. McClave (2001) Similarity vs. diversity. In International conference on case-based reasoning, pp. 347–361. Cited by: §2.1.
  • J. Snoek, H. Larochelle, and R. P. Adams (2012) Practical bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.), pp. 2951–2959. External Links: Link Cited by: §5.2.
  • R. Su, L. Yin, K. Chen, and Y. Yu (2013) Set-oriented personalized ranking for diversified top-n recommendation. In Proceedings of the 7th ACM conference on Recommender systems, pp. 415–418. Cited by: §2.1.
  • Y. Sun and J. Han (2013) Mining heterogeneous information networks: a structural analysis approach. Acm Sigkdd Explorations Newsletter 14 (2), pp. 20–28. Cited by: §1, §4.1.1.
  • M. Taramigkou, E. Bothos, K. Christidis, D. Apostolou, and G. Mentzas (2013) Escape the bubble: guided exploration of music preferences for serendipity and novelty. In Proceedings of the 7th ACM conference on Recommender systems, pp. 335–338. Cited by: §2.2, Table 1.
  • S. Vargas, L. Baltrunas, A. Karatzoglou, and P. Castells (2014) Coverage, redundancy and size-awareness in genre diversity for recommender systems. In Proceedings of the 8th ACM Conference on Recommender systems, pp. 209–216. Cited by: §2.1.
  • S. Vargas and P. Castells (2011) Rank and relevance in novelty and diversity metrics for recommender systems. In Proceedings of the fifth ACM conference on Recommender systems, pp. 109–116. Cited by: §2.1.
  • J. Wang, P. Huang, H. Zhao, Z. Zhang, B. Zhao, and D. L. Lee (2018) Billion-scale commodity embedding for e-commerce recommendation in alibaba. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 839–848. Cited by: §5.7.
  • J. Wasilewski and N. Hurley (2019) Bayesian personalized ranking for novelty enhancement. In Proceedings of the 27th ACM Conference on User Modeling, Adaptation and Personalization, pp. 144–148. Cited by: §2.1, §2.2, Table 1.
  • Y. Wei, X. Wang, L. Nie, X. He, R. Hong, and T. Chua (2019) MMGCN: multi-modal graph convolution network for personalized recommendation of micro-video. In Proceedings of the 27th ACM International Conference on Multimedia, pp. 1437–1445. Cited by: §4.1.3.
  • L. Weng, Y. Xu, Y. Li, and R. Nayak (2007) Improving recommendation novelty based on topic taxonomy. In 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology-Workshops, pp. 115–118. Cited by: §2.2, Table 1.
  • L. Wu, Q. Liu, E. Chen, N. J. Yuan, G. Guo, and X. Xie (2016) Relevance meets coverage: a unified framework to generate diversified recommendations. ACM Transactions on Intelligent Systems and Technology (TIST) 7 (3), pp. 1–30. Cited by: §2.1.
  • X. Yu, X. Ren, Y. Sun, Q. Gu, B. Sturt, U. Khandelwal, B. Norick, and J. Han (2014) Personalized entity recommendation: a heterogeneous information network approach. In Proceedings of the 7th ACM international conference on Web search and data mining, pp. 283–292. Cited by: §2.2.
  • X. Yu, X. Ren, Y. Sun, B. Sturt, U. Khandelwal, Q. Gu, B. Norick, and J. Han (2013) Recommendation in heterogeneous information networks with implicit user feedback. In Proceedings of the 7th ACM conference on Recommender systems, pp. 347–350. Cited by: §2.2.
  • S. Zhang, L. Yao, and A. Sun (2017) Deep learning based recommender system: a survey and new perspectives. arXiv preprint arXiv:1707.07435. Cited by: §2.3, §5.2.
  • Y. C. Zhang, D. Ó. Séaghdha, D. Quercia, and T. Jambor (2012) Auralist: introducing serendipity into music recommendation. In Proceedings of the fifth ACM international conference on Web search and data mining, pp. 13–22. Cited by: §2.1, §2.2, Table 1, 2nd item.
  • Q. Zheng, C. Chan, and H. H. Ip (2015) An unexpectedness-augmented utility model for making serendipitous recommendation. In Industrial Conference on Data Mining, pp. 216–230. Cited by: §2.2.
  • T. Zhou, Z. Kuscsik, J. Liu, M. Medo, J. R. Wakeling, and Y. Zhang (2010) Solving the apparent diversity-accuracy dilemma of recommender systems. Proceedings of the National Academy of Sciences 107 (10), pp. 4511–4515. Cited by: §1, §6.2.
  • C. Ziegler, S. M. McNee, J. A. Konstan, and G. Lausen (2005) Improving recommendation lists through topic diversification. In Proceedings of the 14th international conference on World Wide Web, pp. 22–32. Cited by: §2.1, §5.4.
  • Z. Zolaktaf, R. Babanezhad, and R. Pottinger (2018) A generic top-n recommendation framework for trading-off accuracy, novelty, and coverage. In 2018 IEEE 34th International Conference on Data Engineering (ICDE), pp. 149–160. Cited by: §1, §6.2.