A Heterogeneous Information Network based Cross Domain Insurance Recommendation System for Cold Start Users

07/30/2020 ∙ by Ye Bi, et al. ∙ Ping An Bank NetEase, Inc 0

Internet is changing the world, adapting to the trend of internet sales will bring revenue to traditional insurance companies. Online insurance is still in its early stages of development, where cold start problem (prospective customer) is one of the greatest challenges. In traditional e-commerce field, several cross-domain recommendation (CDR) methods have been studied to infer preferences of cold start users based on their preferences in other domains. However, these CDR methods could not be applied to insurance domain directly due to the domain specific properties. In this paper, we propose a novel framework called a Heterogeneous information network based Cross Domain Insurance Recommendation (HCDIR) system for cold start users. Specifically, we first try to learn more effective user and item latent features in both source and target domains. In source domain, we employ gated recurrent unit (GRU) to module user dynamic interests. In target domain, given the complexity of insurance products and the data sparsity problem, we construct an insurance heterogeneous information network (IHIN) based on data from PingAn Jinguanjia, the IHIN connects users, agents, insurance products and insurance product properties together, giving us richer information. Then we employ three-level (relational, node, and semantic) attention aggregations to get user and insurance product representations. After obtaining latent features of overlapping users, a feature mapping between the two domains is learned by multi-layer perceptron (MLP). We apply HCDIR on Jinguanjia dataset, and show HCDIR significantly outperforms the state-of-the-art solutions.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

(a) Home Page
(b) Nonfinanacial Domain
(c) Insurance Domain
Figure 1. Online shopping on Jinguanjia. (a) is homepage. (b) is nonfinancial domain, providing daily necessities. (c) is insurance domain, providing various insurance products.

Internet is changing the world, every segment of the economy is experiencing dramatic change and is having to respond to shifts in the value chain, enhanced consumer power, and altered competitive cycles. Internet insurance adapted to the trend of economic boom in internet age for two main sectors. For supply side, internet insurance overcomes the limitations of live sales and geography, increasing the customer base. For demand side, internet sales are more acceptable to young people, who are the main consumers of insurance products. Adapting to the trend of internet sales will bring revenue to traditional insurance companies.

Internet insurance is still in its early stages of development, where cold start problem (prospective customer) is one of the greatest challenges. For example, PingAn Jinguanjia, one of the most popular comprehensive applications (App) in China, which boasts more than 100 million registered users, has nearly cold start users in insurance domain (i.e. these registered users didn’t buy any insurance products). This situation resulted from many reasons. First, insurance policies are so complex that ordinary users are relatively lack of knowledge to understand them. Besides, insurance products are typically bought to be used for a long time period (e.g. one year in car insurance). Attracting prospective customers plays a critical role in buildup of the competitive edge for traditional insurance company. Under this circumstances, our motivation for creating an online insurance recommendation system stems from providing personalize recommendations for prospective customers, and then building customer loyalty. To our knowledge, there are not many models about recommendation systems in insurance domain, some includes (Rokach et al., 2013; Qazi et al., 2017; Liu et al., 2019). However, these methods treat insurance domain and traditional e-commerce equally, neglecting product complexity and data sparsity problem in insurance domain.

In this paper, we focus on PingAn Jinguanjia, one of the most popular comprehensive applications in China. In addition to traditional e-commerce products (defined as nonfinancial products in this paper), e.g. electronics, household supplies, etc., it also provides financial products like insurance products, investment services. Besides, each registered customer would be assigned with an agent, who can help with enquiries, offer recommendations. As mentioned above, even though Jinguanjia has a big user group, it does not have a greater share of sales in online insurance. In other words, most of the registered users did not buy any insurance products, though they have relatively abundant activities in nonfinancial domain. As a result, we could not get enough information in insurance domain only. Traditional recommendation systems, like collaborate filtering (CF) (Abdollahi and Nasraoui, 2016), sequential-based models (Chung et al., 2014), could not perform effectively in insurance domain, since most of users only have less than 2 interactions in a year. To obtain enough information and get more accurate recommendation, PingAn company tries to use side information form Jinguanjia App (the interaction behaviors form nonfinancial domain), but to little avail.

Cross-domain recommendation (CDR) (Man et al., 2017; Ma et al., 2019; Kang et al., 2019; Fu et al., 2019), which aims to improve the recommendation performance by means of transferring information from the source domain to the target domain, is one of the promising ways to solve data sparsity and cold start problem. These methods assume that there exists overlap in information between users and/or items across different domains, and train a mapping function from the source-domain into the target-domain. So the key factor for CDR method is to learn more comprehensive and accurate user representations in two domain. However, the complexity of the insurance products and the severe data sparsity hinder us from learning user representations in insurance domain as accurate as possible. As a result, we could not apply CDR methods into insurance and nonfinancial domain directly.

In order to help the users understand the complex insurance policies and get user representations as comprehensive and accurate as possible, an insurance heterogeneous information network (IHIN) is constructed according to the data from Jinguanjia App. In IHIN, we define four types of nodes corresponding to users, agents, insurance products and insurance product properties, and six types of edges denoting various types of relations between them. Graph convolutional networks (GCN) (Hamilton et al., 2017) and Graph attention networks (GAT) (Velickovic et al., 2018) as powerful deep representation learning method for graph data, has shown superior performance on recommendation. However, these methods apply identical aggregation function on various types of edges, and the number of neighbors grows exponentially as the layers stacked up, which prohibit these methods performing efficiently on HIN. To deal with heterogeneous information, many state-of-the-art models emerge and has been proved to be efficient (Wang et al., 2019b; Xu et al., 2019; Schlichtkrull et al., 2018). R-GCNs (Schlichtkrull et al., 2018) are developed to deal with highly multi-relational data. HAN (Wang et al., 2019b) designs a two level (node-level and semantic-level) attentions to generate node embedding by aggregating features from meta-path based neighbors.

Inspired by these models, we propose a novel framework called a Heterogeneous information network based Cross Domain Insurance Recommendation (HCDIR) system for cold start users. Specifically, we first try to learn more effective user and item latent features in both source and target domains. In source domain, users interactions are rich, we can easily get the consume sequence of users, so we employ gated recurrent unit (GRU) (Chung et al., 2014) to module users’ dynamic interests. In target domain, given the complexity of insurance products and the data sparsity problem, we construct an IHIN based on data from Jinguanjia App, the IHIN connects users, agents, insurance products and insurance product properties together, giving us richer underlying information. Then we employ three-level (relational, node, and semantic) attention aggregations to get user and insurance product representations. After obtaining the latent features of the overlapping users, a feature mapping between the two domains is learned by multi-layer perceptron (MLP).

In summary, our contributions in this paper are as follows:

  • To the best of our knowledge, this is the first work to combine cross-domain mechanism and heterogeneous information network to give personalized recommendations for cold start users in insurance domain.

  • For the complexity of insurance products, we construct a heterogeneous information network, which contains four types of nodes and six types of relations. And we employ three level aggregations over IHIN to learn more effective user and item representations in insurance domain.

  • We conduct experiments on real-world recommendation scenarios, and the results prove the efficacy of HCDIR over several state-of-the-art baselines.

2. Data and Preliminary

2.1. Dataset

Our dataset is collected from one of the largest e-commerce platform PingAn Jinguanjia. As shown in Figure 1, Jinguanjia provides not only nonfinancial products (traditional e-commerce products), but also financial products like insurance products, investment services. Besides, each registered customer would be assigned with an agent, who can help with enquiries, offer recommendations. In this paper, we aim at providing recommendations to prospective customers by CDR method in insurance domain, the users we use are overlapping users, who have interactions in both insurance domain and nonfinancial domain. Our dataset is collected within the time period from June 1st 2018 to May 31th 2019, the statistics of which is shown in Table 1.

[1pt] IS-domain (Target domain) NF-domain (Source domain)
User Nodes 117,613 Users 117,613
Item Nodes 42 Items 19,266
Agent Nodes 90,377 User-Item Interactions 1,995,168
Insurance Property Nodes 35
User-Iten Relations 344,206
User-Agent Relations 97,343
Item-Property Relations 275
[1.0pt]
Table 1. Statistics of Our dataset.

Nonfinancial Domain. The nonfinancial domain contains pursue logs of nonfinancial products (daily necessities) including clothes, skincare products, fruits, etc. Each item is associated with a description, illustrating category, function, and so on. Besides, we also have the interaction order of each user.

Insurance Domain. The insurance domain contains short-term insurance (coverage time less than one year) including illness insurances, accident insurances, medical insurances, education insurances and other kinds of insurances. To better learn user representations, we construct an insurance heterogeneous information network, which contains four types of nodes (user (U), agent (A), insurance product (I), insurance property (P)), and six types of relations among them (UI: purchase and be purchased by; UA: be served by and serve; IP: possess and be possessed by). Insurance policies are very complex, even the same product, if two customers are in different age groups, they may pay different price. To better describe insurance products, we choose 35 insurance properties (e.g. price, age limit, coverage time, etc.) that customers care most about as nodes in IHIN.

2.2. Observations in Real Data

Is it necessary to design a recommendation system specifically for cold start users in online insurance domain? To answer this, we start by investigating the following questions.

Q1. Is online insurance the tendency? First, we might wonder that if users really buy insurance products online, or they may have been used to buying insurance products in the traditional way. To answer this question, we calculate the number of insured improvements on Ping An Jinguanjia App from 2015 to 2019, which are showed in Figure 2. From the results, we can observe that online insurance experienced explosive growth from 2016 to 2018, the number of insured orders on Jinguanjia jumped by over 2 times in 2018 compared with that in 2015. However, the growth in 2019 entered a bottleneck period, so it is urgent for insurance company to adjust their operation patterns to the internet trend.

Figure 2. The Number of Insured Improvements w.r.t. 2015 on Jinguanjia App

Q2. Do users’ behaviours in nonfinancial domain have influence on their behaviors in insurance domain? In order to investigate the implicit relationships between users’ behaviors in insurance domain and nonfinancial domain, we define a metric called group-buy-ratio. For a given group, the group-buy-ratio is defined by the number of people who buy insurance products on Jinguanjia for the first time divided by the total number of people in this group.

We first select two groups of customers in Jinguanjia, regular customer group (RCG) includes customers who have bought only some nonfinancial products on Jinguanjia before, new customer group (NCG) concludes new registered customers, i.e. customers who did’t buy anything. Then we calculate group-buy-ratio for the two group in six months and summarize them in Figure 3.

Figure 3. Group-buy-ratio of RCG and NCG.

Group-buy-ratio of RCG and NCG.

From Figure 3, we can see that group-buy-ratio of RCG is higher than that of NCG. There may be two reasons. Firstly, regular customers in nonfinancial domain might be more willing to trust Jinguanjia, since they have shopping experience in the App. Moreover, as shown in Figure 1(b), most goods Jinguanjia provides are health products, customers who buy those products may concern more about themselves. So, users’ behaviors in nonfinancial domain may help us make recommendations in insurance domain. We note here that group-buy-ratio in each month is new purchases rate in the group. Since insurance products are typically bought to be used for a long time period (e.g. one year for car insurance), the customers may not buy them again in a short time period, so group-buy-ratio of RCG decreases in our statistic period. Even though, group-buy-ratio of RCG is higher than that of NCG.

Q3. Are insurance policies really complex? In traditional e-commerce domain, for example, in clothes, customers only need to see picture to decided weather they need the cloth or not. In insurance domain, understanding items may require a considerable cognitive overload. For example, there are 11 main terms and 30 subsidiary terms in “PingAn critical illness insurance clause”. The main terms include responsibilities of insurance company, exemption of insured liability, rights and obligations for both policy holder and insurer, etc. The subsidiary terms includes some explanation of medical term and exception of the insurance. In a word, insurance policies are complex, and the complexity can be summarized as numerous contents and complicated terminology.

Q4. Are agents affecting users’ buying behaviors in insurance domain? To investigate whether agents are affecting users’ buying decisions, we define ask-buy-ratio, which is the number of customers who buy online insurance after consulting the agent dividing by the total number of customers who have consulted. We first divide the agents according to their communication frequency with customers, and list the ask-buy-ratio of the top , top , and top communication frequency agents in Table 2. From Table 2, we can see that different agents have different ask-buy-ratio, ask-buy-ratio of the top communication frequency agents is more than four times that of the top communication frequency agents. This indicates that if a customer is assigned with an agent in the top communication frequency group, he /she may be more likely to buy online insurance products.

[1pt] communication frequency order T+1 T+2 T+3
top 4.9784 4.9750 4.7591
top 2.2700 2.3214 2.2698
top 1.0184 1.0635 1.0145
[1.0pt]
Table 2. Ask-buy-ratio of different agents.

To sum up, we have following findings.

  • Online insurance is becoming more and more popular, thought it is in its growth bottleneck. A special recommendation system for online insurance domain is in demand.

  • Users’ behaviors in nonfinancial domain have influence on their behaviors in insurance domain. Users who have shopping experiences in nonfinancial domain are more willing to trust Jinguanjia, and more likely to buy insurance products.

  • Insurance policies are too complex to understand, the traditional randomly initialized method could not give the accurate item representations.

  • Different agents have different influence on users, a user assigned with the top agent will be more likely to buy online insurance.

Given the above findings, we argue that designing a recommendations system specifically for online insurance is essential. It is also worth noting that, to give more accurate recommendations, we should try to represent the products accurately and take the influence of nonfinancial domain and agents into consideration.

[1pt] Notations Descriptions
, source domain and target domain
overlapping users in the two domains
, rating matrices of source and target domain
, interacted item sequences of user in source
and target domain
’s one-hop neighbors
the set of nodes connecting to by meta-path
[1.0pt]
Table 3. Notations and descriptions

2.3. Preliminary

A heterogeneous information network (HIN) is a special kind of information network, which contains either multiple types of objects or multiple types of relations, which can be defined as follows:

Definition 2.1 (Heterogeneous Information Network (HIN) (Sun et al., 2011)).

A HIN is defined as a directed graph with an node type mapping function and a relation type mapping function . and denote the sets of predefined node and relation types, where .

In HINs, two objects can be connected via different semantic paths, which are called meta-paths.

Definition 2.2 (Meta-path (Sun et al., 2011)).

A meta-path is defined as a path in the form of (abbreviated as ), which describes a composite relation between object and , where denotes the composition operator on relations.

3. Problem Formulation

In this section, we formally define our problem, and summarize the notations and descriptions in Table 3. As mentioned above, we have two domains, a source domain (nonfinancial domain) and a target domain (insurance domain). Let denote overlapping users between nonfinancial domain and insurance domain , respectively. If a user only appears in one domain, he/she is a cold start user in the other domain. The user-item interaction matrices are denoted as and , which are defined according to users’ implicit feedbacks. We additionally use and for the sequences of items that user has interacted with. Besides, the interactions in insurance domain can be abstracted as a heterogeneous information network (HIN), which we will illustrate later. Given rating matrices and HIN, our goal is to learn more effective latent features for users and items, and then learn the mapping function from nonfinancial domain to insurance domain, which can help us deal with cold start users.

4. Hcdir

To provide recommendations to cold start users, we propose HCDIR. As shown in Figure 4, HCDIR contains three main parts: learning latent features of users in both insurance domain and nonfinancial domain, mapping of user latent features.

Figure 4. The Framework of HCDIR

4.1. Latent Feature in Insurance Domain

Figure 5. The Details of TAHIN (take node as example). (a) illustrates the HIN constructed on Jinguanjia dataset. (b) is relational neighbors aggregation, we first project the neighbors to the same node type space, and aggregate them by calculating the weighted sum of one-hop neighbors. (c) is the node and semantic attention aggregation, the left part is to aggregate the meta-paths based neighbors, the right part is to aggregate the results from the left part. (d) is node updation, aggregating information from (b) and (c) to the original node representation.

As mentioned above, the complexity of insurance products is typically non-trivial, understanding the items may require a considerable cognitive overload (Rokach et al., 2013). Under this circumstance, generating efficient user embeddings is challenging. To achieve this goal, we design a three-level attention aggregation HIN method (TAHIN). In this part, we first introduce the IHIN we constructed based on Jinguanjia dataset, and then present how to learn effective user representations over the constructed IHIN. Figure 5 shows the details of TAHIN module. Specifically, we first propose relational attention to aggregate one-hop heterogeneous neighbors, and then node attention to aggregate meta-paths based neighbors, and semantic attention to aggregate meta-paths based neighbor sets. Finally, we aggregate the results of relational attention aggregation and semantic attention aggregation to the original node embedding to update node representations.

4.1.1. Insurance Heterogeneous Information Network Construction

Interactions in insurance domain can be abstracted as an insurance heterogeneous information network (IHIN). Specifically, we define four types of nodes corresponding to user (U), agent (A), insurance product (I) and insurance product property (P), and six types of edges denoting various types of relations between them. As we mentioned above, insurance products are very complex, customers usually couldn’t understand the whole insurance policies by just reading insurance products titles. Insurance products have several properties, which meet different demands for different customers. Therefore, we treat insurance property as a type of node. In this paper, we choose several properties customers most care about, which are price, level of assurance, character, coverage time, insurance type, age restriction, extra characters, etc. The schema of IHIN is displayed in Figure 5(a), which is formally defined as:

Definition 4.1 (Insurance Heterogeneous Information Network).

Insurance Heterogeneous Information Network (IHIN) in our work is a HIN, containing four types of nodes: users , agents , insurance products and insurance product properties . Edges exit between and denoting be served by and serve relations, between and denoting purchase and be purchased by relations, between and denoting possess and be possessed by relations.

PingAn company possesses the data of user portrait, agent portrait and item portrait, for efficiency, we initialize IHIN using these data instead of initializing them randomly.

In IHIN, two nodes can be connected via different meta-paths. As shows in Figure 5, two insurance products can be connected via multiple meta-paths, e.g. insurance product-user-insurance product (I-U-I), insurance product-insurance property-insurance product (I-P-I), etc. Different meta-paths may reveal different semantics. For example, I-U-I means the two insurance products are needed by the same user, they may be complementary. I-P-I means the two insurance products have same properties, e.g. high level assurance. In addition, meta-paths can also connect different types of nodes, for example, user-insurance product-insurance property (U-I-P), which implies that the user bought the insurance, since she may concern most about the insurance property. Now, we can give the definition of meta-path based neighbors:

Definition 4.2 (Meta-path based Neighbors (Hu et al., 2019)).

Given a node and a meta-path in a HIN, the meta-path based neighbors of node is defined as a set of nodes which connect with node via meta-path . Note that the node’s meta-path based neighbors may have different node types.

4.1.2. Relational Neighbor Aggregation

As different relations imply different information, as shown in Figure 5(a), and are all neighbors of , but they imply different information. We employ a relational attention aggregation over one-hop neighbors. Figure 5 (b) illustrates the framework. Specifically, instead of using the same aggregation function among different one-hop neighbors, we learn a specific aggregation function for each type of relation. Let denote the current embedding of node , as node’s one-hop neighbors may have different node type with the node, so we first project them to the same node space ( is projection matrix), and then calculate the attention score:

where is the deep neutral network performing relational attention, is the level of influence of node , is node ’s one-hop neighbors. Then, we aggregate information from :

(1)

where

denotes the activation function.

4.1.3. Meta-path based Aggregation

Two nodes can also be connected by meta-paths, since different meta-path based neighbors imply different information (e.g. information from insurance product to insurance product (--) is different from information from it to (--), as shown in figure 5(a), since the former implies the same user, and the later implies the same property). For efficiency, we only choose the meta-path based neighbors that have the same node type with the node. The attention score is defined as:

where is the deep neutral network which performs the node-level attention, is the level of influence of node . Then, we employ attention mechanism to aggregate the information of the meta-path based neighbors:

where denotes the activation function. The procedure is illustrated in the left part of Figure 5(c).

Given the meta-path set , after node attention aggregation, for node , we can obtain node-level embeddings, denoted as . All the node embeddings are denoted as . In the following part, we introduce how to aggregate these node-level embeddings.

To learn a more accurate node embedding, we try to fuse multiple node embeddings. Taking as input, as shown in the right part of Figure 5(c), we first calculate the importance of each meta-path :

and the weight for is defined as:

Form the definition of the attention score, we can see that the higher , the more important meta-path is. Now, we can fuse these node-level embeddings to obtain the final node embeddings:

(2)

4.1.4. Node Updation

Finally, we aggregate the information to node from (from (1)) and (from (2)):

After updating the HIN node embeddings, we can get the user and insurance product embedding, which are denoted as and , respectively. The objective function in target domain is:

(3)

where ,

is sigmoid function,

is a ranking function which can be a dot-product or a deep neural network.

4.2. Latent Feature in Nonfinancial Domain

In Jinguanjia, each item in nonfinancial domain is associated with a description . In order to learn more effective latent features, we employ word2vec (Mikolov et al., 2013). Suppose there are words in ’s content

. Then we utilize word2vec to obtain word vectors, which are represented as

. Then we concatenate word vectors and apply a max pooling over it to get the final item embedding:

To model the final user latent feature , we employ GRU over the user’s interacted sequence ,

where is sigmoid function, is element-wise product, , , , , , , is hidden size. And use to represent the user, i.e.

. The loss function is the same as eq. (

3), where .

4.3. Mapping Function Between Two Domains

Similar to study (Man et al., 2017), we employ MLP to perform latent space matching from source domain to target domain. We take as input and as output. and the loss function can be formalized as:

4.4. Recommendation for Cold Start Users

In this paper, we assume cold start users have interactions in nonfinancial domain, but no interactions in insurance domain. After learning the latent features in nonfinancial domain , we can get the corresponding mapping latent features . Based on learned , we can make recommendations to cold start users.

5. Experiment

To evaluate the performance of HCDIR, we conduct extensive experiments and online A/B test on Jinguanjia dataset to answer the following key questions:

RQ1: How does our proposed HCDIR model perform compared with the state-of-the-art methods for CDR task?

RQ2: Can the proposed HCDIR alleviate the data sparsity problem in the target domain?

RQ3: How does different types of heterogeneous auxiliary information and other HIN options affect the recommendation performance in HCDIR?

5.1. Datasets

As described in Section 2.1, we build and release a suitable dataset for insurance product recommendation task. We randomly split the overlapping users of Jinguanjia dataset into training set (60) to learn parameters, validation set (20) to tune hyper-parameters, and testing set (20) for the final performance comparison. For the testing set, we remove their information in the target domain to utilize them as cold start users for evaluating the recommendation performance (i.e., test users). To study the performance changes of our proposed methods with respect to the number of overlapping users, we restrict the number of the overlapping users similarly to the real-world distribution. We build four training sets with a certain fraction of the overlapping users who do not belong to the test users in baseline comparison study.

5.2. Baseline Models and Metrics

Four widely used recommendation algorithms are compared with the variants of HCDIR. These baselines can be divided into two groups: (1) Single-domain Models: BPR (Rendle et al., 2012) and GRU4REC (Hidasi et al., 2016); (2) Cross-domain Models: EMCDR-BPR (Man et al., 2017), EMCDR-GRU, two variants of HCDIR and HCDIR. The first group is utilized to validate the usefulness of cross-domain recommendation models, and the second group is used to demonstrate the advantage of TAHIN module in insurance domain to deal with various kinds of heterogeneous information including user purchase logs, agent and complex insurance products’ properties. How to utilize different types of heterogeneous information is one of the key factors to boost the effectiveness of model. RGCN and HAN are two representative methods in handling heterogeneous data. Here, we designed two variants, HCDIR-RGCN and HCDIR-HAN adopting RGCN and HAN, respectively. HAN is superior to the other deep heterogeneous network embedding models such as Metapath2Vec. RGCN employs the relation-aggregators-based GCN to heterogeneous information network. HCDIR both leverages HAN and RGCN to process various kinds of heterogeneous information in insurance domian for better user representation, which can effectively improve the recommendation performance.

We evaluate all models with NDCG and Rec@N (N=1,3,5), which effectively evaluate the performance of recommendation methods. NDCG is used to observe the overall performance in terms of ranking insurances, while Recall@N is used to judge how accurately recommend insurances at top N positions.

NDCG: Normalized Discounted Cumulative Gain (NDCG) extends HR by assigning higher scores to the hits at higher positions in the ranking list.

Recall@N(Rec@N)

: The primary evaluation metric is Recall, which measures the proportion of cases when the relevant item is amongst the top ranked items in all test cases.

[1pt] Jinguanjia dataset Metrics
Group Method NDCG Rec@1 Rec@3 Rec@5
10 Single-domain BPR 0.0719 0.0213 0.0737 0.1248
RS GRU4REC 0.0036 0.0017 0.0032 0.0057
EMCDR-BPR 0.0881 0.0324 0.0689 0.1543
Cross-domain EMCDR-GRU 0.1013 0.0284 0.0961 0.2088
RS HCDIR-RGCN 0.2468 0.0967 0.3448 0.3849
HCDIR-HAN 0.3206 0.1236 0.3476 0.4828
HCDIR 0.3674 0.1366 0.4002 0.5543
20 Single-domain BPR 0.0789 0.0241 0.0864 0.1348
RS GRU4REC 0.0042 0.0022 0.0047 0.0061
EMCDR-BPR 0.0984 0.0347 0.0848 0.1611
Cross-domain EMCDR-GRU 0.1112 0.0366 0.1308 0.2257
RS HCDIR-RGCN 0.2579 0.1003 0.3516 0.4002
HCDIR-HAN 0.3311 0.1273 0.3656 0.4927
HCDIR 0.3769 0.1548 0.4189 0.5683
50 Single-domain BPR 0.0791 0.0274 0.1205 0.1735
RS GRU4REC 0.0117 0.0027 0.0114 0.0213
EMCDR-BPR 0.1125 0.0402 0.1609 0.2281
Cross-domain EMCDR-GRU 0.1289 0.0496 0.1594 0.2589
RS HCDIR-RGCN 0.2701 0.1166 0.3611 0.4341
HCDIR-HAN 0.3432 0.1341 0.3946 0.5372
HCDIR 0.3895 0.1636 0.4354 0.5827
100 Single-domain BPR 0.1009 0.0354 0.1627 0.1809
RS GRU4REC 0.0137 0.0054 0.0154 0.0221
EMCDR-BPR 0.1359 0.0511 0.2059 0.2556
Cross-domain EMCDR-GRU 0.1498 0.0806 0.2124 0.2486
RS HCDIR-RGCN 0.3067 0.1247 0.3739 0.4974
HCDIR-HAN 0.3703 0.1357 0.4254 0.5627
HCDIR 0.4109 0.1873 0.4654 0.6128
[1.0pt]
Table 4. Performance comparison.

5.3. Model Implement Details

Parameter Setting. In TAHIN’s ‘relational neighbor aggregation’ part, message passing is set as mean operation and type wise reducer is set as sum operation. In TAHIN’s ‘meta-path based aggregation’ part, the meta-paths used here can be categorized into four groups according to node types. User meta-paths are [U I U], [U A U] and [U I P I U], item meta-paths are [I U I], [I P I] and [I U A U I], agent meta-paths are [A U A] and [A U I U A] and insurance products’ properties meta-paths are [P I P] and [P I U I P]. The number of attention head in GAT is set to 8. Owing to separate training in three tasks (insurance domain, nonfinancial domain and cross domain) in cold start scenario, single type of meta-paths cannot significantly affect the model performance while incorporation of all kinds of meta-paths can boost the performance. Final embedding dimension S is a key parameter in HCDIR discussed in the below section. We set the GRU hidden state size to 32 due to storage. We take Adam as our optimizing algorithm. For the hyper-parameters of the Adam optimizer,we set the learning rate

= 0.001. These settings are chosen with grid search on the validation set. To speed up the training and converge quickly, we use batch size as 32. We test the model performance on the validation set for every epoch. We implement the proposed method based on Pytorch and DGL

(Wang et al., 2019a). All experiments are performed in Nvidia Tesla V100.

Study of the Final Embedding Dimension S. The quality of the final emvedding can directly effect the performance of model. As shown in Figure 6, we can see that with the increase of the embedding dimension S, the performance raises first and then starts to drop slowly. The best parameter of S is 32. The main reason is that cross-domain method needs a suitable dimension to encode two domains’ different information and larger dimension may introduce additional redundancies.

Figure 6. Study of the Final Embedding Dimension S.

Study of the Final Embedding Dimension S.

5.4. Performance Comparison

To answer RQ1 and RQ2, two variants of HCDIR are compared with four state-of-the-art models with different densities. Table 4 shows the performance comparison. Overall, benefiting from the proposed TAHIN module and source domain information, HCDIR beats all comparative methods under all levels of data sparsity, respectively. These experiments reveal a number of interesting discoveries: (1) All cross-domain methods yield better performances than single-domain methods with mixture of target and source domain data , demonstrating the importance of cross-domain module; (2) Owing to the capability of using different types of heterogeneous information in insurance domian, two variants of HCDIR (HCDIR-RGCN and HCDIR-HAN) defeat other comparative methods; (3) HCDIR achieves a better performance in a sparser dataset compared with other methods. It is validated that, compared to comparative approaches, HCDIR can better alleviate the negative impacts of the data sparsity issue.

In order to anwser RQ3, we conduct experiments to compare HCDIR with HCDIR-RGCN and HCDIR-HAN. From the results of Table 4, we can find that the performance of HCDIR-RGCN and HCDIR-HAN declines sharply in terms of all the metrics when using a sparser dataset. This experiment shows that, the proposed HCDIR can get more stable and better performance with limited data, which mainly contributes various types of heterogeneous information and the incorporation of RGCN and HAN to deal with various kinds of auxiliary relationships.

5.5. Ablation Study

We find two important factors (data and corresponding data process module) effecting the performance of model. Therefore, we conduct the following two studies at 10 sparsity level, data ablation and model ablation study as shown in Table 5.

Result 1: Data Ablation

In order to investigate the effect of two newly added heterogeneous data (Agents and Insurance Properties), we designed three variants of our proposed model, HCDIR with only interactions, HCDIR without Agent and HCDIR without Insurance Properties (short for HCDIR without IP). From Table 5, it is found that only using interactions can not reach the best performance even use our proposed model framework. We can also observe that the performance of HCDIR without Agent declines more than HCDIR without IP compared to HCDIR in terms of all the metrics, which means agent heterogeneous information is the key factor to improve the model.

Result 2: Model Ablation

Two kinds of newly added heterogeneous information are used in HCDIR. How to leverage various types of heterogeneous information effectively may affect the final model performance. RGCN and HAN are two widely used methods in dealing with heterogeneous data, so we designed two variants of HCDIR in IHIN module, namely HCDIR using RGCN and HCDIR using HAN. HCDIR using HAN outperforms HCDIR using RGCN which aggregates 1-hop relation-aware neighbors. These results indicates the advantage of the combination attention mechanism and higher-order heterogeneous neighbors generated by GCN-based model in HAN. To improve HCDIR, we choose HAN and RGCN to deal with these heterogeneous information, which gains a better result.

[1pt] inguanjia dataset Metrics
at 10 sparsity level NDCG Rec@1 Rec@3 Rec@5
HCDIR only 0.1013 0.0284 0.0961 0.2088
with interactions -72.43 -79.21 -75.99 -62.33
Data HCDIR 0.2157 0.0933 0.2287 0.3277
Ablation without Agent -41.29 -31.70 -42.85 -40.88
HCDIR 0.2313 0.1073 0.2512 0.3665
without IP -37.04 -21.45 -37.23 -33.88
HCDIR 0.2468 0.0967 0.3448 0.3849
Model using RGCN -32.83 -29.21 -13.84 -30.56
Ablation HCDIR 0.3206 0.1236 0.3476 0.4828
using HAN -12.74 -9.52 -13.14 -12.90
Full Model HCDIR 0.3674 0.1366 0.4002 0.5543
[1.0pt]
Table 5. Performance of variants of HCDIR on Jinguanjia dataset at 10 sparsity level

5.6. Online A/B Testing for Cold-Start Recommendation

[1pt] Metrics GHCDIR without agent GHCDIR
T+1 month T+2 months T+3 months T+1 month T+2 months T+3 months
improvement percentage 8.79 12.87 18.38 12.94 18.66 23.20
of UPCR vs GBaseline
improvement percentage 10.97 13.04 15.31 15.25 20.41 25.62
of UPGR vs GBaseline
improvement percentage -79.94 -76.59
of runing time vs GBaseline
[1.0pt]
Table 6. Online performance of compared methods. ‘GBaseline’ indicates the baseline performance of cold start user group using traditional method LightGBM; and ‘GHCDIR without agent’ and ‘GHCDIR’denotes HCDIR without agent heterogeneous relationships and HCDIR, respectively.

To validate the effectiveness of HCDIR, we implement online A/B test for insurance domain’s cold-start users to show how cross domain method and heterogeneous insurance information affect cold start recommendation in real-world scenario.

For online A/B testing, cold-start users who haven’t purchased any insurance products by the end of August 2019 are divided into three groups with highly similar activities in Jinguangjia APP where each group contains 150,000 users. Users of the first group users are recommended insurance products by traditional strategy using best trained machine learning model LightGBM, donated as G

Baseline. Users of the second group are recommended by HCDIR without agent heterogeneous information with 10 training data of Jinguanjia dataset used above, donated as GHCDIR without agent. Users of the third group are by our proposed HCDIR trained with 10 training data, donated as GHCDIR.

User Purchase Conversion Rate(UPCR): Number of users who purchased the recommended insurance product divide total number of cold start users

User Premium Growth Amount (UPGA): Amount of insurance premium cold start users paid for the recommended insurance.

Table 6 shows the results of our designed online A/B testing compared with GBaseline as baseline. We compare the performance of GHCDIR without agent to machine learning method GBaseline using LightGBM with only user-item interactions and designed features. From Table 6, we find that the performance of GHCDIR without agent and GHCDIR using all the heterogeneous information consistently outperform these baseline methods. The improvement of UPCR and UPGR gradually increase over time, which indicates it need time for cold start users to develop insurance awareness. Specifically, it can be observed that GHCDIR without agent at least improves UPCR and UPGA by 8.79 and 10.97 in the time period from 1 month to 3 months compared to the traditional baseline GBaseline, respectively, which fully demonstrates the comprehensive effectiveness of TAHIN module in insurance domain and cross domain recommendation method. Moreover, with the help of ’agent’ heterogeneous auxiliary information, the improvements of UPCR and UPGR in GHCDIR are larger than that of GHCDIR without-agent. As mentioned above, agents are the key way to improve UPCR and UPGR in traditional insurance recommendation, and it also proves that the strong power of agent can significantly boost the performance of cold start problem in online insurance recommendation. As for training time, the training time of GBaseline model is 39.15 minutes, while the training time of GHCDIR without agent and GHCDIR are 7.86 minutes and 9.17 minutes, respectively. As shown in Table 6, our proposed HCDIR can at least improve by 76 .

6. Related Work

6.1. Insurance Recommendation System

To our knowledge, there are not many papers about recommendation systems in insurance products domain, some includes (Rokach et al., 2013; Gupta and Jain, 2013; Mitra et al., 2014; Qazi et al., 2017; Liu et al., 2019). (Rokach et al., 2013) throughly describes the differences between recommendation system for classical domain and insurance domain, and focuses on call centers servicing Life and Annual insurance, where the agents also have limited knowledge and experience. (Gupta and Jain, 2013) propose a web recommendation system for life insurance sector by using association rules, which is one of the most well researched techniques of data mining. (Mitra et al., 2014) presents a hybird recommendation system in insurance domain based on a standard user-user collaborate filtering approach. (Qazi et al., 2017) utilizes Bayes networks to give customers personalized recommendation based on what other similar people with similar portfolios have. (Kanchinadam et al., 2018) is a improved model of (Qazi et al., 2017)

, which tries to learn the structure of Bayesian network and considerably speeds up both training and inference run-times, while achieving similar accuracy.

(Liu et al., 2019) propose a causation-driven visualization system that fundamentally transforms cross-media insurance data into network diagrams and performs recommendation reasoning. However, these methods neglect the item complexity and data sparsity problem.

6.2. Cross-domain Recommendation

Cross-domain recommendation (CDR) (Wang et al., 2017; Man et al., 2017; Ma et al., 2019; Kang et al., 2019; Fu et al., 2019; Lin et al., 2019), which aims to improve the recommendation performance by means of transferring information from the auxiliary domain to the target domain, is one of the promising ways to solve data sparsity and cold start problem. Generally, CDR can be categorized into two categories. One is to aggregate knowledge between two domains, this kind of methods are interested in improving the overall performance in the target domain (Wang et al., 2017; Lin et al., 2019; Ma et al., 2019), however, they can not deal with cold start users. Since cold start users do not have any interactions in target domain. The other one aims at infering the preferences of cold start users based on their preferences observed in other domains (Man et al., 2017; Kang et al., 2019; Fu et al., 2019). These methods assume that there exists overlap in information between users and/or items across different domains, and train a mapping function from the source-domain into the target-domain. For cold start users, these method first learn representations in source domain, and then mapping them to the target domain.

6.3. Heterogeneous Information Networks

Recently, some methods have been proposed representation learning methods for HIN. These methods can be grossly divided into two groups: shallow models and deep models. Shallow models ((Dong et al., 2017; Fu et al., 2017; Hu et al., 2018; Lu et al., 2019) employ factorization-­based approaches or random walk approaches to aggregate information from neighbor nodes. For example, Metapath2vec (Dong et al., 2017) formalizes meta-paths based random walks to obtain heterogeneous neighborhoods of a node and leverages Skip-gram model to learn the network structure. However, this kind of method only explore one aspect information, failing to integrate more heterogeneous information. Deep models (Zhang et al., 2019; Wang et al., 2019b) aggregate neighbor information by neural network based method. HetGNN (Zhang et al., 2019) jointly learn heterogeneous graph information and heterogeneous contents information for node embeddings based on GNN (Scarselli et al., 2009). Inspired by graph attention networks, R-GCNs (Schlichtkrull et al., 2018) are developed to deal with the highly multi-relational data. HAN (Wang et al., 2019b) designs a two level (node-level and semantic-level) attentions to generate node embedding by aggregating features from meta-path based neighbors.

7. Conclusion and Future Work

To deal with insurance product complexity and cold start problem, we propose a novel framework called a HCDIR for cold start users in insurance domain. Specifically, we first try to learn more effective user and item latent features in both source and target domains. In source domain, we employ GRU to module users’ dynamic interests. In target domain, we construct an IHIN based on data from Jinguanjia App, then we employ three-level (relational, node, and semantic) attention aggregations to get user and insurance product representations. After obtaining the latent features of the overlapping users, a feature mapping between the two domains is learned by MLP. We apply HCDIR on PingAn Jinguanjia dataset, and show HCDIR significantly outperforms the state-of-the-art solutions. As future work, we will try to construct more complete HIN, considering more types of relations, such as the relation between agent and insurance product. We will also consider to train more accurate item representations in source domain.

References

  • B. Abdollahi and O. Nasraoui (2016) Explainable matrix factorization for collaborative filtering. See DBLP:conf/www/2016c, pp. 5–6. External Links: Link, Document Cited by: §1.
  • J. Chung, Ç. Gülçehre, K. Cho, and Y. Bengio (2014)

    Empirical evaluation of gated recurrent neural networks on sequence modeling

    .
    CoRR abs/1412.3555. External Links: Link, 1412.3555 Cited by: §1, §1.
  • Y. Dong, N. V. Chawla, and A. Swami (2017) Metapath2vec: scalable representation learning for heterogeneous networks. See DBLP:conf/kdd/2017, pp. 135–144. External Links: Link Cited by: §6.3.
  • T. Fu, W. Lee, and Z. Lei (2017) HIN2Vec: explore meta-paths in heterogeneous information networks for representation learning. See DBLP:conf/cikm/2017, pp. 1797–1806. External Links: Link, Document Cited by: §6.3.
  • W. Fu, Z. Peng, S. Wang, Y. Xu, and J. Li (2019) Deeply fusing reviews and contents for cold start users in cross-domain recommendation systems. See DBLP:conf/aaai/2019, pp. 94–101. External Links: Link, Document Cited by: §1, §6.2.
  • A. Gupta and A. Jain (2013) Life insurance recommender system based on association rule mining and dual clustering method for solving cold-start problem. International Journal of Advanced Research in Computer Science and Software Engineering 3. Cited by: §6.1.
  • W. L. Hamilton, Z. Ying, and J. Leskovec (2017) Inductive representation learning on large graphs. See DBLP:conf/nips/2017, pp. 1024–1034. Cited by: §1.
  • B. Hidasi, A. Karatzoglou, L. Baltrunas, and D. Tikk (2016) Session-based recommendations with recurrent neural networks. See DBLP:conf/iclr/2016, External Links: Link Cited by: §5.2.
  • B. Hu, C. Shi, W. X. Zhao, and P. S. Yu (2018)

    Leveraging meta-path based context for top- N recommendation with A neural co-attention model

    .
    See DBLP:conf/kdd/2018, pp. 1531–1540. External Links: Link, Document Cited by: §6.3.
  • B. Hu, Z. Zhang, C. Shi, J. Zhou, X. Li, and Y. Qi (2019) Cash-out user detection based on attributed heterogeneous information network with a hierarchical attention mechanism. See DBLP:conf/aaai/2019, pp. 946–953. External Links: Link, Document Cited by: Definition 4.2.
  • T. Kanchinadam, M. Qazi, J. Bockhorst, M. Y. Morell, K. J. Meissner, and G. Fung (2018) Using discriminative graphical models for insurance recommender systems. See DBLP:conf/icmla/2018, pp. 421–428. External Links: Link, Document Cited by: §6.1.
  • S. Kang, J. Hwang, D. Lee, and H. Yu (2019) Semi-supervised learning for cross-domain recommendation to cold-start users. See DBLP:conf/cikm/2019, pp. 1563–1572. External Links: Link, Document Cited by: §1, §6.2.
  • T. Lin, C. Gao, and Y. Li (2019) CROSS: cross-platform recommendation for social e-commerce. See DBLP:conf/sigir/2019, pp. 515–524. External Links: Link, Document Cited by: §6.2.
  • Z. Liu, C. Zang, K. Kuang, H. Zou, H. Zheng, and P. Cui (2019) Causation-driven visualizations for insurance recommendation. See DBLP:conf/icmcs/2019w, pp. 471–476. External Links: Link, Document Cited by: §1, §6.1.
  • Y. Lu, C. Shi, L. Hu, and Z. Liu (2019) Relation structure-aware heterogeneous information network embedding. See DBLP:conf/aaai/2019, pp. 4456–4463. External Links: Link, Document Cited by: §6.3.
  • M. Ma, P. Ren, Y. Lin, Z. Chen, J. Ma, and M. de Rijke (2019) -net: A parallel information-sharing network for shared-account cross-domain sequential recommendations. See DBLP:conf/sigir/2019, pp. 685–694. External Links: Link, Document Cited by: §1, §6.2.
  • T. Man, H. Shen, X. Jin, and X. Cheng (2017) Cross-domain recommendation: an embedding and mapping approach. See DBLP:conf/ijcai/2017, pp. 2464–2470. External Links: Link, Document Cited by: §1, §4.3, §5.2, §6.2.
  • T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean (2013) Distributed representations of words and phrases and their compositionality. See DBLP:conf/nips/2013, pp. 3111–3119. External Links: Link Cited by: §4.2.
  • S. Mitra, N. Chaudhari, and B. Patwardhan (2014) Leveraging hybrid recommendation system in insurance domain. International Journal of Engineering and Computer Science 3. Cited by: §6.1.
  • M. Qazi, G. M. Fung, K. J. Meissner, and E. R. Fontes (2017) An insurance recommendation system using bayesian networks. See DBLP:conf/recsys/2017, pp. 274–278. External Links: Link, Document Cited by: §1, §6.1.
  • S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme (2012) BPR: bayesian personalized ranking from implicit feedback. CoRR abs/1205.2618. External Links: Link, 1205.2618 Cited by: §5.2.
  • L. Rokach, G. Shani, B. Shapira, E. Chapnik, and G. Siboni (2013) Recommending insurance riders. See DBLP:conf/sac/2013, pp. 253–260. External Links: Link, Document Cited by: §1, §4.1, §6.1.
  • F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini (2009) The graph neural network model. IEEE Transactions on Neural Networks 20 (1), pp. 61–80. External Links: Document Cited by: §6.3.
  • M. S. Schlichtkrull, T. N. Kipf, P. Bloem, R. van den Berg, I. Titov, and M. Welling (2018) Modeling relational data with graph convolutional networks. See DBLP:conf/esws/2018, pp. 593–607. External Links: Link, Document Cited by: §1, §6.3.
  • Y. Sun, J. Han, X. Yan, P. S. Yu, and T. Wu (2011) PathSim: meta path-based top-k similarity search in heterogeneous information networks. PVLDB 4 (11), pp. 992–1003. Cited by: Definition 2.1, Definition 2.2.
  • P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio (2018) Graph attention networks. See DBLP:conf/iclr/2018, External Links: Link Cited by: §1.
  • M. Wang, L. Yu, D. Zheng, Q. Gan, Y. Gai, Z. Ye, M. Li, J. Zhou, Q. Huang, C. Ma, Z. Huang, Q. Guo, H. Zhang, H. Lin, J. Zhao, J. Li, A. J. Smola, and Z. Zhang (2019a)

    Deep graph library: towards efficient and scalable deep learning on graphs

    .
    ICLR Workshop on Representation Learning on Graphs and Manifolds. External Links: Link Cited by: §5.3.
  • X. Wang, X. He, L. Nie, and T. Chua (2017) Item silk road: recommending items from information domains to social users. See DBLP:conf/sigir/2017, pp. 185–194. External Links: Link, Document Cited by: §6.2.
  • X. Wang, H. Ji, C. Shi, B. Wang, Y. Ye, P. Cui, and P. S. Yu (2019b) Heterogeneous graph attention network. See DBLP:conf/www/2019, pp. 2022–2032. External Links: Link, Document Cited by: §1, §6.3.
  • F. Xu, J. Lian, Z. Han, Y. Li, Y. Xu, and X. Xie (2019) Relation-aware graph convolutional networks for agent-initiated social e-commerce recommendation. See DBLP:conf/cikm/2019, pp. 529–538. External Links: Link, Document Cited by: §1.
  • C. Zhang, D. Song, C. Huang, A. Swami, and N. V. Chawla (2019) Heterogeneous graph neural network. See DBLP:conf/kdd/2019, pp. 793–803. Cited by: §6.3.