1. Introduction
Online advertising is a 27.5billion dollar business in fiscal year 2017 (Bureau, 2017)
, and advertisers have been shifting their budgets to programmatic ad buying platforms. Recently, more and more advertisers are running campaigns with CostPerAction (CPA) goals, seeking to maximize conversions for a given budget. To achieve such objectives, accurate prediction of conversion probability is fundamental and has attracted lots of research attention in the past few years
(Lee et al., 2012; Agarwal et al., 2010; Chapelle et al., 2015; Lu et al., 2017; Ahmed et al., 2014).Advertising platforms insert pixels (i.e. Javascript codes) into advertisers’ websites to track users’ conversions, and there are several types of conversions that the advertisers want to track. Some pixels track whether a user fills out an online form, while other pixels track whether a user buys a product. The existence of different types of conversions makes conversion prediction challenging because the decisive factors that drive users to convert may vary from one conversion type to another. For example, whether to fill out a form online is a personal decision, so field User_ID and its interaction effects with other fields, should be the decisive factor. While for online purchase, the product itself or its corresponding brand play more important roles.
To address this problem, one approach is to build a separate model for each conversion type. However, this is memory intensive and it fails to leverage information from other conversion types. Another approach is to build a unified model which captures the 2way or 3way interactions between fields, with conversion type included as one of the fields. However, the 2way model fails to capture the different field interaction effects for different conversion types, while the 3way model is computationally expensive.
In this paper we study an alternative approach, i.e., formulating conversion prediction as a multitask learning problem, so that we can jointly learn prediction models for multiple conversion types. Besides taskspecific parameters, these models share low level feature representations, providing the benefit of information sharing among different conversion types. We propose MultiTask Fieldweighted Factorization Machine (MTFwFM), based on one of the bestperforming models for click prediction, i.e., Fieldweighted Factorization Machine (FwFM) (Pan et al., 2018), to solve these tasks together.
Our main contribution is twofold: First, we formulate conversion prediction as a multitask learning problem and propose MTFwFM to solve all tasks jointly. Second, we have carried out extensive experiments on realworld conversion prediction data set to evaluate the performance of MTFwFM against existing models. The results show that MTFwFM increases the AUC of ROC on two conversion types by 0.74% and 0.84%, respectively. The weighted AUC of ROC across all tasks is also increased by 0.50%. We have also conducted comprehensive analysis, which shows that MTFwFM indeed captures different decisive factors for different conversion types.
The rest of the paper is organized as follows. We investigate the field interaction effects for different conversion types in Section 1. Section 3 describes MTFwFM in detail. Our experiment results are presented in Section 4. In Section 5, we conduct analysis to show that MTFwFM learns different field interaction effects for different conversion types. Section 6 and Section 7 discuss the related work and conclude the paper.
2. Field Interaction Effects for Different Conversion Types
The data used for conversion prediction are typically multifield categorical data (Zhang et al., 2016), where features are very sparse and each feature belongs to only one field. For example, feature yahoo.com and Nike belong to field Page_TLD (Toplevel domain) and Advertiser, respectively. In click prediction, it has been verified that different field pairs have different interaction effects on multifield categorical data (Juan et al., 2016; Pan et al., 2018).
In conversion prediction, advertisers would like to track different types of conversions, and they spend most of their budget on the following four types:

Lead: the user fills out an online form

View Content: the user views a web page such as the landing page or a product page

Purchase: the user purchases a product

Sign Up: the user signs up an account
Conversion Type  Top 5 Field Pairs 
Lead  (Ad, User), (Creative, User), (Line, User), (Subdomain, User), (Advertiser, User) 
View Content  (Subdomain, Hour), (Ad, Subdomain), (Creative, Subdomain), (Subdomain, Age_Bucket), (Page_TLD, Hour) 
Purchase  (Ad, Subdomain), (Creative, Subdomain), (Ad, Page_TLD), (Creative, Page_TLD), (Line, Subdomain) 
Sign Up  (Ad, Subdomain), (Creative, Subdomain), (Ad, Age_Bucket), (AD, Page_TLD), (Creative, Page_TLD) 
The decisive factors, i.e., the main effect terms (fields) and/or the interaction terms (field pairs) that drive a user to convert, may vary a lot among these types. Following the analysis in (Pan et al., 2018), we verify this by computing mutual information (MI) between each field pair and each type of conversion on our realworld data set described later in section 4.1. Suppose there are unique features , different fields and conversion types. We denote as the field that feature belongs to, and as the conversion type. The interaction effect of a field pair with respect to conversions of type is measured by:
(1) 
where is the marginal probability of , denotes , and is the marginal probability of . All marginal probabilities are computed based on the samples from each conversion type .
The top 5 field pairs that have the highest mutual information w.r.t. each conversion type are shown in Table 1. It shows that these field pairs vary among types: all 5 field pairs of Lead contain field User_ID and all 5 field pars of View Content contain publisher fields (Page_TLD and Subdomain)^{1}^{1}1Page_TLD denotes a toplevel domain of a web page, while Subdomain denotes the subdomain. For example, given a web page with URL https://sports.yahoo.com/warriorsloss76ersvividillustration075301147.html, the Page_TLD is yahoo.com and the Subdomain is sports.yahoo.com. For Purchase and Sign Up, most field pairs contain one publisher field and one advertiser field (Ad, Creative, Line). The heat maps of the mutual information for all field pairs with respect to each conversion type are shown in Figure 2 and please refer to section 4.1 for the explanation of each field.
There are several approaches to capture different field interaction effects for different conversion types. The first one is to build one model for each conversion type, and train each model separately. However, this is not preferred in the realworld advertising platform because lots of memories are required to store the parameters of all models. In addition, extreme low conversion rate for some conversion types may render the lack of sufficient positive samples to train the corresponding models.
The second approach is to build a unified model, with conversion type as one of the fields. However, all 2way stateoftheart models, such as 2way Factorization Machines (FM) and Fieldweighted Factorization Machines (FwFM), are not able to fully capture the differences in field interaction effects among different conversion types. 3way FM and FwFM may resolve this issue, but the online computing latency is much higher. Please refer to Section 3.3.3 for the details.
3. Multitask Fieldweighted Factorization Machine
We formulate the prediction of different types of conversions as a multitask learning problem, and propose MultiTask Fieldweighted Factorization Machine (MTFwFM) to train these models jointly. This section is organized as follows: Section 3.1 introduces FwFM and MTFwFM in detail; the training procedure of MTFwFM is described in section 3.2. In Section 3.3, we analyze the number of parameters as well as computing latency for MTFwFM.
3.1. MultiTask Fieldweighted Factorization Machine (MTFwFM)
MTFwFM is a variant of Fieldweighted Factorization Machine (FwFM), which is introduced in (Pan et al., 2018) for click prediction. FwFM is formulated as
(2) 
where
is the sigmoid function, and
is the sum of the main and interaction effects across all features:(3) 
Here is a set of parameters : denotes the bias term;
refers to the embedding vector for feature
; denotes main term weight vector for field , which is used to model the main effect of feature ; denotes the field interaction weight between field and .We modify FwFM in the following ways to get MTFwFM: First, instead of using one bias term , MTFwFM has one bias term for each conversion type . Second, each conversion type has its own to model the main effect of feature . Last, each conversion type also has its own field interaction weights . The feature embeddings are kept the same as FwFM and are shared by all conversion types. Mathematically,
(4) 
MTFwFM can be regarded as a 3layer neural network: each sample is first processed by an
embedding layer that maps each binary feature to an embedding vector , then by a main & interaction layer which consists of and .Each node in the main and interaction layer is connected to a output layer which consists of nodes, one for each conversion type. The connections between and each output node are weighted by , while connections between and each output node are weighted by field interaction weights . The architecture of MTFwFM is shown in Figure 1.
3.2. Joint Training
The feature embedding vectors are shared during the model training by all conversion types and are optimized for every sample. However, for the conversion type specific parameters such as and
, they are only optimized for samples of corresponding type. We minimize the following loss function for MTFwFM:
(5) 
where , denotes the label, and denotes the regularization terms w.r.t. the parameters.
We use minibatch stochastic gradient descent to optimize the loss function. In each iteration, we select a batch of samples(
) randomly, where each sample belongs to a specific task, i.e., conversion type in our case. Within each batch, the model is updated according the conversion type of each sample. More specifically, is updated for all samples, while , and are updated only for samples with conversion type . The training procedure is summarized in Algorithm 1.3.3. Model Complexity
There are two key constraints when we build a conversion prediction model in the realtime serving system: the memory needed to store all parameters, and the computing latency for each sample. We’ll analyze these two constraints in this section.
3.3.1. Number of Parameters
The number of parameters in MTFwFM is
(6) 
where , , , refer to the number of conversion types, features, fields, as well as the dimension of the feature embedding vectors and main term weight vector, respectively.
Thus in (6), represents the number of bias terms ; calculates the number of parameters for the embedding vectors ; corresponds to , i.e., the main term weight vectors for all conversion types; denotes the number of field interaction weights . The number of parameters approximately equals to , given that and .
3.3.2. Online Computing Latency
The online computing latency for each prediction request grows linearly with the number of operations, such as float additions and multiplications. During the inference of MTFwFM, for each sample, the number of operations in the main effect terms is
and the number of operations in the interaction terms is
Thus the total number of operations of MTFwFM is
3.3.3. MTFwFM v.s. Using Conversion Type as a Field
Besides formulating conversion prediction as a multitask learning problem, an alternative approach is to incorporate conversion type as one of the fields in the existing models, such as FM and FwFM. We can either consider the 2way interactions between fields, referred as 2way Conversion Type as a Field (2way CTF), or the 3way interactions, referred as 3way Conversion Type as a Field (3way CTF). 2way CTF with FM and FwFM are used as baseline models in Section 4.
For 3way CTF with FM or FwFM, the number of operations is much more than that of MTFwFM, which makes them less preferred in the production environment. We discuss the number of operations of 3way CTF with FwFM as an example here and omit that for FM since they are very similar. The formula of 3way CTF with FwFM are:
(7) 
where is a 3way dot product.
The number of operations of 3way CTF with FwFM is
(8) 
It is approximately , which is 150% more than that of MTFwFM. Thus, compared with MTFwFM, 3way CTF with FwFM is less preferred due to its much more number of operations.
4. Experiments
This section presents our experimental evaluation results. We introduce the data set in Section 4.1, and describe the implementation details in Section 4.2. Section 4.3 compares the performance of MTFwFM with that of 2way CTF with FM and FwFM. We denote 2way CTF with FM or FwFM as FM or FwFM in this section for the sake of simplicity.
4.1. Data Set
The data set is collected from the impression and conversion logs of the Verizon Media DSP advertising platform. We treat each impression as a sample, and use the conversions to label them. The labeling is done by lasttouch attribution, i.e., for each conversion, only the last impression(from the same user and line ^{2}^{2}2Line is the smallest unit for advertisers to set up budget, goal type, targeting criteria of a group of ads) before this conversion is labeled as a positive sample. All the remaining impressions are labeled as negative samples. The type of each sample is the type of the corresponding line. A line may be associated with multiple conversions that belong to several different types. However, in this paper we focus on those lines that have only one type of conversions since they contribute to most of the traffic as well as spend in our platform.
We use 7 days of impression logs, denoted as to , as the training data set. Then conversions from to are used to label those impressions. A 6days longer conversion time window is used because there are usually delays between impressions and conversions, and most conversions happens within 6 days after impressions. We then downsample the negative samples to solve the data imbalance issue since the ratio of positive samples is in the order of in the data set. We get approximately equal number of positive and negative samples in the training set after downsampling.
The validation data set is collected from the impression logs on , and the test data set is collected on . Conversions from to and to are used to label the validation and test set, respectively. We do not downsample on validation and test data sets, since the evaluation should be applied to data sets that reflect the real class distribution. Table 2 summarizes the statistics of the training, validation and test data set.
Data set  Samples  CVR  Features  
Train  Purchase  4,552,380  0.1858  11,852 
Lead  6,566,688  0.3402  15,728  
Sign Up  3,332,250  0.8797  13,227  
View Content  170,694  0.3690  1,171  
Validation  Purchase  12,800,160  4.63E04  11,153 
Lead  17,036,604  5.59E04  9,474  
Sign Up  2,222,334  3.30E03  5,591  
View Content  441,252  4.90E04  1,494  
Test  Purchase  12,623,382  4.52E04  11,007 
Lead  18,738,990  5.37E04  9,373  
Sign Up  1,926,558  3.41E03  5,553  
View Content  383,940  4.69E04  1,173 
Model  Overall AUC  Weighted AUC  
Training  Validation  Test  Training  Validation  Test  
FM  0.9706  0.9014  0.9012  0.9537  0.8500  0.8383 
FwFM  0.9702  0.9023  0.9027  0.9530  0.8520  0.8400 
MTFwFM  0.9728  0.8999  0.9046  0.9574  0.8511  0.8450 
Type  Model 




Lead  FM  0.8393  0.8412  0.8116  
FwFM  0.8357  0.8536  0.8109  
MTFwFM  0.8502  0.8258  0.8190  

FM  0.9523  0.9577  0.9542  
FwFM  0.9511  0.9569  0.9537  
MTFwFM  0.9563  0.9580  0.9545  
Purchase  FM  0.9922  0.9758  0.9684  
FwFM  0.9924  0.9804  0.9761  
MTFwFM  0.9930  0.9799  0.9737  
Sign Up  FM  0.9381  0.7529  0.7475  
FwFM  0.9374  0.7564  0.7501  
MTFwFM  0.9428  0.7545  0.7585 
There are 17 fields of features, which fall into 4 categories:

Userside fields: User_ID, Gender and Age_Bucket

Publisherside fields: Page_TLD, Publisher_ID, and Subdomain

Advertiserside fields: Advertiser_ID, Creative_ID, AD_ID, Creative_Media_ID, Layout_ID, and Line_ID

Context fields: Hour_of_Day, Day_of_Week, Device_Type_ID, Ad_Position_ID, and Ad_Placement_ID
We use Conversion_Type_ID as an additional field for FM and FwFM. The meanings of most fields are quite straightforward so we only explain some of them:

Page_TLD: toplevel domain of a web page.

Subdomain: subdomain of a web page.

Creative_ID: identifier of a creative, which is an image or a video.

Ad_ID: identifier of a (Line_ID, Creative_ID) combination.

Creative_Media_ID: identifier of the media type of the creative, i.e., image, video or native.

Layout_ID: the size of a creative, for example, .

Device_Type_ID: identifier of whether this event happens on desktop, mobile or tablet.

AD_Position_ID & AD_Placement_ID: identifiers of the position of an ad on the web page.
4.2. Implementations
All baseline models as well as the proposed MTFwFM model are implemented in Tensorflow. The input is a sparse binary vector
with only nonzero entries. In the embedding layer, the input vector is projected into embedding vectors , one for each field. The main and interaction effect terms in the next layer, i.e., main & interaction layer, are computed based on these vectors. The main effect terms simply concatenate all vectors, while the interaction effect terms calculate the dot product between each feature pair. Then, each node in the main & interaction layer is connected to the output layer, which consists of nodes, each of them corresponds to one specific conversion type.4.3. Performance Comparisons
This section compares MTFwFM with FM and FwFM on the data sets introduced above. For the hyperparameters such as regularization coefficient and learning rate in all models, we select the values that lead to the best performance on the validation set and then use them in the evaluation on the test set. We focus on the following performance metrics:
Overall AUC
AUC of ROC (AUC) specifies the probability that, given one positive and one negative sample, their pairwise rank is correct. Overall AUC calculates the AUC over samples from all conversion types.
AUC for each conversion type
The AUC on the samples from each conversion type, denoted as .
Weighted AUC
The weighted average of the AUC on each conversion type:
where refers to the spend of conversion type . The weights are the spend of each conversion type.
Table 3 summarizes the experiment results. It shows that MTFwFM gets the best performance w.r.t. both overall and weighted AUC, with a lift of and over the best performing baseline, respectively. While the performance improvement on overall AUC is marginal, the lift on weighted AUC is significant.
Table 4 compares the performance of all models on each conversion type. Among four conversion types, View Content and Purchase have high AUCs than the other two types using the baseline models(over 95% v.s. under 82%). For these two conversion types that already get high AUC, the lifts of MTFwFM are more or less neutral, namely and . On the other hand, for conversion type Lead and Sign Up that get low performance on baseline models, MTFwFM improves the AUC by and .
Therefore, we conclude that MTFwFM outperforms FM and FwFM significantly w.r.t. the weighted AUC over all conversion types. And this improvement mainly come from the conversion types that get relatively low AUC using the baseline models.
5. Study of Learned Field Interaction Effects for Different Conversion Types
In this section, we analyze MTFwFM in terms of its ability to capture different field interaction effects for different conversion types. As described in Section 1, the field interaction effects are measured by the the mutual information between a field pair and the conversion of each type, i.e., . Figure 2 presents the visualization of these field interaction effects by heat maps.
The difference among the four heat maps in Figure 2 illustrates how field interaction effects vary among different conversion types. For Lead, User_ID has very strong interaction effects with almost all other fields, especially with Page_TLD, Subdomain, Ad and Creative. For View Content, field pairs containing publisherside fields such as Page_TLD and Subdomain have large mutual information in general. For Purchase and Sign Up, we observe field pairs with advertiserside fields, such as Advertiser, Ad, Creative and Line, have strong interaction effects with other fields.
To verify whether MTFwFM captures the different patterns of field interaction effects among conversion types, we compare with the learned field interaction effect between and on conversion type , namely . Here we only consider the magnitude of , since either a large positive or negative value indicate a strong interaction effect. Figure 3 shows the heat maps of for all conversion types.
According to the comparison between Figure 2 and Figure 3, the learned field interaction effects have similar pattern with their mutual information for each conversion type. In general, Figure 3 looks like a pixelated version of Figure 2. For Lead, MTFwFM successfully captures that User_ID have strong interaction effects with other fields. For View Content, field pairs including the publisherside fields, e.g., Publisher, Page_TLD, and Subdomain generally have large magnitude of . For Purchase and Sign Up, advertiserside fields, e.g., Advertiser, Ad, Creative and Line have in general large with other fields.
6. Related Work
There has been lots of work in the literature on click and conversion prediction in online advertising. Research on click prediction focus on developing various models, including Logistic Regression (LR)
(Richardson et al., 2007; Chapelle et al., 2015; McMahan et al., 2013), Polynomial2 (Poly2) (Chang et al., 2010), treebased models (He et al., 2014), tensorbased models
(Rendle and SchmidtThieme, 2010), Bayesian models (Graepel et al., 2010), Fieldaware Factorization Machines (FFM) (Juan et al., 2016, 2017), and Fieldweighted Factorization Machines (FwFM) (Pan et al., 2018). Recently, deep learning for CTR prediction also attracted a lot of research attention
(Cheng et al., 2016; Zhang et al., 2016; Qu et al., 2016; Guo et al., 2017; Shan et al., 2016; He and Chua, 2017; Wang et al., 2017).For conversion prediction, (Lee et al., 2012)
present an approach to estimate conversion rate based on past performance observations along data hierarchies.
(Chapelle et al., 2015) and (Agarwal et al., 2010) propose a logistic regression model and loglinear model for conversion prediction, respectively. (Rosales et al., 2012) provides comprehensive analysis and proposes a new model for postclick conversion prediction. (Bagherjeiran et al., 2010) proposes a ranking model that optimize the conversion funnel even for CPC (CostperClick) campaigns. (Ji et al., 2017) proposes a timeaware conversion prediction model. (Lu et al., 2017) describes a practical framework for conversion prediction to tackle several challenges, including extremely sparse conversions, delayed feedback and attribution gaps. Recently, there are also several work on modeling the delay of conversions (Chapelle, 2014; Yoshikawa and Imai, 2018).MultiTask Learning (MTL) (Caruana, 1998)
has been used successfully across multiple applications, from natural language processing
(Collobert and Weston, 2008), speech recognition (Deng et al., 2013), to computer vision
(Girshick, 2015). MTL is also applied to online advertising in (Ahmed et al., 2014) to model clicks, conversions and unattributed conversions. In (Ma et al., 2018) the authors proposes a multitask model to solve the tasks of click prediction and clickthrough conversion prediction jointly.7. Conclusion
In this paper, we formulate conversion prediction as a MultiTask learning problem and propose MultiTask Fieldweighted Factorization Machines (MTFwFM) to learn prediction models for multiple conversion types jointly. The feature representations are shared by all tasks while each model has its specific parameters, providing the benefit of sharing information among different conversion prediction tasks. Our extensive experiment results show that MTFwFM outperforms several stateoftheart models, including Factorization Machines (FM) and Fieldweighted Factorization Machines (FwFM). We also show that MTFwFM indeed learns different field interaction effects for different conversion types. There are many potential directions for future research. To name a few, we could involve more tasks to the current model, including predicting clicks or nonattributed conversions, or build a deep neural network (DNN) on top of MTFwFM to better solve these tasks.
References
 (1)
 Agarwal et al. (2010) Deepak Agarwal, Rahul Agrawal, Rajiv Khanna, and Nagaraj Kota. 2010. Estimating rates of rare events with multiple hierarchies through scalable loglinear models. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 213–222.
 Ahmed et al. (2014) Amr Ahmed, Abhimanyu Das, and Alexander J Smola. 2014. Scalable hierarchical multitask learning algorithms for conversion optimization in display advertising. In Proceedings of the 7th ACM international conference on Web search and data mining. ACM, 153–162.
 Bagherjeiran et al. (2010) Abraham Bagherjeiran, Andrew O Hatch, and Adwait Ratnaparkhi. 2010. Ranking for the conversion funnel. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. ACM, 146–153.
 Bureau (2017) Interactive Advertising Bureau. 2017. IAB internet advertising revenue report. https://www.iab.com/wpcontent/uploads/2018/05/IAB2017FullYearInternetAdvertisingRevenueReport.REV2_.pdf
 Caruana (1998) Rich Caruana. 1998. Multitask learning. In Learning to learn. Springer, 95–133.

Chang
et al. (2010)
YinWen Chang, ChoJui
Hsieh, KaiWei Chang, Michael Ringgaard,
and ChihJen Lin. 2010.
Training and testing lowdegree polynomial data
mappings via linear SVM.
Journal of Machine Learning Research
11, Apr (2010), 1471–1490.  Chapelle (2014) Olivier Chapelle. 2014. Modeling delayed feedback in display advertising. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1097–1105.
 Chapelle et al. (2015) Olivier Chapelle, Eren Manavoglu, and Romer Rosales. 2015. Simple and scalable response prediction for display advertising. ACM Transactions on Intelligent Systems and Technology (TIST) 5, 4 (2015), 61.
 Cheng et al. (2016) HengTze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. ACM, 7–10.
 Collobert and Weston (2008) Ronan Collobert and Jason Weston. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning. ACM, 160–167.
 Deng et al. (2013) Li Deng, Geoffrey Hinton, and Brian Kingsbury. 2013. New types of deep neural network learning for speech recognition and related applications: An overview. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 8599–8603.
 Girshick (2015) Ross Girshick. 2015. Fast rcnn. In Proceedings of the IEEE international conference on computer vision. 1440–1448.
 Graepel et al. (2010) Thore Graepel, Joaquin Q Candela, Thomas Borchert, and Ralf Herbrich. 2010. Webscale bayesian clickthrough rate prediction for sponsored search advertising in microsoft’s bing search engine. In Proceedings of the 27th international conference on machine learning (ICML10). 13–20.
 Guo et al. (2017) Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A FactorizationMachine based Neural Network for CTR Prediction. arXiv preprint arXiv:1703.04247 (2017).
 He and Chua (2017) Xiangnan He and TatSeng Chua. 2017. Neural Factorization Machines for Sparse Predictive Analytics. (2017).
 He et al. (2014) Xinran He, Junfeng Pan, Ou Jin, Tianbing Xu, Bo Liu, Tao Xu, Yanxin Shi, Antoine Atallah, Ralf Herbrich, Stuart Bowers, et al. 2014. Practical lessons from predicting clicks on ads at facebook. In Proceedings of the Eighth International Workshop on Data Mining for Online Advertising. ACM, 1–9.
 Ji et al. (2017) Wendi Ji, Xiaoling Wang, and Feida Zhu. 2017. Timeaware conversion prediction. Frontiers of Computer Science 11, 4 (2017), 702–716.
 Juan et al. (2017) Yuchin Juan, Damien Lefortier, and Olivier Chapelle. 2017. Fieldaware factorization machines in a realworld online advertising system. In Proceedings of the 26th International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, 680–688.
 Juan et al. (2016) Yuchin Juan, Yong Zhuang, WeiSheng Chin, and ChihJen Lin. 2016. Fieldaware factorization machines for CTR prediction. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 43–50.
 Lee et al. (2012) Kuangchih Lee, Burkay Orten, Ali Dasdan, and Wentong Li. 2012. Estimating conversion rate in display advertising from past performance data. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 768–776.
 Lu et al. (2017) Quan Lu, Shengjun Pan, Liang Wang, Junwei Pan, Fengdan Wan, and Hongxia Yang. 2017. A Practical Framework of Conversion Rate Prediction for Online Display Advertising. In Proceedings of the ADKDD’17. ACM, 9.
 Ma et al. (2018) Xiao Ma, Liqin Zhao, Guan Huang, Zhi Wang, Zelin Hu, Xiaoqiang Zhu, and Kun Gai. 2018. Entire Space MultiTask Model: An Effective Approach for Estimating PostClick Conversion Rate. arXiv preprint arXiv:1804.07931 (2018).
 McMahan et al. (2013) H Brendan McMahan, Gary Holt, David Sculley, Michael Young, Dietmar Ebner, Julian Grady, Lan Nie, Todd Phillips, Eugene Davydov, Daniel Golovin, et al. 2013. Ad click prediction: a view from the trenches. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1222–1230.
 Pan et al. (2018) Junwei Pan, Jian Xu, Alfonso Lobos Ruiz, Wenliang Zhao, Shengjun Pan, Yu Sun, and Quan Lu. 2018. Fieldweighted Factorization Machines for ClickThrough Rate Prediction in Display Advertising. In Proceedings of the 2018 World Wide Web Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1349–1357.
 Qu et al. (2016) Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, and Jun Wang. 2016. Productbased neural networks for user response prediction. In Data Mining (ICDM), 2016 IEEE 16th International Conference on. IEEE, 1149–1154.
 Rendle and SchmidtThieme (2010) Steffen Rendle and Lars SchmidtThieme. 2010. Pairwise interaction tensor factorization for personalized tag recommendation. In Proceedings of the third ACM international conference on Web search and data mining. ACM, 81–90.
 Richardson et al. (2007) Matthew Richardson, Ewa Dominowska, and Robert Ragno. 2007. Predicting clicks: estimating the clickthrough rate for new ads. In Proceedings of the 16th international conference on World Wide Web. ACM, 521–530.
 Rosales et al. (2012) Rómer Rosales, Haibin Cheng, and Eren Manavoglu. 2012. Postclick conversion modeling and analysis for nonguaranteed delivery display advertising. In Proceedings of the fifth ACM international conference on Web search and data mining. ACM, 293–302.
 Shan et al. (2016) Ying Shan, T Ryan Hoens, Jian Jiao, Haijing Wang, Dong Yu, and JC Mao. 2016. Deep Crossing: Webscale modeling without manually crafted combinatorial features. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 255–262.
 Wang et al. (2017) Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & Cross Network for Ad Click Predictions. arXiv preprint arXiv:1708.05123 (2017).
 Yoshikawa and Imai (2018) Yuya Yoshikawa and Yusaku Imai. 2018. A Nonparametric Delayed Feedback Model for Conversion Rate Prediction. arXiv preprint arXiv:1802.00255 (2018).
 Zhang et al. (2016) Weinan Zhang, Tianming Du, and Jun Wang. 2016. Deep learning over multifield categorical data. In European conference on information retrieval. Springer, 45–57.
Comments
There are no comments yet.