Optimized Cost per Click in Taobao Display Advertising

by   Han Zhu, et al.

Taobao, as the largest online retail platform in the world, provides billions of online display advertising impressions for millions of advertisers every day. For commercial purposes, the advertisers bid for specific spots and target crowds to compete for business traffic. The platform chooses the most suitable ads to display in tens of milliseconds. Common pricing methods include cost per mille (CPM) and cost per click (CPC). Traditional advertising systems target certain traits of users and ad placements with fixed bids, essentially regarded as coarse-grained matching of bid and traffic quality. However, the fixed bids set by the advertisers competing for different quality requests cannot fully optimize the advertisers' key requirements. Moreover, the platform has to be responsible for the business revenue and user experience. Thus, we proposed a bid optimizing strategy called optimized cost per click (OCPC) which automatically adjusts the bid to achieve finer matching of bid and traffic quality of page view (PV) request granularity. Our approach optimizes advertisers' demands, platform business revenue and user experience and as a whole improves traffic allocation efficiency. We have validated our approach in Taobao display advertising system in production. The online A/B test shows our algorithm yields substantially better results than previous fixed bid manner.


page 1

page 2

page 3

page 4


Truncation-Free Matching System for Display Advertising at Alibaba

Matching module plays a critical role in display advertising systems. Wi...

Research on Cross-platform Measurement method of online Advertising

There are a large number of competing ADXs on the Internet. It is the pr...

Efficient Delivery Policy to Minimize User Traffic Consumption in Guaranteed Advertising

In this work, we study the guaranteed delivery model which is widely use...

AiAds: Automated and Intelligent Advertising System for Sponsored Search

Sponsored search has more than 20 years of history, and it has been prov...

Analyzing Location-Based Advertising for Vehicle Service Providers Using Effective Resistances

Vehicle service providers can display commercial ads in their vehicles b...

A deep dive into the accuracy of IP Geolocation Databases and its impact on online advertising

The quest for every time more personalized Internet experience relies on...

How Much Ad Viewability is Enough? The Effect of Display Ad Viewability on Advertising Effectiveness

A large share of all online display advertisements (ads) are never seen ...

1. Introduction

Advertising fosters the rise of new brands and keeps existing quality brands youth forever. Online advertising (Goldfarb and Tucker, 2011; Lahaie et al., 2007; Evans, 2009; Karande et al., 2013), a marketing strategy involving the use of the internet as a medium to obtain website traffic and target, and deliver marketing messages to the right customers, has experienced an exponential increase in the growth since the early 1990s. Real-time bidding (RTB) (Perlich et al., 2012; Yuan et al., 2014a; Muthukrishnan, 2010) technology in online advertising allows advertisers to bid for every individual impression. And lots of research (Zhang et al., 2014; Yuan et al., 2014b, 2013; Zhang et al., 2016) has found effective and efficient bidding strategies to maximize unilateral economic surplus of a party, such as advertisers, consumers and intermediary platforms.

More than RTB systems, Taobao, called “the country’s biggest online marketplace” by the Economist (eco, 2015), established one of the most advanced online advertising system in the world. In both mobile app and PC website of Taobao, selected ads are presented to users in specific spots. In this paper, we focus on the bid optimization problem of the indispensable CPC display advertising in Taobao mobile app. Two ad formats involved are as follows:

Figure 1. Banner and item CPC ads displayed on Taobao mobile app home.
  • Banner CPC Ads: The ads appear in the top banner of Taobao home page as Figure 1. Advertisers set up campaigns for a single item, a store or a brand.

  • Item CPC Ads: Single items are displayed to users in the Guess What You Like column including about two hundred spots, three of which are for advertising and the others are for recommendation as Figure 1.

Connecting users and advertisers, Taobao advertising platform forms its own unique ecosystem characterized by:

  • First, unlike most RTB systems, for which it’s difficult to obtain complete user data, Taobao itself acts as demand side and supply side at the same time. This ecologically closed-loop system enables Taobao to collect integrated user data and ad campaigns’ information.

  • Second, most advertisers in the system are small and medium-sized ones who are more concerned about the increase in revenue than promoting their brands. Therefore the increase in gross merchandise volume (GMV) can better benefit these advertisers.

  • Third, while different advertisers could pursue different key performance indicators (KPI, e.g., impressions, clicks, conversions or return of investment (ROI)), they bid for clicks on Taobao platform, i.e., CPC is adopted. We will discuss other methods such as cost per mille (CPM) and cost per sale (CPS) later.

  • The last but the most important is that advertising spots must meet the media requirements, which is measured by indices such as click-through rate (CTR), conversion rate (CVR), GMV, etc. Here is an example of GMV analysis. First, we hope that the introduction of business traffic does not unduly affect user experience. Thus, setting GMV requirements achieves a win-win situation of business revenue and user experience. Second, as Taobao’s advertisers are precisely Taobao’s sellers, with sellers using an approximately fixed percentage (taking rate) of revenue for marketing purposes, raising GMV will result in advertisers increasing their advertising budgets, which brings long-term benefits to the platform.

Weighing the pros and cons, we adopt CPC in the two ad formats. Although advertisers assume less risk with CPS (Edelman et al., 2007; Aggarwal et al., 2006; Varian, 2007), compared to CPC, CPS ignores the value of clicks, providing worse traffic liquidating efficiency. Since the involved ad formats are mainly for small and medium-sized advertisers, CPM poses higher risk while CPC allows advertisers to control the cost of clicks and the platform takes the risk of turning page views to clicks. With Taobao’s complete data ecology, standardized e-commerce advertising and interactive process, CPC is sufficiently effective.

Many state-of-the-art systems such as Facebook’s (Facebook, 2012) use different designs from Taobao. To some large social networking services (SNSs), for example, through optimized cost per mille (oCPM), advertisers can bid for click and actually pay per impression (Facebook, 2012). SNS advertising interactions are usually divergent, such as like, click, share, etc., while Taobao’s transactions are often accomplished by simple serial clicks. From the data ecology point of view, after ad clicks, Taobao users’ all behaviors are still on Taobao platform, which provides conditions for follow-up interaction-based deductions. However, the SNS usually lets advertisers bid for clicks or other actions and then converts to equivalent CPM manner, which in mechanism encourages advertiser to upload real follow-up interaction data and further optimizes the bid.

In previously mentioned two ad formats, taking into account the ecology, efficiency, etc., we choose CPC method which is the focus of this paper.

Taobao’s advertising system includes filtering millions of ads and ranking of these candidate ads. First, mining user preference inferred from its behavior data and the ad item’s details, Taobao targeting system (Provost et al., 2009; Raeder et al., 2012) trains models to filter mass amount of ads for each page view (PV) request, which is called matching stage. Different from the recommendation (Schafer et al., 2007) not involving advertisers, the matching service recalling related users has to reflect the advertisers’ bidding will and ensuring market depth. Secondly, real-time prediction (RTP) engine predicts click-through rate (pCTR) for each eligible ad. Thirdly, traditionally, these candidate ads are ranked by and displayed based on the order to maximize effective cost per mille (eCPM sorting mechanism).

Advertisers always expect the bid to match the traffic quality. Due to technical limitations, traditional methods can only set fixed bids for specific user groups and ad slots for coarse-grained traffic differentiation, however, advertisers are looking for further fine-grained matching of bids and traffic quality. Ranking process based on the fixed bids has two defects. On the one hand, it is inefficient that a fixed bid set by an advertiser deals with continuous internet traffic of different commercial qualities; on the other hand, traditional methods maximize eCPM to pursue short-term commercial revenue, however, can not optimize and control media requirements such as GMV, detrimental to Taobao’s long-term interests.

For these two issues, from the perspective of advertisers, oCPM in some SNSs (Facebook, 2012) converted equivalently from other bidding objectives, is able to maximize advertisers’ interests, however, may not guarantee the platform ecological health such as GMV; from another aspect, excessive pursuit of media requirements like GMV by modifying the ranking formula can not bring effective commercial benefits to advertisers and the platform.

In order to solve above problems, we propose optimized cost per click (OCPC) with following characteristics: for each PV request, on the premise of optimizing the advertiser’s demands, OCPC adjusts the bid toward the true value of the traffic quality, and meanwhile maximizes a composite score reflecting overall ecology of user experience, advertisers’ interests, and platform’s revenue, by keeping eCPM sorting mechanism unchanged; our design allows us to adapt the OCPC system flexibly with lower costs based on the changing needs of the business. We expect through optimizing the traffic matching efficiency, our OCPC achieves a comprehensive upgrade of all the user, advertiser and platform indices. It’s worth mentioning that enhanced cost per click (Google, 2010) (ECPC) in Google AdWords also attempts to adjust the bid according to the potential conversion rate. However, besides conversion rate, platform indices like GMV, which are crucial elements for Taobao platform, cannot be optimized directly in ECPC manner.

Our major contributions are summarized below. (i) We illuminate some characteristics of Taobao display advertising system and its subsystems. (ii) We propose a novel bid optimization approach which achieves the overall optimization of advertisers’ interests, user experience and platform revenue of Taobao ecology. (iii) Comprehensive offline and online experiments are conducted to verify the effectiveness of the proposed OCPC mechanism.

The rest of this paper is organized as follows: Section 2 gives a brief introduction to Taobao advertising system. Section 3 presents the OCPC details. Section 4 introduces the prediction process. At last, Section 5 focuses on the experimental results about the proposed approach, including model effectiveness estimation, offline experimental mechanism, and online A/B test performance.

2. System Architecture

This section describes how data and information flow in Taobao’s display ads system as Figure 2, which is essential to help understand why and how bid optimization works. Each system component and the sequence of events handled in them from the foremost page view request to the ultimate impression are highlighted as follow:

Front Server receives a page view request from a user and hand out to Merger Server which acting as a central coordinator communicates with other components during the whole process. Merger Server requests Matching Server to analyze the user and get a list of feature tags according to the advertisers’ user targeting requirements. Through Merger Server, these tags are delivered to Search Node (SN) Server for searching particular candidate ads along with the bids. In aforementioned Guess What You Like, the number of candidates is reduced from thousands to about four hundreds. Then, Real-time Prediction (RTP) Server predicts the click-through rate (pCTR) and conversion rate (pCVR) for the candidates from SN. In terms of CTR prediction (Chen et al., 2016; Graepel et al., 2010; He et al., 2014)

, we use mixture of logistic regression (MLR, which is also called as LS-PLM

(Gai et al., 2017)

) model to deal with particular high dimensional, i.e., usually hundreds of millions of dimensions, sparse and binarized features. As a part of merger,

Strategy Layer contains the main logic of OCPC which optimizes traffic allocation by ranking stage based on pCTR, pCVR and bid. The strategy layer is also responsible for the follow-up ads duplication removal, and final impression price calculation under generalized second-price auction (GSP). According to the rank of ads, titles and image addresses are extracted by Data Node (DN) Server, which are further optimized by Smart Creative Service (SCS). At last, the front server returns ad results to the mobile app or PC website. And the subsequent click or conversion will be recorded in the log system. All subsystems together constitute a complete data ecology based on which we introduce OCPC strategy in the next section.

Figure 2. The star schema of Taobao display advertising system and the proposed bid optimization strategy used in it.

3. Optimized Cost per Click

In this part, we first mathematically describe the demands of advertisers and conditions for optimization. Secondly, we propose an algorithm to optimize the platform ecology index and platform revenue. At last, relevant details are introduced. Practically, our algorithm framework applies to a wide range of advertisers’ demands and platform ecology indices, such as number of page views, clicks, conversions, etc. As a typical case, this paper sets ROI and gaining quality traffic as the advertisers’ demand, and GMV as the platform ecology index, which along with platform revenue are optimized by adjusting the advertisers’ bids. Suppose is the set of ad campaigns that are eligible for a PV request. With this specific PV request, for each campaign , there exists a preseted corresponding bid by the advertiser. For each , the role of OCPC algorithm is to adjust it and find an optimized to achieve the pre-designated various optimization requirements.

3.1. Optimization Scope

ROI Constraint

Taking into account the small and medium-sized advertisers being more concerned about the marketing effect, we choose to optimize their revenue (GMV) while keeping or improving ROI as a primary application of our algorithm. Here we introduce relevant notations and finally derive the mathematical representation of ROI.

First, we define the probability of transaction conversion conditioned on a user and a clicked ad as . For a specific item, note that in the ad spot is not considered as a condition for different spots eventually leading to the same item page. For a particular ad campaign , define as the predicted pay-per-buy (PPB) by consumers, i.e., the seller’s revenue. Thus, the expected GMV for a single click is .

Although the actual cost is calculated according to GSP mechanism, here we suppose the cost of a click paid by the advertiser is . So the expected ROI for a single click is derived as Eq.1.


Further, the overall ROI of ad across different users and clicks is derived as Eq.2, where is the total number of clicks for a user over a period of time. (We suppose the ROI is for a particular crowd and a spot, thus is consistent.)


Equation 2 indicates that the advertiser’s overall ROI is determined by three factors: the expectation of conversion rate , the predicted and bid , among which is inherent for each ad, and is regarded as stationary in each particular auction.

In practice, the current prediction model is used to predict pCVRs of competing ads from past few days and the largest, smallest of these CVRs are eliminated, with average of the remaining composing current . The goal of bid optimization requires that should keep unchanged or be improved (so called ROI constraint), and advertisers can gain more high quality traffic.

Bid Optimization Boundary

Equation 2 proves the linear relationship between and , i.e., bid optimization that satisfying will prevent ROI from falling. Along with considering advertisers’ demands of gain quality traffics, we conduct the following bid optimization principles: raise the bid under ROI constraint to help advertisers compete for quality traffics (), and depress the bid to save cost for those low quality traffics (). The bid optimization range that compromised quality and quantity is illustrated in the gray area of Figure 3, based on the ratio of and . Note that there exists a fixed threshold (e.g., ), for the sake of safety and business settings. The lower bound is essential to avoid the situation that some advertisers may get little traffic when optimizing their ROI.

Figure 3. The bid optimization scope (the gray area) under ROI constraint.

With the area depicted in Figure 3, the lower and upper bounds, denoted as and , of bid optimization for an ad campaign are as Eq. 3 and 4. It’s worth to emphasize that the bid optimization boundaries can be generalized to refer to other pursuits of advertisers, not limited to ROI. If bid optimization is not authorized by some advertisers, the corresponding lower and upper bounds both equal to .


3.2. Ranking

Optimizing bid price in the given boundary can help advertisers gain better quality traffics and higher ROI. However, different bid price chosen from the feasible region might result in different ad ranks under eCPM sorting mechanism (i.e., ads are still ranked by after bid optimization), and consequently bring different revenues or other indicators. In the following content of this section, we’ll introduce our novel way to choose from the feasible region, which can attain best composite index that has taken pursuits from all sides into account, on the premise of holding eCPM sorting mechanism.

Assuming that we’re going to display one ad under eCPM sorting mechanism, we expect the ad to maximize the following objective


where is the number of eligible ads in a PV, i.e., , and is the function that can give a composite index which has included pursuits from all sides. Without loss of generality, we assume that is monotone increasing w.r.t. . Condition in Eq. 6 means that the auction-winning ad is the top ranked th ad under eCPM sorting, and the above optimization problem is to maximize of the auction-winning ad. Condition in Eq. 7 ensures the optimized bid price in the determined scope. There are two meanings for the optimization problem presented in Eq. 5. On the one hand, we attempt to select the th ad that could have the largest value; on the other hand, the bid price of each ad should be adjusted to make sure that the selected th ad can have the largest eCPM. For , we give the following two examples

where tends to prompt Taobao platform’s overall GMV, which is the revenue of all advertisers. And is a compromise of Taobao’s GMV and advertising revenue. Note that is the trade off coefficient between GMV and advertising revenue, and different value could result in different goal of bid optimization, just in the way presented in Eq. 5.

The remaining work of ranking is to find for each that can maximize the objective in Eq. 5. Analogy to the boundary of optimized bid price, we derive the boundary of as and , called the lower and upper bound of optimized rank score (, , according to Eq. 3 and 4). To optimize Objective 5, we just need to sort ads in descending order of (note that we use bid’s upper bound here because we assume is monotone increasing w.r.t. ) for each , then choose the first ad whose is no less than all other ads’ (to make sure that Constraints 6 and 7 can be satisfied) as the result to display and set . Last, update bid prices for other candidates in their feasible region, which ensure ad has the largest eCPM.

Returning to the real scenario that there might be more than one (e.g., ) ads displayed in each PV, we propose a greedy algorithm in Algorithm 1 and give a brief explanation as follow.

First, we sort ads according to (line 3) and pick an ad out (derive the ad as , lines 4-5) by optimizing the objective function in Eq. 5. Then, we update remaining ads’ by limiting them to no more than (correspondingly update to ensure that ad could have the largest eCPM after bid optimization, as Constraint 6, lines 8-11). Afterwards, we repeat the above two steps until all ads are picked out (lines 2-12). Last, set of all ads to their bid price upper bound (lines 13-15). The time complexity of the proposed ranking algorithm is . Typically, is a small number (e.g., in Item CPC Ads) that real-time response won’t be an issue.

Example 3.1 ().

Here we give an example to help understand the ranking algorithm. Suppose that there are eligible ads in given in Table 1, and the number of ads to display is . Now, we are going to select ads from Ad 1-4 in Table 1. According to the proposed ranking algorithm, these 4 ads are sorted in descending order of . The largest rank score lower bound is (marked in blue in Table 1). And the top ranked ad’s (also marked in blue), which is larger than . Thus, Ad 1 is picked out from and inserted into the winning set , and the candidate set is updated to Table 2, according to lines 6-11 in Algorithm 1 (updated cells are in red). Afterwards, Ad 3 rather than Ad 2 is selected as another winning ad in the second cycle, because Ad 2’s rank score upper bound , which is smaller than . Then, the loop ends because . Finally, the bid optimization result of each ad is given in Table 3.

Ad #
1 0.04 2 2.8 11.2 8 0.312
2 0.05 1.5 1.5 7.5 4.5 0.255
3 0.06 1.5 1.95 11.7 9 0.237
4 0.04 1 1 4 3.6 0.14
Table 1. eligible ads in and their pCTRs, bids, etc., and in is set to . The upper bound of each ad’s rank score , and the lower bound of each ad’s rank score .
Ad #
2 0.05 1.5 1.5 7.5 4.5 0.255
3 0.06 1.5 1.86 11.2 9 0.232
4 0.04 1 1 4 3.6 0.14
Table 2. Remained ads and their updates in after Ad 1 is picked out. Updated cells are in red.
Ad # eCPM
1 0.04 2 2.8 0.312 0.112
3 0.06 1.5 1.86 0.232 0.112
2 0.05 1.5 1.5 0.255 0.075
4 0.04 1 1 0.14 0.04
Table 3. Bid optimizatoin result of each eligible ad.

By such ranking strategy, we decouple the final sorting index and the goal of advertising traffic. On the one hand, ads can still be sorted by , which is the way to maximize eCPM; on the other hand, the advertising platform can choose ads according to other pursuits by different . Another concerned problem is about budget constraint of advertisers. Once an ad campaign spends out its budget, it will be excluded from the following auctions, which would not affect the bid optimization process.

Input: Ad list , corresponding boundaries of bid price
Output: Optimized bid prices for
1 Winning set ;
2 repeat
3       Sort ads in in descending order of ;
4       the largest for ;
5       Find the first ad from that ;
6       ;
7       ;
8       for  do
9             ;
10             ;
12       end for
14until  or ;
15for  do
16       ;
18 end for
19Return for each ad in ;
Algorithm 1 Ranking Algorithm

3.3. Algorithm Details

After introducing the core ideas in our OCPC mechanism, we’ll going to detail the whole strategy layer.


From the historical experience of maintaining the advertising system, we find that inherent bias exists on the predicted values used in OCPC layer, which could affect the algorithm effectiveness. Since it’s difficult to do adjustment in model training, we do calibration after prediction at the beginning of OCPC layer.

We’ll take pCVR calibration as the example. The RTP module usually gives a larger estimated CVR value when the actual CVR is in a high level. This phenomenon is illustrated in Figure 4. We divide all ads to 20 groups according to their pCVR. The corresponding real CVR and the ratio of predicted and real CVR are draw in the figure. We can see that the ratio becomes larger in groups with large pCVR. Thus, We calibrate the predicted CVR as


where is the calibration threshold, typically in practice. Those pCVR values that are larger than will be calibrated with Eq. 8, which is an intuitive way that aims to reduce the gap between predicted and real CVR for ads with large pCVR values. After calibration, we can see from Figure 4 that the gap drops significantly in high pCVR region.

(a) Before calibration
(b) After calibration
Figure 4. The gap between predicted and real CVR w.r.t. different pCVR level before and after calibration (, from Jan 10, 2017 to Jan 16, 2017).

Overall OCPC Strategy

In Algorithm 2, we give the overview of the over OCPC strategy, from calibration to ranking. Lines 1-4 with functions and have linear time complexity . The time complexity of function is . Therefore, the run time performance bottleneck of OCPC strategy is the ranking stage. Considering the typical value of (about hundreds) and , the real-time performance is not an issue for the proposed approach.

Input: Eligible ads , and corresponding predictions
Output: Optimized bid price for
1 for  do
2       calibrate();
3       calculateBoundary();
5 end for
7 return each ;
Algorithm 2 OCPC Algorithm

4. Model Estimation

The stated bid optimization boundary of OCPC is extremely dependent on CVR prediction. Meanwhile, other predicted values like pCTR will also affect the performance of proposed strategy for the most part. In this section, we are going to focus on the prediction models, along with the accuracy and stability of predicted values.

4.1. Model and Features

In Taobao estimation, we have features of user and campaign which are sparse and have tens of millions dimensions. Logistic Regression is a widely used algorithm in tasks like CTR prediction (Richardson et al., 2007). However, the problem to solve may be non-linear. Therefore, we use mixture of logistic regression (MLR, which is also called as LS-PLM (Gai et al., 2017)) algorithm in RTP server. We do not expand more about MLR here, instead, we are going to introduce the composition of feature to help understand how the learning model work.

Figure 5. Feature composition of CTR and CVR prediction model.

In Figure 5, we illustrate the feature composition in CTR and CVR prediction. We’ll give a brief introduction to these three kinds of features along with their combinations. Context feature is features related to the context. For example, spot position feature (we call it PID feature) is used to distinguish different spots (e.g., spots in Android or IOS). User feature mainly contains user profile features (like gender, age, etc.) and user behavior features (e.g., click times of different categories in a period of time). Campaign feature is consists of features like ad ID. Beside separate features in those three kinds, their combinations (e.g., the Cartesian Product of nick name and ad ID) are also used. Furthermore, In CVR prediction, the results of click quality model (used to qualify a click behavior) are used as input, which has shown significant improvement in practice.

In CTR model, positive samples are collected from those clicked impressions. And the negative samples are those not clicked impressions. In CVR model, positive samples are those clicked and converted impressions, and the negative samples are those clicked but not converted impressions. New models are trained every day to eliminate the variance between different days.

4.2. Model Performance

Serving precise results are very important for prediction models. In tasks like CTR prediction, AUC is a widely used metric to measure model effectiveness. However, existing research (Cheng et al., 2016) shows that better AUC results in testing may bring worse performance in production. This also confused us in practice when tuning our prediction model. We analyzed the problem and found that AUC metric doesn’t treat different users and spots differently. For example, users who never click any ad or obscure ad spots would bring turbulence to AUC result towards a lower value. According to those facts and analysis, we proposed an AUC like metric, called Group AUC (GAUC) in Eq. 9. First, we aggregate all test data according to the user () and the particular position () of ad spot. Then, the AUC results are calculated in each single group (note that if there are all positive or negative samples in a group, we remove the group from the data). At last, we average these weighted AUC (weight is proportional to impression times or click times in the group) results in different groups and take the result as the GAUC value.


CTR and CVR Model Performance

In Figure 6, we give the AUC and GAUC performance of CTR and CVR prediction model in a 7 days period. The results show that the performance of daily models conducted by MLR algorithm are fairly stable. The CVR model has higher GAUC than CTR model, because there is less noises in the samples of CVR model.

(a) CTR model
(b) CVR model
Figure 6. The AUC and GAUC performance of CTR and CVR model in a 7 days period (from Jan 10, 2017 to Jan 16, 2017).

In Figure 7 and 4, we illustrate the ratio of predicted and real CTR, CVR values w.r.t. different predicted value levels. The results show that the predicted values of CTR are usually larger than the real ones. However, what matters more is the ordinal relation between different predicted CTR values in the proposed OCPC strategy.

The performance results show that the CTR and CVR prediction models, which are prerequisite of the proposed OCPC mechanism, are practicable.

Figure 7. The gap between predicted and real CTR w.r.t. different pCTR level (from Jan 10, 2017 to Jan 16, 2017).


Satisfied by the above prediction model performance results, we are going to evaluate the effectiveness of the proposed OCPC method. The following experiments have two parts: offline simulation and online A/B test.

5.1. Offline Simulation

In online advertising, it always take several days, or even several weeks, for new algorithms to take effect. Such long feedback time would put off the development and upgrading of new algorithm. To overcome the problem, we build an offline simulation platform to accelerate the validation of new ideas. Based on log data, the pre-view procedures can be restored perfectly. In other words, giving the same eligible ad list for each PV request, the auction-winning ads in simulation environment are the same with production environment. And for the coming post-view user behaviors, we use the predicted probability as a substitution of real clicks or conversions to estimate the real performance of different bid optimization strategies. For example, if an impressed ad’s CTR prediction is , then it will contribute to the total times of clicks. In the simulation, we use of all bidding records (which is about twenty million PVs) in Item CPC Ads in Feb 11, 2017, and we compared 4 different bid optimization strategies:

  • Strategy 0 is the old strategy without bid optimization. And due to the sensibility and privacy of commercial data, the results of other strategies will be shown in comparison form contrast to this basic strategy.

  • Strategy 1 is a simple bid optimization strategy that takes the advertisers’ view. Here we directly optimize , where is a monotone increasing function (when ) w.r.t. , ranging in .

  • Strategy 2 is our OCPC strategy that takes GMV pursuit of the traffic into account. The index and , where the implicit term () could take effect to prompt GMV.

  • Strategy 3 also attempts to promote GMV, but in the other way that directly sort eligible ads by descending order of , without bid optimization.

Str 1 is a straightforward strategy like the one proposed in (Perlich et al., 2012), which attempts to optimize advertisers’ ROI. The relationship between its bid optimization result and is illustrated in Figure 8. Str 2 is the proposed OCPC strategy that has also considered Taobao’s GMV pursuit. Using and as the arguments of in , Str 2 tends to select those ads with high GMV estimation indirectly. Str 3 also attempts to promote GMV but in a new sorting mechanism out of eCPM.

(a) Curve of
(b) Bid adjustment ratio
Figure 8. The curve of and the corresponding bid adjustment ratio of Str 1 when , w.r.t. different .

Before giving the results, we will introduce some metrics in detail that we used to evaluate the performance of different bid optimization strategies. RPM is the indicator of advertising revenue per thousand impressions, which could measure the traffic liquidating efficiency of advertising platform. GMV per mille (GPM) is the gross merchandise volume per thousand impressions, which is related to advertisers’ revenue and user experience of Taobao. ROI is to measure advertisers’ return on investment. CTR, CVR and PPC are the average click-through rate, conversion rate and pay-per-click respectively.

Str 1 -9.5% 8.8% 20.2% -0.5% 10.1% -7.8%
Str 2 5.6% 14.1% 8.1% -1.9% 14.9% 9.5%
Str 3 -17.7% 23.6% 50.2% -8.6% 74.0% -9.8%
Table 4. Simulation results of different OCPC strategies when .

In Table 4, we give the results of Str 1,2,3 against to Str 0. The parameter in adjustment function is chosen by cross-validation and set to and for Str 1 and 2 respectively. Str 1 focuses on optimizing advertisers’ ROI and cannot ensure better RPM. Str 3 can boost GPM by ranking with , however, it also pulls down RPM (because of pay per click (PPC) and CTR drop). Only the proposed OCPC strategy in Str 2 can achieve a tripartite win-win situation of GPM, ROI and RPM.

To measure the influence of different adjustment ranges , we conduct an experiment with Str 2 (which outperforms the other strategies in the above experiment) and the results are shown in Table 5. Offline simulation results indicate that larger can bring better performance. And the increment of RPM is less than that of GPM, which results in a higher ROI lift when the adjustment range is large.

The results in Table 4 and 5 show that Strategy 2 positively works in boosting overall GMV, and ROI constraint can protect advertisers’ interests.

0.2 4.2% 6.5% 2.2% -0.5% 6.5% 5.5%
0.3 5.2% 10.2% 4.8% -1.1% 10.4% 7.6%
0.4 5.6% 14.1% 8.1% -1.9% 14.9% 8.1%
0.5 5.5% 18.1% 11.9% -3.1% 19.9% 11.2%
Table 5. Simulation results of Str 2 w.r.t. different .

Campaign Results

Besides the overall performance, we also simulate the performance of particular campaigns under Str 2, to ensure that the proposed strategy can improve each single campaign’s advertising effect. The results of 10 campaigns with largest cost in simulation are shown in Table 6. The metric named ”Cost” is the total payment for advertising. An interesting observation is that seven campaigns’ GPM increases while their PV drops at the same time, which means that they win less poor quality opportunities with OCPC mechanism. In addition, eight in ten campaigns’ ROI is improved, which shows that the ROI constraint truly work for separate campaigns. The ROIs of campaign 3 and 8 drop slightly, because they compete more PVs.

Camp 1 -0.9% -17.5% -16.2% 18.2% 20.1%
Camp 2 -7.7% -27.9% -26.8% 26.2% 28.1%
Camp 3 2.5% 9.2% 2.6% -0.1% -6.2%
Camp 4 23.0% 9.2% 0.4% 22.6% 12.7%
Camp 5 -13.1% -23.8% -22.0% 11.4% 14.0%
Camp 6 0.0% -4.0% -10.3% 11.5% 4.1%
Camp 7 -5.0% -8.0% -9.7% 5.2% 3.2%
Camp 8 64.6% 65.7% 49.5% 10.1% -0.6%
Camp 9 -19.2% -30.4% -28.5% 13.1% 16.1%
Camp 10 -4.2% -5.3% -8.6% 4.9% 1.2%
Table 6. Simulation results of Str 2 w.r.t. different campaigns.

5.2. Online Results of OCPC Strategy 2

After offline simulation and online mini flow A/B test in experimental environment, we finally decide to deploy aforementioned Str 2 in production. Meanwhile, Str 0 is reserved as a contrast test. In this section, we are going to study the online performance of proposed OCPC strategy in Item CPC Ads. And other results of different traffic pursuits and scenarios are also shown to prove the effectiveness and generality of OCPC mechanism.

In Table 7, we give the experimental results of Str 2 with of whole production traffic, and the benchmark Str 0 also has traffic. Users are allocated to each strategy randomly, while all ad campaigns exist in both strategies. Note that we have about ninety million PVs every day in Item CPC Ads. The results prove stable improvement of the proposed bid optimization strategy. Advertisers’ interests (indicated by ROI), platform’s revenue (indicated by RPM) and overall GPM achieve a tripartite win-win situation.

% Improved 6.6% 8.9% 2.1% -1.3% 5.2%
Table 7. Online experimental results of Str 2 under of whole production traffic (, from Aug 23, 2016 to Aug 29, 2016).

After giving the results about overall performance, we do other experiments (from Sep 8, 2016 to Sep 14, 2016) to verify the effectiveness of the Str 2 further, to find out whether it benefit to most separate advertisers and the advertising platform in the long term.

Performance in Advertisers’ View

Firstly, we analyze the performance for each separate campaign. Campaigns with at least conversions in a week are included. In Table 8, we give the proportion data of ad campaigns whose ad performance is improved. In all campaigns with more than conversions in a week, campaigns get GPM and ROI improvement at the same time. And campaigns are in the so called quantity and quality exchange situation: their PV increment is larger than the ROI drop. We say that it’s also acceptable for some advertisers, because PV increment might lead those secondary impressions to a campaign and lower the ROI. However, more impressions could also bring more conversions.

GPM and ROI are improved 67%
Quantity and quality exchange 24%
Table 8. The proportion of ad campaigns whose performance is improved. Here we choose campaigns with more than conversions in the experiment.

With OCPC mechanism, advertisers might also be concerned about what the optimized bid prices actually are. In Figure 9, we illustrate the numerical relation between optimized bid price and determined bid for those displayed ads in Feb 19, 2017. We divide those bidding records into 9 groups, according to their value of (ranging from to ). From the results, we can see that more than half impressions belong to group 5, the middle group which includes records with . It’s a reasonable observation, because the bid optimization upper bound for those low quality traffics is set to according to Eq. 4, and the proposed ranking algorithm prefer to adopt the upper bound.

Figure 9. The proportion of different ratios of optimized bid price and determined bid.

Performance in Platform’s View

Standing by the platform side, merely focusing on the overall RPM, GPM and ROI results is far from enough. In Taobao advertising system, ad items are from variant kinds of categories, e.g., women’s dress, furniture or digital product. For each category, it has an inherent CVR or ROI level. There exists the probability that the overall improvements of GMV or ROI come from the traffic shifting between different categories, which is not good in the long run. Thus, we give an experimental result to capture the traffic shifting.

Figure 10. The variation of PV proportion of top 20 categories (ranked by category’s total advertising cost).

The results of variation of PV proportion are given in Figure 10. PV proportion of a category is the ratio of the category’s PV and the total PV in an experiment bucket. The results suggest that the traffic shifting is not too obvious, with all the variations are within (note that the PV proportion might change whether the algorithms are different, in different buckets).

Analogy to the advertisers’ view, we also do experiments to show the performance in categories’ view. Results in Table 9 suggest that categories (with PVs) get GPM and ROI improvement at the same time.

Category PV
GPM and ROI are improved 17% 62%
GPM is improved 27% 21%
Quantity and quality exchange 30% 12%
Table 9. The proportion of categories whose performance is improved, and the corresponding PV proportion of categories.

The results in platform’s and advertisers’ view prove that the OCPC algorithm has the capability to hand out suitable opportunities to different ads, which can improve comprehensive utilization effect of advertising traffic. And all the above results prove that OCPC takes significant effect for both Taobao advertising platform and the advertisers.

5.3. Online Performance in Other Scenarios

As mentioned in Section 3.1, advertisers could have different pursuits. Before Double Eleven, Taobao sellers are more concerning about the quantity of goods added to users’ shopping cart. We do experiments with Str 2 using different before 2016’s Double Eleven event. Using the predicted probability of adding to shopping cart (predicted ASR), the index function , and .

In Table 10, we give the results of OCPC strategy which helps prompt ASR. From the results, we can observe that ASR has been improved (compared to Str 2 with aforementioned in Section 5.1).

Improved 0.3% -6.1% -2.9% 21.1% 15.6%
Table 10. The online results about prompting the probability of adding to shopping cart (from Oct 30, 2016 to Nov 10, 2016, production flow).

Besides, we give the results of Str 2 in Banner CPC Ads, with instead, and , in Table 11. Note that we remove term from , because there are store campaigns in which PPBs of different items vary a lot. The results suggest large CVR and GPM improvement.

Improved 15.7% 3.6% 11.7% -0.6% 19%
Table 11. The online results in Banner Ads (from Jan 13, 2017 to Jan 15, 2017, production flow).

Above experiments show that the OCPC mechanism could act like a general framework to handle different problems, no matter what the pursuits and scenarios are.

6. Conclusion

We introduce a number of important features of Taobao display advertising system, and elaborate on two key ad formats, i.e., banner and item ads. By analyzing the ecological characteristics and comparing with other methods, we use the most suitable pricing method, i.e., CPC in the involved ads formats. We showcase our system architecture and ads serving process, based on which we analyzed the shortcomings of the traditional CPC method and propose OCPC algorithm to reconcile the demands of advertisers, platform ecological indices and platform revenue. We characterize the optimization objectives mathematically and give detailed algorithms with other relative technical details such as prediction models, calibration, and algorithm complexity analysis. Holding eCPM sorting mechanism, our proposed OCPC strategy benefits to not only advertisers, but also other indices including eCPM itself, by bid optimization. In Taobao display advertising platform, OCPC has been automatic applied in the whole mobile production traffic of Item CPC Ads, and can also be chosen to apply by advertisers in their own Banner CPC Ads traffic.


  • (1)
  • eco (2015) 2015. The everything creditor. Economist (2015).
  • Aggarwal et al. (2006) Gagan Aggarwal, Ashish Goel, and Rajeev Motwani. 2006. Truthful auctions for pricing search keywords. In Proceedings of the 7th ACM conference on Electronic commerce. ACM, 1–7.
  • Chen et al. (2016) Junxuan Chen, Baigui Sun, Hao Li, Hongtao Lu, and Xian-Sheng Hua. 2016. Deep CTR Prediction in Display Advertising. In Proceedings of the 2016 ACM on Multimedia Conference. ACM, 811–820.
  • Cheng et al. (2016) Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016.

    Wide & deep learning for recommender systems. In

    Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. ACM, 7–10.
  • Edelman et al. (2007) Benjamin Edelman, Michael Ostrovsky, and Michael Schwarz. 2007. Internet advertising and the generalized second-price auction: Selling billions of dollars worth of keywords. The American economic review 97, 1 (2007), 242–259.
  • Evans (2009) David S Evans. 2009. The online advertising industry: Economics, evolution, and privacy. The journal of economic perspectives 23, 3 (2009), 37–60.
  • Facebook (2012) Facebook. 2012. Cost per Actoin and Optimized Cost Per Mille. developers.facebook.com (2012).
  • Gai et al. (2017) Kun Gai, Xiaoqiang Zhu, Han Li, Kai Liu, and Zhe Wang. 2017. Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction. arXiv preprint arXiv:1704.05194 (2017).
  • Goldfarb and Tucker (2011) Avi Goldfarb and Catherine Tucker. 2011. Online display advertising: Targeting and obtrusiveness. Marketing Science 30, 3 (2011), 389–404.
  • Google (2010) Google. 2010. Enhanced Cost per Click in Google AdWords. https://support.google.com/adwords/answer/2464964 (2010).
  • Graepel et al. (2010) Thore Graepel, Joaquin Q Candela, Thomas Borchert, and Ralf Herbrich. 2010. Web-scale bayesian click-through rate prediction for sponsored search advertising in microsoft’s bing search engine. In

    Proceedings of the 27th International Conference on Machine Learning (ICML-10)

    . 13–20.
  • He et al. (2014) Xinran He, Junfeng Pan, Ou Jin, Tianbing Xu, Bo Liu, Tao Xu, Yanxin Shi, Antoine Atallah, Ralf Herbrich, Stuart Bowers, et al. 2014. Practical lessons from predicting clicks on ads at facebook. In Proceedings of the Eighth International Workshop on Data Mining for Online Advertising. ACM, 1–9.
  • Karande et al. (2013) Chinmay Karande, Aranyak Mehta, and Ramakrishnan Srikant. 2013. Optimizing budget constrained spend in search advertising. In Proceedings of the sixth ACM international conference on Web search and data mining. ACM, 697–706.
  • Lahaie et al. (2007) Sébastien Lahaie, David M Pennock, Amin Saberi, and Rakesh V Vohra. 2007. Sponsored search auctions.

    Algorithmic game theory

    (2007), 699–716.
  • Muthukrishnan (2010) S Muthukrishnan. 2010. Data Mining Problems in Internet Ad Systems.. In COMAD. 9.
  • Perlich et al. (2012) Claudia Perlich, Brian Dalessandro, Rod Hook, Ori Stitelman, Troy Raeder, and Foster Provost. 2012. Bid optimizing and inventory scoring in targeted online advertising. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 804–812.
  • Provost et al. (2009) Foster Provost, Brian Dalessandro, Rod Hook, Xiaohan Zhang, and Alan Murray. 2009. Audience selection for on-line brand advertising: privacy-friendly social network targeting. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 707–716.
  • Raeder et al. (2012) Troy Raeder, Ori Stitelman, Brian Dalessandro, Claudia Perlich, and Foster Provost. 2012. Design principles of massive, robust prediction systems. In Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 1357–1365.
  • Richardson et al. (2007) Matthew Richardson, Ewa Dominowska, and Robert Ragno. 2007. Predicting clicks: estimating the click-through rate for new ads. In Proceedings of the 16th international conference on World Wide Web. ACM, 521–530.
  • Schafer et al. (2007) J Ben Schafer, Dan Frankowski, Jon Herlocker, and Shilad Sen. 2007. Collaborative filtering recommender systems. In The adaptive web. Springer, 291–324.
  • Varian (2007) Hal R Varian. 2007. Position auctions. international Journal of industrial Organization 25, 6 (2007), 1163–1178.
  • Yuan et al. (2014a) Shuai Yuan, Jun Wang, Bowei Chen, Peter Mason, and Sam Seljan. 2014a. An empirical study of reserve price optimisation in real-time bidding. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1897–1906.
  • Yuan et al. (2013) Shuai Yuan, Jun Wang, and Xiaoxue Zhao. 2013. Real-time bidding for online advertising: measurement and analysis. In Proceedings of the Seventh International Workshop on Data Mining for Online Advertising. ACM, 3.
  • Yuan et al. (2014b) Yong Yuan, Feiyue Wang, Juanjuan Li, and Rui Qin. 2014b. A survey on real time bidding advertising. In IEEE International Conference on Service Operations and Logistics, and Informatics. 418–423.
  • Zhang et al. (2014) Weinan Zhang, Shuai Yuan, and Jun Wang. 2014. Optimal real-time bidding for display advertising. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1077–1086.
  • Zhang et al. (2016) Weinan Zhang, Tianxiong Zhou, Jun Wang, and Jian Xu. 2016. Bid-aware Gradient Descent for Unbiased Learning with Censored Data in Display Advertising. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 665–674.