Practical Constrained Optimization of Auction Mechanisms in E-Commerce Sponsored Search Advertising

07/31/2018 ∙ by Gang Bai, et al. ∙ 0

Sponsored search in E-commerce platforms such as Amazon, Taobao and Tmall provides sellers an effective way to reach potential buyers with most relevant purpose. In this paper, we study the auction mechanism optimization problem in sponsored search on Alibaba's mobile E-commerce platform. Besides generating revenue, we are supposed to maintain an efficient marketplace with plenty of quality users, guarantee a reasonable return on investment (ROI) for advertisers, and meanwhile, facilitate a pleasant shopping experience for the users. These requirements essentially pose a constrained optimization problem. Directly optimizing over auction parameters yields a discontinuous, non-convex problem that denies effective solutions. One of our major contribution is a practical convex optimization formulation of the original problem. We devise a novel re-parametrization of auction mechanism with discrete sets of representative instances. To construct the optimization problem, we build an auction simulation system which estimates the resulted business indicators of the selected parameters by replaying the auctions recorded from real online requests. We summarized the experiments on real search traffics to analyze the effects of fidelity of auction simulation, the efficacy under various constraint targets and the influence of regularization. The experiment results show that with proper entropy regularization, we are able to maximize revenue while constraining other business indicators within given ranges.



There are no comments yet.


This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

In this article, we present our work on auction mechanism optimization in the mobile sponsored search engine of Alibaba’s mobile E-commerce platform ( and In 2017, the platform powered millions of active advertisers to proactively reach hundreds of millions of unique potential buyers and effectively accomplish sales of goods worthy of hundreds of billions of RMB.

Sponsored search has been proved to be one of the most successful business model in online digital advertising. For each user query, the sponsored search engine renders relevant advertisements in addition to the main search results. The advertisers bid on query keywords for their advertisements. For each advertisement showing opportunity (an impression), the sponsored search engine selects a set of advertisement candidates relevant to the search query, predicts their quality scores such as the click-through-rate (CTR) and conversion-rate (CVR), allocates impression opportunities to advertisements using an auction mechanism and computes the clearing price for advertisers.

Generalized second price (GSP) mechanism is arguably the most widely used mechanism for the sponsored search engines (EOS2007, ; Aggarwal2006, ; Varian2007, ), which ranks the advertisements by their bidding price and quality score. The top ranked advertisements get the impressions and pay the minimum price to maintain their ranking locations.

In the literature, most of the auction mechanisms focus on maximizing the expected revenue in the Bayesian setting (Myerson1981, ), with variants on balancing efficiency and revenue using reserve prices (Roberts2013, ; Ostrovsky2011, ), or trading-off relevance and efficiency with an exponential weight of quality score(Lahaie2011, ). However, in the realistic case of E-commerce, the auction mechanism should be optimized under many constraints including the advertisers’ budget, advertisement efficiency limits, etc. To make the platform revenue sustainable for long-term gain, we should also take care of the factors like advertisers’ return on investment (ROI, quantitatively measured as the sale amount from the advertising cost) and users’ satisfactory of the search experience.

Moreover, most of the existing auction mechanisms only work under very ideal environments, where the participants are perfectly rational (Tang17, ) and the click-through-rate of advertisements are fixed according to the ad positions. However, these assumptions are not true in industry search engines and the traffic characteristics like user propensity, search queries changes dynamically over the days.

In Alibaba’s mobile sponsored search platform, we conduct GSP auction in a virtual space (the ranking score). Hence, the key to improve the performance of our platform is to find the right ranking score function. As mentioned above, our target is to maximize the revenue of the platform while meeting the requirements like user experience and advertisers’ ROI. It is natural to formulate the auction optimization problem as a constrained optimization problem.

However, since the ranking function space is too large, it is impractical to explore the performance of all the ranking functions online. To approximately gauge the outcome performance indicators of different ranking functions, we build a simulation system to replay the online auctions to generate virtual impressions and estimate expected user responses under a given ranking function.

To make the optimization problem practical, we introduce a discrete set of selected ranking functions as a novel representation of the auction mechanism. We re-parameterize the auction mechanism as the hitting probability of elements in the set. For each impression, one of the ranking functions is selected according to their probability to rank and price the candidate advertisements. The constrained optimization problem is to find the best hitting probability of the given set of ranking functions. With this representation, we derive a convex optimization formulation of the problem.

2. Problem and Formulation

The auction process in E-commerce sponsored search platform can be formulated as follows: for each product search request with user query , the search engine finds a set of advertisement candidates relevant to via broad match(Yan2018, ) and estimate predicted CTR , CVR of each candidate using statistical models. Then the predicted CTR, CVR and bidding price of the ad on the query keyword are mapped into a virtual space by evaluating a ranking score, and a GSP auction is conducted on the virtual space.

The ranking score, as the core of auctions in our platform, accounts for both expected efficiency(Lahaie2011, ) and hidden cost(Abrams2007, ). It is principally defined as


where and

are vectors of detailed parameters in the scoring functions. For simplicity, we denote the combined parameters as

. As a GSP auction in the space of the ranking score, the price for the -th bidder in the ranking is determined as the infimum of bid that she can still keep her position


Each component in weighs different input attributes of the function and influence the outcomes of ranking and pricing in the auctions differently, for example favoring higher bidding price or higher click probability. Due to business issues, we omit the detailed formulation of the components in (1). This exclusion will not hinder the illustration of our methods though, since the method can be applied to various formulation of the ranking score function.

In practice, a context-aware mechanism which assigns different mechanisms on different traffic is usually used to better capture the distinct properties of search requests and ad candidate sets. In our case, search requests are designated into categories using query information. Each category has a manually tuned parameter that effectively conducts the auctions.

For sustainable development of the platform, we should keep improving the satisfactory of users and advertisers while generating revenue for the platform. In our work, we regard the satisfactory of users as their engagement with the platform, i.e. the advertisement clicks and product purchases they make. For advertisers, we measure average cost for each user engagement through the advertisement as indicators of advertiser’s ROI. We also take advertising PV coverage ratio (ratio of ad impressions to total ad slots) as an important factor of users’ search experience, as excessively displaying ads among search result is typically displeasing.

Different ranking function parameter s have different outcomes of ranking and pricing, which lead to different user responses and eventually difference business metrics. We define business performance indicators of accordingly, including platform revenue, overall click-through rate (CTR), conversion rate (CVR) and ad PV coverage ratio (PVR), and advertisers’ fulfillment of goal as cost per-click (CPC) and cost per-conversion/acquisition (CPA).

2.1. Constrained Mechanism Optimization

One may see this setup as a multi-objective optimization problem as proposed in (Wang2012, ). However, the objectives are indirectly and nonlinearly correlated and appropriately setting the preferences among each objective to meet a particular set of requirements is difficult.

Instead, complying with the general business objective, we formulate our business problem into a constrained optimization setup, which optimizes over auction parameters for each category . In detail, the goal is to find the best parameters of which the outcome of auctions maximizes the total revenue of the platform, while being feasible for the constraints on the metrics of CTR, PPC, CVR, CPA and PVR, etc. We denote targets of lower bounds of CTR and CVR, lower and upper bounds of CPC, CPA and PVR as , , , , , , and , respectively.

Directly solving the optimization problem is intractable since we are working on the second price auction mechanism, the outcomes are typically non-convex and even discontinuous with respect to the parameters . It is fascinating and challenging to formulate the optimization problem into a sophisticated framework, such as convex optimization (Boyd2004, ).

In this article, we present a novel re-parametrization of ranking score function using a set of representative parameter instances to make the optimization problem practical. The instances in the set are selected by evenly discretizing each dimension of the parameter within a bounded-box centering at parameter of the original ranking function. In our experiments, the number of selected fixed parameter vectors , as is the number of grids in the bounded-box, is 2025.

This set of parameters , for each request categories provides a comprehensive range of different outcomes of auctions around that of the original ’s. By weighing the elements in the set, we are able to tune it to produce specific results.

To steer the outcome of the family of ranking score functions, we asign a probability of selecting the instance for category , denoted as . In application, the process of applying this family of mechanisms in sponsored search auction is described in Algorithm1. The re-parametrization with makes the constrained optimization problem practical.

We articulate the method of obtaining the best distribution over conforming to the business requirements. For each request of category , the ranking function with parameter produces the corresponding ranking and pricing result, of which we denote the expected user response of click and purchase as and and the price of click as . With all the auction outcomes of each request with all the ranking function instances, our problem is to find the particular probability values for each category and each ranking function of the category that maximizes expected revenue and meets the business requirements in expectation. Formally, the constrained optimization problem is materialized as:

(4) s.t.

where if is true and 0 otherwise.

In this way, we simplify the complex continuous parameter optimization problem into a discrete K-armed bandits optimization problem. To construct this setup, we estimate the resulted business indicators just at selected discrete instead of every possible instance of . When focusing on the fixed set of rules, we are able to improve the accuracy of the estimations. Also, compared with a solution of one single ranking function, a distribution of the fixed instances has a spectrum of much finer-grained outcome, since it is essentially a linear combination of the outcomes of the ranking functions in the set. Also, when applied online in the stochastic environment, a distribution of multiple instances works more smoothly and robustly than a fixed one.

Output: A list of auction-winning ads.
1 foreach search request  do
2      Determine the category of the request ;
3      Sample a parameter configuration ;
4      Assign the parameters ;
5      foreach  candidate in parallel do
6           Estimate the click and conversion probability and Evaluate the ranking score ;
8           end foreach
9          Sort and filter candidates by the ranking score;
10           Calculate click price for top ads in the sorted set;
11           Response with the result ad list;
13           end foreach
Algorithm 1 Apply auction mechanism online

3. Implementation

The key problem in constructing the problem setup in Eq.(3)…(10) is to evaluate the coefficients which are determined by the auction outcome and stochastic user responses. Applying

directly online to real traffics in short periods will merely result in observations with high variance, yet applying in longer period is unaffordable since it would seriously damage the performance of the platform when applying

that fiercely violates the business constraints. We carry out a biased but smooth estimation via offline replay simulation.

After estimating all the business indicators and set up the coefficients in Eq.(3)…(10), it is fairly straightforward to solve the problem using augmented Lagrangian method (Birgin2014, ). Also, an entropy regularization term is added to the optimization goal in Eq.(3) to make the results robust to the error in the performance indicator computation by offline auction simulation.

3.1. Offline Replay Simulation

In this work, we approximate the auction outcomes under different mechanism parameters by replaying the recorded online auctions using the parameters. Compared with applying to real online traffic, the offline replay simulation is safe since it has no effect on online user experience and advertiser’s ROI. More importantly, we are able to apply various ranking and pricing rules to the exactly same set of requests and ad candidates. Whereas online evaluation of rules are based on splitting of real traffics and so on different set of requests, which brings variance.

Replay logs consist of the request context, the query and the user profile, and the whole set of ads’ information in auction including predicted CTR, CVR and bidding price. Under the hood, the ad serving module records each of the ad request and the algorithmic module records the CTR and CVR predictions for the request as well as bidding information of each ad candidate. The recorded data are then collected by ETL infrastructures, aggregated and joined by the request id and eventually stored on Alibaba cloud, where we implement the offline simulation pipeline. The raw log data is organized as a table partitioned by hours, which amounts to hundreds of TBs each day.

Based on the replay log data, we are capable to apply any ranking and pricing function to a snapshot of the online traffic and obtain a set of winning ads for each ad slot. By implementing the ranking and filtering logics according to the counterparts in the online ad serving module, we make the offline simulation of the auctions produce the same ranking and pricing results as online’s.

With outcomes of simulated ranking and pricing, we estimate the expected user response to further estimate the business indicator metrics. We utilize the CTR and CVR predictions and to approximate the expectation of user click and conversion.

However, due to systematic bias caused by simplifying assumptions and variations in data distributions between training and serving time, the empirical mean of the predicted and the actual CTRs diverge, especially when the position effect of the ad slot escalates. This divergence brings bias to our estimations of metrics. We remedy this bias by statistically calibrating the predicted CTRs (McMahan2013, ) during replay simulation and evaluation of metrics, so that their empirical mean matches the actual CTR.

Figure 1. Calibration results of two ad slots.

We learn calibration mappings for each advertisement slot position using isotonic regression(Zadrozny2001, ). is modeled as a piece-wise constant function, monotonically increasing with , which is a flexible approximator. The range of the input value is split into intervals. For each interval , within which the number of ad impressions is and the actual CTR of these impressions is , we learn a constant factor as the calibrated ctr corresponding to the interval of predicted CTR. The calibration learning task is formulated as:

(12) subject to

We setup and solve the above learning task for each ad slot position. We prepare samples for calibration by aggregating advertisement serving logs. For each ad impression of slot position , we extract the online predicted CTR , and label it is clicked or not. Then we aggregate the impression data by grouping by the interval bin , and estimate the actual CTRs of samples that fall in each interval bin.

Figure 1 shows that calibration corrects the over-optimistically predicted CTRs and remediates the position bias in our platform.

With calibrated user response expectations, we calculate the business indicators on the whole dataset for each mechanism parameter in Eq.(3)…(10). With these steps, the construction of the problem setup is finished.

3.2. Regularization with Entropy Bonus

Solving the linear programming problem is fairly straightforward using sophisticated solutions to obtain a globally convergent solution. However, as it is a data-driven approach, the coefficients in the problem setup are approximated and the data distribution may also vary day after day. The optimal combination of mechanism parameters found from the simulated data is likely to be suboptimal.

To remedy this problem, we introduce some additional prior information via regularization to prevent overfitting to the replay simulated data. In our case, we borrow the idea of entropy regularization (entropy bonus) from reinforcement learning research

(Williams1991, ), where entropy bonus was introduced to prevent convergence to a single choice of action and enforce exploration.

We add this regularization term weighted by a hyper-parameter to the objective:


With the objective Eq.(13) and constraints Eq.(4)…(10), the problem setup is a linearly constrained convex optimization. Among many available methods, we use augmented Lagrangian for its simplicity.

4. Experiments

In this section, we present the experiments to measure the efficacy of the proposed method in auction mechanism optimization. We designed and launched several rounds of experiments on a fraction of the search traffic on the mobile search engine of the E-commerce platform. The baseline of the experiments is implemented the same as the main product version of the ranking score function, which consists of manually tuned parameters for each of the several thousands search categories. From the experimental results, we studied the effects of various factors on the solution of the problem and summarized the motivations, the arrangements, the results and some analysis of our experiments.

For each experiment, we construct a particular setup of the constrained problem. We evaluate the estimated business indicators for all the fixed parameters via auction replay simulation on logged data from the past 14 days. In the same pipeline, we also simulate the baseline mechanism to estimate the baseline metrics of exactly the same requests. The constraint targets in the problem setup Eq.(4)…(10) are set by scaling up and down the estimated baseline business indicators from simulation for the upper and lower bound targets of the constraints.

We solve the constrained optimization using augmented Lagrangian method, in which the dual variables generally indicates the difficulty of satisfying each constraint. With some particular constraint targets, the residual of the constraint term may be large, making the solution infeasible. This is expected since the targets may be beyond the domain in which spans the outcome of all combinations of the ranking functions from valid distributions.

We launched the implementation of Algorithm 1 with each particular solution as well as the baseline onto online A/B test environment in our mobile search platform to handle 1% of the real search traffics. Experiments are retained for at least 24 hours to sufficiently gauge the business performance indicators.

4.1. Results and Analysis

The experiments and results confirms the effectiveness of our approach and analyzes the effects of miscellaneous factors to it in optimizing auction mechanisms with constraints. Our approach generally meets the business requirements specified in constraint targets and maximizes the revenue.

4.1.1. Calibration in simulation.

We check the effects of calibration in offline simulation by comparing the results of problem setup with and without calibration. The experimental result shows that calibration helps significantly improve accuracy of the simulated metrics estimation and produce online results that approximately comply with the constraint targets, as illustrated in table 1.

Without calibration, the simulated results tend to be over-optimistic on the overall user response, i.e CTR and CVR. This is because ads with high predicted CTRs and CVRs, which may be over-estimated, are more likely to win the auctions. Thus this experiment also express the influence of accuracy in estimations of the objective and the constraint targets in the problem setup.

problem setup configurations result metrics
2.5% -2%0 -0.5%0.5% -1%1% -1%1% with +0.06% +2.31% -2.07% -0.13% -0.75% -1.33%
without 1.09% +0.89% +0.48% -0.28% -0.92% +1.12%
Table 1. With and without calibration

4.1.2. Efficacy with various constraint targets

Most importantly, we want to examine the effectiveness of the proposed method in conforming with the specific requirements of the business performance while maximizing revenue.

We specify different business requirements on the performance by varying the upper and lower bounds in the constraints of the problem setup. With the solutions of the various setups, we examine how the resulted metrics correlate to the targets.

The various arrangement of targets and the corresponding online experiment results are illustrated in table 2. The metrics in the experiment results generally follow the designated constraint targets, though miscue does exist due to the limit of the ranking function and exceedingly selected target values.

For CVR and CPA, we use simpler constraint targets because the expected number of conversions is less accurate, since it is calculated based on expected number of clicks, which is also approximated.

constraint target configurations results
1% -1%0 -0.5%0.5% 0 0 +0.387% +1.37% -0.91% -0.06% +0.31% -1.22%
1.5% -1%0 -0.5%0.5% 0 0 +0.378% +1.35% -0.89% -0.07% +0.33% -1.21%
1.5% -2%0 -0.5%0.5% 0 0 +0.146% +1.69% -1.38% -0.14% +0.24% -1.62%
2% -2%0 -0.5%0.5% 0 0 -0.074% +1.96% -1.72% -0.28% +0.11% -1.83%
2.5% -2%0 -0.5%0.5% 0 0 -0.219% +2.28% -2.11% -0.34% -0.05% -2.06%
3% -2%0 -0.5%0.5% 0 0 infeasible constraints
Table 2. Various constraint targets

4.1.3. Regularization

We evaluate the effects of the entropy regularization term in the objective of the optimization problem Eq.(13). In the experiment, we choose one single set of constraint, of which the targets are CTR1.5%, -1%PPC0, -0.5%PVR0.5%, CVR0 and CPA0.

The experimental results are listed in table 3. To articulate the effects of regularization, we also list the estimated business indicators of applying the solution to the replay simulation dataset in addition to the online experimental results.

Regularization term makes the offline simulated metrics suboptimal, but, when set appropriately, it improves the robust of the optimization result and performs better in the online traffic.

simulated online
0 +2.62% -1.35% +0.74% +1.98% +0.78% -1.25% +0.16% -0.32%
1e-4 +1.78% -1.02% +0.34% +1.08% +1.31% -1.14% -0.11% +0.045%
1e-2 +1.53% -0.79% +0.11% +0.84% +1.35% -0.89% -0.07% +0.378%
1 +1.24% -0.65% -0.08% +0.501% +1.13% -0.73% -0.15% +0.241%
Table 3. Various regularization term

5. Conclusion

In this article, we present a constrained optimization formulation of the auction mechanism optimization problem for E-commerce sponsored search platform. We showed this formulation is practical and applicable with discretized parameterization of the auction mechanism. We illustrate the construct of the problem setup with calibrated offline simulation and the objective with entropy regularization to improve the robustness of the results.

From the experimental results, we can conclude that our proposed methods do conform approximately with the specified constraint targets while maximizing the revenue. Another contribution of our work is the building of the auction simulation system configured to accommodate our experiments. Moreover, it is also extensible for other experiments that may require a replay of a large number of online auctions. We also would like to comment that it is obvious that our method would cause some bidding behavior changes once enforced and the characteristic of online data distribution will drift away from the snapshot of offline data. Whereas this problem is resolved as models are updated in a daily basis while most of the active campaigns on our platform last for days or even weeks.

The effectiveness of the proposed method is crucially impacted by two factors. One is the accuracy of the offline simulation in estimating the business indicators. In the future work, we will work on incorporating online evaluated performance indicators into the problem setup to improve the accuracy. The other factor is the fix set of selected representative parameters . The outcome of the auctions spans the space defined by the outcome of each individual point in the set. A significant boost in the business performance requires judiciously selected candidates in the set and even a new design of the ranking score function, which is also an important future direction of this work.


  • [1] Zoë Abrams and Michael Schwarz. Ad auction design and user experience. In Internet and Network Economics, pages 529–534, 2007.
  • [2] E. G. Birgin and J. M. MartínezJ. Practical Augmented Lagrangian Methods for Constrained Optimization. SIAM, 2014.
  • [3] Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
  • [4] Ben Roberts et al. Ranking and tradeoffs in sponsored search auctions. In Proceedings of the Fourteenth ACM Conference on Electronic Commerce, EC ’13, pages 751–766. ACM, 2013.
  • [5] Benjamin Edelman et al. Internet advertising and the generalized second price auction: Selling billions of dollars worth of keywords. 2007.
  • [6] Brendan McMahan et al. Ad click prediction: A view from the trenches. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’13, pages 1222–1230. ACM, 2013.
  • [7] Gagan Aggarwal et al. Truthful auctions for pricing search keywords. In Proceedings of the 7th ACM Conference on Electronic Commerce, EC ’06, pages 1–7. ACM, 2006.
  • [8] Su Yan et al. Beyond keywords and relevance: A personalized ad retrieval framework in e-commerce sponsored search. In Proceedings of The Web Conference, WWW’18. ACM, 2018.
  • [9] Yilei Wang et al. Multi-objective optimization for sponsored search. In Proceedings of the Sixth International Workshop on Data Mining for Online Advertising and Internet Economy, ADKDD ’12, pages 3:1–3:9. ACM, 2012.
  • [10] Sébastien Lahaie and R. Preston McAfee. Efficient ranking in sponsored search. In Internet and Network Economics, pages 254–265. Springer Berlin Heidelberg, 2011.
  • [11] Roger B. Myerson. Optimal auction design. Math. Oper. Res., 6(1):58–73, February 1981.
  • [12] Michael Ostrovsky and Michael Schwarz. Reserve prices in internet advertising auctions: A field experiment. In Proceedings of the 12th ACM Conference on Electronic Commerce, EC ’11, pages 59–60. ACM, 2011.
  • [13] Pingzhong Tang. Reinforcement mechanism design. In

    Proceedings of International Joint Conference on Artificial Intelligence

    , pages 5146–5150, 2017.
  • [14] Hal Varian. Position auctions. International Journal of Industrial Organization, 25(6):1163–1178, 2007.
  • [15] R. J. Williams and J. Peng. Function optimization using connectionist reinforcement learning algorithms. Connection Science, 3(3):241–268, 1991.
  • [16] Bianca Zadrozny and Charles Elkan.

    Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers.


    Proceedings of the Eighteenth International Conference on Machine Learning

    , ICML ’01, pages 609–616, 2001.