I Introduction
An online ad supplyside platform (SSP) that seeks to maximize revenue from an ad impression has traditionally implemented an auction waterfall [1, 2], where a sequence of auctions, each with different parameters, is held in order to sell the impression. The platform traverses the sequence of auction parameters, holding one auction after another, until a winning bid is found. When this occurs, the SSP stops the auction sequence and the impression is returned to the winning bidder. For example, the first auction could be intended for buyers with exclusive, firstlook access to the inventory. More information could be disclosed on the impression, and the reserve price would be higher. As one goes down the auction waterfall, less information would be disclosed to the buyers, and the reserve price would decrease. A schema of the process is shown in Fig. 1.
The auction waterfall length for an ad impression can vary in length: a typical number would be from three to five. There are some ad impressions for which certain auctions in the waterfall are unlikely to produce a winning bid. Indeed, some ad impressions are not interesting for buyers and they may not even return bids in response to the bid requests. To simplify terminology, we shall also consider this to be a “lost” auction.
A simplistic implementation of the auction waterfall would be to run through the entire auction sequence regardless of the quality of the ad impression. This is not optimal for the SSP, since there is a transaction cost to running auctions, e.g., network and machine costs. A more optimal approach is to only hold the auctions that are likely to succeed. More specifically, the SSP should hold an auction only if it has expected positive net income (NI), where the net income is defined as the difference between the payoff and the transaction cost.
Auction theory is a rich field, and has found applicability in various fields like online ads [3, 4] and airline seat booking [5]. While singledecision or static auctions were studied at the beginning [6], there has been a trend towards studying dynamic auctions [7, 8, 9]. An advanced SSP would generate its auction waterfall dynamically to maximize revenue. To the authors’ best knowledge, however, optimizing the auction waterfall to maximize net income has not been considered in published literature before.
In this paper, we consider the case where the auction sequence is a priori unknown and seek, via a decision tree, to abort auctions that are unlikely to produce a positive NI for the ad impression. The abort decision is applied on an auctionbyauction basis.
Ii Problem description
An ad impression opportunity generated by a user landing on publisher page creates an ad request . This is resolved by holding a sequence of auctions in order to sell the impression. Suppose that there are auctions in the waterfall with publisher tags , , . The publisher tag is a unique identifier that defines the auction parameters. is not constant for all ad impressions, but is a configurable publisher setting. Thus, on an SSP, there are potentially waterfalls of different lengths for different publishers.
At the th auction, given and the ad request context , we want to derive a decision rule on whether or not to abort the auction so as to maximize the expected net income of the ad request, defined as the expected payoff minus the expected transaction cost. Define , to be the abort flag for publisher tag . If , we abort the th auction; otherwise, we hold it. If the th auction is aborted, the system continues to the next auction in the sequence and decides if it wants to hold that. Alternatively, if the th auction is carried out, the ad request is resolved if there is a successful bid; otherwise, the system proceeds to the next auction, just as in the case of .
Let be the payoff and be the number of played auctions in the ad request . Clearly, . In order to simplify the problem, we assume that there is a fixed cost to running an auction. This may not be true: for example, one could have a different number of bid requests for different publisher tags, and the network cost is proportional to the number of bid requests, c.f. Fig. 1. We want to therefore maximize the expected net income, which is
(1) 
where the expectation is taken of all ad requests . Suppose that is known. To maximize , one would solve to obtain optimal ’s.
In our application, we do not know N a priori nor do we know the sequence of publisher tags. This lack of precise knowledge on the auction waterfall leads us to consider the problem of maximizing knowing only the ad request context and the publisher tag of the current auction.
Iii Derivation of the optimal abort decision rule to maximize the expected net income
Iiia Derivation
Suppose that we are at auction and want to decide if it is better to abort auction or not. Consider two cases.
Case 1. Abort auction . Assume that:

the expected payoff of the ad request is the same as the case that auction loses, and

the expected number of auctions associated with the ad request is one less than in the case that auction loses
The expected net income, taking into account the payoff and auction costs, is then
(2) 
where the conditional expectation is taken over all ad requests whose auction sequence includes the tag , and loses.
Assumptions (1) and (2) may only approximately hold. For example, the auction sequence could be ordered so that the first few auctions maximize the revenue and the remaining auctions maximize the fill rate, i.e., the sale of the impression. Another possible scenario is that some buyers act strategically to withdraw or modify their bids in higher tiers of the waterfall since they anticipate being able to buy the impression more cheaply in a lower tier.
Case 2: Keep (hold) auction . The expected net income is simply
(3) 
where the expectation is taken over all ad requests whose auction sequence includes the tag .
IiiB Discussion
Consider some special cases in the application of the decision rule (4).

Zero transaction cost. Then, , and iff . Irrespective of how small is, it is still better to hold auction .

Nonzero transaction cost and auction
has a very small win probability
. The RHS of (4) will be a large positive number and will most likely be higher than the LHS. In this case, it is better to abort auction .
It is important to realize that the optimal decision rule given in (4) maximizes the expected net income conditioned on knowing only the publisher tag
. However, if other features are available, these should be used as additional conditioning variables as the expectations would be generally better estimators of the observed values. Consequently, one would expect a better abort decision rule.
IiiC Adjustment for correlating effects
The assumptions in Sec. IIIA may not hold if there is a correlation between tag losing and , . Tag losing may be correlated with a lower expected payoff. Then, the expected payoff if the auction were aborted would be higher than . There is an opposite effect on : tag losing may be correlated to a longer auction sequence. In this case, if the auction were aborted, the expected number of auctions would be smaller than . The net effect is to decrease the LHS of (4) and increase its RHS, making an abort decision more likely.
The critical terms in the decision rule are and . One way to reduce the correlating effect on the payoff is to reweigh observations in the calculation of the difference of expectations. Define to be the median bid of ad request . Instead of , compute instead
(5) 
where the outer expectation is over and set , i.e., the distribution of is independent of winning or losing. Having identical median bid distributions in either a win or loss brings us closer, one hopes, to the first assumption in Sec. IIIA.
To perform the computation, one must obtain a clustering of the bid medians . Let denote a partition of the bid median support, where for all . Then, (5) can be calculated as
(6) 
where the proportion of ad requests whose . A similar computation can be carried out for . Here, we assume that correlates to the number of auctions .
Iv Design of the decision tree
We noted in Sec. IIIB that additional features beyond the publisher tag should be used in the abort decision rule. For example, the ad request context includes user information, which is critically important for buyers of an ad impression. These user features would consequently also be predictive of whether an auction is worth running or not.
Decision trees are well known in the machine learning community
[10], having been used in different applications. We want to design an the abort decision tree classifier using
as the available features in order to classify an auction as either “abort” or “keep”, so and respectively. Note that the payoff and auction waterfall length associated with each auction in the leaves are functions of the ad request . The decision tree therefore incorporates auction sequence information.A natural purity measure comes from the optimal decision rule (4), where the expected net income between “abort” and “keep” is compared. We propose using the absolute difference of the expected net income (ADENI) when the auction is held vs. when it is aborted. This is
(7) 
A larger ADENI is desirable. The split criterion is then the gain in the purity measure. Assuming an ary decision tree, and defining to be the ADENI computed over the set of observations one obtains
(8) 
and is a partition of
. Two stopping heuristics are used: when the size of a node falls below a threshold
, or when the increase in the ADENI purity criterion given by (8) falls below a threshold .ADENI can be adapted to (5) by simply changing the way that the differences in expectations are calculated.
V Experiments
A binary decision tree () is estimated using the split criterion given by (8), and is benchmarked against the simple decision rule (4) that only takes into account the publisher tag. Define to be the net income assuming an auction cost of in CPM and the abort decision rule , where for when there is no abort rule, the simple rule, or the decision tree rule respectively. In order to calculate the performance of the decision rules , we use as the performance metrics the NI delta change and the NI percent change .
Experimental data is drawn from 50 publishers on the Oath publisher platform. The train dataset comprises randomly sampled data from one day’s worth of auctions and the test dataset comprises randomly sampled data from the following day. The train and test datasets have 195,122,188 auctions and 226,012,100 auctions respectively. We consider three possible values for the auction cost : $0.003663, $0.007326, and $0.010989 in Cost Per Mille (CPM). The nominal is the estimated transaction cost based on historical billing data, and we consider values around the nominal value. The test dataset’s payoff is $8,897.82. Using the nominal transaction cost, it has a net income of $7,242.06.
The results are given in Table I below. We tested out the adjustment in Sec. IIIC to reduce the correlating effect of an auction loss to the payoff and auction sequence length. Since no significant difference was observed, these results are omitted for the sake of brevity. A possible explanation is that the assumptions in Case 1 of Sec. IIIA approximately hold.
Abort rule  NI delta change  NI percent change 
¢ (CPM), $8,069.94  
Simple  $42.06  0.52% 
Tree  $102.72  1.27% 
¢ (CPM), $7,242.06  
Simple  $187.31  2.59% 
Tree  $355.71  4.91% 
¢ (CPM), $6,414.17  
Simple  $404.36  6.30% 
Tree  $820.87  12.80% 
The decision tree rule has better performance than the simple rule. As the auction cost increases, the NI percent change increases for both abort rules . This is because, when increases, its impact on the NI becomes more significant. If one is able to skip unsuccessful auctions, there is a noticeable savings gain to be realized.
The savings gain varies from publisher to publisher. In Fig. 2, we plot histograms of the average NI delta in CPM dollar for each publisher at different values of the auction cost . The average NI delta is defined as the delta NI divided by the number of ad requests. From the histograms, this number generally increases in magnitude as the auction cost increases. The benefit of aborting an unprofitable auction appears to increase the higher the auction cost. Irrespective of , the simple rule always has several negative average delta NI. The decision tree rule, which makes use of userspecific features, has nonnegative performance for all publishers when ¢ and ¢. When ¢, there are several publishers for which the average delta NI is negative.
Vi Conclusions and future work
Under certain conditions, we derived the optimal auction abort decision rule to maximize the expected net income given only the publisher tag of the current auction and the ad request context. A purity measure and splitting criterion were proposed to construct an abort decision tree. In the experiment conducted, the decision tree rule that takes into account additional features like userrelated features performs better than the simple rule that just uses the publisher tag feature. In order to confirm the results of the experiments, we would have to A/B test. If we find that the explore cost opportunity is significant, we could consider contextual bandits.
A better abort decision rule could be achieved if more information were available regarding the auction sequence, e.g., if the auction sequence were known a priori, or if auction sequence probabilities were known or could be estimated. Indeed, the decision rule and decision tree derived in this work did not even make use of the auction index . To illustrate how much information is contained in additional knowledge of the waterfall, suppose that the SSP is at publisher tag but knows that tag has a higher expected payoff. It makes sense then to skip the current auction and move on to the next one in the sequence. A limited form of waterfall optimization has been explored in this work in the sense that, rather than start out with a suboptimal waterfall designed to maximize revenue and prune off unprofitable auctions, one could design the auction sequence to maximize net income from the very start.
References
 [1] J. Collette, A. S. Dilling, and T. Vu, “Auction tiering in online advertising auction exchanges,” US Patent Application US2014/0 006 170A1, Jan. 2, 2014.
 [2] (2018, Apr.) Targeting ad sources. AOL. [Online]. Available: https://learn.onemobile.aol.com/hc/enus/articles/235673048

[3]
P. Stone, R. E. Schapire, M. L. Littman, J. A. Csirik, and D. McAllester,
“Decisiontheoretic bidding based on learned density models in simultaneous,
interacting auctions,”
J. of Artificial Intelligence Research
, vol. 19, pp. 209–242, 2003.  [4] M. Babaioff, J. D. Hartline, and R. D. Kleinberg, “Selling ad campaigns: Online algorithms with cancellations,” in Proc. of the 2009 ACM Conference on Electronic Commerce, 2009, pp. 61–70.
 [5] J. Subramanian and S. S. J. and C. J. Lautenbacher, “Airline yield management with overbooking, cancellations, and noshows,” Transportation Science, vol. 33, no. 2, pp. 147–167, 1999.
 [6] R. B. Myerson, “Optimal auction design,” Mathematics of Operations Research, vol. 6, no. 1, pp. 58–73, 1981.
 [7] G. Vulcano, G. van Ryzin, and C. Maglaras, “Optimal dynamic auctions for revenue management,” Management Science, vol. 48, no. 11, pp. 1388–1407, 2002.
 [8] S. C. Fang, H. W. L. Nuttle, and D. Wang, “Fuzzy formulation of auctions and optimal sequencing for multiple auctions,” Fuzzy Sets and Systems, vol. 142, pp. 421–441, 2004.
 [9] D. Bergemann and M. Said, “Dynamic auctions: A survey,” Cowles Foundation for Research in Economics, Discussion Paper No. 1757R, 2010.
 [10] K. J. Murphy, Machine Learning: A Probabilistic Perspective. MIT Press, 2012.
Comments
There are no comments yet.