Learning to Clear the Market

06/04/2019 ∙ by Weiran Shen, et al. ∙ 0

The problem of market clearing is to set a price for an item such that quantity demanded equals quantity supplied. In this work, we cast the problem of predicting clearing prices into a learning framework and use the resulting models to perform revenue optimization in auctions and markets with contextual information. The economic intuition behind market clearing allows us to obtain fine-grained control over the aggressiveness of the resulting pricing policy, grounded in theory. To evaluate our approach, we fit a model of clearing prices over a massive dataset of bids in display ad auctions from a major ad exchange. The learned prices outperform other modeling techniques in the literature in terms of revenue and efficiency trade-offs. Because of the convex nature of the clearing loss function, the convergence rate of our method is as fast as linear regression.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

A key difficulty in designing machine learning systems for revenue optimization in auctions and markets is the discontinuous nature of the problem. Consider the basic problem of setting a reserve price in a single-item auction (e.g., for online advertising): revenue steadily increases with price up to the point where all buyers drop out, at which point it suddenly drops to zero. The discontinuity may average away over a large market, but one is typically left with a highly non-convex objective.

We are interested in obtaining pricing policies for revenue optimization in a data-rich (i.e., contextual) environment, where each product is associated with a set of features. For example, in online display advertising, a product is an ad impression (an ad placement viewed by the user) which is annotated with features like geo-information, device type, cookies, etc. There are two main approaches to reserve pricing in this domain: one is to divide the feature space into well defined clusters and apply a traditional (non-contextual) revenue optimization algorithm in each cluster [18, 9, 20, 19]. This is effectively a semi-parametric approach with the drawback that an overly fine clustering leads to data sparsity and inability to learn across clusters. An overly coarse clustering, on the other hand, does not fully take advantage of the rich features available.

To overcome these difficulties, a natural alternative is to fit a parametric pricing policy by optimizing a loss function. The first instinct is to use revenue itself as a loss function, but this loss is notoriously difficult to optimize because it is discontinuous, non-convex, and has zero gradient over much of its domain—so one must look to surrogates. Medina and Mohri [14] propose a continuous surrogate loss for revenue whose gradient information is rich enough to optimize for prices. The loss is nevertheless non-convex so optimizing it relies on techniques from constrained DC-programming, which have provable convergence but limited scalability in high-dimensional contexts.

Main contribution. The main innovation in this paper is to address the revenue optimization problem by instead looking to the closely related problem of market clearing: how to set prices so that demand equals supply. The loss function for market clearing exhibits several nice properties from a learning perspective, notably convexity. The market clearing objective dates back to the economic theory of market equilibrium [2], and more recently arises in the literature on iterative auctions [4, 11, 3]. To our knowledge, our work is the first to use it as a loss function in a machine learning context.

The economic insight behind the market-clearing loss function allows us to adapt its shape to control how conservative or aggressive the resulting prices are in extracting revenue. To increase price levels, we can artificially increase demand or limit supply, which connects revenue optimization theorems from computational economics [5, 21] to regularization techniques under our loss function.

We begin by casting the problem of market clearing as a learning problem. Given a dataset where each record corresponds to an item characterized by a feature vector, together with buyer bids and seller asks for the item, the goal of the pricing policy is to quote a price that balances supply and demand; with a single seller, this simply means predicting a price in between the highest- and second-highest bids, which intuitively improves over the baseline of no reserve pricing.

This offers us a general framework for price optimization in contextual settings, but the objective function of market clearing is still disconnected from revenue optimization. Revenue is the aggregate price paid by buyers, while market clearing is linked to the problem of optimizing efficiency (realized value). Efficiency can be measured as social welfare (the total value of the allocated items), or more coarsely via the match rate (the number of cleared transactions). The platform faces a tension between trying to extract as much revenue as possible from buyers, while also leaving them enough surplus to discourage a move to competing platforms.

To better understand the trade-off between revenue and efficiency, we consider the linear programming duality between allocation and pricing and observe that a natural parameter that trades-off revenue for efficiency is the available supply. Artificially limiting supply (or increasing demand) allows one to control the aggressiveness of the resulting clearing prices output by the model. This fundamental idea has been used multiple times more recently in algorithmic game theory to design approximately revenue-optimal auctions 

[12, 9, 21, 10]. Translating this intuition to our setting, a simple modification of the primal (allocation) linear program has the effect of restricting the supply. In the dual (pricing) linear program, this is equivalent to adding a regularization to the market-clearing objective function.

The focus of this paper is empirical. As our main application, we use this methodology to optimize reserve prices in display advertising auctions. We demonstrate the efficacy of the market clearing loss for reserve pricing by experimentally comparing it with other strategies on a real-world data set. Coupled with the experimental evaluation, we establish some theoretical guarantees on match rate and efficiency for the optimal pricing policy under clearing loss. The theory provides guidance on how to set the regularization parameters and we investigate how this translates to the desired trade-offs experimentally.

Experimental results. We evaluate our method against a linear-regression based approach on a dataset consisting of over 200M auction records from a major display advertising exchange. The features are represented as 84K-dimensional sparse vectors and contain information such as the website on which the ad will be displayed, device and browser type, and country of origin. As benchmarks we consider standard linear regression on either the highest or second-highest bid, and models fit using the surrogate revenue loss proposed by Medina and Mohri [14]. We find that our method Pareto-dominates the benchmarks in terms of the trade-off between revenue and match rate or social welfare. For example, for the best revenue obtained from regression approach, we can obtain a pricing function with at least the same revenue but 5% higher social welfare and 10% higher match rate. We also find that the convergence rate of fitting models under our loss function is as fast as a standard linear regression. In comparison, the surrogate loss of Medina and Mohri [14] has much slower convergence due to its non-convexity.

Related work. There is a large body of literature on learning algorithms for optimizing revenue, however, most of the literature deals with the non-contextual setting. Cole and Roughgarden [8], Morgenstern and Roughgarden [17, 16], Paes Leme et al. [19] study the batch-learning non-contextual problem. Roughgarden and Wang [20] study the non-contextual problem both in the online and batch learning settings. Cesa-Bianchi et al. [6] study it as a non-contextual online learning problem. Finally, there has been a lot of recent interest in the contextual online learning version [1, 7, 13], but those ideas are not applicable to the batch-learning setting.

Closest to our work are Medina and Mohri [14] and Medina and Vassilvitskii [15], who also study contextual reserve price optimization in a batch-learning setting. Medina and Mohri [14] proves generalization bounds, defines a surrogate loss as a continuous approximation to the revenue loss, and proposes an algorithm with provable convergence based on DC programming. The algorithm, however, requires solving a convex program in each iteration. Medina and Vassilvitskii [15] propose a clustering based approach, which involves the following steps: learning a least-square predictor of the bid, clustering the feature space based on the linear predictor, and optimizing the reserve using a non-contextual method in each cluster.

2 Market Clearing Loss

This section introduces our model, proceeding from the general to the specific. We first explain the duality between allocation and pricing, which motivates the form of the loss function to fit clearing prices, and provides useful economic insights into how the input data defines its shape. We next define the formal problem of learning a clearing price function in an environment with several buyers and sellers. We then specialize to a single-item, second-price auction (multiple buyers, single seller).

Allocation and Pricing

We consider a market with buyers and sellers who aim to trade quantities of an item (e.g., a stock or commodity) among themselves. Each buyer is defined by a pair where is a bid price and is a quantity. The interpretation is that the buyer is willing to buy up to units of the item at a price of at most per unit. Similarly, each seller is defined by a pair where is an ask price and is the quantity of item the seller can supply. The ask price can be viewed as a cost of production, or as an outside offer available to the seller, so that the seller will decline to sell item units for any price less than its ask.

The allocation problem associated with the market is to determine quantities of the item supplied by the sellers, and consumed by the buyers, so as to maximize the gains from trade—value consumed minus cost of production. Formally, let be the quantity bought by buyer and the quantity sold by seller . The optimal gains from trade are captured by the (linear) optimization problem:

s.t. (1)

The optimization is straightforward to solve: the highest bid is matched with the lowest ask, and the two agents trade as much as possible between each other. The process repeats until the highest bid falls below the lowest ask. The purpose of the linear programming formulation is to consider its dual, which corresponds to a pricing problem:

(2)

where denotes . The optimal dual solution corresponds to a price that balances demand and supply, which is the central concept in this paper.

Definition 1.

A price is a clearing price if, for any optimal solution to the allocation problem, we have

for each buyer and seller ,

In words, a clearing price balances supply and demand by ensuring that, at an optimal allocation, each buyer buys a quantity that maximizes its utility (value minus price), and similarly each seller sells a quantity that maximizes its profit (price minus cost). In the current simple setup with a single item, buyer will buy units if , zero units if , and is indifferent to the number of units bought at ; similarly for each seller . However, the concept of clearing prices—where each agent maximizes its utility at the optimal allocation—generalizes to much more complex allocation problems with multiple differentiated items and nonlinear valuations over bundles of items [4].

The fact that a clearing price exists, and can be obtained by solving (2), follows from standard LP duality. The complementary slackness conditions relating optimal primal solution to optimal dual solution amount to the conditions of Definition 1. The optimal solution to the dual corresponds to a Lagrange multiplier for constraint (1) which equates demand and supply.

Learning Formulation

To cast market clearing in a learning context, we consider a generic feature space with the label space consisting of bid and ask vectors . For the sake of simplicity, we develop our framework assuming that the number of buyers and sellers remains fixed (at and ), and that the item quantity that each agent demands or supplies ( or ) is also fixed. This information is straightforward to incorporate into the label space if needed, and our results can be adapted accordingly. The objective is to fit a price predictor (also called a pricing policy) to a training set of data drawn from , to achieve good prediction performance on separate test data drawn from the same distribution as the training data.

As a concrete example, the training data could consist of bids and asks for a stock on a financial exchange throughout time, and the features might be recent economic data on the company, time of day or week, etc. The clearing problem here is equivalent to predicting a price within each datapoint’s bid-ask spread given the features. As another example, the data could consist of bids for ad impressions on a display ad exchange, and the features might be contextual information about the website (e.g., topic) and user (e.g., whether she is on mobile or desktop). The clearing problem there reduces to predicting a price between the highest and second-highest bids.

Based on our developments so far, the correct loss function to fit clearing prices is given by (2), which we call the clearing loss:

Figure 1 illustrates the shape of the clearing loss (in green) under an instance with buyers and sellers . Note that although the first buyer’s bid of lies below any of the sellers’ costs, it still contributes to the shape of the loss. Here any price between and is a clearing price. If we add an extra buyer , the loss curve tilts to the right (in blue) and the unique clearing price becomes ; since there is more demand, the clearing price increases. If we instead add an extra seller , the curve tilts to the left (in pink) and the clearing price decreases; now any price between and is a clearing price. This example hints at a way to control the aggressiveness of the price function fit to the data, by artificially adjusting demand or supply.

Figure 1: Effect on the shape of the clearing loss when adding a buyer or a seller.

Over a training set of data , model fitting consists of computing a pricing policy that minimizes the overall loss . Under a limited number of contexts , it may be possible to directly compute optimal clearing prices, or even revenue-maximizing reserve prices, based on the bid distributions in each context [8, 18]. But this kind of nonparametric approach quickly runs into difficulties when there is a large number of contexts or even continuous features, where issues of data sparsity and discretization arise. Our formulation allows one to impose some structure on the pricing policy (e.g., a linear model or neural net) whenever this aids with generalization.

From a learning perspective, clearing loss has several attractive properties. It is a piece-wise linear convex function of the price, where the kink locations are determined by the bids and asks. The magnitude of its derivatives depends only on the buyer and seller quantities, which makes it robust to any outliers in the bids or asks. By its derivation via LP duality, its optimal value equals the optimal gains from trade, which are easy to compute. This gives a reference point to quantify how well a price function fits any given dataset.

Reserve Pricing

As a practical application of the clearing loss, we consider the problem of reserve pricing in a single-item, second-price auction. In this setting every buyer demands a single unit (), and there is a single seller () with cost . The seller also has unit supply, but we still parametrize its quantity by to allow some control on the shape of the loss.

We write and to denote the highest and second-highest bids, respectively. In a single-item second-price auction, the item is allocated to the highest bidder as long as , and is charged . Second-price auctions are extremely common and until now have been the dominant format for selling display ads online through ad exchanges, among countless other applications. It is common in second-price auctions for the seller to set a reserve price, a minimum price that the winning bidder is charged. The cost is itself a reserve price, but the seller may choose to increase this to some price in an attempt to extract more revenue, at the risk of leaving the item unsold if it turns out that . Revenue as a function of can be negated to define a loss, which we denote :

However, this loss is notoriously difficult to optimize directly, because it is non-convex and even discontinuous, and its gradient is 0 except over a possibly narrow range between the highest and second-highest bids. Clearing loss represents a promising alternative for reserve pricing because any price between and is a clearing price, so a correct clearing price prediction should intuitively improve over the baseline of . The clearing loss in the auction setting takes the form:

(3)

In practical applications of reserve pricing it is often desirable to achieve some degree of control over the match rate—the fraction of auctions where the item is sold—and the closely related metric of social welfare—the aggregate value of the items sold, where value is captured by the winning bid . Formally, these concepts are defined as follows, where the notation is 1 if its predicate is true and 0 otherwise.

Definition 2.

On a single data point, the match rate at price is and the social welfare is .

As with the revenue objective, match rate and social welfare are discontinuous and their gradients are almost everywhere 0, so they are not directly suitable for model fitting via convex optimization (i.e., one has to look to surrogates).

Note that the clearing loss (3) effectively contains a term that approximately regularizes according to match rate. The seller’s term can be viewed as a hinge-type surrogate for match rate, since any setting of above risks impacting match rate. Increasing improves match rate, in line with the earlier economic intuition that increasing seller supply shifts the clearing price downwards. Symmetrically, can be decreased within the range (the loss remains convex in this range), which is equivalent to increasing each buyer’s demand to . According to the economic intuition, this shifts the clearing price upwards at the expense of match rate. The fact that the relevant range and units of the regularization weight are understood is very convenient in practice. In the next section, we derive a quantitative link between and match rate.

3 Theoretical Guarantees

In this section we prove approximation guarantees on the match rate and efficiency performance of models fit using the clearing loss. The results of this analysis will provide guidelines for setting the regularization parameters for fine-grained control of the match rate.

We begin by characterizing the optimal pricing policy under clearing loss when there is no restriction on the policy structure, assuming that bids and costs are drawn independently (but not necessarily identically).

Proposition 3.

If conditioned on each feature vector the bid and cost distributions are given by and , then the pricing policy that optimizes clearing loss is the solution to

which is the policy that balances expected supply and demand.

Proof.

We can write the expectation of the market clearing loss function as follows:

Taking the derivative with respect to and setting it to zero leads to the result in the statement:

We now consider the single-item auction setting where and for all buyers. For simplicity, also assume that , which implies for all . In that case we can bound the match rate by a simple formula.

Proposition 4.

In the setup with a single seller with supply and cost , and independent buyer distributions, the expected match rate under the optimal clearing price policy is at least .

Proof.

A transaction clears if there is at least one buyer with valuation above the price

which happens with probability

. Since the optimal policy is the solution of by the previous proposition, we can bound the match rate as follows:

where the first inequality follows from the arithmetic-geometric mean inequality. ∎

The preceding proposition provides a useful guideline on how to set the regularization parameter to achieve a certain target match rate. We can also obtain a similar bound for social welfare:

Corollary 5.

In the setting of the previous proposition, the social welfare obtained by the optimal clearing price policy is at least of the optimal social welfare, obtained by setting no reserves.

Proof.

This follows from the fact that . ∎

Another interesting corollary is that when buyers are i.i.d., fitting a clearing price is equivalent to fitting a certain quantile of the common bid distribution.

Corollary 6.

In the setup of the previous proposition with i.i.d. buyers, the optimal clearing price policy is to set the price at where .

This result makes explicit how varying in the clearing loss tunes the aggressiveness of the resulting price function, by moving up or down the quantiles of the bid distribution. In particular, it’s possible to span all quantiles using . Fitting clearing prices is not exactly equivalent to quantile regression, since the relevant quantile depends on the number of buyers, which is a property of the data and not fixed in advance.

4 Empirical Evaluation

In this section we evaluate our approach of using predicted clearing prices as a reserve pricing policy in second-price auctions. We collected a dataset of auction records by sampling a fraction of the logs from Google’s Ad Exchange over two consecutive days in January 2019. Our sample contains over 100M records for each day. In display advertising, online publishers (e.g., websites like nytimes.com) can choose to request an ad from an exchange when a user visits a page on their site. The exchange runs a second-price auction (the most common auction format) among eligible advertisers, possibly with a reserve price.

We clip bid vectors to the 5 highest bids. As the publisher cost , we use a reserve price available in the data which is meant to capture the opportunity cost222A common alternative source of display ads besides exchanges are reservation contracts, which are advertiser-publisher agreements to show a fixed volume of ads for a time period. If the contract is not fulfilled, this comes at a penalty to the publisher. of not showing ads from other sources besides the exchange, in line with our model.333We also excluded additional sources of reserve prices from the dataset: (a) reserve prices configured by publishers reflecting business objectives like avoiding channel conflict (i.e., protecting the value of inventory sold through other means) and (b) automated reserve prices set by the exchange. Reserve prices are only relevant conditional on the top bid exceeding the publisher cost, so the auction records were filtered to satisfy this condition. When reporting our results this means that the baseline match rate without any reserve pricing is 100%, so we will refer to it as relative match rate in our plots to emphasize this fact.

All the models we evaluate444We evaluate the models by simulating the effect of the new reserves on a test dataset. The simulation does not take into account possible strategic responses on the part of buyers. However, since the auction format is a second price auction, it is a dominant strategy for the buyers to bid truthfully. are linear models of the price as a function of features

of the auction records. The only difference between the models is the loss function used to fit each one, to focus on the impact of the choice of loss function. The features we used included: publisher id, device type (mobile, desktop, tablet), OS type (e.g., Android or iOS), country, and format (video or display). For sparse features like publisher id we used a one-hot encoding for the most common ids and an ‘other’ bucket for ids in the tail. The models were all fit using TensorFlow with the default Adam optimizer and minibatches of size 512 distributed over 20 machines. An iteration corresponds to one minibatch update in each machine, therefore

data points. The models were all trained over at least 400K iterations, although for some models convergence occurred much earlier.

Besides the clearing loss used to fit our model, we considered several other losses as benchmarks:

  • Least-squares regression on the highest bid .

  • Least-squares regression on the 2nd-highest bid .

  • A revenue surrogate loss function proposed by Medina and Mohri [14] as a continuous alternative to the pure revenue loss mentioned previously:

    The loss has a free parameter which can be tuned to control the approximation to . Although this loss is continuous, it is still non-convex. In our experiments we tried a range of . Below we report on the setting which gave the best revenue performance.

For each loss function we added the match-rate regularization , and we varied to span a range of realized match rates. Recall that this regularization is already implicit in the clearing loss, where can be construed as the item quantity supplied by the seller. We used non-negative to ensure that convexity is preserved if the original loss is itself convex.

We used the first day of data as the training set and the second day as the test set. The performance was very similar on both for all fitted models, which is expected due to the volume of data and the generalization properties of this learning problem [17]. We report results over the test set below.

Revenue Performance

Figure 2: Trade-off between revenue improvement and decrease in match rate (top) or buyer welfare (bottom). Each point represents the performance of the fitted model under a loss function for a fixed regularization level.

We first consider the revenue performance of the different losses as it trades off against match rate and buyer welfare. Figure 2 plots the ratio of realized revenue with learned reserves against the realized match rate (top). Both axes are normalized by the revenue and match rate of the second price auction using only the seller’s cost as reserves. Each point represents a pair of revenue and match rate or welfare achieved at a certain setting . The most immediate observation is that the curve traced out by the clearing loss Pareto dominates the performance of the benchmark loss functions, in the sense that for any fixed match rate, the clearing loss’ revenue performance lies higher than the others. The best revenue performance is a improvement achieved by the clearing loss at with a match rate of .

We also plot in the figure the revenue against welfare (bottom). We again normalize each axis by the revenue and welfare of the auction that uses only seller’s costs (which achieves the optimal social welfare). For the sake of clarity the range of the x-axis has been clipped. The Pareto dominance here is even more pronounced, and it’s also striking to note that clearing loss can achieve revenue improvements of over with less than impact on buyer welfare.

Another interesting aspect of Figure 2 is the range of match rates spanned by the different losses. Recall that, under the assumptions and results of Proposition 4, varying from 0 to large values should allow the clearing loss to span the full range of match rates in , and this is borne out by the plot. For the regressions on and , there is a hard floor on the match rate that they can achieve with , respectively at 0.38 and 0.67. Another kind of regularization term would be needed to push these further downward and reach more aggressive prices. Match rate for the surrogate loss was particularly sensitive to regularization. Over a range of spanning from 0 to 1, only and yielded match rates below 1, at 0.47 and 0.92 respectively.

Controlling Match Rate

In practice setting the right regularization weight to achieve a target match rate is usually process of trial and error, even to determine the relevant range to inspect, and this was the case for all the benchmark losses. For the clearing loss, however, Proposition 4 gives a link between match rate and which can serve as a guide. Specifically, the result prescribes to achieve a match rate of .

Figure 3: Realized match rate against target match rate under the model fit with the clearing loss, broken down by device type. The vertical line denotes the parameter setting with a target match rate of .

Figure 3 plots the target match rate implied by the settings of that we used, according to this formula, against realized match rates. The vertical line shows the reference point of , which is the “default” form of the clearing loss without artificially increasing or limiting supply, with an associated match rate . The realized match rate tracks the target fairly well but not perfectly. A possible reason for the discrepancy is that the assumption of i.i.d. bidders that the formula relies on may not hold in practice. Another possible reason is that the linear model may not be expressive enough to fit the optimal price level within each feature context . Interestingly, the target match rate from Proposition 4 tracks not only the overall match-rate but also segment-specific match rate. In Figure 3, we break down the match rates by device type and find that they are very consistent across devices.

Convergence Rate

We next consider the convergence rates of model-fitting under the various loss functions, plotted in Figure 4. Convergence rates for the clearing loss and the regression losses are very comparable. The main difference between the curves has to do with initialization. Initial prices tended to be high under our random initialization scheme, which is more favorable to regression on the highest bid. All models have converged by 100K iterations. Since square loss is ideal from an optimization perspective, these results imply that models with clearing loss can be fit very quickly and conveniently in practice, in a matter of hours over large display ad datasets.

Figure 4: Convergence rate of the model under different loss function, in minibatch iterations. We plot the value of each loss across iterations normalized by its value upon convergence.

In Figure 5 we compare the convergence of the clearing loss with the surrogate loss. Convergence is much slower under surrogate loss. This was expected, as the loss is nonconvex and it has ranges with 0 gradient where the Adam optimizer (or any of the other standard TensorFlow optimizers) cannot make progress; it was nonetheless an important benchmark to evaluate since it closely mimics the true revenue objective. Medina and Mohri [14] discuss alternatives for optimizing the surrogate loss, and propose a special purpose algorithm based on DC-programming (difference of convex functions programming), but they only scale it to thousands of training instances. The fact that the surrogate loss has not quite converged after 400K iterations is a contributing factor to its revenue performance in Figure 2.

Effectiveness of Linear Regression

While the key take-away of our empirical evaluation is the fact that the clearing loss dominates other methods in terms of revenue vs. match rate trade-offs, another surprising consequence of this study is the effectiveness of using a simple regression on the top bid. The natural intuition would be that any least-squares regression should perform poorly since it has the same penalty for underpricing (which is a small loss in revenue) and overpricing (which can cause the transaction to fail to clear and hence incur in a large revenue loss). Indeed it is the case that an unregularized regression (the leftmost green point on Figure 2) incurs a large match rate loss, but it also achieves significant revenue improvement (albeit with an almost 5% loss in social welfare compared to the clearing loss).

Looking into the data, we found that an explanation for this fact is that bid distributions tend to be highly skewed which causes standard regression to underpredict for high bids and overpredict for low bids. In fact, under zero regularization the linear regression on the top bid underpredicts 17.7% of instances for bids below the median and 99.1% for bids above the median. This type of behavior explains why standard regression can be effective in practice despite the fact that square loss does not encode any difference between underpredicting and overpredicting.

Figure 5: Convergence rate of the model under clearing and surrogate function, in minibatch iterations. Both loss functions are smoothed using a moving average.

5 Conclusions

This paper introduced the notion of a predictive model for clearing prices in a market with bids and asks for units of an item. The loss function is obtained via the linear programming dual of the associated allocation problem. When applied to the problem of revenue optimization via reserve prices in second-price auctions, regularizing the loss has an intuitive interpretation as expanding or limiting supply, which can be formally linked to the expected match rate. Our empirical evaluation over a dataset of bids from Google’s Ad Exchange confirmed that a model of clearing prices outperforms standard regressions on bids, as well as a surrogate loss for the direct revenue objective, in terms of the trade-off between revenue and match rate (or social welfare). In future work, we plan to develop models of clearing prices for more complex allocation problems such as search advertising, where the clearing loss can be generalized (using the same duality ideas presented in this paper) to handle a vector of position prices.

References

  • Amin et al. [2014] Kareem Amin, Afshin Rostamizadeh, and Umar Syed. Repeated contextual auctions with strategic buyers. In Advances in Neural Information Processing Systems, pages 622–630, 2014.
  • Arrow and Debreu [1954] Kenneth J. Arrow and Gerard Debreu. Existence of an equilibrium for a competitive economy. Econometrica, 22(3):265–290, 1954.
  • Ausubel [2006] Lawrence M Ausubel. An efficient dynamic auction for heterogeneous commodities. American Economic Review, 96(3):602–629, 2006.
  • Bikhchandani and Mamer [1997] Sushil Bikhchandani and John W Mamer. Competitive equilibrium in an exchange economy with indivisibilities. Journal of Economic Theory, 74(2):385–413, 1997.
  • Bulow and Klemperer [1996] Jeremy Bulow and Paul Klemperer. Auctions versus negotiations. American Economic Review, 86:180–194, 1996.
  • Cesa-Bianchi et al. [2013] N. Cesa-Bianchi, C. Gentile, and Y. Mansour. Regret minimization for reserve prices in second-price auctions. In ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1190–1204. SIAM, 2013.
  • Cohen et al. [2016] Maxime C Cohen, Ilan Lobel, and Renato Paes Leme. Feature-based dynamic pricing. In Proceedings of the 2016 ACM Conference on Economics and Computation, pages 817–817. ACM, 2016.
  • Cole and Roughgarden [2014] Richard Cole and Tim Roughgarden. The sample complexity of revenue maximization. In

    Proceedings of the 46th annual ACM Symposium on Theory of Computing

    , pages 243–252. ACM, 2014.
  • Dhangwatnotai et al. [2015] Peerapong Dhangwatnotai, Tim Roughgarden, and Qiqi Yan. Revenue maximization with a single sample. Games and Economic Behavior, 91:318–333, 2015.
  • Eden et al. [2017] Alon Eden, Michal Feldman, Ophir Friedler, Inbal Talgam-Cohen, and S Matthew Weinberg. The competition complexity of auctions: A Bulow-Klemperer result for multi-dimensional bidders. In Proceedings of the 2017 ACM Conference on Economics and Computation, pages 343–343. ACM, 2017.
  • Gul and Stacchetti [1999] Faruk Gul and Ennio Stacchetti. Walrasian equilibrium with gross substitutes. Journal of Economic Theory, 87(1):95–124, 1999.
  • Hartline and Roughgarden [2009] Jason D. Hartline and Tim Roughgarden. Simple versus optimal mechanisms. In Proceedings 10th ACM Conference on Electronic Commerce, pages 225–234, 2009.
  • Mao et al. [2018] Jieming Mao, Renato Paes Leme, and Jon Schneider. Contextual pricing for Lipschitz buyers. In Advances in Neural Information Processing Systems, pages 5648–5656, 2018.
  • Medina and Mohri [2014] Andres M Medina and Mehryar Mohri. Learning theory and algorithms for revenue optimization in second price auctions with reserve. In Proceedings of the 31st International Conference on Machine Learning (ICML-14), pages 262–270, 2014.
  • Medina and Vassilvitskii [2017] Andres Muñoz Medina and Sergei Vassilvitskii. Revenue optimization with approximate bid predictions. In Advances in Neural Information Processing Systems 30, pages 1856–1864, 2017.
  • Morgenstern and Roughgarden [2016] Jamie Morgenstern and Tim Roughgarden. Learning simple auctions. In Proceedings of the Conference on Learning Theory (COLT), pages 1298–1318, 2016.
  • Morgenstern and Roughgarden [2015] Jamie H Morgenstern and Tim Roughgarden. On the pseudo-dimension of nearly optimal auctions. In Advances in Neural Information Processing Systems, pages 136–144, 2015.
  • Myerson [1981] Roger B Myerson. Optimal auction design. Mathematics of Operations Research, 6(1):58–73, 1981.
  • Paes Leme et al. [2016] Renato Paes Leme, Martin Pál, and Sergei Vassilvitskii. A field guide to personalized reserve prices. In Proceedings of the 25th International Conference on World Wide Web (WWW), pages 1093–1102, 2016.
  • Roughgarden and Wang [2016] Tim Roughgarden and Joshua R. Wang. Minimizing regret with multiple reserves. In Proceedings of the 2016 ACM Conference on Economics and Computation (EC), pages 601–616. ACM, 2016.
  • Roughgarden et al. [2012] Tim Roughgarden, Inbal Talgam-Cohen, and Qiqi Yan. Supply-limiting mechanisms. In Proceedings of the 13th ACM Conference on Electronic Commerce, pages 844–861. ACM, 2012.