Dynamic Reserve Prices for Repeated Auctions: Learning from Bids

02/18/2020 ∙ by Yash Kanoria, et al. ∙ 0

A large fraction of online advertisement is sold via repeated second price auctions. In these auctions, the reserve price is the main tool for the auctioneer to boost revenues. In this work, we investigate the following question: Can changing the reserve prices based on the previous bids improve the revenue of the auction, taking into account the long-term incentives and strategic behavior of the bidders? We show that if the distribution of the valuations is known and satisfies the standard regularity assumptions, then the optimal mechanism has a constant reserve. However, when there is uncertainty in the distribution of the valuations, previous bids can be used to learn the distribution of the valuations and to update the reserve price. We present a simple, approximately incentive-compatible, and asymptotically optimal dynamic reserve mechanism that can significantly improve the revenue over the best static reserve. The paper is from July 2014 (our submission to WINE 2014), posted later here on the arxiv to complement the 1-page abstract in the WINE 2014 proceedings.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Advertising is the main component of monetization strategies of most Internet companies. A large fraction of online advertisements are sold via advertisement exchanges platforms such as Google’s Doubleclick (Adx) and Yahoo!’s Right Media.111Other examples of major ad exchanges include Rubicon, AppNexus, and OpenX. Using these platforms, online publishers such as the New York Times and the Wall Street Journal sell the advertisement space on their webpages to advertisers. The advertisement space is allocated using auctions where advertisers bid in real time for a chance to show their ads to the users. Every day, tens of billions of online ads are sold via these exchanges (Muthukrishnan, 2009; McAfee, 2011; Balseiro et al., 2011; Celis et al., 2014).

The second-price auction is the dominant mechanism used by the advertisement exchanges. Among the reasons for such prevalence are the simplicity of the second-price auction and the fact that it incentivizes the advertisers to be truthful. The second price auction maximizes the social welfare (i.e., the value created in the system) by allocating the item to the highest bidder.

In order to maximize the revenue in a second price auction, the auctioneer can set a reserve price and not make any allocations when the bids are low. In fact, under symmetry and regularity assumptions (see Section 2), the second-price auction with an appropriately chosen reserve price is optimal and maximizes the revenue among all selling mechanism (Myerson, 1981; Riley and Samuelson, 1981).

However, in order to set the reserve price effectively, the auctioneer requires information about distribution of the valuations of the bidders. A natural idea, which is widely used in practice, is to construct these distributions using the history of the bids. This approach, though intuitive, raises a major concern with regards to long-term (dynamic) incentives of the advertisers. Because the bid of an advertiser may determine the price he or she pays in future auctions, this approach may result in the advertisers shading their bids and ultimately in a loss of revenue for the auctioneer.

To understand the effects of changing reserve prices based on the previous bids, we study a setting where the auctioneer sells impressions (advertisements space) via repeated second price auctions. We demonstrate that the long-term incentives of advertisers plays an important role in the performance of these repeated auctions by showing that under standard symmetry and regularity assumptions (i.e., when the valuations of are drawn independently and identically from a regular distribution), the optimal mechanism is running a second price auction with a constant reserve and changing the reserve prices over time is not beneficial. However, when there is uncertainty in the distribution of the valuations, we show that there can be substantial benefit in learning the reserve prices using the previous bids.

More precisely, we consider an auctioneer selling multiple copies of an item sequentially. The item is either a high type or a low type. The type determines the distribution of the valuations of the bidders. The type of the item is not a-priori known to the auctioneer. Broadly, we show the following: when there is competition between bidders and the valuation distributions for the two types are sufficiently different from each other, there is a simple dynamic reserve mechanism that can effectively “learn” the type of the item, and thereafter choose the optimal reserve for that type.222On the other hand, when the valuation distributions for the two types are close to each other, the improvement from changing the reserve is insignificant. As a consequence, the dynamic reserve mechanism does much better than the best fixed reserve mechanism, and in fact, achieves near optimal revenue, while retaining (approximate) incentive compatibility.333Approximate incentive compatibility implies that the agent behave truthfully if the gain from deviation is small; see Section 2.

To this end, we propose a simple mechanism called the threshold mechanism. In each round, the mechanism implements a second price auction with reserve. The reserve price starts at some value, and stays there until there is a bid exceeding a pre-decided threshold, after which the reserve rises (permanently) to a higher value.

We compare the revenue of our mechanism with two benchmarks. Our baseline is the static second price auction with the optimal constant reserve. Our upper-bound benchmark is the optimal mechanism that knows the type of the impressions (e.g., high or low) in advance. These two benchmarks are typically well separated. We show that the threshold mechanism is near optimal and obtains revenue close to the upper-bound benchmark. In addition, we present numerical illustrations of our results that show up to increase in revenue by our mechanism compared with the static second price auctions. These examples demonstrate the effectiveness of dynamic reserve prices under fairly broad assumptions.

1.1 Related Work

In this section, we briefly discuss the closest work to ours in the literature along different dimensions starting with the application in online advertising.

Ostrovsky and Schwarz (2009)

conducted a large-scale field experiment at Yahoo! and showed that choosing reserve prices, guided by the theory of optimal auctions, can significantly increase the revenue of sponsored search auctions. To mitigate the aforementioned incentive concerns, they dropped the highest bid from each auction when estimating the distribution of the valuations. However, they do not formally discuss the consequence of this approach.

Another common solution offered to mitigate the incentive constraints is to bundle different types of impressions (or keywords) together so that the bid of each advertiser would have small impact on the aggregate distribution learned from the history of bids. However, this approach may lead to significant estimation errors and setting a sub-optimal reserve.

To the extent of our knowledge, ours is the first work that rigorously studies the long-term and dynamic incentive issues in repeated auctions with dynamic reserves.

Iyer et al. (2011) and Balseiro et al. (2013) demonstrate the importance of setting reserve prices in dynamic setting in environments where agents are uncertain about their own valuations, and respectively, are budget-constrained. We discuss the methodology of these papers in more details at the end of Section 4. McAfee et al. (1989); McAfee and Vincent (1992) determine reserve prices in common value settings.

Our work is closely related to the literature on behavior-based pricing strategies where the seller changes the prices for a buyer (or a segment of the buyers) based on her previous behavior; for instance, increasing the price after a purchase or reducing the price in the case of no-purchase; see Fudenberg and Villas-Boas (2007); Esteves (2009) for surveys.

The common insight from the literature is that the optimal pricing strategy is to commit to a single price over the length of the horizon (Stokey, 1979; Salant, 1989; Hart and Tirole, 1988). In fact, when customers anticipate reduction in the future prices, dynamic pricing may hurt the seller’s revenue (Taylor, 2004; Villas-Boas, 2004). Similar insights are obtained in environments where the goal is to sell a fixed initial inventory of products to unit-demand buyers who arrive over time (Aviv and Pazgal, 2008; Dasu and Tong, 2010; Aviv et al., 2013; Correa et al., 2013).

There has been renewed interest in behavior-based pricing strategies, mainly motivated by the development in e-commerce technologies that enables online retailers and other Internet companies to determine the price for the buyer based on her previous purchases. Acquisti and Varian (2005) show that when a sufficient proportion of customers are myopic or when the valuations of customers increases (by providing enhanced services) dynamic pricing may increase the revenue. Another setting where dynamic pricing could boost the revenue is when the seller is more patient than the buyer and discounts his utility over time at a lower rate than the buyer (Bikhchandani and McCardle, 2012; Amin et al., 2013). See Taylor (2004); Conitzer et al. (2012) for privacy issues and anonymization approaches in this context.
In contrast with these works, our focus is on auction environments and we study the role of competition among strategic bidders.

The problem of learning the distribution of the valuation and optimal pricing also have been studied in the context of revenue management and pricing for markets where each (infinitesimal) buyer does not have an effect on the future prices and demand curve can be learned with optimal regret (Besbes and Zeevi, 2009, 2012; Harrison et al., 2012; den Boer and Zwart, 2014; Segal, 2003). In this work, we consider a setting where the goal is to learn the optimal reserve price with strategic and forward looking buyers, with multi-unit demand, where the action of each buyer can change the prices in the future.


The remaining of the paper is organized as follows. In Section 2, we formally present the model followed by the description of the threshold mechanisms in Section 3. We show that the mechanism is dynamic incentive compatible in Section 4. In Sections 5, we present an extension of the threshold mechanism.

2 Model and Preliminaries

A seller auctions off items to agents in rounds of second price auctions, numbered . The items are of type high or low denoted by , where informally we think of an item of type as being more valuable than an item of type . The items are all of the same type. The type is

with probability


The valuation of agent for an item of type , denoted by , is drawn independently and identically from distribution , i.e., the valuations are i.i.d. conditioned on . Note that agents’ valuations are identical in each round (if they participate, see below). In Section 6 we consider an extension of our model where the valuations of the agents may change over time.

Each agent participates in each auction with probability exogenously and independently across rounds and agents. One can think of ’s as throttling probabilities 444Due to budget and bandwidth constraints or other considerations, online advertising platforms often randomly select a subset of bidders, from all eligible advertisers, to participate in the auction. This process is referred to as throttling (Goel et al., 2010; Charles et al., 2013). or matching probabilities to a specific user demographic (Celis et al., 2014). Let

be the indicator random variable corresponding to the participation of agent

in auction at time . Note that . We denote the realization of by . Agent learns at the beginning of round . In particular, our (incentive compatibility) results hold in the special case when all the agents participate in all the auctions, i.e., for all . Participation probabilities allow us to model environments where a small number of bidders participate in auctions (cf. Celis et al. (2014)).

Information Structure

We assume , , and to be common knowledge. We also assume that the type of the item is common knowledge among the agents but unknown to the auctioneer, who only knows . This assumption is motivated in part by the application where sometimes advertisers may have more information about the value of a user or an impression than the publisher. Also, it corresponds to a stronger requirement for the incentive compatibility of the mechanism; hence, our results remain valid if the agents have the same information as the seller about the type of the item. Similarly, we assume that ’s are common knowledge among the agents and the auctioneer. Our mechanism remains incentive compatible, as defined below, if the agents have incomplete information about the ’s. At the beginning, each agent knows his own valuation, , but not the other agents’ valuations (but agents may make inferences about the valuations of the other agents over time).

Let us now consider the seller’s problem. The seller aims to maximize her expected revenue via a repeated second price auction.

A “generic” dynamic second price mechanism

At time , the auctioneer announces the reserve price function that maps the history observed by the mechanism to a reserve price. The history observed by the mechanism up to time , denoted by , consists of, for each round , the reserve price, the agents participating in round and their bids, and the allocation and payments at that round. More precisely,


  • is the reserve price at time .

  • . Recall that is equal to if agent participates in the auction for item .

  • where denotes the bid of agent at time . We assign if , i.e., if agent does not participate in round .

  • corresponds to the allocation vector. Since the items are allocated via the second price auction with reserve

    , if all the bids are smaller than , the item is not allocated. Otherwise, the item is allocated uniformly at random to an agent and we have . For all the agents that did not receive the item, is equal to .

  • is the vector of payments. If , then and if , then .

Note that in our notation, includes a reserve price function for each . The length of the history implicitly specifies the round for which the reserve is to be computed.

An important special case is a static mechanism where the reserve is not a function of the previous bids and allocations.

We can now define the seller problem more formally. The seller chooses a reserve price function that maximizes the expected revenue, which is equal to , when the buyers play an equilibrium with respect to the choice of . In order to define the utility of the agents, let denote the history observed by agent up to time including the allocation and payments of (only) agent . Namely,

Next, we state precise definitions of a bidding strategy and a best response.

Definition 1 (Bidding Strategy)

Bidding strategy of agent maps the valuation of the agent , history , and the reserve at time to a bid . Here is the set of possible histories observed by agent .

Definition 2 (Best-Response)

Given strategy profile , is a best-response strategy to the strategy of other agents , if, for all and in the support of , it maximizes the expected utility of agent ,

where the expectation is over the valuations of other agents, the participation variables ’s, and any randomization in bidding strategies. Strategy is an -best-response if, for all in the support,

where BR denotes a best response.

A mechanism is incentive compatible if, for each agent , the truthful strategy is a best-response to the other agents being truthful. In this paper, we consider the notion of approximate incentive compatibility that implies that an agent does not deviate from the truthful strategy when the benefit from such deviation is insignificant. This notion is appealing when characterizing, or computing, the best response strategy is challenging and has been studied for static games (cf. Daskalakis et al. (2009); Chien and Sinclair (2011); Kearns and Mansour (2002); Feder et al. (2007); Hémon et al. (2008)) as well as dynamic games, such as ours, where finding the best response strategy of an agent corresponds to solving a complicated (stochastic) dynamic program (Iyer et al., 2011; Balseiro et al., 2013; Gummadi et al., 2013; Nazerzadeh et al., 2013).

Definition 3 (Approximate Incentive Compatibility)

A mechanism is -incentive compatible if the truthful strategy of agent is an -best-response to the truthful strategy of other agents for all and all in the support of .

Note that is the expected number of rounds in which agent participates. Therefore, under an -incentive compatible mechanism, on average the agent loses at most in utility, relative to playing a best response, per-round of participation.

We now define a stronger notion of incentive compatibility. In Section 4, we provide conditions under which our proposed mechanism satisfies these stronger notion. By a realization, denoted by , we refer to a valuation vector along with a participation vector .

Definition 4 (Dynamic Incentive Compatibility)

We call a realization -good with respect to a mechanism, if truthfulness, for each agent and in each round , remains an (additive) -best-response to the truthful strategy of the other agents. We say that a mechanism is -dynamic-incentive-compatible if the probability of the realization being -good with respect to the mechanism is at least .

Thus, in a -dynamic-incentive-compatible mechanism, assuming truthful bidding, with probability at least the realization satisfies the following property: for each bidder and each round that the bidder participates, the average cost of truthful bidding is at most for each future round that he may participate in, relative to a best response. The above definition extends the notion of (exact) interim dynamic incentive compatibility (Bergemann and Välimäki, 2010) which implies that the agents will not deviate from the truthful strategy even as they obtain more information over time.


In the next section, we propose a simple approximately incentive compatible mechanism for the setting described above. We compare our proposed mechanism with two benchmarks that provide a lower-bound and an upper-bound on the revenue of the best dynamic second price mechanism.

The lower-bound mechanism, which we refer to as the static mechanism, at each step, implements a second price auction with a constant reserve . The reserve is chosen at time before the mechanism observes any of the bids and does not change over time.555Since the valuations of the agents are correlated through the type of the items, finding the optimal static auction is challenging and could be computationally intractable (Papadimitriou and Pierrakos, 2011). Cremer and McLean (1988) proposed a mechanism that can extract the whole surplus if the valuations are correlated, however, their mechanism is not practical and does not satisfies the desirable ex-post individual rationality property; also see Section 6.

For the upper-bound, we consider the optimal -round mechanism that knows the type of the items.

Lemma 1 (Upper-bound)

Let be the optimal -round mechanism that knows the type of the item, . Similarly, corresponds to the optimal (static) mechanism when . Then, the revenue of , denoted by , is bounded by . Furthermore, if mechanism is ex-post incentive compatible, then, can be implemented by repeating mechanism at each step .

We prove the first claim in the appendix using a reduction argument that reduces a mechanism in a -round setting to a mechanism in a single-round. If mechanism is ex-post incentive compatible, the leakage of information from one round to another does not change the strategy of the bidders. Recall that for private value settings, ex-post incentive compatibility implies that truthfulness is a (weakly) dominant strategy for each agent for any realizations of other agents’ valuations — the second price auction with reserve satisfies this property.

Through out this paper, we make the following standard regularity assumption (cf. Myerson (1981)).

Assumption 1 (Regularity)

Distribution , , with density , is regular, i.e., c.d.f. and are strictly increasing in over the support of .

Examples of regular distributions include many common distributions such as the uniform, Gaussian, log-normal, etc.

If is known and is regular, then , the optimal mechanism for , is the second price auction with reserve price that is the unique solution of


Therefore, by Lemma 1, we obtain the following.

Theorem 2.1 (No “dynamic” improvement with single type)

If the valuations are drawn i.i.d. from a regular distribution (e.g., is known and is regular), the optimal mechanism is the second price auction with a constant reserve that is the solution of Eq. (1) and there is no benefit from having dynamic reserve prices.

The theorem above is similar to the previous results in the literature for settings with a single buyer (Stokey, 1979; Salant, 1989; Hart and Tirole, 1988; Acquisti and Varian, 2005) and generalizes their insights to auction environments with multiple buyers.

In the next section, we preset a simple mechanism that exploits correlations between valuations (via types ) and the competition among bidders to extract higher revenue than the static mechanism and in fact, for a broad class of distributions of valuations, obtains revenue close to the upper-bound benchmark. Further, the mechanism is approximately incentive compatible.

3 The Threshold Mechanism

In this section, we present the class of threshold mechanisms.

A threshold mechanism is defined by three parameters and is denoted by where is the initial reserve price. The reserve stays until any of the agents bid above , then for all subsequent rounds, the reserve price will increase to . If there are no bids above , the reserve stays until the end.

As we demonstrate in the following, this class of mechanisms (and a generalization of it, presented in Section 5, include good candidates for boosting revenue if the modes of and are sufficiently well separated. The idea is to choose such that the valuation of an agent is unlikely to be above if , whereas, a valuation exceeding is quite likely if . Moreover, as we establish, truthful bidding forms an approximate equilibrium in this case, so for almost all realizations, the mechanism does correctly infer .

To convey the intuition behind our incentive compatiblity results, we start with the following (warm up) proposition.

Proposition 1

Suppose that is supported on and is supported on for and for at least 2 agents. Consider any . Then, , for any , is incentive compatible.


Proof: Consider agent and round where participates (i.e., ). First observe that since bidding truthfully is a (weakly) dominant strategy in the second-price auction, truthfulness is a myopic best-response in our setting.

If , then bidding truthfully will not increase the reserve in the future rounds and truthfulness is a (weakly) dominant strategy. Now suppose . If in round , then again truthful bidding is a best response, since the reserve will continue to be for the remaining rounds. On the other hand, if (and ), at least one other agent will participate in the auction at time . At the equilibrium, the other agent will bid truthfully, hence above , and the reserve will be for the remaining rounds in any case. So bidding truthfully is a best response, since it is myopically a best response.

We now show that the threshold mechanism is approximately incentive compatible when the support of the distributions overlap and the distribution of the low type is bounded. We also provide an example that shows a significant boost in the revenue.

Theorem 3.1

Let be supported on , . Let be the solution of . Consider any positive . Let . Define

Consider such that . Then, Mechanism is -incentive-compatible for all and . In addition, the expected revenue for each is at least , where is the optimal single-round mechanism that knows in advance and can be obtained using a constant reserve of .

Thus, using mechanism in this setting, truthful bidding is an approximate equilibrium and the revenue is very close to the benchmark.

Defining , an appealing feature of Theorem 3.1 is that the lower bound on number of bidders, , grows only as . On the other hand, the lower bound on number of rounds grows as for . This is somewhat larger than for small but this is not a major concern since the number of identical (or very similar) impressions is often large in online advertising settings. The below example demonstrates a numerical illustration of Theorem 3.1.

Figure 1: Illustration of distributions in Example 1.

is the normal distribution, with mean

and standard deviation

, truncated to interval , and , i.e., normal with mean and standard deviation . We use .
Example 1

Suppose is the normal distribution, with mean and standard deviation , truncated to interval , and , i.e., normal with mean and standard deviation . These distributions are shown in Figure 1. Also, let , for all . Note that each agent participates in rounds on average, making it non-trivial to have dynamic reserves without losing incentive compatibility if is much larger than . We have and . Using gives . Using , we obtain for so we consider in our simulations. The (optimal) static second price auction obtains average-revenue per-round equal to using (constant) reserve price . Mechanism yields per-round revenue of (Theorem 3.1 guarantees that the loss relative to the optimal revenue is at most per round) improving more than over the static mechanism. The per-round revenue of the optimal mechanism that knows the type of the impressions is equal to . The 95% confidence error in estimating the revenues is less than . The average welfare of a buyer per round of participation, averaged over and is found to be (with 95% confidence error 0.002). Note that given this implies that the threshold mechanism is -incentive compatible.

4 Dynamic Incentive Compatibility

In this section, we show that, with high probability, no agent has a large incentive to deviate from the truthful strategy in later rounds after acquiring new information. We also relax the requirement that needs to have bounded support.

Theorem 4.1

Recall (1). Let . Consider any and let . Then, Mechanism is -dynamic incentive compatible for all and where

Further, the expected revenue of the mechanism is additively within of the benchmark revenue under truthful bidding.

Note that this theorem requires an to be not too large. The assumed upper bound on can be eliminated in two different ways: In the (immediate) corollary below, we assume a bounded support for leading to . Later in Section 5, we introduce a generalized threshold mechanism, which then facilitates a result similar to Theorem 4.1 while allowing to be arbitrarily large (Theorem 5.1).

Corollary 1 (Bounded Support)

Recall (1). Let be supported on , . Consider any and let . Then, Mechanism is -dynamic incentive compatible for all and where

Further, the expected revenue of the mechanism is additively within of the benchmark revenue under truthful bidding.

Comparing with Theorem 3.1, we see that the cost of the stronger notion of equilibrium here is only a factor loss in and a further factor loss in .

We prove Theorem 4.1 in the appendix. We state below the main lemma leading to a proof of dynamic incentive compatibility .

Let be the event that the reserve in round is assuming truthful bidding (thus , here, is the event that a bidder with valuation exceeding participates in one of the first rounds). Let be the event that the reserve in round would have been assuming truthful bidding even if the bids of agent are removed (thus , here, is the event that a bidder with valuation exceeding participates in one of the first rounds). Let


(It turns out that .) By definition, for all and all . It follows that , so, in establishing -dynamic incentive compatibility, we can ignore the trajectories under which does not occur (these trajectories have combined probability bounded above by ). Under , the reserve in round (and all later rounds) is already , making truthful bidding an exact best response in those rounds. Also, for agents whose valuation is less than , truthful bidding is always a best response. For an agent with valuation exceeding , on the equilibrium path, the only time that may potentially benefit from not being truthful is the first time that participates (and if the reserve is still ); once has bid truthfully once, future bids of have no impact on the reserve and truthful bidding is a best response.666In the present setting with mechanism , agent will see a reserve that has already risen to in each subsequent round that participates in. (Later we will generalize the threshold mechanism in Section 5, but it will still be true that if bids above once, future bids of will not affect the reserve, hence truthful bidding will be exactly optimal in subsequent rounds.) Hence, it suffices to show that under for , if an agent participates for the first time in round , truthful bidding is (additively) -optimal, assuming that others bid truthfully, even if the reserve is still in round .

Lemma 2

Assume that , , . Let be defined as in Eq. (2). For any agent with valuation exceeding who participates for the first time in a round , and sees a reserve , truthful bidding is (additively) -optimal, assuming that others bid truthfully. Further, we have and .

5 The Generalized Threshold Mechanism

We now present a generalization of the threshold mechanism that allows us to significantly weaken the required bound on the right tail of the low type distribution of Theorem 4.1.

The generalized threshold mechanism is defined by four parameters and is denoted by where is the initial reserve price. The reserve stays until distinct agents bid above (possibly in different rounds). If this occurs then for all subsequent rounds, the reserve price will increase to .

Theorem 5.1

Recall (1). Let . Assume . Fix positive . Define . Let

We provide mechanisms that work well for any and .

  • Suppose . For all and , the generalized threshold mechanism is -dynamic incentive compatible, and it is additively close to the revenue benchmark.

  • For all and , the generalized threshold mechanism is -dynamic incentive compatible, and it is additively close to the revenue benchmark.

As a remark to ease the burden of notation: note that for the values of interest, i.e., for . In other words, rounds suffice to obtain our positive results. Also, note that for , we still have and as was the case for Theorem 3.1, so our requirements on the number of bidders and number of rounds needed continue to be reasonable.

6 Discussion

Transient Valuations

So far we assumed that the valuations of the agents are constant over time. In this section, we consider the following extension of our model: each time an agent participates, he draws a new valuation from independently with some probability (our original model corresponds to the case ), and retains his previous valuation with probability .

We observe that our incentive compatibility result of Theorem 3.1 (also Corollary 1) holds in this setting because the incentive of the agents to deviate is even smaller and the proof work nearly as before for any . In addition, Theorem 2.1 holds if we consider mechanisms that are periodic ex-post individually rational (Bergemann and Välimäki, 2010); in other words, the utility of any truthful agent at the end of each round , should be non-negative.

The following example shows that a mechanism that is not ex-post individually rational can obtain a higher revenue by charging the agents a high price in advance: Suppose there is only one agent (), the agent participates in all rounds (

), and the agent draws a new valuation at each round from the uniform distribution over

(). It is not difficult to see that the optimal constant reserve for this setting is equal to which yields the expected revenue of since the agent will purchase the item with probability . Now consider a mechanism that offers reserve price (for an arbitrarily small ) in the first round and if the agent accepts that price, the mechanism offers the item for free in the future rounds, and if the agent refuses the offer, the mechanism posts a price of at each round. Observe that the agent will accept the mechanism’s offer in the first round and the revenue obtained in this case is equal to . However, this mechanism is not ex-post individually rational. For the optimal mechanism (that does not satisfy the ex-post IR property) would take the form of contracts followed by sequence of auctions (Kakade et al., 2013; Battaglini, 2005).

Connection to Mean Field Equilibrium

We now comment briefly on the connection between our work, and the concept of mean field equilibrium. A number of recent papers study notions of mean field equilibrium, e.g., Iyer et al. (2011); Balseiro et al. (2013) study mean field equilibria in dynamic auctions, and Gummadi et al. (2013) studies mean field equilibrium in multiarmed bandit games. An agent making a mean field assumption assumes that the set of competitors (or cooperators) she faces will be drawn uniformly at random from a large pool of agents with a known distribution of types. In our work, agents’ participate in a particular round of a dynamic auction independently at random, but our results do not require and agents retain their valuation for all rounds in which they participate. In Theorem 4.1, one can have any fixed number of agents exceeding , and participants reason about the posterior distribution of competitors they will face in a round, given the information available to them. This posterior distribution of the valuations of competitors is in general different from the prior distribution of valuations and evolves from one round to the next.

7 Conclusion

We considered repeated auctions of items, all of the same type, with the auctioneer not knowing the type of the items a-priori. In our model, the issue of incentives is challenging because a bidder typically participates in multiple auctions, and is hence sensitive to changes in future reserve prices based on current bidding behavior. We demonstrated a fairly broad setting in which a simple dynamic reserve second price auction mechanism can lead to substantial improvements in revenue over the best fixed reserve second price auction. In fact, our threshold mechanism is approximately truthful and achieves near optimal revenue in our setting. We demonstrate a numerical illustration of our results with a reasonable choice of model parameters, and show significant improvement in revenue over the static baseline.

For our future work, we would like to investigate the effects of various properties of the (joint) distributions of the valuation of the advertisers (e.g., more than two types), the characteristics of learning algorithms (as opposed to simple threshold mechanisms), and the effect of the rate (and manner) in which the valuations of advertisers change over time on the equilibrium and the revenue of the auctioneer.


We would like to thank Brendan Lucier, Mohammad Mahdian, and Mukund Sundararajan for their insightful comments and suggestions. This work was supported in part by Microsoft Research New England. The work of the second author was supported in part by a Google Faculty Research Award.


  • Acquisti and Varian [2005] Alessandro Acquisti and Hal R. Varian. Conditioning prices on purchase history. Marketing Science, 24(3):367–381, May 2005.
  • Amin et al. [2013] Kareem Amin, Afshin Rostamizadeh, and Umar Syed. Learning prices for repeated auctions with strategic buyers. In Christopher J. C. Burges, Léon Bottou, Zoubin Ghahramani, and Kilian Q. Weinberger, editors, NIPS, pages 1169–1177, 2013.
  • Aviv and Pazgal [2008] Y. Aviv and A. Pazgal. Optimal Pricing of Seasonal Products in the Presence of Forward-looking Consumers. 10(3):339–359, 2008. ISSN 1526-5498.
  • Aviv et al. [2013] Yossi Aviv, Mingcheng Wei, and Fuqiang Zhang. Responsive pricing of fashion products: The effects of demand learning and strategic consumer behavior. 2013.
  • Balseiro et al. [2011] Santiago Balseiro, Jon Feldman, Vahab S. Mirrokni, and S. Muthukrishnan. Yield optimization of display advertising with ad exchange. In Shoham et al. [2011], pages 27–28. ISBN 978-1-4503-0261-6.
  • Balseiro et al. [2013] Santiago R. Balseiro, Omar Besbes, and Gabriel Y. Weintraub. Auctions for online display advertising exchanges: Approximations and design. In Proceedings of the Fourteenth ACM Conference on Electronic Commerce, EC ’13, pages 53–54, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-1962-1.
  • Battaglini [2005] Marco Battaglini. Long-term contracting with markovian customers. American Economic Review, 95(3):637–658, 2005.
  • Bergemann and Välimäki [2010] Dirk Bergemann and Juuso Välimäki. The dynamic pivot mechanism. Econometrica, 78:771–789, 2010.
  • Besbes and Zeevi [2009] Omar Besbes and Assaf Zeevi. Dynamic pricing without knowing the demand function: risk bounds and near-optimal algorithms. Operations Research, 57:1407–1420, 2009.
  • Besbes and Zeevi [2012] Omar Besbes and Assaf Zeevi. Blind network revenue management. Operations Research, 60:1520–1536, 2012.
  • Bikhchandani and McCardle [2012] Sushil Bikhchandani and Kevin McCardle. Behavior-based price discrimination by a patient seller. B.E. Journals of Theoretical Economics, 12, June 2012.
  • Celis et al. [2014] L. Elisa Celis, Gregory Lewis, Markus Mobius, and Hamid Nazerzadeh. Buy-it-now or take-a-chance: Price discrimination through randomized auctions. Management Science, 2014.
  • Charles et al. [2013] Denis Charles, Deeparnab Chakrabarty, Max Chickering, Nikhil R. Devanur, and Lei Wang. Budget smoothing for internet ad auctions: A game theoretic approach. In Proceedings of the Fourteenth ACM Conference on Electronic Commerce, EC ’13, pages 163–180, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-1962-1.
  • Chien and Sinclair [2011] Steve Chien and Alistair Sinclair. Convergence to approximate nash equilibria in congestion games. Games and Economic Behavior, 71(2):315–327, 2011.
  • Conitzer et al. [2012] Vincent Conitzer, Curtis R. Taylor, and Liad Wagman. Hide and seek: Costly consumer privacy in a market with repeat purchases. Marketing Science, 31(2):277–292, March 2012. ISSN 1526-548X.
  • Correa et al. [2013] José Correa, Ricardo Montoya, and Charles Thraves. Contingent preannounced pricing policies with strategic consumers. Working Paper, 2013.
  • Cremer and McLean [1988] Jacques Cremer and Richard P McLean. Full extraction of the surplus in bayesian and dominant strategy auctions. Econometrica, 56(6):1247–57, November 1988.
  • Daskalakis et al. [2009] Constantinos Daskalakis, Aranyak Mehta, and Christos H. Papadimitriou. A note on approximate nash equilibria. Theor. Comput. Sci., 410(17):1581–1588, 2009.
  • Dasu and Tong [2010] Sriram Dasu and Chunyang Tong. Dynamic pricing when consumers are strategic: Analysis of posted and contingent pricing schemes. European Journal of Operational Research, 204(3):662–671, August 2010.
  • den Boer and Zwart [2014] Arnoud V. den Boer and Bert Zwart.

    Simultaneously learning and optimizing using controlled variance pricing.

    Management Science, 2014.
  • Esteves [2009] Rosa Branca Esteves. A survey on the economics of behaviour-based price discrimination. NIPE Working Papers 5/2009, NIPE - Universidade do Minho, 2009.
  • Feder et al. [2007] Tomás Feder, Hamid Nazerzadeh, and Amin Saberi. Approximating nash equilibria using small-support strategies. In Jeffrey K. MacKie-Mason, David C. Parkes, and Paul Resnick, editors, ACM Conference on Electronic Commerce, pages 352–354. ACM, 2007. ISBN 978-1-59593-653-0.
  • Fudenberg and Villas-Boas [2007] Drew Fudenberg and J. Miguel Villas-Boas. Behavior-Based Price Discrimination and Customer Recognition. Elsevier Science, Oxford, 2007.
  • Goel et al. [2010] Ashish Goel, Mohammad Mahdian, Hamid Nazerzadeh, and Amin Saberi. Advertisement allocation for generalized second-pricing schemes. Operations Research Letters, 38(6):571–576, 2010.
  • Gummadi et al. [2013] Ramki Gummadi, Peter Key, and Alexandre Proutiere. Optimal bidding strategies and equilibria in dynamic auctions with budget constraints. Working Paper, 2013.
  • Harrison et al. [2012] J. Michael Harrison, N. Bora Keskin, and Assaf Zeevi. Bayesian dynamic pricing policies: Learning and earning under a binary prior distribution. Management Science, 58(3):570–586, 2012.
  • Hart and Tirole [1988] Oliver D. Hart and Jean Tirole. Contract renegotiation and coasian dynamics. Review of Economic Studies, 55:509–540, 1988.
  • Hémon et al. [2008] Sébastien Hémon, Michel de Rougemont, and Miklos Santha. Approximate nash equilibria for multi-player games. In Burkhard Monien and Ulf-Peter Schroeder, editors, SAGT, volume 4997 of Lecture Notes in Computer Science, pages 267–278. Springer, 2008. ISBN 978-3-540-79308-3.
  • Iyer et al. [2011] Krishnamurthy Iyer, Ramesh Johari, and Mukund Sundararajan. Mean field equilibria of dynamic auctions with learning. In Shoham et al. [2011], pages 339–340. ISBN 978-1-4503-0261-6.
  • Kakade et al. [2013] Sham M. Kakade, Ilan Lobel, and Hamid Nazerzadeh. Optimal dynamic mechanism design and the virtual pivot mechanism. Operations Research, 61(4):837–854, 2013.
  • Kearns and Mansour [2002] Michael J. Kearns and Yishay Mansour. Efficient nash computation in large population games with bounded influence. In Adnan Darwiche and Nir Friedman, editors, UAI, pages 259–266. Morgan Kaufmann, 2002. ISBN 1-55860-897-4.
  • McAfee [2011] Preston McAfee. The design of advertising exchanges. Review of Industrial Organization, 39(3):169—185, 2011.
  • McAfee and Vincent [1992] R Preston McAfee and Daniel Vincent. Updating the reserve price in common-value auctions. American Economic Review, 82(2):512–18, May 1992.
  • McAfee et al. [1989] R Preston McAfee, John McMillan, and Philip J Reny. Extracting the surplus in the common-value auction. Econometrica, 57(6):1451–59, November 1989.
  • Muthukrishnan [2009] S. Muthukrishnan. Ad exchanges: Research issues. In Internet and Network Economics, 5th International Workshop (WINE), pages 1–12, 2009.
  • Myerson [1986] Roger Myerson. Multistage games with communications. Econometrica, 54(2):323–358, 1986.
  • Myerson [1981] Roger B. Myerson. Optimal auction design. Mathematics of Operations Research, 6(1):58–73, 1981.
  • Nazerzadeh et al. [2013] Hamid Nazerzadeh, Amin Saberi, and Rakesh Vohra. Dynamic cost-per-action mechanisms and applications to online advertising. Operations Research, 61(1):98–111, 2013.
  • Ostrovsky and Schwarz [2009] Michael Ostrovsky and Michael Schwarz. Reserve prices in internet advertising auctions: A field experiment. Working Paper, http://faculty-gsb.stanford.edu/ostrovsky/papers/rp.pdf, 2009.
  • Papadimitriou and Pierrakos [2011] Christos H. Papadimitriou and George Pierrakos. On optimal single-item auctions. In STOC, pages 119–128, 2011.
  • Riley and Samuelson [1981] John G. Riley and William F. Samuelson. Optimal auctions. American Economic Review, 71(3):381—392, 1981.
  • Salant [1989] Stephen W Salant. When is inducing self-selection suboptimal for a monopolist? The Quarterly Journal of Economics, 104(2):391–97, May 1989.
  • Segal [2003] Ilya Segal. Optimal pricing mechanisms with unknown demand. The American Economic Review, 93(3):509–529, 2003.
  • Shoham et al. [2011] Yoav Shoham, Yan Chen, and Tim Roughgarden, editors. Proceedings 12th ACM Conference on Electronic Commerce (EC-2011), San Jose, CA, USA, June 5-9, 2011, 2011. ACM. ISBN 978-1-4503-0261-6.
  • Stokey [1979] Nancy L Stokey. Intertemporal price discrimination. The Quarterly Journal of Economics, 93(3):355–71, August 1979.
  • Taylor [2004] Curtis R. Taylor. Consumer privacy and the market for customer information. RAND Journal of Economics, 35(4):631–650, Winter 2004.
  • Villas-Boas [2004] J. Miguel Villas-Boas. Price cycles in markets with customer recognition. RAND Journal of Economics, 35(3):486–501, Autumn 2004.

Appendix 0.A Appendix

0.a.1 Proof of Theorem 3.1

We now prove Theorem 3.1 by first showing that threshold mechanism -incentive-compatible. First assume . Consider any agent with valuation and assume that other agents are truthful always. Since , it is clear that truthful bidding weakly dominates any other strategy, since this is true myopically, the reserve is unaffected, and the bidding behavior of others is unaffected by the bids of agent . In this case, the reserve remains and the agents bid truthfully throughout, so there is no loss in revenue.

Now assume . In Appendix 0.D, we prove the following lemma.

Lemma 3

Assume , . Fix an agent . With probability at least , irrespective of what agent does, at least bidder with valuation exceeding will bid in the first rounds. For , at least bidders different from with valuation exceeding will bid in the first rounds with probability at least .

Here we have used . We now show how the lemma, for , implies the results.


Proof of Theorem 3.1. In the first rounds, agent being truthful can cause the reserve to rise though it wouldn’t otherwise have risen, leading to a loss of at most in expected utility for the agent. If the reserve would not have risen in the first rounds but agent caused it to rise, this can lead to a further loss of up to per round of participation, and such a loss occurs with probability at most from Lemma 3, leading to a bound of for this loss to the agent. Combining yields the overall bound of on the loss incurred by agent by being truthful relative to any other strategy for .

Finally we bound the loss in revenue from using this mechanism if . Recall that the optimal auction if the auctioneer knows beforehand is to commit and run a second price auction with reserve in all rounds. Hence, similar to the above, the expected revenue loss to the auctioneer is bounded by if . Since for each possible , the expected loss in revenue is bounded above by , the same bound holds when we take expectation over .

Appendix 0.B Proof of Theorem 4.1

First assume . A simple union bound ensures that that all bidders have a valuation of at most with probability at least , in which case truthful bidding weakly dominates any other strategy. (Since , it is clear that truthful bidding weakly dominates any other strategy, since this is true myopically and the bidding behavior of others is unaffected by the bids of agent .) Hence, the realization is -good with respect to the mechanism with probability at least . Further, we can easily bound the loss in expected revenue relative to the benchmark under truthful bidding: There is no loss with probability (the mechanism matches the benchmark mechanism, since the reserve remains throughout) and a loss of at most (due to the reserve rising to ) with probability . Thus, the loss in expected revenue is bounded by as required.

Now assume . For any agent with a valuation less than or equal to , truthful bidding again weakly dominates any other strategy, since this is true myopically and the bidding behavior of others is unaffected by the bids of agent . It remains to deal with agents whose valuation exceeds , to establish that truthful bidding is -incentive-compatible. In particular, we need to show that with probability at least , the realization is -good with respect to the mechanism, i.e., that each such agent loses no more than in expectation on the equilibrium path from bidding truthfully in round , for each that participates in. But this follows from Lemma 2: all realizations such that occurs are -good, and . See the argument after the statement of Theorem 4.1 in Section 4.1 for further details.

It remains to show that the loss in revenue is no more than , assuming truthful bidding under -good realizations. Now, using Lemma 3 and , we have

Under , the mechanism matches the benchmark mechanism for rounds after and hence there is no loss in revenue relative to the benchmark, after the first rounds. In any round, the loss due to setting the wrong reserve (under truthful bidding) is bounded by . Under , the loss can be this large in each of rounds, in worst case. It follows that the overall loss in revenue is bounded by . But by definition, and , implying that the loss in revenue relative to the benchmark is at most as required, using the definition of .

Appendix 0.C Proof of Theorem 5.1

We start with the first bullet. The proof for follows exactly the same steps as the proof of Theorem 4.1, except that we make use of the second part of Lemma 3 (using ) since we are using instead of . Consider . The probability of or more bidders with valuation exceeding is . Since , we have the mean of the binomial , in particular, . Now, using a Chernoff bound (on where leading to a mean of ; clearly this binomial stochastically dominates the one we care about), we infer that

If all valuations are no more than then such a realization is clearly -good (i.e., incentive compatible in an exact sense) with respective to the mechanism. Hence, we have shown that the probability of the realization being -good is at least , implying -dynamic incentive compatibility for . Further, the loss in expected revenue for is bounded above by as required.

Now consider the second bullet. Consider . The threshold is . Let .

Lemma 4

Assume , and . Fix an agent . With probability at least , irrespective of what agent does, at least bidders different from with valuation exceeding will bid in the first rounds. with probability at least .

To use Lemma 4 we need an upper bound on . Note that using and , we have . Hence we have

using . It follows that

With reference to the upper bound on in Lemma 4, we deduce that

Hence, using Lemma 4, we deduce that under truthful bidding, the reserve rises to within with probability at least . Following the argument in the proof of Lemma 2 from here, we deduce -dynamic incentive compatibility for . We also deduce that the loss in expected revenue is small similar to the proof of Theorem 4.1.

Consider the second bullet and . The probability of or more bidders with valuation exceeding is . The mean since . We infer using a Chernoff bound that

We then complete the proof of approximate dynamic incentive compatibility and revenue optimality exactly as we did for the first bullet with .

Appendix 0.D Proofs of Lemmas


Proof of Lemma 1. To prove the first part of the claim, we construct mechanism that obtains, in expectation, revenue equal to . Since by definition is the optimal revenue that can be obtained when , we conclude that .

We construct mechanism as follows: Let be the set of agents who participate in the one-round auction. Note that each agent knows his own but not for any other agent . For all agents , draw a (hypothetical) valuation i.i.d. from the distribution of valuations . Now consider the probability space generated by simulating mechanism over a round auction by sampling ’s in each round and emulating the (optimal) bidding strategy of the agents under .

Consider the distribution of in rounds where the set of agents who participate is exactly , in this probability space. More precisely, we are considering not a single simulation, but the probability space of possible simulation trajectories. For each trajectory , the pair for each round in which agents participate contributes a weight in proportional to the probability of trajectory .

To determine the payments under , draw uniformly from distribution . The mechanism charges the agents in these amounts and allocate the items according to .

We argue that the mechanism is truthful: It is not hard to see that the ex interim expected utility of an agent from participating in with bid when others bid truthfully, is exactly times the ex interim expected utility of participating in and following his equilibrium strategy for valuation there if others follow their equilibrium strategies. Recall that each agent knows his own but not for any other agent . It follows that truthful bidding is an equilibrium in mechanism . Further, under truthful bidding, it is not hard to see that the expected revenue of mechanism is , as claimed. Note that when , , the proof would be simplified and could be argued using the revelation principle Myerson [1986].

We now prove the second part of the claim. Note that if is ex-post incentive compatible, the leakage of information from one round to another does not change the strategy of the bidders. Therefore, repeating mechanism obtain revenue which is the upper-bound revenue.


Proof of Lemma 2. Consider any agent . By definition of , we know that for , for all agents we have


Let . Note that . It follows from Lemma 3 that


In particular, we have and , yielding the second part of the lemma.

Combining Eqs. (7) and (4) we obtain that


Hence, agent who participates for the first time in round and sees reserve , infers that the reserve will rise to by round with probability at least , due to the bids of other agents. Thus, we can bound the expected cost in future rounds to agent by causing the reserve to rise by bidding truthfully:

  • Under , agent may lose at most in future rounds (in expectation).

  • Under , agent may lose at most