DeepAI

Liquid Welfare guarantees for No-Regret Learning in Sequential Budgeted Auctions

• 4 publications
• 11 publications
03/18/2018

An Improved Welfare Guarantee for First Price Auctions

This paper proves that the welfare of the first price auction in Bayes-N...
03/05/2021

Competitive Information Disclosure with Multiple Receivers

This paper analyzes a model of competition in Bayesian persuasion in whi...
05/18/2022

Budget Pacing in Repeated Auctions: Regret and Efficiency without Convergence

We study the aggregate welfare and individual regret guarantees of dynam...
06/16/2021

Strategic Behavior is Bliss: Iterative Voting Improves Social Welfare

Recent work in iterative voting has defined the difference in social wel...
11/06/2017

Social Welfare and Profit Maximization from Revealed Preferences

Consider the seller's problem of finding "optimal" prices for her (divis...
06/25/2021

Can Buyers Reveal for a Better Deal?

We study small-scale market interactions in which buyers are allowed to ...
11/04/2015

Learning in Auctions: Regret is Hard, Envy is Easy

A line of recent work provides welfare guarantees of simple combinatoria...

1 Introduction

The work of [DBLP:conf/sigecom/BalseiroG17] focuses only on guarantees from the players’ perspective (lower and upper bounds for a player’s utility compared to the maximum utility in retrospect). DBLP:journals/corr/GaitondeLLLS22 [DBLP:journals/corr/GaitondeLLLS22] focus on guarantees of the pacing algorithms for liquid welfare. Liquid welfare was first introduced in [DBLP:conf/icalp/DobzinskiL14], and is the standard efficiency metric that generalizes social welfare when players are budget constrained: the liquid welfare of a player is the minimum of the value she receives and her budget. The optimal liquid welfare is a reasonable benchmark for the social value of the outcome: one cannot expect players with low budgets to be able to pay to achieve very high values. [DBLP:journals/corr/GaitondeLLLS22] prove that when all players use the pacing algorithm of [DBLP:conf/sigecom/BalseiroG17] the liquid welfare is within a factor of of the optimal one and their results hold for both first and second-price auctions. However, the work of [DBLP:journals/corr/GaitondeLLLS22] makes some restrictive assumptions: First, they assume that every player is using a variant of the aforementioned pacing algorithm of [DBLP:conf/sigecom/BalseiroG17]. Second, they assume that the players’ values are sampled from the same distribution every round.

An appealing and weaker behavioral assumption is to assume that players have small regret or small competitive ratio in the utility they achieve, which requires the players to guarantee a fraction of a benchmark up to additive factors without requiring that they use a particular algorithm. [DBLP:journals/corr/GaitondeLLLS22] prove that in sequential budgeted second-price auctions, even when players have no-regret, the resulting liquid welfare can be arbitrarily bad compared to the optimal one (see Theorem 5.1 for the result). In second-price auctions, it is often the case (even without budgets) that the item prices are low even if players do not want to deviate; this can result in a player paying little to get high utility which, coupled with a low budget, leads to small liquid welfare.

In contrast, we focus on sequential budgeted first-price auctions, where we show that no-regret and even player utility a factor away from optimal do imply guarantees about liquid welfare. In particular, we focus on guarantees about the competitive ratio of the players. When players are budgeted and the setting (in our case the players’ values) is picked adversarially, no-regret guarantees are not achievable (this was also proven for the second-price setting by [DBLP:conf/sigecom/BalseiroG17]). In contrast, work focuses on guarantees about the competitive ratio which is the ratio of the reward of the optimal in retrospect best fixed action (shading factor, in our case), and the one the player achieves. We summarize our main results next.

• Our first and main result in Section 4, is that in sequential budgeted first-price auctions if every player is guaranteed a competitive ratio of compared to using the best fixed shading factor (while budget lasts), then the liquid welfare is within a factor of of the optimal one (Theorem 4.3). When , the same theorem guarantees a factor of about , which is close to the bound of 2 of [DBLP:journals/corr/GaitondeLLLS22] and [DBLP:conf/sigecom/BalseiroKK22]. However, the former work assumes that the players’ type is sampled from the same distribution every round and that the players use the pacing algorithm of [DBLP:conf/sigecom/BalseiroG17], which for second-price auctions guarantees a competitive ratio of . [DBLP:conf/sigecom/BalseiroKK22] assume players with i.i.d. type and prove their bound for an equilibrium where the competitive ratio is by the definition of an equilibrium. Instead, we make only the much weaker behavioral assumption that the players employ a competitive bidding strategy.

The main technique we use to prove our theorem is the following. We compare the utility achieved by each agent to the utility possible for a well-chosen shading factor . The competitive ratio assumption guarantees that the players’ utility is comparable to what would have been achievable with such a fixed shading. We need to distinguish two cases. If by using this shading factor a player would not have run out of budget, then her utility in that case is high using ideas similar to the classic price of anarchy work (e.g., [DBLP:conf/stoc/SyrgkanisT13]). In contrast, if that multiplier would have led to the agent being limited by her budget, then almost all of the budget is used, which again yields high utility: the total value won by the agent would have essentially been at least times her budget.

• In Section 5, we prove an almost matching lower bound: if every player has competitive ratio or better then the resulting liquid welfare may be times the optimal one, for any (Theorem 5.3). In addition, for , we prove a lower bound (Theorem 5.2).

• In Section 6, we prove that any guarantee for Bandits with Knapsacks also translates to a guarantee for bidding in sequential budgeted first-price auctions. This proves that a player can always achieve a competitive ratio of

with high probability, even when her values and the other players’ bids are adversarially picked (

Theorem 6.3).

To achieve this extension we have to overcome a few differences in our setting and the classical Bandits with Knapsacks setting. Our action space is continuous and the known results in that case assume that, as a function of the action, the rewards are concave and consumption of the resource is convex, which is not true in our case. Additionally, in Bandits with Knapsacks, an action that would lead to over-consumption of the resource ends the game, while in our setting the bid is adjusted to the remaining budget and the game goes on.

• Finally, in Section 7, we extend our first result to the case when users have submodular valuations instead of additive ones. In this case, we prove that if every player achieves a competitive ratio of , then the liquid welfare is within a factor of of the optimal one (Theorem 7.1), a bound that is a little worse than our bound of for the linear case.

2 Related Work

Assuming that players’ behavior is based on using a shading multiplier to shade one’s value has received a lot of attention recently. DBLP:conf/sigecom/BalseiroG17 [DBLP:conf/sigecom/BalseiroG17] have proposed an adaptive pacing mechanism for sequential second-price auctions that shades one’s bid based on the total budget and payment up to that round. From a single user’s perspective, they prove that even if values and prices are picked adversarially, the player can guarantee a competitive ratio of compared to her maximum utility in retrospect, where is the maximum value she can have each round, is her budget, and are the number of rounds; they also prove that this bound is tight. For the case where values and prices are picked from the same distribution every round, they prove a competitive ratio of . They also prove the same result when every player is using their pacing algorithm and their values are sampled from a distribution. In repeated second-price auctions using a fixed shading factor to bid can give the optimal utility with hindsight, up to a small additive error. While this is not true for first price, the simplicity of considering shading factors to manage one’s budget motivates using this bidding strategy in this case also, see for example the works of [DBLP:conf/ec/ConitzerKPSSMW19, DBLP:conf/sigecom/BalseiroKK22].

DBLP:journals/corr/GaitondeLLLS22 [DBLP:journals/corr/GaitondeLLLS22] focus on guarantees of the pacing algorithm of [DBLP:conf/sigecom/BalseiroG17] in sequential budgeted second and first-price auctions. They prove that when users’ values are picked from the same distribution every round, liquid welfare is within a factor of the optimal one, up to an additive error, sublinear in the number of rounds. [DBLP:journals/corr/GaitondeLLLS22] also prove that the weaker behavioral assumption that players have no-regret or a small competitive ratio is not enough to bound liquid welfare in second-price auctions: even when players have no-regret, the resulting liquid welfare can be arbitrarily bad compared to the optimal one. In contrast, we prove such guarantees for the case of first price.

In the offline setting, DBLP:conf/ec/ConitzerKPSSMW19 [DBLP:conf/ec/ConitzerKPSSMW19], as well as the more recent paper of DBLP:conf/sigecom/BalseiroKK22 [DBLP:conf/sigecom/BalseiroKK22] focus on shading based equilibria in budgeted first-price auctions. [DBLP:conf/sigecom/BalseiroKK22] prove that when every player’s type is sampled from the same distribution, if players shade their values and then use those to bid according to standard symmetric first-price auction equilibrium, this produces a symmetric Bayesian Nash equilibrium of a single item auction while observing the budget limit in expectation. They consider this soft budget limit (in expectation only) as when such an equilibrium solution is repeated many times, concentration will help essentially observe the true overall budget. Their work naturally extends value shading, which is one of the several ways budgets are managed in practice, to non-truthful auctions, e.g., see DBLP:conf/wine/ConitzerKSM18 [DBLP:conf/wine/ConitzerKSM18], DBLP:conf/ec/ConitzerKPSSMW19 [DBLP:conf/ec/ConitzerKPSSMW19], and DBLP:journals/ior/BalseiroKMM21 [DBLP:journals/ior/BalseiroKMM21] (the first and third works focus on second price and the other one on first price). Equilibria efficiency guarantees for such games were also established in DBLP:conf/wine/AggarwalBM19 [DBLP:conf/wine/AggarwalBM19] under more general payment constraints, as well as in DBLP:conf/innovations/Babaioff0HIL21 [DBLP:conf/innovations/Babaioff0HIL21] for more general utility measures. Both show that the liquid welfare of a pure Nash equilibrium of the static game is within a factor of of the optimal liquid welfare, when the underlying auction is truthful, e.g., a second-price auction. In contrast, our efficiency results apply to player behavior in a repeated auction setting not only at equilibrium but also without converging to equilibrium.

As mentioned previously, the concept of liquid welfare was introduced in DBLP:conf/icalp/DobzinskiL14 [DBLP:conf/icalp/DobzinskiL14] and was also used by DBLP:conf/stoc/SyrgkanisT13 [DBLP:conf/stoc/SyrgkanisT13] (the latter refers to liquid welfare as effective welfare). [DBLP:conf/stoc/SyrgkanisT13] focus on the price of anarchy of simple mechanisms (first or second-price auctions) but in terms of budgeted players, they compare the resulting social welfare with the optimal liquid welfare, offering a rather unfair comparison, as the former can be arbitrarily bigger than the latter. [DBLP:conf/icalp/DobzinskiL14] focus on designing mechanisms that maximize the liquid welfare. DBLP:conf/aaai/FotakisLP19 [DBLP:conf/aaai/FotakisLP19] convert known incentive compatible mechanisms that maximize social welfare for submodular players to incentive compatible mechanisms that maximize liquid welfare with similar guarantees.

Analyzing the outcome of regret-minimizing players in auctions has received attention from the research community recently. DBLP:conf/www/KolumbusN22 [DBLP:conf/www/KolumbusN22] study repeated auctions with two players who report their (potentially different) value to a regret-minimizing algorithm. They notice that in second-price auctions the dynamics induced might not converge to the bidding of the equilibrium and therefore players might have incentive to misreport their values to the algorithms. They also prove that this is not the case for first-price auctions. Another similar line of work is that of DBLP:conf/aaai/0004GLMS21 [DBLP:conf/aaai/0004GLMS21] and DBLP:conf/www/DengHLZ22 [DBLP:conf/www/DengHLZ22], both of which study the convergence to equilibria of players who bid according to mean-based learning algorithms, a category of no-regret learning algorithms. [DBLP:conf/aaai/0004GLMS21] focuses on second-price, first-price, and multi-position VCG auctions when players have i.i.d. distributions and prove that the bids converge to the canonical Bayes-Nash equilibrium. [DBLP:conf/www/DengHLZ22] focus on first-price auctions when players have fixed but different values and study conditions on the players’ values that guarantee or not convergence to a Nash equilibrium.

Another interesting series of related work is that of adversarial Bandits with Knapsacks: a budget limited player is trying to maximize her total reward by picking an action each round, where each action has a reward and costs that are subtracted from the budget of each of her resources; rewards and costs are picked by an adversary. The framework was first introduced by DBLP:conf/focs/ImmorlicaSSS19 [DBLP:conf/focs/ImmorlicaSSS19], where they prove that the player can achieve competitive ratio and sublinear regret, both in expectation and high probability, where is the number of resources. DBLP:conf/colt/Kesselheim020 [DBLP:conf/colt/Kesselheim020] offer an improved competitive ratio. Another interesting line of work (that precedes the previous one) focuses on the same scenario, but where the rewards and costs are sampled from an unknown (potentially correlated) distribution every round. In contrast to the adversarial setting, here the competitive ratio of is achievable [DBLP:conf/focs/BadanidiyuruKS13]. An excellent discussion of both works and multi-armed bandits in general can be found in [DBLP:journals/ftml/Slivkins19].

3 Preliminaries and Model

We assume that there are players and rounds. Every player has an additive valuation (we will generalize this in Section 7): every round she has a value for the item being auctioned that round and if she gets allocated the items of rounds her total allocated value is

 Vi=∑t∈Tvit

We assume that the players’ values for are bounded: for every , .

Unless stated otherwise, we will focus on first-price auctions. This means that every round , each player submits a bid and the player with the highest bid wins the item and pays her bid (ties are broken arbitrarily). We denote with the price of the item, i.e., the highest bid, and with the highest competing bid faced by player , i.e., . If player gets allocated the items of rounds , then her total payment is

 Pi=∑t∈Tpt

We assume that every player has a budget and budgeted quasi-linear utility, i.e., her utility when her total value and payment are and is

 Ui={Vi−Pi, if Pi≤Bi−∞, otherwise

We will focus on the case where the budget of every player is linear in time (note that if then player is never budget constrained). This assumption is used in [DBLP:conf/sigecom/BalseiroG17] and is also used to get meaningful guarantees for adversarial Bandits with Knapsacks (see [DBLP:journals/ftml/Slivkins19] for more information).

We evaluate the auction system by measuring how well it does at maximizing the Liquid Welfare: the liquid welfare of a player is

 LWi=min{Bi,Vi}

and the total liquid welfare is

 LW=∑i∈[n]LWi

We denote the optimal liquid welfare with .

3.1 The behavioral assumption

We are assuming that players all learn to bid while participating in the auction. When budgeted players are participating in repeated auctions where the values and prices may be adversarially picked, no-regret learning is not possible, but learning algorithms can achieve a bounded competitive ratio. Specifically in this work, we are going to compare a player’s resulting utility with her utility had she bid times her value every round for any , up to her budget. More specifically, we denote with the resulting utility had player bid using shading multiplier every round, constrained by her budget, i.e., is her utility if her bid on every round was

 ^bit=min{λvit,Bi−t−1∑τ=1^biτ1[^biτ>diτ]}

i.e., every round player ’s bid is the minimum of and her remaining budget.

We say that player has competitive ratio and regret , if is sublinear in and for her resulting utility it holds that

 Ui≥maxλ∈[0,1]^Ui(λ)−Regiγ

In the special case that , we say that player has no-regret.

4 Upper Bound for Liquid Welfare

In this section, we are going to prove the upper bound guarantee for liquid welfare when all the players have a bounded competitive ratio. We start with a lower bound for the utility of a player when she plays according to the optimal in retrospect shading multiplier.

Lemma 4.1.

Fix a player , her values , and the maximum bids of the other players . Let be the items that player gets in the allocation that maximizes the total liquid welfare. Let . Then for any such that (equivalent to ), it holds that

 maxμ∈[0,1]^Ui(μ)≥c(λ)λLW∗i−c(λ)∑t∈Oidit−1

To prove the lemma we examine and distinguish two cases. First, if player runs out of budget, then she has spent almost all her budget, up to a constant. This allows us to lower bound her utility since every time she wins an item the value she earns from it is at least times the payment (either if she is budget constrained or not). Second, if the player is not budget constrained, we can show that by bidding according to a randomized multiplier, the desired bound holds.

Proof.

Fix and as described above. First, we examine the case where, if the player bids according to multiplier , then she runs out of budget, i.e., at some round she is budget constrained and therefore bids less than . In this case, her total payment is at least and every time she gets an item the value she gets from it is at least times the price. This proves that

 ^Ui(λ)≥(1λ−1)(Bi−λ)=(1λ−1)Bi−1+λ

which proves the lemma in this case since , , and .

Now we examine the case where player is not budget constrained when using multiplier , which also proves she is not budget constrained for any multiplier . If player uses multiplier , then her utility is

 ^Ui(μ)=(1−μ)∑t∈[T]vit1[μvit>djt]≥(1−μ)∑t∈Oivit1[μvit>djt] (1)

If the multiplier

is picked from the distribution that has probability density function

 fλ(μ)={c(λ)1−μ, if μ∈[0,λ]0, otherwise

then taking the expectation of (1) we get

 Eμ∼fλ(μ)[^Ui(μ)] = ∫λμ=0(1−μ)∑t∈Oivit1[μvit>djt]c(λ)1−μdμ ≥ c(λ)∑t∈Oivit(λ−dit/vit) = c(λ)λ∑t∈Oivit−c(λ)∑t∈Oidit

The above proves what we want because and . ∎

We now prove a lower bound for the expected liquid welfare of player , given a bound for her competitive ratio. We require the bound for the competitive ratio to be one with high probability (see Theorem 6.3 for the guarantee we prove).

Lemma 4.2.

Fix a player , her values , and the highest bids of the other players . Additionally, assume that player has competitive ratio and regret 111 is usually a function of and . with probability , i.e.,

 Ui≥maxλ^Ui(λ)−Regiγ (2)

Let be the items player gets in the allocation that maximizes the total liquid welfare and define . Then for any such , it holds that

where the expectation is taken only over the bids of player .

Proof.

Define the following two events:

 E1={Vi≤Bi} and E2={(???) holds}

This makes

 E[LWi] ≥ E[LWi|Ec1]P[Ec1]+E[LWi|E1∩E2]P[E1∩E2] (law of total expectation) ≥ BiP[Ec1]+E[Vi|E1∩E2](P[E1]−δ) (P[E2]≥1−δ) ≥ E[Vi|E1∩E2]P[Ec1]+E[Vi|E1∩E2](P[E1]−δ) (Bi≥E[Vi|E1∩E2]) = E[Vi|E1∩E2](1−δ) ≥ E[Ui|E1∩E2](1−δ) (Vi≥Ui)

We now focus on the expectation of the last term above:

 E[Ui|E1∩E2] ≥ E[maxλ^Ui(λ)−% Regiγ∣∣E1∩E2] (using (???)) = maxλ^Ui(λ)−Regiγ (no dependence from i's bids) ≥ c(λ)λγLW∗i−c(λ)∑t∈Oidit+1+Regiγ (\lx@cref{creftype~refnum}{lem:upper:utility})

Using the last lemma, we can prove a bound for the total liquid welfare. The trickiest part of this theorem is choosing the optimal , which turns out to be a complicated function of .

Theorem 4.3.

Assume that every player has competitive ratio and regret with probability at least . Then for the resulting liquid welfare LW and the optimal one it holds that

 E[LW]≥(1−δ)(1γ+√2γ+O(1)LW∗−1γ+√γ/2+O(1)(n+∑iRegi)) (3)

where the terms only depend on .

Remark 4.4.

For small , a plot of the Price of Anarchy (term in the denominator of the fraction in front of the term in (3)) is shown in Fig. 1. For , this equals about .

The proof of the theorem is based on the lemmas provided above but requires some complicated calculations to prove the we picked satisfies the requirements of Lemma 4.2.

Proof.

Fix a competitive ratio . We set

 λ=min⎧⎪ ⎪⎨⎪ ⎪⎩λ∗,1+1W−1(−1e1+1/γ)⎫⎪ ⎪⎬⎪ ⎪⎭

where is the solution of the equation and is the -th branch of the Lambert W function 222For , is the negative solution to the equation ; read more in https://en.wikipedia.org/wiki/Lambert_W_function. Because , this satisfies the condition of Lemma 4.2, , which allows us to use that lemma.

We add over the expectation of the inequality we get from Lemma 4.2 and using the facts that and form an allocation we get

 E[LW]≥(1−δ)(c(λ)λγLW∗−c(λ)∑t∈[T]E[pt]+n+∑iRegiγ)

Using the fact that the total payments of all the players are less than their total liquid welfare the above becomes

 E[LW]≥(1−δ)(c(λ)λγLW∗−n+∑iRegiγ)−c(λ)γE[LW]

The theorem is proved by re-arranging the above inequality, substituting

, and doing an asymptotic analysis of the terms. In

Appendix A we include the code for the asymptotic bounds for completeness. ∎

5 Lower Bounds for Liquid Welfare

In this section, we are going to prove lower bounds for the guarantee in liquid welfare when the players have a competitive ratio of . We first include the lower bound for repeated sequential budgeted second-price auctions of [DBLP:journals/corr/GaitondeLLLS22].

Theorem 5.1 ([DBLP:journals/corr/GaitondeLLLS22, Proposition D.1]).

In sequential budgeted second-price auctions, for any number of rounds there exists an instance with two players both of who have competitive ratio and no-regret where the resulting liquid welfare is arbitrarily smaller than the optimal one.

Proof.

The first player has budget for a small and value every round. The second one has budget and value every round.

The optimal allocation is to give the second player all the items, achieving .

If player bids every round and player bids , then the resulting liquid welfare is and no player has regret: player gets every item for free, while player has no incentive to get any item at price . This completes the proof by taking , since . ∎

We now provide an lower bound for . More specifically, we provide an example where the liquid welfare is half the optimal one, even when players have competitive ratio and no-regret.

Theorem 5.2.

In sequential budgeted first-price auctions, for every number of rounds and every such that is an integer, there exists an instance where the players have competitive ratio , no-regret, and the resulting liquid welfare is times the optimal one.

Proof.

There are two players. The first has value every round and a total budget of and the second player has value every round and a total budget of .

The optimal allocation is to give player the items of the first rounds and for the rest of the rounds to give the items to player . This results in .

In contrast, if player bids infinitesimally above and player bids , then neither user has any regret and the resulting liquid welfare is . This proves the theorem. ∎

Finally, we provide an upper bound, which for large makes Theorem 4.3 tight (and also Theorem 7.1 that we present later). More specifically, we show that if players have competitive ratio then the resulting liquid welfare may be times less than the optimal one.

Theorem 5.3.

In sequential budgeted first-price auctions, for every number of rounds and such that is an integer, there is an instance where every player has competitive ratio at most and constant regret, and the resulting liquid welfare is almost times smaller than the optimal one.

Intuitively, the theorem proves that, if there is only one player who gets only a fraction of identical items, then her competitive ratio is and the liquid welfare is times less than the optimal.

Proof.

There are two players, neither of which is budget constrained. Player has value every round. Player has value for the first rounds and for , where is a small constant. The optimal liquid welfare is , by giving all the items to player .

The bids of the players are the following: for the first rounds, player bids , and player bids . For the rest of the rounds, player bids and player bids . Player has a total utility of and player has . This outcome yields liquid welfare which is fraction of .

We now only have to prove that the above outcome has competitive ratio at most and regret less than for every player. The best allocation for player would have been to get the first items for free and the rest of the items for a price of (note that this results in as much utility as using any constant shading multiplier). This allocation yields utility . One can prove that this yields competitive ratio and regret for player .

For player the best allocation would have been to get the second batch of items for free. The utility in that case is . This yields a competitive ratio of and regret . ∎

6 Algorithm for bounded competitive ratio

In this section, we are going to prove that there exists an algorithm that achieves competitive ratio and sublinear regret with high probability, as needed by Theorem 4.3. Our bound is going to hold for any behavior of the other players, even if it is adversarially picked.

Our algorithm will be based on the classic adversarial Bandits with Knapsacks (BwK) setting, first studied by [DBLP:conf/focs/ImmorlicaSSS19]. To achieve this we are first going to prove that the difference in using two shading multipliers that are very close results in a very small additive error. This will effectively prove that uniformly discretizing the action space entails a small additive error. Then we are going to reduce the problem of bidding in sequential budgeted first-price auctions to the classic framework of adversarial BwK with very small additive error.

We first prove that using shading multiplier instead of yields a small additive error. We are going to focus on a single arbitrary player, so we are going to drop the subscript throughout this section.

Lemma 6.1.

Fix any highest bids the other players have submitted, the player’s budget and her values , and let be the utility of the player when she is bidding according to multiplier . Then for any and , it holds that

 ^U(λ+ϵ)≥^U(λ)−O(Tϵ)−2

The lemma is proven by examining two cases. First, if using multiplier does not make the player run out of budget, then it only entails a slightly larger payment, at most . If by using multiplier the player runs out of budget, then her payment is almost her budget, which yields a high utility since the value of every item she gets is at least times the price she paid for it.

Proof.

We first study the outcome when the player is using multiplier . Let be the rounds in which the player wins the auction when biding with multiplier and is not budget constrained, i.e., she bids times her value and wins the item. If at some round the player wins an item while being budget constrained, then she bids and pays her entire remaining budget. This means that other than in the rounds the player wins at most one more item, whose value is at most , proving that

 ^U(λ)≤∑t∈T(1−λ)vt+1 (4)

When the player bids with multiplier then she is guaranteed to win every item of rounds , unless she runs out of budget, meaning she either gets utility at least or pays at least . This means that

 ^U(λ+ϵ)≥⎧⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪⎩(1−λ−ϵ)∑t∈Tvt, if the player gets T without being % budget constrained(1λ+ϵ−1)(B−(λ+ϵ)), if (λ+ϵ)∑t∈Tvt≤B, but % does not get T(1λ+ϵ−1)(B−(λ+ϵ)), if (λ+ϵ)∑t∈Tvt>B (5)

where the second and third cases come from the fact that every time the player wins an item, her value for it is at least times the price she pays for it.

For the second case of (5), because , we have

 ^U(λ+ϵ)≥(1λ+ϵ−1)(B−(λ+ϵ))≥(1−λ−ϵ)∑t∈Tvt−1+λ+ϵ≥^U(λ)−ϵT−1

where the last inequality holds because of (4), , , and . The inequality above satisfies the lemma and is also similarly proved for the first case of (5).

For the third case of (5), we have that

 ^U(λ+ϵ) ≥ (1λ+ϵ−1)B−1+λ+ϵ ≥ 1−λ−ϵλ+ϵλ∑t∈Tvt−1 (λ∑t∈Tvt≤B and % λ+ϵ≥0) ≥ (1−λ−ϵλ+ϵλ≥1−λ−ϵλ when λ>0,ϵ≥0) ≥ ^U(λ)−1−ϵλ∑t∈Tvt−1 (using (???)) > ^U(λ)−ϵBT−ϵT−2 ((λ+ϵ)∑t∈Tvt>B⟹λ≥BT−ϵ) ≥ ^U(λ)−O(Tϵ)−2 (B=Θ(T),ϵ=o(1))

The above inequality satisfies the lemma and completes the proof. ∎

Now we are going to prove that for any discretization of the action space, , we can use the algorithms from BwK to achieve a low competitive ratio and regret.

Lemma 6.2.

Fix any vector of highest bids that the other player have submitted

and the budget and values of the player. Let with be any multipliers. Then there is an algorithm that uses only multipliers and achieves utility for which

 P[U≥maxk=0,…,K^U(λk)−Reg% O(log(T))]≥1−O(δT)

where the probability is taken only over the player’s actions and

 Reg=O(KT3/4√log(T/δ))
Proof.

We are going to reduce the problem to the adversarial BwK problem. In this problem, there are actions, rounds, and a resource with a total budget of . The adversary picks rewards and costs for every round-action pair. The -th arm is assumed to have . On round , without observing the rewards and costs of that round, the player picks a (potentially randomized) action . The game ends after round or on round when the player depletes the resource, i.e., when . We denote the total reward of the player with and with the reward of best fixed action in retrospect.

To reduce our problem to BwK we set and . First, we note that there is a small mismatch between BwK and the repeated auction setting. A sequence of actions has the same rewards, costs, and remaining budget in both settings, up to before round when in the BwK setting the player runs out of budget. In BwK if the player picks an action that incurs a cost higher than the remaining budget the game stops. In contrast, in the repeated auction setting, if the bid of the player is higher than her remaining budget, then her bid is adjusted, which may or may not end the game by depleting her budget. However, this causes a small mismatch:

• For the difference between the reward of the algorithm in BwK and the utility of the same actions in the repeated auction setting (where the actions after the BwK algorithm runs out of budget are picked arbitrarily), it holds that .

• For the two benchmarks, and , it holds . The rewards when playing the -th arm and using multiplier are the same up to before the round when the game stops in the BwK setting. On that round and onwards, in the auction setting the player has less than budget remaining. As long as this remaining budget is used to win items without lowering the player’s bid (i.e., the player bids ), the player can gain at most additional utility, since the utility she gains is times the price she pays and she has remaining budget. Additionally, she might earn one more item on the round her budget is depleted. This means that after having less than remaining budget, she may earn up to utility.

Now we simply use the algorithm of [DBLP:conf/focs/ImmorlicaSSS19, Remark VI.2] that guarantees for every

 P⎡⎣REW≥OPTFA−Reg′O(log(T))⎤⎦≥1−O(δT)

with

 Reg′=O(TKB)T3/4√log(T/δ)

as long as . The above fact yields the lemma by using , , and that . ∎

Finally, we combine the two lemmas of this section to prove that there exists an algorithm that achieves competitive ratio with high probability, even for adversarially picked values and prices.

Theorem 6.3.

Fix a player with values and budget . For any bids of the other players, there is an algorithm that for any achieves utility for which

 P⎡⎣U≥maxλ∈[0,1]^U(λ)−RegO(log(T))⎤⎦≥1−O(δT)

where the probability is taken only over player ’s actions and

 Reg=O(T7/8√log(T/δ))
Proof.

Fix a such that and for let . Using the algorithm of Lemma 6.2 we get that with probability at least ,

 U≥maxk=0,…,K^U(λk)−RegO(log(T))

with

 Reg=O(KT3/4√log(T/δ))=O(T7/8√log(T/δ))

Fix a multiplier and let be such that . Using Lemma 6.1 we get that . This proves that and completes the proof of the theorem. ∎

Remark 6.4.

We note that even though Theorem 6.3 proves a high probability competitive ratio and sublinear regret for sequential budgeted first-price auctions, our proof can be easily adapted to use the guarantee of any Bandits with Knapsacks algorithm, either with high probability or in expectation.

7 Submodular Valuations

We now move to the final section of our results, where we generalize the results of Section 4 for the case where players have submodular valuations. If player receives a bundle of items then her value is , where is a submodular333A set function is submodular if for any and it holds that , non-decreasing, and non-negative function. We use the standard notation for the marginal value of item for bundle : . The definitions of the players’ utilities and liquid welfare remain the same.

Before bounding the liquid welfare in this setting, we first define how bidding according to a multiplier works in this case. If player in round uses multiplier and has already gained items then she bids her current marginal value for item , , as long as she is not budget constrained (if she is budget constrained she bids her remaining budget). Because the marginal value of the item in every round (and therefore the bid) of every player depends on her past allocation, this setting is more complicated than the one we studied in previous sections. Most notably, it is not clear if there exists an algorithm with bounded competitive ratio and regret, as we showed in Section 6. The reason for that is that a reduction to BwK is much harder since the reward and consumption of a single round depend on the results of the previous ones. We leave the last question as future work.

In this section, we prove the following theorem, which proves a slightly worse bound than the one of Theorem 4.3.

Theorem 7.1.

Assume that every player has competitive ratio and regret with probability at least . Then, when players have submodular valuations, for the resulting liquid welfare LW and the optimal one it holds that

 E[LW]≥(1−δ)⎛⎝1(√γ+1)2LW∗−1γ+√γ(∑iRegi−n)⎞⎠
Remark 7.2.

We note that the factor in front of is a bit bigger than the one in Theorem 4.3, but both asymptotically are . We give a plot of both for small in Fig. 2.

We start with a simple lemma that will help us lower bound the value of the bundle gained by player when she uses a fixed shading multiplier .

Lemma 7.3.

Fix a player , her submodular valuation , and the maximum bids of the other players . Let and such that

 t∉Ti⟹vi(t∣∣Ti∩[t−1])≤1λdit

Then, for any set it holds that

 vi(Ti)≥vi(Oi)−1λ∑t∈Oidit
Proof.

We have that

 vi(Ti)+1λ∑t∈Oidit ≥ vi(Ti)+1λ∑t∈Oi∖Tidit (dit≥0) ≥ vi(Ti)+∑t∈Oi∖Tivi(t∣∣Ti∩[t−1]) (t∉Ti) ≥ vi(Ti)+∑t∈Oi∖Tivi(t∣∣Ti∪(Oi∩[t−1])) (submodularity) = vi(Oi∪Ti)≥vi(Oi)

where in the last equality, every term in the sum iteratively adds the marginal of an item that is in and not in . ∎

Next, we prove a lemma analogous to Lemma 4.1, where we lower bound the utility of a player if she used a fixed shading multiplier. The following lemma provides a worse guarantee because the lemma does not use randomization on the multiplier picked.

Lemma 7.4.

Fix a player and the maximum bids of the other players . Then for any it holds that

 ^Ui(λ)≥(1−λ)⎛⎝LW∗i−1λ∑t∈Oidit−1⎞⎠

where is the bundle player gets in the allocation that maximizes the total liquid welfare.

Proof.

Assume that by using multiplier player would have gotten bundle if she was never budget constrained; in that case she would have had utility and it would have held . If she was budget constrained (in which case ) she would have spent at least , which yields a utility of at least , since the marginal value she gets from any item she bids is at least what she pays for it. This proves

 ^Ui(λ) ≥ {(1−λ)vi(Ti), if λvi(Ti)≤Bi(1λ−1)(Bi−λ), if λvi(Ti)>Bi ≥ (1−λ)min{vi(Ti),1λBi−1}

Because the bundle satisfies the requirements of Lemma 7.3 the above becomes

 ^Ui(λ)≥(1−λ)min⎧⎨⎩vi(Oi)−1λ∑t∈Oidit,1λBi−1⎫⎬⎭

The above proves what we want since , , and . ∎

Now we prove the analogous of Lemma 4.2, lower bounding the expected utility of player given she has competitive ratio with high probability. The statement and the proof are almost identical to the aforementioned lemma, other than the guarantees for this one are a bit worse.

Lemma 7.5.

Fix a player , her values , and the maximum bids of the other players . Assume that, player has competitive ratio and regret with probability , i.e.,

 Ui≥maxλ^Ui(λ)−Regiγ (6)

Let be the items gets in the allocation that maximizes the total liquid welfare. Then for any , it holds that

 E[LWi]≥(1−δ)⎛⎝1−λγLW∗−1/λ−1γ∑t∈Oidit−1γ(Regi−1)⎞⎠

where the expectation is taken only over the bids of player .

Proof.

Define the following two events:

 E1={Vi≤Bi} and E2={(???) holds}

This makes

 E[LWi] ≥ E[LWi|Ec1]P[Ec1]+E[LWi|E1∩E2]P[E1∩E2] (law of total expectation