# Online Allocation and Display Ads Optimization with Surplus Supply

In this work, we study a scenario where a publisher seeks to maximize its total revenue across two sales channels: guaranteed contracts that promise to deliver a certain number of impressions to the advertisers, and spot demands through an Ad Exchange. On the one hand, if a guaranteed contract is not fully delivered, it incurs a penalty for the publisher. On the other hand, the publisher might be able to sell an impression at a high price in the Ad Exchange. How does a publisher maximize its total revenue as a sum of the revenue from the Ad Exchange and the loss from the under-delivery penalty? We study this problem parameterized by supply factor f: a notion we introduce that, intuitively, captures the number of times a publisher can satisfy all its guaranteed contracts given its inventory supply. In this work we present a fast simple deterministic algorithm with the optimal competitive ratio. The algorithm and the optimal competitive ratio are a function of the supply factor, penalty, and the distribution of the bids in the Ad Exchange. Beyond the yield optimization problem, classic online allocation problems such as online bipartite matching of [Karp-Vazirani-Vazirani '90] and its vertex-weighted variant of [Aggarwal et al. '11] can be studied in the presence of the additional supply guaranteed by the supply factor. We show that a supply factor of f improves the approximation factors from 1-1/e to f-fe^-1/f. Our approximation factor is tight and approaches 1 as f →∞.

## Authors

• 1 publication
• 23 publications
• 8 publications
• 15 publications
• 13 publications
• 1 publication
11/23/2016

### Efficient Delivery Policy to Minimize User Traffic Consumption in Guaranteed Advertising

In this work, we study the guaranteed delivery model which is widely use...
08/04/2017

### Combining guaranteed and spot markets in display advertising: selling guaranteed page views with stochastic demand

This paper proposes an optimal dynamic model for combining guaranteed an...
12/13/2019

### The SBP Algorithm for Maximizing Revenue in Online Dial-a-Ride

In the Online-Dial-a-Ride Problem (OLDARP) a server travels through a me...
02/07/2022

### Competitive Online Optimization with Multiple Inventories: A Divide-and-Conquer Approach

We study a competitive online optimization problem with multiple invento...
03/11/2022

### Impression Allocation and Policy Search in Display Advertising

In online display advertising, guaranteed contracts and real-time biddin...
09/10/2018

### A Multi-Agent Reinforcement Learning Method for Impression Allocation in Online Display Advertising

In online display advertising, guaranteed contracts and real-time biddin...
08/12/2020

### Optimizing fire allocation in a NCW-type model

In this paper, we introduce a non-linear Lanchester model of NCW-type an...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

An overwhelming majority of publishers on the web monetize their service by displaying ads alongside their content. The revenue stream of such publishers typically comes from two key channels, often referred to as direct sales and indirect sales. In the direct sales channel the publisher strikes several contracts with some major advertisers. The price of such contracts are often negotiated and decided on a per-impression basis before the serving begins. In the indirect sales channel, the ad is selected by seeking, in real-time, bids in an Ad Exchange platform (AdEx for short). In this case an auction is conducted to select the winner and decide how much they pay. A comprehensive yield optimization consists of jointly optimizing the publisher’s revenue across both channels. In fact, revenue optimization in this context is significantly important since the display ads industry represents a giant ( \$50B) marketplace and is fast growing even at its current mammoth size.

#### Basic setting and preliminaries.

Unweighted edges for contractual advertisers is fine because these contracts are mostly based on the number of impressions delivered. In a few cases the contracts are based on the number of clicks or conversions, in which case the edges will be weighted based on the probability of click or conversion. Contracts based on impressions form such a large majority, that having unweighted edges, is almost wlog.

and the edges incident on the AdEx node could have an arbitrary weight depending on the highest bid from the Exchange. AdEx is modeled by the distribution of highest bids in the exchange: i.e., regardless of the query that arrives, when it is assigned to , the publisher accrues a profit that is equal to a draw from . The publisher’s basic problem is to decide, on a per-query basis, whether to assign the query to a contract advertiser (if so, whom) or to AdEx .

Publisher’s goal is to maximize its overall revenue. Publishers typically have pre-negotiated prices for each contractual advertiser . The total revenue of the publisher will be the sum of three parts (i) the revenue from AdEx (i.e., the sum of edge weights of queries assigned to AdEx ), (ii) the revenue from contracts: , and (iii) the revenue lost due to under-delivery, i.e., the negative of the penalty paid. Note that (ii) is a constant, and is unaffected by the allocation algorithm. Thus, while computing competitive ratio, we compute it w.r.t. the sum of (i) and (iii).

#### Supply factor.

An important concept that we introduce is what we call a supply factor of an instance, which captures the (potentially fractional) number of times that a publisher will be able to satisfy their contractual advertisers’ demands. Formally, let a complete matching be defined as one where all contractual advertisers’ demands are fully satisfied, i.e., all the offline vertices are fully saturated. The supply factor of an instance is defined as the largest positive real number s.t., there exists an offline solution with complete matchings. If there are many such matchings, we pick one to be the supply-factor-determining-offline-solution. In this work, we assume that the number of arriving online queries is exactly . The algorithm designer is aware of , the ’s, and the highest bid distribution from AdEx .

There are several important practical aspects of the yield optimization problem that previous work do not capture that we aim to address:

1. [align=left]

2. The first aspect is that publishers typically have more inventory than they are able to sell via the direct sales channel (contracts), and indeed that is the main reason that most publishers are selling through the indirect sales channel of AdEx as well. Most previous works on joint yield optimization either address the objectives of the two channels separately (bi-criteria objective), or study them in the absence of supply factor/penalties/AdEx bid distribution. Studying the yield optimization problem with a single unified objective (AdEx revenue - penalty) in the presence of supply factor and AdEx bid distribution surfaces the nature of the optimal tradeoff between the supply factor and how on-track a contract is towards hitting its goals. Clearly, when a contract is lagging behind, we should allocate a query to AdEx only when the AdEx bid is high enough. But how does this “high enough” vary as we increase/decrease the publisher’s supply, captured by the supply factor ? This is explicitly answered in our work. Similarly the dependence on the penalty and AdEx distribution are also explicitly revealed.

3. Even in classic online allocation problems like the online bipartite matching of Karp et al. [17] and the online vertex-weighted bipartite matching of Aggarwal et al. [1], it is interesting to inquire what happens to the competitive ratio when there is a supply factor .

4. Prior works mostly studied the problem in a fully stochastic model or a fully adversarial model. In reality, while user browsing patterns might have significant variations across days, in response to events, state-of-mind etc. (and hence an adversarial arrival of queries is reasonable), advertiser bidding/spending patterns are far more predictable because advertisers have daily and hourly spending budgets. We incorporate this in our model by having a distribution over the highest bids from AdEx , even though query arrival is adversarial. The inclusion of AdEx bid distribution, not only represents reality better, but also leads to a crisp algorithm that sheds ample light on the role of the distribution in the joint yield optimization problem.

### 1.1 Our Results

One of our contributions, as just discussed, is to present an economical model that crisply captures the reality of display ads monetization. Our main result is a fast simple deterministic algorithm that obtains the optimal competitive ratio as values grow large. The algorithm is as follows: let be the points in the support of the distribution of highest bid in AdEx (highest bid is often referred to as reward for short). As a pre-processing step, compute thresholds as a function of (we define and ), and the AdEx bid distribution. Let the satisfaction-ratio of a contractual advertiser be the ratio of the number of impressions delivered to the contract thus far, to the number of impressions requested by the contract. For each arriving query, the algorithm picks the contract with the lowest satisfaction ratio, call it . Find such that . Assign the query to AdEx if the highest bid in the exchange exceeds . And if not, assign the query to the contract with the lowest satisfaction ratio. Algorithm 1 summarizes this. We highlight a few important aspects of this algorithm.

1. Once the pre-processing step is over (which is a one-time computation), the algorithm is very simple to implement in real time while serving queries, even in a distributed fashion. Each relevant advertiser for the current query (i.e., each offline node with a matching edge to the current online node) just responds with its satisfaction ratio . From there on, the algorithm simply computes the smallest satisfaction ratio, and do a simple lookup over the thresholds that are pre-computed, and decide the allocation based on how big the AdEx bid is.

2. The algorithm is quite intuitive. As the satisfaction ratio of the most needy contract gets lower, the AdEx bid has to be correspondingly higher to merit snatching this impression from the contract. This tradeoff happens to take such a simple symmetric form, where one looks for the mirror image in , namely , of the index to which the satisfaction ratio gets mapped is quite surprising. Importantly, the supply factor and penalty are used only in the pre-processing step to compute the thresholds, and don’t appear in serving time at all.

3. The algorithm need not fully know the highest bid from AdEx . It just needs to be able to compare the highest bid against a reserve price of . Further, extending the algorithm to deal with multiple Ad Exchanges is simple: broadcast the same reserve to all exchanges, and pick the highest bidding exchange that clears the reserve (we just need to know which exchange is the highest bidder, and whether they clear the reserve, not the exact value of the bid). If no exchange clears the reserve, allocate to the advertiser with the lowest .

4. While the algorithm is intuitive in hindsight, it is far from obvious that it obtains the optimal competitive ratio.

As mentioned earlier, apart from analyzing the joint yield optimization problem, we also show the benefits that a supply factor can bring in classic online algorithmic problems. For the seminal online bipartite matching problem of  [17], we show that the same RANKING algorithm of  [17] with a supply factor of yields a tight competitive ratio of , which increases with , and approaches as . Likewise for the vertex-weighted generalization of this problem studied by [1], the same generalized vertex-weighted RANKING algorithm of  [1] (a.k.a Perturbed Greedy) yields a competitive ratio of . We defer these analyses to the Appendix A, and include them primarily to show how supply factor influences the competitive ratio of some well known problems.

#### Overview of analysis techniques.

We use a max-min approach to analyze the performance of our algorithm. Given the thresholds , our algorithm is completely defined. Therefore the adversary can compute the instance that minimizes the optimal objective of our algorithm given the thresholds, and the algorithm can optimize the thresholds knowing the best response of the adversary. The minimization problem of the adversary can be captured by a succinct LP, and we reason about the structure of the optimal solution to this LP. This sets up the maximization problem of the algorithm, which turns out to be a non-linear, non-convex optimization problem. Nevertheless, we develop a simple poly-time dynamic programming algorithm that obtains the optimal solution (optimal thresholds ) up to a small additive error. For tightness, we construct an example which is a modified version of the “upper triangular graph” of Karp et al. [17], and show that no algorithm can obtain an objective value larger than the objective value achieved as the optimal solution to the max-min problem described above. This establishes that the class of threshold-based algorithms is optimal. To act as a warm up to ease into the general distribution section, we begin with the special case of distributons with support size two. In this case, the maximization problem of the algorithm in the max-min problem above is a single-variable concave maximization problem, and already yields clear insights on how the optimal threshold computed by the algorithm depends on the supply factor and the penalty .

#### Bid-to-budget ratio vs supply factor.

On the surface level, it might appear that the notion of supply factor is just like the “large budgets” assumption, where it is assumed that the budget (in our case the number of impressions demanded by each advertiser ) is much larger than the bid (i.e., the value of an edge). However these two concepts are quite different. In particular, even with the large budgets assumption, without a supply factor larger than , any algorithm will be very conservative and will essentially always allocate to the contracts (assuming the penalty is larger than the AdEx reward). The supply factor is a property of the entire setup of the publisher: the demands of the contracts and the nature of traffic (set of online nodes arriving, i.e., users/queries that visit their website).

#### Extensions.

A natural question to ask is what happens if the publishers have different under-delivery penalties for different advertisers. To show a proof of concept extension of our results to this setting, we consider the simpler setting of our problem where the AdEx rewards are equal to for every query (i.e., a deterministic distribution ), and show how the technique and results extend to handle different ’s. We conjecture that the same approach extends to the general AdEx distributions as well, and leave it as an open problem. In a different direction, in this work, we focus on a deterministic algorithm because of its many virtues when deployed in a production system: the ability to replay and hence debug easily, ex-post fairness, etc. While we show that it achieves the optimal competitive ratio (i.e., even randomized algorithms cannot improve further), this necessarily requires values being large (for a deterministic algorithm to be optimal, large budgets are necessary even for the much simpler -matching problem [16]). In practice, however, large budget assumption essentially always holds, as advertiser contractual demands are much larger than the edge weight of . Nevertheless, one could ask whether one could use randomized algorithms to remove the dependence of ’s being large. Again, as a proof of concept extension of our results, we show that for the special case where AdEx rewards are uniformly equal to for every query, randomized algorithms can get the same competitive ratio as deterministic ones for any value of , not just large ones.

#### Comparison to closely related work.

In terms of works that consider joint optimization across the two channels, the closest to ours is that of Dvorák and Henzinger [11], who also consider the objective of maximizing revenue across two channels: the fundamental differences are (a) the absence of a supply factor in their work, (b) they model adversarially both the arrivals and the AdEx bids, and (c) they achieve separate approximation factors for each channel as opposed to our approximating the joint unified objective. Equally close is the work of Balseiro et al. [4], who study the same problem, with the differences being (a) the absence of a supply factor, (b) they model stochastically both the arrivals and AdEx bids.

Another closely related work is by Devanur and Jain [8] in which they consider the adwords problem with concave returns in the objective: while their model can capture penalties, it does not handle the AdEx reward distribution. Our model takes the reward distribution and penalties into account simultaneously. Additionally, the supply factor notion is absent in [8].

Karp et al. [17] wrote the seminal paper on online bipartite matching, and Aggarwal et al. [1] consider the generalization of it to vertex weighted settings. Mehta et al. [19] introduced the influential Adwords problem and gave a approximation for it, with a recent breakthrough result by Huang et al. [15] showing how to beat a approximation for this problem even with small budgets. Devanur et al. [9] give a randomized primal dual algorithm that gives a unified analysis of [17, 1, 19]. We refer the reader to [18] for a survey on the online matching literature.

## 2 Optimal Algorithm for Binary Ad Exchange Distribution

In this section, we consider a special case where the highest AdEx bid (referred to as AdEx reward often) of each query is drawn from a distribution of support size two. We consider the general distribution in Section 3. We first provide an algorithm, and later show that this algorithm is optimal. Formally we consider the following setting:

###### Definition 2.1 (Binary reward distribution with parameters q and r).

We consider the setting where AdEx reward distribution is with probability , and is with probability .

Without loss of generality we assume that the two support points are and , rather than and for . This is because, in the latter case, we can subtract from each support point, and also from the penalty, and it yields the distribution in the format we need. Also, without loss of generality we assume that the support point in the distribution is such that where is the penalty. Note that if , then clearly whenever the AdEx reward is (i.e., non-zero), an optimal algorithm can always allocate the query to AdEx , so there is nothing to study here.

### 2.1 An Optimal Algorithm

Now we propose a simple greedy algorithm (we basically specialize Algorithm 1 for binary distributions), analyze its performance and establish its optimality. The analysis can be extended to the more general distributions of AdEx rewards, but with more involved techniques. We do this in section 3.

Algorithm 2 is our algorithm for binary reward distributions. Here, we compute an appropriate threshold as a pre-processing step. At arrival of a query, let be the available advertiser (i.e., an advertiser with an edge to the incoming vertex) with the lowest satisfaction ratio . The algorithm allocates the impression to AdEx if and only if and the query has non-zero AdEx reward of . I.e., the algorithm first greedily allocates queries to available advertisers that are furthest from being satisfied, no matter how large the AdEx weight of arriving queries. However, when the advertisers are satisfied to some extent (i.e., their ), satisfying contracts becomes less of a priority, and AdEx is preferred when it offers non-zero reward.

Before proving the competitive ratio, we set some notation that we use in our analysis throughout the paper. These concepts are also demonstrated in Figure 1.

 βj=∑a∈∪ℓ≥jAℓ1tna=1t(N−∑ℓ

Thus

 αj=t(βj−βj+1). (2)
###### Lemma 2.2.

Based on definition of as described, for any ,

 ∑ℓ≤jfαℓ≤{∑ℓ≤stβℓ+∑st<ℓ≤j1qβℓ,if j≥st;∑ℓ≤jβℓ,if j
###### Proof.

The RHS represents the set of queries that, when they arrived, the most deserving (lowest ) contractual advertiser that was eligible was of type at most . To see this note that when the lowest is , every arriving query is allocated to the contract (hence the second line of RHS). When the lowest is at least , only a fraction of the considered queries are allocated to the contract — thus the considered queries = allocated queries / q, which is the first line of RHS.

The LHS represents the number of queries that were allocated to an advertiser of type at most in the supply-factor-determining-offline-solution.

It is immediate that LHS is at most RHS because every query counted in the LHS will count for RHS when it arrives. ∎

Notice that the total expected reward of the algorithm can be divided into the following parts:

• The baseline penalty is if no impression is allocated to contracts, the total such penalty is . The total AdEx reward that may be obtained by assigning everything to AdEx is . The next points capture the change to the objective when we move away from this extreme solution of giving everything to AdEx .

• Any impression that is allocated to an advertiser with satisfaction ratio (which is the set of impressions counted in for ), with probability , loses a reward of from AdEx . Thus in expectation each impression has reward added to the objective;

• Each time an impression is allocated to an advertiser with satisfaction ratio (which is the set of impressions counted in for ), the impression always has reward for AdEx , but adds to the objective.

Therefore the expected total reward ALG of the algorithm is

 \textscALG=Nf(1−q)r−Nc+∑j≤st(c−(1−q)r)βj+∑st

We can add (2) and (3

) as constraints, to get a linear program that lower bounds the reward of the algorithm as follows:

 minimize Nf(1−q)r−Nc+∑j≤st(c−(1−q)r)βj+∑st

The constraints are explained immediately by expanding and doing a telescopic summation using (2) and (3). We set because in all but pathological instances we have that every advertiser ends up with at least fraction of their demand satisfied (note that is large, just that ). Even in the pathological instances where this is not true, i.e., only holds, by setting , there is just a additive error we have introduced. Namely, when proving optimality of our algorithm, we will just have proved it up to additive terms. From now on, we take .

###### Claim 2.3.

By setting values as follows we get an optimal solution to the linear program (5):

 β∗j=⎧⎪ ⎪⎨⎪ ⎪⎩Nt(1−1tf)j−1,%ifj≤st+1;Nt(1−1tf)st(1−1/qtf)j−st−1,if j>st+1.
###### Proof.

We prove that there exists an optimal solution of the LP (5) such that all non-trivial constraints are tight. Then the claim follows by observing that as defined satisfy this tightness property. To show that the leads to tightness, start with a simple assignment of and iteratively find the solution to the system of linear equations formed by replacing the inequalities with equalities. This is straightforward.

To show why tightness is wlog, for any being an optimal solution to the above LP, let be the smallest index such that the corresponding constraint (of either type) is not tight. If , we can see that there exists such that , is a new feasible solution with the objective staying the same, while the th constraint becomes tight. Otherwise, if , then , is a new feasible solution with the objective decrease by , while the th constraint can become tight. By repeating this process we can construct a solution, with at least the same objective, in which all non-trivial constraints becoming tight. ∎

We can use the above observations on structure of ALG to compute the appropriate threshold in the following claim:

###### Claim 2.4.

The objective of the algorithm is maximized when the threshold is set to .

###### Proof.

Using Claim 2.3 we have,

 ALG ≥ Nf(1−q)r−Nc+∑j≤st(c−(1−q)r)β∗j+∑st

where . Then to maximizes the reward, we consider the following expression in the right hand side:

 RHS(x)=Nc(f−1)+(1−q)(r−c)fNe−x−qfNce1−qqx−1qf. (6)

For optimizing this threshold we take the derivative over , and compare the obtained value with the boundary values for . We have,

 RHS′(x)=(1−q)(c−r)fNe−x−(1−q)fNce1−qqx−1qf.

We have the unique zero point of is . Since the allowed range of is , we need to consider the following two cases. When , is maximized at . Then we set in the algorithm, with

 \textscALG≥RHS(x∗)=Ncf((1−1/f)−(1−r/c)1−qe−1/f).

When the solution found is not in range, note that the can only be less than and never greater than . This is because , and thus clearly . Given the concavity of the objective, this means that in such a case optimality is achieved at . Then we set in the algorithm, with

 \textscALG≥RHS(0)=Ncf((1−1/f)+(1−q)(1−r/c)−qe−1qf).

#### Useful insights.

Interesting insights already flow out of this binary support distribution case. It shows that the optimal threshold that we set is an affine function of the supply factor . Higher the supply factor, lower the threshold we set (note that the coefficient of in , namely is negative). Also, the dependence on the penalty and AdEx reward are quite non-trivial and intriguing. The binary support is often a good first-order approximation of reality when we bucket bids into “high” and “low” types.

### 2.2 Optimality of Algorithm 2

We now prove the optimality of the algorithm in the previous section by showing an example for which no algorithm can perform better. Consider a binary distribution with parameter and as defined earlier. We use a modification of the “upper triangular graph” instance of [17] as follows:

###### Example 2.5.

Suppose that there are advertisers, and each advertiser demands impressions. There are queries arriving in groups , with queries in group have an edge to the same advertisers determined as follows: consider a random permutation , then the queries in group are available to advertisers with .

At a high-level, in this instance, all advertisers are available to the first group of queries arriving. Then with each group one random advertiser is removed from the set of available advertisers to the group. We next argue that Algorithm 2 is optimal for this instance by showing that any online algorithm will not lead to a better reward.

###### Theorem 2.6.

For Example 2.5, the competitive ratio of any randomized online algorithm matches the competitive ratio obtained by Algorithm 2 up to a small additive factor .

###### Proof.

First we have the following observation about deterministic algorithms. By Yao’s minimax principle, we only need to consider the performance of any deterministic algorithm over the randomness of the instance.

Fix any deterministic algorithm. Let be the fraction of queries in with AdEx reward that is allocated to advertiser , and be the fraction of queries in with AdEx reward that is allocated to advertiser . Then for and 2,

 Eπ[qiju]≤{1m−i+1,if j≥i;0,if j

Also later we use . This is because for each , there are random advertisers that have an edge connected to impressions in . If , then is a uniformly at random advertiser among this group of advertisers. Thus and for any it holds . If , then advertiser does not have an edge to impressions in . Then the expected reward we get from the algorithm, using the same reasoning from the previous section, is

 −Nc+fN(1−q)r+m∑i=1m∑j=i(fNmqEπ[qij1]c+fNm(1−q)Eπ[qij2](c−r)),

Here the first term and the second term are the total reward from not allocating anything to the contract advertisers, while the third term is the total reward gain from the allocation of the algorithm: there are in expectation queries with AdEx reward (or with reward ) from group and (or ) fraction of them are allocated to advertiser , with each impression contributing to a reward gain (or ) compared to being allocated to AdEx .

As we discussed for any . Hence we can simplify the overall expectation for all :

 −cN+fN(1−q)r+m∑i=1(m−i+1)(fNmqEπ[qim1]c+fNm(1−q)Eπ[qim2](c−r)).

Then the reward of the algorithm is upper bounded by the solution of the following linear program, where variables represent the expected value .

 maximize−cN+fN(1−q)r+fNmm∑i=1(m−i+1)(qyi1c+(1−q)yi2(c−r)) s.t.m∑i=1(fNmqyi1+fNm(1−q)yi2)≤Nm;0≤yi1,yi2≤1m−i+1,∀i,1≤i≤m. (7)

Here the left hand side of the first constraint is the total expected number of allocated impressions to advertiser , which is at most .

Next, we show a structure on any optimal solution to this LP, that captures a threshold based behavior that we can be related to the algorithm we presented in the previous section:

###### Lemma 2.7.

For an optimal solution to the above LP, there exists thresholds , such that, for , and for for .

###### Proof.

First, we show that in any optimal solution and a threshold , such that for , and for . Then a similar claim follows for . To show such a threshold behavior holds for values in any optimal solution , where , we equivalently argue that there cannot be such that , for . Let us assume by contradiction that such exists. Then setting , for small enough leads to a new feasible solution since all constraints are still feasible. Furthermore, in the objective function has coefficient , which is the coefficient of . Thus after perturbing this way we get feasible solution with a larger objective value. This contradicts the assumption of being optimal.

Next, we show that , i.e. the thresholds are monotone. For any optimal solution , if , then for , , while . Then setting , for small enough leads to a new feasible solution since all constraints are still feasible. Furthermore, the increase of the objective due to is which is the decrease of the objective due to . Thus after perturbing this way we get a new feasible solution with a larger objective value. This contradicts the assumption of being optimal. ∎

From the above two lemmas, we know that the optimal strategy for Example 2.5 has the following form: for queries in group , all impressions are allocated uniformly to all available advertisers; for queries in group , only queries with AdEx reward are allocated uniformly to all available advertisers; for queries in group , no impression is allocated a contract.

By setting the values, as determined by Lemma 2.7, we can simplify the objective function of linear program (7) with threshold and and bound the reward ALG obtained from an online algorithm as follows: The objective can be written as

 −cN+(1−q)r+(z2∑i=1(qc+(1−q)(c−r))+z1∑i=z2+1qc)=−cN+(1−q)r+z1qc+z2(1−q)(c−r)z2.

Then we get,

 \textscALG≤maxz1,z2−cN+(1−q)fNr+z1qc+z2(1−q)(c−r)z2 (8) s.t.z2∑i=1f⋅1m−i+1+z1∑i=z2+1f⋅1m−i+1q=1

When is large enough, the constraint can be replaced by

 flnmm−z2+fqlnm−z2m−z1=1.

Let , then . Then we can express and by as follows: , and . Apply these to (8) we have

 ALG ≤ maxx∈[0,1f]−cN+(1−q)fNr+m(1−ex(1−q)q−1fq)qc+m(1−e−x)(1−q)(c−r) = maxx∈[0,1f]Nc(f−1)+(1−q)(r−c)fNe−x−qfNce1−qqx−1qf.

Notice that the optimization problem here is identical to the optimization problem (6) that we described in the analysis of Algorithm 2. Thus the upper bound of the performance of any online algorithm for this instance matches the lower bound of the performance of Algorithm 2 for any underlying graph. As the optimal offline allocation has the same expected reward for any instance (see Theorem D.1 for a more detailed discussion), we prove the optimality of Algorithm 2.

#### Going from Section 2 to Section 3.

In Section 3, we use a similar max-min approach as in Section 2. However, the max-min problem of the algorithm is no-more the simple single-variable concave maximization problem. It is a multi-variate, non-linear and non-convex optimization problem. While we cannot solve it precisely optimally in general, we show a dynamic program that can solve it to almost optimality with a small additive error. Also, while establishing tightness, the task was simpler in Section 2 because we had to compare the upper bound from the hard example to the single variable expression and show that these are the same expressions. But in section 3 we establish that the non-linear mathematical programs obtained in the maximization problem of the algorithm and in the hard example are identical. The non-trivial roles that , the AdEx distribution, and the penalty play in determining the optimal thresholds is the core contribution of our work.

## 3 Optimal Algorithm for a General Ad Exchange Distribution

In this section we consider a general AdEx reward distribution. More formally, we have a constant penalty and each query has an AdEx reward drawn from a discrete distribution with a fixed support size 444The assumption on a fixed support, can be relaxed using a standard discretization approach at a small cost in the competitive ratio that depends on this discretization., and the supply factor is . We propose a threshold-based algorithm in which a set of thresholds are chosen based on an optimization problem that takes into account. We then show that this algorithm is optimal. We consider the same instance used in Section 2.2, and show that the optimal solutions on this instance for the two optimization problems are the same when the number of advertisers is sufficiently large. Finally, we show that the binary distribution is the worst-case distribution for any class of algorithm with a fixed mean . This allows us to obtain a competitive ratio, that depends on using our results in Section 2.

### 3.1 Optimal Algorithm for General AdEx Distribution

In this section, we provide a threshold-based algorithm , and in future sections we discuss the computational aspects and prove tightness. First, let us formalize the notation:

###### Definition 3.1 (AdEx distribution with parameters ((ri,qi)i∈[d])).

We consider an AdEx distribution with support size , rewards , where probability of that the reward is is . Also we set .

In other words, for , with probability , we have , ; . Without loss of generality, we assume . Otherwise, we can shift the rewards and the penalty by , since is the smallest reward from any allocation. We also assume , since otherwise, when a query with AdEx reward at least arrives, an optimal strategy always allocates the impression to AdEx , and hence we can disregard such queries.

Our algorithm is presented in Algorithm 1 (see Section 1). For any query that arrives, if is the advertiser the lowest satisfaction ratio, and , then the impression is allocated to if and only if its AdEx reward . Here we define for completeness.

We use the same setup as we in analysis of the algorithm in Section 2. Recall that we discretize the algorithm into steps. An advertiser has type if at the end of the algorithm, . We defined be the total demand of advertisers in the set of all advertisers of type , and be the expected total number of impressions that get allocated to an advertiser with by the algorithm at the time the query arrives. We can relate the values of and using a similar reasoning as in Lemma 2.2. Formally,

###### Lemma 3.2.

Consider an AdEx distribution with parameters , where penalty , , and let be as defined above. We have,

 ∑ℓ≤jfαℓ≤⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩1qd∑0<ℓ≤jβℓ,if j≤s1t;1qd∑0<ℓ≤s1tβℓ+1qd−1∑s1t<ℓ≤jβℓ,if s1t

The proof is omitted, since it is a straightforward extension of Lemma 2.2 that was used for the binary distribution.

A similar case by case analysis as in (4), allows us to write an expression for the total expected reward by considering the following parts:

• The baseline penalty is if no impression is allocated to contracts, the total penalty is .

• The total AdEx reward that may be obtained is .

• Any impression that is allocated to an advertiser with satisfaction ratio in , in expectation gets a reward of . Thus in expectation each query has reward added to the total penalty.

Therefore the expected total reward ALG of the algorithm is

 \textscALG=−cN+d∑u=1fN(qu−qu−1)ru+d∑u=1sut∑j=su−1t+1Nβj(c−ED[r|r≤rd+1−u]).

We can add (1), (2) and Lemma 3.2 to constraints of a linear program to lower bound the reward of the algorithm as follows:

 minimize\textscALG (10) s.t. ftβ1−ftβj+1≤∑ℓ≤j1qdβℓ, ∀j≤s1t; ftβ1−ftβj+1≤∑ℓ≤s1t1qdβℓ+∑s1t<ℓ≤j1qd−1βℓ, ∀s1t

Next, using similar arguments as in Claim 2.3 we argue that by solving a system of linear equations formed by the LP constraints, we can obtain optimal solutions. For this consider the following values:

 β∗j=⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩Nt(1−1/qdtf)j−1,if j≤s1t+1;Nt(1−1/qdtf)s1t−s0t(1−1/qd−1tf)j−s1t−1,if s1t+1
###### Claim 3.3.

The values defined above, form an optimal solution to LP (10).

The argument is similar to the proof of Claim 2.3, and a sketch is provided in Appendix C. Next, similarly to Section 2, the performance of the algorithm is lower bounded by the following formula based on LP (10):

 ALG(s1,...,sd)≥−cN+d∑u=1fN(qu−qu−1)ru+d∑u=1sut∑j=su−1t+1β∗</