 # Non-zero-sum Stackelberg Budget Allocation Game for Computational Advertising

Computational advertising has been studied to design efficient marketing strategies that maximize the number of acquired customers. In an increased competitive market, however, a market leader (a leader) requires the acquisition of new customers as well as the retention of her loyal customers because there often exists a competitor (a follower) who tries to attract customers away from the market leader. In this paper, we formalize a new model called the Stackelberg budget allocation game with a bipartite influence model by extending a budget allocation problem over a bipartite graph to a Stackelberg game. To find a strong Stackelberg equilibrium, a standard solution concept of the Stackelberg game, we propose two algorithms: an approximation algorithm with provable guarantees and an efficient heuristic algorithm. In addition, for a special case where customers are disjoint, we propose an exact algorithm based on linear programming. Our experiments using real-world datasets demonstrate that our algorithms outperform a baseline algorithm even when the follower is a powerful competitor.

Comments

There are no comments yet.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

An aim of computational advertising is to find the best advertisement that can help build customers loyalty. More specifically, the purpose of advertisers is to devise an optimum allocation of budgets to media, such as newspapers, radio stations, TV, and websites, in order to maximize the number of activated customers. Recently, Alon et al.  proposed a model to deal with a simple case of the problem, called a bipartite influence model. In this study, we shall extend the model by integrating a game-theoretic framework, called the non-zero-sum Stackelberg game framework. Let us explain the model more precisely below.

In the bipartite influence model, we consider a bipartite graph where one side is a set of media, the other is a set of customers

, and each edge is associated with a probability. Intuitively, each edge between a medium and a customer indicates that the customer is influenced by the medium with some given probability that depends on the budget allocated to the medium. We aim to allocate budgets on media so that the expected number of activated customers is maximized. The problem can be formulated as a combinatorial optimization problem. Constant-factor approximation algorithms for the problem have been developed in a framework of submodularity

[1, 12, 13].

In this paper, we shall try to extend the above-mentioned model to deal with a situation of a duopoly where a market leader has occupied the market of a certain product for a long time and a competitor tries to break into the market. The competitor tries to grab the share of the market by aggressively marketing its product. On the other hand, the market leader wants to gain customers and retain her loyal customers simultaneously. This implies that the leader’s gain does not necessarily result in the competitor’s loss. In order to capture the dynamics of this market, we exploit a Stackelberg game  framework to model the interactions between the market leader and the competitor. The Stackelberg game is a two-player two-period game, in which one player (a leader) can commit to an action before the other player (a follower) plays an action. A standard solution concept of this game is the strong Stackelberg equilibrium, which is an optimal solution maximizing the leader’s utility under the constraint that the follower plays a best response to the leader’s action (i.e., intended to maximize the follower’s utility).

The Stackelberg game matches to model our problem setting because the leader wants to increase the number of activated customers, and at the same time, prevent the outflow of her customers, which is achieved by finding a strong Stackelberg equilibrium. In a strong Stackelberg equilibrium, the leader plays a mixed strategy and the follower plays a pure strategy

, where pure strategy and mixed strategy correspond to a budget allocation and a probability distribution over the pure strategies, respectively.

In this paper, we propose a new model called the Stackelberg budget allocation game with a bipartite influence model, which is an extension of the budget allocation problem presented in . The difficulties of our game lie in the leader’s utility function. Our game belongs to a non-zero-sum game, and the utility function is a submodular (nonlinear) function even when the follower’s action is fixed. It is hard to construct an approximation algorithms by the following reasons: (i) the cumbersome constraint that the follower optimally responds and (ii) the leader’s utility may be non-linearly changed by a follower’s strategy. Thus, existing techniques for submodular functions cannot be directly applied to our problem. Furthermore, the leader’s utility function is not necessarily monotone, that is, the utility does not always increase in the number of allocated budgets. This entails the increment of the number of pure strategies. To design an efficient algorithm is an arduous task.

In this paper, we propose three efficient algorithms:

• We design an approximation algorithm with theoretical guarantee. The key idea to construct an approximation algorithm is to create a zero-sum game close to the original non-zero-sum game, and to find an approximate minimax strategy of the zero-sum game with the aid of submodularity.

• We give an efficient heuristic algorithm that repeatedly finds a leader’s pure strategy greedily and uniformly picks from the pure strategies. The running time is polynomial in the leader’s budget. This heuristic can deal with a situation that the leader should not spend up her whole budget due to the non-monotonicity of the utility function. We also evaluate its performance by numerical experiments.

• If the customers are disjoint, we prove that a strong Stackelberg equilibrium can be found efficiently even when the leader has exponentially many pure strategies by using the multiple linear programming (LP for short) formulations. The point in the disjoint case is that we can aggregate a leader’s mixed strategy to a fractional budget allocation. At the same time we can recover a mixed strategy in a compact representation without loss of the leader’s utility. This enables us to save memories to keep a mixed strategy and reduce the size of LP instances.

The rest of the paper is organized as follows: We describe related work in Section 2 and define notations in Section 3. We formalize our model and analyze its (mathematical) properties in Section 4. We then provide an approximation and a heuristic algorithms in Section 5, and provide an exact algorithm for the disjoint customers in Section 6. In Section 7, we empirically show the performance of our algorithm, and finally we conclude the study in Section 8.

## 2 Related work

Our problem setting can be viewed as a non-monotone non-zero-sum Stackelberg game with submodular functions. Vanek et al.  modeled a non-zero-sum Stackelberg game with submodular functions where the defender (the leader) cares about minimizing the loss of her utility. In our game, the leader maximizes her utility incorporating her loss against the follower’s action. Thus, the goal of the leader is different. Moreover, direct application of their technique to find a Stackelberg equilibrium does not seem to work well in our setting. Recently, Wilder et al.  extended a bipartite influence model to a zero-sum Stackelberg game, which is closely related to our problem setting. They proved that the problem is APX-hard, while it has FPTAS for some special cases.

In combinatorial optimization and machine learning, approximation algorithms for maximizing submodular functions under certain constraints have been extensively studied

. Our problem can be viewed as a submodular maximization under a best-response constraint, which is more cumbersome than typical constraints in the submodular maximization literature (e.g., cardinality constraint and knapsack constraint).

The budget allocation problem with the bipartite influence model has been extended in [11, 12, 7, 16]. In particular, some formulations have incorporated the view of the multi-agent system. Maehara et al.  extended a budget allocation in the bipartite influence model to a strategic form game, called the budget allocation game with a bipartite influence model. Hatano et al.  extended the budget allocation problem to the problem with two participants; advertiser and match maker. In the problem, there exist multiple advertisers who cooperatively maximize the influence on customers and single match maker who allocates slots of media to advertisers.

## 3 Preliminary

Let be the set of non-negative integers. For an integer , let be the set . In this section, we describe the budget allocation problem with a bipartite influence model and the Stackelberg game.

### 3.1 Bipartite influence model

Let be a bipartite graph, where is a bipartition and is a set of edges. Each vertex corresponds to a medium and corresponds to a customer. Let and be the sizes of and , respectively. Each edge is associated with a probability , which means that allocating a budget to medium activates customer with probability . We assume that the activation events are independent. The advertiser has a total available budget of , and each medium has a slot to which the advertiser can allocate her budget. The goal is to find the optimal budget allocation with that maximizes the number of activated customers. Throughout this paper, we identify a set

of media with its characteristic vector

. A probability that a customer is activated by the advertiser’s trial from media in is given by

 Pv(z)=1−∏u∈Nv:zu=1(1−puv), (1)

where is the set of the neighbors of . The expected number of customers activated through the budget allocation is given by . The objective of the budget allocation problem with a bipartite influence model is to find that maximizes subject to .

The function is shown to be a monotone submodular function . Here, a function is submodular if it satisfies for all , where and denote the vector of component-wise maxima and minima, respectively, i.e., and . A function is monotone if it satisfies for all , i.e., for all . Thus the budget allocation problem is a special case of the submodular maximization problem with a cardinality constraint, and it is well-known that the problem is NP-hard  and has a -approximation algorithm .

### 3.2 Stackelberg game

The Stackelberg game is played between two players: the leader and the follower. Both players can play a mixed strategy, but it is sufficient to consider that the follower plays a pure strategy. Let and be the sets of pure strategies of the leader and the follower, respectively. We denote the set of mixed strategies of the leader by , each of which is a probability distribution on pure strategies in . We define and as utility functions of the leader and the follower, respectively. We define an instance of the game as . Let be the set of best responses of the follower against . In this game, the leader will commit to play a mixed strategy before the follower plays his strategy. Thus the leader needs to find a mixed strategy maximizing under the constraint that the follower would choose a best-response pure strategy . More precisely, the goal of this game is to find a leader’s mixed strategy that forms a strong Stackelberg equilibrium, as indicated below.

###### Definition 3.1.

A strong Stackelberg equilibrium of is a pair that satisfies

 f(x∗,y∗)≥f(x,y)

for all , , and .

## 4 Stackelberg budget allocation game

In this section, we extend the budget allocation problem with a bipartite influence model to a Stackelberg game. For any set and , we denote by a characteristic vector in such that for and for (). For a mixed strategy , the support of is the set of pure strategies that is played with non-zero probability under , i.e., .

### 4.1 Definition

Let be a bipartite graph consisting of a set of media, a set of customers, and a set of edges between them. For each , we denote by a probability that a customer is activated through a medium by a leader’s or a follower’s trial, and by a probability that a medium activates a customer who has been already activated by the leader. Two probabilities intuitively mean that is a basic activation probability in the market, and is a probability that the follower recaptures customers who were activated by the leader. Let and be the budgets of the leader and the follower, respectively. An instance of the Stackelberg budget allocation game with a bipartite influence model is parameterized by .

We construct a Stackelberg game from an instance as follows. A pure strategy for the leader (respectively the follower) is a set of at most media (respectively media). and of the game are defined by setting (or equivalently ) and .

Let be any customer. Let and be a leader’s and a follower’s pure strategies, respectively. The probability that the leader activates is given by the equation (1). If is not activated by the leader, then the activation probability for the follower is given by the same basic probability, that is . If is activated by the leader, then the probability that the follower attracts a customer away from the leader is .

###### Example 4.1.

We explain the difference between and . Consider a game instance illustrated in Figure 1. There are three media and four customers . For an arbitrary edge , and . The budget for the leader and the follower is and , respectively. At first, the leader plays a mixed strategy that chooses w.p. . Suppose the situation in Figure 1(a) where is chosen and , , and are activated w.p. 0.8, who are shown in gray. After that, the follower plays a pure strategy that chooses . In Figure 1(b), the customer switches to the follower w.p.  if is activated by the leader, and otherwise is activated w.p. . Thus, the probability to activate is . In addition, is activated w.p.  because is non-activated.

The utility functions and are given as follows. The expected number of customers that are activated by the leader but do not shift to the follower is given by

 f(z,y)=∑v∈VPv(z)(1−PF,v(y)).

The expected number of activated customers for the follower is given by

 g(z,y)=∑v∈V (Pv(z)PF,v(y)+(1−Pv(z))Pv(y)).

When the leader uses a mixed strategy , we abuse the notation and write . Here, for a probability distribution over a domain , means that we sample from the distribution . Similarly, we write , and .

The goal of the Stackelberg budget allocation game with a bipartite influence model is to find a leader’s mixed strategy in a strong Stackelberg equilibrium of the game . We define a function that receives a mixed strategy and returns the leader’s utility when the follower takes a best response, i.e.,

 fBR(x)=max{f(x,y)∣y∈BR(x)}.

We aim to solve

 maxfBR(x)s.t.x∈DL. (2)

Note that is an optimal solution to (2) if and only if is a strong Stackelberg equilibrium, where is a best response against . We can evaluate for in time by evaluating times. To obtain the value of , we evaluate for , which takes time.

We now see that the leader’s optimal strategy may not be a pure strategy.

###### Example 4.2.

Consider an instance with , , , and . For a set of media, denotes a unit vector in with . The instance including activation probabilities is depicted in Figure 2(a). Consider an instance depicted in Figure 2(a) with . In this case, an optimal strategy for the leader is and where the best response of the follower is . However, and .

We next see that the leader may not use the whole budget in her optimal strategy.

###### Example 4.3.

Consider an instance depicted in Figure 2(b) where , , , , and . Also, and . Consider an instance depicted in Figure 2(b) with and . Then while .

There also exists an instance without a pure Stackelberg equilibrium (see Example 4.4 in the upcoming full version). There also exists an instance without a pure Stackelberg equilibrium.

###### Example 4.4.

Consider an instance with and where and . Let and let and for all . We have for all . There exists a mixed Stackelberg equilibrium , where and . However, there is no pure Stackelberg equlibrium in this instance.

### 4.2 Hardness

In this subsection, we show hardness results. We observe that finding a leader’s optimal pure strategy when is equivalent to the optimal budget allocation problem. Thus, it is NP-hard to find the leader’s mixed strategy that forms a Stackelberg equilibrium even if , since our problem (2) when always has the leader’s optimal strategy that is pure. It is also known that the approximation ratio is best possible for the maximum coverage problem under the assumption that P NP . Hence, our problem (2) is also inapproximable within ratio unless P NP.

Moreover, when is not a fixed constant, it is even NP-hard to evaluate for a given . The proof is reducing from the maximum coverage problem, which is shown to be NP-hard (see e.g., ). Given an integer and a collection of sets , the maximum coverage problem is to find a subset of at most sets such that the number of covered elements is maximized. See the upcoming full version for the proof.

###### Theorem 4.5.

It is NP-hard to compute for .

###### Proof.

Let be any instance of the maximum coverage problem. We consider an instance of Stackelberg budget allocation problem: , , , and for each . Let and .

We fix a leader’s mixed strategy as . We denote be an all-one-vector. To evaluate , it is necessary to know since .

We show that there exists that covers at least elements if and only if for some . By construction, we have . In addition, if is covered by some set with , and otherwise. Thus, .

Then, if that covers at least elements, then attains , where is defined by if and otherwise. Conversely, if satisfies , then covers at least elements. This completes the proof. ∎

## 5 Algorithms for non-disjoint customers

In this section, for the non-disjoint customers setting, which has no assumption about the graph structure, we propose two types of algorithms for (2). Let be a game instance created from an instance and let be its data size. Due to the hardness result (Theorem 4.5), in this section we assume that is a constant.

### 5.1 Approximation algorithm via zero-sum game

We shall approximately solve a game by solving a zero-sum game close to . The core idea of constructing such a zero-sum game is to keep the same set of best-responses of the follower for any strategy of the leader as . Let us focus on the structure of and , which include the term and its negation, respectively. We define a utility function for the leader as

 Φ(x,y) =−g(x,y)+∑v∈VPv(x) =∑v∈V[Pv(x)(1−PF,v(y))−Pv(y)(1−Pv(x))].

Note that and we can compute in polynomial time since is polynomially bounded. Let be a zero-sum game .

For reals and , we call an algorithm -approximation for (resp. ) if it provides a strategy profile such that and (resp. ). Such is called an -approximate solution.

###### Lemma 5.1.

Let be an -approximate solution of a zero-sum game , and let be a strong Stackelberg equilibrium of the original game . Let and . Then is an -approximate solution for the game .

###### Proof.

We remark that can be rewritten by as . Let be the minimax strategy of . We have

 f(x′,y′) =Φ(x′,y′)+ϵ1≥αΦ(~x,~y)−ϵ+ϵ1 ≥αΦ(x∗,y∗)−ϵ+ϵ1=αf(x∗,y∗)−(αϵ2−ϵ1+ϵ),

where the second inequality holds by . ∎

To find an approximate strong Stackelberg equilibrium, it suffices to find an approximate minimax strategy for . Note that since is an exponential size, finding a minimax strategy for is still intractable.

To this end, we use the multiplicative weight update method . Based on this method, Kawase and Sumita  showed that, for any nonnegative monotone submodular functions and , there exists an algorithm that finds a -approximate solution of in polynomial time in , and . We set for all pure strategies and . By the definition, is nonnegative monotone submodular for any . Thus, we see that we can compute a -approximate solution for in polynomial time in and . This solution is -approximate for . Therefore, by Lemma 5.1, we observe the following result.

###### Theorem 5.2.

For any , there exists a -approximation algorithm where and the running time is polynomial with respect to and .

### 5.2 Heuristic algorithm

In this subsection, we propose a heuristic algorithm. Intuitively, in the algorithm, the players fictitiously play a game times. Here

is a parameter. Let us assume that the leader would know that the follower estimates the leader’s mixed strategy by observing the past budget allocations. In every phase, the leader needs to allocate her budgets so that the mixed strategy estimated by the follower maximizes the leader’s utility. The algorithm outputs a mixed strategy by repeating this phase

times.

We describe informally our algorithm, which is summarized in Algorithm 1. The algorithm repeatedly computes pure strategies , and outputs the best mixed strategy among . At first round, is chosen to maximize . Each is computed greedily (lines 4–9).

In each round , we evaluate times, and each evaluation of takes time. Thus the total running time is .

## 6 Algorithm for disjoint customers

In this section, we focus on the disjoint customers setting where each customer is interested in only one medium, i.e., for all . This means that the utility functions are bilinear. In this special case, we propose an LP-based algorithm, and modify it so that it runs fast when is small. We denote by the data size of an input game instance . The following proposition is the main result in this section.

###### Proposition 6.1.

When for all , we can find a strong Stackelberg equilibrium in polynomial time with respect to and .

As we will see in Section 6.1, it is easy to compute a strong Stackelberg equilibrium by a multiple LP formulation. The running time is polynomial with respect to , , and . However, this is not sufficient since could be exponentially large with respect to and . To remove the dependency on , we reduce the size of each LP in Section 6.2. The idea is a projection of a leader’s mixed strategy onto a fractional budget allocation .

### 6.1 Multiple LP formulation

We first describe a simple exact algorithm to solve (2). The problem (2) is rewritten as

 maxf(x,y)s.t.x∈DL,g(x,y)≥g(x,y′)∀y′∈DF. (3)

When we fix , LP (3) is equivalent to the following LP:

 max∑z∈SLf(z,y∗)xzs.t.∑z∈SL(g(z,y∗)−g(z,y′))xz≥0   ∀y′∈DF,∑z∈SLxz=1,xz≥0∀z∈SL. (4)

The simple algorithm solves (3) exactly by solving (4) for each . Each LP (4) is solvable in polynomial time with respect to , and , and the algorithm produces instances of LP (4). Thus this algorithm runs in polynomial time with respect to , , and .

### 6.2 Reduced formulation

Let be a matrix in whose rows are all pure strategies. For notational convenience, we denote for each . We denote by a fractional budget allocation with . We remark that a fractional budget allocation is a different notion from a mixed strategy ; the former is uniquely defined from the latter as , but the converse may not hold.

We first observe that projects a mixed strategy to a fractional budget allocation . Let .

###### Lemma 6.2.

For any vector , it holds that if and only if

 0≤z≤1, ∑u∈Uzu≤kL. (5)
###### Proof.

For any set of media, we define . It is not difficult to see that a vector is in if and only if satisfies

 ∑u∈Szu≤q(S) (∀S⊆U),z≥0. (6)

Moreover, satisfies (6) if and only if it satisfies (8). The only-if part is clear. To see the if part, assume that for some . If , then it holds that . Otherwise, i.e., , since , we have for some . ∎

We can rewrite and as and , where is the only neighbor of . Then and are simplified as

 f(x,y) =∑u∈U∑v∈Nupuv(Ax)u(1−pF,uvyu), g(x,y) =∑u∈U∑v∈Nupuvyu(1−p′uv(Ax)u).

The utility functions and are bilinear. Moreover, they depend on a fractional budget allocation rather than .

###### Lemma 6.3.

Assume that for all . For each and , it holds that and for any such that .

This lemma gives us an intuition that we solve (4) for a fractional budget allocation and recover a mixed strategy . We claim that LP (4) is polynomially equivalent to

 max ∑u∈U∑v∈Nupuv(1−pF,uvy∗u)rus.t. ∑u∈U∑v∈Nupuvy′ur′u≥0, y′u=y∗u−yu∀u∈U,y∈DF, r′u=1−p