DeepAI
Log In Sign Up

Strategic investments in multi-stage General Lotto games

In adversarial interactions, one is often required to make strategic decisions over multiple periods of time, wherein decisions made earlier impact a player's competitive standing as well as how choices are made in later stages. In this paper, we study such scenarios in the context of General Lotto games, which models the competitive allocation of resources over multiple battlefields between two players. We propose a two-stage formulation where one of the players has reserved resources that can be strategically pre-allocated across the battlefields in the first stage. The pre-allocation then becomes binding and is revealed to the other player. In the second stage, the players engage by simultaneously allocating their real-time resources against each other. The main contribution in this paper provides complete characterizations of equilibrium payoffs in the two-stage game, revealing the interplay between performance and the amount of resources expended in each stage of the game. We find that real-time resources are at least twice as effective as pre-allocated resources. We then determine the player's optimal investment when there are linear costs associated with purchasing each type of resource before play begins, and there is a limited monetary budget.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

02/26/2020

When showing your hand pays off: Announcing strategic intentions in Colonel Blotto games

In competitive adversarial environments, it is often advantageous to obf...
11/27/2022

Strategically revealing capabilities in General Lotto games

Can revealing one's competitive capabilities to an opponent offer strate...
10/23/2021

Strategically revealing intentions in General Lotto games

Strategic decision-making in uncertain and adversarial environments is c...
06/23/2021

A General Lotto game with asymmetric budget uncertainty

We consider General Lotto games of asymmetric information where one play...
04/30/2021

Human strategic decision making in parametrized games

Many real-world games contain parameters which can affect payoffs, actio...
12/18/2020

Reinforcement Learning for Unified Allocation and Patrolling in Signaling Games with Uncertainty

Green Security Games (GSGs) have been successfully used in the protectio...
04/17/2018

A Capacity-Price Game for Uncertain Renewables Resources

Renewable resources are starting to constitute a growing portion of the ...

I Introduction

In resource allocation problems, system planners must make investment decisions to mitigate the risks posed by disturbances or strategic interference. In many practical settings, these investments are made over several stages leading up to the actual time of allocation. For example, security measures in cyber-physical systems are deployed over long periods of time. As such, attackers can use knowledge of pre-deployed elements to identify vulnerabilities and exploits in the defender’s strategy [3, 23]. As another example, power grid operators must bid on forward-capacity (i.e., day-ahead, hour-ahead and real-time) markets to fulfill future demand. Although grid operators can significantly reduce energy prices and carbon emissions by procuring capacity in day- and hour-ahead markets, they still rely on real-time markets to account for uncertainty in the demand signal [2, 14]. Further examples include R&D contests, team management in competitive sports, and political lobbying [22].

Indeed, there are numerous real-world examples of systems in which both early and late investments contribute to the system performance. Notably, many of these scenarios consist of strategic interactions between competitors, and exhibit trade-offs when investing in pre-allocated and real-time resources (e.g., resource costs vs. flexibility in deployment, long-term vs. short-term gains). In such scenarios, system planners must choose their dynamic investments while accounting for their competitors’ decision making, and balancing the trade-offs in early and late investment.

In this manuscript, we seek to characterize the interplay between early and late investment in competitive resource allocation settings. We pursue this research agenda in the context of General Lotto games, a game-theoretic framework that explicitly describes the competitive allocation of resources between opponents. The General Lotto game is a popular variant of the classic Colonel Blotto game, wherein two budget-constrained players, and , compete over a set of valuable battlefields. The player that deploys more resources to a battlefield wins its associated value, and the objective for each player is to win as much value as possible. Outcomes in the standard formulations are determined by a single, simultaneous allocation of resources. In the novel formulation introduced in this paper, one of the players can strategically decide how to deploy resources before the actual engagement takes place. The placement of the pre-allocated resources thus has an effect on how the allocation decisions are made at the time of competition.

Specifically, we consider the following two-stage scenario. Player is endowed with resources to be pre-allocated, and both players possess real-time resources to be allocated at the time of competition. In the first stage, player decides how to deterministically deploy the pre-allocated resources over the battlefields. Player ’s endowments and pre-allocation decision then become known to player . In the second stage, both players engage in a General Lotto game where they simultaneously decide how to deploy their real-time resources, and payoffs are subsequently derived. We assume player does not have any pre-allocated resources at its disposal, and only has real-time resources to compete with. Each player can randomize the deployment of her real-time resources. Here, player must overcome both the pre-allocated and real-time resources deployed by player to secure a battlefield. A full summary of our contributions is provided below:

Our Contributions: Our main contribution in this paper is a full characterization of equilibrium payoffs to both players in our two-stage General Lotto game, given player has pre-allocated resources, real-time resources, and player has real-time resources (Theorem III-.1). This result also specifies how player should optimally deploy its pre-allocated resources to the battlefields, each of which has an arbitrary associated value. Our characterization explicitly reveals the relative effectiveness of pre-allocated and real-time resources – for any desired performance level against , we provide the set of all pairs that achieve the payoff for player (Theorem III-.2). As a consequence, we show that, to achieve the same performance using only one type of resource, player needs at least double the amount of pre-allocated resources than the amount of real-time resources (Corollary 3.1).

Leveraging the main results, we then derive the optimal investment pair for player when there are linear per-unit costs to invest in both types of resources and a limited monetary budget is available. We note that it is optimal to invest in both resources only if the per-unit cost of pre-allocated resources is lower than real-time resources. Indeed, pre-allocated resources are less effective than real-time resources, since their deployment is not randomized and player has knowledge of their placement.

Related works: This manuscript takes preliminary steps towards developing analytical insights about competitive resource allocation in multi-stage scenarios. There is widespread interest in this research objective, where the focus ranges from zero-sum games [15, 9, 13], and dynamic games [8, 19], to Colonel Blotto games [20, 1, 18, 12]. The goal of many of these works is to develop computational tools to compute decision-making policies for agents in adversarial and uncertain environments. In comparison, our work provides explicit, analytical characterizations of equilibrium strategies, allowing for insights relating the players’ performance with various elements of adversarial interaction to be drawn. As such, our work is related to a recent research thread studying Colonel Blotto games in which allocation decisions are made over multiple stages [10, 7, 6, 16, 4].

Our work also draws significantly from the primary literature on Colonel Blotto and General Lotto games [5, 17, 11, 21]. In particular, the simultaneous-move subgame played in the second stage of our model was first proposed by Vu and Loiseau [21], and is known as the General Lotto game with favoritism (GL-F). Favoritism refers to the fact that pre-allocated resources provide an inherent advantage to one player’s competitive chances. Their work establishes existence of equilibria and develops computational methods to calculate them to arbitrary precision. However, this prior work considers pre-allocated resources as exogenous parameters of the game. In contrast, we model the deployment of pre-allocated resources as a strategic element of the competitive interaction. Furthermore, we provide the first analytical characterization of equilibria and the corresponding payoffs in GL-F games.

Ii Problem formulation

Fig. 1: (a) The two-stage General Lotto game under consideration. Players and compete over battlefields, whose valuations are given by . In Stage 1, player decides how to deploy pre-allocated resources to the battlefields. Player observes the deployment. In Stage 2, the players simultaneously decide how to deploy their real-time resources and , thus engaging in a General Lotto game with favoritism. (b) A contour map of player ’s equilibrium payoff in the Stage 2 game under the optimal deployment of her pre-allocated resources in Stage 1. The dashed lines indicate level curves, i.e. the set of resource pairs that achieve a given desired performance level (Theorem III-.2). Here, we have normalized the battlefield values and player ’s budget such that and . (c) This table shows the relative effectiveness of pre-allocated to real-time resources, . Here, is defined as the endowment (i.e. without real-time resources) that achieves the same performance as the endowment (i.e. without pre-allocated resources) for a given . We find that real-time resources are at least twice as effective as pre-allocated resources, and can be arbitrarily more effective in certain settings (Corollary 3.1).

The General Lotto game with pre-allocations (GL-P) is a two-stage game with players and , who compete over a set of battlefields, denoted as . Each battlefield is associated with a known valuation , which is common to both players. Player is endowed with a pre-allocated resource budget and a real-time resource budget . Player is endowed with a real-time resource budget , but no pre-allocated resources.111Recent computational advances (see, e.g., [21]) permit the study of the scenario where both players are endowed with pre-allocated resources. In this work, we seek to provide analytical characterizations of equilibrium payoffs, and, thus, consider the simpler, unilateral pre-allocation setting. The two stages are played as follows:

Stage 1: Player decides how to allocate her

pre-allocated resources to the battlefields, i.e., it selects a vector

. We term the vector as player ’s pre-allocation profile. No payoffs are derived in Stage 1, and ’s choice becomes binding and common knowledge.

Stage 2: Players and compete in a simultaneous-move sub-game over with their real-time resource budgets , . Here, both players can randomly allocate these resources as long as their expenditure does not exceed their budgets in expectation. Specifically, a strategy for player is an -variate (cumulative) distribution over allocations that satisfies

(1)

We use to denote the set of all strategies that satisfy (1). Given that player chose in Stage 1, the expected payoff to player is given by

(2)

where if , and otherwise for any two numbers .222The tie-breaking rule (i.e., deciding who wins if ) can be assumed to be arbitrary, without affecting any of our results. This property is common in the General Lotto literature, see, e.g., [11, 21]. In words, player must overcome player ’s pre-allocated resources as well as player ’s allocation of real-time resources in order to win battlefield . The parameter is the relative quality of player ’s real-time resources against player ’s resources. When (resp. ), they are less (resp. more) effective than player ’s resources. The payoff to player is , where we denote .

Stages 1 and 2 of GL-P are illustrated in Figure 1a. We denote an instance of GL-P as , and note that the Stage 2 sub-game (i.e., the game with fixed pre-allocation profile) is an instance of the General Lotto game with favoritism [21]. For a given GL-P instance , we define an equilibrium as any joint strategy profile that satisfies

(3)

for any , and . Notably, player ’s strategy consists of her deterministic pre-allocation profile in Stage 1, as well as her randomized allocation of real-time resources in Stage 2. It follows from the results in [21] that an equilibrium exists in any GL-P instance , and that the equilibrium payoffs , , are unique. For ease of notation, we will use , , to denote players’ equilibrium payoffs in when the dependence on the vector is clear.

Iii Main results

In this section, we present our main result: the characterization of players’ equilibrium payoffs in the GL-P game. We then use this result to derive an expression for the level sets of the function in , and to compare the relative effectiveness of pre-allocated and real-time resources.

The result below provides an explicit characterization of player ’s equilibrium payoff . Note that player ’s equilibrium payoff is simply .

Theorem Iii-.1.

Consider a GL-P game instance with , and . The following conditions characterize player ’s equilibrium payoff :

  1. If , or and , then is

    (4)
  2. Otherwise, is

    (5)

The derivation of the above result is challenging because explicit expressions for the players’ payoffs in the Stage 2 sub-game are generally not attainable for arbitrary . Moreover, these payoffs are not generally concave. Our approach is to show that for any , the payoff is nondecreasing in the direction pointing to . The full proof is given in Appendix -B, and relies on methods developed in [21]. These details are given in Appendix -A.

As a consequence of our main result, we are able to characterize expressions for the level curves of the function . That is, for a desired performance level and fixed , we provide the set of all pairs such that .

Theorem Iii-.2.

Given any and , fix a desired performance level . The set of all pairs that satisfy is given by the following conditions:

If , then

(6)

If , then

(7)

If , then for any .

We plot the surface for as well as the level curves corresponding to in Figure 1b. Notably, for any , the level curve is strictly decreasing and convex in , where we use to explicitly note the dependence on . Hence, the function is quasi-concave in .

We can use the result in Theorem III-.2 to obtain an expression for the relative effectiveness of pre-allocated and real-time resources when these are deployed in isolation. In the following corollary, we provide this expression, and observe that real-time resources are at least twice as valuable as pre-allocated resources, and can be arbitrarily more valuable in specific settings:

Corollary 3.1.

For given , the unique value such that is characterized by the following expression:

(8)

Notably, the ratio is lower-bounded by , and as .

The table in Figure 1c compares the relative effectiveness of pre-allocated and real-time resources corresponding with the performance levels considered in Figure 1b.

Iv Optimal investment decisions

In this section, we consider a scenario where player has an opportunity to make an investment decision regarding its resource endowments. That is, the pair is a strategic choice made by before the game is played. Given a monetary budget for player , any pair must belong to the following set of feasible investments:

(9)

where is the per-unit cost for purchasing pre-allocated resources, and we assume the per-unit cost for purchasing real-time resources is 1 without loss of generality. We are interested in characterizing player ’s optimal investment subject to the above cost constraint, and given player ’s resource endowment . This is formulated as the following optimization problem:

(10)

In the result below, we derive the complete solution to the optimal investment problem (10).

Corollary 4.1.

Fix a monetary budget , relative per-unit cost , and real-time resources for player . Then, player ’s optimal investment in pre-allocated resources for the optimization problem in (10) under the linear cost constraint in (9) is

(11)

where . The optimal investment in real-time resources is . The resulting payoff to player is given by

(12)

The above solution is obtained by leveraging the level set characterization from Theorem III-.2, and the fact that the level sets are strictly decreasing and convex for . We omit details of the proof for space considerations. A visual illustration of how the optimal investments are determined is shown in Figure 2. The budget constraint is a line segment in , and we thus seek the level curve that lies tangent to the segment. In cases where the cost is sufficiently high, no level curve lies tangent to , and, thus, player invests all of her budget in real-time resources.

Fig. 2: The optimal investment subject to the linear cost constraint in (9). Here, we consider the optimal investment problem when , , and . Observe that the set of feasible investments is the line segment connecting and . The optimal investment lies on the level curve tangent to this line segment. For example, when , the optimal investment is (unfilled circle), as (dotted, black line) is tangent to the level curve with (solid, orange line). For sufficiently high cost , will not be tangent to any level curve, and the optimal investment is . For example, when , observe that (dashed, black line) is not tangent even to the level curve with (solid, blue line), and the optimal investment is (filled square).

V Conclusion

In this manuscript, we studied the strategic role of pre-allocations in competitive interactions under a two-stage General Lotto game model. We identified an explicit expression for the set of pre-allocated and real-time budget pairs that correspond with a given desired performance. We then used this explicit expression to derive the optimal dynamic investment strategy under a given linear cost constraint, and to compare the relative effectiveness of pre-allocated and real-time resources when deployed in isolation. Exciting directions for future work include studying the strategic outcomes (i.e., equilibria) when both players can make pre-allocations, and introducing heterogeneities in players’ battlefield valuations and resource effectiveness to the model.

References

  • [1] T. Aidt, K. A. Konrad, and D. Kovenock (2019) Dynamics of conflict. European journal of political economy (60), pp. 101838. Cited by: §I.
  • [2] A. Ben-Tal, A. Goryashko, E. Guslitzer, and A. Nemirovski (2004)

    Adjustable robust solutions of uncertain linear programs

    .
    Mathematical programming 99 (2), pp. 351–376. Cited by: §I.
  • [3] G. Brown, M. Carlyle, J. Salmerón, and K. Wood (2006) Defending critical infrastructure. Interfaces 36 (6), pp. 530–544. Cited by: §I.
  • [4] R. Chandan, K. Paarporn, D. Kovenock, M. Alizadeh, and J. R. Marden (2021) The art of concession in General Lotto games. ESI Working Paper 21-24. Cited by: §I.
  • [5] O. Gross and R. Wagner (1950) A continuous Colonel Blotto game. Technical report RAND Project, Air Force, Santa Monica. Cited by: §I.
  • [6] A. Gupta, T. Başar, and G. Schwartz (2014) A three-stage Colonel Blotto game: when to provide more information to an adversary. In

    International Conference on Decision and Game Theory for Security

    ,
    pp. 216–233. External Links: Document Cited by: §I.
  • [7] A. Gupta, G. Schwartz, C. Langbort, S. S. Sastry, and T. Başar (2014) A three-stage colonel blotto game with applications to cyberphysical security. In 2014 American Control Conference, Vol. , pp. 3820–3825. External Links: Document Cited by: §I.
  • [8] R. Isaacs (1965) Differential games: a mathematical theory with applications to warfare and pursuit, control and optimization. Wiley. Cited by: §I.
  • [9] D. Kartik and A. Nayyar (2021) Upper and lower values in zero-sum stochastic games with asymmetric information. Dynamic Games and Applications 11 (2), pp. 363–388. Cited by: §I.
  • [10] D. Kovenock and B. Roberson (2012) Coalitional colonel blotto games with application to the economics of alliances. Journal of Public Economic Theory 14 (4), pp. 653–676. Cited by: §I.
  • [11] D. Kovenock and B. Roberson (2020) Generalizations of the General Lotto and Colonel Blotto games. Economic Theory, pp. 1–36. External Links: Document Cited by: §I, footnote 2.
  • [12] V. Leon and S. R. Etesami (2021) Bandit learning for dynamic colonel blotto game with a budget constraint. In 2021 60th IEEE Conference on Decision and Control (CDC), pp. 3818–3823. External Links: Document Cited by: §I.
  • [13] L. Li and J. S. Shamma (2020) Efficient strategy computation in zero-sum asymmetric information repeated games. IEEE Transactions on Automatic Control 65 (7), pp. 2785–2800. External Links: Document Cited by: §I.
  • [14] S. Li, A. Shetty, K. Poolla, and P. Varaiya (2020) Optimal resource procurement and the price of causality. IEEE Transactions on Automatic Control 66 (8), pp. 3489–3501. Cited by: §I.
  • [15] A. Nayyar, A. Gupta, C. Langbort, and T. Başar (2013) Common information based markov perfect equilibria for stochastic games with asymmetric information: finite games. IEEE Transactions on Automatic Control 59 (3), pp. 555–570. Cited by: §I.
  • [16] K. Paarporn, R. Chandan, D. Kovenock, M. Alizadeh, and J. R. Marden (2021) Strategically revealing intentions in General Lotto games. arXiv preprint arXiv:2110.12099. Cited by: §I.
  • [17] B. Roberson (2006) The Colonel Blotto game. Economic Theory 29 (1), pp. 1–24. External Links: Document Cited by: §I.
  • [18] D. Shishika, Y. Guan, M. Dorothy, and V. Kumar (2021) Dynamic defender-attacker blotto game. arXiv preprint arXiv:2112.09890. Cited by: §I.
  • [19] A. Von Moll, M. Pachter, D. Shishika, and Z. Fuchs (2020) Guarding a circular target by patrolling its perimeter. In 2020 59th IEEE Conference on Decision and Control (CDC), Vol. , pp. 1658–1665. External Links: Document Cited by: §I.
  • [20] D. Q. Vu, P. Loiseau, and A. Silva (2019) Combinatorial bandits for sequential learning in colonel blotto games. In 2019 IEEE 58th Conference on Decision and Control (CDC), pp. 867–872. Cited by: §I.
  • [21] D. Q. Vu and P. Loiseau (2021) Colonel blotto games with favoritism: competitions with pre-allocations and asymmetric effectiveness. In Proceedings of the 22nd ACM Conference on Economics and Computation, pp. 862–863. Cited by: Lemma -A.1, §-A, §-A, §I, §II, §III, footnote 1, footnote 2.
  • [22] H. Yildirim (2005) Contests with multiple rounds. Games and Economic Behavior 51 (1), pp. 213–227. Cited by: §I.
  • [23] C. Zhang and J. E. Ramirez-Marquez (2013) Protecting critical infrastructures against intentional attacks: a two-stage game with incomplete information. IIE Transactions 45 (3), pp. 244–258. Cited by: §I.