A well understood problem in combinatorial optimization is that of maximizing a linear function over a polymatroid. As shown inEdmonds (1970), the solution of the problem is given by a simple greedy algorithm whose output is some vertex of the base of the polymatroid. A similar algorithm can be used to minimize a linear function over a contrapolymatroid. (All concepts will be defined precisely in Section 2).
Many optimization problems can be viewed as a special case of this problem. The general approach is to associate some “performance vector” with each possible choice of policy for the problem in question, then to show that the convex hull of these vectors is the base of a polymatroid or a contrapolymatroid. The objective function is then expressed as a linear function over , so that it can be optimized using the classic greedy algorithm.
One example of such a problem is the single machine scheduling problem of choosing what order to process a finite set of jobs with given processing times to minimize their weighted sum of completion times. The solution to this problem is given by a simple index rule known as Smith’s Rule (Smith, 1956). Queyranne (1993) later showed that this solution can be derived by showing that the convex hull of the vectors of completion times of jobs is the base of a contrapolymatroid, and observing that the weighted sum of completion times is a linear function over . See also Queyranne and Schulz (1994). The problem is equivalent to a search problem considered in Bellman (1957)
[Chapter III, Exercise 3, p.90], where a target in located in one of a finite number of boxes with search costs according to a known probability distribution and the aim is to locate the target in minimal expected cost.
introduced a single machine scheduling problem where a finite set of jobs each has a given reward and a given probability of being successfully processed. If the machine fails to process a job it cannot process any further jobs. The problem is to choose in which order to schedule the jobs to maximize the total expected reward. Like the first scheduling problem described above, this problem is also solved by a simple index rule. More recently, the same problem was studied independently under the nomenclature of theunreliable jobs problem by Agnetis et al. (2009), who considered performance vectors corresponding to the probabilities that the jobs are successfully processed under a given schedule, and showed the convex hull of these vectors is the base of a polymatroid. Hence, by writing the objective of the problem as a linear function over this polyhedron, the index rule for this problem can also be derived from the greedy algorithm for optimizing a linear function over a polymatroid. Kodialam (2001) had previously studied this same polymatroid to solve a different, but related problem in sequential testing.
By considering so-called conservation laws, Federgruen and Groenevelt (1988) showed that the performance space of several multiclass queueing systems have a polymatroid structure, and this was extended to many other queueing problems by Shanthikumar and Yao (1992). Some cases of the well-known priority rule for queueing can be derived by considering a linear optimization problem over the base of a polymatroid.
In this paper we introduce and solve a new max-min version of the classic problem of maximizing a linear function over a polymatroid. It generalizes many problems on search games, sequential testing and queueing; some known and some new. In particular, we solve a case of the weighted search game introduced by Yolmeh and Baykal-Gürsoy (2021), where a Searcher aims to minimize a weighted time to find a target hidden among a finite number of locations with varying weights and search times. We extend the weighted search game to incorporate the variable speed search paradigm of Alpern and Lidbetter (2014), and give a solution to this problem too. We show that the solution of a search and rescue game introduced by Lidbetter (2020) also follows from a corollary of our main results; furthermore we solve a more elaborate search and rescue game.
We show that our approach yields an alternative solution to a problem in sequential testing previously solved by Kodialam (2001) and Condon et al. (2009), in which operators sequentially perform tests on some tuples until obtaining a negative test, and the objective is to find a randomized routing of tuples to maximize throughput.
Our main problem can also be used to address some max-min (or min-max) multiclass queueing problems, which, as far as we know, have not previously been considered in the literature. Although there are several possible applications, we consider one concrete example of a multiclass queueing problem in which one server processes jobs with exponentially distributed service times that arrive according to a Poisson process. The objective is to choose a randomized priority rule to minimize the maximum expected holding cost of any job class in the steady state of the system. A solution to this problem follows from our results.
In Section 2 we review the notion of a polymatroid and the classic greedy algorithm of Edmonds (1970). We then describe and solve our main problem, framing it as a zero-sum game between a maximizer whose pure strategies are the set of vertices of the base of a polymatroid and a minimizer whose pure strategies are the coordinates . The payoff of the game is for some fixed positive weights (in contrast to the classic problem where the objective is ). We give optimal strategies for both players and an expression for the value of the game. Furthermore, we give a complete characterization of the set of optimal strategies for Player 2. The value of the game and optimal strategies for both players can be found in strongly polynomial time (in the dimension of the polymatroid). We show that in some special cases, the value of the game can be found particularly quickly. In Section 3, we consider some variations of our game involving contrapolymatroids and min-max objectives.
We apply our general result to several search games in Section 4, some of which have known solutions and some of which do not. These are two-person zero-sum games, where one player hides a “target”, which the other player must locate. See Alpern and Gal (2003) or Hohzaki (2016) for an overview on the search games literature. We also make a link to a problem in sequential testing in Section 5, showing that it is a special case of the problem addressed in this paper. In Section 6, we discuss further implications of our results in the field of queueing theory, giving an example of a min-max problem in queueing theory whose solution follows from this work.
Finally, in Section 7, we consider a special case of our game in which the payoff function satisfies a certain monotonicity property. For this case, we give an efficient procedure that implements an optimal strategy for Player 1. The support of this strategy is of exponential size, and the procedure does not output an explicit representation of it as a convex combination of pure strategies. Instead, the procedure can be used to efficiently generate a pure strategy, drawn from the support of this optimal strategy with appropriate probability.
2 Problem Statement and Solution
In this section we define and solve our main problem, then consider some special cases and variations. But first, we review the definition of polymatroids and some elementary facts about them.
2.1 Review of Elementary Polymatroid Theory
Recall that a function is submodular if for all and is supermodular if for all .
For the rest of this section we assume that is a non-negative, non-decreasing (with respect to set inclusion) submodular function with , where for some positive integer . (We set if .) We assume that the values are given by an oracle. Let be the polymatroid associated with , given by
where . We first review the problem of maximizing a linear function over , where is a constant. Let be a permutation (or bijection) of such that . The classic solution to the problem, given in Edmonds (1970) is the point given by
Notice that for any and , we have , so an equivalent problem is to maximize over the base polyhedron of , given by
The vertices of are given by all points defined by (1), as ranges over the set of all possible permutations of .
Later, we will use the following fact, which is easy to verify.
If , then for any that maximizes , there exists some such that and (that is, precedes in ).
We note that in giving running times, we assume that it takes only constant time to answer an oracle query.
2.2 The Main Problem
The problem we consider in this paper is that of finding some to maximize . Equivalently, we consider a zero-sum game in which a pure strategy for Player 1 (the maximizer) is a permutation of (or, equivalently, a vertex of ) and a pure strategy for Player 2 (the minimizer) is a direction . For a given pair of pure strategies and , the payoff is given by
We will usually drop the and from the subscript of . We denote this game by . (In Section 3 we will consider a version of the game where Player 1 is the minimizer.) A mixed strategy for Player 1 corresponds to a point of and the expected payoff of such a strategy against a pure strategy of Player 2 is .
A mixed strategy for Player 2 is a randomized choice of directions, where each is chosen with some probability , where . For such a mixed strategy, the payoff against a strategy of Player 1 is
where , and is the th coordinate vector.
Equivalently, we may consider a mixed strategy for Player 2 as a point of the simplex
so that a pure strategy for Player 2 is a vertex of . In a small abuse of our notation, we write for the expected payoff when Player 1 uses strategy and Player 2 uses strategy . When one player uses a pure strategy and the other uses a mixed strategy, we extend the use of in the natural way.
Since each player has a finite number of pure strategies, the game has optimal mixed strategies and a value , by the minimax theorem for zero-sum games, where
Note that for a given mixed strategy of Player 2, the problem of finding a best response for Player 1 is that of choosing to maximize . This is the classical problem solved in Edmonds (1970) of maximizing a linear function over . With this observation, it follows that an optimal strategy for Player 1 can be computed in polynomial time using the ellipsoid algorithm (see e.g., Hellerstein et al. (2019)). In what follows, we give a strongly polynomial time algorithm.
Any given mixed strategy of Player 2 can be expressed uniquely as a convex combination of his pure strategies (that is, vertices of of ) simply by taking . A given mixed strategy of Player 1 can be written as a convex combination of at most of her pure strategies , by Carathéodory’s Theorem. In general, as discussed in Hoeksma et al. (2014), such a representation can be found in strongly polynomial time by combining the generic approach of Grötschel et al. (2012) with the algorithm of Fonlupt and Skoda (2009) for finding the intersection of a line with a polymatroid. The runtime of this algorithm is . For particular problems it is possible to exploit the structure of in order to find a more efficient algorithm for representing a Player 1 mixed strategy as a convex combination of at most of her pure strategies.
For a subset , , denote by . Consider the Player 2 mixed strategy
For a Player 1 strategy , the expected payoff against is
by definition of . We summarize this in the following lemma.
If Player 2 uses the strategy for some , the expected payoff is at most .
We will show in Theorem 4 that the strategy is optimal for Player 2, where is chosen to minimize . A minimizing set can be found in strongly polynomial time, using a parametric search (see Iwata et al. (1997) [Section 6] for a parametric search algorithm for minimizing the ratio of a submodular function to a non-negative supermodular function). This relies on an algorithm for minimizing a submodular function. The fastest known strongly polynomial algorithm for submodular function minimization is that of Orlin (2009), whose runtime is , so that the minimization of takes time .
Before stating and proving the theorem, we define a strategy which will be optimal for Player 1. To do this, we recursively define a partition of into subsets .
Definition 3 (- decomposition)
Set and suppose have already been defined for some . Then if is equal to , set . If not, we define to be any set that minimizes , where
We call an - decomposition of .
Note that the function is the ratio of a submodular function and a modular function, therefore, as remarked earlier, it can be minimized in strongly polynomial time. Since is defined in terms of and , a more informative notation is , but we omit the superscripts in general when they are clear from the context.
We now define the Player 1 strategy by
To show that it is indeed a strategy, we need to prove that it lies in . Let be arbitrary and let for . Also set . Then
by definition of . Since is submodular, , so
Hence, . It is also easy to see that , so that .
Suppose is a non-empty set that minimizes . Then the value of the game is equal to . An optimal strategy for Player 2 is . An optimal strategy for Player 1 is , where is any - decomposition.
Proof. By Lemma 2, the value of the game is at most . To complete the proof, we will show that ensures a payoff at least against any Player 2 strategy. Note that for a pure strategy of Player 2 with , the expected payoff against is
So it is sufficient to show that is non-decreasing in . By definition of , we have
for . Writing and rearranging yields
This is equivalent to , and the proof is complete.
In general, both players have multiple optimal strategies. For Player 2, we can characterize these strategies.
Let be the family of sets that minimize , so that the value of the game is equal to for any . We also set to be equal to , so that . It is useful to note that is a lattice. Indeed, suppose . In the following calculation, we use the observation that for any , if then , where the second inequality is tight if the first is also tight. We have
where the equality and second inequality follow from our observation and the first inequality follows from the submodularity of . Therefore, the two inequalities hold with equality, and .
A Player 2 strategy is optimal if and only if it is in the convex hull of .
Proof. By Theorem 4, each element of is optimal, so any convex combination of such points is also optimal.
For the opposite direction, suppose that is an optimal Player 2 strategy. By relabeling, let us assume that . Then recalling that for and setting , we can write as
where . Note that
where the final equality follows from the fact that . So is a convex combination of the strategies . We claim that if for some then , so that is in fact a convex combination of strategies with . Indeed, suppose that , so that . Since any pure strategy best response to maximizes , by Lemma 1, we can express as a point such that the first terms of are in some order. So by definition of ,
Equation (3) also holds for any mixed strategy which is a best response to (since must be a mixture of pure best responses to ). In particular, it holds for , where is any - decomposition of whose first element is the maximal element of .
We claim that . Let and suppose for some . Since and any Player 2 pure strategy in the support of that is played with positive probability must be a best response to , it follows that strategy is a best response to . But by the maximality of , inequality (2) with is strict, and rearranging gives . Since is non-decreasing, for any ,
so cannot be a best response to , a contradiction. Hence, so .
Now, by definition of ,
where is the value of the game. Combining this with (3) yields , so . This completes the proof.
2.3 Special Cases
To find optimal strategies in the game , it is necessary to minimize the function . As previously remarked, there is a strongly polynomial time algorithm for this problem with runtime . To calculate an optimal Player 1 strategy, this algorithm must be run at most times, so the overall runtime is . For some functions , this minimization can be performed much faster, as we show in the remainder of this section.
We say that the payoff is -decreasing if there exists such that for any and any with ,
If we say is -increasing. If for all , then we say is decreasing (or respectively increasing).
If the payoff is -decreasing (or increasing) we assume that the values are given as part of the input of the problem.
Suppose is -decreasing. Then is equal to for some .
Proof. It is sufficient to show that if is -decreasing and and , then . Let be any optimal Player 1 strategy such that the first set in the partition is , and write as a convex combination of pure strategies. Since , for any best response to , we can write , where , by Lemma 1. Since every pure strategy in the support of must be a best response to , we can assume that if then . It follows from (4) that if and , then
By Theorem 5, every element of (in particular, ) is in the support of some optimal Player 2 strategy and cannot be in the support of any Player 2 strategy. Therefore, must be a best response to and cannot be a best response, so that
It is worth pointing out that although the definition of -decreasing and the proof of Lemma 7 are given in game theoretic terms, the lemma is not exactly a game theoretic result, and could be stated without reference to the game . Indeed, it is easy to see that is -decreasing if and only if there exists such that
for any with .
Lemma 7 implies that for games with a -decreasing payoff function, the set can be found in time , simply by relabeling the the elements of so that they are in non-decreasing order of the index , computing for each and choosing the largest that minimizes this function. (Note that these computations can done in time by keeping a record of each time and adding to obtain .) Therefore the value of the game and the optimal Player 2 strategy can be found in time .
In order to compute the optimal Player 1 strategy it is necessary to calculate an - decomposition , which involves at most minimizations of functions of the form . It is easy to check that if is -decreasing, then so is the function , where
It follows that an - decomposition can be found in time . (However, expressing as a convex combination of at most pure strategies takes additional computation in general.)
We conclude this section by showing that when the payoff is decreasing, the solution of the game is particularly simple.
If is decreasing then is non-increasing in and the value of the game is . The strategy is optimal for Player 1, where consists only of the set , and is optimal for Player 2.
Proof. Let be a proper subset of , and without loss of generality, assume that for some . Let and let be any permutation of that starts with . Since is decreasing, for any ,
Then setting , we obtain
This proves that is non-increasing in , so the value of the game is .
The optimality of the stated strategies is immediate from Theorem 4.
3 Other Variations of the Game
Now let be an arbitrary non-decreasing, supermodular function with . The contrapolymatroid associated with is defined by
The base of is given by
Consider the game where a mixed strategy for Player 1 (the maximizer) is some , a mixed strategy for Player 2 (the minimizer) is some and the payoff is . Let be the dual of , given by . It is easy to show that is submodular and non-decreasing with and . Moreover, is -increasing if and only if is -decreasing. Therefore, the game is equivalent to , and the solution follows immediately from Theorems 4 and 5. Versions of Lemmas 7 and 8 also hold.
So far, we have considered two equivalent zero-sum games where Player 1 is the maximizer and Player 2 is the minimizer. We now consider an alternative game with Player 1 as minimizer and Player 2 as maximizer. Let be a non-decreasing, supermodular function with , and let be the game which is the same as except that Player 1 is the minimizer and Player 2 is the maximizer. Similarly, for a non-decreasing, submodular function with , we define to be the same as except that Player 1 is the minimizer and Player 2 is the maximizer.
Although the games and do not seem to be equivalent to the games of the previous section, the solutions and analysis are almost identical. We briefly describe the solutions here and leave the proofs as an exercise.
Analogously to an - decomposition, we define a - max-decomposition as follows. Set and suppose have already been defined for some . Then if is equal to , set . If not, we define to be any set that maximizes . This time, the function is the ratio of a supermodular function and a modular function and can be maximized by using the procedure of Iwata et al. (1997) to minimize the inverse ratio. Then the Player 1 strategy is defined in precisely the same way as in the original version of the game.
Let be a non-decreasing submodular function with and let be a non-decreasing supermodular function with . Then the solutions to the games , , and are given in Table 1. The value and an optimal Player 1 strategy are indicated in the second and third columns of the table. In each case, the set of optimal Player 2 strategies is the convex hull of the set of where ranges over all possible values as given in the second column of the table. The fourth column gives a condition on the payoff for the set to have the form given in the fifth column. The sixth column gives a condition for to be equal to .
|for optimal||Condition||, if condi-||Condition on|
|Game||Value||Player 1||on payoff||tion on payoff||payoff for|
4 Applications to Search Games
In this section we apply our results to a number of search games between a Searcher (Player 1) and a Hider (Player 2), where corresponds to a set of hiding locations. In each example, a Searcher pure strategy is a permutation of , where is the location that is in position in the order of search and a Hider pure strategy is a location at which a target is hidden.
4.1 A Weighted Search Game
Consider a game where the time to search location is given by and each location has a weight , corresponding to the rate of damage incurred at location while the target has not been found. The payoff is given by , for a permutation and , where
This payoff is the total time to find the Hider multiplied by the rate of damage. The Searcher is the minimizer and the Hider is the maximizer. This game was considered by Yolmeh and Baykal-Gürsoy (2021), who solved the special case when the search times are all equal to 1, using a polyhedral approach. (Yolmeh and Baykal-Gürsoy (2021) also applied a column and row generation approach to the game in a more general network setting, with multiple searchers and targets.)
Condon et al. (2009) studied the special case of this game for , which they called the game theoretic multiplicative regret game. This case was also studied by Angelopoulos et al. (2019). Implicit in the results of Condon et al. (2009) is an optimal Player 1 (Searcher) strategy and the value of the game for the general weighted search game with arbitrary .
Here we give an alternative solution using the framework presented in Sections 2 and 3. The searching of locations is analogous to the processing of jobs in single machine scheduling, and in the language of scheduling theory, we can interpret the time as the processing time of job and the time as the completion time of job under the schedule . We associate a Searcher pure strategy with a point given by . It is well known from scheduling theory (see Queyranne and Schulz (1994)) that the set of feasible vectors are the vertices of , where is the supermodular function given by
and . The polyhedron is known as the scheduling polyhedron and corresponds to the set of Searcher mixed strategies in the search game. Let . Then for a Hider pure strategy , the expected payoff against a Searcher mixed strategy given by is . Hence, this is the game and its solution follows from Theorem 9. The value of the game is
It is easy to see that the payoff is -increasing where . Hence, by Theorem 9, the value and optimal strategies can be found in time . To express the optimal Searcher strategy as a mixture of at most pure strategies, one can use the strongly polynomial time decomposition algorithm of Hoeksma et al. (2014).
We note that two different solutions of the special case when the rates of damage are all equal to 1 were given by Lidbetter (2013) and Alpern and Lidbetter (2013), though in each solution the size of the support of the optimal Searcher strategy was exponential in . Condon et al. (2009) also considered this special case, calling it the game theoretic total cost problem. They found an optimal Searcher strategy of support size . Theorem 9 implies an alternative polynomial time algorithm for finding an optimal Searcher strategy with support size . Furthermore, the payoff is increasing in this case, so Theorem 9 implies that the optimal Hider strategy given by Lidbetter (2013) and Alpern and Lidbetter (2013) is unique.
4.2 A Weighted Search Game with Variable Speeds
We can extend the model of the previous subsection by adopting the variable speed network model, as considering by Alpern and Lidbetter (2014). Suppose that we think of the set of locations as endpoints of arcs, whose other endpoint is a common point . The Searcher successively travels from to the end of each arc and back again, where the time to travel from to the end of arc is and the time to travel back again is . Let be the tour time of arc . Similarly to the previous subsection, the vector is defined as
and corresponds to the times the Searcher reaches each location under .
We consider a weighted search game with a minimizing Searcher and a maximizing Hider, whose payoff for a permutation and is given by . If for all , then and this is equivalent to the model of the previous subsection.
The special case when the rates of damage are all equal to 1 was solved by Alpern and Lidbetter (2014) in the more general setting of tree networks, but the optimal Searcher strategy given had exponential support size even in the case of no network structure. The case of arbitrary has not been considered before.
Let and let , so that the payoff for a Searcher strategy and a Hider strategy is . Note that we can write , where is defined as in the previous subsection and is given by . Therefore, the convex hull of the vectors is equal to , where is the non-decreasing supermodular function given by
Therefore, this is the game , and its solution follows from Theorem 9. The value of the game is