 # On a Competitive Secretary Problem with Deferred Selections

We study secretary problems in settings with multiple agents. In the standard secretary problem, a sequence of arbitrary awards arrive online, in a random order, and a single decision maker makes an immediate and irrevocable decision whether to accept each award upon its arrival. The requirement to make immediate decisions arises in many cases due to an implicit assumption regarding competition. Namely, if the decision maker does not take the offered award immediately, it will be taken by someone else. The novelty in this paper is in introducing a multi-agent model in which the competition is endogenous. In our model, multiple agents compete over the arriving awards, but the decisions need not be immediate; instead, agents may select previous awards as long as they are available (i.e., not taken by another agent). If an award is selected by multiple agents, ties are broken either randomly or according to a global ranking. This induces a multi-agent game in which the time of selection is not enforced by the rules of the games, rather it is an important component of the agent's strategy. We study the structure and performance of equilibria in this game. For random tie breaking, we characterize the equilibria of the game, and show that the expected social welfare in equilibrium is nearly optimal, despite competition among the agents. For ranked tie breaking, we give a full characterization of equilibria in the 3-agent game, and show that as the number of agents grows, the winning probability of every agent under non-immediate selections approaches her winning probability under immediate selections.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In the classic secretary problem  a decision maker observes a sequence of non-negative real-valued awards , which are unknown in advanced, in a random order. At time step , the decision maker observes award , and needs to make an immediate and irrevocable decision whether or not to accept it. If she accepts , the game terminates with value ; otherwise, the award is gone forever and the game continues to the next round. The objective of the decision maker is to maximize the probability of choosing the maximal award. A tight competitive ratio of is well known for this problem (see, e.g., ).

This problem (and variants thereof) is an abstraction that captures many real-life scenarios, such as an employer who interviews potential workers overtime, renters looking for a potential house, a person looking for a potential partner for life, and so on. This problem has interesting implications to mechanism design, auctions and pricing, for both welfare and revenue maximization, in various markets such as online advertising markets (see e.g. [2, 3, 4, 6, 13, 14]).

##### Competing Agents.

Most attention in the literature on secretary problems has been given to scenarios in which a single agent makes immediate and irrevocable decisions. The requirement to make an immediate decision arises in many cases from an implicit competition. Namely, if the decision maker does not take the current offered award, then it may be taken by someone else. For example, a potential employee who does not get a job offer following an interview will probably get a job in another firm. Indeed, competition among agents is a fundamental component in many real-life online scenarios.

Recent work has considered the competition aspect in secretary-type problems. For example,  and  considered settings with multiple decision makers who compete over awards that arrive online. In these studies, as in the standard setting, decisions are immediate and irrevocable.

In this paper, we introduce a multi-agent model in which the competition is endogenous. In particular, the different agents compete over the sequence of arriving awards, but unlike previous models, decisions need not be immediate. It is the endogenous competition that may drive agents to make fast selections, rather than the rules of the game. That is, the time to select an award is part of an agent’s strategy. In particular, every previously arriving award can be selected as long as it has not been taken by a different agent. Thus, in our model a sequence of non-negative real-valued awards arrive over time, unknown from the outset. At time step , all agents observe award , and need to decide whether to select an available award, or to pass.

One issue that arises in this model is how to resolve ties among agents. That is, who gets the award if several agents select it. We consider two natural tie-breaking rules; namely, random tie breaking (where ties are broken uniformly at random) and ranked tie-breaking (where agents are a-priori ranked by some global order, and ties are broken in favor of higher ranked agents). Random tie breaking fits scenarios with symmetric agents, whereas ranked tie breaking fits scenarios where some agents are preferred over others, according to some global preference order. For example, it is reasonable to assume that a higher-position/salary job is preferred over lower-position/salary job, or that firms in some industry are globally ordered from most to least desired. Random and ranked tie-breaking rules were considered by Immorlica et al.  and Karlin and Lei , respectively, in secretary settings with immediate and irrevocable decisions.

Two natural objectives have been considered in settings with competition. The first is to maximize the probability of receiving the maximal award (see, e.g., [10, 11]). The second is to outperform the competitors (see, e.g., the dueling framework studied by Immorlica et al. ). We consider an extension of the latter objective, where an agent wishes to maximize the probability to win the place, then to win the place, and so on. Our goal is to study the structure and quality of equilibria in these settings.

### 1.1 Our Results and Techniques

#### 1.1.1 Random Tie-Breaking

For the random tie-breaking rule, we characterize the equilibria of the induced game, and show that the expected social welfare in equilibrium is nearly optimal, despite competition between the agents. This is cast in Theorems 3.1 and 3.2; a simplified statement follows:

###### Theorem.

(Theorems 3.1 and 3.2 ) In every -agent game with random tie-breaking, there exists a simple time-threshold strategy that guarantees each agent a winning probability of , regardless of the strategies of the other agents. The strategy profile where all agents play this strategy is a subgame perfect equilibrium (SPE). Moreover, the expected social welfare of this SPE is at least a fraction of the sum of the top awards.

In particular, we show that each of the agents can guarantee herself a winning probability of following a simple time-threshold strategy that depends only on the current time, the number of active agents, and whether the maximal award so far is available. By symmetry, this is the maximal possible guarantee. This guarantee is then used to fully characterize the set of subgame perfect equilibria of the game.

We then establish that in equilibrium, the expected social welfare is at least fraction of the sum of the top awards which is the optimal welfare (i.e., we bound the price of competition). We do so by using the following two observations: First, we show that in equilibrium the expected number of selected awards among the top awards is high. Second, we observe that the probability of an award to be selected in equilibrium is monotone in its rank among the awards.

We complement this result with a matching upper bound (up to a constant factor), which is derived by observing that in equilibrium there is a constant probability that the first selected award is not one of the top awards.

#### 1.1.2 Ranked Tie-Breaking

For the ranked tie-breaking rule, we show that for a sufficiently large number of agents, the winning probabilities under immediate- and non-immediate selections are roughly the same.

###### Theorem.

(Informal Theorem 4.2) Under the ranked tie-breaking rule, for every rank , as the number of agents grows, the winning probabilities of the ranked agent under non-immediate selections approaches her winning probability under immediate selections.

To prove this result, we use observations from [15, 11, 4] to show that in the immediate decision model, the probability that the maximal award is allocated goes to as the number of agents grows. Since an agent in the non-immediate decision model can always mimic the strategy of an agent in the immediate decision model, her winning probability (which equals to her probability of receiving the maximal award) in the non-immediate model is at least her probability of receiving the maximal award in the immediate decisions game. We therefore deduce that the winning probabilities in the non-immediate model converge to those in the immediate model. This claim essentially formalizes the intuition that as competition grows, the urgency to select awards faster grows.

In addition, we fully characterize the equilibria of the three-agent game.

###### Theorem.

(Theorem 4.1) In every equilibrium of the 3-agent game, agent wins with probability while each of agents wins with probability .

Notice that agent (the highest-ranked agent) can always guarantee herself a winning probability of at least by acting according to the optimal strategy in the classical secretary problem. The last theorem shows that in a setting with 3 agents, the benefit that agent 1 derives due to her ability to postpone decisions is quite small ( vs. ). As implied by Theorem 4.2, this benefit shrinks as the number of agents grows.

### 1.2 Related Work

The classical secretary problem and variants thereof have attracted broad interest and have resulted in a vast amount of literature over the years. For a comprehensive survey, see, e.g., .

##### Competing Agents.

The closest papers to our work are the studies by Karlin and Lei  and Immorlica et al. , who study secretary settings with competing agents, with the ranked- and random tie breaking rules, respectively. The main difference between theirs models and ours is that they consider multi-agent settings where agents must make decisions immediately, while in our model the competition is endogenous; namely, past awards can be selected as long as they are available. Karlin and Lei  show that under the ranked tie-breaking rule, the optimal strategy for each agent is a time-threshold strategy, which is given in the form of a recursive formula (albeit not in a closed form). Immorlica et al.  characterize the Nash equilibria under the random tie-breaking rule. Another related work is the dueling framework by Immorlica et al. , who considered, among other settings, a dueling scenario between two secretary algorithms, whose objective is to outperform the opponent algorithm.

##### Matroid- and Uniform Matroid Secretaries.

In this paper we derive insights from studies on secretary variants in which a decision maker can choose multiple awards, based on some feasibility constraints. Babaioff et al.  introduced the matroid secretary problem, where a decision maker selects multiple awards under a matroid constraint. It has been shown that a constant competitive ratio can be achieved for some matroid structures, but the optimal competitive ratio for arbitrary matroids is still open. An interesting special case (which was also studied in earlier works such as  and ) is one where the decision maker may choose up to awards (also known as a -uniform matroid constraint). Gilbert and Mosteller , Sakaguchi , Matsui and Ano , Ezra et al.  studied secretary models in which a decision maker wishes to maximize the probability of getting the highest award, but may choose up to awards. We draw interesting connections between these models and the one studied in our paper.

##### Non-Immediate and Irrevocable Decisions.

Other relaxations of the requirement to select immediately have been considered in the literature. Ho and Krishnan  consider a sliding-window variant, where decisions may be delayed for a constant amount of time. A similar model is considered by Kesselheim et al. , where decisions may be delayed for a randomized (not known in advance) amount of time. Ezra et al.  study settings where the irrevocability assumption is relaxed. Specifically, they consider a setting where the decision maker can select up to elements immediately and irrevocably, but her performance is measured by the top elements in the selected set. This work is complementary to ours in the sense that it relaxes the irrevocability assumption, while our work relaxes the immediacy assumption.

### 1.3 Paper’s Structure

Our model is presented in Section 2. In Sections 3 and 4 we present our results with respect to the random tie-breaking rule, and the ranked tie-breaking rule, respectively. We conclude this paper in Section 5, where we discuss future directions.

## 2 Model

We consider a variant of the classical secretary setting, where a set of arbitrary awards are revealed online in a uniformly random order. Let denote the award revealed at time . Unlike the classical secretary problem that involves a single decision maker, in our setting there are agents who compete over the awards. Upon the revelation of award , every agent who has not received an award yet may select one of the awards that haven’t been assigned yet. An award that is selected by a single agent is assigned to this agent. An award that is selected by more than one agent is assigned to one of these agents either randomly (hereafter, random tie breaking), or according to a predefined ranking (hereafter, ranked tie breaking). Agents who received awards are no longer active. Awards that were assigned are no longer available. Thus, at time , the set of available awards is the subset of awards that have not been assigned yet. The game continues as long as there are active agents. I.e., after time , if active agents remain, the agents compete (without newly arriving awards) on the remaining available awards until all agents are allocated.

Given an instance of a game, the history at time includes all the relevant information revealed up to time ; i.e., , and the assignments up to time 111In our setting, additional information, such as the history of selections (in contrast to assignments) is irrelevant for future decision making.. A strategy of agent , denoted by , is a function from the set of all possible histories to a selection decision (either selecting one of the available awards, or passing). A strategy profile is denoted by . We also denote a strategy profile by , where denotes the strategy profile of all agents except agent . Every strategy profile induces a distribution over assignments of awards to agents. For ranked tie breaking, the distribution is with respect to the random order of award arrival, and possibly the randomness in the agent strategies. For random tie breaking, the randomness is also with respect to the randomness in the tie breaking.

As natural in competition settings, every agent wishes to win the game; that is, to receive an higher award than her competitors. We say that agent wins the place in the game if she receives the highest award among all allocated awards. Let be a

-dimensional vector, where

is the probability to win the place. Given two vectors , is preferred over , denoted , if is lexicographically greater than . Similarly, is weakly preferred over , denoted , if is lexicographically greater or equal to . That is, every agent wishes to maximize her probability to win the first place, upon equality to maximize the probability to win the second place, and so on.

A strategy profile induces a -dimensional probability vector for each agent , where is the probability that agent wins the place under strategy profile . Agent derives higher (respectively, weakly higher) utility from strategy profile than strategy profile , denoted (resp., ), if (resp., ). We use and interchangeably, and similarly for and .

Note that is the probability that agent wins the first place under strategy profile . We sometimes refer to it as agent ’s winning probability under .

##### Equilibrium notions.

We consider the following equilibrium notions.

• Nash Equilibrium: A strategy profile is a Nash equilibrium (NE) if for every agent and every strategy , it holds that .

• Subgame perfect equilibrium: A strategy profile is a subgame perfect equilibrium (SPE) if it is a NE for every subgame of the game. I.e. for every initial history , is a NE in the game induced by history .

SPE is a refinement of NE; namely, every SPE is a NE, but not vice versa.

## 3 Random Tie-Breaking

In this section, we study the setting of the random tie-breaking rule. We characterize the SPEs and give simple time threshold strategies with optimal utility guarantees. We show that in the SPE where all agents play according to this optimal guarantee strategy, at least of the optimal social welfare is achieved in expectation.

Consider the following strategy for agent , where denotes the number of active agents at time (including agent ).

• If , then select the maximal available award.

• If , and the maximal award so far is available, then select it.

• If , , and the maximal award so far is available, then select it.

• Else, pass (i.e., select no award).

We denote by the set of strategies in which agent plays according to up to the following cases:

• If , , and both the highest and the second highest awards so far are available, then the agent can either pass or select the highest award so far.

• If and the highest award so far is available, then the agent can either pass or select the highest available award.

We next show that strategies in are the only strategies that guarantee a utility of at least . By symmetry, there is no strategy such that for every . Thus, the strategy profiles where each agent plays according to a strategy in , are the only SPEs.

###### Theorem 3.1.

For every agent , and , it holds that . For every there exists such that . Moreover, the strategy profiles where each agent plays according to a strategy in , are the only SPEs.

Before proving Theorem 3.1, we show the following:

###### Observation 3.1.

For every time , selecting an element that is not the maximal so far cannot guarantee a winning probability of .

Thus, we can assume that agents do not select elements that are not the maximal so far up to time . We now give lower bounds on the probabilities of winning first and second places in a strategy given a time , and whether the maximal and second maximal awards so far are available. Let

be an ordered pair denoting a lower bound on the probabilities of agent

winning first and second places under strategy profile , conditioned on the event that at time (after observing the award , but before making selections in time ) agent is active, there are active agents (including agent ), and the maximal and second maximal awards up to time are available. Similarly, let be a lower bound on the probabilities of agent winning first and second places under strategy profile , conditioned on the event that at time , agent is active, there are active agents (including agent ), and the maximal award up to time is available, but the second maximal award is not available.

Let be a lower bound on the probabilities of agent winning first and second places under strategy profile , conditioned on the event that after the allocations of time , agent is active, there are active agents (including agent ), and the maximal award up to time is not available, but the second maximal award is available. Finally, let be a lower bound on the probabilities of agent winning first and second places under strategy profile , conditioned on the event that after the allocations of time , agent is active, there are active agents (including agent ), and none of the maximal and the second maximal awards up to time are available.

In Lemma 3.1 we lower bound the above terms. The full proof of the lemma is deferred to the appendix.

###### Lemma 3.1.

For every , and every , it holds that:

###### Proof sketch.

We observe that the winning probability of an agent depends on the time step , the number of active agents , and whether the maximal award so far is available or not. If at time an agent receives the maximal award up to time , she wins with probability (which is the probability that this award is the global maximum). If another agent receives the maximal award up to time , then by symmetry, each remaining active agent can guarantee a winning probability of . Thus, selecting the maximal award so far is better whenever , and passing is better whenever . In cases where , both passing and selecting the maximal award so far give a winning probability of , and the agents break this tie based on the probability of winning the second place. For the four states of whether the maximal and second maximal awards so far are available, we establish lower bounds on the probabilities of winning first and second places, by induction on and . ∎

We are now ready to prove Theorem 3.1.

###### Proof of Therorem 3.1.

It follows from the proof of Lemma 3.1 that for every strategy that is not in , if each agent plays according to a strategy in , agent ’s utility is smaller than . It also shows that if agent plays according to a strategy in , and there exists an agent that plays according to a strategy not in , is greater than .

Thus, the only SPEs are profiles in which each agent plays according to a strategy in , and by symmetry, we get that the utility of agent is exactly , as desired. ∎

We next show that despite the competition, the social welfare of the SPE where all agents play according to is at least of the maximal possible social welfare. We use to denote the maximal value among . The optimal social welfare is .

###### Theorem 3.2.

The expected sum of the allocated awards in the SPE profile is at least .

###### Proof.

Let

be the random variable denoting the first time in which one of the top

awards appears, let be the number of awards selected in the profile before time , and let be the number of awards for that are the maximal so far upon their arrival. Notice that at most one award is selected up to time , and any award that gets selected at time is the best award so far, thus, . We start by bounding the expectation of .

 E[A] = E[A∣X>nk]⋅Pr[X>nk]+E[A∣X≤nk]⋅Pr[X≤nk] ≤ E[B+1∣X>nk]⋅Pr[X>nk]+0 = Pr(X>nk)⋅1+n∑x=⌊nk⌋+1Pr(X=x)⋅x−1∑t=⌊nk⌋+11t = Pr(X>nk)+n∑t=⌊nk⌋+1Pr(X>t)⋅1t ≤ (n−nkn)k+n∑t=⌊nk⌋+11t(n−tn)k ≤ 1e+n∑t=⌊nk⌋+1e−tknt ≤ 1e+knn∑t=⌊nk⌋+1e−tkn ≤ 2.7e<1, (3)

where Inequality (3) is since if , then , and since . Equality (3) is true since is the maximal so far with probability . Inequality (3) holds for every . We conclude that in expectation, less than one award among the top awards is not selected. Since the arrival order is uniform and the algorithm decisions depend only on the current ranks of awards and not on their actual values, the probability that award (i.e., the th highest award) is chosen is monotonically decreasing in . Monotonicity follows by the fact that given an order of awards such that is chosen while is not, if these two awards are switched, then is chosen while is not. The monotonicity together with the fact that in expectation at least awards out of the highest awards are chosen, gives expected social welfare of at least . ∎

We complement this result by showing an instance of awards where the social welfare in every SPE is at most for every .

###### Example 1.

Suppose and for every such that . In every SPE, the first selection is made at time no later than . If none of the top awards appeared up to time , at least one of the agents gets an award of 0. The probability that none of appeared by time is approximately . Thus, the expected social welfare is at most .

## 4 Ranked Tie-Breaking

In this section, we study competition under the ranked tie-breaking rule.

We first claim that it is without loss of generality to assume that for every agent, the winning probability equals to the probability of receiving the highest award. To show this, we observe that whenever exactly one agent is active, she may as well wait until time and only then select the maximal award without harming her utility. Thus, it can be assumed that the maximal award is always allocated, and the winning agent receives it. Thus, we may assume that the first-order objective of every agent is to maximize the probability of receiving the maximal award, as in the standard secretary problem and previous multi-agent extensions.

In Section 4.1 we present general observations regarding equilibria in this setting. We then characterize the equilibrium in the 3-agent game in Section 4.2. In Section 4.3 we show that as the number of competing agents goes to infinity, the agents’ probabilities of receiving the highest award (which equal to the agents’ winning probabilities) converge to the corresponding probabilities in the immediate decisions model described by Karlin and Lei .

### 4.1 General Observations

We first make observations about the structure of the subgame perfect equilibria (SPE) of the game.

###### Proposition 4.1.

A strategy profile is an SPE if for every agent , is described by a set of time thresholds for every such that . At time , agent selects the highest award so far if it is available and , where the current number of active agents is and agent is ranked among them222If , then the agent is indifferent between selecting and passing.. In addition, if agents are active and , then the lowest-ranked active agent makes a selection, even if the highest award so far is not available.

###### Proof.

Since the objective is lexicographic, agents wish to maximize the probability to win the first place, and only if this is hopeless, they will attempt to win lower places. This implies that agents make selections only if at least one of the following conditions holds:

1. The selected award has a non-zero probability of being the maximal allocated award.

2. Winning the first place has zero probability, independent of whether a selection is made.

In the first case, an agent can either make a selection of the highest award so far or pass. This decision depends only on the number of awards revealed so far and the number of active agents with higher ranks. For a given number of active agents, the winning probability is monotonically increasing in the number of revealed awards. Thus, there exists some such that at time , agent selects the highest award so far if it is available and , where is ranked -th among the active agents.

The second case can only occur if the number of active agents exceeds the number of remaining awards. We show by induction that if there are active agents and , the best strategy of the lowest ranked active agent is to select the best available award. for the base of the induction, if a single agent remains at time , she clearly makes a selection. For the induction step, suppose . The lowest ranked agent among the active agents knows (by the induction assumption) that at any future time step , a selection is going to be made by a higher ranked agent. Thus, she clearly makes a selection. ∎

We proceed with several observations about the time thresholds in the SPE of the game.

Since any agent can always mimic the strategy of an agent ranked lower than her, in equilibrium a lower-ranked agent would be willing to receive any award that a higher-ranked agent would be willing to receive. In the threshold terminology, it means that:

###### Observation 4.1.

For any number of active agents , for every pair of ranks such that , without loss of generality it holds that .

The following observation gives bounds on the time threshold of the lowest-ranked active agent relative to the second-lowest ranked active agent.

###### Observation 4.2.

For any number of active agents , it holds that .

###### Proof.

By Observation 4.1 we have that . On the other hand, the lowest-ranked active agent never makes a selection before time , because she can only benefit from waiting as long as no other active agent makes a selection. The claim now follows since, by Observation 4.1, . ∎

Recall that the winning probability of agent under strategy profile is denoted by . Throughout this section, we make two simplifications in notation. First, we omit . Second, we omit the subscript 1, since we consider only the probability of winning the place. Consequently, we denote the probability that agent wins the place in strategy profile by .

The following observation gives bounds on the winning probability of the lowest-ranked agent relative to the second-lowest agent.

It holds that .

###### Proof.

By definition of , the -ranked agent is willing to select the maximal award among and is not willing to select the maximal award among . Thus,

 Tki−1n≤pi≤Tkin. (4)

When is strictly smaller than , agent can guarantee herself a winning probability of and the right inequality becomes equality. That is, . Thus, by Observation 4.2, either (i) , in which case , or (ii) , in which case and . The observation follows. ∎

### 4.2 The 3-Agent Game

In a 2-agent game Observation 4.3 implies that . That is, both agents win with probability roughly a half. This symmetry breaks as more agents join the game and the setting becomes interesting already in the case of 3 agents.

Notice that the highest-ranked agent can always guarantee herself a probability of at least to receive the highest award by adopting the optimal strategy in the classical secretary problem. An interesting question is whether the opportunity to make non-immediate decisions increases this probability for the highest-ranked agent.

We show that in a game with 3 agents this advantage exists, but is very small. That is, the winning probability of the highest-ranked agent is nearly the same as in the immediate decision model. More accurately, we show that:

###### Theorem 4.1.

In a setting with 3 agents, in any SPE, agent wins with probability , while each of agents wins with probability .

###### Proof.

We start with the case where , and and later on show how to handle the cases where the equality is broken. Let . At time both agents select the maximal award so far and it is allocated to agent . Hence, agent 2’s winning probability is:

 p2=τn. (5)

Agent wins in two cases. Case 1: The maximal award arrives at time for . Case 2: Agent does not select an award before time (this happens with probability ), agent selects an award at time (such selection is made if and only if is maximal so far, which happens with probability ) , and the maximal award arrives later than time (this happens with probability ). Hence:

 p3=n/2−τn+τn/2⋅n∑s=n/2+11s(1−sn)≈12−τn+2(ln2−12)τn. (6)

Combining the fact that (by Observation 4.3) with Equations (5) and (6) gives:

 τn≈12−τn+2(ln2−12)τn.

Solving for , we get . The assertion of the theorem follows.

In the proof we assumed that . By Observation 4.2, the equality is sometimes broken and we have that . In this case, agent makes the first selection and the ”names” of agents and are switched in the remainder of the proof. That is, Equation 5 and 6 denote the winning probabilities of agent and respectably. The same comment applies with respect to the two active agents at time .

### 4.3 Immediate vs. Non-Immediate Selection Models

In this section, we compare the immediate and non-immediate models for games with a large number of agents. Let denote the probability that agent wins in a -agent game, with non-immediate selections, and let denote the probability that agent receives the highest award in a game with immediate selections. We note that under immediate selections, is independent of the number of agents in the game.

The main result here is that agents’ winning probabilities in equilibrium under non-immediate selections approaches their winning probabilities under immediate selections, as the number of agents grows.

###### Theorem 4.2.

For every

 limk→∞pi,k=qi.

Before we prove our main theorem, we restate a result by Karlin and Lei  regarding the immediate decision model.

###### Theorem 4.3 (Karlin and Lei ).

For every and every , there is a unique (independent of k) such that agent plays a -threshold strategy in SPE; namely, wait until time , then make a selection whenever a best-so-far award appears. It holds that , and , for all .

We are now ready to present the proof of Theorem 4.2.

###### Proof.

Matsui and Ano  showed an interesting connection between a -agent game with immediate selections, and a scenario where a single decision maker is allowed to select (out of ) awards, and wishes to maximize the probability of getting the maximal award. Denote this last probability by . Specifically, they show that:

 k∑i=1qi=τk. (7)

It is also known that (see, e.g.,  and )

 limk→∞τk=1. (8)

By combining Equations (7) and (8), it follows that:

 limk→∞∑i≤kqi=1. (9)

A straightforward corollary of Theorem 4.3 is that every agent has a strategy that guarantees her a winning probability of at least in under non-immediate selections, independent of the strategies played by other agents. To see this, observe that the -threshold strategy gives this guarantee, by the monotonicity of . Formally, for every number of agents and every ,

 pi,k≥qi. (10)

Combining all of the above we get that:

 limk→∞pi,k = limk→∞(1−∑j∈[k]∖{i}pj,k) (11) limk→∞(1−∑j∈[k]∖{i}qj) = limk→∞(1+qi−∑j∈[k]qj) qi.

The assertion of the theorem follows by Equations (10) and (11). ∎

## 5 Discussion and Future Directions

In this work we study secretary settings with competing decision makers. While in previous secretary settings, including ones where competition among multiple agents is considered, decisions must be made immediately, we introduce a model where the time of selection is part of the agent’s strategy, and thus the competition is endogenous. In particular, decisions need not be immediate, and agents may select previous awards as long as they are available. We believe that this setting captures many real-world settings, where agents compete over “awards” that may remain available until taken by a competitor.

This work suggests open problems and directions for future research. For the ranked tie-breaking rule, we fully characterize the equilibria of a 3-agent game, and derive the corresponding utilities of the agents. Extending this characterization to any number of agents is an interesting open problem.

Below we list some future directions that we find particularly natural.

• Study competition in additional problems related to optimal stopping theory, such as prophet and pandora box settings.

• Study competition in secretary settings under additional tie-breaking rules, such as random tie breaking with non-uniform distribution, and tie-breaking rules that allow to split awards among agents.

• Study competition in secretary settings under additional feasibility constraints. For example, scenarios where agents can choose up to awards, or other matroid constraints.

• Extend the current study to additional objective functions.

## References

• 
• Babaioff et al.  Moshe Babaioff, Nicole Immorlica, David Kempe, and Robert Kleinberg. 2008. Online auctions and generalized secretary problems. ACM SIGecom Exchanges 7, 2 (2008), 1–11.
• Babaioff et al.  Moshe Babaioff, Nicole Immorlica, and Robert Kleinberg. 2007. Matroids, secretary problems, and online mechanisms. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms. 434–443.
• Ezra et al.  Tomer Ezra, Michal Feldman, and Ilan Nehama. 2018. Prophets and Secretaries with Overbooking. In Proceedings of the 2018 ACM Conference on Economics and Computation. 319–320.
• Ferguson  Thomas S. Ferguson. 1989. Who solved the secretary problem? STATISTICAL SCIENCE 4, 3 (1989).
• Freeman  P. R. Freeman. 1983. The Secretary Problem and Its Extensions: A Review. International Statistical Review / Revue Internationale de Statistique 51, 2 (1983), 189–206.
• Gilbert and Mosteller  John P. Gilbert and Frederick Mosteller. 1966. Recognizing the Maximum of a Sequence. J. Amer. Statist. Assoc. 61, 313 (1966), 35–73.
• Ho and Krishnan  Shan-Yuan Ho and Abijith Krishnan. 2015. A Secretary Problem with a Sliding Window for Recalling Applicants. arXiv preprint arXiv:1508.07931 (2015).
• Immorlica et al.  Nicole Immorlica, Adam Tauman Kalai, Brendan Lucier, Ankur Moitra, Andrew Postlewaite, and Moshe Tennenholtz. 2011. Dueling algorithms. In

Proceedings of the forty-third annual ACM symposium on Theory of computing

. ACM, 215–224.
• Immorlica et al.  Nicole Immorlica, Robert Kleinberg, and Mohammad Mahdian. 2006. Secretary problems with competing employers. In International Workshop on Internet and Network Economics. Springer, 389–400.
• Karlin and Lei  Anna Karlin and Eric Lei. 2015. On a competitive secretary problem. In

Twenty-Ninth AAAI Conference on Artificial Intelligence

.
• Kesselheim et al.  Thomas Kesselheim, Alexandros Psomas, and Shai Vardi. 2019. How to Hire Secretaries with Stochastic Departures. In Web and Internet Economics - 15th International Conference, WINE 2019, Vol. 11920. 343.
• Kesselheim et al.  Thomas Kesselheim, Klaus Radke, Andreas Tönnis, and Berthold Vöcking. 2013. An optimal online algorithm for weighted bipartite matching and extensions to combinatorial auctions. In European symposium on algorithms. Springer, 589–600.
• Kleinberg  Robert D Kleinberg. 2005. A multiple-choice secretary algorithm with applications to online auctions. In SODA, Vol. 5. 630–631.
• Matsui and Ano  Tomomi Matsui and Katsunori Ano. 2016.

Lower bounds for Bruss’ odds problem with multiple stoppings.

Mathematics of Operations Research 41, 2 (2016), 700–714.
• Sakaguchi  M Sakaguchi. 1978. Dowry problems and OLA policies. Rep. Stat. Appl. Res., JUSE 25 (1978), 124–128.

## Appendix A Proof of Lemma 3.1

We prove the lemma by induction on and (, and ). For the base of the induction (either or ), observe that:

• for all . Indeed, if at some point, agent is the only active agent, and the highest award so far is still available, then agent selects the maximal award. Thus, the winning probability is .

• for all . Indeed, at time according to strategy , agent selects the maximal award, and therefore wins with probability at least . If he looses, then he selects the maximal award and consequently wins the second place with probability at least .

• for all . Indeed, at time according to strategy , agent selects the maximal award, and therefore wins with probability at least .

• for all . Indeed, if the highest award (among all the awards) has been already allocated, then the agent selects the maximal available award, which is the second highest, and if he gets it, then he wins second place. The probability of winning second place is at least .

• for every . According to , agent waits until the maximal award so far is available, and if such an award does not arrive, then he selects the maximal award available at time . If the highest award appears between time and , then agent wins first place; else, he wins second place.

• for all . This holds trivially.

• for every . According to , agent waits until the maximal award so far is available, and if such an award does not arrive, then he selects the maximal award available at time . If the highest award appears between time and , then the agent wins first place (which happens with probability ). Else, if the second maximal award appears at this time (which happens with probability ), he wins second place.

Notice that the case of has different bounds than for general . We now show the step of the induction. Let , and . The following hold:

1. For the case where and , according to , agent competes over the maximal award so far. Assuming that agents (including ) compete over this award, we get that:

 Aℓtt ≥ min1≤ℓ′≤ℓt{1ℓ′⋅(tn,n−tn)+ℓ′−1ℓ′Cℓt−1t} ≥ min1≤ℓ′≤ℓt{1ℓ′⋅(tn,n−tn)+ℓ′−1ℓ′(n−tn(ℓt−1),tn)} = 12⋅(tn,n−tn)+12(n−tn,tn) = (12,12),

where the first inequality holds since the probability of receiving the award under competition is , and the probability of this award to be the maximal is , and to be the second maximal is . The second inequality is by the induction hypothesis. In this case, if and , then both values of give same lower bound.

Similarly,

 Bℓtt ≥ min1≤ℓ′≤ℓt{1ℓ′⋅(tn,n−tn)+ℓ′−1ℓ′Dℓt−1t} ≥ min1≤ℓ′≤ℓt{1ℓ′⋅(tn,n−tn)+ℓ′−1ℓ′(n−tn(ℓt−1),t(n−t)n(n−1))} = 12⋅(tn,n−tn)+12(n−tn,t(n−t)n(n−1)) = (12,(n+t−1)(n−t)2n(n−1)),

where the first inequality holds since the probability of receiving the award under competition is , and the probability of this award to be the maximal is , and to be the second maximal is . The second inequality is by the induction hypothesis. In this case if and , the unique minimum is reached at .

2. For the case where and . According to , agent does not compete over any of the available awards. Thus, either no other agent selects an award and the winning probability of agent is at least , or the number of active agents decreases by one. Thus,

 Aℓtt ≥ min(Cℓt−1t,Aℓtt+1) ≥ min((n−tn(ℓt−1),tn(ℓt−1)),(1ℓt,1ℓt)) = (1ℓt,1ℓt),

where the first inequality holds since the first term lower bounds the case where there exists another agent that selects the maximal available award, and the second term lower bounds the case that no other agent selects the maximal available award. The second inequality is by the induction hypothesis.

Similarly,

 Bℓtt ≥ min(Dℓt−1t,t−1t+1Bℓtt+1+2t+1Aℓtt+1) ≥ min((n−tn(ℓt−1),t(n−t)n(n−1)(ℓt−1)),t−1t+1(1ℓt,(n−t−1)(n+t)n(n−1)ℓt)+2t+1(1ℓt,1ℓt)) = (1ℓt,(n−t)(n+t−1)n(n−1)ℓt),

where the first inequality holds since the first term lower bounds the case where there exists another agent that selects the maximal available award, and the second term lower bounds the case that no other agent selects the maximal available award, and thus at time , both the maximal and second maximal awards will be available with probability . The second inequality is by the induction hypothesis.

3. For the case where and , according to , agent competes over the maximal award so far. Assuming that agents (including ) compete over this award, we get that:

 Aℓtt ≥ min1≤ℓ′≤ℓt{1ℓ′⋅(tn,t(n−t)n(n−1))+ℓ′−1ℓ′Cℓt−1t} ≥ min1≤ℓ′≤ℓt{1ℓ′⋅(tn,t(n−t)n(n−1))+ℓ′−1ℓ′(n−tn(ℓt−1),n2+t2−tn−nn(n−1)(ℓt−1))} = 1ℓt⋅(tn,t(n−t)n(n−1))+ℓt−1ℓt(n−tn(ℓt−1),n2+t2−tn−nn(n−1)(ℓt−1)) = (1ℓt,1ℓt),

where the first inequality holds since the probability of receiving the award under competition is , and the probability of this award to be the maximal is , and to be the second maximal is . The second inequality is by the induction hypothesis.

Similarly,

 Bℓtt ≥ min1≤ℓ′≤ℓt{1ℓ′⋅(tn,t(n−t)n(n−1))+ℓ′−1ℓ′Dℓt−1t} ≥ min1≤ℓ′≤ℓt{1ℓ′⋅(tn,t(n−t)n(n−1))+ℓ′−1ℓ′(n−tn(ℓt−1),n−tn(ℓt−1))} = 1ℓt⋅(tn,t(n−t)n(n−1))+ℓt−1ℓt(n−tn(ℓt−1),n−tn(ℓt−1)) = (1ℓt,(n−t)(n+t−1)n(n−1)ℓt),

where the first inequality holds since the probability of receiving the award under competition is , and the probability of this award to be the maximal is , and to be the second maximal is . The second inequality is by the induction hypothesis.

4. For the case where and , according to , agent does not compete over any of the available awards. Thus, either no other agent selects an award and the winning probability of agent is at least , or the number of active agents decreases by one. Thus,

 Aℓtt ≥ min(Cℓt−1t,Aℓtt+1) ≥ min((n−tn(ℓt−1),n2+t2−tn−nn(n−1)(ℓt−1)),(1ℓt,1ℓt)) = (1ℓt,1ℓt),

where the first inequality holds since the first term lower bounds the case where there exists another agent that selects the maximal available award, and the second term lower bounds the case that no other agent selects the maximal available award. The second inequality is by the induction hypothesis.

Similarly,

 Bℓtt ≥ min(Dℓt−1t,t−1t+1Bℓtt+1+2t+1Aℓtt+1) ≥ min((n−tn(ℓt−1),n−tn(ℓt−1)),<