A Note on the Expected Number of Interviews When Talent is Uniformly Distributed

10/26/2018
by   Simon Demers, et al.
Microsoft
0

Optimal stopping problems give rise to random distributions describing how many applicants the decision-maker will observe or interview before choosing one, a quantity sometimes referred to as the optimal stopping time. Despite the fact that is has important practical implications, this quantity is not widely studied. This short research note considers how many interviews are expected to be conducted when a decision-maker has to choose a candidate from a pool of sequential applicants with uniformly distributed talent and no recall, in the vein of previously studied Cayley-Moser and Secretary Problems. In terms of theoretical contribution, we derive algebraically the expected number of interviews when the decision-maker can only assess candidates using a rank-based indicator. In terms of empirical contribution, we show how the expected number of interviews relates to the size of the applicant pool when payoff values are observable. Finally, we present a new conjecture around the median number of interviews that will be conducted in the full-information setting.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

05/20/2022

Individual Fairness in Prophet Inequalities

Prophet inequalities are performance guarantees for online algorithms (a...
10/27/2021

A sequential estimation problem with control and discretionary stopping

We show that "full-bang" control is optimal in a problem that combines f...
10/26/2018

Packing Returning Secretaries

We study online secretary problems with returns in combinatorial packing...
02/14/2022

Online Approval Committee Elections

Assume k candidates need to be selected. The candidates appear over time...
11/12/2020

Sample-driven optimal stopping: From the secretary problem to the i.i.d. prophet inequality

Two fundamental models in online decision making are that of competitive...
10/27/2020

A Note on Multigrid Preconditioning for Fractional PDE-Constrained Optimization Problems

In this note we present a multigrid preconditioning method for solving q...
03/21/2018

Mislearning from Censored Data: Gambler's Fallacy in a Search Problem

In the context of a sequential search problem, I explore large-generatio...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Although optimal stopping problems have been studied and refined extensively over time (Gilbert and Mosteller, 1966; Ferguson, 1989), little is currently known about the statistical properties of the random distributions they give rise to. For example, it is not necessarily obvious how the expected number of observations (“applicants”) to be considered (“interviewed”) before one is ultimately chosen (“hired”) will vary when the decision-maker (“employer”) can observe the actual payoff values or when the sample size (“applicant pool”) grows. This is unfortunate because it might be useful in some practical settings to anticipate, for example, how many interviews the decision-maker should plausibly schedule or prepare for. This is what this short note attempts to elucidate.

The focus is on the problem of choosing a candidate from a pool of applicants with uniformly distributed talent. We first provide some background information about the problem and summarize some key earlier results (Section 2). In terms of theoretical contribution, we derive algebraically the expected number of interviewed candidates within the Bearden (2006) setting where the decision-maker can only assess candidates using a rank-based indicator (Section 3). In terms of empirical contribution, we show how the expected number of interviewed candidates relates to the applicant pool size within the Moser (1956) setting where payoff values are observable (Section 4). This allows us to show that both settings result in the same expected number of interviews when there are 63 or 64 applicants under consideration. Finally, we present a new conjecture around the median number of interviews that will be conducted in the Moser (1956) setting (Section 5).

2 Background and Known Results

Consider the problem of a decision-maker (“employer”) who is looking for the best possible candidate (“hire”) out of a sequence of applicants sampled from a uniform distribution but cannot recall previously considered applicants who have been passed on. Assume that the decision-maker’s payoff value is determined by the selected candidate’s attractiveness, quality, or intrinsic value (interchangeably).

Moser (1956) studied the case where the decision-maker can observe each sequential applicant’s attractiveness (“payoff value”). Let be the payoff associated with the applicant, a value that is observable by the decision-maker. Assume that the observations are independent and identically distributed, drawn from a known uniform distribution scaled on the interval . As a reminder, when there are applicants left to be observed (“interviewed”), the optimal stopping rule consists in stopping and choosing the applicant whenever , where is defined recursively (inductively) with , and (Moser, 1956). Asymptotically, when the pool of applicants left to be observed is large enough, the minimum threshold can be approximated as: (Gilbert and Mosteller, 1966). Interestingly, Mazalov and Peshkov (2004) previously proved that the expected number of interviews in the Moser (1956) setting converges asymptotically to .

Fifty years later, Bearden (2006) considered a similar problem but assumed instead that the decision-maker could only observe an indicator revealing the relative rank of each candidate, in the true tradition of the Secretary Problem (Ferguson, 1989). As shown by Bearden (2006), the optimal strategy for the decision-maker who can only observe the relative attractiveness of each subsequent applicant but cannot observe the actual payoff values is to reject the first applicants (rounded to the nearest integer) and select the next candidate identified as the relatively best so far (or the applicant if none turns out to be relatively better than the applicants observed initially).

The reasoning behind the result obtained by Bearden (2006) is as follows. The decision-maker only observes an indicator , where if and only if the candidate is the most attractive (best) so far and otherwise. Let be the number of candidates interviewed before one is hired. Given a pool of applicants and an arbitrary cutoff of candidates with

, the probability that the

candidate will be the relatively best candidate is:

(1)

Given that the candidate is identified as the relatively best, its expected payoff value is . If the decision-maker is instead compelled to select the last () candidate by default, the expected value is simply the unconditional mean: . Combining these arguments, Bearden (2006) showed that the expected value for the decision-maker is:

(2)

It is a matter of algebra to show that this expected value is maximized by setting . That is, the optimal strategy for the decision-maker is to reject the first candidates (rounded to the nearest integer) and select the next candidate identified as the relatively best so far. Under this optimal selection strategy, the decision-maker’s expected value is .

3 New Theoretical Results

Importantly, the number of candidates expected to be observed (“interviewed”) by the decision-maker before a candidate is ultimately chosen is much greater than in the Bearden (2006) setting. In fact, in repeated samples, the decision-maker would be expected to interview at least twice as many candidates almost half the time. This can be seen by recognizing that the theoretical median of

, based on the probability distribution function in Eq. (

1), is .

For its part, the expected number of interviewed candidates is:

(3)

where is the digamma function. It turns out that the overestimation error of the approximation is roughly proportional to .

4 Numerical Estimates

Unfortunately, deriving algebraically the expected number of interviewed candidates in the case where the decision-maker observes the actual payoff values remains an open problem. However, it is relatively straightforward to estimate numerically the expected number of interviews in the

Moser (1956) for any given pool size , as long as we are able to compute the optimal stopping thresholds with a sufficient degree of precision. This approach has the potential to be especially useful and practical with smaller applicant pools for which the asymptotic limit of derived by Mazalov and Peshkov (2004) is not a satisfactory approximation.

Let refer to the optimal stopping threshold computed recursively for the candidate observed by the decision-maker or, stated differently, when there are applicants left to be observed. As before, let be the number of candidates interviewed before one is hired. Keeping in mind that the payoff values observed sequentially by the decision-maker are drawn from a uniform distribution, the probability that no candidate before the candidate will have a payoff value above its corresponding threshold is . Since the candidate will have a payoff value above its corresponding threshold with probability , the ex ante probability that the candidate will be selected by the decision-maker setting is . In turn, the expected number of interviews in the Moser (1956) setting is:

(4)

In the presence of applicants, for example, the decision-maker can expect to conduct approximately 3,333,339 interviews on average. Of course, the fact that this closely approximates should not come as a surprise in light of the asymptotic result previously derived by Mazalov and Peshkov (2004).

More numerical results are summarized in Table 1. The corresponding theoretical predictions in the Bearden (2006) setting are included for comparison purposes.

Pool Size Moser (1956) Bearden (2006)
()
9 4.23844 0.4709 3 0.3333 5.68571 4
25 9.87275 0.3949 8 0.3200 11.9371 8
49 18.0805 0.3690 15 0.3061 19.1777 12
64 23.1646 0.3619 20 0.3125 23.0590 14
100 35.3069 0.3531 30 0.3000 31.2265 18
400 135.758 0.3394 118 0.2950 77.4217 38
2500 836.365 0.3345 734 0.2936 242.191 98
10000 3336.83 0.3337 2931 0.2931 556.413 198
1000000 333338 0.3333 292897 0.2929 7901.35 1998
Table 1: Expected number of interviews when the decision-maker can observe the uniformly distributed payoff values (Moser, 1956) or only a rank-based indicator (Bearden, 2006).

We note in passing that fewer interviews are expected to take place when the payoff values are observable by the decision-maker (Moser, 1956) compared to when the decision-maker only receives a rank-based signal of relative attractiveness (Bearden, 2006), as long as there are fewer than available applicants under consideration. As soon as the pool of applicants grows beyond 64 or so choices, the decision-maker who can observe the actual payoff values will actually be expected to conduct more interviews on average.

5 Conjecture

Using the same probability distribution function in Eq. (4) to calculate the median, it is possible to show that the decision-maker who has access to applicants is equally likely to conduct more or less than 292,897 interviews. With applicants, the median is 2,928,936 interviews.

This leads us to speculate without being able to prove mathematically that the median number of interviews in the Moser (1956) setting converges asymptotically to . In other words, a decision-maker in the Moser (1956) setting who deals with a large pool of applicants should plan to interview at least (no more than) 29.29% of those applicants roughly half the time.

6 Conclusion

Optimal stopping problems give rise to random distributions describing how many interviews might be conducted by the decision-maker. Despite the fact that they have practical implications, these probability distributions are rarely studied. This short research note focuses on the problem of choosing a candidate from a pool of applicants with uniformly distributed talent.

In terms of new theoretical results, we offer that the expected number of interviews in the Bearden (2006) setting can be approximated quite well by while the median is the integer closest to (Section 3).

A good rule-of-thumb within the Moser (1956) setting is that approximately (and no fewer than) one third of all applicants can expect to be interviewed on average, an asymptotic result due to Mazalov and Peshkov (2004). We confirm this empirically. In fact, the numerical estimates in Section 4 allow us to add that approximately of all applicants should be interviewed at least half the time (Section 5). Proving this conjecture mathematically remains an open problem.

References

  • Bearden (2006) Bearden, J. N. (2006). A New Secretary Problem with Rank-Based Selection and Cardinal Payoffs. Journal of Mathematical Psychology, 50(1):58–59. DOI: 10.1016/j.jmp.2005.11.003.
  • Ferguson (1989) Ferguson, T. S. (1989). Who Solved the Secretary Problem? Statistical Science, 4(3):282–289. DOI: 10.1214/ss/1177012493.
  • Gilbert and Mosteller (1966) Gilbert, J. P. and Mosteller, F. (1966). Recognizing the Maximum of a Sequence. Journal of the American Statistical Association, 61(313):35–73. DOI: 10.1080/01621459.1966.10502008.
  • Mazalov and Peshkov (2004) Mazalov, V. V. and Peshkov, N. V. (2004). On Asymptotic Properties of Optimal Stopping Time. Theory of Probability & Its Applications, 48(3):549–555. DOI: 10.1137/S0040585X97980580.
  • Moser (1956) Moser, L. (1956). On a Problem of Cayley. Scripta Mathematica, 22:289–292.