3 other extensions of the Secretary Problem

The class of potentially optimal strategies for the standard version (find the best, one selection) can be derived as follows: It never makes sense to choose a candidate worse than the best seen so far. Thus a strategy only needs to specify when to choose a candidate who's the best candidate so far; let's call such a candidate a plausible candidate. Given that the current candidate is plausible, the relative order of the other candidates seen so far provides no information on the probability of the current candidate being the best of all. Thus the decision whether to accept a plausible candidate can only depend on the time of arrival. It's clear that the probability of a plausible candidate being the best of all increases over time, so the solution must be not to accept any candidates up to a certain time and to accept plausible candidates from then on.

Analogous reasoning can be applied to Problem 1. Now a plausible candidate is someone who's either the best or the worst seen so far. The problem is symmetric with respect to best and worst, so the optimal strategy must be to reject all candidates up to a certain time $r$ and to accept plausible candidates from then on. The calculation of the optimal value of $r$ proceeds as in the Wikipedia section on the standard version; the probability of choosing the best or worst candidate is

$$ \begin{align} P(r) &= \sum_{i=1}^{n} P\left(i \text{ is selected}\mid i \text{ is best or worst}\right) \cdot P\left(i \text{ is best or worst}\right) \\ &= 0 + \sum_{i=r}^{n} P\left( \text{best and worst of first } i-1 \text{ are among first } r-1 \mid i \text{ is best or worst} \right) \cdot\frac{2}{n} \\ &= \sum_{i=r}^{n} \frac{(r-1)(r-2)}{(i-1)(i-2)} \cdot \frac{2}{n} \\ &= \frac{(r-1)(r-2)}n\sum_{i=r}^{n} \frac{2}{(i-1)(i-2)}. \end{align} $$

In the limit $n\to\infty$, with $x=r/n$ and $t=i/n$ this becomes

$$x^2\int_x^1\frac2{t^2}\mathrm dt=2x^2\left(\frac1x-1\right)=2x(1-x)\;.$$

The maximum is at $x=1/2$, and the probability achieved is $1/2$. That is, in the limit of large $n$, the optimal strategy is to reject half the candidates and then accept the next candidate who is either the best or the worst seen so far; the probability of choosing the best or worst candidate by this strategy is $1/2$.

Another approach is to write a recurrence relation for the value $a_i$ of the game at time step $i$ under the assumption that plausible candidates are always chosen, solve the recurrence and choose $r$ so as to maximize $a_r$. In the standard version, since candidate $i$ being the best candidate of all and candidate $i$ being rejected are mutually exclusive events, the probability $a_i$ of success is just the sum of the probability $1/n$ of candidate $i$ being the best and the probability $a_{i+1}(i-1)/i$ of candidate $i$ being rejected and then the rest of the process succeeding:

$$a_i=\frac1n+\frac{i-1}ia_{i+1}\;.$$

Dividing through by $1/n$ and rearranging yields

$$\frac{a_{i+1}-a_i}{1/n}=-1+\frac{a_{i+1}}i\;,$$

which in the limit $n\to\infty$ with $t=i/n$ and $a(i/n)=a_i$ becomes

$$a'(t)=-1+\frac{a(t)}t$$

with solution

$$a(t)=-t\log t$$

as expected. The corresponding derivation for Problem $1$ is

$$a_i=\frac2n+\frac{i-2}ia_{i+1}\;,$$

$$\frac{a_{i+1}-a_i}{1/n}=-2+2\frac{a_{i+1}}i\;,$$

$$a'(t)=-2+2\frac{a(t)}t$$

with solution

$$a(t)=2t(1-t)$$

as expected.

Problem $3$ is just two independent instances of the standard version, since the very first candidate is never chosen and all remaining candidates can be a plausible candidate for at most one of the targets. Thus in this case the optimal strategy is to pick the candidates who are either the best or the worst so far beginning at $x=1/\mathrm e$, and the resulting probability of success is $(1/\mathrm e)^2$.

Problem $2$ is more difficult, and the probability of finding the median tends to $0$ for $n\to\infty$. Basically, this is because in the other problems success is almost certain when choosing plausible candidates near the end, whereas in Problem $2$ only the very last candidate has probability $1$ of being the median if it's a plausible candidate, and the penultimate one already only has probablity roughly $1/2$, and so on.

The problem can be solved numerically as follows: The decisions can only depend on the rank of the current candidate. Early in the process all candidates should be rejected; later candidates in some range of central ranks should be accepted, where the range depends on time. To determine the optimal range at each time step, we can start at the end, and at each time step for each rank compare the probability of success if we choose the candidate, given its rank, to the probability of success if we skip the candidate, and pick the greater one. Then by summing these success probabilities over all ranks we get the overall success probability, which becomes the probability of success on skipping for the previous time step.

Given that the $i$-th candidate has rank $j$, its probability of being the median is (assuming odd $n$)

$$P(i\text{ is the median}\mid i\text{ has rank } j)=\frac in\frac{\binom m{j-1}\binom m{i-j}}{\binom{2m}{i-1}}$$

(the probability $1/n$ of being the median over the probability $1/i$ of having rank $j$ times the probability of choosing $j-1$ out of $m$ values above this one and $i-j$ out of $m$ values below this one, where $m=(n-1)/2$). Approximating this using Stirling's formula doesn't lead to a scale-invariant equation as in the other problems. Numerical solution shows that the success probability decreases with $n$, and the fraction of candidates after which one should start accepting candidates with central ranks increases with $n$. Up to $n=729$, the range of central ranks in which candidates should be accepted never grows beyond $5$ on either side of the centre. Here are some results for the success probability $p$ and for the number $r$ and fraction $x$ of initial candidates to be ignored:

$$ \begin{array}{c|c|c} n&p&r&x\\\hline 1&1\\ 3&0.333333\\ 9&0.256085&4&0.444444\\ 27&0.174775&16&0.592593\\ 81&0.122964&54&0.666667\\ 243&0.086665&180&0.740741\\ 729&0.060955&590&0.809328 \end{array} $$

(The entries at the top right are empty because there's no choice in the first line and the choice doesn't make a difference in the second line.) Here's the code I used for this; it also performs a simulation in the end to check that the optimized strategy achieves the calculated probability.

It looks as if the success probability may follow a power law as function of $n$; if so, the exponent is approximately $-0.32$.