How to use the 1/e law of best choice?

Caution: I'm not a mathematician, but I remember some of what I learned in college.

I was reading about the Secretary Problem on Wikipedia, essentially about determining the optimal moment to stop evaluating new options. I'm interested in the variation where there is an unknown number of applicants, which they say can be solved with the $\frac{1}{e}$ law of best choice.

Wikipedia gives a simple formula for the time distribution function, where $f$ is the frequency of applicants, and $T$ is the maximum time you can wait.

$$F(t) = \int_{0}^{t} f(s)ds \ \ , \ \ 0\le t\le T$$

If I understand correctly, the article then asserts that the correct thing to do is to wait until $F(t) = \frac{1}{e}$.

I am puzzled at what this would mean in the real world. I'm interpreting the definite integral to mean the cumulative number of candidates. But it can't be asking us to wait until half a candidate shows up. Or is it saying to discard the first candidate, and take the next one that's as good? Or do I have this completely wrong and it's saying to wait until the density of candidates is $\frac{1}{e}$? If so that is unintuitive.

The Wikipedia article is unclear (to me, anyway) and I haven't found a better reference.


Solution 1:

Suppose that all applicants have independently of each other the same arrival time density $f$ on $[0,T]$ and let $F$ denote the corresponding arrival time distribution function $$ F(t) = \int_0^t f(s) \, ds, \, 0 \leq t \leq T. $$

The author is stretching the analogy somewhat to give intuition for the integral of the probability density function. In this model, you might tell the applicants to email their resumes from midnight this morning to midnight tonight. Suppose the resumes arrive randomly according to the probability density function $f(s)$. Let $\tau$ be the time such that $F(\tau) = \frac{1}{e}$. In other words, a $\frac{1}{e}$ fraction of the area under the probability density function has been witnessed, which should roughly correspond to having seen a $\frac{1}{e}$ fraction of the applicants. (In practice, you could be so unlucky that all applicants randomly choose to arrive just before midnight tonight, but this is highly unlikely.) The strategy recommends that you dismiss all applicants who emailed before time $\tau$ and hire the first applicant after that who is better than all the previously dismissed ones.

In short, your goal is to reject the first $\frac{1}{e}$ fraction of applicants and hire the next applicant who is better than all the previously dismissed ones. If you happen to know there will be exactly $N$ applicants, then you can dismiss the first $\frac{N}{e}$ of them. If you do not know how many applicants will apply, then you must make some judgement about the distribution of the application times and choose the time when you expect a $\frac{1}{e}$ fraction have already applied.

Solution 2:

You are right, Neilk, if you say " In the real world, distributions of events are messier, and often never settle down to any average frequency. – neilk "

The point is that the 1/e-strategy is the best you can do in practice if you cannot estimate the number of candidates beforehand. Do not think of F(t) as as fraction of the number of candidates. F is the cumulative distribution function with which you model your decision problem, and then you choose t such that F(t)=1/e. Thus you expect to see (as the two preceding answers explain correctly) the fraction 1/e, thus around 37 percent, of the unknown number of candidates, but this is an expected value which need not be an integer. Choose t and wait and see.