Better than random

As it has been pointed out, the wikipedia page contains more than enough information. Here my answer anyway:

Let $X$ be a r.v. with a distribution of your choice. The only important thing is that it gives positive weight to each (nonempty) interval of the reals. For example, let $X$ be a standard normal variable.

Let the two numbers be $a$ and $b$, with $a < b$. Compare a realisation of $X$ (independent of your choice of $a$ or $b$) with the value of the first number you see. If $X$ is bigger, switch, otherwise keep the number.

The probability of winning can be computed as follows:

  • If you choose $a$ (with $\frac{1}{2}$ chance), then you win if you switch, ie if $X > a$. This has prob. $P(X>a)$.
  • Similarly if you choose $b$ (with $\frac{1}{2}$ chance), you win if you don't switch, ie if $X \leq b$. This has prob. $P(X\leq b)$.
  • So the overall probability of winning is $\frac{1}{2} + \frac{1}{2}P(a < X \leq b)$, which is slightly larger than 1/2 based on our assumptions on $X$.

There's a detailed discussion about this on MO: https://mathoverflow.net/questions/9037. There's also a question on math.SE on the Card doubling paradox, but this is about expectation values for doubled amounts, not about the probability of guessing correctly. And as Didier pointed out in a comment, there's a section in the Wikipedia article on the two envelopes problem that deals with this as an extension of that.


The wikipedia page given in the comments for the two envelopes problem contains the answer in the second-to-last section, called "Randomized solutions." It instructs the player to select a random number on the distribution of the numbers in the boxes and switch if the number in the box the player opens is lower than the random number chosen.

Obviously, if the random number is greater than or less than both of the numbers the chances are still 50% but if the random numer is between the numbers then the player can select the appropriate box. This gives a probability of $P= .5+$ probability that the lower number < random number < higher number, which is greater than .5.


Note: I have never heard this before, I'm only reasoning through it, so don't take my answer as the answer.

This problem is defined by its ambiguity. Can ANY number be in either box, from negative infinity to infinity, with equal distribution? It doesn't say anything about it. From 0 to infinity? Only integers?

If it's negative infinity to positive infinity (integers or not) you can do better than 50% only in a trivial sense; If you see -5.6 billion in the first one there are infinite numbers in either direction, so you have a 50% chance either way. So, no, you can't do better than 50% given the equal-distribution-negative-infinity-to-positive-infinity-assumption.

If there is a known discrete range, than it's obviously easy to get better than 50%. But with any infinite range, you can't do better than 50%.

Note: I am really guessing here that this problem was formed with this thought: If the range is 0 to infinity, then whatever finite number I get in the first box has a finite number of numbers smaller than it, and infinite larger; therefore, I should always pick larger and I am correct some number approaching (but not exactly equal to) 100%. This is an absurd conclusion since it implies that no matter which box I pick first, it will always have the smallest number, when the fact is this will only happen 50% of the time.

Not being a mathematician, my guess is that the unstated assumption that leads to this contradiction is what the comment below points out: The entire idea of a uniform distribution over an infinite set is impossible.

So what does this say about the question? Well, just that it doesn't really make sense with an infinite range, and is horribly simple with a finite range.


If you know the distribution of random numbers, for instance if you know that the numbers come randomly from an interval (0,L) of real numbers, then your probability of winning is

2/3, or about 66% of the trials, of you use the same random to base the decision, and

3/4, so 75% of the trials, if you use the median of the distribution

To see the first, pick a third random number from the same distribution. Then the probability of this number to be between the other two is 1/3, as you can check from the six equally possible permutations, and in this case the strategy always selects the right answer. If the number is greater and smaller that both boxes, you have still 1/2 probability of random guess, to total prob is 1/3 + 2/3 * 1/2 = 2/3

To check this, you can try the following Python program:

from random import random
trials, winrandom, winmethod=0,0.0,0.0

while trials < 10000:
   abox=random()
   bbox=random()
   print abox,bbox
   trials +=1
   choose = "a" if random() < 0.5 else "b"
   if choose=="a" and abox > bbox: winrandom +=1
   if choose=="b" and bbox > abox: winrandom +=1
   decide = "a" if random() < abox else "b"
   if decide=="a" and abox > bbox: winmethod +=1
   if decide=="b" and bbox > abox: winmethod +=1
   print trials, winrandom/trials, winmethod / trials

Generically, if you use a different distribution of probability, your total prob is still given by the formula offered in the other solutions, $ p+ {1\over 2} (1-p)$, or $${1\over 2} + {1\over 2} p $$
with $p$ the probability of the random number to be between the two boxes. Note that some formulae combine this $p$ with the one of the first box being smaller that the second, to hide the 1/2 factor.

A distribución concentrated on the median of the original one is a better strategy, because the prob of the median to be between both numbers is 0.5, and so the winning prob raises up to the 75%.