Alice has two bags, having random values in [0, 1]. She reveals one bag to Bob. Bob picks one bag, Alice gets the other. What is the optimal strategy?

Alice is given two bags. The values of the contents of the bags are each uniformly (and independently) distributed over $[0, 1]$. She looks in both bags and finds out their value. She then reveals the value of one of the bags to Bob. Bob then picks either the revealed bag or the hidden bag. Bob and Alice are both trying to maximize their expected value. What are their optimal strategies? Assume that Bob knows Alice's strategy when he chooses his strategy.

It's clear that Bob can always achieve an expected value of $0.5$ by just picking either the revealed bag or the hidden bag uniformly at random.

One proposed strategy for Alice is to reveal the bag having value closer to $0.5$. In this case, after a bag of value $x$ is revealed to Bob, his posterior distribution over the value of the hidden bag is uniform over $[0, 0.5 - t] \cup [0.5 + t, 1]$, where $t = |x - 0.5|$. Since this distribution is symmetric about 0.5, Bob's expected value for the hidden bag is always $0.5$. Accordingly, Bob will choose the revealed bag with value $x$ if $x \ge 0.5$, and will otherwise prefer the hidden bag having expected value $0.5$.

Since the distribution of the value $x$ of the revealed bag is symmetric about $0.5$, the probability of $x \ge 0.5$ is $0.5$. The probability density of $x$ conditioned on $x \ge 0.5$ is proportional to $p(x) = 1 - 2(x - 0.5) = 2 - 2x$; normalization gives us $p(x) = 8 - 8x$. Bob's expected value when $x \ge 0.5$ is therefore $\int_{0.5}^{1} x (8 - 8x) \,dx = 2/3$, so his overall expected value is

$$ \frac{1}{2} \cdot \frac{1}{2} + \frac{1}{2} \cdot \frac{2}{3} = \frac{7}{12} = 0.5 + \frac{1}{12}. $$

As we can see, this strategy still gives Bob $1/12$ more expected value than the proven lower bound of $0.5$. My questions are:

  • Is there a better strategy for Alice?
  • I suspect the only way to bring Bob's expected value to 0.5 would be if Alice had a strategy where, after she revealed a bag of value $x$ to Bob, Bob's expected value of the hidden bag is $x$. Only then is Bob unable to gain any advantage out of his ability to choose. Is such a strategy possible?
  • Can we prove a better lower bound? Can we prove the above strategy optimal?

Solution 1:

Here, I will expand the argument given by @QuesterZen in his last paragraph, and argue that Alice has no strategy that helps her in any way against Bob's strategy of "take anything $\geq 0.5$ and switch otherwise", and show that $7/12$ is indeed the best.

There are essentially 3 cases we have to consider:

  1. $a < b < 0.5$: Alice can ensure that Bob gets the $a$ by showing $b$.
  2. $a < 0.5 < b$: Alice has no choice but let Bob get the larger one.
  3. $0.5 < a < b$: Alice can ensure that Bob gets $a$ by first showing $a$.

In none of these strategies did Alice do anything suboptimal; the crux of Bob's gains are coming from the fact that whenever the two numbers are separated by the threshold Bob chose (0.5), Bob is guaranteed to make big bucks. (this is similar to the two envelopes puzzle if you have heard of it)

The expected gains of Bob under this strategy matches your $7/12$ as:

$\frac{1}{4} \frac{1}{6} + \frac{1}{2} \frac{3}{4} + \frac{1}{4} (\frac{1}{2} + \frac{1}{6}) = \frac{7}{12}$,

which shows that Bob can indeed always achieve 7/12.

Solution 2:

Perhaps we can build up to the continuous case with smaller finite cases?

With only {0,1}, Bob always wins by accepting 1 and rejecting 0. His expected return is $\frac{3}{4}$.

With only {0,0.5,1}, the only interesting case is if Alice has either (0.5,0) or (0.5,1), when she has to show the 0.5 either way. Bob can accept it or chose randomly with equal results. His expected return is $\frac{11}{18}$.

With {0,0.33,0.67,1} Alice gains nothing by showing the 0 or 1. In all other cases Bob can’t do better than when he sees 0.33 he swaps and if 0.67 he holds and Alice can’t make any better choices even knowing this. His expected return is $\frac{5}{8}$.

Five and six values are basically the same.

So far it is looking very likely you are correct.

For the continuous case, I’m fairly sure Alice can’t beat Bob’s obvious strategy or Bob beat Alice’s. This would mean your pair of strategies is a Nash equilibrium for the game.

To see that neither has a better strategy in the face of the others’ obvious strategy, take for example the situation where Alice has two low values a,b with a<b<0.5. If she shows value a, she expects Bob to swap it, so she can’t beat showing b hoping to “trick” Bob into swapping. But for each such case there is a case where she has b,(1-a) with equal probability. Again she has to show b as 1-a>0.5 so Bob would take it and win. For Bob if he sees the value b, it is equally likely that Alice has a or 1-a in her bag since either way she would show b, so he should always swap, gaining an expected value of 0.5>b.