Most efficient strategy for guessing outcome of (fair) dice roll?

Each yes-no question asked can provide at most one bit of information. A fair dice roll has six equally probable outcomes, so $\log_26=2.585$ bits of entropy. Any strategy to correctly guess the dice roll cannot use fewer than this many questions on average.

A strategy that halves the search space with each new question as evenly as possible will take two questions to correctly guess for two rolls (say 1, 6) and three questions for the other four rolls, thus averaging $\frac83=2.667$ questions. This is optimal, since if three rolls only required two questions the average number of questions would be 2.5 – below the theoretical lower bound.

The strategy that asks whether the roll is 1, then 2, then 3 and so on requires 5 questions if the roll was 5 or 6, thus requiring $\frac{1+2+3+4+5+5}6=\frac{10}3=3.333$ questions on average; it is sub-optimal.


Your question is essentially a coding problem. When you have decided on your strategy, each outcome is a series of yes-no-answers. By encoding "yes" answers as $1$ and "no" answers as $0$, you get a binary codeword for each of the outcomes.

The set of the codewords must be a prefix code, that is, no codeword is a prefix of another. That's because, for example, if one outcome corresponds to answer sequence "yes-no-yes", you clearly can't have a outcome that corresponds to answer sequence "yes-no". (Otherwise you'd know the answer after 2 questions but still would ask a third question which would change the answer, which is clearly impossible.)

On the other hand, you can transform a prefix code to a question strategy; you'll just formulate your $k$th question as "Using this coding, is the $k$th bit in the codeword $1$?" and stop when you know the answer.

So your question is exactly equivalent to "What is the most efficient prefix code for a die throw outcome?" Here, efficiency is the expected value of the length of the codeword.

Given a known set of outcomes with a known probability distribution, the most efficient prefix code is the Huffman code. I'll leave constructing the code as an exercise for the reader, but the optimal expected length in this case is $2+\frac{2}{3}$.