Magic 8 Ball Problem

This problem is similar to the German tank problem. You can use the same approaches used for that problem, with the number of distinct sides seen and its conditional distribution taking the place of the highest serial number seen and its conditional distribution. (Note that the only information we have that tells us anything about the number of sides of the magic ball is the number of distinct sides we've seen; all the details of which side we saw when, both the ones you recorded in your sequence and the ones you didn't record, are irrelevant, since their conditional distribution given the number of sides seen is the same irrespective of the total number of sides, i.e. they carry no information on the total number of sides.)

I'll work out the Bayesian approach in analogy to the treatment in the Wikipedia article linked to above; I won't go into much detail for the parts that are common to the two calculations.

So let $n$ be the number of sides of the magic ball, $k$ the number of shakes performed so far, and $m$ the number of distinct sides seen. (All the notation is in analogy to the article.) Then the desired credibility $(n\mid k,m)$ is given by

$$ (n\mid m,k) = (m\mid n,k)\frac {(n\mid k)}{(m\mid k)}\;. $$

We can take the required conditional probability $(m\mid n,k)$ of seeing $m$ distinct sides on $k$ shakes of a ball with $n$ sides from this answer:

$$ (m\mid n,k)=\frac{m!}{n^k}\binom nm\left\{k\atop m\right\}\;, $$

where the $\left\{k\atop m\right\}$ are Stirling numbers of the second kind. As in the article, we can eliminate $(m\mid k)$ by marginalizing over all possible $n$:

$$ (m\mid k)=\sum_{n=1}^\infty(m\mid n,k)(n\mid k)\;. $$

Now comes the shady part, $(n\mid k)$, your prior credibility that there are $n$ sides after having performed $k$ shakes. In contrast to the tank problem, where observing a tank changes not only the highest serial number seen but also the minimum number of tanks, in our case $(n\mid k)$ doesn't depend on $k$. We can use the same trick of introducing a maximum number $\Omega$ of sides up to which the prior credibility is uniform, and hoping that $\Omega$ will magically drop out of the result. Then $(n\mid k)=1/\Omega$ for $1\le n\le\Omega$. Putting it all together, we have

$$ (n\mid m,k)=\frac{(m\mid n,k)}{\sum_{n=1}^\Omega(m\mid n,k)}=\frac{\frac{m!}{n^k}\binom nm\left\{k\atop m\right\}}{\sum_{n=1}^\Omega\frac{m!}{n^k}\binom nm\left\{k\atop m\right\}}=\frac{\frac1{n^k}\binom nm}{\sum_{n=1}^\Omega\frac1{n^k}\binom nm}\;. $$

Now, interestingly, if we want to let the sum in the denominator run to infinity to get rid of $\Omega$, we find that it converges if and only if $m\le k-2$. That is, if you had no prior idea how many sides the magic ball might have and you've never seen the same side twice, then you still have no idea how many sides it might have. That's perhaps to be expected; what's perhaps not quite as expected is that this is still true (though "only" with a logarithmic divergence in the denominator) if you've seen a single reoccurrence of a side. Only with at least two reoccurrences can you overcome your prior ignorance as to the possible scale of the number of sides.

Let's work out some examples. The first convergent case is $m=1$, $k=3$, i.e. you shook three times and got the same side every time. Then

$$ (n\mid 1,3)=\frac{\frac1{n^2}}{\sum_{n=1}^\infty\frac1{n^2}}=\frac6{(n\pi)^2}\;. $$

So at this point you're already $6/\pi^2\approx61\%$ sure that the thing is a lemon and only ever shows this one side. More generally, we have

$$ (n\mid 1,k)=\frac1{n^{k-1}\zeta(k-1)}\;, $$

so after seeing the same side four times in a row your inclination to go to the store to get your money back has risen to $1/\zeta(3)\approx83\%$. On the other hand, if you do see $m=2$ different sides in $k=4$ shakes, you have

$$ (n\mid 2,4)=\frac{\frac1{n^4}\binom n2}{\sum_{n=1}^\infty\frac1{n^4}\binom n2}=\frac{\frac{n-1}{n^3}}{\sum_{n=1}^\infty\frac{n-1}{n^3}}=\frac{\frac{n-1}{n^3}}{\zeta(2)-\zeta(3)}\approx 2.258\frac{n-1}{n^3}\;, $$

so now only about $28\%$ are concentrated on the possibility that these are the only two sides.

Now, you'd asked about an estimate of the number of sides. The expected value of $(n\mid m,k)$ is

$$ \mathbb E(n\mid m,k)=\frac{\sum_{n=1}^\Omega\frac1{n^{k-1}}\binom nm}{\sum_{n=1}^\Omega\frac1{n^k}\binom nm}\;. $$

So it turns out that even after two reoccurences you can't get a finite estimate of the number of sides if you didn't have any prior clue how many there might be – you need three reoccurrences for the sum in the numerator to converge. In the case of always observing a single side, the expected value is

$$ \mathbb E(n\mid1,k)=\frac{\zeta(k-2)}{\zeta(k-1)}\;. $$

so after seeing the same four sides and nothing else, your estimate of the number of sides is $\zeta(2)/\zeta(3)\approx1.368$.

To give a more realistic example, let's say you shake the ball with $n=20$ sides $k=20$ times. Then the mode of the distribution of the number of distinct sides you see is at $m=13$ (at a probability of about $28\%$). So let's say you observe that most probable value and try to estimate the number of sides. Then

$$ (n\mid 13,20)=\frac{\frac1{n^{20}}\binom n{13}}{\sum_{n=1}^\infty\frac1{n^{20}}\binom n{13}}\approx\frac{8.25\cdot10^{19}}{n^{20}}\binom n{13}\;. $$

Here's a plot, with a nice mode at $n=20$. The estimate from the expectation value is

$$ \mathbb E(n\mid 13,20)=\frac{\sum_{n=1}^\Omega\frac1{n^{19}}\binom n{13}}{\sum_{n=1}^\Omega\frac1{n^{20}}\binom n{13}}\approx26.5\;, $$

which perhaps shows that you might be better off using the mode as the estimator rather than the mean, at least if you don't expect huge numbers of sides.

Another question we might ask is: If we've seen all $m=n$ sides, what number $k$ of trials do we need before we're, say, $99\%$ sure that that's all there is? That is, for which $k$ do we first have

$$ (n=m\mid m,k)=\frac{\frac1{m^k}\binom mm}{\sum_{n=1}^\infty\frac1{n^k}\binom nm}=\frac{\frac1{m^k}}{\sum_{n=1}^\infty\frac1{n^k}\binom nm}\ge0.99\;? $$

Some trial and error on the borders of Wolfram|Alpha's time limits shows that for $m=n=20$ this requires a full $k=157$ trials. (If the calculation times out, just try again.)

Regarding the approximation considered in the comments: If the term for $n=m$ is dominant, we can compare it to the next term, for $n=m+1$:

$$ \frac{(m\mid n=m,k)}{(m\mid n=m+1,k)} = \frac{\frac1{m^k}\binom mm}{\frac1{(m+1)^k}\binom{m+1}m} = \frac{\left(1+\frac1m\right)^k}{m+1} = \frac{\mathrm e^{k\log\left(1+\frac1m\right)}}{m+1}\;. $$

We can already see that this is going to get very large for large $k$ (no approximation so far); to see more clearly how large, we can use $\log(1+x)\simeq x$ to get

$$ \frac{(m\mid n=m,k)}{(m\mid n=m+1,k)} \simeq \frac{\mathrm e^{k/m}}{m+1}\;. $$

Thus, for $k\gg m$ this ratio grows exponentially. The third term is again smaller by a similar factor, so we can ignore it and all remaining terms and approximate the credibility for $n=m$ as

\begin{eqnarray} (n=m\mid m,k) &\simeq& \frac{(m\mid n=m,k)}{(m\mid n=m,k)+(m\mid n=m+1,k)} \\ &\simeq& \frac1{1+(m+1)\mathrm e^{-k/m}} \\ &\simeq& 1-(m+1)\mathrm e^{-k/m}\;. \end{eqnarray}

So our doubts whether indeed $n=m$ decay exponentially with $k/m$.