Bag of Marbles problem from exercises for feynman lectures

You have 100 marbles numbered 0 to 99 in a bag. You repeatedly draw a marble from the bag (all marbles being equiprobable), note its number, and replace it in the bag. On average, how many of the marbles numbered 1 through 99 will have been drawn from the bag one or more timesbefore drawing marble #0?

This is a problem from exercises for feynman lectures.

I have tried to solve this problem. First i found the probability of the occurance of an $n$ length sequence excluding the terminating zero. which results in $$\frac{99^n}{100^{n+1}}$$ then i found the probability that this $n$ length sequence containing only $k$ unique numbers from $1 to 99$. Which i got as $$\frac{99C_k *k^{n-k}}{100^n}$$. Then the joined combination is summed over $k from 1 to 99 & n from 1 to Infinity$. But i couldn't get the correct answer which is $\frac{99}{2}$ Please tell me where i went wrong.


Solution 1:

A far simpler approach:

For each $i\in \{1,2,\dots,99\}$ let $X_i$ be the indicator random variable equal to $1$ if at least one marble numbered $i$ was drawn before any marble numbered $0$, and equal to $0$ otherwise.

It is easy to show that $Pr(X_i=1)=Pr(X_i=0)=\frac{1}{2}$, due to the symmetry of the problem. Search elsewhere on this site for a formal proof if you are unconvinced.

As $X_i$ is an indicator randomvariable it follows that $E[X_i]=Pr(X_i=1)=\frac{1}{2}$

Now, recognize that the total number of marble id's which have been seen before a $0$ is drawn is $X=\sum\limits_{i=1}^{99}X_i$ and by the linearity of expectation you have:

$$E[X]=E[\sum\limits_{i=1}^{99}X_i]=\sum\limits_{i=1}^{99}E[X_i] = 99\cdot \frac{1}{2}$$


As for your mistake... it appears as though you are asserting that the probability of exactly $k$ different numbers appearing in $n$ pulls given that the $n+1$'st pull is $0$ as being $\dfrac{\binom{99}{k} k^{n-k}}{100^n}$. I assume that your logic went something like "we'll first choose which $k$ of the numbers were seen, then we'll guarantee an occurrence of each, then for the rest of the pulls we let them be any from the list of selected numbers"

This is a good idea, but implemented incorrectly. The probability you found was in actuality the probability that exactly $k$ values were seen, and very specifically within the first $k$ pulls the $k$ values all appear exactly once each in increasing order after which you have the remaining pulls all being any from the list of selected values. This however does not account for situations where you repeat any value within the first $k$ pulls nor does it account for situations where the $k$ values first appear in an order other than increasing order.

To correct your approach, you should look into using Stirling Numbers of the Second Kind, however this is overly complicated compared to the argument above using the linearity of expectation.