Compute the expectation of steps making $n$ different balls the same

Suppose that we are given $n$ balls of different colors. Each time we pick up two balls randomly and change the color of the second ball to that of the first, then we put them back. The question is what is the expectation of steps when these balls become the same color.

I solved the problem for $n=2, 3, 4$. The naive solution is to solve a system of recursive equations. But there are too many states if $n$ is lager than $4$.

As requested by the comment, I now present my solution for $n=3$; similar method works for $n=4$. Let $E(1, 2, 3)$ be the desired expectation, we have $$ E(1, 2, 3)=1+\frac{1}{6}(E(1, 1, 2)+E(1, 2, 2)+etc.)=1+E(1,1, 2) $$ and $$ E(1, 1, 2)=1+\frac{1}{3}(E(1,1,2)+E(1,1,1)+E(1, 2, 2))=1+\frac{2}{3}E(1,1,2). $$ It not hard to see that $E(1,1,2)$ is finite, hence $E(1,1,2)=3$ and $E(1,2,3)=4$.

P.S. I was asked this question by a phone interviewer. If I or he didn't get the question wrong, then I hope there might be a 'clever and quick' solution for general cases.


Solution 1:

Following up on Christian Blatter's answer, I crunched through all the equations for $n=5$ and, using Christian's notation, found

$$E_{11111}=16$$

Together with the trivial results $E_1=0$ and $E_{11}=1$, the OP's reported $E_{111}=4$, and another computation to get $E_{1111}=9$, the sequence of expected values,

$$0,1,4,9,16,\ldots$$

looks suspiciously familiar. It would behoove someone to take a look at $n=6$. It might also help if someone independently checked my work on $n=5$ (and $n=4$, for that matter), to make sure I'm not fooling myself. (Remark, added later: I wrote a quick and dirty computer simulation and looked at cases up to $n=10$. The results seem to agree with $(n-1)^2$ as the expected value. This gives me a bit more confidence that I did the $n=5$ case correctly. Still, no one should take my word for it.)

If indeed the expected values are all squares, there ought to be a simple proof, but I have no idea what it might be.

Solution 2:

This is almost a Moran process: the only difference is that in a Moran process, the two balls are sampled with replacement, so with probability $1/n$ a ball replaces itself and nothing happens. In the Moran process, the expected number of steps until "fixation" (all balls being the same color) is $t^*=n(n-1)$ (see, e.g., WJ Ewens, Mathematical Population Genetics, 2004, (3.54)). Since a fraction $1/n$ of those steps are self-replacements that are omitted from this process, the expected number of steps in this case is $t=(1-1/n)t^*=(n-1)^2$.

I should mention that the key step in the derivation for the Moran process is that to calculate the expected time to fixation, by symmetry one can pick a single ball and condition on it being the one to take over, and then just track the number of balls with its color. This drastically reduces the size of the state space and gives a simple recursion.

Solution 3:

Note: The following "old" answer deals with the organization of numerical experiments concerning this problem. For a full solution of the problem check my other answer below.

When analyzing this game we should forget the names of the colors. The states of the game are the partitions of $n$, i.e., the different ways of writing $n$ as a sum of positive integers. For $n=3$ these are $$(3), (2,1), (1,1,1)\ ,$$ and for $n=5$ there are $7$ of them, namely $$(5), (4,1), (3,2), (3,1,1), (2,2,1), (2,1,1,1), (1,1,1,1,1)=:(1^5)\ .$$ For $n=10$ we would have $42$ states. The initial state of the game is $(1^n)$, and the end state is $(n)$.

In the case of $n=5$ we have the equations $$\eqalign{E_5&=0\cr E_{41}&=1+{4\over5}\cdot{3\over4}E_{41}+{4\over5}\cdot{1\over4}E_5+{1\over5}E_{32}\cr E_{32}&=1+{3\over5}\cdot{2\over4}E_{32}+{3\over5}\cdot{2\over4}E_{41}+{2\over5}\cdot{3\over4}E_{32}+{2\over5}\cdot{1\over4}E_{32}\cr}$$ and $4$ more of this kind.

Maybe you can manage the cases $5$, $6$, and $7$; then ask OEIS whether the found numbers mean something..