I'm looking at this question and the solution given and I understand it, but I'm unable to see where I'm going wrong.

The question states that there are $k$ equally frequent colors and we do not know $k.$ We examine four smarties and notice that they are red, green, red, orange. We wish to find the maximum likelihood estimate $\hat{k}$.

The solution given is that $$\text{lik}(k) = \frac{(k-1)(k-2)}{k^3} $$ since we are looking at the probability that the second and fourth color differs from the first and the third is equal to the first. This can easily seen to be maximized when $\hat{k} = 5$.

My issue is why we are looking at the probability that the second colour and fourth are different from the first.

If we instead look at the probability as the probability of seeing R, G, R, O given there are $k$ colours then the likelihood function is just $$\text{lik}(k)=\frac{1}{k^4}$$ since all sequences of colours are equally likely. This is maximized when $k=3$ as there must at least be 3 different colors.

I can sort of see that I'm going wrong somewhere as my answer is independent of the sequence we get, but where exactly am I going wrong? And why is the correct interpretation to ignore the actual sequence we get and only look at the differentiation between the colors?

EDIT: I'm trying to reimagine a question with $k$ being the maximum positive integer allowed and we see a specific sequence 3, 1, 3, 7. In that case I believe my interpretation would probably be correct. So it must have something to do with the fact that colors aren't ordered, but I'm not able to convince myself exactly what the issue is.


Solution 1:

We know in advance that there is at least one color. Thus when we observe the color of the first one we are seeing an event of probability $1,$ regardless of what $k$ is.

The larger $k$ is, the more probable it is that the second differs from the first. That probability is $(k-1)/k.$

The larger $k$ is, the less probable it is that the third is the same as the first. That probability is $1/k.$

Then the probability that the fourth differs from the two colors observed so far is $(k-2)/k.$