You see a route 14 bus on the moon. What is the most likely number of bus routes on the moon?

This question was asked on a forum and while many argued that the answer is 14 (since the probability of you seeing bus 14 is maximum in this case), I argued against it that they were working backwards. My claim is that this question is invalid as there is no method to determine the probability of number of bus routes.

I'm looking for clarification as to the right answer (with proof obviously)


Solution 1:

Your question touches on fundamental issues of the interpretation of probabilities. You won't get a proof, at least not in the mathematical sense, since this is not a mathematical question but an interpretational question.

Basically, there are two popular interpretations of probability theory, the frequentist one and the Bayesian one. In the frequentist interpretation, probabilities specify the relative frequencies of an event if you perform the "same" experiment many times. Obviously you can't create a large number of moons and count the number of bus routes on each, or, as you put it, "there is no method to determine the probability of number of bus routes", and thus in this interpretation there is no such thing as that probability. As William Feller put it:

There is no place in our system for speculations concerning the probability that the sun will rise tomorrow.

In the Bayesian interpretation, probability theory allows us to reason about uncertain events, and more specifically to rationally update our assessments of how likely events are when new information comes in. In this framework, you always need some prior assessment of likelihoods, and then the theory tells you how to adjust that using the data you observe.

For some sorts of events, such as rolling a die, there are rational grounds for choosing prior probabilities (the same probability for each number). In other cases, such as bus routes on the moon, there isn't one obvious set of prior probabilities, but still reasoning about how prior assessments of likelihood should rationally be modified by incoming data can be useful.

In the present case, any prior assessment of the likelihood of various numbers of bus routes on the moon would presumably have exhibited a very dominant spike at $0$ and then a very low and rather flat tail for all other numbers. The Bayesian probability update requires us to multiply the a priori probability for each possible number of bus routes by the conditional probability that you would have observed a route $14$ bus if there were that many bus routes, and then normalize the resulting probabilities to $1$ to obtain the a posteriori probabilities. If we follow the assumption you seem to be making in the question, that the bus routes are numbered sequentially beginning with $1$ and we have an equal probability of encountering a bus from any one of the existing bus routes, then the conditional probability of observing a route $14$ bus given $n$ bus routes is zero for $0$ to $13$ bus routes and $1/n$ for $n$ bus routes if $n\ge14$.

Now although there is no way to agree on any particular prior in the present case, it seems rather plausible that apart from the dominant spike at $0$, the prior would have been relatively flat. That is, there was no strong a priori reason to favour, say, the number $15$ over the number $14$, and so the ratio of the prior probability for $15$ bus routes to the prior probability for $14$ bus routes would not have exceeded $15/14$. That's the ratio by which the observation of a route $14$ bus raises the probability for $14$ bus routes relative to the probability for $15$ bus routes. So although there's no proof and no one right answer, we can nevertheless plausibly argue that most sensible priors would not favour any number $n>14$ over the number $14$ by a factor of $n/14$, and thus, in a Bayesian framework, the number of bus routes with the highest a posteriori probability would be $14$.