Does observing life on Earth increase the probability of life elsewhere?

Say I have an implausibly large sack of balls. All I know is that the balls are numbered randomly from $1$ to $n$. For all I know, any value of $n$ (a positive integer) is equally likely.

I reach into the sack and choose a ball randomly. The ball says $42$. Does this change at all the probabilities of the values of $n$ used to number the balls where $n \geq 42$?

(Intuitively it might seem like $n$ is a low number in that if $n$ were very very large (say $2^{42}$) it seems implausible we would hit on a very low number from the first ball sampled. On the other hand, if $n$ is a very very large number, $42$ is equally as like as any ball to emerge.)


Another simplified version might be where the balls are either blue or red, but I don't know how many are blue or how many are red. The first ball I choose is blue. Does this increase the probability of observing further blue balls in later samples?

(Again if there were only one blue ball, intuitively it seems unlikely we would choose it on the first sample. On the other hand, if there were only one blue ball, that ball is as equally likely to emerge as any on the first sample.)


It seems to be a question that crops up a lot. Like for example in the argument that well there's life here on Earth so it would be an improbable fluke if there were no life elsewhere. Of course this is a more complex question than just what colour the balls are, but the thrust of this argument seems to be a probabilistic one, like it boils down to the idea that we know there's one blue ball in the tiny sample we've seen, so there must be lots of blue balls in the implausibly large sack to explain that.

I'm not convinced this latter argument makes sense, but on the other hand, I don't know how to reason about the problem or prove one way or the other whether seeing a blue ball early on affects the (relative) probability of the number of blue balls in the population. Hence I'm wondering, for example, if there's some sort of general theorem from probability that talks about this?


This is a really interesting question. I suggest the following appoach using Bayes theorem.

Supose there exist n planets in total.

Define $E_r$ = event that there are exactly r planets with life(blue planets). You can check easily that the events are mutually exclusive and exhaustive.

A = event of observing one blue planet.

We shall calculate $P(E_r/ A)$= $\frac {P(E_r ).P(A/E_r)}{\sum P(E_i).P(A/E_i)}$

Assumig that the creator painted the planets randomly, what is the probabilty that r of them are blue?

Clearly its $P(E_r) = \frac{nCr}{2^n}$.

Also $P(A/E_r) = \frac{r}{n}$

SUbsituting,we will have

$P(E_r/A)= \frac{(n-1)!}{(r-1)!.(n-r)!.2^{n-1}}$

Suppose, that n is comparatively small, about a million. Note how negligibly small the probability of observing only one blue planet (r=1) becomes.


Let's look at your first problem, the one with the numbered balls.

Well, one problem with this problem is that there is no uniform distribution of all natural numbers. However we can consider the case where $n$ is uniformly distributed in the range $1$ to $N$, and see if we can make statements when $N$ goes to infinity.

So let's assume that we have a sack with $1\le n\le N$ numbered balls, and each value of $n$ in the range is initially equally likely, that is, we have an uniform prior for $n$. Now we draw at random (that is, again with uniform probability) a single ball from the sack, and get 42. The question is, what is the probability distribution for $n$ after drawing that ball.

According to Bayes' theorem, we have $$P(n=n_0|\text{42 drawn}) = \frac{P(n=n_0)P(\text{42 drawn}|n=n_0)}{\sum_k P(n=k)P(\text{42 drawn}|n=k)}$$ Now $P(n=k) = \frac{1}{N}$ and $$P(\text{42 drawn}|n=k)=\begin{cases} \frac{1}{k} & k\ge 42\\ 0 & k<42 \end{cases}$$ Therefore for $n_0\ge 42$ we have $$P(n=n_0|\text{42 drawn}) = \frac{1}{n_0\sum_{k=42}^N\frac{1}{k}}$$ Note that the sum in the denominator is independent of $n_0$ and basically just gives the normalization constant, so that the probabilities add up to $1$. Therefore the relevant information is: $$P(n=n_0|\text{42 drawn}) \propto \frac{1}{n_0}$$ Therefore small values of $n_0$ (with the restriction $n_0\ge 42$, of course) are indeed favoured, but only very weakly; in particular, the probabilities still go to zero as $N\to\infty$.

Let's calculate the expectation value of $n$: $$\langle n\rangle = \sum_{n_0=1}^N n_0\,P(n=n_0|\text{42 drawn}) = \frac{N-41}{\sum_{k=42}^N\frac{1}{k}}$$ Since the numerator grows linearly while the denominator grows logarithmically, this diverges for $N\to\infty$. The information we get from the single ball therefore is not sufficient to cut the expectation value down to a finite value, although it grows more slowly with $N$ than on the prior probability where it grows linearly with $N$.

Note that if we draw a second ball, then the probabilities should be $\sim \frac{1}{k^2}$, which gives a convergent series. Therefore drawing two balls should be sufficient to force a finite probability even in the limit $N\to\infty$, and therefore probably also a finite expectation value (but at the moment I'm too lazy to calculate that, especially given that it is already far past midnight and I should go to bed).