Balls are placed into 3 urns. Expected time until some urn has 100 balls.

Computation of the distribution

Let $n$ be the number of balls to draw. Let $m$ be the number of urns. Let $k$ be the target number of balls when the game stops.

You could express the probability to reach a stop in $n$ balls in terms of the probability that the number of balls in each urn is $k-1$ or less (the cumulative distribution).

The number of ways to put $n$ balls into $m$ urns is $m^n$ (with or without reaching the stop condition).
The number of ways to put $n$ balls into $m$ urns but not having reached the stop condition (that is having at most $k-1$ in each of them) can be found by enumerating over the set $S$ of vectors $\vec{k}$ (the numbers $(k_i)$ depicting the the number of balls in each $i$-th urn) that satisfy the condition $$\sum_i k_i = n \quad \text{and} \quad \forall i:0 \leq k_i < k$$ And for each of the vector $\vec{k}$ (a set of numbers $k_1,k_2,k_3$) that satisfies these conditions we compute the the number of ways to distribute the balls into the urns with those numbers which is $$\text{number of ways to put $k_i$ balls in urn $i$} = \frac{n!}{\prod_i{k_i!}}$$ Then we take the sum over all of this $$P(N \leq n) = \frac{1}{m^n}\sum_{\vec{k} \in S} \frac{n!}{\prod_{k_i\in \vec{k}}{k_i!}} $$ where the sum over $\vec{k} \in S$ means the summation over all vectors with numbers $k_i$ that satisfy the conditions and the product over $k_i \in \vec{k}$ means the product with all $k_i$ in the $\vec{k}$.

See below for an implementation in R code:

output image from code

# computation
n <- 99
sum <- rep(0,3*n+1)
for (k1 in 0:n) {
  for (k2 in 0:n) {
    for (k3 in 0:n) {
      t = (k1+k2+k3)
      sum[t+1] = sum[t+1]+exp(lfactorial(t)-lfactorial(k1)-lfactorial(k2)-lfactorial(k3))
    }
  }
}
x <- c(0:(3*n))
Xcum <- c(sum/3^x,0)

# simulation
set.seed(1)

draw <- function() {
  s <- sample(c(1:3),size = 300, replace=TRUE)
  min(which((cumsum(s==1)==100) | (cumsum(s==2)==100) | (cumsum(s==3)==100)))
}
q <- replicate(10^5,draw())

# computation using beta function

drn <- function(n,k) {
  a <- max(0,n-2*k+1)
  b <- min(k-1,n-k)
  choose(n-1,k-1) * 2^(n-k) / 3^(n-1) *
      ( zipfR::Ibeta(0.5,n-k-b+1,b+1)/beta(n-k-b+1,b+1) - 
        zipfR::Ibeta(0.5,n-(k-1)-(a-1),(a-1)+1)/beta(n-(k-1)-(a-1),(a-1)+1) )
  #choose(n-1,k-1) * 2^(n-k) / 3^(n-1) * (pbinom(b,n-k,0.5)-pbinom(a-1,n-k,0.5))
}
drn <- Vectorize(drn)


#plotting both together

h <- hist(q, breaks=c(0:298)+0.5, xlim=c(200,300),
          xlab = "N", ylab = "probability", freq = FALSE, main="")
lines(1:298,-diff(Xcum),col=2)
lines(c(100:298),drn(c(100:298),100),col=3)

multinomial distribution

You can view this as related to the multinomial distribution which has the pdf

$$\frac {n!}{k_1! k_2! ... k_m!} p_1^{k_1} p_2^{k_2} ... p_m^{k_m} $$

which becomes for equal $p_i = 1/m $ the following

$$\frac {1}{m^n}\frac {n!}{k_1! k_2! ... k_m!} $$

which shows similarity with the expresssion before. Then the probability that for $n$ draws you did not reach 100 yet is equal to the probability that in 100 draws each $k_i<100$. And you can see the computation of you probability density as the computation of the CDF for the multinomial distribution.

Expression in terms of regularized incomplete beta function

For the case of three urns we may write an explicit expression for the probability in terms of the regularized incomplete beta function.

The probability that there are in the $n$-th draw $k$ balls in the first urn and less than $k$ in the others is, is equal to the 1/3 of the probability that there are in the $n-1$ draw $l= k-1$ balls in the first urn and equal or less than $l$ in the others is :

$$\begin{array}{rcrl} P_{k_1=l=k-1,k_2 \leq l,k_3 \leq l \vert n-1} &=& &\sum_{a \leq k_2 \leq b} \frac {1}{3^{n-1}}\frac {(n-1)!}{l! k_2! (n-1-l-k_2)!} \\ & = & \frac{(n-1)!}{l! 3^{n-1}} &\sum_{a \leq k_2 \leq b} \frac {1}{k_2! (n-1-l-k_2)!} \\ & = & {{n-1}\choose{l}} \frac{2^{n-1-l}}{3^{n-1}}& \sum_{a \leq k_2 \leq b} \underbrace{{n-1-l\choose{k_2}} \frac{1}{2^{n-1-l}}}_{\text{this is a binomial distribution}} \\ & = & {{n-1}\choose{k-1}} \frac{2^{n-k}}{3^{n-1}} & \left( I_{1/2}(n-k-b+1,b+1) - I_{1/2}(n-k-a+2,a) \right) \end{array}$$

with $a = max(0,n-2k+1)$ and $b = min(k-1,n-k)$

Computation of the expectation value

In the first part we computed $P(n>k) = 1-P(n\leq k)$. To obtain the mean you can sum over all of these. $\mu = \sum 1-P(n\leq k)$. This will give:

$$\sum_{k_1=0}^{99}\sum_{k_2=0}^{99}\sum_{k_3=0}^{99} \frac{1}{3^{k_1+k_2+k_3}} \frac{(k_1+k_2+k_3)!}{k_1!k_2!k_3!} = 274.9186 $$

The expected time can be expressed in terms of the incomplete gamma function as follows (inspired by this paper and comments here):

In general: We want the expected value of the time to wait $T$ until one of the $3$ urns contains $n$ ($=100$) balls. Then

$$E_{n}[T] = \sum_{t=1}^\infty P(T\ge t) = \sum_{t=0}^\infty p_{n}(t) \tag1$$

where $p_{n}(t) $ is the probability that after $t$ rounds ($t$ balls) all $3$ urns have less than $n$ balls. But this is equivalent to

$$ \sum_{x=0}^{n-1}\sum_{y=0}^{n-1}\sum_{z=0}^{n-1} \frac{1}{3^{x+y+z}} \frac{(x+y+z)!}{x! \, y! \, z!} \tag2$$

Further, we use a property of the (upper) incomplete gamma function:

$$\begin{align} \left( \frac{\Gamma(n,a)}{\Gamma(n)} \right)^3 &= \left( e^{-a} \sum_{r=0}^{n-1}\frac{a^r}{r!} \right)^3 \\&= e^{-3a} \sum_{x=0}^{n-1}\sum_{y=0}^{n-1}\sum_{z=0}^{n-1}\frac{a^{x+y+z}}{x! \, y! \, z!} \tag3 \end{align}$$

Integrating and using $\int_0^\infty \exp(-3a) a^p da = p!/3^{p+1}$ we get

$$ \int_0^\infty \left( \frac{\Gamma(n,a)}{\Gamma(n)} \right)^3 da= \sum_{x=0}^{n-1}\sum_{y=0}^{n-1}\sum_{z=0}^{n-1} \frac{1}{3^{x+y+z+1}} \frac{(x+y+z)!}{x! \, y! \, z!} \tag4$$

and finally

$$E_{n}[T] = 3 \int_0^\infty \left( \frac{\Gamma(n,a)}{\Gamma(n)} \right)^3 da \tag5$$

More in general, if we have $d$ urns:

$$E_{n,d}[T] = d \int_0^\infty \left( \frac{\Gamma(n,a)}{\Gamma(n)} \right)^d da \tag6$$

This can be evaluated numerically, I don't know about asymptotics (asked here).

Empirically , it seems that $E = 3 n - \beta \sqrt{n} +O(1)$ where $\beta \approx 2.5$

And here's a numerical recursive computation (in Java):

public class MSE3368225 {

    static Double[] cache = new Double[(1<<21)];

    static double ex(int x, int y, int z) {
        if (x == 0 || y == 0 || z == 0)
            return 0;
        if (x > 127 || y > 127 || z > 127) 
            throw new RuntimeException("Out of range");
        int k = (x << 14) | (y << 7) | z; // packs three variables in one integer
        Double d = cache[k];
        if (d == null) {
            d = 1 + (ex(x - 1, y, z) + ex(x, y - 1, z) + ex(x, y, z - 1)) / 3.0;
            cache[k] = d;
        }
        return d;
    }

    public static void main(String[] args) {
        System.out.println(ex(100, 100, 100));
    }
}

This solves the recursion

$$g(x,y,z)=\begin{cases} 0 & \text {if $x=0$ or $y=0$ or $z=0$}\\ 1+ \frac13\left(g(x-1,y,z)+g(x,y-1,z)+g(x,y,z-1)\right) & \text{elsewhere} \end{cases} $$

where $g(x,y,z)$ is the expected remaining time, when there remains $(x,y,z)$ balls for each urn.

The result is $E_{100}[T]=274.9186440$

Some values

  n     E
  2  2.888889 
  3  5.049383 
  4  7.348270 
  5  9.734204
 10  22.34468
 20  48.99126
 50  132.3676
100  274.9186

If the sum of eigenvectors is an eigenvector, then they all correspond to the same eigenvalue

If $X$ is an order topology and $Y \subset X$ is closed, do the subspace topology and order topology on $Y$ coincide?

Classify finitely generated modules over the ring $\mathbb{C}[\epsilon]$ where $\epsilon^2=0$

When $L^p \subset L^q$ for $p <q$.

Angle between two planes in four dimensions

Computing $ \int_0^{\infty} \frac{ \sqrt [3] {x+1}-\sqrt [3] {x}}{\sqrt{x}} \mathrm dx$

If $ax^2-bx+c=0$ has two distinct real roots lying in the interval $(0,1)$ $a,b,c$ belongs to natural prove that $\log_5 {abc}\geq2$

Does there exist a set of exactly five positive integers such that the sum of any three distinct elements is prime?

Coordinate ring of the unit circle is never a UFD?

Define $I_n=\int_0^1\frac{x^n}{\sqrt{x^2+1}}dx$ for every $n\in\mathbb{N}$. Prove that $\lim_{n\to\infty}nI_n=\frac{1}{\sqrt 2}$.

Applications of additive version of Hilbert's theorem 90

Hardy's inequality again

Balls are placed into 3 urns. Expected time until some urn has 100 balls.

Computation of the distribution

multinomial distribution

Expression in terms of regularized incomplete beta function

Computation of the expectation value

Related

Recent Posts