Properties of the Minimum of Two Poisson Random Variables

I stumbled upon the following problem in my research. We are trying to analyze $Z=\min(X,Y)$ where $X \sim Pois(p\lambda)$ and $Y\sim Pois((1-p)\lambda)$. Note that the RVs expectation is related yet not identical but are independent.

What we are most interested in is a closed form expression for $\mathbb{E}Z$. Or, alternatively, an expression simple enough to prove with that the expectation $\mathbb{E}Z$ is attained at $p=\frac{1}{2}$

I managed to find very little literature on the subject. I saw that in some places this scenario is called a "Poisson Race", but couldn't find anything that is relevant to me.

I tried to go the manual way: \begin{equation} \begin{split} \mathbb{E} Z & = \sum_{n\geq 1} \Pr(min(X,Y) \geq n) \\ & = \sum_{n\geq 1} \Pr(X\geq n\ \text{and}\ Y\geq n) \\ & = \sum_{n\geq 1} \Pr(X\geq n)\cdot \Pr(Y\geq n) \\ & = \sum_{n\geq 1}\Bigg[\Bigg(\sum_{i\geq n} \frac{(p \lambda)^i e^{-p\lambda}}{i!} \Bigg)\Bigg(\sum_{i\geq n} \frac{((1-p) \lambda)^i e^{-(1-p)\lambda}}{i!} \Bigg)\Bigg] \\ & = e^{-\lambda}\sum_{n\geq 1}\Bigg[\Bigg(\sum_{i\geq n} \frac{(p \lambda)^i}{i!} \Bigg)\Bigg(\sum_{i\geq n} \frac{((1-p) \lambda)^i }{i!} \Bigg)\Bigg] \\ & = e^{-\lambda}\sum_{n\geq 1}\Bigg[\Bigg(e^x-e_{n-1}(p\lambda) \Bigg)\Bigg(e^x - e_{n-1}((1-p)\lambda) \Bigg)\Bigg] \\ \end{split} \end{equation}

But this didn't lead to any relatively simple terms. Tried looking into Gamma Taylor partial sums of $e^x$ and Gamma functions $\Gamma (x)$ but again, with no result.

What is obvious, due to the symmetry of the function is that the max is attained at $p=\frac{1}{2}$. Does one see any way to prove so without having to derive once and twice and do all the dirty work?


$e_n(x)$ is the Exponential Sum Function


Solution 1:

$\newcommand{\bbx}[1]{\,\bbox[8px,border:1px groove navy]{\displaystyle{#1}}\,} \newcommand{\braces}[1]{\left\lbrace\,{#1}\,\right\rbrace} \newcommand{\bracks}[1]{\left\lbrack\,{#1}\,\right\rbrack} \newcommand{\dd}{\mathrm{d}} \newcommand{\ds}[1]{\displaystyle{#1}} \newcommand{\expo}[1]{\,\mathrm{e}^{#1}\,} \newcommand{\ic}{\mathrm{i}} \newcommand{\mc}[1]{\mathcal{#1}} \newcommand{\mrm}[1]{\mathrm{#1}} \newcommand{\pars}[1]{\left(\,{#1}\,\right)} \newcommand{\partiald}[3][]{\frac{\partial^{#1} #2}{\partial #3^{#1}}} \newcommand{\root}[2][]{\,\sqrt[#1]{\,{#2}\,}\,} \newcommand{\totald}[3][]{\frac{\mathrm{d}^{#1} #2}{\mathrm{d} #3^{#1}}} \newcommand{\verts}[1]{\left\vert\,{#1}\,\right\vert}$ \begin{align} \mathbb{E}\bracks{Z} & = \mathbb{E}\bracks{\min\braces{X,Y}} = \mathbb{E}\bracks{X + Y - \verts{X - Y} \over 2} = {1 \over 2}\,\mathbb{E}\bracks{X} + {1 \over 2}\,\mathbb{E}\bracks{Y} - {1 \over 2}\,\mathbb{E}\bracks{\verts{X - Y}} \\[5mm] & = {1 \over 2}\,p\lambda + {1 \over 2}\,\pars{1 - p}\lambda - {1 \over 2}\,\mathbb{E}\bracks{\verts{X - Y}} = {1 \over 2}\lambda - {1 \over 2}\,\color{#66f}{\mathbb{E}\bracks{\verts{X - Y}}} \end{align}


With $\ds{x \equiv p\lambda}$ and $\ds{y \equiv \pars{1 - p}\lambda}$: \begin{align} \color{#66f}{\mathbb{E}\bracks{\verts{X - Y}}} & = \sum_{m = 0}^{\infty}{x^{m}\expo{-p\lambda} \over m!} \sum_{n = 0}^{\infty}{y^{n}\expo{-\pars{1 - p}\lambda} \over n!}\,\verts{m - n} \\[5mm] & = \expo{-\lambda} \sum_{m = 0}^{\infty}\sum_{n = 0}^{m}{x^{m}\,y^{n} \over m!\,n!}\pars{m - n} + \expo{-\lambda} \sum_{m = 0}^{\infty}\sum_{n = m}^{\infty}{x^{m}\,y^{n} \over m!\,n!} \pars{n - m} \\[5mm] & = \expo{-\lambda} \sum_{n = 0}^{\infty}\sum_{m = n}^{\infty}{x^{m}\,y^{n} \over m!\,n!} \pars{m - n} + \expo{-\lambda} \sum_{m = 0}^{\infty}\sum_{n = m}^{\infty}{x^{m}\,y^{n} \over m!\,n!} \pars{n - m} \\[5mm] & = \expo{-\lambda}\sum_{n = 0}^{\infty}\sum_{m = n}^{\infty} {x^{m}\,y^{n} + x^{n}\,y^{m} \over m!\,n!}\pars{m - n} = \expo{-\lambda}\sum_{n = 0}^{\infty}\sum_{m = 0}^{\infty} {x^{m + n}\,y^{n} + x^{n}\,y^{m + n} \over \pars{m + n}!\,n!}m \\[5mm] & = \sum_{m = 0}^{\infty}m\pars{x^{m} + y^{m}} \sum_{n = 0}^{\infty}{\pars{xy}^{n} \over \pars{m + n}!\,n!} \\[5mm] & = \sum_{m = 0}^{\infty}m\bracks{\pars{x \over y}^{m/2} + \pars{y \over x}^{m/2}} \,\mrm{I}_{m}\pars{2\root{xy}} \end{align}

where $\ds{\,\mrm{I}_{\nu}}$ is the Modified Bessel Function of the First Kind.

Our result, '$so\ far$', is given by \begin{align} \mathbb{E}\bracks{Z} & = \mathbb{E}\bracks{\min\braces{X,Y}} \\[5mm] & = {1 \over 2}\lambda - {1 \over 2} \sum_{m = 0}^{\infty}m \bracks{\pars{p \over 1 - p}^{m/2} + \pars{1 - p \over p}^{m/2}} \,\mrm{I}_{m}\pars{2\root{p\bracks{1 - p}}\lambda} \end{align}

Solution 2:

Comment. I played with this without getting anything nearly as elegant as @FelixMartin's Answer (+1). I did a quick simulation and found that the relationship between $\mu = E(Z)$ and $p$ depends on the value of $\lambda.$ (In view of @Misha's result, I had initial hopes $\lambda$ might not be crucial, but that seemed counter-intuitive.) For what they may be worth, I post graphs of $\mu/\lambda$ against $p$ for six values of $\lambda.$ (The simulated values should be accurate within the resolution of the plots.)

enter image description here

Addendum. Crude R code is provided below, as requested in Comment. There are two alternative lines beginning z = replicate.... The one with pmin was my initial method. The one with abs was to verify that @FelixMartin's formula gives the same results as mine. Put # at the beginning of the line you want to omit. (Increase 10^3 to 10^4 and 5000 to 10000 for smaller simulation error; slower and not necessary for graphs.) Of course, simulation is for visualization and verification only.

par(mfrow=c(2,3))  # enables six panels per plot
lamb = c(.5, 1, 10, 25, 100, 1000); m=6
for(j in 1:m)      # outer loop for 6 values of lambda
  {
  lam = lamb[j]
  p=seq(.0, 1, by=.05); B = length(p); mu=numeric(B)
  for(i in 1:B)    # inner loop for B values of p
    {   
    pp=p[i]
    z = replicate( 10^3, lam/2 - .5*mean(abs(rpois(5000,pp*lam)-rpois(5000,(1-pp)*lam))) )
    # z = replicate( 10^3, mean(pmin(rpois(5000,pp*lam),rpois(5000,(1-pp)*lam))) )
    # 2nd 'replicate' for z can be substituted for first
    mu[i] = mean(z) }
                   # end inner loop
  plot(p, mu/lam, pch=19, ylim=c(0,.5), main=paste("lambda =",lamb[j]))  }                
                   # end outer loop
par(mfrow=c(1,1))  # returns to default single-panel plot

Solution 3:

The answer is a continuation of the one by @MishaLavrov. Specifically, I prove that:

Claim: $\forall n: E[Z|X+Y=n]$ (considered as a function of $p$) is maximized at $p=1/2$.

This allows us to conclude that $E[Z] = \sum_n E[Z|X+Y=n] P(X+Y=n)$ is also maximized at $p=1/2$.

Several people have pointed out that this result is "obvious", and I agree. :) So my proof might be a bit more detailed than usual, because the whole point is to be more rigorous than perhaps customary. Apologies in advance for the tediousness!

Also, the proof is a symmetry argument, and does not come close to obtaining a closed form for general $p$ (which can then be differentiated, etc).


Consider a "trinomial" experiment: You roll an unfair $3$-sided die with faces $a,b,c$ and probabilities $p, ({1\over 2} - p), {1\over 2}$ respectively, where $p \in [0, {1\over 2}]$. You roll this die $n$ times and record the results as a sequence $\omega \in \{a,b,c\}^n$, e.g. $\omega = ccabacbcaac$. In other words $\Omega = \{a,b,c\}^n$ is the sample space and each $\omega$ is a sample point.

(Preview: the symmetry exploited will be changing every $a$ to $c$ and vice versa, but we need some preliminaries before we get there.)

Let $A,B,C$ be random variables denoting the number of $a,b,c$ (respectively) in $\omega$. Next we define/identify:

  • $X = A \sim Bin(n,p)$

  • $Y = B+C = n-X \sim Bin(n,1-p)$

  • $Z = \min(X,Y) = \min(A,B+C)$ is the value of interest

  • $X' = A+B \sim Bin(n,1/2)$

  • $Y' = C = n-X' \sim Bin(n,1/2)$

  • $Z' = \min(X',Y') = \min(A+B,C)$ is what $Z$ would have been if $p =1/2$

  • $W = Z' - Z$

  • $D = C - A$

Claim: $E[W] \ge 0\ \forall p \in [0,1/2]$, with equality iff $p = 1/2$.

Corollary: Above claim $\implies E[Z'] \ge E[Z] \ \forall p \in [0,1/2]$, i.e. $E[Z]$ is maximized at $p=1/2$.

Proof: Partition $\Omega$ into $5$ events based on $D = C - A$:

  • $E_0: C - A = 0:$ in this case $W = 0$

  • $E_1: C - A > B:$ in this case $W = (A+B) - A = B$

  • $E_2: C - A < -B:$ in this case $W = C - (B+C) = -B$

  • $E_3: B \ge C - A > 0:$ in this case $W = C - A = D$ (Note: this event $\implies B>0$)

  • $E_4: -B \le C - A < 0:$ in this case $W = C - A = D$ (Note: this event $\implies B>0$)

Now the symmetry: consider the mapping $f:\Omega \rightarrow \Omega$ where for each sample point, i.e. every sequence $\omega, f()$ changes every $a$ into $c$ and every $c$ into $a$. E.g. $f(ccabcabc) = aacbacba$. Clearly, $f$ is bijective and its own inverse. More importantly:

  • $\forall \omega: A(\omega) = C(f(\omega)), C(\omega) = A(f(\omega))$

  • $\forall \omega: D(\omega) = -D(f(\omega))$ since $D = C-A$

  • Defining, as usual, $f(E_j)$ as the range $\{f(\omega) | \omega \in E_j\}$, then we have: $f(E_0) = E_0, f(E_1) = E_2, f(E_2) = E_1, f(E_3) = E_4, f(E_4) = E_3$ since the event definitions involve ranges that are symmetric about $0$.

  • $\forall \omega: W(\omega) = -W(f(\omega))$. This takes a little more algebra to see:

    • For $E_0: W=D=0$

    • For $E_1, E_2:$ the values are $W = \pm B$ and $f(E_1) = E_2, f(E_2) = E_1$.

    • For $E_3, E_4: W = D$ and we have $W(\omega) = D(\omega) = -D(f(\omega)) = -W(f(\omega))$

Now by definition, $E[W] = \sum_{\omega \in \Omega} W(\omega) P(\omega) = \sum^4_{j=0} \sum_{\omega \in E_j} W(\omega) P(\omega)$. The $E_0$ term contributes nothing and can be dropped since in that case $W = 0$. Thus:

$$E[W] =\sum_{\omega \in E_1} W(\omega) P(\omega) + \sum_{\omega \in E_2} W(\omega) P(\omega) + \sum_{\omega \in E_3} W(\omega) P(\omega) + \sum_{\omega \in E_4} W(\omega) P(\omega)$$

Next, consider the first pair of events:

$$\sum_{\omega \in E_1} W(\omega) P(\omega) + \sum_{\omega \in E_2} W(\omega) P(\omega)$$

$$= \sum_{\omega \in E_1} W(\omega) P(\omega) + \sum_{\omega \in E_1} W(f(\omega)) P(f(\omega))$$

$$= \sum_{\omega \in E_1} W(\omega) P(\omega) + \sum_{\omega \in E_1} -W(\omega) P(f(\omega))$$

$$= \sum_{\omega \in E_1} W(\omega) (P(\omega) - P(f(\omega)) = (**)$$

Finally we get to the root of the symmetry. Consider any specific sequence $\omega$ where the random variables $A,B,C$ take values $n_A, n_B, n_C$. Then:

  • $P(\omega) = p^{n_A} ({1\over 2} - p)^{n_B} {1\over 2}^{n_C}$ (No need for the multinomial ${n \choose n_A, n_B, n_C}$ because $\omega$ is one specific sequence.)

  • $P(f(\omega)) = p^{n_C} ({1\over 2} - p)^{n_B} {1\over 2}^{n_A}$

  • If $n_C > n_A$ then $P(\omega) / P(f(\omega)) = ({1 \over 2} / p)^{n_C - n_A} \ge 1$, i.e. $P(\omega) \ge P(f(\omega))$, with equality iff $p = 1/2$.

Now for $\omega \in E_1$, we have both $W(\omega) > 0$ and $P(\omega) - P(f(\omega)) \ge 0$, so $(**) \ge 0$

The same is true for the other pair $E_3, E_4$. To conclude, $E[W] = \sum^4_{j=0} \sum_{\omega \in E_j} W(\omega) P(\omega) \ge 0$ with equality iff $p = 1/2$.