Covariance between an exponential random variable and the maximum of several exponential random variables

Suppose $X_1, \ldots, X_n$ are i.i.d. exponential RV with parameter $\lambda$. Let $Z = \max\{X_1, \ldots, X_6\}$. My goal is to find $\mathrm{cov}(X_1, Z)$.

I already know the c.d.f., p.d.f., and expectations of each $X_i$ and $Z$, I think I even found the joint p.d.f. of $Z$ and $X_1$ as:

$$f_{Z,\ X_1}(z,x_1) = \begin{cases} 5\lambda^2 e^{-\lambda x_1}(1-e^{-\lambda z})^4 & z \geq x_1 \geq 0 \\ 0 & \mathrm{otherwise} \end{cases}. $$ (This joint distribution is not correct, see below).

I found this by using $f_{Z,\ X_1} = f_{X_1} \ f_{Z|X_1}$ . But now I am sort of stuck because the integral $$E[X_1 Z] = \int\int_{R^2} x_1 z \ f_{Z,\ X_1} \ \mathrm{d}A$$ needed for $\mathrm{cov}(X_1,Z)$ gets cumbersome (and on my first attempt did not converge). Is there some slick thing to notice or do that I am missing? Note I am not yet confident with generating functions and moments.


Solution 1:

The divergence of the integral as stated in the question is due to the presence of an atom (discrete point mass), as pointed out by BGM's comment. There are "proper" ways to deal with the point mass, but this answer will present the common technique that allows for an elementary solution.

Most of the analysis will be in the first section, and most of the calculation is left to the second. Two appendices come after the code block (numerical verification) to provide technical details.

Strategy and Analysis

The analysis presented here applies to general $n$ and not just the requested $n = 6$.

Denote $W$ as the largest of the set of $n-1$ iid in contrast to $Z$ being the max of $n$. Here's let's say $W$ excludes $X_1$. $$W \equiv \max\{X_2,\,X_3,\,\ldots,\, X_6\} $$ Clearly $W$ is independent to $X_1,\,$ and its marginal density $f_W$ has the same form as $f_Z$ except with $n-1$ in place of $n$. The distribution of $Z$ as the max of $n$ iid exponential is well-known, as are the moments. Same for $W$.

Knowing that $X_1$ is interchangeable with any of the $n-1$ other $X_i$, we have the "$\frac1n$ and$\frac{n-1}n$" split that allows $W$ to come in. \begin{align*} &\phantom{{}={}} E\bigl[ ZX_1] \\ &= E\bigl[ZX_1~\big|\, Z = X_1 \bigr] \Pr\{Z = X_1\} + E\bigl[ZX_1~\big|\, Z > X_1\bigr] \Pr\{Z > X_1\} \\ &= E\left[Z^2~\middle|\, Z = X_1\right] \frac1n + E\bigl[ZX_1~\big|\, Z > X_1 \bigr] \frac{n-1}n \tag{1.a} \label{Eq_conditional_split} \\ &= E\left[Z^2 \right] \frac1n + E\bigl[WX_1~\big|\, W > X_1 \bigr] \frac{n-1}n \tag{1.b} \label{Eq_conditional_split_transfer} \end{align*} With the moments of $Z$ and $W$ well-known, at this point the analysis is already done, and one may wish to skip directly to the next section Calculation and Results. Here allow me to elaborate the intracacies below.

Going from the second line Eq\eqref{Eq_conditional_split} to the third line Eq\eqref{Eq_conditional_split_transfer} is trivial yet at the same time profound.

The first term $E\left[Z^2~\middle|\, Z = X_1 \right] = E\left[Z^2 \right]$ is effectively unconditional, because $Z = X_i$ for exactly one of the $i = 1 \sim n$ almost surely (ties has measure zero) while all $X_i$ play the same role. To be specific, due to $E\left[Z^2~\middle|\, Z = X_i \right] = E\left[Z^2 \,\middle|\, Z = X_j \right]$ for all $i \neq j$, we have $$\begin{align} E\left[Z^2\right] &= \sum_{i=1}^n \frac1n E\left[Z^2\,\middle|\, Z = X_i \right] \\ &= n \cdot \frac1n E\left[Z^2~\middle|\, Z = X_1 \right] = E\left[Z^2~\middle|\, Z = X_1 \right]\end{align}$$ From the same line of reasoning, in fact we have identical distribution $\left(Z^2~\middle|\, Z = X_1 \right) \overset{d}{=} Z^2$. Note that this is conditioning on $Z = X_1$ and NOT conditioning on a given value of $Z = X_1 = x_1$.

As for the second term with $X_1 < Z$ going from Eq\eqref{Eq_conditional_split} to Eq\eqref{Eq_conditional_split_transfer}, by definition $Z$ is the max of $W$ and $X_1$, therefore upon conditioning on $Z > X_1$ we have $W = Z$ as an identity between random variables (and not just identical in distribution).

Lastly, to be pedantic, the earlier step of "$\frac1n$ and$\frac{n-1}n$" split can be formally stated (note that ties, e.g. both $X_4$ and $X_3$ are max, have zero probability marginally): $$\Pr\{ W > X_1 \} = 1 - \Pr\{ W \leq X_1 \} = 1 - \Pr\{ X_1~\text{is max} \} = 1 - \frac1n $$


Summary of the Strategy: diagonal split

The space can be split into disjoint cases as done in Eq\eqref{Eq_conditional_split}. For region $Z = X_1$ we bypass the joint density $f_{X_1Z}$ that is formidable by utilizing the marginal density $f_Z$ that is easy. For the other region, $Z > X_1$, we bypass the joint $f_{X_1Z}$ by recognizing its identical replacement $f_{X_1W}$, which is a joint density easily handled due to $W \perp X_1$.


This "diagonal split" applies to expectation of any well-behaving function $g(X_1, Z)$ in general and not just the product $g(X_1, Z)=X_1 Z$. Namely, we can compute $E[g(X_1, Z)]$ while not having to deal with $f_{X_1Z}$ which is cumbersome.

Nonetheless, $\, f_{X_1Z}$ is rather interesting and worth examining. Formally, we have the Heaviside step function (one form of indicator function), its derivative in the form of $\delta(\cdot)$ the Dirac delta function, and the first derivative of the Dirac delta function $\delta'(\cdot)$. Please consult wiki or Eq.(10) of the Wolfram page to see how $\delta'(\cdot)$ works. See Appendix.2 for an outline of the derivation of the full form of $f_{X_1Z}$ as in Eq\eqref{Eq_joint_density_full_form}.

Calculation and Results

For the contribution from the "boundary" $Z = X_1$, one can obtain the moments with minor effort (see Appendix.1 for details if needed) \begin{align*} \text{At}~n = 6\,,& & E[Z] &= \frac{49}{20} \frac1{\lambda} & E\left[Z^2 \right] &= \frac{13489}{1800}\frac1{\lambda^2 } &&\qquad \tag{2} \label{Eq_Z_2nd_moment_n=6} \end{align*} For what it's worth, $13489 = 7\cdot 41 \cdot 47$ and $13489/1800 \approx 7.49388888\ldots$

In the $Z > X_1$ region, since $W \perp X_1$ the calculation of $E\bigl[WX_1~\big|\, W > X_1 \bigr]$ is easy to setup. The joint density is just $f_{X_1W}(x, w) = f_X(x)\, f_W(w)$ where the marginal density for $X_1$ is $f_X$ that is common to all $X_i$.

$$ E\bigl[WX_1~\big|\, W > X_1 \bigr] = \frac{ E\bigl[W X_1\cdot \mathcal{I}_{W>X_1} \bigr] }{ \Pr\bigl\{ W > X_1 \bigr \} } = \frac{ \int\limits_{x=0}^{\infty} \int\limits_{w=x}^{\infty} \displaystyle ~x \, w\, f_{X_1W}(x, w) \,\mathrm{d}w \,\mathrm{d}x }{ (n-1)/n }$$ where the $\mathcal{I}_{\text{statement}}$ is the indicator function, as in $\mathcal{I}_{\text{statement}} = 1$ wherever "statement" is true and $\mathcal{I}_{\text{statement}} = 0$ otherwise. Again, the denominator $\Pr\{ W > X_1 \}$ equals $(n-1)/n$ due to symmetry.

The integral that one needs to evaluate is $$\begin{align} &E\bigl[WX_1~\big|\, W > X_1 \bigr] \\ & \qquad {}= \frac{n}{n-1} \int\limits_{x=0}^{\infty} \int\limits_{w=x}^{\infty} \lambda^2 (n-1) x\,w\,e^{-\lambda(w+x)} \left(1 - e^{-\lambda w} \right)^{n-2} \,\mathrm{d}w \,\mathrm{d}x \tag{3.a} \label{Eq_E[XZ_on_X-smaller]_general_n} \end{align}$$ Note that the leading factor $n/(n-1)$ will cancel with that in Eq\eqref{Eq_conditional_split_transfer}, and this is not a coincidence.

The smaller $n$ cases are borderline manageable by hand, while in general one should not attempt this without CAS (computer algebra system); see the later half of Appendix.1. At $n = 6$, this evaluates to

$$E\bigl[WX_1~\big|\, W > X_1 \bigr] = \frac65 \frac{17381}{10800} \frac1{\lambda^2} \tag{3.b} \label{Eq_E[XZ_on_X-smaller]_n=6} $$ For what it's worth, $17381 = 7 \cdot 13 \cdot 191$ and $17381/10800 \approx 1.609351851851851\ldots~$, while $6/5$ times that is $\approx 1.93122222\ldots$

Take Eq\eqref{Eq_E[XZ_on_X-smaller]_n=6} and Eq\eqref{Eq_Z_2nd_moment_n=6} into Eq\eqref{Eq_conditional_split}, we have for $n = 6$:

\begin{align*} E[Z X_1] &= \frac16 E\left[ Z^2 ~\middle|\, X_1\right] + \frac56 E\bigl[ Z X_1~\big|\, Z > X_1 \bigr] \\ &= \frac16 E\left[ Z^2 \right] + \frac56 E\bigl[WX_1~\big|\, W > X_1 \bigr] \\ &= \frac{13489}{10800}\frac1{\lambda^2 } + \frac{17381}{10800} \frac1{\lambda^2}\\ &\hspace{-7pt} \bbox[4pt,border:2pt solid #55BB11]{ \begin{aligned}[t] &=\frac{343}{120} \frac1{\lambda^2} \\ &\approx \frac{ 2.8583333 }{\lambda^2} \end{aligned} } \end{align*} From here, one readily obtains the covariance using $E[Z]$ above in Eq\eqref{Eq_Z_2nd_moment_n=6} and $E[X_1] = 1 / \lambda$. The covariance for $n = 6$ is $49/(120 \lambda^2)$. Using $E\left[ Z^2 \right]$ to get $V[Z]$ and do similar with $E\left[ X_1^2 \right] = 1/\lambda^2$, one obtains the correlation coefficient as $\frac72\sqrt{7\,/\,767} \approx 0.3343639$, moderately positive as expected.

Here's a quick and dirty Mathematica code verifying the results numerically (Monte Carlo).

ClearAll@SimulateEZX;
SimulateEZX[itrIn_: 17, InN_: 10^5, nIn_: 6, prcIn_: 20] := Module[{n (* # of iid *), itr, vN (* vector size *), itbyN, GTotal (* grand total*), data, Z, X, sumZX, prec, XisZ, sumBound, LenBound, bdN(* boundary case count *), bdTotal}, Timing[{itr, vN, n, prec} = Ceiling@{Norm@itrIn, Norm@InN, Norm@nIn, Norm@prcIn};  data = ConstantArray[0, {vN, n}]; Z = ConstantArray[0, vN];  X = Z; sumZX = ConstantArray[0, itr]; sumBound = sumZX;  LenBound = sumZX; XisZ = sumZX;
Do[ data = RandomVariate[ExponentialDistribution@1, {vN, n}, WorkingPrecision -> prec];
 Z = Max /@ data; X = data[[;; , 1]];     sumZX[[k]] = Total[X Z];
 XisZ[[k]] = Pick[X, MapThread[SameQ, {X, Z}]]; (* boundary case when X_1 \[Equal] Z *)
 LenBound[[k]] = Length@XisZ[[k]];    sumBound[[k]] = Total[ XisZ[[k]]^2 ]; 
 Print[{Row@{k, "th loop: "}, Row@{"sample_E[\!\(\*SubscriptBox[\(ZX\), \(1\)]\)] = ", sumZX[[k]]/vN}, Row@{"  sample_E[\!\(\*SubscriptBox[\(ZX\), \(1\)]\) | \ Z=\!\(\*SubscriptBox[\(X\), \(1\)]\)] = ", sumBound[[k]]/LenBound[[k]]}}]
 , {k, itr}];
{itbyN, GTotal, bdN, bdTotal} = {itr  vN, Total@sumZX, Total@LenBound, Total[sumBound]};
{ Row@{"Sample size = ", itbyN},  Row@{"  sample_E[\!\(\*SubscriptBox[\(ZX\), \(1\)]\)] = ",      GTotal/itbyN} , Row@{"boundary case: count = ", bdN, " ,freq = ", N[bdN/itbyN, prec]}, Row@{"  sample_E[\!\(\*SubscriptBox[\(ZX\), \(1\)]\) | \ Z=\!\(\*SubscriptBox[\(X\), \(1\)]\)] = ", bdTotal/bdN},  Row@{"  sample_E[\!\(\*SubscriptBox[\(ZX\), \(1\)]\) | \ \!\(\*SubscriptBox[\(X\), \(1\)]\)<Z] = ", (GTotal - bdTotal)/(itbyN - bdN)}
} ]  ];
SimulateEZX[] (* default quick run  *)
SimulateEZX[20, 4.5 10^5, 6, 15] (* run with your own parameters *)

$$\color{white}.$$

Appendix.1 : Distribution and Moments of Max of iid Exponential

Denote $Z$ as the maximum of $n$ random variables with iid exponential distribution with rate parameter $\lambda$. \begin{align*} F_Z(t) &= \left( 1 - e^{-\lambda t} \right)^n &&& f_Z(t) &= \lambda n e^{-\lambda t} \left( 1 - e^{-\lambda t} \right)^{n-1} \tag{A.1.1} \label{Eq_CDF_and_density_of_max} \end{align*} The moments can be obtained via routine (not necessarily easy) integration to be \begin{align*} E[Z] &= \frac1{\lambda} \mathcal{H}(n) & E\left[Z^2 \right] &= \frac1{ \lambda^2} \left( \frac{ \pi^2 }6 + \mathcal{H}(n)^2 -\psi'(n+1) \right) \tag{A.1.2} \label{Eq_Z_moments_general_n} \end{align*} where $\mathcal{H}(n)\equiv \sum_k^n \frac1k$ is the harmonic number and $\psi'(\cdot)$ is the first derivative of the digamma function (a.k.a polygamma with order 1).

A related calculation is the indispensable Eq\eqref{Eq_E[XZ_on_X-smaller]_general_n}. Ignoring the leading $n/(n-1)$ factor, this integral evaluates to a sequence $S_n$ that satisfies the following 2nd order difference equation

$$S_{n+1} = \frac{ n+ 1}{ (n-1) (n+2)^3 } \biggl[ n-4 + (n+2) \Bigl( 2 S_{n+1} \left(n^2+n-1 \right) - S_n + n^2 \Bigr) \biggr]$$

for $n \geq 2$, with the initial two terms $S_2 = 1/2$ (obvious) and $S_3 = 47/54$ (not so obvious). The first few terms are $\{\frac12, \frac{ 47 }{ 54 }, \frac{ 335 }{ 288 }, \frac{ 12641 }{ 9000 }, \frac{ 17381 }{ 10800 }, \frac{ 1103219 }{ 617400 }, \ldots\}$. Here for $n = 6$ what goes into Eq\eqref{Eq_E[XZ_on_X-smaller]_n=6} is $S_6 = 17381/10800$.

$$\color{white}.$$

Appendix.2 : Derivation of Unconditional Joint Density $f_{ZX_1}$

I shall NOT include a sketch of the $X_1$-$W$ plane, with $X_1$ as the horizontal axis. Please make the sketch yourself as it will be helpful. The rectangular region that $F_{X_1Z}(x,z)$ covers is always tall-and-thin (height is as least as large as the width) but never short-and-fat.

Keep in mind that $Z = \max\{W, X_1\}$. From the sketch it should be easy to see that \begin{align*} F_{X_1Z}( x \, ,\, z ) &= \Pr\bigl\{ W \leq z ~~\& ~~X_1 \leq x \bigr\} &&\\ &= \Pr\bigl\{ W \leq z \bigr\} \cdot \Pr\bigl\{X_1 \leq x \bigr\} \qquad \because W \perp X_1 \\ &= F_W(z) \cdot F_X(x) \tag{A.2.1} \label{Eq_joint_CDF_elementary} \end{align*} where again $F_W$ is as seen in Eq\eqref{Eq_CDF_and_density_of_max} with $W$ instead of $Z$ and $n-1$ place of $n$.

Of course, $F_{X_1Z}$ should be different from $F_{X_1W}$, and one needs more explicit expressions.

$$ F_{X_1Z}( x,\, z ) = F_W(z) \cdot F_X(x) \cdot \mathcal{I}_{z \geq x} \tag{A.2.2} \label{Eq_joint_CDF_indicator}$$

Even in its full indicator-form (with indicators regarding the range of $X$ and $W$), $F_{X_1W} = F_W(z) \cdot F_X(x)$ doesn't have this particular indicator $\mathcal{I}_{z \geq x}$.

Moving on. The indicator is a legitimate function, and the derivative of Eq\eqref{Eq_joint_CDF_indicator} follows the product rule as usual. \begin{align*} f_{X_1Z}( x \, ,\, z ) &= \frac{ \partial^2 F_W(z) F_X(x)}{ \partial x \partial z } \mathcal{I}_{z \geq x} + F_W(z) F_X(x)\frac{ \partial^2 \mathcal{I}_{z \geq x}}{ \partial x \partial z } \\ &\hspace{36pt} + \frac{ \partial F_W(z) F_X(x) }{ \partial z } \frac{ \partial \mathcal{I}_{z \geq x} }{ \partial x } + \frac{ \partial F_W(z) F_X(x) }{ \partial x } \frac{ \partial \mathcal{I}_{z \geq x} }{ \partial z } \end{align*} Apply the relations between the $\{ \mathcal{I}_{\cdot},\delta(\cdot),\delta'(\cdot) \}$ to eventually arrive at \begin{align*} f_{X_1Z}( x \, ,\, z ) &= \color{magenta}{ f_W(z)\, f_X(x)\, \mathcal{I}_{z \geq x} } + \delta(z-x) \biggl( \frac{ \partial }{ \partial z } \Bigl( F_W(z) F_X(x) \mathcal{I}_{z \geq x} \Bigr) \\ & \hspace{102pt} + F_W(z) f_X(x) - f_W(z) F_X(x) \biggr) \tag{A.2.3} \label{Eq_joint_density_full_form} \end{align*} The CDFs $F_W, F_X$ and their densities $f_W$, $f_X$ are established. One can pair up the densities in Eq\eqref{Eq_joint_density_full_form} to make it into a nice "matching" form if so desired.

The leading term (highlighted in magenta) corresponds to the region $Z > X_1$ that is identical to $f_{X_1W}$ conditionally.