Density and expectation of the range of a sample of uniform random variables [closed]

If the variables $\alpha_1$...$\alpha_n$ are distributed uniformly in $(0,1)$,

  1. How do I show that the spread $\alpha_{(n)}$ - $\alpha_{(1)}$ has density $n (n-1) x^{n-2} (1-x)$ and expectation $(n-1)/(n+1)$?
  2. What is the probability that all $n$ points lie within an interval of length $t$?

This is a (straight forward?) exercise is order statistics.

From Wikipedia, the pdf for the joint order statistic is:

$$ f_{ X_{(i)}, X_{(j)} }(u, v) = $$ $$ \frac{ n! }{ (i-1)! (j-1-i)! (n-j)! } [ F_X(u) ]^{i-1} [F_X(v) - F_X(u)]^{j-1-i} [1 - F_X(v)]^{n-j} f_X(u) f_X(v) $$

(note the change in arguments) Where $F_X(u)$ is the cumulative distribution function (or cdf) and $f_X(u)$ is the probability density function (or pdf) of the random variable $X$.

Plugging in $i=1$ and $j=n$ with $X = \alpha$, $f_{\alpha}(u) = 1$ and $F_{\alpha}(u) = u$, we get:

$$ f_{ \alpha_{(1)}, \alpha_{(n)} }(s, s+t) = \frac{ n! }{ (n-2)! } [ s+t - s ]^{n-2} = n (n-1) t^{n-2} $$

for a starting point $s \in [0, 1-t] $ with interval length $t$. To find the probability of finding $\alpha_{(1)}$ and $\alpha_{(n)}$ within some interval of $t$, we need to integrate over all (permissible) starting positions, $s$:

$$ \int_0^{1-t} f_{\alpha_{(1)},\alpha_{(n)}}(s, s+t) ds = \int_{0}^{1-t} n (n-1) t^{n-2} ds = n (n-1) t^{n-2} \int_0^{1-t} ds $$ $$ = n (n-1) t^{n-2} (1-t) $$

Which gives the first part of the answer for question 1. For the second part to question 1, you could integrate the above equation for $t \in [0, 1]$ or you could just take the expectation of the difference of the random variable $\alpha_{(n)}$ with $\alpha_{(1)}$. I will do the latter. The pdf of the $k$'th order statistic is:

$$ f_{X_{(k)}}(u) = \frac{ n! }{ (k-1)! (n-k)! } [F_X(u)]^{k-1} [1-F_X(u)]^{n-k} f_X(u) $$

and plugging in $X = \alpha$, $k=1$ and $k=n$ for the 1st and $n$'th order statistic (of the uniform distribution) respectively, we get:

$$ f_{\alpha_{(1)}}(u) = n (1 - u)^{n-1} $$ $$ f_{\alpha_{(n)}}(u) = n u^{n-1} $$

and taking the expectation of their difference:

$$ E[ \alpha_{(n)} - \alpha_{(1)} ] = \int_{0}^{1} u (n u^{n-1} - n (1 - u)^{n-1} ) du = \frac{n}{n+1} - \frac{1}{n+1} = \frac{n-1}{n+1} $$

As desired. (note that $\alpha_{(n)}$ and $\alpha_{(1)}$ are not independent but that the expectation of the sums of random variables is still the same regardless).

To answer part 2, one must find $\Pr\{ \alpha_{(n)} - \alpha_{(1)} \le t \} $. We just derived the pdf of the joint distribution that the difference of $\alpha_{(n)}$ and $\alpha_{(1)}$ is of length $t$, so now we just have to sum over all lengths less than $t$:

$$ \int_0^{t} n (n-1) x^{n-2} (1-x) dx = t^{n-1} ( n - (n-1) t) $$

I must admit that I just looked up the pdf of the joint order statistic. Could anyone tell me (or give a reference) on how to derive this equation?


Here's approach for computing the density function that doesn't rely on knowledge of order statistics. First let $R = \alpha_{(n)} - \alpha_{(1)}$ be the range. We compute the cumulative distribution function $F_R(x)$ by conditioning on the event $\alpha_{(n)} \leq x$.

\begin{align} F_R(x) &= P\{R \leq x\} \\ &= P\{R \leq x \text{ and } \alpha_{(n)} \leq x\}+P\{R \leq x \text{ and } \alpha_{(n)} > x\} \\ &=P\{R \leq x \text{ | } \alpha_{(n)} \leq x\}P\{\alpha_{(n)} \leq x\}+P\{R \leq x \text{ and } \alpha_{(n)} > x\} \label{a}\tag{1} \end{align}

To compute the first term in $\ref{a}$, observe that if the largest value is less than $x$, then the range $R$ must be less than x, so we have

\begin{align} P\{R \leq x \text{ | } \alpha_{(n)} \leq x\} = 1 \\ \end{align}

Also,

\begin{align} P\{\alpha_{(n)}\leq x\} &= P\{\alpha_1 \leq x, \alpha_2 \leq x,\ldots,\alpha_n \leq x\} \\ &= P\{\alpha_1 \leq x\}P\{\alpha_2 \leq x\}\cdots P\{\alpha_n \leq x\} \\ &= x^n \end{align}

To finish our computation of $F_R$, we must compute $P\{R \leq x \text{ and } \alpha_{(n)} > x\} $. Let $A$ be the event that $R \leq x \text{ and } \alpha_{(n)} > x$. Let $A_i$ be the event that $\alpha_i > x$ and $\alpha_i - x \leq \alpha_j \leq \alpha_i$ for all $j \neq i$. Clearly $P\{\alpha_i > x\} = (1-x)$. Because the interval $[\alpha_i - x,\alpha_i]$ is of width $x$, we have $P\{\alpha_i - x \leq \alpha_j \leq \alpha_i\} = x$, which implies

$$P\{\alpha_i - x \leq \alpha_j \leq \alpha_i \text{ for all } j \neq i \} = x^{n-1}$$

Therefore

$$P\{A_i\}= (1-x)x^{n-1} $$

Notice that $A = \bigcup\limits_{i=1}^n A_i$, and the events $A_i$ are mutually exclusive, so we have

$$P(A) = P\{R \leq x \text{ and } \alpha_{(n)} > x\} = n(1-x)x^{n-1}$$

Substituting these results into $\ref{a}$, we get

$$F_R(x) = x^n + n(1-x)x^{n-1}$$

Differentiation and a little algebra gives the desired density function

\begin{align} f_R(x) &= F_R'(x) \\ &= n(n-1)x^{n-2}(1-x) \end{align}


Here's a hint: The answer to part 2 is quite easy. And if you consider it, you might see a trick on how to get the answer to part 1.