If $m$ tickets are drawn out of $n$ tickets numbered $1$ to $n$, find variance of the sum of the numbers on tickets
$m$ tickets are drawn out of $n$ tickets which are numbered from $1$ to $n$. If $X$ denote the sum of the numbers on the tickets drawn. Find $V(X)$.
$X = X_1+X_2+\cdots+X_m$ , if $X_i$ can be treated as the $i$th number drawn. Otherwise, $X_i$ can be treated as the indicator variable of the number $i=1,2,...,n$.
In either way, I am able to get expectation since dependence of variables does not matter. However, while calculating Variance, dependence does matter. While calculating $E(X_iX_j)$ the second draw is supposed to be dependent on the first draw since there is a constraint of the sum $X$.
Please answer.
Solution 1:
Here is a slightly indirect way of obtaining the variance:
Let $X_k$ be the number on the $k$th ticket, $k=1,2,\ldots,m$.
So we have a uniform distribution for the $X_k$'s, namely
$$ P(X_k=j)=\begin{cases}\frac{1}{n}&,\text{ if }j=1,2,\cdots,n\\\\\,0&,\text{ otherwise }\end{cases}$$
So,
\begin{align} \operatorname{Var}(X_k)&=E(X_k^2)-(E(X_k))^2 \\\\&=\frac{n^2-1}{12}=\sigma^2\,,\text{ say } \end{align}
If the correlation between $X_i$ and $X_j$ $\,(i\ne j)$ be $\rho$, then $$\rho=\dfrac{\text{Cov}(X_i,X_j)}{\sigma^2}$$
You are looking for \begin{align}\operatorname{Var}(X)&=\operatorname{Var}\left(\sum_{k=1}^m X_k\right)\\&=\sum_{k=1}^m \operatorname{Var}(X_k)+2\sum_{i<j}\text{Cov}(X_i,X_j)\\&=m\sigma^2+2\binom{m}{2}\rho\sigma^2 \\&=m\sigma^2(1+(m-1)\rho)\tag{1}\end{align}
Now note that the joint distribution of $(X_i,X_j)\,,i\ne j$ is independent of $m$.
So we see that
\begin{align} \operatorname{Var}\left(\sum_{k=1}^{\color{red}{n}}X_k\right)&=\operatorname{Var}(\text{constant})=0 \\&\implies\color{red}{n}\sigma^2(1+(\color{red}{n}-1)\rho)=0 \\&\implies\rho=\frac{1}{1-n} \end{align}
Substituting this value of $\rho$ and the value of $\sigma^2$ in $(1)$, we finally get the variance of $X$ as
$$\operatorname{Var}(X)=\frac{m(n+1)(n-m)}{12}$$