Variance of sample variance?
What is the variance of the sample variance? In other words I am looking for $\mathrm{Var}(S^2)$.
I have started by expanding out $\mathrm{Var}(S^2)$ into $E(S^4) - [E(S^2)]^2$
I know that $[E(S^2)]^2$ is $\sigma$ to the power of 4. And that is as far as I got.
Solution 1:
Here's a general derivation that does not assume normality.
Let's rewrite the sample variance $S^2$ as an average over all pairs of indices: $$S^2={1\over{n\choose 2}}\sum_{\{i,j\}} {1\over2}(X_i-X_j)^2.$$ Since $\mathbb{E}[(X_i-X_j)^2/2]=\sigma^2$, we see that $S^2$ is an unbiased estimator for $\sigma^2$.
The variance of $S^2$ is the expected value of $$\left({1\over{n\choose 2}}\sum_{\{i,j\}} \left[{1\over2}(X_i-X_j)^2-\sigma^2\right]\right)^2.$$
When you expand the outer square, there are 3 types of cross product terms $$\left[{1\over2}(X_i-X_j)^2-\sigma^2\right] \left[{1\over2}(X_k-X_\ell)^2-\sigma^2\right]$$ depending on the size of the intersection $\{i,j\}\cap\{k,\ell\}$.
When this intersection is empty, the factors are independent and the expected cross product is zero.
There are $n(n-1)(n-2)$ terms where $|\{i,j\}\cap\{k,\ell\}|=1$ and each has an expected cross product of $(\mu_4-\sigma^4)/4$.
There are ${n\choose 2}$ terms where $|\{i,j\}\cap\{k,\ell\}|=2$ and each has an expected cross product of $(\mu_4+\sigma^4)/2$.
Putting it all together shows that $$\mbox{Var}(S^2)={\mu_4\over n}-{\sigma^4\,(n-3)\over n\,(n-1)}.$$ Here $\mu_4=\mathbb{E}[(X-\mu)^4]$ is the fourth central moment of $X$.
Solution 2:
Maybe, this will help. Let's suppose the samples are taking from a normal distribution. Then using the fact that $\frac{(n-1)S^2}{\sigma^2}$ is a chi squared random variable with $(n-1)$ degrees of freedom, we get $$\begin{align*} \text{Var}~\frac{(n-1)S^2}{\sigma^2} & = \text{Var}~\chi^{2}_{n-1} \\ \frac{(n-1)^2}{\sigma^4}\text{Var}~S^2 & = 2(n-1) \\ \text{Var}~S^2 & = \frac{2(n-1)\sigma^4}{(n-1)^2}\\ & = \frac{2\sigma^4}{(n-1)}, \end{align*}$$
where we have used that fact that $\text{Var}~\chi^{2}_{n-1}=2(n-1)$.
Hope this helps.
Solution 3:
There can be some confusion in defining the sample variance ... 1/n vs 1/(n-1). The OP here is, I take it, using the sample variance with 1/(n-1) ... namely the unbiased estimator of the population variance, otherwise known as the second h-statistic:
h2 = HStatistic[2][[2]]
These sorts of problems can now be solved by computer. Here is the solution using the mathStatica add-on to Mathematica. In particular, we seek the Var[h2], where the variance is just the 2nd central moment, and express the answer in terms of central moments of the population:
CentralMomentToCentral[2, h2]
We could just as easily find, say, the 4th central moment of the sample variance, as:
CentralMomentToCentral[4, h2]
Solution 4:
Showing the derivation of $E(\left[{1\over2}(X-Y)^2-\sigma^2\right]^2) = (\mu_4+\sigma^4)/2$ of user940:
LHS:
$E(\left[{1\over2}(X-Y)^2-\sigma^2\right]^2) = E(\frac{1}{4}(X-Y)^4 - (X-Y)^2 \sigma^2 + \sigma^4) = E(\frac{1}{4}(X-Y)^4) - 2\sigma^2\sigma^2 + \sigma^4 = E(\frac{1}{4}(X-Y)^4) - \sigma^4 = \frac{1}{4}E(X^4 -4X^3Y +6X^2Y^2 -4XY^3 + Y^4) -\sigma^4 = \frac{1}{4}(2E(X^4) -8E(X)E(X^3) +6 E(X^2)(X^2)) - \sigma^4 = \frac{1}{2}(E(X^4)-4E(X)E(X^3) +3 E(X^2)(X^2) - 2\sigma^4)$
I use the fact that $E((x-y)^2) = 2\sigma^2$ here.
RHS:
$\require{cancel} (\mu_4+\sigma^4)/2 = \frac{1}{2}(E((X-\mu)^4) + \sigma^4) = \frac{1}{2}(E((X-E(X))^4) + \sigma^4) = \frac{1}{2}(E(X^4 -4X^3E(X) + 6X^2E(X)^2 -4XE(X)^3 + E(X)^4) + \sigma^4) = \frac{1}{2}(E(X^4 -4X^3E(X) + 6X^2E(X^2) - 6X^2\sigma^2 -4XE(X)(E(X^2)-\sigma^2) + (E(X^2)-\sigma^2)^2) + \sigma^4) = \frac{1}{2}(E(X^4) -4E(X)^3E(X) + 6E(X)^2E(X^2) - 6E(X)^2\sigma^2 -4E(X)^2(E(X^2)-\sigma^2) + (E(X^2)-\sigma^2)^2 + \sigma^4) = \frac{1}{2}(E(X^4) -4E(X)^3E(X) + 6E(X)^2E(X^2) - \cancel{6E(X)^2\sigma^2} -4E(X^2)E(X^2) +\cancel{4E(X^2)\sigma^2 +4E(X^2)\sigma^2} - 4\sigma^4 + E(X^2)^2-\cancel{2E(X^2)\sigma^2} + \sigma^4 + \sigma^4) = \frac{1}{2}(E(X^4) -4E(X)^3E(X) + 3E(X)^2E(X^2) - 2\sigma^4)$
I use the fact that $E(x) = \mu$ and that $E(x)^2 = E(x^2) - \sigma^2$
Now LHS = RHS.
Solution 5:
I will use this as an example of this theorem(from Seber, G.A. and Lee, A.J. (2012))
Let $X_1, X_2, ... , X_n$ be independent rvs with means $(\theta_1, \theta_2, ... ,\theta_n)$,common $\mu_2,\mu_3,\mu_4$. If A is any n x n symmetric matrix and $a$ is a column vector of the diagonal elements of A, then
$$var[X'AX]=(\mu_4-3\mu^2_2)a'a+2\mu^2_2tr(A^2)+4\mu_2\theta'A^2\theta+4\mu_3\theta'Aa
$$
denote $1_n$ as n-dim column vector that all elements are 1, notice that for sample variance
$$S^2=\frac{1}{n-1}X'AX, where A=I_n-\frac{1}{n}1_n1_n'
$$
and we have$A^2=A$, $a=(1-\frac{1}{n})1_n$
since $X_i$ in our case are iid, let's say their mean is $\mu$, then $\theta=\mu1_n$
so the third and fourth term is $0$,since
$$
A^2\theta=A\theta=\mu(1_n-\frac{1}{n}1_n(1_n'1_n))=0\\
Aa=(1-\frac{1}{n})(1_n-\frac{1}{n}1_n(1_n'1_n))=0
$$
then$$
var[S^2]=\frac{1}{(n-1)^2}[(\mu_4-3\mu_2^2)(1-\frac{1}{n})^2n+2\mu_2^2(n-1)]=\frac{\mu_4}{n}-\frac{n-3}{n(n-1)}\mu_2^2
$$