Proof variance of Geometric Distribution

I have a Geometric Distribution, where the stochastic variable $X$ represents the number of failures before the first success.

The distribution function is $P(X=x) = q^x p$ for $x=0,1,2,\ldots$ and $q = 1-p$.

Now, I know the definition of the expected value is: $E[X] = \sum_{i}{x_i p_i}$

So, I proved the expected value of the Geometric Distribution like this:

$E[X]=\sum _{ i=0 }^{ \infty }{ iP(X=i) } = \sum _{i=0}^{\infty}{i q^i p} = p\sum _{i=0}^{\infty}{i q^i} = pq \sum _{i=0}^{\infty}{iq^{i-1}}$

$\qquad = pq \sum _{i=0}^{\infty}{\frac{d}{dq}q^i} = pq \frac{d}{dq}(\sum _{i=0}^{\infty}{q^i}) = pq \frac{d}{dq}(\frac{1}{1-q})$

$\qquad = pq \frac{1}{(1-q)^2} = \frac{pq}{p^2} = \frac{q}{p}$

So now, I would like to prove that $Var[X] = \frac{q}{p^2}$. I know I have to use a simular trick as above (with the derivation).

$Var[X] = E[X^2] - E[X]^2 = \sum _{i=0}^{\infty}{i^2 q^i p} - (\frac{q}{p})^2 = p \sum _{i=0}^{\infty}{i^2 q^i} - (\frac{q}{p})^2 = pq \sum _{i=0}^{\infty}{i^2 q^{i-1}} - (\frac{q}{p})^2$

$\qquad = pq \sum _{i=0}^{\infty}{\frac{d}{dq}i q^i} - (\frac{q}{p})^2 = pq \frac{d}{dq} \sum _{i=0}^{\infty}{iq^i}-(\frac{q}{p})^2$

Then I'm stuck. How can I get another $q$ out of the sum? Won't it mess up the first derivation?

I have a proof which follows the approach of @Math1000 but it in a slightly different way. It may be useful if you're not familiar with generating functions.

However, I'm using the other variant of geometric distribution. In my case $X$ is the number of trials until success. Therefore $E[X]=\frac{1}{p}$ in this case. Anyways both variants have the same variance.

So assuming we already know that $E[X]=\frac{1}{p}$. Then the variance can be calculated as follows: $$ Var[X]=E[X^2]-(E[X])^2=\boxed{E[X(X-1)]} + E[X] -(E[X])^2 = \boxed{E[X(X-1)]} + \frac{1}{p} - \frac{1}{p^2} $$ So the trick is splitting up $E[X^2]$ into $E[X(X-1)]+E[X]$, which is easier to determine. To determine $\boxed{E[X(X-1)]}$ we have to determine the value of the following series for $p\in(0,1)$: $$ \sum_{k=1}^\infty k(k-1)p(1-p)^{k-1} $$

Here's how it can be done (as an alternative to Math1000's approach): $$ \begin{align} \sum_{k=1}^\infty k(k-1)p(1-p)^{k-1} &= p\sum_{k=1}^\infty k(k-1)(1-p)^{k-1} \qquad\text{Subst. }q:=(1-p)\\\\ &= p\sum_{k=1}^\infty (k-1)kq^{k-1} \\\\ &= p\frac{d}{dq}\left(\sum_{k=1}^\infty (k-1)q^k\right) \\\\ &= p\frac{d}{dq}\left(q^2\sum_{k=1}^\infty (k-1)q^{k-2}\right) \\\\ &= p\frac{d}{dq}\left(q^2\sum_{k=2}^\infty (k-1)q^{k-2}\right) \\\\ &= p\frac{d}{dq}\left(q^2\frac{d}{dq}\left(\sum_{k=2}^\infty q^{k-1}\right)\right) \\\\ &= p\frac{d}{dq}\left(q^2\frac{d}{dq}\left(\sum_{k=1}^\infty q^{k}\right)\right) \\\\ &= p\frac{d}{dq}\left(q^2\frac{d}{dq}\left(\frac{1}{1-q}-1\right)\right) \\\\ &= p\frac{d}{dq}\left(\frac{q^2}{(1-q)^2}\right) \\\\ &= p\left(\frac{-2q}{(q-1)^3}\right)\qquad\text{Backsub. }q=(1-p) \\\\ &= p\left(\frac{-2(1-p)}{((1-p)-1)^3}\right) = p\left(\frac{-2+2p}{-p^3}\right) \\\\ &= \frac{-2+2p}{-p^2} =\frac{2(p-1)}{-p^2} = \frac{2(1-p)}{p^2}. \\\\ \end{align} $$ Now putting the result back into the equation for $Var[X]$ gives us: $$ Var[X]=\boxed{E[X(X-1)]} + E[X] -(E[X])^2 =\frac{2(1-p)}{p^2} + \frac{1}{p} - \frac{1}{p^2} = \frac{2-2p+p-1}{p^2} = \frac{1-p}{p^2}. $$

No answer to your question but a suggestion to follow an alternative route (too much for a comment).

Let $S$ denote the event that the first experiment is a succes and let $F$ denote the event that the first experiment is a failure. Then make use of: $$\mathbb EX^n=\mathbb E(X^n|S)P(S)+\mathbb E(X^n|F)P(F)=\mathbb E(1+X)^nq$$ This for $n=1$ and $n=2$ respectivily.

It leads to expressions for $\mathbb EX$, $\mathbb EX^2$ and consequently $\text{Var}X=\mathbb EX^2-(\mathbb EX)^2$.

Proof variance of Geometric Distribution

Related

Recent Posts