Proof for the calculation of mean in negative binomial distribution

Here is a purely algebraic approach. We begin by first showing that the PMF for a negative binomial distribution does in fact sum to $1$ over its support. Suppose $X \sim \operatorname{NegBinomial}(r,p)$, with PMF $$\Pr[X = x] = \binom{x-1}{r-1} p^r (1-p)^{x-r}, \quad x = r, r+1, r+2, \ldots.$$ This is the parametrization you chose. Consider the function $$f_m(z) = \sum_{k=0}^\infty \binom{k+m}{m} z^k.$$ We recall the identity $$\binom{k+m}{m} = \binom{k+m-1}{m-1} + \binom{k+m-1}{m},$$ from which it follows that $$\begin{align*} f_m(z) &= \sum_{k=0}^\infty \binom{k+m-1}{m-1}z^k + \binom{k-1+m}{m} z^k \\ &= f_{m-1}(z) + z \sum_{k=1}^\infty \binom{k-1+m}{m} z^{k-1} \\ &= f_{m-1}(z) + z f_m(z). \end{align*}$$ Consequently, $$f_m(z) = \frac{f_{m-1}(z)}{1-z}.$$ But because $$f_0(z) = \sum_{k=0}^\infty \binom{k}{0} z^k = \frac{1}{1-z},$$ it immediately follows that $$f_m(z) = (1-z)^{-(m+1)}.$$ Now letting $m = r-1$, $z = 1-p$, and $k = x-r$, we obtain $$\sum_{x=r}^\infty \Pr[X = x] = p^r (1 - (1-p))^{-(r-1+1)} = p^r p^{-r} = 1, \quad 0 < p < 1.$$ This proves that $\Pr[X = x]$ does define a valid PMF.


Next, we use this property to calculate $\operatorname{E}[X]$. By definition, $$\operatorname{E}[X] = \sum_{x=r}^\infty x \Pr[X = x].$$ But since $$x \binom{x-1}{r-1} = \frac{x!}{(r-1)!(x-r)!} = r \frac{x!}{r! (x-r)!} = r \binom{x}{r},$$ we find $$\operatorname{E}[X] = \sum_{x=r}^\infty r \binom{x}{r} p^r (1-p)^{x-r} = \frac{r}{p} \sum_{x=r+1}^\infty \binom{x-1}{(r+1)-1} p^{r+1} (1-p)^{x-(r+1)},$$ where we obtained this last expression by incrementing the lower index of summation by $1$, and decrementing the index in the summand by $1$. But you will notice that we have also rewritten the summand so that it is now apparent that it is the sum of the PMF of a negative binomial distribution with parameters $r+1$ and $p$. Thus this sum equals $1$, and we conclude $\operatorname{E}[X] = r/p$.


It is worth noting that for this purely algebraic approach, we have spent most of our effort to show that this parametrization is a valid PMF. The calculation of the expectation is quite straightforward by contrast. Also, if the variance is desired, it is best to consider $\operatorname{E}[X(X-1)],$ rather than $\operatorname{E}[X^2]$, since the former expression more readily yields to the same type of binomial coefficient manipulation that we used for $\operatorname{E}[X]$. I leave this computation as an exercise for the reader.

A final word: perhaps the most elegant computation is to exploit the fact that the negative binomial distribution is a generalization (i.e., a sum of IID) geometric random variables. But the purpose of this answer is to show how the computation can be done purely as an algebraic manipulation with very few prerequisites.


Note that in your argumentation

  • By binomial theorem, $\sum_{k=0}^{\infty}\frac{(k+r)!}{r!k!}p^r(1-p)^k$ becomes $[p+(1-p)]^{k+r} = 1$

the equation $[p+(1-p)]^{k+r} = 1$ depends on the summation index $k$ which is not permissible. Furthermore the expression

\begin{align*} [p+(1-p)]^{k+r}=\sum_{j=0}^{k+r}\binom{k+r}{j}p^j(1-p)^{k+r-j}=1 \end{align*} is a sum with a finite number of summands and not helpful in your calculation. But, besides this argumentation you are on the right track.

Starting from your last expression we obtain \begin{align*} E(x)&=r\sum_{k=0}^{\infty}\frac{(k+r)!}{r!k!}p^r(1-p)^k\\ &=rp^r\sum_{k=0}^{\infty}\binom{k+r}{k}(1-p)^k\tag{1}\\ &=rp^r\sum_{k=0}^{\infty}\binom{-(r+1)}{k}(-1)^k(1-p)^k\tag{2}\\ &=rp^r(1-(1-p))^{-(r+1)}\\ &=\frac{r}{p} \end{align*}

Comment:

  • In (1) we use the identity $\binom{-n}{k}=\binom{n+k-1}{k}(-1)^k$

\begin{align*} \binom{-n}{k}&=\frac{(-n)(-n-1)\cdots(-n-k+1)}{k!}\\ &=(-1)^k\frac{n(n+1)\cdots(n+k-1)}{k!}\\ &=(-1)^k\frac{(n+k-1)!}{k!(n-1)!}\\ &=(-1)^k\binom{n+k-1}{k} \end{align*}

  • In (2) we use the binomial series expansion

Your utilization of the Binomial theorem is wrong. In

$$\sum_{k=0}^{\infty}\frac{(k+r)!}{r!k!}p^r(1-p)^k$$

the sum $r+k$ isn't constant.


For instance, with $r=2$,

$$\sum_{k=0}^{\infty}\frac{(k+r)!}{r!k!}p^rq^k=\frac{2!}{2!}p^2+\frac{3!}{2!}p^2q+\frac{4!}{2!2!}p^2q^2+\frac{5!}{2!3!}p^2q^3\cdots\\ =\frac{p^2}2\left(2\cdot1+3\cdot2q+4\cdot3q^2+5\cdot4q^3+\cdots\right)=\frac{p^2}2\frac2{(1-q)^3}=\frac1p.$$