Calculate expectation of a geometric random variable
Your question essentially boils down to finding the expected value of a geometric random variable. That is, if $X$ is the number of trials needed to download one non-corrupt file then $$X\sim Geo(0.2)$$ In general, if $X\sim Geo(p)$ then $$E(X)=\frac1 p$$ So in your case the expected number of trials to download an uncorrupted file is $$E(X)=\frac{1}{0.2}=5$$ Addendum: Here is a derivation of the above mentioned result.
First note that $P(X=k)=p(1-p)^{k-1}$. The expected value is thus $$\begin{align*}E(X)&=\sum_{k=1}^{\infty}kp(1-p)^{k-1} \\ &=p\sum_{k=1}^{\infty}k(1-p)^{k-1} \\ &=p\left(-\frac{d}{dp}\sum_{k=1}^{\infty}(1-p)^k\right) \\ &=p\left(-\frac{d}{dp}\frac{1-p}{p}\right) \\ &=p\left(\frac{d}{dp}\left(1-\frac{1}{p}\right)\right)=p\left(\frac{1}{p^2}\right)=\frac1p\end{align*}$$
Derivative step: (answer to comment)
Simple use of chain rule gives: $$ -\frac{d}{dp}\sum_{k=1}^{\infty}(1-p)^k = \sum_{k=1}^{\infty}k(1-p)^{k-1} $$ It is clear that $$ -\frac{d}{dp}\sum_{k=1}^{\infty}(1-p)^k = -\frac{d}{dp}\left( \sum_{k=1}^{\infty}(1-p)^k \right)$$ given that $ 0 <1 - p < 1$ we can use the geometric series formula to obtain: $$ \left( \sum_{k=1}^{\infty}(1-p)^k \right) = \frac{1 - p}{p} $$ then proof follows accordingly.
An intuitive and telling approach to this is to find a functional identity (see note at the end) that the random number $X$ of downloads necessary to get an uncorrupted file satisfies. The everyday situation you describe amounts to the following:
- With probability $p=0.2$, $X=1$ (first file uncorrupted).
- With probability $1-p=0.8$, $X=1+Y$, where $Y$ is distributed like $X$ (first file corrupted, then continue with the next files).
Thus, $E[Y]=E[X]$ hence $$E[X]=p\cdot1+(1-p)\cdot(1+E[X]), $$ from which the arch-classical formula $E[X]=1/p$ follows.
Note that this also yields the full distribution of $X$, for example, for every $|s|\leqslant1$, $g(s)=E[s^X]$ is such that $E[s^{Y}]=g(s)$ hence $g(s)$ must solve the corresponding identity $$g(s)=p\cdot s+(1-p)\cdot s\cdot g(s), $$ hence $$ \sum_{n\geqslant0}P[X=n]s^n=g(s)=\frac{ps}{1-(1-p)s}=ps\sum_{n\geqslant0}(1-p)^ns^n, $$ from which $P[X=n]=p(1-p)^{n-1}$ follows, for every $n\geqslant1$.
Note: Since some user was kind enough to upvote this a long time after it was written, I just reread the whole page. Frankly, I found appalling the insistence of a character to confuse binomial distributions with geometric distributions, but I also realized that the functional identity referred to in the first sentence of the present answer had not been made explicit, so here it is.
The distribution of the number $X$ of downloads to get an uncorrupted file is the only solution of the identity in distribution $$X\stackrel{(d)}{=}1+BX,$$ where the random variable $B$ on the RHS is independent of $X$ on the RHS and Bernoulli distributed with $$P(B=0)=p,\qquad P(B=1)=1-p.$$
This merely summarizes the description in words at the beginning of this post, and allows to deduce all the mathematical results above. This also yields a representation of $X$ as
$$X\stackrel{(d)}{=}1+\sum_{n=1}^\infty\prod_{k=1}^nB_k,\qquad\text{with $(B_k)$ i.i.d. and distributed as }B.$$
Finally, note that every positive integer valued random variable $X$ can be represented as the sum of such a series for some independent sequence of Bernoulli random variables $(B_k)$, but that the distribution of $B_k$ being independent on $k$ characterizes the fact that the distribution of the sum $X$ is geometric.
A clever solution to find the expected value of a geometric r.v. is those employed in this video lecture of the MITx course "Introduction to Probability: Part 1 - The Fundamentals" (by the way, an extremely enjoyable course) and based on (a) the memoryless property of the geometric r.v. and (b) the total expectation theorem.
If you compute E[X]
as the sum of the two leafs of the probability tree regarding the first outcome, you end up with:
E[X] = 1 + pE[X-1|x=1] + (1-p)E[X-1|x>1]
E[X] = 1 + 0 + (1-p)E[X]
From which, solving for E[X]
, you can find E[X] = 1/p
Let $X \sim Geom(p)$. Then
$\begin{align} \mathbb{E}[X] & = \sum_{n=1}^\infty n(1-p)^{n-1}p\\ & = p\Sigma_1 \end{align}$
Where $\Sigma_1 = \sum_{n=1}^\infty n(1-p)^{n-1}$. Then let $\Sigma_0 = 1 + (1-p) + (1-p)^2 + \ldots = \frac{1}{1 - 1 + p} = \frac{1}{p}$ as we have a geometric series. Then we have
$\begin{align} \Sigma_1 & = 1 + 2(1-p) + 3(1-p)^2 + \ldots\\ (1-p)\Sigma_1 & = (1-p) + 2(1-p)^2 + \ldots\\ (1 - 1 + p)\Sigma_1 & = p\Sigma_1 = 1 + (1-p) + (1-p)^2 + \ldots = \Sigma_0 = \frac{1}{p}\\ \Sigma_1 & = \frac{1}{p^2} \end{align}$
And from above we know that $\mathbb{E}[X] = p\Sigma_1$. So finally:
$\begin{equation*} \mathbb{E}[X] = p\Sigma_1 = \frac{1}{p} \end{equation*}$