Intuition behind binomial variance

A binomial random variable is a sum of independent Bernoulli random variables. So if you accept that a Bernoulli random variable has variance $p(1-p)$, then the formula for the variance of a Binomial random variable follows from the "variance of sum" rule. Moreover, the variance of a Bernoulli random variable can be seen at a glance using the formula $\text{Var}(X) = E(X^2) - E(X)^2$.

I understand the logic to be the same in the case of Bernoulli random variables as in the case of Binomial r.v.s because of independence. In the case of Bernoulli r.v.s, just as in the case of the binomial, the concept of variance is implicit in the notion of expected value itself. That is, the more the expected value per trial deviates from 1 or 0, the more we have variability in our results. Variability is maximised the less we can say, either, that we have all ones or all zeros, and the closer we get to equal ones and equal zeros. This occurs as the mean value, p, approaches 1/2. When our expected value is neither one nor zero (that is, when there is randomness in our outcomes) we start to record distances of x from the mean value. Our distance from the expected value is the same as our expected value when x=0, when our outcome does not occur, which occurs with a probability (1-p). At the same time, our distance can be the complement of this with the complementary probability. Calculating the expected difference of x from its mean means taking account of both distances and their probabilities which, because of symmetry, yields a duplication of the product of p (the expected value of x), and the probability of the non occurrence of the outcome. This symmetry recalls the fact that distances from x = 0 or x = 1 are equally relevant to the variability, because they equally prescribe the deviation from certainty we have in relation to the event, however the contribution they provide to this sum has in both instances to be tempered by the level of certainty we can have in relation to them (their own probability). The squaring of complement terms necessary to calculate the variance reduces this duplicate to one. Thus the variance is simply the expected value of x and the probability of the non occurrence of the outcome.

You have $Var(X+Y) = Var(X)+Var(Y)$ for i.i.d vars and we have $$ K = \sum I_i, $$ With i.i.d $I_i = 1$ because the nubber of succcesses is has the same as distribution as the number of times $I_i$ is 1 and they sum the the number of successes.

Now with probability $p$ $I_i=1$ and zero with probability $1-p$. and also $$ Var(I_i) = p(1-p) $$ Once can motivate this formula by noting the variance is quadratic in $p$ and if almost all mass is at 1 then we have almost zero variance and if we have almost all mass at 0 then symmetrically we have almost zero variance from there you get the formula. The n is from the principle that variance of a sum of independent variables is the same as the sum of the individual variance.

Intuition behind binomial variance

Related

Recent Posts