The derivation of the Wald interval

I'm asking about the binomial proportion confidence interval, also known as the Wald interval.

Recall that $$\lim_{n \to \infty}{P_p \left( -z_{1-\frac{\alpha}{2}}\frac{\sigma}{\sqrt{n}}+\bar{X_n} \leq p \leq z_{1-\frac{\alpha}{2}}\frac{\sigma}{\sqrt{n}}+\bar{X_n} \right)} = 1-\alpha, $$ with $\sigma = \sqrt{p(1-p)}$.

Starting from the expression above, and the fact that for $\hat{p}=\dfrac{\sum X_i}{n},\ \hat{\sigma}=\sqrt{\hat{p}(1-\hat{p})}$ is consistent for $\sigma$ ($X_i \sim \rm Bin(1,p) )$, what argument can I use to show that $$\left[-z_{1-\frac{\alpha}{2}}\frac{\hat{\sigma}}{\sqrt{n}}+\bar{X_n} ,\ z_{1-\frac{\alpha}{2}}\frac{\hat{\sigma}}{\sqrt{n}}+\bar{X_n}\right]$$ is a confidence interval?


Solution 1:

The Wald confidence interval for binomial success probability $p$ depends on two approximations.

(1) That $Z = \frac{\hat p - p}{\sqrt{p(1-p)/n}}$ is approximately standard normal, $Norm(0, 1)$. Thus one would have $P(-1.96 < Z < 1.96) \approx 0.95.$ This is a good approximation if $n$ is large and $p$ is not too far from $1/2.$ [A common rule of thumb is that $np$ and $n(1-p)$ should both exceed. 5.]

From there, simple algebra gives $$P\left(\hat p - 1.96\sqrt{p(1-p)/n} < p < \hat p + 1.96\sqrt{p(1-p)/n} \right) \approx .95.$$ This is promising because $p$ is 'isolated' (after a fashion) between two 'bounds', but not useful in practice for making a confidence interval because $\sqrt{p(1-p)/n}$ is unknown.

(2) This leads to the second assumption, that if $n$ is sufficiently large, then $\hat p$ will be sufficiently close to $p$ that we can write

$$P\left(\hat p - 1.96\sqrt{\hat p(1-\hat p)/n} < p < \hat p + 1.96\sqrt{\hat p(1- \hat p)/n} \right) \approx .95.$$

So that an approximate 95% confidence interval for $p$ is of the form $\hat p \pm 1.96\sqrt{\hat p(1- \hat p)/n}.$ Similarly for other confidence levels with an appropriate number from standard normal tables replacing 1.96. (For example, 1.645 for a 90% CI and 2.576 for a 99% CI.)

$Notes:\;$ Unfortunately, as shown by intensive computations for various values of $n$ and $p,$ the actual 'coverage probability' of the Wald interval can be far from 95% (and what is worse, often far $below$ 95%) with 1.96. Similarly for other 'target' confidence levels. (A key reference is Brown, Cai, and DasGupta, 2001.)

If $n$ is several hundred or thousand (as in a public opinion poll or a large-scale simulation) the Wald interval is tolerably accurate. Otherwise, for a 95% CI with smaller $n$ a considerable improvement is artificially to introduce two extra successes and two extra failures into the data before finding $\hat p$ and $n$. This adjustment (due to Agresti and Coull, 1998) is now widely used instead of the Wald interval. (See Wikipedia.)

The Wilson interval (again, Wikipedia) results from taking the square and then solving a quadratic equation to (truly) isolate $p$ in $-1.96 < Z < 1.96$ without making assumption (2). Equating 1.96 and 2 in the 95% Wilson CI gives nearly the same result as the simpler Agresti-Coull interval.

The plots below show $actual$ coverage probabilities of Wald and Agresti "95%" CIs for 2000 values of $p$ between 0 and 1 for $n = 100$. The rapid oscillation of coverage probabilities for even small changes in $p$ is due to the discreteness of the binomial distribution.

enter image description here