What is continuity correction in statistics

The continuity correction comes up most often when we are using the normal approximation to the binomial. It comes up sometimes when we are approximating a Poisson distribution with large $\lambda$ by a normal.

Let $X$ be a binomially distributed random variable that represents the number of successes in $n$ independent trials, where the probability of success on ay trial is $p$. Let $Y$ be a normal random variable with the same mean and the same variance as $X$.

Suppose that $npq$ is not too small. Then if $k$ is an integer, $\Pr(X\le k)$ is reasonably well-approximated by $\Pr(Y\le k)$. It is ordinarily better approximated by $\Pr(Y\le k+\frac{1}{2})$. The difference can be significant when $n$ is not large. When $np(1-p)$ is big, say bigger than $100$, the continuity correction makes little practical difference.

The continuity correction is less important than it used to be. For with modern software, we can compute $\Pr(X\le k)$ essentially exactly.

It is easy to get confused when using the continuity correction. In particular, the question that you asked comes up: when do we add $\frac{1}{2}$, and when do we subtract? I deal with that by remembering only one rule. To repeat,

Rule: If $k$ is an integer, then $\Pr(X\le k)\approx \Pr(Y\le k+\frac{1}{2})$, where $Y$ is a normal with the same mean and variance as $X$.

Let us look at a couple of examples. Let $X$ have binomial distribution. Approximate the probability that $X\lt k$, where $k$ is an integer. This doesn't quite look like our Rule. Note we have $\lt k$, not $\le k$. But $X\lt k$ if and only if $X\le k-1$. Now we are of the right shape. The answer is, approximately, $\Pr(Y\le (k-1+\frac{1}{2}$, where $Y$ is the appropriate normal. This is $\Pr(Y\le k-\frac{1}{2}$, so in a sense we sutracted. But it all came from the one Rule, where we always add, but pay close attention to the difference between $\lt$ and $\le$.

What is the probability that $X\gt k$? This is $1-\Pr(X\le k)$. Thus we get that the result is approximately $1-\Pr(Y\le k+\frac{1}{2})$.

A numerical example: Toss a fair coin $100$ times. Approximate the probability that the number of heads is $\le 55$.

By working directly with the binomial, and software, I get this is, to $6$ figures, $0.864373$. That's the "right" answer.

Using $\Pr(Y\le 55)$, where $Y$ is normal mean $50$, standard deviation $5$, no continuity correction, I get the approximation $0.8413$.

Using the continuity correction, I get the approximation $0.8643$. I should really do a few other examples, the continuity correction is too good here!