The Wikipedia page for the Binomial Distribution states the following lower bound, which I suppose can also be generalized as a general Chernoff lower bound.

$$\Pr(X \le k) \geq \frac{1}{(n+1)^2} \exp\left(-nD\left(\frac{k}{n}\left|\right|p\right)\right) \quad\quad\mbox{if }p<\frac{k}{n}<1$$

Clearly this is tight up to the $(n+1)^{-2}$ factor.

However computationally it seems that $(n+1)^{-1}$ would be tight as well. Even (n+1)^{-.7} seems to be fine.

It's not as easy to find lower bounds for tails as it is upper, but for the Normal Distribution there seems to be a standard bound:

$$\int_x^\infty e^{-t^2/2}dt \ge (1/x-1/x^3)e^{-x^2/2}$$

My question is thus, is the $\frac{1}{(n+1)^2}$ factor the best known? Or can $\frac{1}{n+1}$ be shown to be sufficient?

Update: Here is the region in which the conjecture holds numerically, by Mathematica: Where the conjecture holds


Solution 1:

Update: I wrote a note surveying different proofs of varying precisions. This gets all the way down to $1+o(1)$ sharpness.

I can at least show that $(n+1)^2$ can be improved to $\sqrt{2n}$:

$$\begin{align} \sum_{i=0}^k {n \choose i} p^i (1-p)^{n-i} &\ge {n \choose k} p^k (1-p)^{n-k}\\ &= {n \choose k} \exp\left(-n(k/n \log1/p+(1-k/n)\log1/(1-p)\right)\\ &\ge \frac{\exp(n\text{H}(k/n))}{\sqrt{8k(1-k/n)}}\, \exp(-n(\text{D}(k/n||p) + H(k/n)))\\ &= \frac{1}{\sqrt{8k(1-k/n)}}\exp(-n\text{D}(k/n||p))\\ &\ge \frac{1}{\sqrt{2n}}\exp(-n\text{D}(k/n||p)) \end{align}$$

Here I've used the lower bound for the binomial ${n\choose an}\ge\frac1{\sqrt{8na(1-a)}}\exp(n\text{H}(a))\\$ from http://www.lkozma.net/inequalities_cheat_sheet/ineq.pdf . I'd be happy if anyone can provide a better reference.

We see that it is sharp in the sense that ${2\choose1} = \frac1{\sqrt{2\cdot2}}\exp(2\text{H}{(1/2)})$. Also by A.S.'s comments we see that the bound is asymptotically sharp, up to a constant dependent on $p$ and $k$.

Update: R.B. Ash. Information Theory is a reference to the binomial approximation, and in fact they also derive the exact same bound for the distribution.