How do we know PA is incomparable with PRA + $\epsilon_0$?

This is far from optimal, but it is fun:

Note that PRA isn't literally in the same language as PA, so by "PRA" we really mean "an appropriate rephrasing of PRA in the language of PA." This isn't an issue in this context, but it's worth pointing out.

Let's shift attention. The theory $I\Sigma_1$ consists of basic arithmetic (the ordered semiring axioms) and induction for $\Sigma_1$-formulas. It's not hard to check that $I\Sigma_1$ contains PRA ($I\Sigma_1$ proves "every primitive recursive function is total").

So it's enough to show that "$I\Sigma_1+\epsilon_0$" doesn't contain PA. Why is $I\Sigma_1$ a better theory to use here? Well, it turns out that $I\Sigma_1$ is finitely axiomatizable. Hence $I\Sigma_1+\epsilon_0$ is also finitely axiomatizable.

  • The key here: induction for bounded-complexity formulas along $\epsilon_0$ looks like a scheme, but can be captured by a single sentence via the appropriate truth-predicate. (Of course that breaks down if we drop "bounded" ...)

... So what? Well, this is the neat bit: there is a beautiful model-theoretic proof of the incompleteness of PA (due to Kripke, written up by Putnam) which has as a corollary that no finitely axiomatizable extension of PA in the language of PA is consistent (the key aspect of PA being that it proves the consistency of each of its finite subtheories). Bam, we're done.