Why is $\omega$-consistency needed in Gödel's original Incompleteness proof?

The issue is already present in step 1. Consistency is not a strong enough assumption on the theory to guarantee that the provability relation is representable.

For example, if $T$ is taken to be $\text{PA} + \lnot\text{Con}(PA)$, which is not $\omega$-consistent, then we have $T \vdash \text{Pvbl}_T(\phi)$ for all $\phi$, but $T$ is consistent, so for many $\phi$ we also have $T \not \vdash \phi$. Thus we do not have the whole scheme $T \vdash \phi \Leftrightarrow T \vdash \text{Pvbl}_T(\phi)$, as claimed by (1).

The source of confusion may be that you seem to be quantifying over predicates when you write "a predicate $PRV_F$ exists s.t. $F \vdash A \Leftrightarrow F \vdash PRV_F(A)$". In fact we construct a specific predicate $\text{Pvbl}$ in order to derive scheme (1) from the question.

In order to show that $T \vdash A$ implies $T \vdash \text{Pvbl}(A)$, which is part of showing that $\text{Pvbl}$ does represent the provability relation, we rely on $\text{Pvbl}$ being the actual provability predicate, so that we can transform any derivation of $A$ into a derivation of $\text{Pvbl}(A)$.

Similarly, in showing that $T \vdash \text{Pvbl}(A)$ implies $T \vdash A$, we use the $\omega$-consistency of $T$ to know that, if $T \vdash \text{Pvbl}(A)$ then there is actually a natural number $n$ that codes a derivation of $A$ from $T$.

You may be thinking of general results that show that certain theories are able to represent every computable set, or weakly represent every r.e. set. It is true that the representability of the provability relation follows from these general results. But the proofs of these general results also assume that the theory is $\omega$-consistent, or at least $1$-consistent, not just that the theory is consistent.


Since I had some issues with this topic, too, I would like to share my thoughts on it.

Notice, that I use $\color{red}{red}$ whenever the so denoted things are meant to be syntactic ones. This was (and remains) to me always the key to get things clear: in which "world" are these objects I'm thinking about?

1. There are two different predicates involved:

First, one construct the proof-predicate $Pr_F(x,y)$ in the recursive world, which states that $x$ codes a proof (in $F$ with the Hilbert Calculus) for the formula which is coded by $y$. This is strongly representable in $F$, hence, there is a formula $\color{red}{Pr_F}$ such that:

$$\begin{align}Pr_F(x,y) &\Leftrightarrow F\vdash \color{red}{Pr_F(\overline{x},\overline{y})}\\\neg Pr_F(x,y)&\Leftrightarrow F \vdash\color{red}{\neg Pr_F(\overline x,\overline y)}\end{align}$$

Now, what we are interested in is more: we want to say that some formula is provable, i.e. that there is a proof. So, second, we define the provable-predicate in the first-order language world:

$$\color{red}{P_F(y)\leftrightarrow\exists x . Pr_F(x,y)}$$

Now, what you've stated in (1) holds for $Pr_F$, but it makes no sense to state such a property for $\color{red}{P_F}$, since this is just a formula, but not a primitive recursive predicate. Even more: if we would like to define it as a primitive recursive predicate (at first, then get it via representability into the syntactic world), we couldn't, since there is an unbound quantifier involved!

2. These predicates aren't equivalent in $F$

What we have now is the following:

$$\begin{align} F \vdash \color{red}\phi &\Longleftrightarrow \text{exists closed term $\color{red}{t}$, such that }F\vdash\color{red}{Pr_F(t,\ulcorner\phi\urcorner)}\\&\Longrightarrow F\vdash\color{red}{P_F(\ulcorner\phi\urcorner)}\end{align}$$

but the other direction does not need to hold. You can see it clearly here, what we would need is a closed term $\color{red}t$.

Consider the following note: The quantifiers range over the hole universe, but not every element in the universe must be expressible within the given language. (If you find some time, there are pretty simple examples you could construct. The most obvious is of course the language with one constant symbol, etc.)

Now, $\omega$-consistency to the rescue gets us the $\color{red}t$.

3. A historical note

Although Gödel's original proof assumed $\omega$-consistency it had its effect already on the community. So it shouldn't be overrated, it's more a small taint.