Computability viewpoint of Godel/Rosser's incompleteness theorem
Here I shall present very simple computability-based proofs of Godel/Rosser's incompleteness theorem, which require only basic knowledge about programs. I feel that these proofs are little known despite giving a very general form of the incompleteness theorems, and also easy to make rigorous without even depending on much background knowledge in logic. Typical proofs use something like the fixed-point lemma, which is essentially a fixed-point combinator 'applied' to provability, but that is quite a bit harder to understand and prove rigorously than the unsolvability of the halting problem. It suffices to say that all proofs use diagonalization in some way or another.
Take any practical programming language L in which programs can perform basic operations on string/integer variables and conditional jumps (or while-loops). From now on all the programs that we shall refer to are programs in L. For convenience we shall also consider every string to be a program. If it is not ordinarily a valid program, we shall consider it to represent a program that simply does an infinite-loop. (Any interpreter for L can easily be modified to implement this.)
First I shall show how the unsolvability of the halting problem implies essentially Godel's (first) incompleteness theorem.
Halting problem
Define the halting problem to be:
Given a program P and input X:
If P halts on X, then the answer is "true".
If P does not halt on X, then the answer is "false".
It is not hard to prove that there is no program that solves the halting problem. For a program to solve the halting problem, it must halt on every input (P,X), and also must output the right answer as specified in the problem. If you do not already know the proof, try before looking at the proof below!
Given any program H:
Let C be the program that does the following on input P:
If H(P,P) = "false" then output "haha" (and halt) otherwise infinite-loop (not halting).
If H solves the halting problem:
H(C,C) halts, and hence C(C) halts iff H(C,C) = "false", by definition of C.
Contradiction with definition of H.
Thus H does not solve the halting problem.
Key definitions concerning formal systems
Take any formal system T.
We say that V is a proof verifier for T iff all the following hold:
V is a program.
Given any sentence φ over T and proof x:
V(φ,x) decides (halts and answers) whether x is a proof of φ.
We say that T can reason about programs iff:
For every program P that halts on an input X and outputs Y:
T can prove the following for any string Z distinct from Y:
"The program P halts on input X."
"The program P halts on input X and outputs Y."
"It is not true that the program P halts on input X and outputs Z."
(Here P,X,Y,Z are inserted as encoded strings, but you should get the idea.)
We say that T is consistent iff:
There is no sentence φ about programs such that T proves both φ and its negation.
We say that T is complete iff:
For every sentence φ about programs we have that T proves either φ or its negation.
We say that T is sound for program halting iff:
If T proves that a program halts on an input then it really halts.
Godel's incompleteness theorem via the halting problem
Take any formal system T with proof verifier V that can reason about programs.
Let H be the following program on input (P,X):
For each string s in length-lexicographic order:
If V( "The program P halts on input X." , s ) then output "true".
If V( "The program P does not halt on input X." , s ) then output "false".
If T is complete and consistent and sound for program halting:
Given any program P and input X:
T proves exactly one of the following sentences:
A = "The program P halts on input X."
B = "The program P does not halt on input X."
Thus H halts on input (P,X) because s will eventually be a proof of A or of B.
If P halts on X:
T can prove A, since T can reason about programs, and hence H(P,X) = "true".
If P does not halt on X:
T does not prove A, by soundness for program halting.
Thus T proves B, and hence H(P,X) = "false".
Therefore H(P,X) is the correct answer to whether P halts on X.
Therefore H solves the halting problem.
Contradiction with unsolvability of the halting problem.
Therefore T is either incomplete or inconsistent or unsound for program halting.
Rosser's strengthening of Godel's incompleteness theorem
After Godel's theorem was published, Rosser came up with a trick to strengthen it, and I came across a blog post by Scott Aaronson that shows that if we use something called the zero-guessing problem, instead of the halting problem, we can get the same strengthening! Specifically, we can then drop the condition of soundness for program halting. I shall give a simplified self-contained version of the zero-guessing problem and the proof of Rosser's incompleteness theorem, exactly parallel to how the halting problem unsolvability implies Godel's incompleteness theorem. If you want a challenge, you can first read the definition of the zero-guessing problem and then try to find the proof on your own by following the ideas in the earlier proof.
Zero-guessing problem
Define the zero-guessing problem to be:
Given a program P and input X:
If P halts on X, then the answer is 0 if P(X) = 0 and is 1 otherwise.
(If P does not halt on X, then any answer is fine.)
For a program to solve the zero-guessing problem, it must halt on every input (P,X), and also must output the right answer as specified in the problem. This implies that a solver is allowed to output arbitrary nonsense if P does not halt on X. Like the halting problem, the zero-guessing problem cannot be solved by a program. Try to prove this before looking at the proof below!
Given any program G :
Let D be the program that does the following on input P:
If G(P,P) = 0 then output 1 otherwise output 0.
If G solves the zero-guessing problem:
G(D,D) halts, and hence D(D) ≠ 0 iff G(D,D) = 0, by definition of D.
Contradiction with definition of G.
Therefore G does not solve the zero-guessing problem.
Note that the choice of 0 and 1 is really arbitrary. You can assume 0 and 1 stand for any fixed distinct strings that you like.
Rosser's incompleteness theorem via the zero-guessing problem
Take any formal system T with proof verifier V that can reason about programs.
Let G be the following program on input (P,X):
For each string s in length-lexicographic order:
If V( "The program P halts on input X and outputs 0." , s ) then output 0.
If V( "It is not true that the program P halts on input X and outputs 0." , s ) then output 1.
If T is complete and consistent:
Given any program P and input X:
T proves exactly one of the following sentences:
A = "The program P halts on input X and outputs 0."
B = "It is not true that the program P halts on input X and output 0."
Thus G halts on input (P,X) because s will eventually be a proof of A or of B.
If P halts on X:
Recall that T can prove the correct output of P on X.
If P(X) = 0, then T can prove A and hence G(P,X) = 0.
If P(X) ≠ 0, then T can prove B and hence G(P,X) = 1.
Therefore G solves the zero-guessing problem.
Contradiction with unsolvability of the zero-guessing problem.
Therefore T is either not complete or not consistent.
Explicitly independent sentence
In the above proofs we used the unsolvability of some computability problem as a black-box to show that T is incomplete. In both cases, we can actually merge the unsolvability proof with the incompleteness proof to obtain explicit sentences that are independent over T, meaning that T can prove neither it nor its negation.
Let H be the program constructed in the proof of Godel's incompleteness theorem.
Let C be the program constructed in the proof that H does not solve the halting problem.
Let Q = "The program C halts on input C.".
Let Q' be the negation of Q.
If T is consistent and sound for program halting:
If C(C) halts:
T proves Q but not Q', since T can reason about programs.
Thus H(C,C) = "true", and hence C(C) does not halt.
Contradiction.
Therefore C(C) does not halt.
Thus T does not prove Q, since T is sound for program halting.
If T proves Q':
H(C,C) = "false", and hence C(C) halts.
Contradiction as above.
Therefore T does not prove Q'.
Thus T proves neither Q nor Q', but Q' is actually true.
Let G be the program constructed in the proof of Rosser's incompleteness theorem.
Let D be the program constructed in the proof that G does not solve the zero-guessing problem.
Let W = "The program D halts on input D and outputs 0.".
Let W' be the negation of W.
If T is consistent:
If D(D) halts:
Recall that T can reason about programs.
Thus T proves W if D(D) = 0 and T proves W' if D(D) = 1.
Thus G(D,D) = D(D), by definition of G.
But D(D) ≠ G(D,D), by definition of D.
Contradiction.
Therefore D(D) does not halt.
If T proves W or W':
G(D,D) halts, and hence D(D) halts.
Contradiction as above.
Therefore T proves neither W nor W', but W' is actually true.
From the computability perspective, although T is complete for program halting (since it can reason about programs), it is incomplete for program non-halting (some program on some input will not halt and T cannot prove that fact).
Zero-guessing versus halting
Note that we avoid the need for T to be sound for program halting in the above proofs based on the zero-guessing problem because it has a weaker requirement than the halting problem in the case when the given program P does not halt on the given input X.
Soundness versus consistency
It is customary to assume that T is classical (for programs), meaning that T can use the rules of classical logic in deducing sentences about programs. But the above proofs do not assume that T is classical. Note that if T is classical (or at least has the principle of explosion), then soundness of T for any sentence would imply consistency of T, because if T is inconsistent then T proves every sentence about programs. Also, if T is classical then its soundness for program halting is stronger than its consistency, because it is possible (which we shall prove in a later paragraph) that T proves a sentence of the form "The program P halts on input X." and yet P actually does not halt on input X in reality. Note also that for classical T, soundness of T for program non-halting is equivalent to just consistency of T, because if a program P actually halts on input X in reality, then T can prove that fact, and so if T proves "The program P does not halt on input X." then T is also inconsistent.
Another fact for classical T is that T is consistent iff Q' is true. We have already shown that if T is consistent then Q' is true. If T is inconsistent and classical then by the principle of explosion T proves both Q and Q', and so H(C,C) halts, and hence C(C) does not halt. Similarly if T is classical then T is consistent iff D(D) does not halt.
To show the claim in the first paragraph, we need to construct a formal system that is classical and consistent but yet unsound for program halting. To do so, let S be a formal system that can perform merely classical reasoning about finite binary strings, and hence can reason about programs because it can reason about the execution of any program for any finite number of steps. (We shall explain in a later section how S can do this. Of course we need to translate sentences about programs to sentences about finite binary strings, but there is a natural computable translation.) We believe that S is consistent and sound for finite binary strings, and hence is also sound for program halting. We have already seen that S proves neither Q nor Q', but Q is actually false. Now let S' be S+Q, namely that S' is the formal system that proves every theorem that can be deduced classically from the axioms of S plus Q. Then S' is classical and has a proof verifier (exercise for you) and is unsound for program halting. But S' is consistent, because otherwise there is a proof of contradiction over S+Q, which can be converted into a proof of Q' over S. This last claim is an instance of the deduction theorem, which is obvious for Fitch-style natural deduction.
Godel's original theorem required T to be ω-consistent, but his proof in fact only requires T to be Σ1-sound. By a trick of Godel's called the β-lemma, Σ1-soundness is essentially equivalent to soundness for program-halting. So in this precise sense one can say that the weaker theorem is essentially equivalent to the theorem shown by Godel's original proof. Actually soundness for program halting is always taken for granted for any formal system that we use in practice, since we would really want it not to prove false sentences about programs. But the stronger theorem is beautiful from the modern CS perspective, since it reveals severe fundamental limitations in any consistent formal system that can reason about finite program execution, which is a very concrete notion in the real world!
Encoding program execution in a string
In this section we explain how program execution can be expressed in a single finite binary string, so that we can use sentences over these strings to describe program behaviour, including halting. First notice that binary is not a severe restriction, and there are many ways to go around it. The simplest way is to use unary numbers (k is encoded as a string of k ones) separated by zeroes to represent finite strings of naturals! This representation gives a one-to-one correspondence between finite binary string to natural strings. Next observe that we can represent finite sequences of natural strings using a single natural string, by adding one to each item and using zeroes as separators. For example the sequence ((3,1,4),(1),(),(5,9)) would be represented by (4,2,5,0,2,0,0,6,10). Now every program can be represented easily as a natural string. Furthermore, the entire state of any given program executing on a given input can be captured, by a pair of natural strings representing the program and input with the 'current step' highlighted, plus a sequence of pairs of natural strings where each pair (x,v) denotes that variable x has value v. So the entire program execution state can be represented by a single natural string. If L is simple enough, you should be able to imagine how to express by a classical sentence about strings s,t that s→t is a valid program state transition, meaning that after one step from state s the program will be in state t. Since a finite sequence of program states can be represented as a single natural string, we can express by a classical sentence about strings p,x,y that the program p on input x will halt and output y, which is basically the sentence ( There is a finite sequence of program states, starting with program p having input x, in which every pair of adjacent states in that sequence is a valid state transition, and ending with output y. ). Finally, note that if a program p really halts on input x, then any formal system S that can reason about natural strings can prove each step of the execution of p on x, and then string all these proofs together to prove the fact of halting.
In the last part of the above paragraph, we appealed slightly to our intuition that we can do the appropriate translation of that sentence given any conceivable practical programming language. It is not really illuminating to completely formalize this, but it could be done in many ways. One way is to do it only for some specific universal flavour of Turing machines. Another is to do it only for some specific assembly-like language. A third is to do it for some variant of lambda calculus. Whatever it is, it has to be equivalent to Turing machines. This issue is not peculiar to this version of the incompleteness theorem, since the original theorem concerns systems that can reason about basic arithmetic, which turns out to be equivalent due to Godel's β-lemma. The reason Godel's theorem was about arithmetic seems to be because that was what mathematicians at that time thought was fundamental to mathematics. The main benefits of proving my version of the theorem are that it avoids the number theory in Godel's β-lemma and the concept of primitive recursion, and shows that just basic facts of string concatenation (not even any form of induction) are enough for the incompleteness phenomenon to arise, and does not require the formal system to be based on classical logic.
Popular misconceptions about incompleteness
When people first encounter the statement about incompleteness of Peano Arithmetic (PA), many wrongly suspect various aspects of PA as being the 'cause'.
It is not due to induction, nor even the infinite number of axioms of PA. The reason is that PA− is enough, and PA− has finitely many axioms. PA− plus induction gives PA. Similarly, the Theory of Concatenation (TC) described here, which is a simple candidate for the formal system S above that can merely reason about finite binary strings, has only finitely many axioms.
It is not due to any deep number-theoretic phenomenon. I myself wrongly thought that it was, because Presburger arithmetic is consistent and complete, until I saw the computability-based proof above, which applies to TC since TC can reason about programs. It is true that TC is in some sense equivalent to PA−, but TC has nothing but concatenation, and the axioms of TC are just a few 'obvious' facts about strings.
It is not due to classical logic. This a common 'criticism' of the incompleteness theorems, but is completely unfounded. As shown in the proof above, it applies to any formal system that has a proof verifier and can reason about programs, whether or not it is classical. Notice that nowhere did I say anything about syntactic or deductive rules, because there need not be any. The formal system T could even be so crazy that if we are given an arbitrary program that halts on some input the easiest way to find a proof of that fact φ over T would be to run V(φ,x) for every possible string x until you find one that V says is valid! For a trivial but relevant example, consider the formal system R (for "runner") whose proof verifier does the following given input (φ,k): It first checks if φ is of the form "[It is not true that] the program P halts on input X [and outputs Y]." (where the phrases in square brackets are optional), and then runs P on X for length(k) steps, and then answers that the proof is valid if P has halted [and its output is as claimed], and answers that the proof is invalid in all other cases. You can see that R indeed halts on every input (φ,k), and only affirms validity of the so-called proof when φ is actually true and k is long enough. You can also see that R does not affirm validity of any proof when φ is of some other form or P does not halt. Thus R satisfies the requirements of the above incompleteness theorem. If you wish, you can let R* be the closure of R under intuitionistic deduction, and then R* would be an intuitionistic non-classical example.
In my opinion the phenomenon actually 'responsible' for causing incompleteness is the ability to reason about programs. Someone once said that Godel's proof was essentially like constructing a compiler in arithmetic, just so that he could run primitive recursive programs (those that only use for-loops whose counter cannot be changed inside the loop).
Finally, the foundational system needed to prove the incompleteness theorems can be very weak. A key assumption is that program behaviour is well-defined, namely that given any program P and input X, either P halts or P does not halt, and the output if any is unique. This assumption is necessary otherwise even the concepts of consistency and completeness are not well-defined. In short, having classical logic for program behaviour suffices. Note that since program behaviour can be encoded as a sentence about strings (as in the previous section), this implies that in some sense we only need to assume classical logic for strings to be able to prove the incompleteness theorems in an encoded form. If you want to prove it in a more natural form, then you would need the foundational system to natively support reasoning about finite sequences.
Generalization
We can fully generalize the incompleteness theorems by relaxing the condition that the formal system T has a proof verifier V that always halts. It is sufficient to require that V(φ,x) outputs "yes" exactly when x is a proof of φ, and it does not matter if V does not halt when x is not a proof of φ! The proof is the same except that you simply need to construct the program to parallelize all the calls to V. In any reasonable programming language, this can be done as follows. Each call to V triggers a step-by-step simulation of the execution of V on the given inputs in parallel with the rest of the program, so at any time there may be multiple (but finitely many) ongoing simulations. If any simulation reaches the end, the entire program is terminated and the output in that simulation is returned as the output of the entire program. As before, the proof will show that if T is consistent then exactly one of A,B will be proven and so the output will correspond to which one is proven, and the rest of the proof is unchanged.
Note also that this full generalization is equivalent to replacing the criterion of T having a proof verifier with the criterion of T having a theorem generator M that is a program that runs forever and eventually outputs every theorem of T and does not output any string that is not a theorem of T. Then the program in the proof merely needs to simulate M and wait until M generates A or B and then terminating the entire program and outputting accordingly. In the above proofs I did not use the full generalization because it is not obvious what kind of programming languages are strong enough that programs in them can simulate other programs, and all practical formal systems do have a proof verifier anyway.
Generalization to uncomputable formal systems
One nice aspect of this computability-based viewpoint is that it automatically relativizes to any kind of class Ω of oracle programs. In particular, the same proof yields the incompleteness theorems for formal systems whose proof verifier is a program in Ω and that can reason about programs in Ω. This result can be used to prove that the arithmetical hierarchy does not collapse to any level, like shown in this post.
Further reading
The blog post by Scott Aaronson that inspired some of this, citing Kleene's 1967 Mathematical Logic text for a similar proof of Rosser's theorem (Theorem VIII and Corollary I on pages 286−288).
A 1944 paper by Emil Post sketching a proof corresponding loosely to the above proof via the halting problem for formal systems that are sound for program halting. (Thanks Philip White!)
A formal version of the above proof of Rosser's theorem for formal systems that interpret PA−.
A discussion of foundational issues regarding the halting problem and the incompleteness theorem.
An explanation of the fixed-point combinator in λ-calculus mentioned in the opening paragraph.