Why does Cantor's diagonal argument yield uncomputable numbers?
As everyone knows, the set of real numbers is uncountable. The most ubiquitous proof of this fact uses Cantor's diagonal argument. However, I was surprised to learn about a gap in my perception of the real numbers:
A computable number is a real number that can be computed to within any desired precision by a finite, terminating algorithm.
Turns out that the set of computable numbers is countable. My mind is effectively blown at this point.
So I'm trying to reconcile this with Cantor's diagonal argument. Wikipedia has this to say: "...Cantor's diagonal argument cannot be used to produce uncountably many computable reals; at best, the reals formed from this method will be uncomputable."
So much for background information.
Say I have a list of real numbers $.a_{1n}a_{2n}a_{3n}\ldots$ for $n\geq 1$. Why do we get more than just computable numbers if we select digits different from the diagonal digits $a_{ii}$? I.e. if I make a number $.b_1b_2b_3\ldots$ where $b_i\neq a_{ii}$, why is this number not always computable?
The main issue I'm having is that it seems like I'm "computing" the digits $b_i$ in some sense. Is the problem that I have a choice for each digit? Or is there some other subtlety that I'm missing?
You are only computing the digits $b_i$ relative to the given enumeration $a_{i,n}$. So if the function $f(i,n) = a_{i,n}$ is computable then the $b$ you construct is also computable, and not equal any of the reals $a_i$. However, if you have no way to compute the digits $a_{i,n}$ then you can't compute $b$ either. It only seems like you are computing $b_i$ because you are assuming that you already have a way to compute the digits $a_{i,n}$.
However, there is no computable function $f(i,n)$ that satisfies these three requirements:
- for any $i$ and $n$, $f(i,n)$ eventually returns some output. In other words, $f(i,n)$ is a total function
- for every $i$, the function $n \mapsto f(i,n)$ gives a decimal expansion of a real number
- for every computable real $a$ there is some $i$ with $a_n = f_{i,n}$ for all $n$
The diagonalization argument is actually the "standard" way to prove there is no $f(i,n)$ with those properties. If there was, then the $b$ you get from it would be another computable real, which is impossible.
Working in set theory, you can do the construction with a noncomputable function $f(i,n)$ which lists every computable real. Since this $f$ is not computable, $b$ need not be computable, and in fact it won't be if $f$ lists all the computable reals.
The reason that the computable numbers are countable is that a computable number must be given by an algorithm of finite length, and any such algorithm must be written in a language with finitely (say $n$) many symbols. The number of algorithms of length $m$ is thus at most $m^n$, so we can label each algorithm as (length $n$, natural number $x \le m^n$) and thus count them using an argument similar to Cantor's.
I'd like to take a closer look at the apparent contradiction you get when trying to apply Cantor's diagonal slash to the computable numbers, so let me repeat the argument in somewhat more detail: A real number $r$ is computable if and only if there exists an algorithm which, given $n\in \mathbb{N}$, computes the n-th digit of the decimal expansion of $r$ (care has to taken because the decimal expansion is not unique in general; in this case the algorithm has to output digits of one expansion consistently). So you pick an explicit machine model (for example, Turing machines; the word "algorithm" means "Turing machine") and an enumeration of all algorithms $A_1,A_2,...$, the outputs of which are called $a_{i,n}$ You construct a new algorithm which, given $n$, computes $a_{n,n}$ (for Turing machines this is essentially a Universal Turing machine) and outputs $b_n$ which is different from $a_{n,n}$. You can make an explicit choice here, for example, $b_n:=2$ if $a_{n,n}=1$, and $b_n:=1$ otherwise (thus avoiding potential problems with non-unique decimal expansions, which end in a sequence of zeroes or nines). This seems to define a computable number which at the same time is uncomputable because it is different from any number that $A_i$ defines.
The trouble with this "proof" is that it ignores the termination issues, and thus $a_{i,n}$ isn't even well-defined. You could make several attempts to repair this argument, but you will fail one way or the other. For example, if you say that all $A_i$s that do not terminate on some input $n$ are to be skipped in the enumeration, $i\mapsto a_{i,i}$ is no longer computable because you would have to filter the faulty algorithms, which you can't by the undecidability of the halting problem. If you just define $a_{i,n}$ as zero if $A_i$ does not terminate for the input $n$, then again $(i,n)\mapsto a_{i,n}$ is not computable.
So the undecidability of the halting problem is the real issue here, and it implies other notable properties of computable numbers, for example, you can't computably decide if two computable numbers are equal.
(The original last paragraph before the edit, which I will keep here to understand the comments below, was: So the undecidability of the halting problem is the real issue here, and it makes the very notion of computable numbers rather worthless in my opinion (you can't even computably decide if two computable numbers are equal, again because of the halting problem).)
Since there are countably many computable real numbers (see Alex's answer), our listing of "all the real numbers" may in fact include each of these without any problem. However, when you apply Cantor's diagonalisation argument to this list, you get a real number that is not on the list, and must therefore be uncomputable. (Similarly, this new real number may fail to be rational, since our listing of "all the real numbers" may include all rational numbers.)