Axiom of Choice: What exactly is a choice, and when and why is it needed?

Solution 1:

The ability to choose single elements during a proof is built into the rules of first-order logic; we don't need a set-theoretic axiom for doing so. More precisely, if we know $\exists x.\varphi(x)$, then it is admissible to pick an $x$ that $\varphi$ holds for. Either the $\exists x$ was proved earlier in the proof -- in which case we actually know a specific thing with this property, because that is how you prove an $\exists$ -- or the $\exists x$ comes from an axiom, in which case the ability to pick such an $x$ is "morally" part of what the axiom promises us.

So when the Axiom of Choice promises us that "$\exists$" a choice function, it doesn't only say that there are choice functions out there, but also that we're allowed to pick one.


Now, why would we want to have a choice function exist inside set theory? The main reason is such that we can wrap up the many choices it represents and handle them as a single object -- particularly in the axiom schemas that allow us to supply a formula to specify what we're doing. The formula may contain set parameters, but syntactically it can only contain a finite number of parameters. The Axiom of Choice allows us to parameterize an instance of the axiom schema with an infinity of choices when we need it.

The two axiom schemas this is relevant for is the Axiom of Replacement and the Axiom of Selection.

The Axiom of Replacement allows us to form $\{F(x)\mid x\in A\}$ for any set $A$. The function $F$ does not a priori have to exist as a set within the set-theoretic universe; it can be given by a logical formula. But we need to be able to prove that there is only (or at most) one $F(x)$ for each $x$ -- otherwise the axiom might produce a set so large that it needs to be a proper class instead. However, often we actually want to use an underspecified $F$ that just says "choose something with such-and-such property", where we can prove that at least one something always exists. For this we need to wrap up all of the choices as a set parameter to the formula that represents $F$; the Axiom of Choice is what we need to do that. First-order logic allows us to make single choices one by one during a proof, but not to write down choose something as part of a formula.

The Axiom of Selection allows us to form $\{x\in A\mid \varphi(x)\}$. Here, $\varphi$ can be an arbitrary formula (with parameters), but sometimes we want $\varphi$ to depend on some arbitrary choices that need to be the same for all the $x$s in $A$ we consider. Again the problem -- of part of it -- is that the formula $\varphi$ cannot itself speak about making choices in a controlled way, but we can parameterize $\varphi$ by a choice function that we select once and for all outside the application of the Selection axiom.


All of the above is for standard first-order logic. There is an alternative -- but nowadays mostly forgotten, for unrelated reasons -- logical formalism, Hilbert's $\varepsilon$-calculus where instead of quantifiers at the formula level one has a term $\varepsilon x.\varphi(x)$ which intuitively stands for "some $x$ that makes $\varphi$ true, if such a thing exists, otherwise an arbitrary object". There's a standard translation from standard first-order formulas to formulas in the $\varepsilon$-calculus, and it is an conservative extension of FOL with respect to these formulas.

In the $\varepsilon$-calculus we don't need an explicit choice axiom, because now formulas can speak about making arbitrary-but-consistent choices; that's exactly what the $\varepsilon$ operator does.

On the other hand, if we want to formulate set theory without AC in the $\varepsilon$-calculus, we need to be careful to restrict the axioms of Replacement and Selection such that the parameter formulas must be of the restricted shape that arises as translation from standard FOL formulas. If this restriction is not done, the Axiom of Choice can be proved: if $X$ is a set of nonempty sets, then $$ \{ (x,\varepsilon y.y\in x)\mid x\in X\}$$ is a choice function for $X$.

Solution 2:

Edward Nelson explained this very well in Chapter 2 of a paper that used to be on his homepage at Princeton ("removed for revision" several years ago). I'll try to do same using my understanding gained from it since I can't point you to the paper anymore. In logic it is shown (extension by adjunction of a function symbol) that if $\mathbf{\varphi}$ is a formula whose free variables are $\mathbf{x, x_1, \dots, x_n}$, and if $\mathbf{\forall x_1 \dots \forall x_n \exists x \varphi}$ is a theorem, then we can adjoin a new $n$-ary function symbol $\mathbf{f}$ and the axiom $\mathbf{\forall x_1 \dots \forall x_n [fx_1 \dots x_n/x] \varphi}$ to get a conservative extension. But such a conservative extension if applied to set theory does not contain new set theory axioms allowing the new function symbol to appear in the axioms obtained from the separation (subset) and replacement (or collection) schema. In practice, this makes no difference if the extension is what is called an extension by definition, which happens when the stronger statement $\mathbf{\forall x_1 \dots \forall x_n \exists ! x \varphi}$ holds (where $\mathbf{\exists ! x}$ means "there exists a unique $\mathbf{x}$ such that"). The reason is that in this case any formula containing $\mathbf{f}$ can be translated into an equivalent (in the extension) formula not containing $\mathbf{f}$, which formula is permissible in separation and replacement axioms. Also, when $\mathbf{\varphi}$ contains no free variables other than $\mathbf{x}$ there is no problem with allowing the (special constant, $0$-ary) $\mathbf{f}$ to appear in the replacement and separation axioms, since such would-be axioms would hold when $\mathbf{f}$ is replaced by a variable not appearing in the would-be axioms (and substitution holds).

Similarly, if $\mathbf{\psi}$ is a formula with free variables $\mathbf{x_1, \dots, x_n}$, we can adjoin a new $n$-ary predicate symbol $\mathbf{p}$, along with the defining axiom $\mathbf{\forall x_1 \dots \forall x_n (p x_1 \dots x_n \leftrightarrow \psi)}$. These also are called extensions by definition, presumably because statements involving $\mathbf{p}$ are translatable to equivalent statements not involving $\mathbf{p}$. Being translatable, there is again no problem allowing new predicate symbols arising from extensions by definition of a predicate symbol to appear in separation and replacement axioms.

In practice, in doing set theory or the higher math based on set theory, one may use extensions by definition and extensions by special constants with abandon, even in separation and replacement axioms. Indeed, math would be impractically tedious unless this or something similar were done. But whenever one uses extensions by adjunction of a function symbol which are not extensions by definition of a function symbol, as happens when the uniqueness condition fails, one needs something like the axiom of choice to reformulate those function symbols so they may be used in separation and replacement axioms. In practice, what this means is that any such function symbols should correspond to functions obtainable by the axiom of choice. Often (e.g., when variables range over sets like the set of real numbers) the axiom of choice can be used to replace something like $\mathbf{\forall x_1 \dots \forall x_n \exists x \varphi}$ with something like $\mathbf{\exists f \forall x_1 \dots \forall x_n [f(x_1, \dots, x_n)/x] \varphi}$, where here $\mathbf{f}$ is a variable representing a set-theoretic function and the parentheses denote a function symbol defining evaluation (to every function and argument there corresponds a unique evaluation of the function at that argument, and so evaluation may be defined problems-free by extension by definition of function symbol). The statement $\mathbf{\exists f \forall x_1 \dots \forall x_n [f(x_1, \dots, x_n)/x] \varphi}$ can be used to adjoin a function symbol $\mathbf{f_c}$ corresponding to $\mathbf{f}$ that would cause no problems appearing in separation or replacement axioms whenever $\mathbf{\varphi}$ involves only $\mathbf{x, x_1, \dots, x_n}$ as free variables, because then $\mathbf{f_c}$ would be defined as a special constant (confusingly, $\mathbf{f_c}$ needs to be $0$-ary).

Bourbaki, by using Hilbert's tau operator (which I guess is similar or the same as Hilbert's epsilon operator) does something similar to allowing formulas obtained by mere adjunction of function symbols to appear in separation and replacement axioms (both approaches would cause the axiom of choice to be a theorem), but as much as I like Bourbaki, I have come to prefer here the standard way logicians do things, which uses an axiom of choice. Assuming too much is not aesthetically pleasant.

Solution 3:

As ad addition to Henning Makholm's explanation.

Example.
Consider finite collection of sets $A_k, \; k\in S=\{1, \ldots , K\}$, where every $A_k$ is countable. What about the union $\bigcup\limits_{k=1}^{K}A_k$?
Well, we can tell that as $A_1$ is countable there exists an enumeration $f_1$, $A_2$ is countable — there exists $f_2$. Then we can repeat this phrase $K$ times and after we can work with these functions $f_1 \ldots f_K$. Then we can do something simple: take a bijection $g: \mathbb{N}\times S \to \mathbb{N}$ and this is it: finite union of countable sets is countable.

OK, but consider countable collection of countable sets $A_k, \; k\in\mathbb{N}$ — what about the union $\bigcup\limits_{k=1}^{\infty}A_k$? We can't repeat phrase about enumeration infinite times because every proof in mathematics must be finite. So, we have a problem here — for every $k$ there exists a function $f_k: \mathbb{N} \to A_k$, but does the set $\{ f_k : k \in \mathbb{N} \}$ exists? You can't tell without axiom of choice (actually, it will be enough to have weaker version of it, but still).