Why do we use von Neumann ordinals and not Zermelo ordinals?

Solution 1:

There is no real, deep, fundamental reason. You can find a bijection between the set of Von Neumann naturals and Zermelo naturals, so anything you can do with the one set you can do with the other.

However, Von Neumann naturals are more convenient in practice for a lot of reasons. For one, the element we call $n$ also has exactly $n$ elements. That means that we can use the actual set $n$ as a cardinality, defining "the set $A$ has $n$ elements" to mean that there is a bijection between $A$ and $n$. For another, a convenient way to define the ordinals is to say that they are the transitive sets which are linearly ordered by $\in$. Then the Von Neumann naturals are precisely the finite ordinals, which is a natural and important way to think about the finite ordinals.

Solution 2:

A simple motivation for the Von Neumann style is to write it in this way: $$ n := \{m |\ m<n\}. $$ I.e., the ordinal $n$ is the set of all the ordinals up to $n-1$. This is, I'd say, the idea behind the Von Neumann ordinals, though obviously it isn't quite a proper definition ($m$ from what base set? What is $<$?).

Alternative way to express it: $$\begin{align} 0 &:= \{\} \\ n+1 &:= \{0 \ldots n\} \end{align}$$

Solution 3:

Von Neumann ordinals arise naturally as an answer to the classification problem asking to classify well-orders on sets. In "naive set theory" we might typically say something along the lines of: given two well-ordered sets $(A, <_A)$ and $(B, <_B)$, we will say they are equivalent if there is an order-preserving isomorphism between $A$ and $B$. Then, an "ordinal" will be defined as an "equivalence class" of this equivalence relation.

In axiomatic set theory, however, we run into the problem that the relation described above is "too big" to be an actual object of ZFC, and so is any individual "equivalence class". However, what we can find is that each "equivalence class" has a canonical representative given by the Mostowski Collapse Lemma:

Suppose we have a set $A$ and a relation $R$ on $A$ which is well-founded and extensional ("extensional" means: for all $x,y\in A$, if $\{ z\in A \mid z \mathrel{R} x \} = \{ z\in A \mid z \mathrel{R} y \}$, then $x = y$ -- and a well-order is automatically extensional). Then there exists a unique transitive set $B$ such that $(A, R) \simeq (B, {\in}|_{B\times B})$.

By restricting to well-orders, we see that each "equivalence class" of well-orders has a unique representative of this type, which is a transitive set $X$ such that ${\in}|_{X \times X}$ is a well-order on $X$. This is precisely the definition of von Neumann ordinals.

So, the von Neumann ordinals get around the issues with "too big" sets which cannot be constructed in ZFC, as a "feature not a bug" of ZFC to avoid contradictions such as Russell's paradox, or the Burali-Forti paradox specifically related to the class of ordinal numbers (while these issues are glossed over in naive set theory). Also, having a representative of each well-order in which the relation is gotten just from a basic term of the language of ZFC has numerous technical advantages.