Explanation of Proof of Zorn's lemma in Halmos's Book

I have been reading Halmos's book on naive set theory on my own and have got stuck in the Chapter on Zorn's lemma. The 2nd and 3rd paragraphs are not very clear to me. Here is the text:

Zorn's lemma. If $X$ is a partially ordered set such that every chain in $X$ has an upper bound, then $X$ contains a maximal element.

Proof. The first step is to replace the abstract partial ordering in $X$ by the inclusion order in a suitable collection of sets. More precisely, we consider, for each element $x \in X$, the weak initial segment $\bar{s}(x)$ consisting of $x$ and all its predecessors. The range $\mathscr{S}$ of the function $\bar{s}$ (from $X$ to$\wp(X)$) is a certain collection of subsets of $X$, which we may, of course, regard as (partially) ordered by inclusion. The function $\bar{s}$ is one-to-one, and a necessary and sufficient condition that $\bar{s}(x)\subseteq \bar{s}(y)$ is that $x\leq y$. In view of this, the task of finding a maximal element in $X$ is the same as the task of finding a maximal set in $\mathscr{S}$. The hypothesis about chains in $X$ implies (and is, in fact, equivalent to) the corresponding statement about chains in $\mathscr{S}$.

Let $\mathscr{X}$ be the set of all chains in $X$; every member of $\mathscr{X}$ is included in $\bar{s}(x)$ for some $x \in X$. The collection $\mathscr{X}$ is a non-empty collection of sets, partially ordered by inclusion, and such that if $\mathscr{C}$ is a chain in $\mathscr{X}$, then the union of the sets in $\mathscr{C}$ (i.e., $\bigcup_{A \in \mathscr{C}}A$) belongs to $\mathscr{X}$. Since each set in $\mathscr{X}$ is dominated by some set in $\mathscr{S}$, the passage from $\mathscr{S}$ to $\mathscr{X}$ cannot introduce any new maximal elements. One advantage of the collection $\mathscr{X}$ is the slightly more specific form that the chain hypothesis assumes; instead of saying that each chain $\mathscr{C}$ has some upper bound in $\mathscr{S}$, we can say explicitly that the union of the sets of $\mathscr{C}$, which is clearly an upper bound of $\mathscr{C}$, is an element of the collection $\mathscr{X}$. Another technical advantage of $\mathscr{X}$ is that it contains all the subsets of each of its sets; this makes it possible to enlarge non-maximal sets in $\mathscr{X}$ slowly, one element at a time.

Now we can forget about the given partial order in $X$. In what follows we consider a non-empty collection $\mathscr{X}$ of subsets of a non-empty set $X$, subject to two conditions: every subset of each set in $\mathscr{X}$ is in $\mathscr{X}$, and the union of each chain of sets in $\mathscr{X}$ is in $\mathscr{X}$. Note that the first condition implies that $\varnothing\in\mathscr{X}$. Our task is to prove that there exists in $\mathscr{X}$ a maximal set.

and the proof continues...

In the 2nd paragraph:

'Since each set in $\mathscr{X}$ is dominated by some set in $\mathscr{S}$, the passage from $\mathscr{S}$ to $\mathscr{X}$ cannot introduce any new maximal elements.'

Here I am able to prove that every maximal element of $\mathscr{S}$ has to be a maximal element of $\mathscr{X}$, but surely there can be maximal elements of $\mathscr{X}$ which are not maximal elements of $\mathscr{S}$ and hence 'extra'- please explain..

In the 3rd para- The author considers a set $\mathscr{X}$ with the given properties, and states that the problem of finding a maximal element in $X$ is equivalent to finding a maximal set in $\mathscr{X}$. How come?

Detailed but simple answers would be much appreciated.


Solution 1:

(This is a fairly major re-write of the original answer.)

For your first question, the proper answer is that the statement doesn't matter too much. But to go through it, we begin with the following two facts:

Fact 1: If $\bar{s} (x)$ is a maximal set in $\mathscr{S}$, then $x$ is a maximal element of $X$.

Proof. If $z \geq x$ holds for some $z \in X$, then $\bar{s}(x) \subseteq \bar{s}(z)$. By maximality of $\bar{s}(x)$ it follows that $\bar{s}(z) = \bar{s}(x)$, and therefore $z \in \bar{s}(x)$ meaning that $z \leq x$. Thus $z = x$. $\dashv$

Fact 2: If $Y$ is a maximal set in $\mathscr{X}$, then $Y$ contains a maximum element $x$ (i.e. $y \leq x$ holds for all $y \in Y$), and $x$ is a maximal element of $X$.

Proof. As $Y$ is a chain in $X$, it has an upper bound $x$. We show that $x \in Y$ (and as it is an upper bound of $Y$ it follows that $x$ is the maximum element of $Y$) and that $x$ is a maximal element of $X$.

Suppose that $z \geq x$ holds for some $z \in X$. Note that $Y \cup \{ z \}$ is also a chain in $X$, and so by maximality of $Y$ it must be that $Y \cup \{ z \} = Y$, and therefore $z \in Y$. (In particular, as $x \geq x$ we have $x \in Y$.) As $x$ is an upper bound of $Y$ we have $z \leq x$, and so $z = x$. Thus $x$ is a maximal element of $X$. $\dashv$

Thus starting with maximal sets in either $\mathscr{S}$ or $\mathscr{X}$ will lead you to maximal elements of the partially ordered set $X$. It might appear, at first glance, that as chains are "more general" than weak initial segments , you get more maximal elements of $X$ by starting with maximal sets in $\mathscr{X}$ than you do by starting with maximal sets in $\mathscr{S}$.

Halmos's statement is that this is not the case. More precisely, one can show that if $Y$ is a maximal set in $\mathscr{X}$ with maximum element $x$, then $\bar{s} (x)$ is a maximal set in $\mathscr{S}$. Therefore, the maximal element $x$ of $X$ obtained by begining with a maximal set in $\mathscr{X}$ could also have been obtained by begining with a maximal set in $\mathscr{S}$.

I honestly wouldn't worry about this too much. The real crux of the proof lies in Fact 2.

Your second question should (now) find its answer in the statement and proof of Fact 2, above.

Additional notes:

Note that in the 2nd paragraph Halmos mentions that the family $\mathscr{X}$ of all chains in the partially ordered set $X$ satisfies the following conditions:

  1. $\mathscr{X}$ is non-empty;
  2. every subset of an element of $\mathscr{X}$ is also an element of $\mathscr{X}$; and
  3. the union of every chain of elements of $\mathscr{X}$ is an element of $\mathscr{X}$.

He then makes the following claims:

Claim 1: If there is a maximal set in $\mathscr{X}$, then there is a maximal element of $X$.

(This was established in Fact 2, above.)

Claim 2: Suppose $\mathscr{Z}$ is any family of subsets of some non-empty set $Z$ satisfying the following conditions:

  1. $\mathscr{Z}$ is non-empty;
  2. every subset of an element of $\mathscr{Z}$ is also an element of $\mathscr{Z}$; and
  3. the union of every chain of elements of $\mathscr{Z}$ is an element of $\mathscr{Z}$.

Then there is a maximal set in $\mathscr{Z}$.

(This is what the remainder of Halmos's proof establishes, except that -- in possibly bad form -- he uses $X$ and $\mathscr{X}$ (which symbols are already used in the proof) in place of my $Z$ and $\mathscr{Z}$, respectively.)

Once these two claims are established, the proof of Zorn's Lemma is complete: Since the family $\mathscr{X}$ of all chains in $X$ has the desired propeties mentioned in the premise of Claim 2, by this it follows that $\mathscr{X}$ contains a maximal set. By Claim 1 it then follows that the partially ordered set $X$ contains a maximal element.

Solution 2:

We move from the ordered set $X$ to the ordered set of chains $\mathscr X$. The assumption on $X$ was that every chain has an upper bound, therefore if $Y$ was a maximal element in $\mathscr X$ it is a chain in $X$ and thus has an upper bound.

From this we have that an upper bound of a maximal chain is a maximal element, do you see why? (Hover to see the answer)

Otherwise we could have added it to the chain and it would contradict its maximality. On the other hand, if there was someone bigger than it in the chain, it would not be an upper bound!