What is it about modern set theory that prevents us from defining the set of all sets which are not members of themselves?

We can clearly define a set of sets. I feel intuitively like we ought to define sets which do contain themselves; the set of all sets which contain sets as elements, for instance. Does that set produce a contradiction?

I do not have a very firm grasp on what constitutes a set versus what constitutes a class. I understand that all sets are classes, but that there exist classes which are not sets, and this apparently resolves Russell's paradox, but I don't think I see exactly how it does so. Can classes not contain classes? Can a class contain itself? Can a set?


Solution 1:

Russell's paradox

In Zermelo set theory, the proof of the titular question is straightforward:

  • Assume there is such a set. Call it $R$.
    • Fact: $x \notin x$ if and only if $x \in R$. This is the defining property of $R$.
    • Assume $R \in R$.
      • By the fact, this means $R \notin R$.
      • Contradiction!
    • Therefore $R \notin R$.
    • By the fact, this means $R \in R$.
    • Contradiction!
  • Therefore no such set exists.

There is an immediate corollary: there is no set of all sets.

  • Assume there is a set of all sets. Call it S.
    • There is a subset $R \subseteq S$ containing exactly those sets $x$ for which that $x \notin x$
    • Contradiction!
  • Therefore, there is no set of all sets.

Rationale for Zermelo set theory

One of the most important features of a set theory is having tools to actually construct sets. Cantor's 'naive' set theory had the most powerful rule of all: if you could name any property $P$, then there was a set of all sets that have property $P$. This let you construct any set you could image! Unfortunately, it lets you construct the set of Russell's paradox, and thus Cantor's set theory is self contradictory.

Zermelo took a more modest approach*: he looked for a more conservative collection of constructions that sufficed for mathematics, but isn't so strong as to create any of the known paradoxical sets. Fraenkel added another useful construction, and gave us the axiom of foundation which simplifies technical arguments.

Among the constructions of Zermelo set theory is the restricted form of Cantor's "comprehension principle": if we have any property $P$ and a set $S$, then we can form the subset of $S$ of things satisfying property $P$.

The axiom of restricted comprehension exactly the property of a universe of sets that is needed to make the argument in the opening section.

*: I do not know if this is historically accurate. Really, I'm espousing an a postiori observation about it.

Classes

Set-builder notation is very useful notation to denote sets. Recall that each of the following notations define sets in ZFC:

$$ \{ x \in S \mid P(S) \} \qquad \qquad \{ f(x) \mid x \in S \} \qquad \qquad \{ a, b \} $$

where $a,b,S$ are all sets, $P$ is a unary predicate whose domain includes $S$, and $f$ is a function whose domain includes $S$.

The same notation turns out to be quite useful to define predicates. For example, predicate

P(x) = "x contains the empty set"

is easily notated as

$$ P = \{ x \mid \emptyset \in x \} $$

and the assertion that $x$ satisfies the predicate $P$ can be written as

$$ x \in P. $$

This notation, formally, has nothing to do with sets: it is alternative notation for logic. When we do this, we call a predicate a "class".

The way you manipulate logic in the form of classes is so strikingly similar to the way you manipulate sets that this unified notation is extremely useful.

To answer a question you had, the only objects are still sets. The only thing that can be a member of a set is a set. The only thing that can be a member of a class is a set. Classes can't be members of anything, because they aren't objects: they're logic. (at least, if we stick to first-order logic....)

It can be technically awkward when you hav0e to pay attention to what is a set and what is a class, especially if you want to reason in a 'stripped down' version of formal logic.

So, Von Neumann, Bernays, and Gödel invented (NBG) set theory*. The objects of NBG set theory are classes. It might be a little confusing to use the same word as we did for the alternative view of logic above; however in practice it's not a problem.

NBG set theory includes a class called $\mathbf{Set}$. $V$ is another commonly used name for this class. There is a theorem/axiom that says if $x \in y$, then $x \in \mathbf{Set}$.

NBG can also be presented (and usually is, I think) as a theory with two sorts: a sort of sets and a sort of classes. Only sets may be elements of things. But for any set there is a class that has the same elements, and it is reasonable to conflate the two.

*: Again, this is not meant to be a historically accurate presentation.

Universes

Another approach to dealing with classes is a Grothendieck universe. However, using them requires assuming a large cardinal axiom.

A Grothendieck universe is, briefly, a set $U$ with the property that the elements of $U$ collectively have good enough properties to be justifiably called a 'universe of sets'. We call the elements of $U$ "small sets". The things we would normally call classes are all subsets of $U$.

In this way (other than having had to assume a large cardinal axiom) we don't have to do much that is special -- everything we are talking about is a set. We just occasionally have to take note of which sets are "small" and which are not.

Solution 2:

Since the other users have answered the question in the context of well-founded set theory, let me say a few words about other set theories.

Before we can really answer this question, we must first think about what a ‘set’ is in the first place. Intuitively, a set is something that has members and which is wholly determined by what its members are. This is codified in the axiom of extensionality:

Extensionality. If $X$ and $Y$ are sets, and for all $z$, $z \in X \iff z \in Y$, then $X = Y$.

Notice, however, that the quantifier “for all $z$” is unbounded – that is, there is no restriction on the type of $z$. Let us non-commitally fix a universe of discourse $\mathbf{U}$ and say that $z$ is required to be in $\mathbf{U}$. So, can a set be a member of another set? Well, that depends – are there any sets in $\mathbf{U}$? If not, then obviously a set cannot be a member of any set. This is rather unacceptable for doing modern mathematics, so we must rectify this somehow.

Russell's own solution to his paradox was to introduce the notion of a type. (What I describe here is the unramified type theory TST, not the type theory of Principia Mathematica.) We start with some basic type $\mathbf{U}_0$ – say, the natural numbers. We define sets whose members are of type $\mathbf{U}_0$ – and this is a new type $\mathbf{U}_1$. We repeat this procedure infinitely, forming at each stage the type $\mathbf{U}_{n+1}$ corresponding to sets of things of type $\mathbf{U}_n$. Thus, we get sets whose members are other sets; on the other hand, it is clear that the universal ‘set’ does not exist in this ontology: if $X$ is a set, then it is of type $\mathbf{U}_{n+1}$ for some natural number $n$, and its members must be of type $\mathbf{U}_n$ – so in particular, $X \notin X$. We could even entirely banish the formula “$X \in X$” because there is no possible assignment of types that makes it a well-formed formula!

Unfortunately, we have had to introduce infinitely many types of sets, and it seems rather complicated to keep track of all these types in practice. Modern set theory resolves this by taking $\mathbf{U}_0$ to be the empty type and collapsing all the higher types into a single type $\mathbf{U}$. Thus, everything in the universe of discourse is a set (but that does not mean all sets are in $\mathbf{U}$!), and it makes sense to ask whether $x \in y$ for any $x$ and $y$ in $\mathbf{U}$. In particular, the once-banished formula $x \in x$ is well-formed again – so again we have to find some other solution to Russell's paradox.

Digging a little bit deeper, we discover that one of the assumptions of the paradox is the naïve axiom of comprehnsion: that is, whenever $\varphi (x)$ is a well-formed formula, then there exists a set $\{ x : \varphi (x) \}$ in $\mathbf{U}$ whose members are precisely those $x$ in $\mathbf{U}$ for which $\varphi (x)$ is satisfied. As such, we must be more careful about the sets we assume are in $\mathbf{U}$. This is where the set–class distinction comes from: in the usual parlance, ‘set’ refers to sets that are in $\mathbf{U}$, and ‘class’ refers to sets whose members are in $\mathbf{U}$ but are not necessarily in $\mathbf{U}$ themselves. To avoid confusion, I will say $\mathbf{U}$-set for the former.

So what should we assume instead of the naïve axiom of comprehension? Quine's New Foundations (NF) offers one option:

Stratified comprehension. Let us say a well-formed formula $\varphi (x)$ is stratified if there is a way to assign types to all the variables appearing in $\varphi (x)$ so that whenever $y \in z$ appears in $\varphi (x)$, $y$ is of type $\mathbf{U}_n$ and $z$ is of type $\mathbf{U}_{n+1}$, and whenever $y = z$ appears in $\varphi (x)$, both $y$ and $z$ are of type $\mathbf{U}_n$. Then, whenever $\varphi (x)$ is a stratified formula, the class $\{ x : \varphi(x) \}$ is a $\mathbf{U}$-set.

Roughly speaking, any set that exists under TST also exists under NF. In particular, the class $\{ x : x = x \}$ is a $\mathbf{U}$-set under NF – so NF admits a universal set. On the other hand, the paradoxical class $\{ x : x \notin x \}$ does not exist in NF, because $x \notin x$ is not a stratified formula. Now, the relative consistency of NF is not well-understood, but the related theory NFU (obtained by allowing $\mathbf{U}_0$ to be non-empty) is known to be consistent relative to ZF set theory. Thus, if we believe ZF is consistent, then we should also believe that there is a consistent set theory in which the universal set exists – in particular, the universal set does not produce a contradiction on its own.

Having mentioned it, I suppose I should also say how comprehension is handled in ZF. We have the following axiom:

Separation. For any $\mathbf{U}$-set $X$, if $\varphi (x)$ is any well-formed formula, the class $\{ x \in X : \varphi (x) \}$ is a $\mathbf{U}$-set.

Obviously, in the presence of a universal set, the axiom of separation is equivalent to the naïve axiom of comprehension, so we had better do something about that.

Regularity. Any $\mathbf{U}$-set $X$ has a member $Y$ such that any member of $Y$ is not a member of $X$. (Equivalently, $X \cap Y = \emptyset$.)

In particular, there is no universal set. It is tempting to call say that the membership relation $\in$ is well-founded on $\mathbf{U}$, but there is a subtlety here: only $\mathbf{U}$-sets are guaranteed to have a $\in$-minimal member. There are still other problems to fix, however – so far, there are no axioms that guarantee our universe $\mathbf{U}$ is non-empty! But that is a story for another day.

Finally, we should discuss formal class–set theories such as von Neumann–Bernays–Gödel (NBG) or Morse–Kelley (MK). In these theories, the universe of discourse $\mathbf{U}$ consists of ‘classes’, and a ‘set’ is defined to be a class that is a member of some class. To avoid confusion, let us say $\mathbf{V}$-class for the former and $\mathbf{V}$-set for the latter. A proper $\mathbf{V}$-class is a $\mathbf{V}$-class that is not also a $\mathbf{V}$-set.

We have a class comprehension axiom governing the formation of $\mathbf{V}$-classes:

Bounded class comprehension. If $\varphi (x)$ is a well-formed formula that does not have any bound variables ranging over $\mathbf{V}$-classes, then the class $\{ x : x \text{ is a } \mathbf{V} \text{-set and } \varphi (x) \}$ is a $\mathbf{V}$-class.

Full class comprehension. If $\varphi (x)$ is any well-formed formula, then the class $\{ x : x \text{ is a } \mathbf{V} \text{-set and } \varphi (x) \}$ is a $\mathbf{V}$-class.

NBG uses the bounded class comprehension axiom, while MK uses the full class comprehension axiom. Either way, we are guaranteed the existence of the $\mathbf{V}$-class $$\mathbf{V} = \{ x : x \text{ is a } \mathbf{V} \text{-set and } x = x \}$$ which contains all $\mathbf{V}$-sets. But is $\mathbf{V}$ itself a $\mathbf{V}$-set? To answer that we need an axiom telling us which $\mathbf{V}$-classes are $\mathbf{V}$-sets.

Limitation of size. Let us say that a bijection is a $\mathbf{U}$-bijection if its graph exists in $\mathbf{U}$, i.e. if it can be defined by a $\mathbf{V}$-class function. A $\mathbf{V}$-class $X$ is a $\mathbf{V}$-set if and only if there does not exist a $\mathbf{U}$-bijection between $X$ and $\mathbf{V}$.

In particular, $\mathbf{V}$ must be a proper $\mathbf{V}$-class. Note that by definition a proper $\mathbf{V}$-class cannot contain itself. Unfortunately, this doesn't answer the question of whether a $\mathbf{V}$-set can be contained in itself. In NBG and MK, this question is settled by the regularity axiom applied to classes:

Class regularity. Any $\mathbf{V}$-class $X$ has a member $Y$ such that any member of $Y$ is not a member of $X$.

Thus, no $\mathbf{V}$-set can contain itself – at least in NBG or MK.

Solution 3:

I don't entirely understand, "We can clearly define a set of sets. I feel intuitively like we ought to define sets which do contain themselves"

You can define the collection of all sets. It is defined by the formula $x = x$. $V = \{x : x = x\}$. Similarly, you can define the collection of all sets that are not members of themselves. The formula that does this is $x \notin x$. Let $A = \{x : x \notin x\}$. In any structure in the language of set theory, these would correspond to definable classes.

However, both of these are not sets. You are interested in the fact that $A$ is not a set. Suppose that $A$ is a set. You have that $A \in A$ or $A \notin A$. If $A \in A$, then $A$ does not satisfy the formula defining $A$; hence, you $A \notin A$. Contradiction. Now suppose that $A \notin A$. Then $A$ satisfies the defining formula for $A$. So $A \in A$. Contradiction. Since neither can occur, you must have that $A$ is not a set.

Russel paradox is resolved by limiting the comprehension axiom. Instead of all definable classes being sets, the axiom of specification asserts that the intersection of any definable class with a set is a set.