How can there be genuine models of set theory?

I know that this a beginner's question asked too many times, but I still didn't get an answer which lets me quit asking:

Given that a model/interpretation of a theory (in the Tarskian sense) is a set with some structure, how can there be models of set theory, since we know, that the class of all sets (the range of the quantifiers of set theory) is not a set?

I especially wonder why

a) it is stressed so often and so strongly that models must be sets?

b) nevertheless sometimes proper classes are allowed? (see Wikipedia's Inner Model Theory: "models are transitive subsets or subclasses")


I especially wonder why

a) it is stressed so often and so strongly that models must be sets?

There are several reasons why model theory texts only look at models that are sets. These reasons are all related to the fact that model theory is itself studied using some (usually informal) set theory.

One benefit of sticking with set-sized models is this makes it possible to perform algebraic operations on models, such as taking products and ultrapowers, without any set-theoretic worries.

Another benefit of requiring models to be sets is this makes it possible to define the satisfaction relation $\vDash$ for each model. In other words, given a model $M$ in a language $L(M)$, we want to form $T(M) = \{ \phi \in L(M) : M \vDash \phi\}$. This can be done when $M$ is a set, by going through Skolem normal form. But it cannot be done, in general, when $M$ is a proper class, because of Tarski's undefinability theorem. In particular, if we let $M$ be the class-sized model $V$ of the language of set theory then Tarski's theorem shows that $T(M)$ is not definable in $V$. We can define the truth of each individual formula (using the formula itself) but in general there may be no global definition of truth in a proper-class-sized model.

Moreover, in model theory, there is no real need to look at proper-class-sized models, because there is already enough interesting behavior from set-sized models. The motivating examples are all sets (algebraic structures, partial orders, etc). And the completeness theorem shows that any consistent theory has a set-sized model (this includes ZFC). So model theorists generally restrict themselves to set-sized models.

b) nevertheless sometimes proper classes are allowed?

Generally, people are only interested in proper-class-sized models in the context of set theory. The reason for the interest is that ZFC can't prove that there is a set model of ZFC (because ZFC can't prove Con(ZFC)), but it is possible to form proper-class-sized models of ZFC from a given proper-class-sized model of ZFC (e.g. the inner model $L$). This allows for some model-theoretic results about set theory, but many things that are taken for granted in model theory have to be re-checked when we move to proper-class-sized models. In general the re-checking is often routine, and it only comes up in advanced settings, where an author is not likely to make a big fuss about it. The benefit of this labor is that we can sometimes avoid having to assume Con(ZFC) as a hypothesis for a theorem about models of set theory.

In summary, in any non-set-theoretic context, "model" will mean "set-sized model". In the context of set theory, this is still what "model" usually means; they usually say "inner model" or "class model" for a proper-class-sized model. But some attention to context is needed when you are working with "models" of set theory to make sure you read what the author intended.


Some comments. The basic distinction you need to make is between the external and internal notions of set. Let me take as granted a primitive and unspecified notion of set: this will be our external notion of set. For any first-order theory $T$ in a language $L$, a model of $T$ is a set (in this external sense) equipped with functions and relations satisfying the appropriate axioms, etc. In particular, a model of, say, ZFC is a set $M$ equipped with a binary relation satisfying etc. etc.

(Requiring that models themselves be sets is likely a matter of convenience. In category theory this requirement can be restrictive, and one way to get around it is the notion of a Grothendieck universe. But I won't say more about this; it isn't central to your misunderstanding.)

Now the elements $m \in M$ of a model of set theory are themselves supposed to be interpreted as sets, but the word "set" here means something different: it is an internal notion specified by $T$ (and $L$). To prevent confusion here it would really be best to replace "set" with some other word, such as "foo." Thus we should speak of foo theory and the class of all foos, which is not a foo. (A class is just an external subset of $M$ specified by some formula.)

When we say that the class of all sets is not a set, what we mean is that there does not exist an element of $M$ which contains all other elements of $M$ (by the axiom of regularity). We don't mean that $M$ is itself not an external set, because it is by definition an external set.

I believe the Wikipedia article on inner models is talking about internal classes (which are still just elements of $M$, an external set), but I'm not sure.

One last thing: ZFC is not capable of provably exhibiting a model of ZFC (since this proves that ZFC is consistent) unless it is inconsistent by the incompleteness theorem.