Why is it considered unlikely that there could be a contradiction in ZF/ZFC?

EDIT:

No answer addresses the "bottleneck" question. It's not surprising to me because the question is vague. But I would like to know whether that is indeed the reason, or perhaps something else. The question is interesting to me and I would be grateful for any help with it.

MAIN PART:

Unfortunately, there will be a bit of vagueness in this question because of its nature and because of my limitations. I would like to ask for your understanding and help with the formulation, if they're possible.

I know (although without knowing the proof) that the consistency of ZF or ZFC cannot be proven within these theories, but their inconsistency can if they are inconsistent. I have read -- to my relief -- that it is considered unlikely that they are inconsistent. I would like to understand this statement better.

One simple argument that comes to mind is that no contradiction has been found so far, even though the theories have been extensively researched. But this alone seems a bit weak to me. The space of all provable statements in ZF or ZFC is clearly infinite. I have noticed that when mathematicians make statements about the likelihood of the truth of statements about elements of infinite classes, they usually give finer arguments than just the truth of the statement for the elements of some finite subclass. For example, many mathematicians seem to believe that Goldbach's conjecture is true, and they base their belief on theorems about the distribution of prime numbers in natural numbers.

Are there any arguments of this kind (unfortunately, I don't seem to be able to define what "this kind" means precisely here) for there not being a contradiction in ZF or ZFC? I've been thinking about how it could happen that there would actually be a contradiction in, say, ZF. I think we could define the "length" of a theorem in ZF to be the minimal number of symbols in a proof of the theorem. (Please tell me if there is something wrong with such a definition.) If we assume that ZF is inconsistent, then the proof of its inconsistency has a finite length, say $n$. For every natural number $k$ there is a finite number of theorems of length at most $k$, so we should be able to tell when we have proven all theorems of length at most $k$. The mathematical community has proven many theorems in ZF. Is it known how far we have gotten in this scale? For example, have we gotten past $k=10$? Let $m$ be the greatest natural number such that all theorems of length at most $m$ are known. Clearly, $n$ would have to be greater than $m$.

But I think many theorems must have been proven with length greater than $m$. Can we meaningfully talk about the chance of hitting the proof of the contradiction of ZF by making random correct reasonings of length $\geq n$? I've been trying to define an "inference bottleneck" that could cause the contradiction to be hard to hit, but I've failed. Since I haven't defined it, it may be difficult or impossible to understand what I mean by "inference bottleneck", but I hope it's not. I mean a theorem that can be proven by only a "small" number of reasonings, only I have trouble saying exactly in comparison to what it should be small.

I would like to ask if it actually is possible to define such "bottlenecks" and if so, would it be possible to prove that they cannot be too narrow? I'm thinking such a theorem could be a more convincing argument for there not being a contradiction in ZF.

And the more general question, to reiterate it, is what other arguments mathematicians (or philosophers?) give for ZF and ZFC being consistent. The belief in the consistency of those theories seems to me to be very strong among mathematicians, even though they tend to be very careful about saying things about other unproven statements. Why is that?

ZFC is meant to capture a certain notion, the cumulative hierarchy of sets. For this justification to make sense you need to think of "well ordering" or "ordinal" as a pre-existing mathematical concept, not one based on ZFC. This justification is explained at more length in Shoenfield's article in the Handbook of Mathematical Logic from 1977, and elsewhere. It dates back to the early 20th century.

Assuming that we have a collection of "ordinals" $O$, which is downward closed and has minimal element $0$, we can define a collection $V^O$ as follows.

$V^O(0) = \emptyset$
If $\alpha + 1$ is in $O$ then $V^O(\alpha + 1)$ is the powerset $V^O(\alpha)$
If $\lambda$ is a limit ordinal in $O$ then $V^O(\lambda)$ is the powerset of $\bigcup_{\alpha < \lambda} V^O(\alpha)$

Finally, $V^O$ itself is $\bigcup_{\alpha \in O} V^O(\alpha)$. The informal way to put this is: think of the elements of $O$ as "stages". Then a set will be put into $V^O$ at stage $\alpha$ if all of its elements have already appeared at stages earlier than $\alpha$.

We can ask: which axioms of set theory does $V^O$ satisfy?

$V^O$ will contain the empty set as long as $O$ has at least two elements, because the empty set appears in $V^O(1)$.
$V^O$ will satisfy the separation axiom. To see this, assume $z \in V^O$ and that we want to prove that $y = \{x \in z : \phi(x)\}$ is in $V^O$. Well, $z$ was formed at some stage, so all the elements of $z$ were formed at earlier stages. But this means that all the elements of $y$ were also formed at stages earlier than the one where $z$ was formed, so $y$ will be formed no later than $z$.
Every subset of a set $z$ is formed at the same time that $z$ is formed. Therefore, the powerset of $z$ will be formed at the stage after $z$ is formed, assuming there is a next stage. So if $O$ has no maximal element then $V^O$ satisfies the axiom of power set.

These examples suggest that the axioms that are satisfied by $V^O$ will depend on how "long" $O$ is. Indeed, it turns out that if we assume sufficient properties about $O$ then we can argue in a similar way that $V^O$ satisfies all the ZFC axioms. In particular, if we let $O$ contain all the ordinals, then $V^O$ will satisfy ZFC, and we usually just write $V$ instead of $V^O$ in this case. This $V$ based on all the ordinals will be a proper class, not a set.

This argument cannot be captured in ZFC itself, although various properties of the cumulative hierarchy can be captured in ZFC. But the argument does give some motivation for why ZFC should be consistent, by giving a conception of sets (as elements of $V$) which seems intuitively reasonable. Indeed, it appears that all we need to have in order to form $V$ is a well-determined collection of ordinals, the ability to take powersets, the ability to take unions, and the ability to iterate these operations along the ordinals.

So where could ZFC be inconsistent, even if this argument is correct? One place is the separation axiom. In the argument above we assumed that $\{x \in z : \phi(x)\}$ actually does define a subset of $z$ whenever $\phi$ is a formula of set theory. If somehow there were formulas $\phi$ which do not determine subsets of $z$, our argument for why the separation axiom holds in $V$ would not go through. There is a certain sense in which the argument above is proving the consistency of second-order ZFC rather than first-order ZFC, just as the informal proof of consistency of Peano arithmetic that says "$\mathbb{N}$ is a model" is really a consistency proof for second-order PA rather than first-order PA.

I am not a set theorist, so this is a slightly naive opinion.

1) As you say, many people have been exploring the consequences of ZF(C) for about 100 years, without having found a contradiction yet.

You mention this in your answer and add that you don't find it very convincing. I think it is in a certain sense very convincing, just not the usual sense of mathematics. If you think about it, strong inductive evidence is the most convincing evidence we have for most things outside of mathematics. Moreover, we are certainly willing to put our money where our mouths are for this kind of speculation: we believe, for instance, that factoring numbers is a computationally intractable problem. Why do we believe this? So far as I know, the best reason is that people have been trying really, really hard for hundreds of years to come up with efficient factoring methods and have not yet succeeded. I am not aware of any "programme" to prove this belief about factoring (It would be disproved, of course, if it turned out that $P = NP$, but very few people believe that!) But anyway, this belief is good enough for us to have made much of contemporary banking and government security depend on the difficulty of factoring!

2) Before ZF(C) axiomatic set theory people had intuitive ideas about sets, and by the way, we still do. If I interpret the axioms of ZF(C) set theory as axioms about the sets that I think about and use every day, then they are all assertions that I am quite sure hold true. Via Godel's Completeness Theorem any axiom system which is formally consistent can be proved such by exhibiting a model. In other words, when we worry that a formal system is inconsistent, we are precisely worried that it has no model. With ZF(C) we have an axiom system based on an informal model that most of us already have in our head. (In fact, to be honest, most working mathematicians know only the intuitive model of sets, not the ZF(C) axioms.) Now an "informal model" is not a model in the sense of mathematical logic: it's not even a mathematical object! But as intuition, it seems convincing: axiom systems that are chosen to model something that I already think exists are not the axiom systems where I'm worried about deriving a formal contradiction.

Unfortunately intuition -- especially, naive or untested intuition -- in mathematics often turns out to be wrong, which takes me back to my first point. If something that I deeply believe is true has held up to a century of formal attacks, then yes, I feel pretty good about it. It would be better if we could prove it, but apparently that's not how it works...

Why is it considered unlikely that there could be a contradiction in ZF/ZFC?

Related

Recent Posts