An (apparently) vicious circle in logic

Solution 1:

In any mathematical study there are two things that have to be clear: the "object" topic that we wish to study and the "background" or "meta" system that we use to study it. We can pick each of these with complete freedom. (Of course, if the background system is too weak, it won't be able to say anything about the object topic, but that's life).

For example, suppose we want to study real numbers as our object topic.

  • We could use Euclidean geometry as our background system, so we would end up looking at numbers that can be constructed by ruler and compass. We would declare some segment $S$ to have length $1$ "by convention" and then other segments would represent real numbers depending on how the ratio of their length to the length of $S$. With more work we could even find ways to represent polynomials in terms of finite arrangements of points on the plane, so we could ask which polynomials can be constructed and which constructible polynomials have constructible roots.

  • We could use ZFC, and define real numbers as some sort of sets, and study these particular sets.

  • We could work more axiomatically by working in ZFC and studying the theory of a complete Archimedean ordered field. One difference between this and the previous bullet is that if we think of a real number just as an element of some field then it makes no sense to ask whether $5 \in \pi$, because there is no "$\in$" in the language of fields; but if we think of a real number as just some particular set then it does make sense to ask whether $5 \in \pi$. Similarly, in geometry we could ask whether two numbers (represented by line segments) are parallel.

The same sort of thing works for any other object theory. For example, if we want to study first-order logic, we could:

  • We could work with a very weak theory that is only able to manipulate finite binary strings. By appropriate coding techniques we can view certain strings as representing formulas from first-order logic, in the same way that Euclidean geometry uses line segments to represent real numbers. We can then see what sort of constructions we can do on these strings.

  • We could work in ZFC, develop a semantics for first-order logic, and study the relationship between provability and semantics.

The various ways of studying a single topic each act as "filters" that hide certain aspects of the topic and emphasize other aspects. At the same time, as was seen above, many of the meta systems add their own idiosyncracies as well. Most of the commonly used background systems can be arranged in a linear hierarchy of increasing strength (e.g. PRA < PA < ZF < ZFC), which Simpson has called the "Gödel hierarchy". Different portions of this hierarchy provide tools to study different aspects of the object topics.

There is no reason that we cannot use a certain system as a background system to study itself. For example, we can study ZFC within ZFC. This is no more paradoxical than using a wrench to fix the very same casting machine that made that wrench. Similarly, when we want to study ZFC, we may decide to use syntactic first-order logic as our base system, and when we want to study first-order logic, we may decide to take ZFC as our base system.

Solution 2:

As cody pointed out to me, this answers the question from the viewpoint of a mathematical platonist.

There are 2 different ZFCs. First is the ZFC that we work in, the one we know and love. In this ZFC, a "proof" means what we mean colloquially by a "proof". In this "casual": ZFC, helped by our real world intuition and experience, we know that "A and B" means what it does in english: both A and B are true. We know what "A(x) is true for all x" means - what it does in english. We know that 2+2=4, not because of number theory or the Peano axioms, but because that reflects our daily experience. We learn how to recognize proofs, not by their ability to be transformed into formal language, but by experience, intuition, and gut feeling.

Within this "casual" ZFC, we create the rigorous discipline known as logic. Within this logic, we formalize the notion of "truth", we formalize the notion of "proof", we formalize, well, everything. Most of our definitions which formalize these notions are based on our feelings/experience in "casual" ZFC. For example, in logic, we define $(A\wedge B)$ to hold if and only if $A$ holds and $B$ holds.

Finally, within logic, we define "formal" ZFC. Here, ZFC is a list of axioms and "proof" means "a chain of sentences, each of which is either a hypothesis, axiom, or can be formally deduced from a previous sentence in the chain by modus ponens." "Truth" becomes a semantic notion, often depending on which particular model of ZFC we are working with.

Solution 3:

Formally, first-order logic is the metatheory in which most mathematics takes place. In this sense, first-order logic is prior to set theory in that when we prove something in set theory, we are using finitistic reasoning which is formalized rigorously in our metatheory. However, it is a phenomenon of sufficiently strong theories (ZF, Robinson Arithmetic) that they are able to formalize what it is to be a proof within themselves. Thus, you have a formula in ZFC $\psi$ that takes Gödel numbers of sentences $\phi$ ($\ulcorner\phi\urcorner$) s.t $\psi(\ulcorner\phi\urcorner)=1$ iff there is a proof in ZFC of $\phi$. This phenomenon is the source of Gödel's incompleteness theorems. It is important to keep these two things separate, the first order logic formalized in the metatheory and the phenomenon that sufficiently strong theories are able to express fragments of this.

Solution 4:

The question what is actually meant by "ZFC is a basis for FOL" and "FOL is a basis for ZFC"?

The idea seems to be that all expressions can be reduced to a formal language ZFC/FOL. That was Hilbert's program. Turns out that this has some problems. And Gödel's incompleteness theory is not the only one in my view.

I think a much better analogy is a compiler which compiles source code into machine code. Machine code is a language which has to be realizable.

I believe this is a very deep question, and that the more or less arbitrary definition of level and meta-level is not the answer. The system itself would have incorporate a definition of levels. This is related to the question how to define definition.