How is the Gödel's Completeness Theorem not a tautology?

As a physicist trying to understand the foundations of modern mathematics (in particular Model Theory) $-$ I have a hard time coping with the border between syntax and semantics. I believe a lot would become clearer for me, if I stated what I think the Gödel's Completeness Theorem is about (after studying various materials including Wikipedia it seems redundant for me) and someone knowledgeable would clarify my misconceptions. So here it goes:

As I understand, if we have a set $U$ with a particular structure (functions, relations etc.) we can interpret it (through a particular signature, e.g. group signature $\{ e,\cdot \}$ ), as a model $\mathfrak{A}$ for a certain mathematical theory $\mathcal{T}$ (a theory being a set of axioms and its consequences). The theory is satisfied by $\mathfrak{A}$ only if $U$'s structure satisfies the axioms.

Enter Gödel's theorem: For every first order theory $\mathcal{T}$ :

$$\left( \exists \textrm{model } \mathfrak{A}: \mathfrak{A} \models \mathcal{T} \right) \iff \mathcal{T} \textrm{ is consistent}$$ So I'm confused. Isn't $\mathcal{T}$ being consistent a natural requirement which implicates that a set $U$ with a corresponding structure always exists (because of the ZFC's set theory freedom in constructing sets as we please without any concerns regarding what constitutes the set)? And that in turn always allows us to create a model $\mathfrak{A}$ with an interpretation of the signature of the theory $\mathcal{T}$ in terms of $U$'s structure?

Where am I making mistakes? What concepts do I need to understand better in order to be able to properly comprehend this theorem and what model theory is and is not about? Please help!

Solution 1:

It may help to look at things from a more general perspective. Presentations that focus on just first-order logic may obscure the fact that specific choices are implicit in the definitions of first-order logic; the general perspective highlights these choices. I want to write this up in detail, as a reference.

General "logics"

We define a particular type of general "logic" with negation. This definition is intended to be very general. In particular, it accommodates much broader types of "syntax" and "semantics" than first-order logic.

A general "logic" will consist of:

A set of "sentences" $L$. These do not have to be sentences in the sense of first-order logic, they can be any set of objects.
A function $N: L \to L$ that assigns to each $x \in L$ a "negation" or "denial" $N(x)$.
A set of "deductive rules", which are given as a closure operation on the powerset of $L$. So we have a function $c: 2^L \to 2^L$ such that
1. $S \subseteq c(S)$ for each $S \subseteq L$
2. $c(c(S)) = c(S)$ for each $S \subseteq L$
3. If $S \subseteq S'$ then $c(S) \subseteq c(S')$.
A set of "models" $M$. These do not have to be structures in the sense of first-order logic. The only assumption is that each $m \in M$ comes with a set $v_m \subseteq L$ of sentences that are "satisfied" (in some sense) by $M$:
1. If $S \subseteq L$ and $x \in v_m$ for each $x \in S$ then $y \in v_m $ for each $y \in c(S)$
2. There is no $m \in M$ and $x \in L$ with $x \in v_m$ and $N(x) \in v_m$

The exact nature of the "sentences", "deductive rules", and "models", and the definition of a model "satisfying" a sentence are irrelevant, as long as they satisfy the axioms listed above. These axioms are compatible with both classical and intuitionistic logic. They are also compatible with infinitary logics such as $L_{\omega_1, \omega}$, with modal logics, and other logical systems.

The main restriction in a general "logic" is that we have included a notion of negation or denial in the definition of a general "logic" so that we can talk about consistency.

We say that a set $S \subseteq L$ is syntactically consistent if there is no $x \in L$ such that $x$ and $N(x)$ are both in $c(S)$.
We say $S$ is semantically consistent if there is an $m \in M$ such that $x \in v_m$ for all $x \in S$.

The definition of a general "logic" is designed to imply that each semantically consistent theory is syntactically consistent.

First-order logic as a general logic

To see how the definition of a general "logic" works, here is how to view first-order logic in any fixed signature as a general "logic". Fix a signature $\sigma$.

$L$ will be the set of all $\sigma$-sentences.
$N$ will take a sentence $x$ and return $\lnot x$, the canonical negation of $x$.
$c$ will take $S \subseteq L$ and return the set of all $\sigma$-sentences provable from $S$.
$M$ will be the set of all $\sigma$-structures. For each $m \in M$, $v_m$ is given by the usual Tarski definition of truth.

With these definitions, syntactic consistency and semantic consistency in the general sense match up with syntactic consistency and semantic consistency as usually defined for first-order logic.

The completeness theorem

Gödel's completeness theorem simply says that, if we treat first-order logic in a fixed signature as a general "logic" (as above) then syntactic consistency is equivalent to semantic consistency.

The benefit of the general perspective is that we can see how things could go wrong if we change just one part of the interpretation of first-order logic with signature $\sigma$ as a general "logic":

If we were to replace $c$ with a weaker operator, syntactic consistency may not imply semantic consistency. For example, we could take $c(S) = S$ for all $S$. Then there would be syntactically consistent theories that have no model. In practical terms, making $c$ weaker means removing deduction rules.
If we were to replace $M$ with a smaller class of models, syntactic consistency may not imply semantic consistency. For example, if we we take $M$ to be just the set of finite $\sigma$-structures, there are syntactically consistent theories that have no model. In practical terms, making $M$ smaller means excluding some structures from consideration.
If we were to replace $c$ with a stronger closure operator, semantic consistency may not imply syntactic consistency. For example, we could take $c(S) = L$ for all $S$. Then there would be semantically consistent theories that are syntactically inconsistent. In practical terms, making $c$ stronger means adding new deduction rules.

On the other hand, some changes would preserve the equivalence of syntactic and semantic consistency. For example, if we take $M$ to be just the set of finite or countable $\sigma$-structures, we can still prove the corresponding completeness theorem for first-order logic. In this sense, the choice of $M$ to be the set of all $\sigma$-structures is arbitrary.

Other completeness theorems

We say that the "completeness theorem" for a general "logic" is the theorem that syntactic consistency is equivalent to semantic consistency in that logic.

There is a natural completeness theorem for intuitionistic first-order logic. Here we let $c$ be the closure operator derived from any of the usual deductive systems for intuitionistic logic, and let $M$ be the set of Kripke models.
There is a completeness theorem for second-order logic (in a fixed signature) with Henkin semantics. Here we let $c$ be the closure operator derived from the usual deductive system for second-order logic, and let $M$ be the set of Henkin models. On the other hand, if we let $M$ be the set of all "full" models, the corresponding completeness theorem fails, because this class of models is too small.
There are similar completeness theorems for propositional and first-order modal logics using Kripke frames.

In each of those three cases, the historical development began with a deductive system, and the corresponding set of models was identified later. But, in other cases, we may begin with a set of models and look for a deductive system (including, in this sense, a set of axioms) that leads to a generalized completeness theorem. This is related to a common problem in model theory, which is to determine whether a given class of structures is "axiomatizable".

Solution 2:

The usual form of the completeness theorem is this: $ T \models \phi \implies T \vdash\phi$, or that if $\phi$ is true in all models $\mathcal{M} \models T$, then there is a proof of $\phi$ from $T$. This is a non-trivial statement, structures and models are about sets with operations and relations that satisfy sentences. Proofs ignore the sets with structure and just gives rules for deriving new sentences from old.

If you go to second order logic, this is no longer true. We can have a theory $PA$, which only has one model $\mathbb N \models PA$, but there are sentences $PA \models \phi$ with $PA \not\vdash \phi$ ("true but not provable" sentences). This follows from the incompleteness theorem which says that truth in the particular model $\mathbb N$ cannot be pinned down by proofs. The way first order logic avoids this is by the fact that a first order theory can't pin down only one model $\mathbb N \models PA$. It has to also admit non-standard models (this follows from Lowenheim-Skolem).

This theorem, along with the soundness theorem $T\vdash \phi \implies T\models \phi$ give a strong correspondence between syntax and semantics of first order logic.

Your main confusion is that consistency here is a syntactic notion, so it doesn't directly have anything to do with models. The correspondence between the usual form of the completeness theorem, and your form is by using a contradiction in place of $\phi$, and taking the contrapositive. So if $T \not \vdash \bot$ ($T$ is consistent), then $T \not \models \bot$. That is, if $T$ is consistent, then there exists a model $\mathcal M \models T$ such that $\mathcal M \not \models \bot \iff \mathcal M \models \top$, but that's a tautology, so we just get that there exists a model of $T$.

List of problem books in undergraduate and graduate mathematics

Why do we miss 8 in the decimal expansion of 1/81, and 98 in the decimal expansion of 1/9801?

Is there a definition of determinants that does not rely on how they are calculated?

How many 7-note musical scales are possible within the 12-note system?

Symbol for elementwise multiplication of vectors

Does this Fractal Have a Name?

Why is $A^TA$ invertible if $A$ has independent columns?

Why can we use induction when studying metamathematics?

What is exactly the difference between a definition and an axiom?

"It looks straightforward, but actually it isn't"

What is the general equation of the ellipse that is not in the origin and rotated by an angle?

What proportion of positive integers have two factors that differ by 1?