Why does one have to check if axioms are true?

Solution 1:

I believe Tao means that the axioms of reflexivity, symmetry, and transitivity are adequate to capture our pre-existing (this is the key) intuition about what "equality" between two objects ought to be. Let me try two contrasting examples to help unpack what I mean.

Version 1

You: Sets $A$ and $B$ are shmequal provided $x \in A \Leftrightarrow x \in B$ for all $x$.

Me: That sounds like a fine relation to investigate. Creative name, by the way.

Version 2

You: Sets $A$ and $B$ are equal provided $x \in A \Leftrightarrow x \in B$ for all $x$.

Me: Now, hold on just a second. By "equal", you mean "identical" or "exactly the same"? I'm not sure I'm ready to accept that this abstract definition captures all that. You would need to show me that the relation $x \in A \Leftrightarrow x \in B$ for all $x$ is reflexive, symmetric, and transitive before I'm willing to concede that this deserves a name like "equal".

Comment from OP

An example came to mind: we define equality for ordered pairs: $(x,y)=(a,b)⟺x=a∧y=b$. To show that equality for ordered pairs is reflexive we need to show $(x,y)=(x,y)$, which by definition means $x=x∧y=y$. But $x$ and $y$ could itself be ordered pairs. So now we are in the situation where we have to prove that every ordered pair is equal to itself but where we also have to accept this as given.

A weak response from me

I struggle to find good words to address your question, but it might help to remember that we are not checking whether $(x,y) = (x,y)$. Rather this is one of the things we insist should be the case if "equality" is to mean anything; it must apply to identical objects. Instead, we are asking "For ordered pairs, does the $x = a \wedge y = b$ property capture this self-evident truth about equality?". We find that it does: $x = x$ and $y = y$ are both true statements because we are comparing two identical objects in each case.

Solution 2:

You are probably confused because you think that axioms are (by definition) statements that we take as true without any proofs. However, this word has a slightly different meaning.

Axioms are a starting point of a mathematical theory. When you build a theory, for example, Arithmetic, from scratch, you need some preliminary facts, otherwise you cannot prove anything. In Arithmetic and a bunch of other mathematical theories the described properties of equality are indeed axioms, that one does not prove. Equality is a primitive notion, and the only sensible way to actually define it is to postulate that these natural (as it seems to us humans) properties hold.

However, in set theory, these "axioms" are not the definition of equality. Rather, equality is defined via the formula above: two sets are equal when they consist of identical elements. But when we define equality in this way, there is a natural question: why we are naming this as "equality" at all? This is why we prove "axioms of equality", which we are already used to, to show that the naming "equality" is adequate. And when we prove them, they become theorems of set theory and properties of equality rather than axioms. This is because set theory is more fundamental and more powerful than most mathematical theories in a sense that you can build (almost) all mathematics based on it.

Solution 3:

There are axioms and then there axioms. Most of the time mathematicians use the word "axiom" they mean it in a definitional sense. Instead of "An equality is a relation satisfying the reflexivity, symmetry, transitivity, and substitutivity axioms" think instead "By definition, equality is any relation for which reflexivity, symmetry, transitivity, and substitutivity hold". In other words, you can call some relation an equality if you can show that it meets the definition, i.e. that reflexivity, symmetry, transitivity, and substitutivity are true of it. So the answer to your first question is "yes". To put it another way, these "axioms" are true of equality relations, but you have to show that your relation is in fact an equality relation. This is exactly the same situation as the definition of a mathematical group for instance.

To put it an even better way, these "axioms" are axioms in the "theory of equality" and you want to show that a particular relation is a model/interpretation/semantics for that theory. Very briefly and roughly sketching, in formal logic, a theory is a collection of symbols and a collection of rules. A theory will define certain arrangements of symbols to be formulas (or sentences or terms). There will also be a notion of "theoremhood" defined by the rules. Typically the rules will have the form "if these arrangements of symbols are theorems, then this arrangement of symbols is a theorem". You could call these rules "axioms", though in this context usually only rules with no premises, i.e. that simply baldly state "this arrangement of symbols is a theorem" with no conditions, are called "axioms". However, the "axiom of symmetry", say, corresponds more closely to a rule than an axiom in this stricter sense. A theory (and the logic in which it is formulated) thus gives rise to a language.

Of course, we usually want to talk about circles and torii and other mathematical objects that we don't (usually...) think of a "arrangements of symbols". To connect a theory to some mathematical objects we use a semantics which is an assignation of mathematical objects to the arrangements of symbols in a consistent manner (usually satisfying some conditions dependent on the logic in which the theory is formulated) such that the rules are satisfied. If the rules aren't satisfied, then the assignment is not a semantics for the theory.

So Tao is (implicitly) specifying a theory of equality and these exercises are asking you to show that particular interpretations (i.e. assignments) are semantics for that theory.

So how does this relate to set theory as the "foundation" of mathematics or axioms as "self evident truths"? As far as "foundations" are concerned the situation is that mathematicians for the most part have agreed to (pretend to) work within the language of Zermelo-Fraenkel (ZF) set theory which is a theory in first-order logic. There is no question of "true" or "false" in this scenario. We simply have some formulas that are called theorems and rules for making more theorems. The rules/axioms simply define what a "theorem" is. The axioms in the stricter sense are then the starting points for deriving theorems as Wolfram said. However, we can talk about semantics for ZF set theory and to show that an assignment is a semantics we would indeed be obliged to prove the "axiom of pairing" and the "axiom of infinity" and all the other axioms of ZF set theory hold for our interpretation. This is something that is done in set theory and logic.

Pragmatically, as I stated in the first paragraph, you should just interpret "axiom" in this and most cases as "condition that needs to hold to meet the definition". The rest of this answer was more explaining how this usage of the word "axiom" is, in fact, more or less consistent with the usage in e.g. "axioms of set theory" or "axioms of geometry".

Solution 4:

I've personally come to think of axioms as little components of a big definition. For example, the axioms at the start of the Elements define what we mean by "Euclidean geometry"; the Peano axioms serve to define what we mean by "natural number", and so forth.

This gives a somewhat cleaner style than a definition that goes on for a paragraph or two with lots of conjunctions. And it serves to make the parts distinct, so that we can study the effects of maybe swapping out or changing one of them and keeping the rest the same.

So Tao is listing all the sub-parts in the definition of what we mean by "equality". What equality means is assumed in advance by these axioms. Now the question is: Does a proposed new relation really count as equality, or not? That needs to be established by proving it meets all of the criteria, that is, all of the component axioms of the definition.

Solution 5:

So he gives the exercise to prove the axioms of equality for sets. Why does one has to prove axioms? Or, put differently: if one can prove these things, why does he state them as axioms? $ \def\imp{\Rightarrow} \def\eq{\Leftrightarrow} $

He does not prove the axiom as stated. The axiom claims equality between two sets iff they have exactly the same members. This equality symbol "$=$" is the symbol in the foundational system itself, and there is no way you will be able to prove an axiom of the foundational system if it is independent of the other axioms. In fact, the equality symbol is part of first-order logic itself. So what exactly does Terence Tao mean?

He wrote:

One can easily verify that this notion of equality is reflexive, symmetric, and transitive (Exercise 3.1.1). Observe that if $x∈A$ and $A=B$, then $x∈B$, by Definition 3.1.4. Thus the "is an element of" relation $∈$ obeys the axiom of substitution.

This is logically not precise unless he is working in first-order logic without equality. It would be clearer to define the binary relation $\sim$ such that $A \sim B$ iff $\forall x\ ( x \in A \eq x \in B )$. Then it makes sense to ask whether $\sim$ is an equivalence relation or not, and whether it obeys substitution. It is trivial to prove that it is indeed symmetric, reflexive and transitive. In a formal system, we should think of the notion that a relation $\sim$ obeys substitution as meaning that $x \sim y$ implies that $P(x) \eq P(y)$ for any $1$-input sentence $P$. It turns out that in a first-order language we do not need to check for every $1$-input sentence, because it suffices to check for each of the non-logical symbols of the language. Since the language of set theory only has one non-logical symbol "$\in$", the precise notion of "obeys substitution" in set theory is hence:

$\forall A,B\ ( A \sim B \imp \forall C\ ( C \in A \eq C \in B ) \land \forall C\ ( A \in C \eq B \in C ) )$.

Observe that the first half of the claim is trivially true by definition of $\sim$. I don't see the second half in the quotes of Terence Tao that you have in your question. If I'm not mistaken, it is not provable, in which case Terence didn't really prove full substitutivity.