Why do we not have to prove definitions?

I am a beginning level math student and I read recently (in a book written by a Ph. D in Mathematical Education) that mathematical definitions do not get "proven." As in they can't be proven. Why not? It seems like some definitions should have a foundation based on proof. How simple (or intuitive) does something have to be to become a definition? I mean to ask this and get a clear answer. Hopefully this is not an opinion-based question, and if it is will someone please provide the answer: "opinion based question."


Solution 1:

I'd like to take a somewhat broader view, because I suspect your question is based on a very common problem among people who are starting to do "rigorous" or "theorem-proof" mathematics. The problem is that they often fail to fully recognize that, when a mathematical term is defined, its meaning is given exclusively by the definition. Any meaning the word has in ordinary English is totally irrelevant. For example, if I were to define "A number is called teensy if and only if it is greater than a million", this would conflict what English-speakers and dictionaries think "teensy" means, but, as long as I'm doing mathematics on the basis of my definition, the opinions of all English-speakers and dictionaries are irrelevant. "Teensy" means exactly what the definition says.

If the word "teensy" already had a mathematical meaning (for example, if you had already given a different definition), then there would be a question whether my definition agrees with yours. That would be something susceptible to proof or disproof. (And, while the question is being discussed, we should use different words instead of using "teensy" with two possibly different meanings; mathematicians would often use "Zduff-teensy" and "Blass-teensy" in such a situation.)

But if, as is usually the case, a word has only one mathematical definition, then, there is nothing that could be mathematically proved or disproved about the definition. If my definition of "teensy" is the only mathematical one (which I suspect is the case), and if someone asked "Does 'teensy' really mean 'greater than a million'?" then the only possible answer would be "Yes, by definition." A long discussion of the essence of teensiness would add no mathematically relevant information. (It might show that the discussants harbor some meaning of "teensy" other than the definition. If so, they should get rid of that idea.)

(I should add that mathematicians don't usually give definitions that conflict so violently with the ordinary meanings of words. I used a particularly bad-looking example to emphasize the complete irrelevance of the ordinary meanings.)

Solution 2:

The other answers did not explain the background of logic that is the key to understanding this issue. In any formal system where we write proofs, we have to use some formal language that specifies the valid syntax of sentences, and we must follow some formal rules that specify which sentences we can write down in which contexts. In mathematics we usually use classical first-order logic, which consists of both the language of first-order logic and classical inference rules. This language is sufficient but extremely cumbersome if we were not allowed to make any definitions.

For example, if we are working in Peano Arithmetic where the only objects are natural numbers, then if we want to prove that an odd number multiplied by an odd number is odd, we effectively have to prove: $\def\imp{\Rightarrow}$

$\forall m \forall n ( \exists a ( m = 2a+1 ) \land \exists b ( n = 2b+1 ) \imp \exists c ( mn = 2c+1 ) )$.

Now certainly we can do this and completely avoid defining "odd", but as the theorems grow in complexity (and this example is an incredibly trivial theorem) it would become simply impossible to refrain from definitions.

What is a definition, then? In first-order logic it can be understood to be simply a shortform for some expression.

Continuing the above example, if for any expression $E$ we define "$odd(E)$" to mean "$\exists x ( E = 2x+1 )$" where "$x$" is a variable not used in "$E$", then we can rewrite the theorem as:

$\forall m \forall n ( odd(m) \land odd(n) \imp odd(mn) )$.

See? Much shorter and clearer.

Solution 3:

In a definition, there is nothing to prove because the general form of a definition is:

An object $X$ is called [name] provided [conditions hold].

The reason that there is nothing to prove is that before the definition [name] is undefined (so it has no content). The [conditions] are like a checklist of properties. If all the properties of the [conditions] are true, then $X$ is whatever [name] is.

The reason that a definition can't be proven is that it isn't a mathematical statement. There's no if-then statements in a definition, a definition is merely a list of conditions; if all the conditions are true then $X$ is [name]. Since [name] had no meaning before the definition, you can't even check that [name] means the same as the conditions.