When is a proof or definition formal?

When is a proof or definition formal?

I've been searching for an explanation of when or proof or definition is formal.

Sometimes, authors call their proofs or definitions informal without further explanation.

I've an idea that a formal proof is derived directly from the axioms (with references to already proven statements), or are the requirements less strict?

When is a definition formal? May words whose meaning follow only from the context be used in a formal definition, or should each use of a word be explicitly defined?


Solution 1:

It's not a very clear-cut line. There's also a complication, because mathematicians often use certain styles of language in a formal proof, making the proof sound more formal-in-the-colloquial-English-sense. They do this to signal "this is a formal-in-your-sense proof". The converse is also true.

Here's an informal intensional definition:

A formal proof is basically one where nothing is hidden.

(The definition of "nothing" is flexible, depending on the level of the proof. A formal proof aimed at world-class mathematicians may miss out more details than a first-year undergraduate formal proof, which is not allowed to miss out many details at all.)

One may use words whose meaning follows only from context, as long as it's clear what the meaning is; that's not hiding anything. A proof can skip out details and still be formal, as long as it tells you how to complete the details: that's hiding stuff but making it clear that the stuff is a) hidden, and b) easy to recover.

Here's an informal extensional definition.

A proof which leaves large chunks to the reader is informal: it hides large amounts of detail. A definition which misses out some annoying special cases (perhaps to make the statement more slick) is informal. A proof which pedantically goes over every statement is formal. A proof in which every statement clearly follows from earlier statements or from axioms/hypotheses is formal.

Solution 2:

The problem here is that the answer depends on whether "formal" is formally or informally defined:)

If it is formally defined it would probably be defined as a mathematical term and the requirement of the proofs and definition would be quite strict. They would probably need to be formulated in a formal language adhering to certain rules. Normally one does not see these kind of definitions and proofs.

Otherwise the term "formal" would have a more relaxed meaning and one would not require a formal definition or proof to adhere to some very strictly defined standards. In this case one would mean that formal means more strict formulation than informal (which is more sloppy formulation) - where you draw the line is depending on context.

Also which one is used can also be seen in context. If you don't see any formal language (note however that a formal language can look natural) or formal definition of "formal" it would most probably be used in the second meaning.

Solution 3:

There is actually a whole spectrum of "formality" in mathematics. In informal terms, "formal" it refers to what is considered as rigorous, but that is of course subjective.

  • Absolutely formal: Written in a language that can be verified by a program that implements some formal system. Check out Mizar and Coq, two well-known proof assistants based on two different foundational systems for mathematics.

  • Extremely formal: Written in a manner such that any logician is capable of translating it by hand into an absolutely formal version. For example, "$\forall x \in S\ ( P(x) )$" where "$P$" is a predicate can be written as a short-form for "$\forall x\ ( x \in S \to P(x) )$", where the latter is permitted in pure first-order set theory.

  • Obviously formal: Written in a manner such that all logicians will agree can be translated mechanically into an absolutely formal version. It may well be impractically troublesome to actually do it, but nobody doubts that it can be done. For example, consideration of symmetry such as in "Without loss of generality $a < b < c$, ..." can be done whenever all the previously deduced statements are symmetric under permutation of $a,b,c$ and $a,b,c$ come from a total order. The easiest way to absolutely formalize it would be to simply repeat the subsequent argument for each permutation, but it is prohibitive especially if there are a lot of symmetry considerations!

  • Provably formal: Written in a manner such that any logician can prove that it can be translated mechanically into an absolutely formal version. This is where it starts to get hairy, because one logician might use some meta-system MS to prove this, but another skeptical logician might reject some inference rule or axiom of MS that the first logician used. So the first logician might insist that an absolutely formal proof exists, but the second might insist that it does not exist. For example, ZF set theory proves that the Hydra game terminates. In particular the following play terminates: At step $n$, if the tree is just a single node then the game ends, otherwise we remove the right-most leaf node and then if its parent is not the root we duplicate the resulting parent subtree $n$ times. This holds even if the initial tree has depth $100$, and ZF additionally can prove that a weaker formal system called PA can prove it. But the shortest proof of this in PA will have more deductive steps than the number of particles in the observable universe! So a logician who rejects ZF but accepts PA might not believe that an absolutely formal proof exists...

  • Reasonably formal: Written in the standard of modern mathematics, that is, in sufficient detail to convince experts in that field of mathematics that it is rigorous and correct. Lots of steps may be skipped, and yet it may be considered perfect. For example, one may define the diameter of a finite graph with weighted edges to be the largest distance between two nodes in the graph, where distance between $x,y$ is defined as the minimum possible sum of weights along a path from $x$ to $y$. It is clear to any graph theorist that this is well-defined, even though technically one has to prove the implicit claims of existence of "minimum" and "maximum" here; it is not totally trivial!

  • Somewhat Informal: Written (or more often spoken) to fellow mathematicians in a manner that is not even reasonably formal, especially when diagrams are involved. You will find less diagrams in more formal proofs, simply because nearly all diagrams require some amount of intuitive interpretation by the intended audience to determine what the diagram is supposed to illustrate, and diagrams almost always depict a single instance and not the general case. Good diagrams will depict a non-trivial instance while bad diagrams may actually mislead viewers to make false assumptions! Also, informal mathematics often arises together with phrases like "it can be shown" and "this will be left as an exercise for the reader"...

  • Very informal: Done with vague words like "near" and "close" in analysis. This is sometimes sufficient to convince a fellow expert in the same field, but it is usually insufficient to enable mathematicians not in the same field to actually write down a formal proof without much effort. Worse still, informal proofs by non-mathematicians tend to be wrong, but often the flaws are not seen by other non-mathematicians.

Solution 4:

There is one aspect of "definitions" that is not exactly asked in your question, but is relevant to mathematics. There are actually two kinds of completely formal definitions, arising from two separate mechanisms: $ \def\nn{\mathbb{N}} $

  1. Definition by existential instantiation: When we have an axiom or a deduced sentence asserting the existence of some object, namely "$\exists x\ ( P(x) )$" where $P$ is some predicate, we can instantiate it and name a reference to it, by saying "Let $c$ be such that $P(c)$.". This is far more general than you might think. For example when you define a function $f : \nn \to \nn$ such that $f(0) = 0$ and $f(n+1) = 2f(n)+1$ for every $n \in \nn$, what you are actually doing is to invoke a theorem that says "$\exists f : \nn \to \nn\ ( f(0) = 0 \land \forall n\in\nn\ ( f(n+1) = 2f(n)+1 ) )$". Don't laugh! This is not necessarily a trivial theorem, depending on your choice of foundational system for mathematics. The bottom line is, such definitions are only valid if you can prove the existential statement that guarantees the existence of the object that you are giving a name to.

  2. Definitorial expansion: In some foundational systems, we are unable to use the above type of definition, simply because we cannot prove the required existential statement. For example in ZF set theory one can actually prove that there is no set $S$ whose members are exactly the sets that are not members of themselves. Namely, $\neg \exists S\ \forall x\ ( x \in S\ \Leftrightarrow \neg x \in x )$. But we still can define a predicate $R$ where $R(S) \overset{def}\equiv \forall x\ ( x \in S\ \Leftrightarrow \neg x \in x )$ for every set $S$. $R$ is not a set, but we can still talk about sets that satisfy $R$. For example, $R(\{\})$ because the empty-set is not a member of itself. This notion of definition can itself be formally defined as described in this post, and more details about the issue with Russel's paradox can be found in this post. This is also more common than you might think. For example in ZFC set theory, there is no function that maps every set to its cardinality, and so the cardinality notation in "$\#(S)$" or "$|S|$" cannot be defined using the first type of definition. But it can be defined via this type of definition. Similarly, when you define the diameter $diam(G)$ of every finite graph $G$, note that actually $diam$ is not a function in ZFC set theory, because there is no set of all finite graphs! But the notion of "finite graph" and $diam$ are both definable via definitorial expansion, and we can use $diam$ on any finite graph. Things get a bit messy with higher-order things like the family of functions on finite graphs. That cannot even be defined via definitorial expansion, but we can cheat in many cases. For example we can define even by mere existential instantiation the family of all functions on graphs whose vertices are a subset of $\nn$.