What does a "convention" mean in mathematics?

We all know that $0!=1$, the degree of the zero polynomial equals $-\infty$, the interval$[a,a)=(a,a]=(a,a)=\emptyset$ ... and so on, are conventions in mathematics. So is a convention something that we can't prove with mathematical logic, or is it just intuitions, or something that mathematicians agree about? Are they the same as axioms? What does "convention" mean in mathematics? And is $i^2 = -1$ a convention? If not how can we prove existence of such number?


Solution 1:

To answer the question in the title, I would say: 'convention' in mathematics means exactly the same as in ordinary English.

As for your examples: $0!:=1$ and $[a,a):=\emptyset$ are definitions. It is a convention not to use a different definition, or to leave it undefined. Of course in this sense, every definition is a convention.

It think that informally, one says a certain definition (such as the two above) is '(just) convention', to mean that they are 'extreme' or 'degenerate' cases, and leaving them undefined would still make the theory go through, but it is more convenient to define them anyway (for example to prevent having to exclude this extreme case in statement of theorems). For example, I think you could get by not defining $[a,a)$ or $[a,b]$ for $b<a$, but then in statements (and proofs) about general intervals $[a,b)$ you are forced to explicitly state and check whether $b>a$ which could be tiresome.

Solution 2:

A convention is a choice made because it is convenient--or at least, less inconvenient than the alternative(s).

For an example of a definition of convenience, let me answer your question regarding $i$ (from comments and original post) more explicitly. In particular, we may define the complex plane and complex arithmetic as follows: Let $\Bbb C:=\Bbb R^2$ with componentwise addition--that is, $$\langle a,b\rangle\oplus\langle c,d\rangle:=\langle a+c,b+d\rangle,$$ with "$+$" being real addition--and with multiplication defined as $$\langle a,b\rangle\odot\langle c,d\rangle:=\langle a\cdot c-b\cdot d,a\cdot d+b\cdot c\rangle,$$ with "$\cdot$" being real multiplication and "$-$" being real subtraction.

Now, we can show that these are well-defined operations, and it is readily verified that $\bigl\{\langle a,0\rangle:a\in\Bbb R\bigr\}$ under $\oplus$ and $\odot$ behaves exactly like $\Bbb R$ under $+$ and $\cdot$. Treating $\Bbb C$ as a real vector space in two dimensions (which it is, as I alluded to in my comment above), the standard ordered basis for $\Bbb R^2$ is $\bigl\{\langle 1,0\rangle,\langle 0,1\rangle\bigr\}$. The former acts as $\odot$-multiplicative identity on all of $\Bbb C$, and we see also that $\langle 0,1\rangle\odot\langle 0,1\rangle=\langle -1,0\rangle$. Identifying those $\Bbb C$-pairs of form $\langle a,0\rangle$ with their real counterparts $a$, and defining $$i:=\langle 0,1\rangle,$$ we find that every complex number can be uniquely expressed in the form $x+iy$ (with $x,y\in\Bbb R$), and that $i^2=-1$. (The other properties of $\Bbb C$ can also be deduced, but that's another story.)

Note, we didn't prove that such a number as $i$ "exists"...we simply defined a structure and specified an element of that structure, which happens to have the properties that we ascribe to $i$. We could as easily have defined $$i:=\langle 0,-1\rangle,$$ without losing any of those properties, and really only chose the definition we did because it accords with the standard ordered basis for $\Bbb R^2$.