Why “characteristic zero” and not “infinite characteristic”?

The characteristic of a ring (with unity, say) is the smallest positive number $n$ such that $$\underbrace{1 + 1 + \cdots + 1}_{n \text{ times}} = 0,$$ provided such an $n$ exists. Otherwise, we define it to be $0$.

But why characteristic zero? Why do we not define it to be $\infty$ instead? Under this alternative definition, the characteristic of a ring is simply the “order” of the additive cyclic group generated by the unit element $1$.

My feeling is that there is a precise and convincing explanation for the common convention, but none comes to mind. I couldn't find the answer in the Wikipedia article either.

Solution 1:

There are two orderings of the set $\mathbb N = \{0,1,\dots\}$:

magnitude $a \leq b$
divisibility $a\mid b$ (i.e. $\exists c. b = a c$)

They are mostly compatible - usually when $a \mid b$, it holds $a \leq b$.

Some definitions are phrased using "greater than" ordering, while in fact the "divisibility" ordering is the real essence.

For example, the greatest common divisor of $a$ and $b$ might be defined as the greatest number which is a common divisor of both $a$ and $b$. Characteristic of a ring $R$ might be defined as smallest number $n>0$ which satisfies $n \cdot 1 = 0$.

Under such commonly taught definitions, it seems natural that $\operatorname{gcd}(0,0)=\infty$ and $\operatorname{char} \mathbb Z = \infty$.

However, those definitions implicitly rely on ideals, and are better phrased using divisibility order. The incompatibility is then more visible: $0$ is the largest element in divisibility order, while it is smallest in magnitude order. Magnitude has no largest element, and often $\infty$ is added to cover this case.

So let's formulate the definitions again, but this time using divisibility ordering.

The greatest common divisor of two numbers $a,b$ is greatest number (in sense of $\mid$) that is a divisor of $a$ and $b$ (i.e. is smaller than $a$ and $b$ in divisibility ordering). This is prettier - $\operatorname{gcd}$ is now the $\wedge$ operator in lattice $(\mathbb N, \mid)$; it also forms a monoid, with $0$ as identity element. Additionally, the definition can be adapted to any ring.
The characteristic of a ring $R$ is the smallest number $n$ (in sense of $\mid$) that satisfies $n \cdot 1 =0$. As a bonus, compared to previous definition, we can remove the $n>0$ restriction: zero is always a valid "annihilator" but it is often not the smallest one. Now we get $\operatorname{char} \mathbb Z = 0$.

Characteristic is a "multiplicative" notion, like gcd. If you have a homomorphism of rings $f: A \to B$, it must hold $\operatorname{char} B \mid \operatorname{char} A$. For example, you cannot map ${\mathbb Z}_2$ to ${\mathbb Z}_4$ - in a sense, ${\mathbb Z}_2$ is "smaller" than ${\mathbb Z}_4$. "Bigger" rings have "more divisible" characteristic, their characteristics are greater in the sense of divisibility. And the "most divisible" number is 0. Another example is $\operatorname{char} A \times B = \operatorname{lcm}(\operatorname{char} A, \operatorname{char} B)$.

In a bit more abstract language: given any ideal $I \subseteq \mathbb Z$, we associate to it the smallest nonnegative element, under the divisibility order. By properties of $\mathbb Z$, every other element of $I$ is a multiple of it. Let's call this number $\operatorname{min}(I)$.

We can now define $\operatorname{gcd}(a,b)=\operatorname{min} ((a) + (b))$, and $\operatorname{char} R = \min (\ker f)$, where $f \colon \mathbb Z \to R$ is the canonical map.

The definition of $\operatorname{min}(I)$ works for any PID, it does not require magnitude order. In any PID, $I = (\operatorname{min}(I))$.

(I dislike saying the ideal $\{0\}$ is "generated" by $0$; although this is true, it also generated by empty set. We do not say that $(2)$ is generated by $0$ and $2$.)

Solution 2:

Given a ring $R$ there is a unique ring homomorphism $\varphi:\mathbb Z\to R$. The characteristic of $R$ is the (canonical, non-negative) generator of $\ker \varphi$.

Solution 3:

Consider the following statement:

Let $n\geq 0$. The characteristic of $R$ is $n$ if and only if ($ka=0$ for all $a\in R$ $\iff$ $n|k$).

The statement holds for positive characteristic, but it also holds for characteristic $0$, since $0$ is the only multiple of $0$. This would not hold for any ring if we defined the characteristic to be $\infty$. This definition also makes sense for rings without $1$.
For rings with unity, the definitions follows as indicated by lhf: the characteristic of $R$ is the nonnegative generator of the kernel of the canonical map from $\mathbb{Z}$ to $R$.

Solution 4:

Recall that an R-algebra is a ring A containing a central image of the ring R. This image is $\,\cong$ R/I so it is characterized by the kernel I. For example, if R = $\mathbb Z$ then an R-algebra is simply a ring A, and the kernel $\rm\ I = (n)\ $ characterizes the canonical image of $\mathbb Z$ in A, via $\rm 1\mapsto 1_A.\,$ Therefore we say that A has characteristic n because n characterizes the canonical image of $\:\mathbb Z\:$ in A.

Remark $\ $ For more general notions of "characteristic rings" see below - excerpted here.

W.D. Burgess; P.N. Stewart. The characteristic ring and the "best" way to adjoin a one.
J. Austral. Math. Soc. 47 (1989) 483-496. $\ \ $

Solution 5:

This is all just a convention, and I'm adding an answer $8$ years late because nobody else has pointed out yet that many years ago some people did use the term "characteristic $\infty$". Schilling does this in his book "Theory of Valuations" (AMS, 1950). Either way, you get used to it.