Why are the solutions of polynomial equations so unconstrained over the quaternions?

An $n$th-degree polynomial has at most $n$ distinct zeroes in the complex numbers. But it may have an uncountable set of zeroes in the quaternions. For example, $x^2+1$ has two zeroes in $\mathbb C$, but in $\mathbb H$, ${\bf i}\cos x + {\bf j}\sin x$ is a distinct zero of this polynomial for every $x$ in $[0, 2\pi)$, and obviously there are many other zeroes.

What is it about $\mathbb H$ that makes its behavior in this regard to be so different from the behavior of $\mathbb R$ and $\mathbb C$? Is it simply because $\mathbb H$ is four-dimensional rather than two-dimensional? Are there any theorems that say when a ring will behave like $\mathbb H$ and when it will behave like $\mathbb C$?

Do all polynomials behave like this in $\mathbb H$? Or is this one unusual?

When I was first learning abstract algebra, the professor gave the usual sequence of results for polynomials over a field: the Division Algorithm, the Remainder Theorem, and the Factor Theorem, followed by the Corollary that if $D$ is an integral domain, and $E$ is any integral domain that contains $D$, then a polynomial of degree $n$ with coefficients in $D$ has at most $n$ distinct roots in $E$.

He then challenged us, as a homework, to go over the proof of the Factor Theorem and to point out exactly which, where, and how the axioms of a field used in the proof.

Every single one of us missed the fact that commutativity is used.

Here's the issue: the division algorithm (on either side), does hold in $\mathbb{H}[x]$ (in fact, over any ring, commutative or not, in which the leading coefficient of the divisor is a unit). So given a polynomial $p(x)$ with coefficients in $\mathbb{H}$, and a nonzero $a(x)\in\mathbb{H}[x]$, there exist unique $q(x)$ and $r(x)$ in $\mathbb{H}[x]$ such that $p(x) = q(x)a(x) + r(x)$, and $r(x)=0$ or $\deg(r)\lt\deg(a)$. (There also exist unique $q'(x)$ and $s(x)$ such that $p(x) = a(x)q'(x) + s(x)$ and $s(x)=0$ or $\deg(s)\lt\deg(a)$.

The usual argument runs as follows: given $a\in\mathbb{H}$ and $p(x)$, divide $p(x)$ by $x-a$ to get $p(x) = q(x)(x-a) + r$, with $r$ constant. Evaluating at $a$ we get $p(a) = q(a)(a-a)+r = r$, so $r=p(a)$. Hence $a$ is a root if and only if $(x-a)$ divides $p(x)$.

If $b$ is a root of $p(x)$, $b\neq a$, then evaluating at $b$ we have $0=p(b) = q(b)(b-a)$; since $b-a\neq 0$, then $q(b)=0$, so $b$ must be a root of $q$; since $\deg(q)=\deg(p)-1$, an inductive hypothesis tells us that $q(x)$ has at most $\deg(p)-1$ distinct roots, so $p$ has at most $\deg(p)$ roots.

And that is where we are using commutativity: to go from $p(x) = q(x)(x-a)$ to $p(b) = q(b)(b-a)$.

Let $R$ be a ring, and let $a\in R$. Then $a$ induces a set-theoretic map from $R[x]$ to $R$, "evaluation at $a$", $\varepsilon_a\colon R[x]\to R$ by evaluation: $$\varepsilon_a(b_0+b_1x+\cdots + b_nx^n) = b_0 + b_1a + \cdots + b_na^n.$$ This map is a group homomorphism, and if $a$ is central, also a ring homomorphism; if $a$ is not central, then it is not a ring homomorphism: given $b\in R$ such that $ab\neq ba$, then we have $bx = xb$ in $R[x]$, but $\varepsilon_a(x)\varepsilon_a(b) = ab\neq ba = \varepsilon_a(xb)$.

The "evaluation" map also induces a set theoretic map from $R[x]$ to $R^R$, the ring of all $R$-valued functions in $R$, with the pointwise addition and multiplication ($(f+g)(a) = f(a)+g(a)$, $(fg)(a) = f(a)g(a)$); the map sends $p(x)$ to the function $\mathfrak{p}\colon R\to R$ given by $\mathfrak{p}(a) = \varepsilon_a(p(x))$. This map is a group homomorphism, but it is not a ring homomorphism unless $R$ is commutative.

This means that from $p(x) = q(x)(x-a) + r(x)$ we cannot in general conclude that $p(c) = q(c)(c-a) +r(c)$ unless $c$ commutes in $R$ with $a$. ~~So the Remainder Theorem may fail to hold (if the coefficients involved do not commute with $a$ in $R$), which in turn means that the Factor Theorem may fail to hold~~ So one has to be careful in the statements (see Marc van Leeuwen's answer). And even when both of them hold for the particular $a$ in question, the inductive argument will fail if $b$ does not commute with $a$, because we cannot go from $p(x) = q(x)(x-a)$ to $p(b)=q(b)(b-a)$.

This is exactly what happens with, say, $p(x) = x^2+1$ in $\mathbb{H}[x]$. We are fine as far as showing that, say, $x-i$ is a factor of $p(x)$, because it so happens that when we divide by $x-i$, all coefficients involved centralize $i$ (we just get $(x+i)(x-i)$). But when we try to argue that any root different from $i$ must be a root of $x+i$, we run into the problem that we cannot guarantee that $b^2+1$ equals $(b+i)(b-i)$ unless we know that $b$ centralizes $i$. As it happens, the centralizer of $i$ in $\mathbb{H}$ is $\mathbb{R}[i]$, so we only conclude that the only other complex root is $-i$. But this leaves the possibility open that there may be some roots of $x^2+1$ that do not centralize $i$, and that is exactly what occurs: $j$, and $k$, and all numbers of the form $ai+bj+ck$ with $a^2+b^2+c^2=1$ are roots, and if either $b$ or $c$ are nonzero, then they don't centralize $i$, so we cannot go from $x^2+1 = (x+i)(x-i)$ to "$(ai+bj+ck)^2+1 = (ai+bj+ck+i)(ai+bj+ck-i)$".

And that is what goes wrong, and there is where commutativity is hiding.

The finiteness of the number of roots of a polynomial $f(x)\in K[x]$ where $K$ is a field depends on two interlaced facts:

$K[x]$ is a Unique Factorization Domain: every polynomial $f(x)$ factors in an essentially unique way as a product of irreducibles;
if $f(\alpha)=0$ then $f(x)=(x-\alpha)g(x)$ where $\deg g(x)=(\deg f(x))-1$.

The combination of these two facts (the first one in particular) does not hold anymore if you think the polynomial $f(x)$ as a polynomial with coefficients in the ring $\Bbb H$ of Hamilton quaternions. This is because the latter is not commutative.

You may also ponder on this fact: in a commutative environment the transformation $a\mapsto\phi_h(a)=hah^{-1}$ (conjugation) is always trivial. Not so in $\Bbb H$, again as a side effect of non-commutativity. The point is that if an element $a$ satisfies a certain algebraic relation with real coefficient (such as $a^2=1$), so will all its conjugates $\phi_h(a)$.

Why are the solutions of polynomial equations so unconstrained over the quaternions?

Related

Recent Posts