I have seen that $x^{2}+x+1$ and $x^{4}+x+1$ are irreducible over $\mathbb{Z}_2$ and I thought a polynomial of the form $x^{2^m}+x+1$ for $m\ge3$ would be irreducible too. However using WolframAlpha, my hunch was wrong. It could be factored over $\mathbb{Z}_2$. WolframAlpha could only generate the factors for $m=3,\ldots, 13$ and so far, I observed that for odd $m$, my polynomial is divisible by $x^{2}+x+1$.

I want to see how one can show that $x^{2^m}+x+1$ for $m\ge3$ is reducible.


Solution 1:

I will write $\mathbb F_2$ rather than $\mathbb Z_2$; here the $\mathbb F$ stands for "finite field" and the subscript is the number of elements in the finite field. This notation is convenient when one wants to consider larger finite fields, as I will in this answer.

The polynomial $x^2 + x + 1$ splits in the field $\mathbb F_4$ with $4$ elements; its two roots are the two elements of $\mathbb F_4$ that aren't in $\mathbb F_2$. Now if $\alpha \in \mathbb F_4$ then $\alpha^4 = \alpha$. Thus if $n = 2 m + 1 $ is odd, we have that $$\alpha^{2^{2m + 1}} = (\alpha^{4^m})^2 = \alpha^2,$$ and so $$\alpha^{2^{2m+1}} + \alpha + 1 = 0.$$ Consequently each of the two roots of $x^2 + x + 1$ is also a root of $x^{2^{2m+1}} + x + 1$ as well, which explains why $x^2 + x + 1$ divides $x^{2^{2m + 1}} + x + 1.$

To handle the general case of $x^{2^n} + x + 1$ (i.e. the case when $n$ is not necessarily odd) it helps to recall some Galois theory for finite fields. The key fact is that if $\alpha$ lies in any finite extension field of $\mathbb F_2$, namely some $\mathbb F_{2^f}$ for some $f \geq 1$, then the algebraic conjugates of $\alpha$ are all of the form $\alpha, \alpha^2,\alpha^4, \ldots.$ (This list will be finite, because --- assuming I chose $f$ minimally --- then $\alpha^{2^{f-1}} \neq \alpha,$ but $\alpha^{2^f} = \alpha$, and so $\alpha$ has exactly $f$ distinct conjugates over $\mathbb F_2$.)

Now suppose that $\alpha$ satisfies $$\alpha^{2^n} + \alpha + 1 = 0.$$ Adding $\alpha + 1$ to both sides (and remembering that $ 1 = - 1$ because we in characteristic $2$) we find that $$\alpha^{2^n} = \alpha+1.$$ Hence $$\alpha^{2^{2n}} = (\alpha^{2^n})^{2^n} = (\alpha + 1)^{2^n} = \alpha^{2^n} + 1 = (\alpha +1)+1 = \alpha$$ (using the facts that $(a+b)^2 = a^2 + b^2$ and $1 + 1 = 0$, which both hold in characteristic $2$).

So we see that the "$f$" for $\alpha$ is at most $2n$, i.e. $\alpha$ has at most $2n$ distinct algebraic conjugates over $\mathbb F_2$, and so its minimal polynomial over $\mathbb F_2$ has degree at most $2n$. Since $\alpha$ is a root of $x^{2^n} + x + 1$, this minimal polynomial divides $x^{2^n} + x + 1$. If $n \geq 3$, then $2n$ is strictly less than $2^n$, and so this minimal polynomial of $\alpha$ also has degree strictly less than that of $x^{2^n} + x + 1$. Thus when $n \geq 3,$ the polynomial $x^{2^n} + x + 1$ must be reducible.

Solution 2:

I would like to present a solution that is completely different from Matt E's method. The starting point is the numerical observation (which I found with PARI) that when you factor $x^{2^n} + x + 1$ in ${\mathbf F}_2[x]$ the number of irreducible factors that occur for $n = 1, 2, 3,...,11$ is $$ 1, 1, 2, 2, 4, 6, 10, 16, 30, 52, 94. $$ The striking thing about those numbers after the second one is that they are all even. The lesson here is that when you were looking at data to understand why these polynomials were usually reducible, you should not only have been looking for some "obvious" irreducible factor, but also at the number of irreducible factors. Without showing there is any one particular "obvious" irreducible factor I am going to explain why, for $n \geq 3$, $x^{2^n} + x + 1$ must have an even number of irreducible factors in ${\mathbf F}_2[x]$. That will show this polynomial is reducible in ${\mathbf F}_2[x]$, since an irreducible polynomial has only one irreducible factor.

In elementary number theory, there is a standard function which counts the parity of the number of prime factors of a positive integer. It's called the Möbius function and is defined as follows: for a positive integer $N$, $$ \mu(N) = \begin{cases} 0, & \text{ if } N \text{ has a multiple prime factor}, \\ (-1)^r, &\text{ if } N = p_1\cdots p_r \text{ where the } p_i\text{'s} \text{ are distinct primes}. \end{cases} $$ So $\mu(N) = 1$ if $N$ is a product of an even number of distinct primes (in particular, $\mu(1) = 1$ using $r = 0$) and $\mu(N) = -1$ if $N$ is a product of an odd number of distinct primes, while $\mu(N) = 0$ if there is some prime factor of $N$ appearing more than once (e.g., $\mu(12) = 0$ since 2 is a factor of 12 twice). So the Möbius function really only counts the parity of the number of prime factors of squarefree positive integers. For any prime $p$, $\mu(p) = -1$. Therefore if $\mu(N) = 1$ then $N$ is definitely not a prime number.

The catch with the Möbius function is that there's no known way to figure out what $\mu(N)$ is without essentially factoring $N$. Well, to be on the safe side, let me just say that if we know $N$ is squarefree, so $\mu(N) = \pm 1$, then figuring out whether $\mu(N)$ is $1$ or $-1$ can't be done without factoring $N$. Put simply, there is no formula for $\mu(N)$ other than its definition. (Of course, if you haven't seen the Möbius function before, then another catch is that it looks like a completely crazy function. Why is it significant? The reason is the Möbius inversion formula, in which the Möbius function plays a prominent role. One of its consequences is that the multiplicative inverse of the Riemann zeta-function $\zeta(s) = \sum_{N \geq 1} 1/N^s$ is $\sum_{N \geq 1} \mu(N)/N^s$, so the Möbius function arises naturally if you want to write $1/\zeta(s)$ as a Dirichlet series.)

There are a lot of analogies between ${\mathbf Z}$ and ${\mathbf F}_p[x]$, for prime $p$, and in particular we can define a Möbius function on monic polynomials in ${\mathbf F}_p[x]$. (Think of monic polynomials as analogous to positive integers.) For any monic $f(x)$ in ${\mathbf F}_p[x]$, set $$ \mu(f) = \begin{cases} 0, & \text{ if } f \text{ has a multiple irreducible factor}, \\ (-1)^r, & \text{ if } f = \pi_1\cdots\pi_r \text{ where the } \pi_i\text{'s} \text{ are distinct monic irreducibles}. \end{cases} $$ For example, $\mu(\pi) = -1$ when $\pi$ is irreducible, so $\mu(x+c) = -1$ for any constant $c$ in ${\mathbf F}_p$. Note that, as with the classical Möbius function, $\mu(f)$ tells us the parity of the number of monic irreducible factors of $f$ when $f$ is squarefree. But there is a huge difference between ${\mathbf Z}$ and ${\mathbf F}_p[x]$ when it comes to determining that something is squarefree, because there is a method of determining that a polynomial in ${\mathbf F}_p[x]$ is squarefree without factoring it: $f(x)$ is squarefree iff $f(x)$ and $f'(x)$ are relatively prime. Note that being squarefree in ${\mathbf F}_p[x]$ is the same thing as being separable (no multiple roots), which ties this in with more standard concepts from field theory for polynomials.

Example. In ${\mathbf F}_2[x]$, $x^{2^n}+x+1$ is squarefree since its derivative is $1$, which is definitely relatively prime to the original polynomial.

The really amazing thing is that it's possible to compute $\mu(f)$ without having to factor $f$.

Theorem. If $p$ is an odd prime then $\mu(f) = (-1)^{\deg(f)}(\frac{\text{disc}(f)}{p})$, where $(\frac{\cdot}{p})$ is the Legendre symbol and $\text{disc}(f)$ is the discriminant of $f(x)$. If $p = 2$ and $f(x)$ is squarefree in ${\mathbf F}_2[x]$, let $F(x)$ be a monic lifting of $f(x)$ to ${\mathbf Z}[x]$. Then $\mu(f) = 1$ if $\text{disc}(F) \equiv 1 \bmod 8$ and $\mu(f) = -1$ if $\text{disc}(F) \equiv 5 \bmod 8$.

It is a theorem of Stickelberger that a monic polynomial in ${\mathbf Z}[x]$ with odd discriminant must have discriminant that is $1 \bmod 4$, so mod 8 the discriminant is 1 or 5.

I will not prove the theorem here, but I'll tell you some background. This theorem is due to Pellet (1878) when $p$ is odd. The case $p=2$ was given by Stickelberger in a short paper in the first ICM in 1897 and Swan rediscovered it in 1962 (Pacific J. Math) in terms of a lifting from ${\mathbf F}_2$ to the 2-adic integers instead of ${\mathbf Z}$. The proof for odd $p$ just needs Galois theory of finite fields, but for $p = 2$ more work is required. I should point out that Pellet, Stickelberger, and Swan were not thinking directly in terms of a Möbius function on polynomials over a finite field. For example, Pellet and Stickelberger wrote the formula with the discriminant term on one side and the rest on the other side with the Möbius value appearing as a power of $-1$: they were studying the "quadratic" nature of the discriminant of a polynomial mod $p$ when the discriminant mod $p$ is nonzero. (The formula in the theorem is trivial when the discriminant is 0, since both sides are 0, but P, S, and S didn't write the formula in that case since it was of no interest to them.)

To do computations with this theorem, let me recall one of the definitions of the discriminant of a monic polynomial $h(x)$ of degree $d$ over a field: $\text{disc}(h) = (-1)^{d(d-1)/2}\prod_{i=1}^d h'(r_i)$, where $h(x) = \prod_{i=1}^d (x-r_i)$ in a splitting field. In particular, note $(-1)^{d(d-1)/2} = 1$ when $d$ is a multiple of 4.

Now it's time to get back to your original question. We will apply this theorem to your polynomial $x^{2^n} + x + 1$ in ${\mathbf F}_2[x]$ by showing $\mu(x^{2^n} + x + 1) = 1$ for any $n \geq 3$, where $\mu$ is the Möbius function on ${\mathbf F}_2[x]$. First observe that in characteristic 2, $x^{2^n} + x + 1 = (x+1)^{2^n} + x$.

Theorem. For any nonconstant polynomial $g(x)$ in ${\mathbf F}_2[x]$, $\mu(g^8 + x) = 1$.

The conclusion of this theorem is false when $g$ is constant, and we'll see early in the proof how $g$ being nonconstant matters: it means the degree of $g^8+x$ is determined by the $g^8$ part and not the $x$ part.

Proof of theorem: Let $G(x)$ in ${\mathbf Z}[x]$ be any monic lifting of $g(x)$. (For example, each mod 2 coefficient of $g(x)$ that is 0 or 1 mod 2 could be replaced with the integers 0 or 1.) Then $\deg(G) = \deg(g) \geq 1$, so $G^8 + x$ has degree $8\deg(G)$, which is a multiple of 8. That's the step that required $g$, and hence $G$, to be nonconstant. And the derivative of $G^8 + x$ is $8G^7G' + 1$, so if we let $r_1,\dots,r_D$ be the roots of $G^8 + x$, so $D$ is a multiple of 8, $$ \text{disc}(G^8 + x) = \prod_{i=1}^D (1 + 8G(r_i)^7G'(r_i)). $$ Since $G^8 + x$ is monic , its roots are algebraic integers, so the number on the right is a product of algebraic integers which are all 1 plus 8 times an algebraic integer. Therefore their product is 1 plus 8 times an algebraic integer. Since the left side $\text{disc}(G^8 + x)$ is in ${\mathbf Z}$, the number on the left must be $1 \bmod 8$, so by the Möbius formula in ${\mathbf F}_2[x]$, $\mu(g^8 + x) = 1$. QED

When $n \geq 3$, $2^n$ is a multiple of 8 and $(x+1)^{2^n} + x = g^8 + x$ for $g(x) = (x+1)^{2^{n-3}}$, so this theorem implies $x^{2^n} + x + 1$ is reducible over ${\mathbf F}_2$ because we showed it has an even number of irreducible factors. While Matt E's solution is much shorter than mine, and is a pleasant application of Galois theory of finite fields (I'll have to remember it for the next time I teach a Galois theory course), notice the theorem above is a much more general result: $g^8 + x$ is reducible for any nonconstant $g$ in ${\mathbf F}_2[x]$ because it has an even number of irreducible factors.

There is an analogue of this in all characteristics: for any prime $p$, $\mu(g^{4p}+x) = 1$ for all nonconstant $g(x)$ in ${\mathbf F}_p[x]$.

Two settings where this Möbius formula was applied somewhat recently are described in http://www.math.uconn.edu/~kconrad/articles/texel.pdf.