How to determine $\mathrm{rank}(p(A))$ if you know the rational canonical form of $A$

For example, let $A$ be a matrix with entries in $\mathbb{F}_2$ such that $$ \text{RCF of } A = C \left[ x + 1 \right] \oplus C \left[ (x + 1) ^ 2 \right] \oplus C \left[ (x^2 + x + 1) ^ 3 \right] \oplus C \left[ (x^3 + x + 1) ^ 2 \right] $$ [where $C \left[ f(x) \right] = \text{the companion matrix of } f(x)$]

How can I get the rank of the matrix $A ^ 3 + I$?


Since applying an polynomial to the $\text{RCF of } A$ is the same as applying a polynomial to every component on the $\text{RCF of } A$, it is enough to work on every component in order to determine the rank of $p(A)$ [in our case, $p (x) = x^3 + 1$].

Since we are in $\mathbb{F}_2 [x]$, then $x ^ 3 + 1 = (x + 1) (x^2 + x + 1)$.

I tried to find the $\text{RCF of } g \left( C \left[ f(x) \right] \right)$ (where $f,g$ are polynomials in $\mathbb{F}_2 [x]$) by considering some cases, but I didn't managed to do it. I suspect that in the case of $\gcd (f, g) = 1$, the $\text{RCF of } g \left( C \left[ f(x) \right] \right)$ is $C \left[ f(x) \right]$, but didn't managed to prove it.


Solution 1:

First, to handle the easiest case: note that if $f\mid g$, then $g(C[f]) = 0$.

Generally speaking, we will have $\dim \ker(g(C[f])) = \deg(\gcd(f,g))$, and then the rank can be obtained via the rank-nullity theorem. With this fact, it is easy to see that for each of the summands in the RCF, the ranks will be $0,1,4,6$. Adding these together gives us our answer: the rank of $A^3 + I$ is $11$.


Regarding your conjecture: your conjecture is incorrect, but the case where $f,g$ are relatively prime is important. For such $f,g$, the matrix $g(C[f])$ is invertible.

To see that this is the case, note that by Bezout's lemma, there exist polynomials $p,q$ such that $$ p(x)f(x) + q(x) g(x) = 1. $$ Plugging $C[f]$ into this equation yields $$ p(C[f])f(C[f]) + q(C[f])g(C[f]) = I \implies q(C[f])g(C[f]) = I, $$ so that $g(C[f])$ has inverse $q(C[f])$.


Regarding the fact that $\dim \ker(g(C[f]) = \deg(\gcd(f,g))$:

There are only two ways that I can think of proving this: one is to appeal to the existence of an algebraic closure, so that we can use Jordan canonical form. The other, which I present here, uses some abstract algebra that you might not be comfortable with.

So, here is the abstract algebra based proof. Suppose that $f$ has degree $n$. Note that $C[f]$ is the matrix of the linear operator $$ \mu_x: \Bbb F[x]/\langle f(x) \rangle \to \Bbb F[x]/\langle f(x) \rangle, \quad \mu_x(g(x)) = xg(x). $$ Note that as a vector space over $\Bbb F$, $V:=\Bbb F[x]/\langle f(x) \rangle$ has dimension $n$ because it is spanned by the independent elements $1,x,\dots,x^{n-1}$.

On the other had, $g(\mu_x) = \mu_{g(x)}$. The image of $\mu_{g(x)}$ is $\langle g(x) \rangle$ (the ideal generated by $g(x)$), which is equal to $\langle d(x) \rangle$ where $d = \gcd(f,g)$. Let $m = \deg(d)$; note that $m \leq n$. An element of $\langle d \rangle \subset V$ can be expressed in the form $d(x)p(x)$, where $\deg(d(x)p(x)) \leq n-1$. It follows that the set $\{d(x),xd(x),\dots,x^{n-m-1}d(x)\}$ is a basis of $\langle d(x)\rangle$.

Thus, we conclude that the image of $g(\mu_x)$ has dimension $n-m$. Thus, the rank of $g(C[f])$ is $n-m$, or equivalently the dimension of the kernel of $g(C[f])$ is $m$, which was what we wanted.


Here's an attempt to translate the above proof into a more linear algebraic context, essentially in the style of Hoffman and Kunze.

To begin, apply Bezout's lemma again to note that there exists polynomials $p,q$ such that $$ p(x)f(x) + q(x)g(x) = d(x). $$ Plugging in $C := C[f]$ yields $$ p(C)f(C) + q(C)g(C) = d(C) \implies q(C)g(C) = d(C), $$ which means that $\operatorname{rank}(d(C)) \leq \operatorname{rank}(g(C))$. On the other hand, because $d|g$, we can let $g_0 = g/d$ and find that $$ g(C) = g_0(C)d(C), $$ which means that $\operatorname{rank}(d(C)) \geq \operatorname{rank}(g(C))$. Thus, $\operatorname{rank}(d(C)) = \operatorname{rank}(g(C))$. With that, it suffices to determine the rank or nullity of $d(C)$, where $d \mid f$.

Note that $\Bbb F^n$ is a $C$-cyclic subspace, and that $v = (1,0,\dots,0)$ is a cyclic vector for $C$. It follows that the image of $d(C)$ is the $d(C)$-cyclic subspace generated by $v$. From there, we can see that the annihilating polynomial of $d(C)v$ must be $p = f/d$. Thus, the $d(C)$-cyclic subspace generated by $v$ has dimension equal to $\deg(p) = \deg(f) - \deg(d)$. So, we conclude that the rank of $d(C)$ is equal to $\deg(f) - \deg(d)$, which was what we wanted.


Thoughts on the RCF of $g(C[f])$:

One easy case occurs when $f(x) = (x-\lambda_1)\cdots(x - \lambda_k)$ (so that $f$ is diagonalizable) and $g(\lambda_i) \neq g(\lambda_j)$ for all $i \neq j$. In that case, $g(C[f])$ has RCF $C[(x - f(\lambda_1) \cdots (x - f(\lambda_k))]$.

Another easy case occurs when $f(x) = (x - \lambda)^k$ and $g$ is not divisible by $(x-\lambda)$. In that case, it turns out that $g(C[f])$ has RCF $C[(x - f(\lambda))^k]$.

It is also helpful to note that whenever $f_1,f_2$ are relatively prime, we have $$ C[f_1(x)f_2(x)] \cong C[f_1(x)] \oplus C[f_2(x)]. $$

It's not clear how these observations might be joined into a general rule.

An example computation of the RCF of $g(C[f])$:

Take $f(x) = (x^2 + 1)^2$ with coefficients in $\Bbb R$, and consider the polynomial $g(x) = x^3 + x$. We note that the Jordan form of $C[f]$ is given by $$ J = \pmatrix{i&1\\&i\\&&-i&1\\&&&-i}. $$ Correspondingly, the derivative of $g$ is $g'(x) = 3x^2 + 1$, so we compute $$ g(J) = \pmatrix{g(i) & g'(i)\\&g(i)\\&&g(-i)&g'(-i)\\&&&g(-i)} = \pmatrix{0 & -2\\&0\\ &&0&-2\\&&&0}. $$ The Jordan form of this matrix is $$ J_2 = \pmatrix{0 & 1\\&0\\&&0&1\\&&&0}, $$ and depending on your definition of the companion matrix, either $J_2$ or its transpose is already in RCF. We find that the RCF of $g(C[f])$ is given by $$ C[x^2] \oplus C[x^2]. $$ On the other hand, suppose that $g(x) = x^3 - x$. This time we find that $$ g(J) = \pmatrix{-2i & -4\\&-2i\\&&2i&-4\\&&&2i}. $$ The corresponding Jordan matrix is $$ J_2 = \pmatrix{-2i & 1\\&-2i\\&&2i&1\\&&&2i}, $$ and the JCF of $J_2$ is given by $$ C[(x^2 + 4)^2]. $$