Is the square root of a triangular matrix necessarily triangular?

$$ \left( \begin{matrix} 1 & 1 & 1 \\ 4 & 1 & 2 \\ 1 & -2 & -3 \end{matrix} \right)^2 = \left( \begin{matrix} 6 & 0 & 0 \\ 10 & 1 & 0 \\ -10 & 5 & 6 \end{matrix} \right)$$

This example was found more or less by

  • Picking an arbitrary top row
  • Filling out the middle column to make a 0 in the product
  • Filling out the right column to make a 0 in the product
  • Filling out the middle row to make a 0 in the product

$$\pmatrix{0 & 1 & 0 \\ 1 & 0 & 0 \\ 1 & 1 & 1}\pmatrix{0 & 1 & 0 \\ 1 & 0 & 0 \\ 1 & 1 & 1}=\pmatrix{1 & 0 & 0 \\ 0 & 1 &0 \\ 2 & 2 & 1}$$

I simply started from $\pmatrix{0 & 1 \\ 1 & 0 }^2= I_2$, and then went to a 3x3 to make it a non-diagonal matrix. The key here is that if you start with a matrix whose square is triangular, and add $(0 ,0 , 0,..., 0, 1)$ as the last column, and anything else in the last row, its square stays lower triangular...


With counterexamples abounding, I'll try here to explain why nonetheless for most triangular $L$, any solution to $X^2=L$ will be similarly triangular, and investigate properties of $L$ (replacing that of having a non-zero below-diagonal entry) that will ensure this. The key observations are

  • Lower triangularity of $L$ means that the subspaces $\langle e_i,\ldots,e_n\rangle$ for $i=1,\ldots,n$ are $L$-stable,
  • Any solution $X$ of $X^2=L$ will commute with $L$,
  • Therefore the kernel of any polynomial $P_L$in $L$ will be $X$-stable.

(For the last point, if $P_L(v)=0$, then $P_L(X(v))=X(P_L(v))=0$.)

Now the subspaces $\langle e_i,\ldots,e_n\rangle$ are $L$-stable if $L$ is triangular, but they need not be kernels of polynomials in $L$; however should they all be such kernels, then they will all be $X$-stable for any solution $X$ by the last point, and $X$ will have to be lower triangular. For this to happen it suffices that the diagonal coefficients $a_{1,1},\ldots,a_{n,n}$ of $L$ are all distinct. For in that case $ \langle e_i,\ldots,e_n\rangle$ is precisely $\ker((L-a_{i,i}Id)\circ\cdots\circ(L-a_{n,n}Id))$, as can easily be checked. The reason that this argument fails when some $a_{j,j}=a_{k,k}$ with $j<k$ is that the eigenspace for this $a_{j,j}$ might be of dimension${}>1$, in which case the kernel of any polynomial in $L$ that contains a factor $L-a_{j,j}$ will kill not only $e_k$ but also $e_j$, so that there is no way for the polynomial to have exactly $ \langle e_k,\ldots,e_n\rangle$ as kernel.

In fact an even weaker condition on $L$ can be given that forces $X$ to be triangular: if the minimal polynomial of $L$ is equal to its characteristic polynomial $(X-a_{1,1})\ldots(X-a_{n,n})$, then the kernels of the $n$ polynomial $\ker((L-a_{i,i}Id)\circ\cdots\circ(L-a_{n,n}Id))$ are all distinct, and therefore necessarily equal to $ \langle e_i,\ldots,e_n\rangle$ respectively. Another argument for this case is that the here the only matrices commuting with $L$ are polynomials in $L$ and therefore triangular; this applies in particular to $X$. The condition on the minimal polynomial can be seen to be equivalent to all eigenspaces of $L$ having dimension $1$.

I believe that this, sufficient condition for $L$ to force $X$ to be triangular is also necessary, in other words once there is an eigenspace of dimension${}>1$ this can be exploited to construct a solution $X$ that is not triangular. Here is an example of this. Suppose we want $L$ to have a double eigenvalue $1$, and a single eigenvalue $4$ (which makes taking square roots easier). It will help to make the entries $1$ on the diagonal non-adjacent, so take $L$ of the form $$ L=\begin{pmatrix}1&0&0\\x&4&0\\y&z&1\end{pmatrix} $$ Now we need $L$ to have eigenspace for $1$ of dimension $2$ (in other words to be diagonalisable), and so $L-I_3$ should have rank $1$, which here means $xz-3y=0$. Let's take $x=y=z=3$ to have this. This leads also to an easy second eigenvector $e_1-e_2$ in addition to the inevitable eigenvector $e_3$ at $\lambda=1$. Computing $L-4I_3$ shows that for these values $e_2+e_3$ is the eigenvector at $\lambda=4$. Now it happens that we can choose a matrix $P$ with eigenvectors as columns (which we will use for base change) to itself be lower triangular; this is of no relevance, but fun and it helps find its inverse: $$ P=\begin{pmatrix}1&0&0\\-1&1&0\\0&1&1\end{pmatrix} \quad\text{for which}\quad P^{-1}=\begin{pmatrix}1&0&0\\1&1&0\\-1&-1&1\end{pmatrix}. $$ The most important point is to decide what $X$ does with the eigenspace of $L$ for $\lambda=1$. As linear map $X$ must stabilize this eigenspace globally, and its restriction to this subspace must square to the identity (it must be an involution); taking the restriction itself to be plus or minus identity will not give a counterexample, so let us take it to have both eigenvalues $1$ and $-1$. A simple way to do this is simply interchange the eigenvectors we found (a swap of two basis vectors is a reflection, with eigenvalues $\pm1$). We still have the choice of square roots $\pm2$ on the eigenspace for $\lambda=4$, so one gets $$ X=P\cdot\begin{pmatrix}0&0&1\\0&\pm2&0\\1&0&0\end{pmatrix}\cdot P^{-1} $$ giving $$ X=\begin{pmatrix}-1&-1&1\\3&3&-1\\3&2&0\end{pmatrix} \quad\text{and}\quad X=-\begin{pmatrix}1&1&-1\\1&1&1\\1&2&0\end{pmatrix}, $$ which indeed both square to $L$. You can experiment with variants, like changing the double eigenvalue of $L$ to $\lambda=-1$ instead of $\lambda=1$; one can still find a real square root of the restriction of $L$ to this eigenspace, though the flavour is a bit different.