Choosing an orthonormal basis in which a linear operator has a sparse matrix

We have $k_{\mathbb{C}}(n)=k(n)=\frac12n(n-1)$ for all $n > 2$ (for $n=2$, we have $k_{\mathbb{C}}(2)=1$, $k(2)=0$).

First, I'll show that these are upper bounds before showing that they can be attained. I'll use the fact that a smooth map from an $m$ dimensional manifold (or finite union of $m$ dimensional manifolds) to an $n$ dimensional manifold cannot be onto unless $n\le m$. It was noted in the question that $k(n)\le\frac12n(n-1)$, so I'll just look at the complex case. The space ${\rm U}(n)$ of $n\times n$ unitary matrices has dimension $n^2$, and the space ${\rm GL}_{\mathbb{C}}(n)$ of complex $n\times n$ matrices has dimension $2n^2$. If $U$ is unitary such that $U^HAU$ has $k$ zero entries, then $V^HAV$ also has $k$ zero entries, where $V=DU$ for a diagonal unitary matrix $D$. For example, it is always possible to choose $D$ such that the first nonzero element of each row of $V$ is real. Letting $R$ be the space of $n\times n$ unitary matrices whose first nonzero element in each row is real, this has dimension $n(n-1)$. Letting $S$ be the $n\times n$ complex matrices with at least $k$ zero entries, this has dimension $2(n^2-k)$. We require the map $R\times S\to {\rm GL}_{\mathbb{C}}(n)$, $(U,M)\mapsto UMU^H$ to be onto. So, $$ {\rm dim}(R\times S)=n(n-1)+2(n^2-k)\ge{\rm dim}(GL_{\mathbb{C}}(n))=2n^2, $$ so, again, $k\le\frac12n(n-1)$.

Now, we can prove the lower bound. It was noted in the question that $k_{\mathbb{C}}\ge\frac12n(n-1)$, as all complex matrices can be put in upper triangular form by a Schur decomposition. So, only the real case remains. For positive integers $n_1+n_2+\cdots+n_m=n$, we can express an $n\times n$ matrix $A$ in block form $$ A=\pmatrix{ A_{11} & A_{12} &\cdots&&A_{1m}\\ A_{21}&A_{22}&\cdots&&A_{2m}\\ \vdots&&\ddots&&\vdots\\ \\ A_{m1}&A_{m2}&\cdots&& A_{mm} } $$ where $A_{ij}$ is an $n_i\times n_j$ real matrix. As noted by Omnomnomnom, using the real Schur form, then replacing $A$ by $Q^TAQ$ for orthogonal $Q$, this can be done so that it is block upper triangular ($A_{ij}=0$ for $i > j$) and such that each $n_i$ is 1 or 2. We can take $n_1=n_2=\cdots=n_r=2$ and $n_{r+1}=n_{r+2}=\cdots=n_m=1$ (where $r\in\{0,1,\ldots,m\}$ is the number of complex-conjugate pairs of eigenvalues of $A$). The nonzero below diagonal terms of $A$ then correspond to the below diagonal terms of the block diagonal elements $A_{ii}$. There are $r$ of these, so $A$ has at least $\frac12n(n-1)-r$ zero entries below the diagonal. We cannot remove these remaining $r$ below-diagonal nonzero entries, but we can introduce an additional $r$ zeros above the diagonal. Note first that if $Q$ is a real matrix in block form $Q=(Q_{ij})_{i,j=1,\ldots,n}$ with $Q_{ij}=0$ for $i\not=j$ (i.e., block-diagonal) with each $Q_{ii}$ orthogonal, then $Q$ is orthogonal. Furthermore, $B=Q^TAQ$ has block form $B=(B_{ij})$ with $B_{ij}=Q_{ii}^TA_{ij}Q_{jj}$. So, $B$ is also block upper triangular.

(1) For any $k < m$ with $n_k=2$, there is an orthogonal block-diagonal matrix $Q$ such that $B=Q^TAQ$ has block entries $B_{ij}=A_{ij}$ for $i,j\not=k$ and $B_{km}$ has a zero entry.

To show this, let $v\in\mathbb{R}^2$ be the first column of $A_{km}$ and $R$ be a $2\times2$ rotation matrix with $R^Tv=(\lVert v\rVert,0)^T$. Then, $R^TA_{km}$ is upper triangular and letting $Q$ be the block-diagonal matrix with $Q_{kk}=R$ and all other block diagonal elements being the identity gives the result.

As long as $r < m$ or, equivalently, $A$ has at least one real eigenvalue, we can apply this so that $A_{rm}$ has a zero entry, then so that $A_{r-1,m}$ has a zero entry, and so on to end up with each $A_{im}$ having a zero entry for $i\le r$. If $r=m$ or, equivalently, $A$ has no real eigenvalues, we can use the following.

(2) If $i < j$ is such that $n_i=n_j=2$ then there exists an orthogonal block-diagonal matrix $Q$ such that $B=Q^TAQ$ has block entry $B_{ij}$ being diagonal.

By the singular value decomposition, there exists $2\times2$ orthogonal matrices $R,S$ with $R^TA_{ij}S$ being diagonal. Then, letting $Q$ be the block-diagonal matrix with $Q_{ii}=R$, $Q_{jj}=S$ and all other diagonal entries being the identity gives the result.

Now in the case where $n > 2$ and $A$ has all complex eigenvalues (so $r=m\ge2$), we can apply (2) to make $A_{r-1,r}$ diagonal so that it has two zero entries. Iteratively applying (1) with $i=r-2,r-3,\ldots$ introduces a zero entry in each of the blocks $A_{im}$ for $i\le r-2$. This puts $A$ into upper block triangular form with $r$ zeros above the diagonal.


With Schur triangularization, you can get the slightly improved result of $\frac 12 n(n-1)$ zero-entries in the complex case, assuming the matrix has exclusively real eigenvalues.

For a real matrix with complex eigenvalues, we can always put the matrix into its "real Schur form", which is a "block-upper-triangular" form. That is, for $A \in \mathbb{R}^{n\times n}$, there exists an orthogonal matrix $Q$ for which $$ Q^TAQ = \pmatrix{ A_1 &&&*\\ &A_2&&\\ &&\ddots&\\ 0&&& A_k } $$ Where each $A_i$ is $1\times 1$ or a $2\times 2$ real matrix with a conjugate pair of complex eigenvalues.

This means that for any $A$, we can always choose an orthogonally equivalent matrix with at least $\frac 12 n(n-1) - n/2 = \frac 12 n(n-2)$ zero entries, which is to say at least $\lceil \frac 12 n(n-2) \rceil$ zero entries.

I hope you find this useful.


It may be possible to generalize your latest result.

For any odd $n$, any $T \in \mathbb{R}^{n\times n}$ must have some eigenvector $v_1$.

So, consider any transformation $A$ with odd dimension. We can write $T = Q^TAT$ of the above form, insisting that $A_1$ be a $1 \times 1$ matrix. That is, for some $a \in \mathbb{R}$, we can write $$ Q^TAQ = \pmatrix{ a &&&*\\ 0&A_2&&\\ \vdots&&\ddots&\\ 0&0&& A_k } $$ Note that the first column of $Q$ (call it $q_1$) is an eigenvector, and that $a$ is the corresponding eigenvalue. If $T$ is non-invertible (which is true if and only if $A$ is invertible), we can insist $a=0$, and we have our extra zero-entry as desired.

If $T$ is non-invertible, I think we might be able to set $v_1 = q_1$, and choose a new basis $v_2,\dots,v_n$ covering $P =$ the span of $q_2,\dots,q_n$. Maybe there's some way to choose $v_2,\dots,v_n$ so that $v_2 \in P \cap T(P)$, and we maintain this block upper-triangular form.

Food for thought, I suppose.


Here is a proof that $k(3)=3$.

Let $T:\mathbb R^3\to\mathbb R^3$ be a linear operator. If the kernel of $T$ is nontrivial, choose a vector in the kernel as one of basis element; this creates $3$ zeros in the matrix already. Suppose $T$ is invertible. Since the space is odd-dimensional, $T$ has at least one eigenvector (the characteristic polynomial has at least one real root). Let $v_1$ be a normalized eigenvector of $T$. Let $P$ be the 2-dimensional subspace orthogonal to $v_1$. Since $T^{-1}P$ is also 2-dimensional, the intersection $P\cap T^{-1}P$ contains a line. Let $v_2$ be a unit vector on the line, and notice that $Tv_2 \perp v_1$. Finally, complete the basis with $v_3=v_1\times v_2$.

In the basis $v_1,v_2,v_3$ the matrix of $T$ has the form $$\begin{pmatrix} * & 0 & * \\ 0 & * & * \\ 0 & * & * \end{pmatrix} $$ Thus, $k(3)\ge 3$. The reverse inequality follows from (1) in the question.


If all the eigenvalues are distinct, then the matrix is fully diagonalizable so that there would be $n^2 - n = n(n-1)$ zeros. If some eigenvalues are identical, then the matrix is only block-diagonalizable; if there are $m\le n$ distinct eigenvalues with $n_1$ to $n_m$ being the number of vectors with that eigenvalue, then the blocks would be $n_i \times n_i$ squares and we could guarantee at least $n^2 - \sum_{i=1}^mn_i^2$ zeros.

Unfortunately, the worst case for this line of reasoning is when all eigenvalues are identical. In that case it cannot prove any zeros at all.

The $k$-numbers under investigation here can be used to tighten the above result. Since the restriction of the linear operator to the $i$th block must still be able to have $k(n_i)$ zeros forced forced into it, we can guarantee at least $n^2 - \sum_{i=1}^mn_i^2 + \sum_{i=1}^mk(n_i)$ zeros, which will be a higher number if at least one $n_i\ge 3$ (so that $k(n_i)>0$).

If we think of the real matrix as a complex matrix, then its Jordan Normal Form will have at most $n$ non-zero entries on the diagonal and $\sum_{i=1}^m(n_i-1) = n-m$ non-zero entries on the superdiagonal. Therefore, we would get no more than $2n-m$ non-zero entries and at least $n^2 - 2n + m$ zeros. However, the JNF of a real matrix is not necessarily real itself, unless all the eigenvalues are also real, so getting this many zeros in the general case would require allowing complex values (on the main diagonal only) in the JNF.