Which matrices $A\in\text{Mat}_{n\times n}(\mathbb{K})$ are orthogonally diagonalizable over $\mathbb{K}$?
Update 1. I still need help with Question 1, Question 2' (as well as the bonus question under Question 2'), and Question 3'.
Update 2. I believe that all questions have been answered if $\mathbb{K}$ is of characteristic not equal to $2$. The only thing remains to deal with is what happens when $\text{char}(\mathbb{K})=2$.
Let $\mathbb{K}$ be a field and $n$ a positive integer. The notation $\text{Mat}_{n\times n}(\mathbb{K})$ represents the set of all $n$-by-$n$ matrices with entries in $\mathbb{K}$. The subset $\text{GL}_n(\mathbb{K})$ of $\text{Mat}_{n\times n}(\mathbb{K})$ is composed by the invertible matrices. Here, $(\_)^\top$ is the usual transpose operator. Also, $\langle\_,\_\rangle$ is the standard nondegenerate bilinear form on $\mathbb{K}^n$.
Definition 1. A matrix $A\in\text{Mat}_{n\times n}(\mathbb{K})$ is said to be orthogonally diagonalizable over $\mathbb{K}$ if there exist matrices $D\in\text{Mat}_{n\times n}(\mathbb{K})$ and $Q\in\text{GL}_{n}(\mathbb{K})$ where $D$ is diagonal and $Q$ is orthogonal (i.e., $Q^\top=Q^{-1}$) such that $$A=QDQ^{\top}\,.$$
Definition 2. A matrix $A\in\text{Mat}_{n\times n}(\mathbb{K})$ is said to be seminormal if $$AA^\top=A^\top A\,.$$
For clarification, when $\mathbb{K}$ is $\mathbb{R}$, seminormal matrices are the same as normal matrices. However, when $\mathbb{K}$ is $\mathbb{C}$, the terms seminormal and normal are different. We have an obvious proposition.
Proposition. Let $A\in\text{Mat}_{n\times n}(\mathbb{K})$.
(a) If $A$ is orthogonally diagonalizable over $\mathbb{K}$, then $A$ is symmetric.
(b) If $A$ is symmetric, then $A$ is seminormal.
The converse of (a) does not hold (but it does if $\mathbb{K}$ is $\mathbb{R}$). For example, when $\mathbb{K}$ is the field $\mathbb{C}$ or any field with $\sqrt{-1}$, we can take $$A:=\begin{bmatrix}1&\sqrt{-1}\\\sqrt{-1}&-1\end{bmatrix}\,.$$ Then, $A$ is symmetric, but being nilpotent, it is not diagonalizable. The converse of (b) does not hold trivially (nonzero antisymmetric matrices are seminormal, but not symmetric).
Here are my questions. Crossed-out questions already have answers.
Question 1. Is there a way to characterize all orthogonally diagonalizable matrices over an arbitrary field $\mathbb{K}$?
As in Proposition (a), these matrices must be symmetric, but the counterexample above shows that this is not a sufficient condition. Due to the answer by user277182, I believe that this is a correct statement.
Theorem. Suppose that $\text{char}(\mathbb{K})\neq 2$. A matrix $A\in\text{Mat}_{n\times n}(\mathbb{K})$ is orthogonally diagonalizable over $\mathbb{K}$ if and only if
(a) $A$ is symmetric and diagonalizable over $\mathbb{K}$, and
(b) there exists a basis $\{v_1,v_2,\ldots,v_n\}$ of $\mathbb{K}^n$ consisting of eigenvectors of $A$ such that $\langle v_i,v_i\rangle$ is a nonzero perfect square element of $\mathbb{K}$ for each $i=1,2,\ldots,n$.
In the case where $\mathbb{K}$ contains all of its square roots (or when $\mathbb{K}$ is algebraically closed), the condition (b) in the theorem above is redundant. This theorem also answers Question 2' below (in the case $\text{char}(\mathbb{K})\neq 2$).
Question2. If a symmetric matrix $A\in\text{Mat}_{n\times n}(\mathbb{K})$ is already known to be diagonalizable over $\mathbb{K}$, is it also orthogonally diagonalizable over $\mathbb{K}$?
The answer of Question 2 turns out to be no (see a counterexample in my answer below). In light of this discovery, I propose a modified version of Question 2.
Question 2'. Let $\mathbb{K}$ be an algebraically closed field. If a symmetric matrix $A\in\text{Mat}_{n\times n}(\mathbb{K})$ is diagonalizable over $\mathbb{K}$, is it also orthogonally diagonalizable over $\mathbb{K}$?
Bonus. If $\mathbb{K}$ is not an algebraically closed field, what is a minimal requirement of $\mathbb{K}$ such that, if a symmetric matrix $A\in\text{Mat}_{n\times n}(\mathbb{K})$ is diagonalizable over $\mathbb{K}$, it is always also orthogonally diagonalizable over $\mathbb{K}$? This requirement may depend on $n$.
My guess for the bonus question is that, for every $x_1,x_2,\ldots,x_n\in\mathbb{K}$, $x_1^2+x_2^2+\ldots+x_n^2$ has a square root in $\mathbb{K}$. For example, a minimal subfield of $\mathbb{R}$ with this property is the field of constructible real numbers. Any field of characteristic $2$ automatically satisfies this condition.
Edit. According to this paper and that paper, when $\mathbb{K}=\mathbb{C}$, a symmetric matrix $A$ with an isotropic eigenvector $v$ (that is, $v^\top\,v=0$) is nonsemisimple (i.e., it is not diagonalizable). Therefore, at least, when $\mathbb{K}$ is a subfield of $\mathbb{C}$ such that, for every $x_1,x_2,\ldots,x_n\in\mathbb{K}$, $x_1^2+x_2^2+\ldots+x_n^2$ has a square root in $\mathbb{K}$, then a symmetric matrix $A\in\text{Mat}_{n\times n}(\mathbb{K})$ is orthogonally diagonalizable over $\mathbb{K}$ if and only if it is diagonalizable over $\mathbb{K}$. The result for other fields is currently unknown (to me).
Question3. As a generalization of this question, suppose that $A\in\text{Mat}_{n\times n}(\mathbb{K})$ is diagonalizable over $\mathbb{K}$. Does it hold that $A$ and $A^\top$ have the same set of eigenspaces if and only if $A$ is seminormal?
Only the forward direction ($\Rightarrow$) of this biconditional statement is known to be true. It is clear, however, that when $A$ is orthogonally diagonalizable over $\mathbb{K}$, then $A$ is symmetric, whence $A$ and $A^\top$ have the same eigenspaces. As a result, the converse is true at least when $\mathbb{K}$ is a subfield of $\mathbb{R}$ because the seminormal (whence normal) matrices which is diagonalizable over $\mathbb{R}$ are the symmetric matrices.
The answer to Question 3 is yes. I forgot that diagonalizable matrices commute if and only if they can be simultaneously diagonalized. See my answer in the other thread for a more detailed proof. Therefore, I proposed a more generalized version of Question 3.
Question 3'. Let $A\in\text{Mat}_{n\times n}(\mathbb{K})$ be such that all roots of the characteristic polynomial of $A$ lie in $\mathbb{K}$. What is a necessary and sufficient condition for $A$ and $A^\top$ to have the same set of generalized eigenspaces?
Clearly, seminormality is not one such conditions. Over any field $\mathbb{K}$, the matrix $A:=\begin{bmatrix}0&1\\0&0\end{bmatrix}$ has the same set of generalized eigenspaces as does $A^\top$. (The only eigenvalue of $A$ is $0$, and the generalized eigenspace associated to this eigenvalue is the whole $\mathbb{K}^2$. The same goes with $A^\top$.) However, $$AA^\top=\begin{bmatrix}1&0\\0&0\end{bmatrix}\neq \begin{bmatrix}0&0\\0&1\end{bmatrix}=A^\top A\,.$$ In fact, any matrix $A\in\text{Mat}_{2\times 2}(\mathbb{K})$ which has an eigenvalue in $\mathbb{K}$ with multiplicity $2$ has $\mathbb{K}^2$ as its unique generalized eigenspace, and it follows immediately that $A$ and $A^\top$ have the same generalized eigenspace.
Here are some worked examples that provide an answer to Question 2. The seminormal matrices in $\text{Mat}_{2\times 2}(\mathbb{K})$ are the symmetric matrices and matrices of the form $$T(a,b):=\begin{bmatrix}a&b\\-b&a \end{bmatrix}\,,$$ where $a$ and $b$ are elements of $\mathbb{K}$. For a symmetric matrix $$S(a,b,d):=\begin{bmatrix}a&b\\b&d\end{bmatrix}\,,$$ it is diagonalizable over $\mathbb{K}$ if and only if $a=d$ and $b=0$, or the quadratic polynomial $$x^2+(a+d)\,x+(ad-b^2)\in\mathbb{K}[x]$$ has two distinct roots in $\mathbb{K}$ (if $\text{char}(K)\neq 2$, the second condition is equivalent to stating that $$\Delta(a,b,d):=\sqrt{\left(\dfrac{a-d}{2}\right)^2+b^2}\in\bar{\mathbb{K}}$$ is a nonzero element of $\mathbb{K}$). It turns out that, if $S(a,b,d)$ is diagonalizable over $\mathbb{K}$, then
- when $\mathbb{K}$ is of characteristic $2$, $S(a,b,d)$ is also orthogonally diagonalizable over $\mathbb{K}$; and
- when $\mathbb{K}$ has characteristic not equal to $2$, $S(a,b,d)$ is orthogonally diagonalizable over $\mathbb{K}$ if and only if $a=d$ and $b=0$, or $\mathbb{K}$ contains both $\Delta(a,b,d)$ and $$\Xi(a,b,d):=\sqrt{2\,\Delta(a,b,d)\,\left(\Delta(a,b,d)-\frac{a-d}{2}\right)}\in\bar{\mathbb{K}}\,.$$
This provides a counterexample to Question 2. For example, when $\mathbb{K}$ is the field of rational numbers $\mathbb{Q}$, we can take $(a,b,d):=(6,4,0)$, so that $\Delta(6,4,0)=5$ and $\Xi(6,4,0)=2\sqrt{5}\notin\mathbb{Q}$. Therefore, $$S(6,4,0)=\begin{bmatrix}6&4\\4&0\end{bmatrix}$$ is not orthogonally diagonalizable over $\mathbb{Q}$. However, $S(6,4,0)$ is diagonalizable over $\mathbb{Q}$ because $\Delta(6,4,0)=5\in\mathbb{Q}_{\neq 0}$.
The smallest subfield $\mathbb{K}$ of $\mathbb{R}$ such that any matrix $S(a,b,d)$, with $a,b,d\in\mathbb{K}$, which is diagonalizable over $\mathbb{K}$, is always also orthogonally diagonalizable over $\mathbb{K}$ is the field of constructible real numbers. Over this field, $S(6,4,0)$ is no longer a counterexample. The same can be said for any field $\mathbb{K}$ that contains all of its square roots (that is, if $S(a,b,d)$ is diagonalizable over $\mathbb{K}$, then it is also orthogonally diagonalizable).
Now, we analyze $T(a,b)$. If $\text{char}(\mathbb{K})=2$, then $T(a,b)$ is diagonalizable over $\mathbb{K}$ if and only if $b=0$, in which case $T(a,b)$ is also orthogonally diagonalizable. If $\text{char}(\mathbb{K})\neq 2$, then $T(a,b)$ is diagonalizable over $\mathbb{K}$ if and only if $b=0$ or $\sqrt{-1}\in\mathbb{K}$; however, when $b\neq 0$, $T(a,b)$ is never orthogonally diagonalizable over $\mathbb{K}$, even when $\mathbb{K}$ contains $\sqrt{-1}$, because it is not symmetric. Unfortunately, the eigenspaces of both $T(a,b)$ and $\big(T(a,b)\big)^\top$ are identical: $$\mathbb{K}\,\begin{bmatrix}1\\+\sqrt{-1}\end{bmatrix}\text{ and }\mathbb{K}\,\begin{bmatrix}1\\-\sqrt{-1}\end{bmatrix}\,.$$
The counterexamples for $\text{Mat}_{2\times 2}(\mathbb{K})$ (for Question 2) above can be extended to counterexamples for $\text{Mat}_{n\times n}(\mathbb{K})$ whenever $n>2$. Both $S(a,b,d)$ and $T(a,b)$ so far, even when they are diagonalizable over $\mathbb{K}$ but not orthogonally diagonalizable over $\mathbb{K}$, do not provide a counterexample for Question 3.
Heres a proof of 2' in the affirmative, that if $M$ is diagonalisable, symmetric over $K$ algebraically closed of characteristic not equal to $2$, then it can be diagonalised by an orthogonal matrix. View our $K$ vector space $V$ as having the nondegenerate bilinear form $\langle e_i,e_j\rangle=\delta_{i,j}$ where $e_i$ are our standard basis of which we use to describe our linear maps as matrices. Then $M$ being symmetric is to say that $\langle Mv,w\rangle=\langle v,Mw\rangle$ with respect to this form. From this property, we see that distinct eigenspaces are orthogonal with respect to this form, since $$\lambda_1\langle v,w\rangle=\langle Mv,w\rangle=\langle v,Mw\rangle=\lambda_2 \langle v,w\rangle$$ for $v$ and $w$ eigenvectors. So since $M$ is diagonalisable, $V$ splits into an orthogonal sum of eigenspaces $V_\lambda$, where orthogonal is with respect to our form. So within each eigenspace, our form restricts to a nondegenerate bilinear form, and we can find orthogonal bases within each $V_\lambda$. This is a theorem about nondegenerate bilinear forms, a proof of which can be found in Serre's "A course in Arithmetic" EDIT(This relies on characteristic not 2, I'm not sure what the result looks like in this situation). So we now have an orthogonal basis of $V$, $\{v_i\}$ such that each $v_i$ is an eigenvector for $M$. Now if $\langle v_i,v_i\rangle=a_i$, replace $v_i$ by $v_i'=\frac{1}{\sqrt{a_i}}v_i$ to get a new orthogonal basis $\{v_i'\}$ of $V$, and note that $\langle v_i',v_i'\rangle =1$ for all $i$, so these are an orthonormal basis with respect to this form.
Now take the linear map $P$ taking $e_i\mapsto v_i'$. By construction, $P^{-1}MP$ is diagonal with respect to the basis $e_i$, and since $v_i'$ are orthonormal, the matrix $P$ is an orthogonal matrix, giving the result. It seems like for this argument to work, we only need $K$ to be closed under taking square roots, the only point where we used algebraic closedness was to scale our $v_i$.