Square root of positive definite nonsymmetric matrix
Let $N$ be a nilpotent matrix in $M_n({\mathbb R})$, such that $(I+N)^2$ is “positive definite” (but not necessarily symmetric) in the sense that $<X,(I+N)^2X>$ is positive for any nonzero $X\in{{\mathbb R}^n}$ (here $<.,.>$ denotes the usual scalar product on ${\mathbb R}^n$). Is it true that $I+N$ must always be “positive definite” also ?
UPDATE : To clarify the question, I know about http://mathworld.wolfram.com/PositiveDefiniteMatrix.html which makes many promising related statements, but without any proof. This webpage has a list of references at the bottom, but I’ve checked them one by one and was unable to find that special result.
I think this question is a rather special case of the properties of the "unique positive square root function" described in the link. I'm hoping for an “elementary” proof.
Solution 1:
I believe $I+N$ must indeed be "positive definite".
Apply induction on the dimension. We are given a nilpotent matrix $N'$ which by Schur decomposition we can assume is strictly upper triangular. As a block matrix, $N'=\begin{pmatrix}N&x\\0&0\end{pmatrix}$ with $N$ a square strictly upper triangular matrix and $x$ a column vector. By assumption $(I+N')^2+(I+N')^{2t}$ is SPD (SPD=symmetric positive definite; superscript $2t$ means square and transpose). We wish to show that $(I+N')+(I+N')^t$ is SPD.
Let $U=I+N$ and note $$\begin{pmatrix}U&x\\0&1\end{pmatrix}^2+\begin{pmatrix}U&x\\0&1\end{pmatrix}^{2t}=\begin{pmatrix}U^2+U^{2t}&(I+U)x\\x^t(I+U)^t&2\end{pmatrix}.$$
By the Schur decomposition characterization of SPD we have by assumption:
- $U^2+U^{2t}$ is SPD
- $2-x^t(I+U)^t(U^2+U^{2t})^{-1}(I+U)x>0$
And we need to show:
- $U+U^t$ is SPD
- $2-x^t(U+U^t)^{-1}x>0.$
$U+U^t$ is SPD by induction. The latter inequality would follow from $$x^t(I+U)^t(U^2+U^{2t})^{-1}(I+U)x\geq x^t(U+U^t)^{-1}x$$ Letting $y=(I+U)x,$ this is $$y^t(U^2+U^{2t})^{-1}y\geq y^t(I+U^t)^{-1}(U+U^t)^{-1}(I+U)^{-1}y$$
This would follow from this result and symmetric positive definiteness of $$(I+U)(U+U^t)(I+U^t)-(U^2+U^{2t})=U+U^t+2UU^t+U(U+U^t)U^t$$ Since $U+U^t$ is SPD, so is $U(U+U^t)U^t,$ and $UU^t$ is certainly SPD. So we are done.
See loup blanc's answer for a generalization of the above argument from unipotent matrices to all matrices with positive spectrum.
Here is a reference with a nice proof: Uniqueness of matrix square roots and an application by Charles R. Johnson, Kazuyoshi Okubo, Robert Reams, Theorem 7. It uses the following theorem:
Theorem (Lyapunov). Let $A\in\mathbb C^{n\times n}$ (not necessarily Hermitian) and let $X\in\mathbb C^{n\times n}$ be Hermitian. If the eigenvalues of $A$ all have positive real part and $AX+XA^*$ is positive definite, then $X$ is positive definite.
In particular if $A$ has eigenvalues with positive real part and $A(A+A^*)+(A+A^*)A^*=A^2+(A^*)^2+2AA^*$ is positive definite, then $A+A^*$ is positive definite.
Proof 1 (sketch) following Horn and Johnson's Topics in Matrix Analysis:
Suppose not. We can take a kind of Jordan normal form, but making any above-diagonal $1$'s arbitrarily small, so $S^{-1}AS+(S^{-1}AS)^*$ is positive definite for some non-singular $S.$ Setting $G=SS^*$ we find that $AG+GA^*$ is positive definite. For $0\leq \theta\leq 1$ define $X_{\theta}=\theta G+(1-\theta)X.$ Note:
- $X_0=X$ is not positive definite
- $X_1=G$ is positive definite
- $AX_\theta+X_\theta A^*$ is a convex combination of positive definite matrices so must be positive definite.
All the matrices $X_\theta$ have real eigenvalues, and by continuity some $X_\theta$ must have $0$ as an eigenvalue: $X_\theta v=0$ for some non-zero $v.$ This implies $v^*(AX_\theta+X_\theta A^*)v=0,$ contradicting positive definiteness of $AX_\theta+X_\theta A^*.$
Proof 2.
Consider the function defined by $f(W)=\int_0^\infty e^{-tA}We^{-tA^*}dt.$ We can compute
\begin{align*} W&=-\frac d{d\tau}\Bigr|_{\tau=0}\int_{\tau}^\infty e^{-tA}We^{-tA^*}dt\\ &=-\frac{d}{d\tau}\Bigr|_{\tau=0}\int_0^\infty e^{-\tau A}e^{-tA}We^{-tA^*}e^{-\tau A^*}dt\\ &=Af(W)+f(W)A^*. \end{align*}
This means $f$ is a left inverse of the map $X\mapsto AX+XA^*.$ Since $f$ is an $\mathbb R$-linear map from the space of Hermitian matrices to itself, and has a left inverse, it must be an isomorphism. So the solution $X$ of $AX+XA^*=W$ is unique. And if $W$ is positive definite, then so is $e^{-tA}W^{1/2}W^{1/2}e^{-tA^*},$ so $f(W)$ is positive definite by construction. This proves that if $AX+XA^*$ is positive definite then so is $X.$
Solution 2:
A very partial answer.
Proposition 1. let $A$ be a P.D. matrix. Then there is a unique $B$ s.t. $B^2=A$ and every eigenvalue $\lambda$ of $B$ satisfies $Re(\lambda)>0$; moreover, if $A$ admits a P.D. square root, then necessarily it's $B$.
Proof. The key point is: if $U\in M_n$ is P.D., then every eigenvalue $\mu$ of $U$ satisfies $Re(\mu)>0$ (Beware, the converse is false!)
In particular, our $A$ has no $<0$ eigenvalues and, therefore, admits a unique square root $B$ s.t. every eigenvalue $\lambda$ of $B$ satisfies $Re(\lambda)>0$ (cf. Higham, functions of matrices). Thus $B$ is the only candidate that can be P.D.
Remark. That does not imply (despite Mathworld's article) that $A$ admits a P.D. square root.
EDIT 1. @Dap did a very pretty proof. I had thought of making such a recurrence, but I was sure that it would not work; which just shows that, in mathematics, you have to believe!
Moreover, using Dap's proof (mutatis mutandis), we can prove the following improvement
Proposition 2. Let $A\in M_n(\mathbb{R})$ be a P.D. matrix that satisfies $spectrum(A)\subset (0,+\infty)$. Then its principal square root (cf. Proposition 1.) is P.D.
Proof. Note that $B$ (the principal square root of $A$) has $>0$ eigenvalues and that it is triangularizable over $\mathbb{R}$ with a change of orthonormal basis.
Let $N'=\begin{pmatrix}N&x\\0&\alpha\end{pmatrix}$ (where $\alpha>0$) be the matrix $B$ after triangularization. We follow the Dap's proof.
We know that $N^2+N^{2T}>0,2\alpha^2-x^T(N+\alpha I)^T(N^2+N^{2T})^{-1}(N+\alpha I)x>0$ and we want to show that
$N+N^T>0,2\alpha-x^T(N+N^T)^{-1}x>0$.
It suffices to show that
$\Delta=x^T(N+\alpha I)^T(N^2+N^{2T})^{-1}(N+\alpha I)x- \alpha(x^T(N+N^T)^{-1}x)$ is non-negative.
We find $\Delta=N(N+N^T)N^T+2\alpha NN^T+\alpha^2(N+N^T)$ and we are done.
Remark 1. It remains to study the case when a P.D. matrix $A$ has non-real eigenvalues with positive real part.
Remark 2. Of course, Dap deserves the bounty.
EDIT 2. I just read the article ([1] by Johnson and all) cited by @Dap. Ewan will be happy; the result given by mathworld is true (what surprises me).
If $A\in M_n(\mathbb{C})$, let $F(A)=\{x^*Ax;||x||=1,x\in \mathbb{C}^n\}$ be its numerical range or its field of values. Note that $A$ is P.D. iff $F(A)\subset \{z;Re(z)>0\}$ iff $A+A^*$ is H.P.D..
[1] Theorem 7 (Kato,Masser, Neumann). If $A\in M_n(\mathbb{C})$ is s.t. $F(A)\cap (-\infty,0]=\emptyset$ (that is the case when $A$ is P.D.), then its principal square root is P.D..
[1] Corollary 8 (Johnson and all). If $A\in M_n(\mathbb{R})$ is s.t. $F(A)\cap (-\infty,0]=\emptyset$ (that is the case when $A$ is P.D. in the sense: for every $x\in \mathbb{R}^n\setminus\{0\}$, $x^TAx>0$), then its principal square root (which is a real matrix) is P.D.