Is the set $\{X \in \mathcal{M}({m \times n}) : \rho(M-NX) < 1\} $ connected?

$\newcommand{\NN}{{\mathbb{N}}}\newcommand{\CC}{{\mathbb{C}}}\newcommand{\RR}{{\mathbb{R}}}\newcommand{\QQ}{{\mathbb Q}}\newcommand{\ra}{\rightarrow}\newcommand{\ds}{\displaystyle}\newcommand{\mat}[1]{\left(\begin{matrix}#1\end{matrix}\right)}\newcommand{\lam}{\lambda}\renewcommand{\P}{{\mathcal P}}\newcommand{\N}{{\mathcal N}}\newcommand{\M}{{\mathcal M}}\renewcommand{\L}{{\mathcal L}}\newcommand{\EQ}[2]{\begin{equation} {{#2}\label{#1}} \end{equation}}\newcommand{\eps}{\varepsilon}\newcommand{\tabb}[1]{\begin{matrix}#1\end{matrix}} \DeclareMathOperator{\rank}{rank} \DeclareMathOperator{\bldiag}{bldiag}$ Theorem: $E = \{X \in \mathcal{M}(m \times n; \mathbb R) : \rho(M-NX) < 1\}$ is connected.

The proof is surprisingly difficult (at least mine). After some preliminaries, I will first treat the case $\rank(N)=1$ which is relatively simple, then show that the case $\rank(N)=2$ implies the general case, next treat the case $n=3$ as a preparation and finally prove the Theorem for $\rank(N)=2$ and all $n$. The case $\rank(N)=n$ (that is essentially $M=0$ and $N=I$, see below) is trivial, because any $X\in E$ can be connected to $0$ by the path $X_s=sX$ in $E$. It will not be mentioned again.

As explained in the linked answer to the question of the closure of $E$, we can assume that $N=\mat{I_m\\0}$ and that the first $m$ rows of $M$ vanish.

Here a further reduction is useful. Consider the matrix $\tilde M(\lam)=\lam\bldiag(0,I_{n-m})-M$ which equals $\lam I-M$ with the first $m$ diagonal elements replaced by zeros. Consider also the gcd $G(\lam)$ of all subdeterminants of $\tilde M(\lam)$ of size $n-m$.

Claim 1: It is sufficient to treat the case that $G(\lam)=1$.

This observation simplifies the presentation here; in the linked answer it was not needed.

Proof of Claim 1. Suppose that $G(\lam)\neq 1$ and suppose first that it has a real zero $\mu$. Since all subdeterminants of size $n-m$ of $\tilde M(\mu)$ vanish, its last $n-m$ rows are linearly dependent. Thus there exists a nonzero row vector $v=(0,\dots,0,v_{m+1},\dots,v_n)$ such that $v\,\tilde M(\mu)=0$ and hence $v\,M=\mu\,v$. We can assume that $v_n\neq0$: There exists some $k$ such that $v_k\neq0$. Then we exchange rows $k$ and $n$ and columns $k$ and $n$, respectively, and the components $v_k$ and $v_n$. Consider now the matrix $T$ with last row $v$ and first $n-1$ rows $e_j$, $j=1,\dots,n-1$, where $e_j$ denotes the $j$th unit vector. Let $M'=T\,M\,T^{-1}$. A small calculation (using $M'\,T=T\,M$) shows that the first $m$ rows of $M'$ vanish and the last row of $M'$ is $\mu e_n$. Since $TN=N$, the matrices $M-NX$ and $M'-NX'$ with $X'=XT^{-1}$ are similar and hence have the same spectral radius. By construction, $M'-NX'$ is block triangular for all $X'$ and always has the eigenvalue $\mu$. The latter must have $|\mu|<1$ because there exists $X'_0$ such that $\rho(M'-NX'_0)<1$. Therefore the last row and the last column of $M'-NX'$ are irrelevant for the decision whether $\rho(M'-NX')<1$ or not. We have shown that $n$ can be reduced if $G(\lam)$ has a real zero.

If $G(\lam)$ has a nonreal zero $\mu=\alpha+\beta i$ then there is a left eigenvector $v=r+i s$, where the first $m$ components of $r$ and $s$ vanish. Now $(r+i s)M=(\alpha +\beta i)(r + is)$ means that $rM=\alpha r-\beta s$ and $sM=\beta r +\alpha s$. We proceed as above with $T$ having the last two rows $r,s$. The result is a block triangular matrix $M'$ with lower right block $\mat{\alpha&-\beta\\\beta&\alpha}$ and again $n$ can be reduced.

As we have shown that $n$ can be reduced if $G(\lam)\neq 1$, this can be repeated until $G(\lam)=1$ and Claim 1 is proved.

Case $\mathbf {m=1}$: First, we recall considerations of the linked answer. We use the Laplace (or cofactor) expansion for the first row of the determinant giving the characteristic polynomial $p_{M-NX}(\lam)$ $${\det(\lam I - M+ NX)=(x_1+\lam)C_1(\lam)+\sum_{k=2}^n x_kC_k(\lam)}$$ where $X=(x_1,\dots,x_n)$ and $C_j(\lam)$, $j=1,\dots,n$ are the $1,j$ cofactors of $\lam I - M$, that is $C_j(\lam)=(-1)^{j+1}M_{1j}(\lam)$, where $M_{1j}(\lam)$ is the determinant of the $n-1$ by $n-1$ matrix obtained from $\lam I-M$ by deleting its first row and its $j$-th column. Observe that $C_1(\lam)\in \lam^{n-1}+\P_{n-1}$ and $C_j(\lam)\in\P_{n-1}$, where $\P_k$ denotes the vector space of all polynomials of degree less than $k$. Before continuing the proof, we need an auxiliary statement.

Lemma 1 Let $D(\lam)$ denote the gcd of $C_j(\lam)$, $j=1,\dots,n$ and let $d$ denote its degree. Then the mapping $${L:\RR^n\to D(\lam)\P_{n-d},\ \ L(X)=\sum_{k=1}^n x_kC_k(\lam)\mbox{ if } X=(x_1,\dots,x_n)}$$ is surjective (or onto).

Remark: The characteristic polynomial of $M-NX$ is the given by $p_{M-NX}(\lam)=\lam C_1(\lam)+L(X)$.\

For the proof, see the linked answer.

Now we continue with the proof in the case $m=1$. By Claim 1, we can assume that $G(\lam)=1$ and hence by Lemma 1, $L$ is an isomorphism. It induces a bijection between $X$ and the characteristic polynomial $p_{M-NX}(\lam)$ as stated in the above remark.

Consider now any element $X_1$ of $E$ and factor $q_1(\lam):=p_{M-NX_1}(\lam)=\prod_{k=1}^{n}(\lam-\mu_k)$ with certain (not necessarily real) $\mu_k$, $|\mu_k|<1$. Now define for $s\in[0,1]$ the polynomials $q_s(\lam)=\prod_{k=1}^{n}(\lam-s\mu_k)$. By construction, their coefficients are real and their zeros are in the open unit disk. Therefore the matrices $X_s$ corresponding to them by $q_s(\lam)=\lam C_1(\lam)+L(X_s)$ are elements of $E$ and connect $X_1$ to the uniquely determined $X_0$ with $p_{M-NX_0}(\lam)=\lam^n$. This shows that all elements of $E$ are connected to $X_0$ within $E$ and proves the Theorem for $m=1$. Actually, this allows to prove that $E$ is simply connected in the case $m=1$.

The case $\mathbf{m=2}$ implies the Theorem for all $\mathbf m$. Again, we suppose that $G(\lam)=1$ (see Claim 1). Consider two elements $X_a,X_b$ of $E$, i.e.\ we have $\rho(M-NX_a)<1$ and $\rho(M-NX_b)<1$. By going over to appropriate very close matrices, we can assume that the elements of $X_a,X_b$ are algebraically independent transcendents over $\QQ(M)$. Then we construct for $k=2,\dots,m-1$ matrices $X_k$ such that $X_a$ can be connected to $X_b$ in $E$ via these auxiliary matrices, modifying only two rows at one time.

For $k\in\{2,\dots,m-1\}$, we first use the $m$ by $n$ matrix $\tilde X_k$ with the first $k$ rows from $X_b$ and the last $m-k$ rows from $X_a$. The formula is $\tilde X_k=\bldiag(I_k,0)X_b+\bldiag(0,I_{n-k})X_a$. There is no reason why $\tilde X_k$ should be in $E$, but we claim

Claim 2: The gcd of all $1,j$ cofactors $C_j^{(k)}(\lam)$ of $\lam I - (M-N\tilde X_k)$ is 1.

The proof of this Claim is very similar to the considerations below Corollary 2 in the linked answer. Consider the gcd $D(\lam)$ of $C_j^{(k)}(\lam)$, $j=1,\dots,n$. A priori, $D(\lam)$ is a polynomial in $\lam$ with coefficients in $\QQ(M,\hat X_k)$ because the $C_j^{(k)}(\lam)$ are; here $\hat X_k$ denotes the matrix obtained from $\tilde X_k$ by erasing the first row. The $C_j^{(k)}(\lam)$ are actually in $K[\hat X_k][\lam]$, where $K=\QQ(M)$. Since $K$ is a field, $K[\hat X_k]$ is a unique factorisation domain because the elements of $\hat X_k$ are algebraically independent transcendents. We multiply $D(\lam)$ by the lcm of the denominators of the coefficients (seen as elements of $K(\hat X_k)$), and then divide by its content (that is the gcd of the coefficients seen as elements of $K[\hat X_k]$). We call the result $D(\lam)$ again. Then $D(\lam)\in K[\hat X_k][\lam]$ has content 1 and divides every $C_j^{(k)}(\lam)$ in $K(\hat X_k)[\lam]$ by definition. By Gauss's Lemma, it follows that $D(\lam)$ divides every $C_j^{(k)}(\lam)$ in $K[\hat X_k][\lam]$.

Now, for $j=1,\dots,n$, $C_j^{(k)}(\lam)$ does not contain any element of the $j$-th column of $\hat X_k$. Therefore $D(\lam)$ must actually be an element of $K[\lam]$ and it divides in $K[\hat X_k]$ the $n$ cofactors $C_j^{(k)}(\lam)$. Therefore $D(\lam)$ must also divide the $1,j$ cofactors $\bar C_j(\lam)$ of $\lam I-(M-N\bar X)$, where $\bar X$ is any $m$ by $n$ matrix. This follows by substitution of the elements of $\hat{\bar X}$ for the corresponding elements of $\hat X_k$. By choosing the rows of $\hat{\bar X}$ as appropriate unit vectors, we can achieve that one of the cofactors is any of the $n-m$ by $n-m$ subdeterminants of $M$ we want. Therefore $D(\lam)$ divides any of the latter subdeterminants and we have assumed that their gcd $G(\lam)$ is 1. This proves Claim 2.

At this point, Lemma 1 applies and yields that the mapping $L^{(k)}:\RR^n\to \P_n$, $V_1=(x_1,\dots,x_n)\mapsto L^{(k)}(V_1)=\sum_{j=1}^n x_j C_j^{(k)}$ is bijective. We choose a vector $V_1^{(k)}$ such that $\lam C_1^{(k)}+L^{(k)}(V_1^{(k)})=\lam^n$. This means that the matrix $X_k$ obtained from $\hat X_k$ by adding the first row $V_1^{(k)}$ gives a nilpotent $M-NX_k$; in any case $X_k$ is an element of $E$. We complete $X_2,\dots,X_{m-1}$ by $X_1=X_a$ and $X_m=X_b$.

For any $k\in\{2,\dots,m\}$, the matrices $X_{k-1}$ and $X_{k}$ have only $two$ different rows: the first and the $k$th. Both are in $E$ as we have seen. Here the case $m=2$ can be applied after exchanging the second and the $k$th rows and columns, respectively, of the two matrices. Hence they are connected by a path in $E$. Combining these $m-1$ paths, we obtain a path in $E$ connecting $X_a$ and $X_b$. This proves the Theorem for all $m\geq3$ if it is true for $m=2$.

Case $\mathbf{n=3,\ m=2}$. We denote the third row of $M$ by $\mat{a&b&c}$, the second row of $X$ by $\mat{d&e&f}$. By Claim 1, we can assume that $a\neq0$ or $b\neq0$. In order to apply Lemma 1, it is interesting to study the gcd of all $2\times2$ submatrices of $\mat{d&\lam+e&f\\-a&-b&\lam-c}$, in particular to establish a condition when this gcd is 1. As above, it is not equal to 1 iff there is a number $\mu$ such that $B(\mu)=\mat{d&\mu+e&f\\-a&-b&\mu-c}$ has rank 1. If $a=0$, then $b\neq0$ and $B(\mu)$ has rank 2 for all $\mu$ iff $d\neq0$. If $a\neq0$, we first add a multiple of the second row of $B(\mu)$ to the first to obtain $\mat{0&\mu+e-\frac{db}a&f+\frac da \mu-\frac{dc}a\\-a&-b&\mu-c}$ and then add a multiple of the second column to the third one to obtain $\tilde B(\mu)=\mat{0&\mu+e-\frac{db}a&f-\frac{de}a+\frac{d^2b}{a^2}- \frac{dc}a\\-a&-b&\mu-c+\frac{db}a}$, a matrix of the same rank as $B(\mu)$. It has rank 2 for all $\mu$ iff the third element of the first row is nonzero, that is iff $\Delta(X)=a^2f-{de}a+{d^2b}-{dc}a\neq0$. Actually the condition in the case $a=0$ can be included in the latter one since $b\neq0$. Observe that $\Delta(X)$ is closely related to the determinant of the matrix associated to the operator $L$ of Lemma 1 with respect to canonical bases of $\RR^n$ and $\P_n$. This is not surprising because $L$ is invertible iff this determinant is nonzero and, by Lemma 1, iff the gcd of the $1,j$ cofactors of $\lam I-M+NX$ equals 1. We call the set $\Delta(X)=0$ the singular surface.

Consider now two matrices $X_1,X_2\in E$. If we can find a path connecting their second rows $\mat{d_1&e_1&f_1}$ and $\mat{d_2&e_2&f_2}$ on which $\Delta$ does not vanish then we can extend it to a path connecting $X_1$ and $X_2$ in $E$. Indeed, we can first find path connecting the characteristic polynomials of $M-NX_1$ and $M-NX_2$ (for example via $\lam^3$) and then the corresponding path connecting the first rows of $X_1,X_2$ by Lemma 1. More precisely, consider a path $Z_s=\mat{d_s&e_s&f_s}$, $s\in[1,2]$ in $\RR^3$ such that $a^2f_s-{d_se_s}a+{d_s^2b}-{d_sc}a\neq0$ for all $s$. We denote the determinants of Lemma 1 by $C_j^s$, the operator associated to $Z_s$ by $L_s$. Observe that $L_s$ and $L_s^{-1}$ depend continuously upon $s$. If the characteristic polynomials of $X_i$ are $p_i(\lam)=\prod_{k=1}^3(\lam-\mu_k^i)$, we connect them by $p_s(\lam)$ defined by $$p_s(\lam)=\left\{\begin{array}l\prod_{k=1}^3\left(\lam-(3-2s)\mu_k^1\right)\mbox{ if }1\leq s\leq 1.5\\[0.2cm] \prod_{k=1}^3\left(\lam-(2s-3)\mu_k^2\right) \mbox{ if }1.5\leq s\leq 2\end{array}\right.$$ Observe that for every $s$, the zeros of $p_s$ are in the open unit disk. Then put $Y_s=L_s^{-1}\left(p_s(\lam)-\lam C_1^s\right)$ and let $X_s$ be the matrices with rows $Y_s$ and $Z_s$. By construction and the remark below Lemma 1, for every $s$, the characteristic polynomial of $M-NX_s$ is then $p_s(\lam)$, therefore $X_s\in E$.

It is easy to see that such a path connecting the second rows of $X_1, X_2$ exists if $\Delta(X_j)=a^2f_j-{d_je_j}a+{d_j^2b}-{d_jc}a$, $j=1,2$, are nonzero and have the same sign. If $a\neq0$, then we can first increase the modulus of $f$ sufficiently, then modify $d_1,e_1$ into $d_2,e_2$ and finally reduce $f$ to $f_2$. The case $a=0$ is even simpler and omitted.

Since $E$ is open, we can assume that both $\Delta(X_j)\neq0.$ It remains to treat the case that both have opposite sign. If $a\neq0$ and $\Delta(X)=0$ then the above calculation shows that the gcd $D(\lam)$ of the $2\times 2$ submatrices of $B(\lam)$ for general $X$ with second row $\mat{d&e&f}$ is $D(\lam)=\lam+e-\frac{db}a$. As $D(\lam)$ is a factor of the characteristic polynomial of $M-NX$, a path connecting $X_1,X_2$ in $E$ must cross the singular surface at a point $X$ where $|e-\frac{db}a|<1$. This makes it clear that the singular surface causes problems in finding a path in $E$ between two given elements, not only for $n=3$, but also for general $n$.

Still in the case $a\neq0$, $\Delta(X_1)>0>\Delta(X_2)$, we choose as crossing point a point $X^{c}$ with second row $\mat{0&0&0}$ (hence $D(\lam)=\lam$) and such that $\rho(M-NX^c)<1$. This is possible using Lemma 1. Then we consider two matrices $X_j^c$, $j=1,2$, which differ from $X^c$ only in the second row: it is $\mat{0&0&+\eps}$ for $X_1^c$ and $\mat{0&0&-\eps}$ for $X_2^c$. If $\eps>0$ is sufficiently small then both matrices are in $E$. As seen above, we can connect $X_1$ and $X_1^c$ in $E$ as they are both in the set $\Delta>0$ and we can connect $X_2$ and $X_2^c$ in $E$ as $\Delta<0$ for both. Obviously, $X_1^c$ and $X_2^c$ are connected in $E$ via the point $X^c$ on the singular surface. This completes the proof in the case $a\neq0$.

In the case $a=0$, the proof is quite similar and we only mention the differences. The singular surface is simply $d=0$ and on it, we have $D(\lam)=\det\mat{\lam+e&f\\-b&\lam-c}$. Since $b\neq0$, we can use $X^c$ with second row $\mat{0&c&c^2/b}$ (so that $D^c(\lam)=\lam^2$) and with vanishing first row: the characteristic polynomial of $M-NX^c$ is $\lam^3$. Then we use $X_j^c$ with second row $\mat{\pm\eps&c&c^2/b}$ and small $\eps$. This completes the proof in the case $n=3$, $m=2$.

General $\mathbf n$ and $\mathbf{m=2}$. We suppose again that $G(\lam)=1$ (see Claim 1). As above, we can associate to any row vector $Z\in\RR^n$ the gcd $D^Z(\lam)$ of the subdeterminants $C_j^Z(\lam)$, $j=1,\dots,n$ of the $n-1$ by $n$ matrix $\bar M(Z,\lam)= \mat{\lam e_2+Z\\\tilde M(\lam)}$ (see before Claim 1 for $\tilde M(\lam)$). It is 1 iff the operator $L^Z$ associated to the $C_j^Z(\lam)$ by Lemma 1 is invertible and this is the case iff the determinant $\Delta(Z)$ of the matrix associated to $L^Z$ for the canonical bases of $\RR^n$ and $\P_n$ has nonzero determinant. The singular surface is again $\Delta(Z)=0$.

Consider two matrices $X_i$, $i=1,2$, in $E$ and let $Y_i,Z_i$ denote the two rows of each of them. As for $n=3$, we can assume that $\Delta(Z_i)\neq0$.

As in the case $n=3,m=2$ we can prove

Lemma 2: If there is a path $Z_s$, $s\in[1,2]$ in $\RR^n$ connecting $Z_1$ and $Z_2$ such that in each of its intersection points $Z^c$ with the singular surface, the gcd $D^{Z^c}(\lam)$ has only zeros in the open unit disk, then it can be completed by a path $Y_s$ such that $X_s=\mat{Y_s\\Z_s}$ connects $X_1$ and $X_2$ in $E$.

Using this Lemma, the Theorem is proved once we show

Claim 3: There always exists a path connecting $Z_1,Z_2$ such that, in each of its intersection points $Z^c$ with the singular surface, the gcd $D^{Z^c}(\lam)$ has only zeros in the open unit disk.

Proof: We prove the Claim by induction over $n$. For $n=3$, it had already been shown above. Suppose now that it has been shown for $n-1$. We consider two cases: a) the first column of $ M$ vanishes, b) it contains a nonzero element.

case a) It is almost identical to the case $n=3,m=2,a=0$. Denote by $\check M$ the $(n-2)\times(n-1)$ matrix obtained from $M$ by deleting the first column. Then the gcd of the $n-2$ by $n-2$ subdeterminants of $\lam(0,I_{n-2})-\check M$ is the same as the one for $\tilde M(\lam)$ and hence equals $G(\lam)=1$. Therefore the condition $\Delta(Z)=0$ is equivalent to the first component of $Z$ being 0.This follows by cofactor expansion with respect to the first column. If $\Delta(Z)=0$ then $D(\lam)$ is the characteristic polynomial of $\mat{ -\check Z\\\check M }$, where $\check Z\in\RR^{n-1}$ is obtained from $Z$ by erasing the first component.

Since the gcd of the $n-2$ by $n-2$ subdeterminants of $\lam(0,I_{n-2})-\check M$ is $1$, we can find a vector $\check Z_c\in\RR^{n-1}$ such that the matrix $\mat{ -\check Z_c\\\check M }$ has a characteristic polynomial $q(\lam)$ with zeros in the open unit disk. We just have to use Lemma 1 in dimension $n-1$.

Now recall $X_i$, $Y_i$, $Z_i$, $i=1,2$. If the first components of $Z_1$ and $Z_2$ have the same sign, we can simply connect them by any path on which the first component does not vanish. The critical case is when the first component $Z_{11}$ of $Z_1$ is positive, say, and the corresponding $Z_{21}$ is negative. Then we can connect $Z_1$ to the vector $(+\eps,\check Z)$ and $Z_2$ to the vector $(-\eps,\check Z)$ by a path that does not cross the singular surface. Finally, we can simply connect $(\pm\eps,\check Z)$ by the segment. It crosses the singular surface at the point $(0,\check Z)$ and by construction, $D^{(0,\check Z)}(\lam)=q(\lam)$ has only zeros in the open unit disk. This completes the proof in case a).

case b) The first column of $M$ contains some nonzero element. This is, of course, the generic case and it has taken me a long time to find a reduction - I had tried to find directly for any $n$ a path crossing the singular surface only at points $Z^c$ with $D^{Z^c}(\lam)$ having zeros $\mu$, $|\mu|<1$, but this turned out to be very complicated. The present proof using induction is comparatively simple.

First, we prepare the matrix $M$ for the inductive step, more precisely, we transform the expression $M-NX$ into similar matrices such that $N$ remains unchanged. The eigenvalues of $M-NX$ do not change either in this way.

Suppose that the first element of the $k$th row of $M$ is nonzero. Then we exchange the third and the $k$th row of $M$ and the third and the $k$th columns of $M$ and $X$. Since the row operation does not have any effect on $N$, this is a similarity transformation for $M-NX$ leaving $N$ unchanged. Next we multiply the third row of $M$ by a convenient number and divide the third columns of $M$ and $X$ by the same number. So we can assume that the first element of the third row of $M$ is 1.

If $\alpha$ denotes the first element of the fourth row of $M$, we now add $-\alpha$ times the third row to it. This row operation is equivalent to multiplication of $M$ from the left by some elementary matrix $T$. Observe that again, $N$ is unchanged by this operation. Then we multiply from the right by the inverse of $T$. This amounts to adding $\alpha$ times the fourth columns of $M$ and $X$ to their third columns. Thus we can assume that the fourth row of $M$ starts with a 0. In the same way, we treat the remaining rows of $M$. So we can assume that the first column of $M$ is the third unit vector.

It is also useful to add a certain multiple of the first column to the $k$th for $M$ and $X$ and subtract the same multiple of the $k$tk row from the first row for $M$. This temporarily leads to a matrix $M$ the first row of which is not zero, but this can be repaired later. We can do this without interfering with $N$ if $k\geq3$. As it is a similarity transformation, the eigenvalues of $M-NX$ do not change. In the case $k=2$, the above transformation can still be done, but we have to clarify how to deal with $N$. So denote by $T_n$ the $n\times n$ matrix correponding to adding a certain multiple of the first column to the second one. Then we modify $M-NX$ into $T_n^{-1}M T_n- (T_n^{-1}N T_2) (T_2^{-1}XT_n)$. Since $T_n^{-1}N T_2=N$, this leads again to an expression of the same form if we substract a multiple of the second row from the first not only for $M$, but also for $X$. After having done all these modification, we simple incorporate the new first line into $X$. So we can choose the elements of the third row of $M$ besides the leading 1 as we like - only $X$ has to be modified accordingly. Observe that all these operations can be undone.

In summary, we can assume from now on that:\ 1. The first column of $M$ is the third unit vector,\ 2. The elements of $M$ in the third row besides 1 are algebraically independent transcendents over $Q(\check M)$.\ Here $\check M$ denotes the matrix obtained from $M$ by removing the three first rows and the first column. Let us write $$M=\mat{0&\tabb{\dots&0}\\ 0&\tabb{\dots&0}\\ 1& R\\ 0_{n-3}&\check M},$$ where $R$ denotes a row vector of $n-1$ elements and $0_{n-3}$ indicates a column vector of $n-3$ zeros.

Consider first the $(n-1)\times(n-1)$ matrix $L=\mat{0\\R\\\check M}$ and the $(n-1)\times1$ matrix $K=\mat{1\\0_{n-2}}$. With row vectors $U\in\RR^{n-1}$ we can consider the matrices $L-KU$. Here we are actually in the case $m=1$. The gcd $F(\lam)$ of the $(n-2)\times(n-2)$ submatrices of $\tilde L(\lam)=\lam\bldiag(0,I_{n-2})-L$ (analogous to $\tilde M(\lam)$) satisfies $F(\lam)=1$. This comes from the fact that the second row $R$ of $L$ consists of algebraically independent transcendents over $Q(\check M)$ and it had been shown in the proof of Claim 2. Therefore by Lemma 1, there exists a vector $U_c$ such that the characteristic polynomial of $L-KU_c$ is any prescribed polynomial with roots in the open unit disk.

Recall again $X_i$, $Y_i$, $Z_i$, $i=1,2$, and denote $\mu_i$ the first elements of $Z_i$. Then we want to connect $Z_1$ and $Z_2$ by a path as required in Claim 3 by first connecting $Z_1$ and $(\mu_1,U_c)$, then $(\mu_1,U_c)$ and $(\mu_2,U_c)$ and finally $(\mu_2,U_c)$ and $Z_2$. By slightly modifying $\mu_1,\,\mu_2$ and $U_c$, if necessary, we can assume that none of the four points is on the singular surface.

Connection of $(\mu_1,U_c)$ and $(\mu_2,U_c)$. Let us assume that $\mu_2>\mu_1$. Then the straight line $s\to(s,U_c)$, $s\in[\mu_1,\mu_2]$, connects the two points. As the $(\mu_j,U_c)$ are not on the singular surface, this path can cross the singular surface only at finitely many points because the equation for the corresponding $s$ is polynomial. If $s$ is some crossing point then the corresponding gcd $D^{(s,U_c)}(\lam)$ is a divisor of the subdeterminant formed by the columns $2$ to $n$ of $\bar M((s,U_c),\lam)$ which happens to be the characteristic polynomial of $L-KU_c$ diskussed above. The vector $U_c$ had been chosen such that all its zeros are in the open unit disk. Hence this is also the case for the zeros of $D^{(s,U_c)}(\lam)$.

Connection of $Z_1$ and $(\mu_1,U_c)$. We want to achieve this by a path $(\mu_1,U(s))$ where $U:[0,1]\to\RR^{n-1}$ has to be determined such that $U(1)=U_c$ and $(\mu_1,U(0))=Z_1$. For convenience, we write $U(s)=(U(s)_2,\dots,U(s)_n)$, $R=(R_2,\dots,R_n)$, $\check M=(m_{ij})_{i\geq4,j\geq2}$ and abbreviate $D^{(\mu_1,U(s))}(\lam)$ by $D_s(\lam).$ We want that the gcd $D_s(\lam)$ of the $(n-1)\times(n-1)$ subdeterminants of $$\bar M(s,\lam)=\left( \begin{array}{cccccc} \mu_1&\lam+U(s)_2&U(s)_3&\dots&\dots&U(s)_n\\-1&-R_2&\lam-R_3&\dots&\dots&-R_n\\ 0&-m_{42}&-m_{43}&\lam-m_{44}&\dots&-m_{4n}\\ \vdots&\vdots&&&\ddots&\vdots\\ 0&-m_{n2}&-m_{n3}&\dots&\dots&\lam-m_{nn}\end{array}\right)$$ to have only zeros in the open unit disk if the path crosses the singular surface.

Here we use that the gcd of these subdeterminants does not change if we add a multiple of one row or column to another row or column, respectively. Precisely first, we add $\mu_1$ times row 2 to the first row in order to have a zero in the upper left corner; we obtain as first row $$\mat{0&\tabb{\lam+U(s)_2-\mu_1R_2&U(s)_3+\mu_1\lam-\mu_1R_3&\dots&U(s)_n-\mu_1R_n}}.$$ Then we add $-\mu_1$ times column 2 to column 3 to remove the multiple of $\lam$ from the third component in row 1. We obtain as new first row $\mat{0&\tabb{\lam+\tilde U(s)_2&\tilde U(s)_3&\dots&\tilde U(s)_n}},$ where $\tilde U(s)_j=U(s)_j-\mu_1R_j$ for $j\neq3$ and $\tilde U(s)_3=U(s)_3-\mu_1U(s)_2-\mu_1R_3+\mu_1^2R_2.$ Observe that $\tilde U(0)$ and $\tilde U(1)$ are determined by $Z_1$ and $U_c$ and that we have to find a path connecting them with the property wanted in Claim 3.

Altogether, for the matrix $$\tilde{\bar M}(s,\lam)=\left( \begin{array}{cccccc} 0&\lam+\tilde U(s)_2&\tilde U(s)_3&\tilde U(s)_4&\dots&\tilde U(s)_n\\ -1&-R_2&\lam-R_3+\mu_1R_2&-R_4&\dots&-R_n\\ 0&-m_{42}&\mu_1m_{42}-m_{43}&\lam-m_{44}&\dots&-m_{4n}\\ \vdots&\vdots&&&\ddots&\vdots\\ 0&-m_{n2}&\mu_1m_{n2}-m_{n3}&-m_{n4}&\dots&\lam-m_{nn}\end{array}\right),$$ the gcd of its $(n-1)\times (n-1)$ submatrices is the same as for $\bar M(s,\lam)$, namely $D_s(\lam)$.

This gcd is the same as the gcd of the $(n-2)\times (n-2)$ subdeterminants for the matrix $\hat M(s,\lam)$ obtained from $\tilde{\bar M}(s,\lam)$ by erasing the first column and the second row. Indeed, any $(n-1)\times(n-1)$ subdeterminant of $\tilde{\bar M}(s,\lam)$ containing the first column agrees with the $(n-2)\times (n-2)$ subdeterminant of $\hat M(s,\lam)$ obtained by erasing its first column and second row. The $(n-1)\times(n-1)$ subdeterminant of $\tilde{\bar M}(s,\lam)$ not containing the first column is a linear combination of the smaller determinants by Laplace (or cofactor) expansion. It is convenient now to exchange the first two columns of $\hat M(s,\lam)$ to have $$\hat M(s,\lam)=\left( \begin{array}{ccccc} \tilde U(s)_3&\lam+\tilde U(s)_2&\tilde U(s)_4&\dots&\tilde U(s)_n\\ \mu_1m_{42}-m_{43}&-m_{42}&\lam-m_{44}&\dots&-m_{4n}\\ \vdots&&&\ddots&\vdots\\ \mu_1m_{n2}-m_{n3}&-m_{n2}&-m_{n4}&\dots&\lam-m_{nn}\end{array}\right).$$ Observe that, by construction, the gcd of its $(n-2)\times(n-2)$ subdeterminants is $D_s(\lam)$. $\hat M(s,\lam)$ is the analogue of $\bar M(s,\lam)$ for the $(n-3)\times (n-1)$ matrix $$\hat M'=\left(\begin{array}{ccccc} -\mu_1m_{42}+m_{43}&m_{42}& m_{44}&\dots&m_{4n}\\ \vdots&&&&\vdots\\ -\mu_1m_{n2}+m_{n3}&m_{n2}&m_{n4}&\dots&m_{nn}\end{array}\right).$$ Here the induction hypothesis applies and yields that there is a path $s\to(\tilde U(s)_3,\tilde U(s)_2,U(s)_4,\dots,\tilde U(s)_n)$ which crosses the singular surface $D_s(\lam)\neq1$ only at points $s$ where the zeros of $D_s(\lam)$ are in the open unit disk. Tracing back the transformations we have made from $\bar M(s,\lam)$ to $\hat M(s,\lam)$, we have found a path $s\to(\mu_1, U(s)_2, U(s)_3,\dots, U(s)_n)$ connecting $Z_1$ and $(\mu_1,U_c)$ which crosses the singular surface only at points where the zeros of $D_s(\lam)$ are in the open unit disk. Observe that in the above reduction of the dimension, it was crucial that the first component of the points on the path remained constant.

As the connection from $(\mu_2,U_c)$ to $Z_2$ is completely analogous to the one from $Z_1$ to $(\mu_1,U_c)$, the proof of Claim 3, and therefore the proof of the Theorem, is complete.

Is the set $\{X \in \mathcal{M}({m \times n}) : \rho(M-NX) < 1\} $ connected?

Related

Recent Posts