In the proof of the Perron-Frobenius theorem why can we take a strictly positive eigenvector corresponding to the eigenvalue $1$? Before that, why can we even take a non-negative eigenvector?

Books simply take such a vector, no explanation whatsoever. Pisses me off.

Wikipedia only proves it assuming the matrix is irreducible.

Any help? Thanks.

EDIT: Nevermind, got it. Delete this question, please.


Solution 1:

It really depends on how your book does it.

The easiest way I know of, is to consider the simplex $S = \{ (x_i) | x_i \geq 0, \sum x_i = 1\} $, and consider the action of the matrix $M$ on this simplex given by $M(s) = Ms / ||Ms||$ (i.e. normalize it such that it lies back in the simplex).

This gives us a map of the simplex to the simplex, hence has a fixed point by Brouwer's fixed point theorem. It remains to show that the boundary of this simplex is not a fixed point, because the entries of the matrix are all positive. Hence, the fixed point lies in the interior, and corresponds to the eigenvector that you're looking for.

Note that this doesn't prove the uniqueness, nor that the corresponding eigenvalue has strictly largest norm.

Solution 2:

(Apparently the OP concerns about doubly stochastic matrices. In this case the existence of a positive Perron vector is trivial, but here we will deal with the general case. As the OP does not want further discussion, the following only serves as a quick reference for myself. In the sequel, "$A>0$" or "$A$ is positive" means $A$ is entrywise positive. "Nonnegative matrices/vectors" are interpreted analogously. The spectral radius of a matrix $A$ is denoted by $\rho(A)$.)

Concerning the existence of Perron vectors, there are actually three statements:

  1. A positive matrix has a positive Perron vector.
  2. A nonnegative matrix has a nonnegative Perron vector.
  3. An irreducible nonnegative matrix has a positive Perron vector.

For statement 1, the elementary proof from lemma 8.2.1 on p.495 of Horn and Johnson's Matrix Analysis, which I will cite later, is fairly easy and not too long. It is certainly not the simplest, but it contains many useful facts that are interesting on their own. We will see the proof below. Statement 2 can be obtained from statement 1 using some sort of limit argument, as a nonnegative matrix/vector can be constructed as a limit of a sequence of positive matrices/vectors. Statement 3 is based on the observation that if $A$ is an irreducible nonnegative matrix of order $n$, then $(I+A)^{n-1}>0$. Intuitively this is because any two vertices in a strongly connected graph are separated by a walk of length at most $n-1$. Now, given a nonnegative Perron vector $x$ corresponding to the eigenvalue $\rho(A)$, we have $(I+A)^{n-1}x=(1+\rho(A))^{n-1}x$. The LHS of this equality is a product of a positive matrix and a nonnegative but nonzero vector. Hence it is positive. Therefore the $x$ on RHS is positive, too.

It remains to prove statement 1. Let $x$ be an eigenvector of $A$ corresponding to an eigenvalue $\lambda$ of modulus $\rho(A)$. (We do not assume the existence of a Perron eigenvalue yet. So, what we assume is $|\lambda|=\rho(A)$, not $\lambda=\rho(A)$.) Clearly $\rho(A) > 0$, otherwise $\operatorname{tr}(A)=0$, contradicting that $A$ is positive. Observe that \begin{equation} \rho(A) |x| = |\lambda x| = |Ax| \le |A||x| \le A|x|.\tag{1} \end{equation} ($|x|$ and $|A|$ denote the vector and matrix obtained by taking the entrywise moduli of the entries of $x$ and $A$.) If equality holds, $|x|$ is an eigenvector of $A$ corresponding to the eigenvalue $\rho(A)$. The RHS of $(1)$ is $z=A|x|$, which is the product of a positive matrix and a nonnegative but nonzero vector. Hence $z$ is positive and in turn, the $|x|$ on LHS is also positive and it is our positive Perron vector.

We now prove that equality must hold in $(1)$. Suppose the contrary. Then $z-\rho(A)|x|>0$. Since $A$ is positive, left multiply this inequality by $A$ gives $Az>\rho(A)z$ and hence $Az\ge\alpha z$ for some $\alpha>\rho(A)$. Let $r_i(X)=\sum_j |x_{ij}|$ denote the absolute row sum of the $i$-th row of a matrix $X$ and let \begin{align} B&=\operatorname{diag}(z)^{-1}A\operatorname{diag}(z),\\ C&=\operatorname{diag}\left(\frac{\min_i r_i(B)}{r_1(B)},\ldots,\frac{\min_i r_i(B)}{r_n(B)}\right)B. \end{align} Then inequality $Az\ge\alpha z$ implies that \begin{equation} \alpha \le \min_i r_i(B).\tag{2} \end{equation} It is not hard to see that all absolute row sums of $C$ are equal to $\min_i r_i(B)$. Consequently \begin{equation} \min_i r_i(B)=\max_i r_i(C)=\|C\|_\infty.\tag{3} \end{equation} Since $(1,\ldots,1)^T$ is an eigenvector of $C$ corresponding to the eigenvalue $\|C\|_\infty$, we further infer that $\|C\|_\infty\le\rho(C)$. Yet the spectral radius of a matrix $C$ must not exceed any matrix norm of $C$. So we must have \begin{equation} \|C\|_\infty=\rho(C).\tag{4} \end{equation} Finally, by definition, $B\ge C>0$. Therefore $\|C^m\|_F^{1/m}\le\|B^m\|_F^{1/m}$ for every $m$ ($\|\cdot\|_F$ denotes the Frobenius norm). Taking $m$ to infinity (Gelfand's formula), we get \begin{equation} \rho(C)\le\rho(B).\tag{5} \end{equation} Yet $\rho(B)=\rho(A)$ because $B$ is similar to $A$. Therefore $(2)-(5)$ give $\alpha\le\rho(A)$, which contradicts the definition of $\alpha$.