What happens if I repeatedly alternately normalize the rows and columns of a matrix?

Solution 1:

Here is a partial answer:

  1. The sign of each entry is perserved under normalisation. So, we may assume without loss of generality that $M$ is entrywise nonnegative.
  2. Therefore, if $A_k$ is the entrywise square of the $k$-th iterate $M_k$, then the iterates $A_k$s are normalised by row sums and column sums, rather than the norms of the rows and columns. I think it is easier to work with $\{A_k\}$ than with $\{M_k\}$.
  3. Thus $A_k$ is row stochastic when $k$ is odd, or column stochastic when $n\ge2$ is even. Without loss of generality, assume that $A_0$ (the entrywise square of the initial $M$) is also column stochastic.
  4. Clearly, $A$ is a fixed point if and only if it is doubly stochastic. (In other words, in the original setting, $M$ is a fixed point if and only if it is the Hadamard product of a $\{-1,1\}$-binary matrix and an entrywise square root of a doubly stochastic matrix. Also, as the set of all row or column stochastic matrices is bounded, each of $\{A_k\}_{k\ge0},\ \{A_{2k}\}_{k\ge0}$ and $\{A_{2k+1}\}_{k\ge0}$ always has a convergent subsequence.
  5. This iterative procedure of alternating normalisation by row sums and column sums is known as Sinkhorn-Knopp algorithm, which is used to determine the "$DAD$ form" in Sinkhorn's theorem. It is known that $\{A_k\}_{k\ge0}$ converges if and only if $A_0$ contains a positive diagonal, i.e. iff for some permutation matrix $P$, all entries of $A_0$ at the positions of the ones in $P$ are positive. See

    R. Sinkhorn and P. Knopp (1967), Concerning Nonnegative Matrices and Doubly Stochastic Matrices, Pacific J. Math., 21(2):343-348.

  6. So, $\{A_k\}_{k\ge0}$ may not converge. In fact, cycles can exist. Example: $$ \pmatrix{0&\frac12&0\\ 1&0&1\\ 0&\frac12&0} \rightarrow\pmatrix{0&1&0\\ \frac12&0&\frac12\\ 0&1&0} \rightarrow\pmatrix{0&\frac12&0\\ 1&0&1\\ 0&\frac12&0} $$ (To get an example for $\{M_k\}_{k\ge0}$, just take the entrywise square roots of the above matrices.)

  7. We may also modify the above example to obatin a divergent $\{A_k\}_{k\ge0}$ that contains no cycle. First, consider the following sequence: $$ A_0'=\pmatrix{1&\frac12\\ 0&\frac12} \rightarrow A_1'=\pmatrix{\frac23&\frac13\\ 0&1} \rightarrow A_2'=\pmatrix{1&\frac14\\ 0&\frac34} \rightarrow\cdots . $$ In other words, suppose $$ A_{2k}'=\pmatrix{1&\frac1{2k+2}\\ 0&\frac{2k+1}{2k+2}}, \ A_{2k+1}'=\pmatrix{\frac{2k+2}{2k+3}&\frac1{2k+3}\\ 0&1} $$ for each $k$. Clearly, $A_k'$ converges to $I$, but it never reaches a fixed point or enters any cycle. Now, consider the sequence defined by $$ A_{2k}=\pmatrix{0&\frac12A_{2k}'&0\\ A_{2k}'&0&A_{2k}'\\ 0&\frac12A_{2k}'&0}, \ A_{2k+1}=\pmatrix{0&A_{2k+1}'&0\\ \frac12A_{2k+1}'&0&\frac12A_{2k+1}'\\ 0&A_{2k+1}'&0}. $$ Then $\{A_k\}_{k\ge0}$ is divergent and it contains no cycles.
  8. However, I am not sure if the odd- or even- indexed sequences are always convergent. In all of the above examples, they are. The details in the aforementioned paper or other papers related to the Sinkhorn-Knopp algorithm may be helpful in this regard.

Solution 2:

I will add mvw's analytical answer with a numeric one.

First, I implemented this row and column normalization in Mathematica, and applied it to randomly generated $4\times 4$ matrices with entries in $[-1,1]$. The algorithm converges, it seems, quite quickly. (Note that while this isn't a proof of convergence, all the terms in the sequence except possibly the first lie in the bounded set of matrices with entries in $[-1,1]$, so there exists a convergent subsequence). The mean of the matrix norm of the difference between the 10th iteration and the 100th iteration was less than $0.001$. After many iterations, the matrices were, as you would expect, matrices with nearly normalized rows and columns.

Matrices with exactly normalized rows and columns are, obviously, fixed points of your algorithm. Let us call these unit row-column matrices. An interesting question is: are there any fixed points of your algorithm that are not unit row-column matrices? I expect that if there are any such fixed points, that they are unstable.

For $2\times 2$ matrices, we can pretty easily see what all the unit row-column matrices are. Given the upper left hand element of the matrix, we may make exactly two choices for the upper right element, and exactly two choices for the lower left element. Finally, there are two choices for the lower right element as well. Therefore, we have eight subclasses of matrix: $$\left(\begin{matrix} \cos t & \sin t\\ \sin t & \cos t\end{matrix}\right),$$ $$\left(\begin{matrix} \cos t & \sin t\\ \sin t & -\cos t\end{matrix}\right),$$ $$\left(\begin{matrix} \cos t & \sin t\\ -\sin t & \cos t\end{matrix}\right),$$ $$\left(\begin{matrix} \cos t & \sin t\\ -\sin t & -\cos t\end{matrix}\right),$$ $$\left(\begin{matrix} \cos t & -\sin t\\ \sin t & \cos t\end{matrix}\right),$$ $$\left(\begin{matrix} \cos t & -\sin t\\ \sin t & -\cos t\end{matrix}\right),$$ $$\left(\begin{matrix} \cos t & -\sin t\\ -\sin t & \cos t\end{matrix}\right),$$ $$\left(\begin{matrix} \cos t & -\sin t\\ -\sin t & -\cos t\end{matrix}\right).$$

I have created the following interesting diagram:

enter image description here

Above you see the unit circle with protruding "hairs." The end of the hair not next to the unit circle is the top row of a $2\times 2$ random matrix. The hair is the path taken by the top row of the matrix as it converges to the unit circle. As you can see, the convergence is sensitive to initial conditions.

Solution 3:

You have $$ F(A) = C(R(A)) $$ with $$ R(A) = \left(\prod_i \frac{1}{\lVert (e_i^T A)^T\rVert}\right) A = \left(\prod_i \frac{1}{\lVert A^T e_i\rVert}\right) A \\ C(A) = \left(\prod_i \frac{1}{\lVert Ae_i\rVert}\right) A $$ which seems a non-linear operation.

In practice it defines an iteration $$ A_n = F^n(A) $$ which seems to have at least the fixed point $A = I$.