The arithmetic-geometric mean for symmetric positive definite matrices

Solution 1:

Nice question!

What follows below is not really an answer, but too long to fit as a comment. It would be helpful if you could write your Mathematica algorithm is traditional mathematical notation so it is easier to see what is going on.

First, I am sure you are aware of the standard definitions of matrix means. I recall them below for the benefit of others who consider your question.

  1. The arithmetic mean is: $(A+B)/2$ as expected. It satisfies several of the standard desirable properties of a mean of spd matrices (nonnegative, if $A \le B$, then $A \le AM(A,B) \le B$, etc)

  2. The standard geometric mean is: $$GM(A,B) = A^{1/2}(A^{-1/2}BA^{-1/2})^{1/2}A^{1/2}$$ This particular choice of GM satisfies: $GM(A,B)=GM(B,A)$.

Now if you iterate the AGM, using the above two definitions of the means, then I think convergence, etc. can be proven---though you might have to resort to some kind of Fixed-point theorem. If I get time, I will think more about this.

Solution 2:

I'll use the characterisation of the geometric mean as given by Slowsolver. As I mentioned in a comment to his answer, the geometric mean $GM(A,B)$ is the geodesic midpoint corresponding to the Riemannian metric on the space of symmetric positive definite matrices with

$$ g(X,Y) = \mathop{tr}(A^{-1}XA^{-1}Y)~,\quad X,Y\in T_A(SPD) $$

where the tangent space $T_A(SPD)$ is the space of symmetric matrices. This interpretation has the following advantage: both the AM and GM are "translation invariant". Let $P$ be a non-singular matrix, then $AM(PAP^T,PBP^T) = P(AM(A,B))P^T$ trivially, and similarly for $GM$ using that $$g_A(X,Y) = g_{PAP^T}(PXP^T, PYP^T)$$ so that a geodesic $\gamma$ joining $A,B$ becomes a geodesic $P\gamma P^T$ joining $PAP^T$ and $PBP^T$. By first conjugating with $P = A^{-1/2}$, we can WLOG assume $A = I$. By then conjugating with a suitable orthogonal matrix $O$ to diagonalize $B$, we can WLOG assume $A = I$ and $B$ is diagonal.

The result on convergence now follows from the convergence of the AGM for the single, real variable case.

For your specific questions:

  1. The rate of convergence is the same as the scalar case, up to a constant factor depending on the initial values of $A,B$ (which determines $P$).
  2. In this definition $GM(A,B) = GM(B,A)$, so the question is moot.
  3. This is in some sense the natural generalisation of the scalar case. Notice that the scalar geometric mean corresponds to the geodesic midpoint for the multiplicative Lie group $(\mathbb{R}_+,\times)$ with the invariant metric. A first generalisation is to the abelian Lie group $\mathbb{R}^k_+$ with componentwise multiplication. (This corresponds to the diagonal case.) What we showed above is that the entire AGM sequence for arbitrary starting SPD matrices $A$ and $B$ is contained, up to conjugation by $P\cdot P^T$, in one such abelian case. Note that this is slightly less related to eigenvalues per se, since the transformation we use above does not preserve eigenvalues.

EDIT: It just occurred to me to do a literature search. In Lawson, J. D. & Lim, Y. "The geometric mean, matrices, metrics, and more". Amer. Math. Monthly, 2001, 108, 797-812, many properties of this geometric mean on SPD matrices are derived in Sections 2 and 3. In particular, the crucial property (as slowsolver remarks below) of being able to set $A = I$ and $B$ diagonal, is the content of Lemma 3.1 of that paper. That is followed by the harmonic-geometric-arithmetic-mean inequality for SPD matrices. By starting with that and playing with monotonicity properties, you presumably also get another proof of the convergence result.