Proof of the Sherman-Morrison Formula

I was reading a few proofs for the Sherman-Morrison Formula, which states that if $A$ is invertible and $M = A + \mathbf{u}\mathbf{v}^T$, then $M^{-1}$ is given by:

$$A^{-1} - A^{-1}\mathbf{u} \mathbf{v}^T A^{-1}/(1+\mathbf{v}^TA^{-1}\mathbf{u}).$$

There is a proof (verification) of this on Wikipedia as well as here but both of them do not justify why $(1+\mathbf{v}^TA^{-1}\mathbf{u})$ is a scalar. Why is it a scalar?

Basically I am trying to somewhat rigorously proof this formula without too many assumptions. Can I proof the formula without assuming it is true? I.e. only using the facts that $A$ is invertible and $M = A + \mathbf{u}\mathbf{v}^T$?


The idea here is prove a formula for the inverse of $A+uv^T$, a "rank one update" of an invertible matrix. The formula shows that the inverse is a rank one update of $A^{-1}$, so there's a nice bilateral relationship.

The verification of the Sherman-Morrison formula is straightforward but not terribly elegant. We want to show:

$$ (A + uv^T)\left(A^{-1} - \frac{A^{-1}uv^TA^{-1}}{1 + v^T A^{-1} u}\right) = I $$

where $A$ is an $n\times n$ invertible matrix and $u,v$ are $n\times 1$ (column) vectors. The Reader is invited to verify that all the terms in the formula have compatible dimensions, e.g. $uv^T$ is an $n\times n$ matrix that properly can be added to $A$ (respectively, multiplied by $A^{-1}$).

The validity of the formula depends on the scalar $1 + v^T A^{-1} u$ being nonzero, since the indicated "division" by this term actually means multiplying by the reciprocal of that scalar. Our algebra will be somewhat simplified if we replace that scalar reciprocal temporarily by a variable, say $c$, and then substitute the correct value at the end. In what follows we freely use the commutativity of scalar multiplication with matrix multiplication.

We begin by distributing (matrix multiplication over matrix addition:

$$ \begin{align*} (A + uv^T)(A^{-1} - cA^{-1}uv^TA^{-1}) &= AA^{-1} - cAA^{-1}uv^TA^{-1} + uv^T A^{-1} - cu(v^T A^{-1}u)v^T A^{-1} \\ &= I - cI uv^T A^{-1} + uv^T A^{-1} - c(v^T A^{-1}u)uv^T A^{-1} \\ &= I + (-c + 1 -cv^T A^{-1}u) uv^T A^{-1} \end{align*} $$

where in the last step we have grouped together all three terms that are scalar multiples of the matrix term $uv^T A^{-1}$. Clearly the final right-hand side in this last step is just $I$ precisely when the combined scalar coefficient of that matrix term is zero:

$$ -c + 1 -cv^T A^{-1}u = 0 $$

But this is equivalent to:

$$ 1 = c (1 + v^T A^{-1} u) $$

$$ c = (1 + v^T A^{-1} u)^{-1} $$

Therefore when $c$ is assigned this value (the reciprocal of the scalar $1 + v^T A^{-1} u$), then the Sherman-Morrison formula is valid (because carrying out the matrix multiplication indicated above gives us the identity $I$ as the result).