What would $\sum_{i=1}^n v_i^T(VV^T)^{-1}v_i$ be, if all $v_i$ are columns of $V$?

I have $n$ column vectors($v$) with each $v_i$ in $\mathbb{R}^m$. These column vectors are stacked vertically to form a matrix $V$ of dimension $m \times n$. Assume $n > m$ and there are $m$ independent columns in $V$.

Under these conditions, why does the relation $\sum_{i=1}^n v_i^T(VV^T)^{-1}v_i = m$ hold true?

It is evident that when $m=n$, $$\sum_{i=1}^n v_i^T(VV^T)^{-1}v_i = \sum_{i=1}^n (V^{-1}v_i)^T (V^{-1}v_i) = n = m $$

I don't how it extends for the case when $n > m$, though I have verified it using the following python code,

import numpy as np
m = 3
n = 5

V = np.random.rand(m, n)
VVti = np.linalg.inv(V @ V.T) # VV^T inverse

each = list()
for i in range(n):
    each.append(V[:, i] @ VVti @ V[:, i])

print("V:")
print(V)
print("rank of V:", np.linalg.matrix_rank(V))

print("each:", each)
print("sum of each:", sum(each))

Please help me "see" why the above result holds true.


Let $e_1, \ldots, e_n$ be the standard basis vectors of $\Bbb{R}^n$, expressed as column vectors. Fun fact about matrix multiplication: if $A$ is an $m \times n$ matrix, then $Ae_i$ is the $i$th column of $A$ (you should verify this for yourself). Using this, observe that the sum becomes $$\sum_{i=1}^n e_i^\top V^\top (VV^\top)^{-1} Ve_i,$$ which is pretty much the definition of the trace of the matrix $V^\top (VV^\top)^{-1} V$. Using the property $\operatorname{tr}(AB) = \operatorname{tr}(BA)$, $$\operatorname{tr}(V^\top (VV^\top)^{-1} V) = \operatorname{tr}(VV^\top (VV^\top)^{-1}) = \operatorname{tr} I_{m \times m} = m.$$