Understanding the singular value decomposition (SVD)

One geometric interpretation of the singular values of a matrix is the following. Suppose $A$ is an $m\times n$ matrix (real valued, for simplicity). Think of it as a linear transformation $\mathbb R^n \to \mathbb R^m$ in the usual way. Now take the unit sphere $S$ in $\mathbb R^n$. Being a linear transformation, $A$ maps $S$ to an ellipsoid in $\mathbb R^m$. The lengths of the semi-axes of this ellipsoid are precisely the non-zero singular values of $A$. The zero singular values tell us what the dimension of the ellipsoid is going to be: $n$ minus the number of zero singular values.


Think of a matrix $A$ as a transformation of space. SVD is a way to decompose this transformation into a series of three consecutive, canonical transformations: a first rotation, scaling and a second rotation. There is a nice picture on Wikipedia showing this transformation: Representation of SVD

Another thing that helps to build the intuition is to derive the equation yourself. For any matrix $A$ of size $n\times m$ the row space and a column space will be subspaces of $R^m$ and $R^n$ respectively. You are looking now for a orthonormal basis of a row space of $A$, say $V$, that gets transformed into the column space of $A$ such that it remains an orthonormal basis (in the column space of $A$). Let's call this transformed basis $U$. So each $v_i$ get's mapped to some $u_i$, possibly with some stretching, say scaled by $\sigma_i$: $$ Av_i = u_i \sigma_i $$ In matrix notation we get: $$ AV = U \Sigma $$

In another words, if you think of matrices as linear transformations: multiplying a vector $v \in R^m$ by $A$ gives a vector in $R^n$. SVD is about finding such a set of $m$ such vectors (orthogonal to each other), such that, after you multiply each of them by $A$ they stay perpendicular in the new space.

Now, we can multiply (from the right) both sides by $V^{-1}$, and knowing that $V^{-1} = V^{T}$ (since for an orthonormal basis $V^TV = I$) we get: $$ A = U \Sigma V^T $$

Side notes:

  • SVD performs two changes of bases - if you multiply a vector $x \in R^m$ by $U \Sigma V^T$ it will be first transformed into new basis by $V^T$. Then, $\Sigma$ will scale the new coordinates by the singular values (and add/remove some dimensions). Finally, $U$ will perform another base change.

  • SVD works both for real and complex matrices, so in general $A = U \Sigma V^*$, where $V^*$ is a conjugate transpose of $V$.

  • SVD is a generalisation of a Spectral Decomposition (Eigendecomposition), which is also a diagonal factorisation, but for symmetric matrices only (or, more specifically, Hermitian).

  • Eigendecomposition (for a square matrix $A$ given by $A = PDP^{-1}$), in contrast to SVD, operates in the same vector space (basis change is performed once by $P^{-1}$ and then undone by $P$)


enter image description here The inserted image hopefully adds to the answer of TheSHETTY-Paradise.

It shows the three operations:

(1) a rotation in the 'domain' space,

(2) a scaling and a change of dimension and then

(3) a rotation in the image space

The other image shows all the possible shapes of $\Sigma$

enter image description here