How unique are $U$ and $V$ in the Singular Value Decomposition?
According to Wikipedia:
A common convention is to list the singular values in descending order. In this case, the diagonal matrix $\Sigma$ is uniquely determined by $M$ (though the matrices $U$ and $V$ are not).
My question is, are $U$ and $V$ uniquely determined up to some equivalence relation (and what equivalence relation)?
Solution 1:
Let $A = U_1 \Sigma V_1^* = U_2 \Sigma V_2^*$. Let us assume that $\Sigma$ has distinct diagonal elements and that $A$ is tall. Then
$$A^* A = V_1 \Sigma^* \Sigma V_1^* = V_2 \Sigma^* \Sigma V_2^*.$$
From this, we get
$$\Sigma^* \Sigma V_1^* V_2 = V_1^* V_2 \Sigma^* \Sigma.$$
Notice that $\Sigma^* \Sigma$ is diagonal with all different diagonal elements (that's why we needed $A$ to be tall) and $V_1^* V_2$ is unitary. Defining $V := V_1^* V_2$ and $D := \Sigma^* \Sigma$, we have
$$D V = V D.$$
Now, since $V$ and $D$ commute, they have the same eigenvectors. But, $D$ is a diagonal matrix with distinct diagonal elements (i.e., distinct eigenvalues), so it's eigenvectors are the elements of the canon basis. That means that $V$ is diagonal too, which means that
$$V = \operatorname{diag}(e^{{\rm i}\varphi_1}, e^{{\rm i}\varphi_2}, \dots, e^{{\rm i}\varphi_n}),$$
for some $\varphi_i$, $i=1,\dots,n$.
In other words, $V_2 = V_1 V$. Plug that back in the formula for $A$ and you get
$$A = U_1 \Sigma V_1^* = U_2 \Sigma V_2^* = U_2 \Sigma V^* V_1^* = U_2 V^* \Sigma V_1^*.$$
So, $U_2 = U_1 V$ if $\Sigma$ (and, in extension, $A$) is square nonsingular. Other options, somewhat similar to this, are possible if $\Sigma$ has zeroes on the diagonal and/or is rectangular.
If $\Sigma$ has repeating diagonal elements, much more can be done to change $U$ and $V$ (for example, one or both can permute corresponding columns).
If $A$ is not thin, but wide, you can do the same thing by starting with $AA^*$.
So, to answer your question: for a square, nonsingular $A$, there is a nice relation between different pairs of $U$ and $V$ (multiplication by a unitary diagonal matrix, applied in the same way to the both of them). Otherwise, you get quite a bit more freedom, which I believe is hard to formalize.
Solution 2:
SVD in dyadic notation removes "trivial" redundancies
The SVD of an arbitrary matrix $A$ can be written in dyadic notation as $$A=\sum_k s_k u_k v_k^*,\tag A$$ where $s_k\ge0$ are the singular values, and $\{u_k\}_k$ and $\{v_k\}_k$ are orthonormal sets of vectors spanning $\mathrm{im}(A)$ and $\ker(A)^\perp$, respectively. The connection between this and the more standard way of writing the SVD of $A$ as $A=UDV^\dagger$ is that $u_k$ is the $k$-th column of $U$, and $v_k$ is the $k$-th column of $V$.
Global phase redundancies are always present
If $A$ is nondegenerate, the only freedom in the choice of vectors $u_k,v_k$ is their global phase: replacing $u_k\mapsto e^{i\phi}u_k$ and $v_k\mapsto e^{i\phi}v_k$ does not affect $A$.
Degeneracy gives more freedom
On the other hand, when there are repeated singular values, there is additional freedom in the choice of $u_k,v_k$, similarly to how there is more freedom in the choice of eigenvectors corresponding to degenerate eigenvalues. More precisely, note that (A) implies $$AA^\dagger=\sum_k s_k^2 \underbrace{u_k u_k^*}_{\equiv\mathbb P_{u_k}}, \qquad A^\dagger A=\sum_k s_k^2 \mathbb P_{v_k}.$$ This tells us that, whenever there are degenerate singular values, the corresponding set of principal components is defined up to a unitary rotation in the corresponding degenerate eigenspace. In other words, the set of vectors $\{u_k\}$ in (A) can be chosen as any orthonormal basis of the eigenspace $\ker(AA^\dagger-s_k^2)$, and similarly $\{v_k\}_k$ can be any basis of $\ker(A^\dagger A-s_k^2)$.
However, note that a choice of $\{v_k\}_k$ determines $\{u_k\}$, and vice-versa (otherwise $A$ wouldn't be well-defined, or injective outside its kernel).
TL;DR
A choice of $U$ uniquely determines $V$, so we can restrict ourselves to reason about the freedom in the choice of $U$. There are twe main sources of redundancy:
- The vectors can be always scaled by a phase factor: $u_k\mapsto e^{i\phi_k}u_k$ and $v_k\mapsto e^{i\phi_k}v_k$. In matrix notation, this corresponds to changing $U\mapsto U \Lambda$ and $V\mapsto V\Lambda$ for an arbitrary diagonal unitary matrix $\Lambda$.
- When there are "degenerate singular values" $s_k$ (that is, singular values corresponding to degenerate eigenvalues of $A^\dagger A$), there is additional freedom in the choice of $U$, which can be chosen as any matrix whose columns form a basis for the eigenspace $\ker(AA^\dagger-s_k^2)$.
Finally, we should note that the former point is included in the latter, which therefore encodes all of the freedom allowed in choosing the vectors $\{v_k\}$. This is because multiplying the elements of an orthonormal basis by phases does not affect its being an orthonormal basis.
Solution 3:
I will complete the proof of @Vedran for the case when there exist repeating eigenvalues, which would justify what @glS have said. Let $A = U \Sigma V^T = U^{'} \Sigma {V^{'}}^T$ be a matrix with real entries - the case with complex entries is similar. Then, $$A^T A = V \Sigma^T \Sigma V^{T} = V^{'} \Sigma^T \Sigma {V^{'}}^T.$$
From this, we get $$\Sigma^T \Sigma V^T V^{'} = V^T V^{'} \Sigma^T \Sigma.$$ Defining the square matrix $Q$ as $Q = V^T V^{'}$, we have $Q^T Q = (V^T V^{'})^T V^T V^{'} = I$, and similarly, $Q Q^T = I.$ Hence, $Q$ is an orthogonal matrix that satisfies the Sylvester equation $$Q\Sigma^T\Sigma - \Sigma^T \Sigma Q = 0.\tag{1}$$
Aiming to simplify the Sylvester equation (1) a little, counting the multiplicities, we can write $\Sigma^T \Sigma $ $= \sigma_1^2 I_{n_1} \oplus \sigma_2^2 I_{n_2} \oplus \cdots $ $ \oplus \sigma_{k}^2 I_{n_k} $ $= \text{diag}(\sigma_1^2 I_{n_1}, \sigma_2^2 I_{n_2},$ $ \cdots, \sigma_{k}^2 I_{n_k}),$ with $\sigma_i=\sigma_j$ iff $i=j$ and $n_i$ the multiplicity of $\sigma_{i}$.
Now, writing $Q$ in blocks conformally to $\Sigma^T \Sigma$, i.e., with $Q \Sigma^T \Sigma$ and $\Sigma^T \Sigma Q$ making sense, we have a new system of Sylvester equations $$\sigma_i^2 Q_{ij} I_{n_i} - \sigma_j^2 I_{n_j} Q_{ij}=0,$$ for each $1\leq i,j\leq k$. This means that, since both matrices $\sigma_i^2 I_{n_i}$ and $\sigma_j^{2}I_{n_j}$ do not share any of its eigenvalues for $i\not=j,$ by the result discussed here, if $i\not= j$, we have necessarily that $Q_{ij} = 0$. This means that $V^T V^{'} = \text{diag} (Q_{11}, Q_{22},\cdots, Q_{kk})$ for orthogonal matrices $Q_{ii}$, meaning that $$V' = V \text{diag}(Q_{11}, Q_{22},\cdots, Q_{kk})=V Q.\tag{2}$$
We can repeat the argument above for the matrix $A A^T$ and conclude that there exist an orthogonal matrix $\bar{Q}$ such that $$\bar{Q} = \text{diag}(\bar{Q}_{11}, \bar{Q}_{22},\cdots, \bar{Q}_{kk})\tag{3}$$ and $U' = U \bar{Q},$ with the matrices $\bar{Q}_{ii}$ of the same size of $Q_{ii}$ whenever $\sigma_i\not=0$ - this is possible because the matrices $A^T A$ and $A A^T$ have the same eigenvalues with the same multiplicity, except possibly the null eigenvalue. Taking into account that $A = U \Sigma V^T = U^{'} \Sigma {V^{'}}^T$, we would be able to conclude $\Sigma = U^T U^{'} \Sigma {V^{'}}^T V = U^T U \bar{Q} \Sigma Q^T V^T V,$ which simplifies to $$\Sigma = \bar{Q} \Sigma Q^T. $$ Lastly, considering the nonzero blocks of $\Sigma$ associated with the matrices $Q_{ii}$ and $\bar{Q}_{ii}$, we are able to conclude that $\bar{Q}_{ii} = Q_{ii}$ in the expression (3), whenever $\sigma_i\not=0$.