When pseudo inverse and general inverse of a invertible square matrix will be equal or not equal?

I calculated general inverse and pseudo inverse of a ivertible symmetrix matrix in MATLAB by using function inv and pinv respectively, but, I got different output. I didn't get the proper reason behind that.

Therefore, I want to know in which case, pinv and inv will produce same result and in which case, pinv and inv will produce different result?


Solution 1:

What is the relationship between the classic matrix inverse $\mathbf{A}^{-1}$ and the Moore-Penrose pseudoinverse matrix $\mathbf{A}^{+}$?

The classic matrix inverse $\mathbf{A}^{-1}$ exists when the matrix $$\mathbf{A} \in\mathbb{C}^{m\times m}_{m}$$ is square, and has full rank. All eigenvalues are nonzero. The determinant is not zero.

The Moore-Penrose pseudoinverse $\mathbf{A}^{+}$ exists for all nonzero matrices $$\mathbf{A} \in\mathbb{C}^{m\times n}_{\rho}$$ How does this concept extend to the classic inverse where $n=\rho=m$?

Fundamental theorem of linear algebra

The fundamental theorem provides a nice way to connect the general problem with the specific case.

A matrix $\mathbf{A} \in\mathbb{C}^{m\times n}_{\rho}$ induces four fundamental subspaces which are resolved into $\color{blue}{range}$ and $\color{red}{null}$ spaces. The domain $\mathbf{C}^{n}$ and the codomain $\mathbf{C}^{m}$ are resolved as

$$ \begin{align} % \mathbf{C}^{n} = \color{blue}{\mathcal{R} \left( \mathbf{A}^{*} \right)} \oplus \color{red}{\mathcal{N} \left( \mathbf{A} \right)} \\ % \mathbf{C}^{m} = \color{blue}{\mathcal{R} \left( \mathbf{A} \right)} \oplus \color{red} {\mathcal{N} \left( \mathbf{A}^{*} \right)} % \end{align} $$

Singular decomposition: general case

The singular value decomposition resolves the four subspaces with orthonormal bases. $$ \begin{align} \mathbf{A} &= \mathbf{U} \, \Sigma \, \mathbf{V}^{*} \\ % &= % U \left[ \begin{array}{cc} \color{blue}{\mathbf{U}_{\mathcal{R}}} & \color{red}{\mathbf{U}_{\mathcal{N}}} \end{array} \right] % Sigma \left[ \begin{array}{cccc|cc} \sigma_{1} & 0 & \dots & & & \dots & 0 \\ 0 & \sigma_{2} \\ \vdots && \ddots \\ & & & \sigma_{\rho} \\\hline & & & & 0 & \\ \vdots &&&&&\ddots \\ 0 & & & & & & 0 \\ \end{array} \right] % V \left[ \begin{array}{c} \color{blue}{\mathbf{V}_{\mathcal{R}}}^{*} \\ \color{red}{\mathbf{V}_{\mathcal{N}}}^{*} \end{array} \right] \\ % & = % U \left[ \begin{array}{cccccccc} \color{blue}{u_{1}} & \dots & \color{blue}{u_{\rho}} & \color{red}{u_{\rho+1}} & \dots & \color{red}{u_{n}} \end{array} \right] % Sigma \left[ \begin{array}{cc} \mathbf{S}_{\rho\times \rho} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{array} \right] % V \left[ \begin{array}{c} \color{blue}{v_{1}^{*}} \\ \vdots \\ \color{blue}{v_{\rho}^{*}} \\ \color{red}{v_{\rho+1}^{*}} \\ \vdots \\ \color{red}{v_{n}^{*}} \end{array} \right] % \end{align} $$ The connection to the subspaces is immediate: $$ \begin{align} % R A \color{blue}{\mathcal{R} \left( \mathbf{A} \right)} &= \text{span} \left\{ \color{blue}{u_{1}}, \dots , \color{blue}{u_{\rho}} \right\} \\ % R A* \color{blue}{\mathcal{R} \left( \mathbf{A}^{*} \right)} &= \text{span} \left\{ \color{blue}{v_{1}}, \dots , \color{blue}{v_{\rho}} \right\} \\ % N A* \color{red}{\mathcal{N} \left( \mathbf{A}^{*} \right)} &= \text{span} \left\{ \color{red}{u_{\rho+1}}, \dots , \color{red}{u_{m}} \right\} \\ % N A \color{red}{\mathcal{N} \left( \mathbf{A} \right)} &= \text{span} \left\{ \color{red}{v_{\rho+1}}, \dots , \color{red}{v_{n}} \right\} \\ % \end{align} $$ A natural expression of the Moore-Penrose pseudoinverse is in terms of the SVD: $$ \begin{align} \mathbf{A}^{*} &= \mathbf{V} \, \Sigma^{+} \, \mathbf{U}^{*} \\ % &= % V \left[ \begin{array}{cc} \color{blue}{\mathbf{V}_{\mathcal{R}}} & \color{red}{\mathbf{V}_{\mathcal{N}}} \end{array} \right] % Sigma \left[ \begin{array}{cccc|cc} \sigma_{1}^{-1} & 0 & \dots & & & \dots & 0 \\ 0 & \sigma_{2}^{-1} \\ \vdots && \ddots \\ & & & \sigma_{\rho}^{-1} \\\hline & & & & 0 & \\ \vdots &&&&&\ddots \\ 0 & & & & & & 0 \\ \end{array} \right] % U \left[ \begin{array}{c} \color{blue}{\mathbf{U}_{\mathcal{R}}}^{*} \\ \color{red}{\mathbf{U}_{\mathcal{N}}}^{*} \end{array} \right] \\ % & = % V \left[ \begin{array}{cccccccc} \color{blue}{v_{1}} & \dots & \color{blue}{v_{\rho}} & \color{red}{v_{\rho+1}} & \dots & \color{red}{v_{n}} \end{array} \right] % Sigma \left[ \begin{array}{cc} \mathbf{S}^{-1}_{\rho\times \rho} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{array} \right] % V \left[ \begin{array}{c} \color{blue}{u_{1}^{*}} \\ \vdots \\ \color{blue}{u_{\rho}^{*}} \\ \color{red}{u_{\rho+1}^{*}} \\ \vdots \\ \color{red}{u_{m}^{*}} \end{array} \right] % \end{align} $$

Singular decomposition: nonsingular matrices

The question asks about the special case of nonsingular matrices. The matrices have trivial null spaces. That is, $$ \color{red} {\mathcal{N} \left( \mathbf{A} \right)} = \mathbf{0}, \qquad \color{red} {\mathcal{N} \left( \mathbf{A}^{*} \right)} = \mathbf{0}. $$ The SVD simplifies to having only range space components: $$ \mathbf{A} = \color{blue}{\mathbf{U}_{\mathcal{R}}} \, \mathbf{S} \, \color{blue}{\mathbf{V}^{*}_{\mathcal{R}}} % $$ as does the pseudoinverse $$ \mathbf{A}^{+} = \color{blue}{\mathbf{V}_{\mathcal{R}}} \, \mathbf{S}^{-1} \, \color{blue}{\mathbf{U}^{*}_{\mathcal{R}}} $$

Does $\mathbf{A}^{+}=\mathbf{A}^{-1}$?

The classic inverse is defined by the properties $$ \mathbf{A}^{-1} \mathbf{A} = \mathbf{A} \mathbf{A}^{-1} = \mathbf{I}_{m} $$ The classic inverse is both a left inverse and a right inverse.

$$ \mathbf{A}^{+} \mathbf{A} = \left( \color{blue}{\mathbf{V}_{\mathcal{R}}} \, \mathbf{S}^{-1} \, \color{blue}{\mathbf{V}_{\mathcal{R}}} \right) \left( \color{blue}{\mathbf{V}_{\mathcal{R}}} \, \mathbf{S} \, \color{blue}{\mathbf{V}^{*}_{\mathcal{R}}} \right) % = \color{blue}{\mathbf{V}_{\mathcal{R}}} \, \mathbf{S}^{-1} \, \mathbf{S} \, \color{blue}{\mathbf{V}^{*}_{\mathcal{R}}} = \color{blue}{\mathbf{V}_{\mathcal{R}}} \color{blue}{\mathbf{V}^{*}_{\mathcal{R}}} = \mathbf{I}_{n} $$ A similar process establishes $$ \mathbf{A} \mathbf{A}^{+} = \mathbf{I}_{n} $$ The pseudoinverse for $\mathbf{A}\in\mathbb{C}^{m\times m}_{m}$ is identically the classic inverse.

Explain numerical differences

The SVD is very powerful and very expensive. The routines are slower, and more robust. For a well conditioned matrix the numeric differences should be of the order of machine epsilon for the precision employed.

If the matrix is poorly conditioned, the differences can be extreme and highlight the need for the SVD. Thinking of rank as a discrete number if overly simplistic.

Consider the matrix $$ \mathbf{A} = \left[ \begin{array}{cc} 1 & 1 \\ 0 & \epsilon \end{array} \right] $$ If $\epsilon\ne0$, the matrix is rank 2. Numerically, as $\epsilon\to 0$, the matrix condition number will erode, and the quality of the inverse will erode also.

Solution 2:

The pseudoinverse should indeed equal the inverse for invertible matrices.

According to the documentation, Matlab's inv is based on LU or LDL decomposition, wile pinv is based on singular value decomposition. Different algorithms are used even if the matrix is invertible, so rounding error accumulates differently. You should never expect exact equality when dealing with floating-point arithmetic.

If the difference is large, file a bug report.