Truly intuitive geometric interpretation for the transpose of a square matrix
I'm looking for an easily understandable interpretation for a transpose of a square matrix A. An intuitive visual demonstration, how $A^{T}$ relates to A. I want to be able to instantly visualize in my mind what I'm doing to the space when transposing the vectors of a matrix.
From experience, understanding linear algebra concepts in two dimensions is often enough to understand concepts in any higher dimension, so an explanation for two dimensional spaces should be enough I think.
All explanations I found so far were not intuitive enough, as I want to be able to instantly imagine (and draw) how $A^{T}$ looks like given A. I'm not a mathematician btw.
Here is what I found so far (but not intuitive enough for me)
- (Ax)$\cdot$y=$(Ax)^{T}$y=$x^{T}A^{T}$y=x$\cdot$$A^{T}$y
As far I understand dot product is a projection (x onto y, y onto x, both interpretations have the same result) followed by a scaling by the length of the other vector.
This would mean that mapping x into space A and projecting y onto the result is the same as mapping y into the space of $A^{T}$, then projecting the unmapped x into $A^{T}$y
So $A^{T}$ is the specific space B for any pair of vectors (x,y) such that Ax$\cdot$y=x$\cdot$By
This doesn't tell me instantly how $A^{T}$ drawn as vectors would look like based on A drawn as vectors.
- "reassigning dimensions"
This one is hard to explain so let me do this with a drawing:
parallel projections
This explanation is much more visual, but far too messy to do it in my head instantly. There are also multiple ways I could have rotated and arranged the vectors around the result $A^{T}$ which is represented in the middle. Also, it doesn't feel like it makes me truly understand the transposing of matrices, especially in higher dimensions.
- some kind of weird rotation
Symmetrical matrices can be decomposed into a rotation, scaling along eigenvectors $\Lambda$ and a rotation back
A=R$\Lambda$$R^{T}$
So in this specific case, the transpose is a rotation in the opposite direction of the original. I don't know how to generalize that into arbitrary matrices. I'm wildly guessing that if A is not symmetric any more, $R^{T}$ must also include some additional operations besides rotation.
Can anyone help me to find a way to easily and instantly imagine/draw how $A^{T}$ looks like given A in two dimensional space? (In a way of understanding that is generalizable into higher dimensions)
Edit 1: While working on the problem I was curious to see what B in
$BA=A^{T}$
looks like. B would describe what needs to be done to A in order to geometrically transpose it. My temporary result looks interesting but I'm still trying to bring it to an interpretable form. If we assume the following indexing order
$$A= \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} $$
and $det(A)\neq0$ then
$$B=\frac{1}{det(A)} \begin{bmatrix} a_{11} a_{22} - a_{21}^2 & a_{11} (a_{21} - a_{12}) \\ a_{22} (a_{12} - a_{21}) & a_{11} a_{22} - a_{12}^2 \\ \end{bmatrix} $$
What's visible on the first sight is that $\frac{1}{det(A)}$ causes scaling such that the area becomes exactly 1 (before applying the actual matrix).
B must also preserve the area as $det(A^{T})=det(A)$. It means that the matrix
$B'=\begin{bmatrix} a_{11} a_{22} - a_{21}^2 & a_{11} (a_{21} - a_{12}) \\ a_{22} (a_{12} - a_{21}) & a_{11} a_{22} - a_{12}^2 \\ \end{bmatrix}$
squares the area while transposing.
Edit 2:
The same matrix can be written as
$B'=\begin{bmatrix} \begin{bmatrix} a_{11} & a_{21} \\ \end{bmatrix} \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & \begin{bmatrix} a_{11} & a_{21} \\ \end{bmatrix} \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ \begin{bmatrix} a_{21} & a_{22} \\ \end{bmatrix} \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & \begin{bmatrix} a_{12} & a_{22} \\ \end{bmatrix} \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ \end{bmatrix}$
Which is
$B'=\begin{bmatrix} a_{1}^{T} \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & a_{1}^{T} \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ a_{2}^{T} \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & a_{2}^{T} \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ \end{bmatrix}= \begin{bmatrix} a_{1}\cdot \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & a_{1}\cdot \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ a_{2}\cdot \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & a_{2}\cdot \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ \end{bmatrix}$
I find the vectors $c_{1}=\begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix}$ and $c_{2}=\begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix}$ interesting. When I draw them it looks like I only need to rotate each by 90 degress in different directions to end up with the transpose column vectors.
Edit 3:
Maybe I fool myself, but I think I'm getting closer. The column space
$C= \begin{bmatrix} c_{1} & c_{2} \\ \end{bmatrix} = \begin{bmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \\ \end{bmatrix}$
is related to $A^{-1}$ because:
$AC=\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} \cdot \begin{bmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \\ \end{bmatrix} = \begin{bmatrix} det(A) & 0 \\ 0 & det(A) \\ \end{bmatrix} =det(A) I$
So
$C=A^{-1}det(A)$
B' can be written as well like this:
$B'=\begin{bmatrix} \begin{bmatrix} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} & \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} \end{bmatrix} \\ \begin{bmatrix} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} & \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \end{bmatrix} \end{bmatrix} = \begin{bmatrix} \begin{bmatrix} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} & c_{1} \end{bmatrix} \\ \begin{bmatrix} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} & c_{2} \end{bmatrix} \end{bmatrix}$
or like this
$B'=\begin{bmatrix} \begin{bmatrix} a_{11} & a_{21} \\ \end{bmatrix} & \begin{bmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \\ \end{bmatrix} \\ \begin{bmatrix} a_{21} & a_{22} \\ \end{bmatrix} & \begin{bmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \\ \end{bmatrix} \\ \end{bmatrix} = \begin{bmatrix} a_1^{T} & \begin{bmatrix} c_{1} & c_{2} \\ \end{bmatrix} \\ a_2^{T} & \begin{bmatrix} c_{1} & c_{2} \\ \end{bmatrix} \\ \end{bmatrix} = \begin{bmatrix} a_1^{T}C \\ a_2^{T}C \\ \end{bmatrix} = det(A) \begin{bmatrix} a_1^{T}A^{-1} \\ a_2^{T}A^{-1} \\ \end{bmatrix}$
Therefore for $BA=A^{T}$ we have
$B=\begin{bmatrix} a_1^{T}A^{-1} \\ a_2^{T}A^{-1} \\ \end{bmatrix}$
Edit 4:
I think I will post my own answer soon. Going down the path of $A^{-1}$ had the idea that one can exploit the symmetry of of $AA^{T}$. Symmetry means that $AA^{T}$ decomposes nicer:
$AA^{T} = R_{AA^{T}} \Lambda_{AA^{T}} (R^{-1})_{AA^{T}}$
Now if you multiply both sides with $A^{-1}$ you'll get
$A^{T} = A^{-1} R_{AA^{T}} \Lambda_{AA^{T}} (R^{-1})_{AA^{T}}$
When I do an example with numbers I can also see that in my example $R_{AA^{T}} = (R^{-1})_{AA^{T}}$
$R_{AA^{T}}$ mirrors the space along y axis and then rotates by some angle $\alpha$ So my suspicion right now is:
$A^{T}=A^{-1} R_{AA^{T}} \Lambda_{AA^{T}} R_{AA^{T}}$
Now if I define
$R_{AA^{T}}^{'} = \begin{bmatrix} cos \alpha & -sin \alpha \\ sin \alpha & cos \alpha \\ \end{bmatrix}$
to get the mirroring out of the matrix $R_{AA^{T}}$ then I get
$A^{T}=A^{-1} R_{AA^{T}}^{'} \begin{bmatrix} -1 & 0 \\ 0 & 1 \\ \end{bmatrix} \Lambda_{AA^{T}} R_{AA^{T}}^{'} \begin{bmatrix} -1 & 0 \\ 0 & 1 \\ \end{bmatrix} $
So generally
$A^{T}=A^{-1} R_{\alpha} M_y \Lambda R_{\alpha} M_y$
With $M_y$ being the mirroring along the y axis, $R_{\alpha}$ some counter-clockwise rotation by $\alpha$ and $\Lambda$ some scaling
One geometric description of $A^T$ can be obtained from the SVD decomposition (this will be similar to your third point). Any square matrix $A \in M_n(\mathbb{R})$ can be written as a product $A = S \Lambda R^T$ where $\Lambda$ is diagonal with non-negative entries and both $S,R$ are orthogonal matrices. The diagonal entries of $\Lambda$ are called the singular values of $A$ while the columns of $S$ and $R$ are called left singular vectors of $A$ and right singular vectors of $A$ respectively and they can be computed explicitly (or at least as explicitly as one can compute eigenvalues and eigenvectors). Using this decomposition, we can describe $A^T$ as
$$ A^T = (S\Lambda R^T)^T = R \Lambda S^T. $$
What does this mean geometrically? Assume for simplicity that $n = 2$ (or $n = 3$) and that $\det S = \det R = 1$ so $R,S$ are rotations. If $A$ is symmetric, we can write $A = R \Lambda R^T$ where $R$ is a rotation and $\Lambda$ is diagonal. Geometrically, this describes the action of $A$ as the composition of three operations:
- Perform the rotation $R^T$.
- Stretch each of the coordinate axes $e_i$ by a factor $\lambda_i$ (which is the $(i,i)$-entry of $\Lambda$).
- Finally, perform the rotation $R$ which is the inverse of the rotation $R^T$.
In other words, $A$ acts by rotating, stretching the standard basis vectors and then rotating back.
When $A$ is not symmetric, we can't have such a description but the decomposition $A = S \Lambda R^T$ gives us the next best thing. it describes the action of $A$ as the composition of three operations:
- First, perform the rotation $R^T$.
- Stretch each of the coordinate axes $e_i$ by a factor $\sigma_i$ (which is the $(i,i)$-entry of $\Lambda$).
- Finally, perform a different rotation $S$ which is not necessarily the inverse of $R^T$.
Unlike the case when $A$ was symmetric, here $R \neq S$ so the action of $A$ is a rotation, followed by stretching and then by another rotation. The action of $A^T = R\Lambda S^T$ is then obtained by reversing the roles of $R,S$ while keeping the same stretch factors. Namely, $A$ rotates by $R^T$, stretches by $\Lambda$ and rotates by $S$ while $A^T$ rotates by $S^T$, stretches by $\Lambda$ and rotates by $R$.
When trying to grasp the relation between $A$, $A^T$ and $A^{-1}$, I created the attached plot
For $A^T$ this reads:
- $\mathcal{r}_{U^T}$ is the rotation performed by $U^T$
- $\mathcal{s}_{\Sigma}$ is the scaling performed by $\Sigma$
-
$\mathcal{r}_{V}$ is the rotation performed by $V$
The three axes show the SVD-decomposition of the three incarnations of $A$.
- A green line between two axes indicates equality.
- A red line indicates a contraposition.
In short, this says
"$A^T$ scales like $A$, but rotates like $A^{-1}$."
So, $A^T$ has more in common with $A^{-1}$ then it has in common with $A$.
Not all matrices have an inverse.
If the inverse does not exist, the plot can still be made, replacing $A^{-1}$ with $A^{\dagger}$ and $\Sigma^{-1}$ with $\Sigma^{\dagger}$.
$A^{\dagger}$ is the generalized inverse of $A$.
Some more detail can be found on: www.heavisidesdinner.com
$A$, $A^T$ and $A^-1$">