Truly intuitive geometric interpretation for the transpose of a square matrix

I'm looking for an easily understandable interpretation for a transpose of a square matrix A. An intuitive visual demonstration, how $A^{T}$ relates to A. I want to be able to instantly visualize in my mind what I'm doing to the space when transposing the vectors of a matrix.

From experience, understanding linear algebra concepts in two dimensions is often enough to understand concepts in any higher dimension, so an explanation for two dimensional spaces should be enough I think.

All explanations I found so far were not intuitive enough, as I want to be able to instantly imagine (and draw) how $A^{T}$ looks like given A. I'm not a mathematician btw.

Here is what I found so far (but not intuitive enough for me)

  1. (Ax)$\cdot$y=$(Ax)^{T}$y=$x^{T}A^{T}$y=x$\cdot$$A^{T}$y

As far I understand dot product is a projection (x onto y, y onto x, both interpretations have the same result) followed by a scaling by the length of the other vector.

This would mean that mapping x into space A and projecting y onto the result is the same as mapping y into the space of $A^{T}$, then projecting the unmapped x into $A^{T}$y

So $A^{T}$ is the specific space B for any pair of vectors (x,y) such that Ax$\cdot$y=x$\cdot$By

This doesn't tell me instantly how $A^{T}$ drawn as vectors would look like based on A drawn as vectors.

  1. "reassigning dimensions"

This one is hard to explain so let me do this with a drawing:

parallel projections

This explanation is much more visual, but far too messy to do it in my head instantly. There are also multiple ways I could have rotated and arranged the vectors around the result $A^{T}$ which is represented in the middle. Also, it doesn't feel like it makes me truly understand the transposing of matrices, especially in higher dimensions.

  1. some kind of weird rotation

Symmetrical matrices can be decomposed into a rotation, scaling along eigenvectors $\Lambda$ and a rotation back

A=R$\Lambda$$R^{T}$

So in this specific case, the transpose is a rotation in the opposite direction of the original. I don't know how to generalize that into arbitrary matrices. I'm wildly guessing that if A is not symmetric any more, $R^{T}$ must also include some additional operations besides rotation.

Can anyone help me to find a way to easily and instantly imagine/draw how $A^{T}$ looks like given A in two dimensional space? (In a way of understanding that is generalizable into higher dimensions)

Edit 1: While working on the problem I was curious to see what B in

$BA=A^{T}$

looks like. B would describe what needs to be done to A in order to geometrically transpose it. My temporary result looks interesting but I'm still trying to bring it to an interpretable form. If we assume the following indexing order

$$A= \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} $$

and $det(A)\neq0$ then

$$B=\frac{1}{det(A)} \begin{bmatrix} a_{11} a_{22} - a_{21}^2 & a_{11} (a_{21} - a_{12}) \\ a_{22} (a_{12} - a_{21}) & a_{11} a_{22} - a_{12}^2 \\ \end{bmatrix} $$

What's visible on the first sight is that $\frac{1}{det(A)}$ causes scaling such that the area becomes exactly 1 (before applying the actual matrix).

B must also preserve the area as $det(A^{T})=det(A)$. It means that the matrix

$B'=\begin{bmatrix} a_{11} a_{22} - a_{21}^2 & a_{11} (a_{21} - a_{12}) \\ a_{22} (a_{12} - a_{21}) & a_{11} a_{22} - a_{12}^2 \\ \end{bmatrix}$

squares the area while transposing.

Edit 2:

The same matrix can be written as

$B'=\begin{bmatrix} \begin{bmatrix} a_{11} & a_{21} \\ \end{bmatrix} \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & \begin{bmatrix} a_{11} & a_{21} \\ \end{bmatrix} \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ \begin{bmatrix} a_{21} & a_{22} \\ \end{bmatrix} \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & \begin{bmatrix} a_{12} & a_{22} \\ \end{bmatrix} \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ \end{bmatrix}$

Which is

$B'=\begin{bmatrix} a_{1}^{T} \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & a_{1}^{T} \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ a_{2}^{T} \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & a_{2}^{T} \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ \end{bmatrix}= \begin{bmatrix} a_{1}\cdot \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & a_{1}\cdot \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ a_{2}\cdot \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & a_{2}\cdot \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ \end{bmatrix}$

I find the vectors $c_{1}=\begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix}$ and $c_{2}=\begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix}$ interesting. When I draw them it looks like I only need to rotate each by 90 degress in different directions to end up with the transpose column vectors.

Edit 3:

Maybe I fool myself, but I think I'm getting closer. The column space

$C= \begin{bmatrix} c_{1} & c_{2} \\ \end{bmatrix} = \begin{bmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \\ \end{bmatrix}$

is related to $A^{-1}$ because:

$AC=\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} \cdot \begin{bmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \\ \end{bmatrix} = \begin{bmatrix} det(A) & 0 \\ 0 & det(A) \\ \end{bmatrix} =det(A) I$

So

$C=A^{-1}det(A)$

B' can be written as well like this:

$B'=\begin{bmatrix} \begin{bmatrix} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} & \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} \end{bmatrix} \\ \begin{bmatrix} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} & \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \end{bmatrix} \end{bmatrix} = \begin{bmatrix} \begin{bmatrix} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} & c_{1} \end{bmatrix} \\ \begin{bmatrix} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} & c_{2} \end{bmatrix} \end{bmatrix}$

or like this

$B'=\begin{bmatrix} \begin{bmatrix} a_{11} & a_{21} \\ \end{bmatrix} & \begin{bmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \\ \end{bmatrix} \\ \begin{bmatrix} a_{21} & a_{22} \\ \end{bmatrix} & \begin{bmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \\ \end{bmatrix} \\ \end{bmatrix} = \begin{bmatrix} a_1^{T} & \begin{bmatrix} c_{1} & c_{2} \\ \end{bmatrix} \\ a_2^{T} & \begin{bmatrix} c_{1} & c_{2} \\ \end{bmatrix} \\ \end{bmatrix} = \begin{bmatrix} a_1^{T}C \\ a_2^{T}C \\ \end{bmatrix} = det(A) \begin{bmatrix} a_1^{T}A^{-1} \\ a_2^{T}A^{-1} \\ \end{bmatrix}$

Therefore for $BA=A^{T}$ we have

$B=\begin{bmatrix} a_1^{T}A^{-1} \\ a_2^{T}A^{-1} \\ \end{bmatrix}$

Edit 4:

I think I will post my own answer soon. Going down the path of $A^{-1}$ had the idea that one can exploit the symmetry of of $AA^{T}$. Symmetry means that $AA^{T}$ decomposes nicer:

$AA^{T} = R_{AA^{T}} \Lambda_{AA^{T}} (R^{-1})_{AA^{T}}$

Now if you multiply both sides with $A^{-1}$ you'll get

$A^{T} = A^{-1} R_{AA^{T}} \Lambda_{AA^{T}} (R^{-1})_{AA^{T}}$

When I do an example with numbers I can also see that in my example $R_{AA^{T}} = (R^{-1})_{AA^{T}}$

$R_{AA^{T}}$ mirrors the space along y axis and then rotates by some angle $\alpha$ So my suspicion right now is:

$A^{T}=A^{-1} R_{AA^{T}} \Lambda_{AA^{T}} R_{AA^{T}}$

Now if I define

$R_{AA^{T}}^{'} = \begin{bmatrix} cos \alpha & -sin \alpha \\ sin \alpha & cos \alpha \\ \end{bmatrix}$

to get the mirroring out of the matrix $R_{AA^{T}}$ then I get

$A^{T}=A^{-1} R_{AA^{T}}^{'} \begin{bmatrix} -1 & 0 \\ 0 & 1 \\ \end{bmatrix} \Lambda_{AA^{T}} R_{AA^{T}}^{'} \begin{bmatrix} -1 & 0 \\ 0 & 1 \\ \end{bmatrix} $

So generally

$A^{T}=A^{-1} R_{\alpha} M_y \Lambda R_{\alpha} M_y$

With $M_y$ being the mirroring along the y axis, $R_{\alpha}$ some counter-clockwise rotation by $\alpha$ and $\Lambda$ some scaling


One geometric description of $A^T$ can be obtained from the SVD decomposition (this will be similar to your third point). Any square matrix $A \in M_n(\mathbb{R})$ can be written as a product $A = S \Lambda R^T$ where $\Lambda$ is diagonal with non-negative entries and both $S,R$ are orthogonal matrices. The diagonal entries of $\Lambda$ are called the singular values of $A$ while the columns of $S$ and $R$ are called left singular vectors of $A$ and right singular vectors of $A$ respectively and they can be computed explicitly (or at least as explicitly as one can compute eigenvalues and eigenvectors). Using this decomposition, we can describe $A^T$ as

$$ A^T = (S\Lambda R^T)^T = R \Lambda S^T. $$

What does this mean geometrically? Assume for simplicity that $n = 2$ (or $n = 3$) and that $\det S = \det R = 1$ so $R,S$ are rotations. If $A$ is symmetric, we can write $A = R \Lambda R^T$ where $R$ is a rotation and $\Lambda$ is diagonal. Geometrically, this describes the action of $A$ as the composition of three operations:

  1. Perform the rotation $R^T$.
  2. Stretch each of the coordinate axes $e_i$ by a factor $\lambda_i$ (which is the $(i,i)$-entry of $\Lambda$).
  3. Finally, perform the rotation $R$ which is the inverse of the rotation $R^T$.

In other words, $A$ acts by rotating, stretching the standard basis vectors and then rotating back.

When $A$ is not symmetric, we can't have such a description but the decomposition $A = S \Lambda R^T$ gives us the next best thing. it describes the action of $A$ as the composition of three operations:

  1. First, perform the rotation $R^T$.
  2. Stretch each of the coordinate axes $e_i$ by a factor $\sigma_i$ (which is the $(i,i)$-entry of $\Lambda$).
  3. Finally, perform a different rotation $S$ which is not necessarily the inverse of $R^T$.

Unlike the case when $A$ was symmetric, here $R \neq S$ so the action of $A$ is a rotation, followed by stretching and then by another rotation. The action of $A^T = R\Lambda S^T$ is then obtained by reversing the roles of $R,S$ while keeping the same stretch factors. Namely, $A$ rotates by $R^T$, stretches by $\Lambda$ and rotates by $S$ while $A^T$ rotates by $S^T$, stretches by $\Lambda$ and rotates by $R$.


When trying to grasp the relation between $A$, $A^T$ and $A^{-1}$, I created the attached plot
For $A^T$ this reads:

  • $\mathcal{r}_{U^T}$ is the rotation performed by $U^T$
  • $\mathcal{s}_{\Sigma}$ is the scaling performed by $\Sigma$
  • $\mathcal{r}_{V}$ is the rotation performed by $V$

The three axes show the SVD-decomposition of the three incarnations of $A$.

  • A green line between two axes indicates equality.
  • A red line indicates a contraposition.

In short, this says
"$A^T$ scales like $A$, but rotates like $A^{-1}$."
So, $A^T$ has more in common with $A^{-1}$ then it has in common with $A$.

Not all matrices have an inverse.
If the inverse does not exist, the plot can still be made, replacing $A^{-1}$ with $A^{\dagger}$ and $\Sigma^{-1}$ with $\Sigma^{\dagger}$.
$A^{\dagger}$ is the generalized inverse of $A$.

Some more detail can be found on: www.heavisidesdinner.com

relation between SVD of <span class=$A$, $A^T$ and $A^-1$">