How many geometrical interpretations does matrix multiplication have?

I am wondering that what is the geometrical interpertation of matrix multiplication. And how many different ways it could be interpreted? The one obvious use is the transformations... I understand a bit in 3D but how about the n-dimensional multiplication --- what kind of transformations would be there?

So, what I want to know is: 1. How can we interpret matrix multiplication? 2. In how many ways do we interpret? 3. How actually coefficients play role in transformations 4. How to interpret more than 3D matrix multiplications

Thanks a lot.


In order to have a geometrical understanding of matrix multiplication, we must first establish a reasonable geometric understanding of matrices themselves. We do this by representing points in space as column vectors (how to do this is in itself somewhat subtle, but I'm going to skimp on the details for brevity) and then using matrices to represent transformations of column vectors, which I'm going to write as rows $(x_1,x_2,\dots,x_n)$ for convenience.

For some $m \times n$ matrix $A$, and a column vector $\mathbf x = (x_1, x_2, \dots, x_n)$, let's define the transformation $f_A(\mathbf x)=\mathbf y = (y_1, \dots,y_m)$ by $y_i = \sum_{j = 0}^n A_{ij} x_j$.

Now suppose $A$ is a $m \times n$ matrix and $B$ is a $n \times p$ matrix, and they have corresponding functions on column vectors $f_A$ and $f_B$. Notice that $f_B$ outputs vectors with $n$ components, and $f_A$ takes vectors of $n$ components as input: we might ask what happens if we put the output of $f_B$ into the input of $f_A$. Let $\mathbf v = (v_1,\dots,v_p)$, and write things like $f_B(\mathbf v)_j$ to mean "the $j^\mathrm{th}$ component of $f_B(\mathbf v)$". Then:

\begin{align*} f_A(f_B(\mathbf v))_i &= \sum_{j=0}^n A_{ij} f_B(\mathbf v)_j \\ &= \sum_{j=0}^n A_{ij} \sum_{k=0}^p B_{jk} v_k \\ &= \sum_{k=0}^p \sum_{j=0}^n A_{ij} B_{jk} v_k \\ &= \sum_{k=0}^p (AB)_{ik}v_k \\ &= f_{AB}(\mathbf v) \end{align*}

where we define the matrix multiplication $AB$ as the $m\times p$ matrix given by $(AB)_{ik} = \sum_{j=0}^n A_{ij} B_{jk}$.

So when we associate functions with matrices in the way I described above, matrix multiplication of $A$ and $B$ gives the matrix whose function is the function of $B$ followed by the function of $A$. Note that in order for this to make sense, the input of $A$ has to be the same size as the output of $B$, and so the width of $A$ has to be equal to the height of $B$, which is exactly the condition for matrix multiplication to make sense.


It helps to think of bases first. Consider the Euclidean space $\mathbb{R}^{n}$, spanned by $\{x_{1},\dots,x_{n}\}$. The standard basis for this space is $\left(\begin{matrix}1\\ 0\\ \vdots\\ 0 \end{matrix}\right),\left(\begin{matrix}0\\ 1\\ \vdots\\ 0 \end{matrix}\right),\dots,\left(\begin{matrix}0\\ \vdots\\ 0\\ 1 \end{matrix}\right)$, but you can use any set of vectors (though using this standard basis makes it a little bit easier to understand what is happening). A matrix (or linear transformation $A$) provides the directions, so to speak, of how to map these vectors into your new space $\mathbb{R}^{m}$. Let $A=\left(A(x_{1})|\cdots|A(x_{n})\right)$. The column $A_{i}$ provides the vector that $x_{i}$ maps onto under the transformation of $A$. For now, don't think so much about the coefficients. Think more about the columns. For example the matrix $$ A=\left(x_{1},0\dots,0\right) $$

maps the $x_{1}$ onto $x_{1}$ and annihilates the other $x_{j}$. Now if we consider any vector, $X$, we can write it in terms of the basis vectors of $\mathbb{R}^{n}$, as $X=\sum_{i=1}^{n}c_{i}x_{i}$, and we can see what $A$ does to it. Simply take $AX=A\sum_{i=1}^{n}c_{i}x_{i}=\sum_{i=1}^{n}c_{i}Ax_{i}=\sum_{i=1}^{n}c_{i}A_{i}$ by linearity, where $A_{i}$ is the $i^{th}$ column of $A$. Thus we can see that $A$, in this case, extracts the first coordinate of $X$ and multiplies it by $x_{1}$. We can consider more complicated examples $$ A=\left(x_{2},x_{1},0,\dots,0\right) $$

In this example, $A$ sends the first basis vector to the second basis vector, and the second basis vector to the first basis vector. So we have $AX=c_{1}x_{2}+c_{2}x_{1}$.

Now we can consider matrices $A$ that are spanned by linear combinations of the $x_{1}$. For example $$ A=\left(x_{1}+x_{2},0,\dots,0\right) $$

sends $x_{1}$ to $x_{1}+x_{2}$. i.e. we would have $AX=c_{1}(x_{1}+x_{2})+0$. Many of the transformations we have can be thought of in the same way.

Consider rotation matrices. Rotation matrices send basis vectors $x_{1},\dots,x_{n}$ to a new set of basis vectors that have been rotated by some set of angles $\{\theta,\phi_{1},\dots,\phi_{n-1}\}$. Using what we have seen above, we can find what the matrix does to any vector $X=\sum_{i=1}^{n}c_{i}x_{i}$.

Hope this helps!