How does the dot product convert a matrix into a scalar?

Sometimes, we just say that a $1\times 1$ matrix is the same as a scalar. Afterall, when it comes to addition and multiplication of $1\times 1$ matrices vs addition and multiplication of scalars, the only difference between something like $\begin{bmatrix}3\end{bmatrix}$ and $3$ is some brackets. Consider $$(3+5)\cdot 4 = 32 \\ (\begin{bmatrix} 3\end{bmatrix} + \begin{bmatrix} 5\end{bmatrix})\begin{bmatrix} 4\end{bmatrix} = \begin{bmatrix} 32\end{bmatrix}$$ The algebra works out exactly the same. So sometimes it's not ridiculous to think of $1\times 1$ matrices as just another way of writing scalars.

But if you do want to distinguish the two, then just think of the formula $a\cdot b = a^Tb$ as a way of finding out which scalar you get from the dot product of $a$ and $b$ and not literally the dot product value itself (which should be scalar). That is, we calculate the dot product of $\begin{bmatrix} 1 \\ 2\end{bmatrix}$ and $\begin{bmatrix} 3 \\ 4\end{bmatrix}$ by using the formula $$\begin{bmatrix} 1 \\ 2\end{bmatrix}^T\begin{bmatrix} 3 \\ 4\end{bmatrix} = \begin{bmatrix} 1 & 2\end{bmatrix}\begin{bmatrix} 3 \\ 4\end{bmatrix} = \begin{bmatrix} 11\end{bmatrix}$$ and then say that this tells us that the dot product is really $11$. So the formula $a^Tb$ is just an algorithm we use to find the correct scalar.

You can view it either way. It doesn't really make a difference.


You are not wrong and it's always good to examine statements very carefully. We usually do not distinguish 1$\times$1 matrices from scalars, and in some sense you can think of scalar multiplication as a special rule for when one of the matrices is 1$\times$1. But the truth is that it is a convenient abuse of notation.

Here's one other thing to think about. A $1 \times 1$ real matrix is supposed to represent a linear mapping from a one-dimensional real vector space into a one-dimensional real vector space--essentially just $\mathbb R$ into $\mathbb R$. The only such mappings are those that take $x \mapsto ax$ for some fixed real number $a$. But this mapping is entirely determined by that real number $a$, so they are essentially equivalent.


It is true that you can only multiply a $m \times n$ matrix by a $n \times p$ matrix, i.e., the column size of the left matrix has to match the row size of the right matrix. With this, we can conclude that a product of a $1 \times 1$ matrix by a $n \times p$ matrix makes no sense for $n > 1$.

That being said, the space of matrices is a vector space, so it has the multiplication between scalars and matrices. So it makes sense to multiply a scalar by a matrix. Once you realize the space of scalars and $1 \times 1$ matrices can be identified, the confusion goes away.