Relationship between tuples, vectors and column/row matrices

Solution 1:

As with many aspects of linear algebra, this aspect is greatly cleared up by working in a coordinate-independent fashion. What an $n \times m$ matrix really is is a particular representation of a linear transformation $T$ from a vector space of dimension $m$ to a vector space of dimension $n$. To write down such a representation you need a basis of both the source and the target vector spaces.

If $V$ is a single $n$-dimensional vector space, then

  • $n \times n$ matrices generally denote linear transformations $T : V \to V$ with respect to a basis $e_1, ... e_n$. Note that we are using the same basis for both source and target.
  • $n \times 1$ matrices generally denote elements of $V$. Note that this is the same thing as a linear transformation $k \to V$ where $k$ is the base field (I assume $k = \mathbb{R}$ here).
  • $1 \times n$ matrices generally denote linear transformations $V \to k$, otherwise known as elements of the dual space $V^{\ast}$.

When we pick a basis $e_1, ... e_n$ of $V$, it follows that every element $v \in V$ can be uniquely expressed in the form

$$v = \sum v_i e_i.$$

Now (and this is somewhat confusing, but it is a lesson well worth learning) the coefficients $v_i$ actually define linear transformations $V \to k$; in other words, they define distinguished elements of $V^{\ast}$ called the dual basis $e_i^{\ast}$ associated to $e_i$. Again, this is confusing, so I'll repeat: $e_i^{\ast}$ is the linear transformation $V \to k$ which sends a vector $v$ to the component $v_i$ of $e_i$ in the unique representation of $v$ in the basis $e_i$.

The problem with working with matrices instead of linear transformations is that nobody ever tells you about the dual basis, and generally people treat the basis and the dual basis as if they were the same thing, which they're not; they transform differently under change of coordinates. When you take the transpose of a matrix, what you are actually doing is switching basis elements with dual basis elements. This operation is coordinate-dependent, and nobody ever tells you this. (Another reason why you can go a long time without ever learning this lesson is that it doesn't matter if $V$ has an inner product on it with respect to which $e_i$ is an orthonormal basis, since then this operation is coordinate-independent with respect to orthogonal change of coordinates.)

You can multiply a $n \times 1$ matrix with a $1 \times n$ matrix to get a $1 \times 1$ matrix; this is a basis-dependent way of talking about the dual pairing $V^{\ast} \times V \to k$ given by evaluating a linear transformation $V \to k$ at a given element of $V$. Note that the dual pairing is coordinate-independent, and it is determined by what it does to a basis of $V^{\ast}$ and a basis of $V$. Predictably it sends $(e_i^{\ast}, e_j)$ to $1$ if $i = j$ and $0$ otherwise.

You can also multiply a $n \times 1$ matrix with a $1 \times n$ matrix in the other direction to get an $n \times n$ matrix; this is a basis-dependent way of talking about the isomorphism between $\text{End}(V)$ (the space of endomorphisms of $V$, or linear transformations $V \to V$) and the tensor product $V^{\ast} \otimes V$. Explicitly, the isomorphism is as follows: if $T$ is a linear transformation with matrix $a_{ij}$, so that

$$T(e_i) = \sum e_j a_{ji}$$

then it gets sent to the element $\sum a_{ji} e_i^{\ast} \otimes e_j \in V^{\ast} \otimes V$. The dual pairing then gives a pairing $V \times (V^{\ast} \otimes V) \to V$ which is precisely the evaluation of a linear transformation at an element of $V$. Again, this is probably extremely confusing, but, again, it is a lesson well worth learning; it is the abstract way to talk about representing a linear transformation by a matrix.


What the word "vector" means is also worth clearing up. A vector is just an element of a vector space. Here that means an element of $V$, but sometimes it will actually mean an element of $V^{\ast}$. This is a bad habit on the part of people who work with coordinates; they should actually call these dual vectors, since vectors and dual vectors (column and row vectors) transform differently under change of coordinates.