What is the difference between a vector and its transpose?

Basically, the difference is one of dual spaces. Low-level explanation: a vector is acted on by matrices by $$ v \mapsto Av. $$ The transpose of a vector (also called a covector) is acted on by $$ a \to aA, $$ i.e. we multiply on the left for vectors and the right for covectors. The product of a covector $u^T$ and a vector $v$, in that order, is a number, which is the same as $\langle u, v \rangle$.

Mid-level, linear algebra explanation: Let $V$ be a vector space over a field $k$. The transpose takes the vectors $v \in V$ to linear maps in the dual space, normally called $V^* $. This is originally thought of as the set of linear maps on $V$: that is, $\alpha \in V^*$ is a function $V \to k$ such that $$ \alpha(\lambda u+\mu v) = \lambda \alpha(u) + \mu \alpha(v) $$ for all $u,v \in V$, $\lambda,\mu \in k$.

Now, if $V$ is also equipped with an inner product $\langle \cdot , \cdot \rangle : V \times V \to k$, and satisfies some other conditions (finite-dimensional's good enough for now) one can show that every element of $V^*$ can be written in the form $$ \alpha(\cdot) = \langle u , \cdot \rangle, $$ for some $u \in V$. This is basically what $u^{T}$ is. To see how matrices act on these things, consider the member of $k$ given by $u^T A v$, where $A$ is a linear map $V \to V$ (or matrix, if you choose a basis). This can either be written as $$ u^T (Av) = \langle u, Av \rangle, $$ or, using $u^T A = (A^T u)^T$, as $$ (u^T A) v = (A^T u)^T v = \langle A^T u, v \rangle. $$ (This is normally actually the definition of the adjoint/transpose of $A$, but I'm trying to explain from a basic point-of-view.) You can go on to talk about bases, how to change them, and so on. Wikipedia no doubt has some good articles on this, since its general linear algebra coverage is excellent.

(There's a stupidly high-powered category-theoretic answer too, but I'll spare all of us the details: basically the transpose is a contravariant functor that takes the category of vector spaces over a field to its opposite category, and so on...)


So in general the transpose of a vector is a linear mapping of vectors to scalars.

There are many reasons to consider them distinctly. One of them is that it makes it really simple to distinguish these two expressions:$$\mathbf{a}\mathbf{b}^T ~~\text{vs.}~~ \mathbf{a}^T\mathbf{b}$$In another notation borrowed from quantum mechanics you can see the difference even more directly:$$| a \rangle\langle b | ~~\text{vs.}~~ \langle a | b \rangle$$one is an inner product (a scalar); the other is an outer product (a matrix). If we just wrote $\mathbf{a}\mathbf{b}$ we would lack this distinction. In fact in yet another notation, called abstract-index notation, we can write things like $a^i b_j c_k$, which is a "product" of three vectors which maps an ordinary vector to a matrix. So being able to simply make these distinctions is helpful. It's great to be able to say things like $\det A = 0 ~\text{iff}~ A = \mathbf u \mathbf u^T$.

That notational convenience gives us a trebuchet to launch our understanding into new mathematics: suppose that instead of identifying the components $v_i = v^i$ as being absolutely identical, we allow an invertible matrix to sit between them: $v_m = \sum_n g_{mn} ~ v^n$. Then we find that actually, the norm of a vector in such a space is $\sum_m v_m ~ v^m = \sum_{mn} g_{mn} ~ v^m ~v^n$ which might not be $\sum_n v^n ~ v^n$. You can now begin to talk about the curvature of the space in terms of details of $g_{mn}$ and how it interacts with derivative operators $\partial_k$ -- in other words you can start to think about manifolds. Or you can do some useful physics -- in particular, this lets us model things like special relativity, where everyone who has some individual idea $(ct, x, y, z)$ of where and when an event happens will all (due to the way the laws work) agree on the number $(ct)^2 - x^2 - y^2 - z^2$. If you insert those negatives for the transposes (duals) of your vectors consistently you'll get a very simple, easy-to-follow mathematics. If you try to work with the vectors $(ct, ix, iy, iz)$ you will have more trouble. One example is that the electromagnetic field tensor is $\partial_i A_j - \partial_j A_i$, it is completely obvious why you want the dual vector to $A^i$ here and not the vector itself (because it'd be a type error to do $\partial_i A^j - \partial_j A^i$ and therefore wouldn't map to any real quantity!).

Even if you're not doing relativity or differential geometry, you might prefer a different basis for your dual vectors. For example, in order to get the simple expression$$\vec a \cdot \vec b = a_x ~ b_x + a_y ~ b_y + a_z ~ b_z $$there is actually a pretty huge step taken, if you think about the more general expression$$\vec a \cdot \vec b = \sum_{mn} a_m ~ b_n ~ \hat e_m \cdot \hat e_n$$The huge step is to say that $$\hat e_m \cdot \hat e_n = \delta_{mn} = 1 ~\text{if}~ m = n ~\text{else}~ 0.$$But what if you're dealing with a crystal where it has really simple crystal properties only in a handful of directions which are not mutually orthogonal? There are lots of lattices that are not rectangular! Then you're in a skewed coordinate system and you'll want your basis vectors to be non-orthogonal. A good dual space in this circumstance rescues you from madness. You start with your skewed basis vectors $\hat e_m$ and invent their "transposes" or duals $\hat e^m$ such that $\hat e^m \cdot \hat e_n = \delta^m_n = 1 ~\text{if}~ m = n ~\text{else}~ 0.$In other words, we choose (in 3D) for $\hat e^1$ the vector which is perpendicular with $\hat e_2$ and $\hat e_3$ and has dot product $1$ when dotted into $\hat e_1$. Then the coordinates of your new "dual vectors" (expressed in terms of those coordinates) live in a "dual space" to your normal vector space, but you can now restore the simple-ish law $\vec a \cdot \vec b = a_x ~ b^x + a_y ~ b^y + a_z ~ b^z$.

In general the answer is, "because you can do a lot more stuff if you can keep things which are similar-but-not-quite-the-same mentally distinct: lots of things become less confusing."