What is the tensor profuct of covectors? [duplicate]
As a physics student, I've come across mathematical objects called tensors in several different contexts. Perhaps confusingly, I've also been given both the mathematician's and physicist's definition, which I believe are slightly different.
I currently think of them in the following ways, but have a tough time reconciling the different views:
- An extension/abstraction of scalars, vectors, and matrices in mathematics.
- A multi-dimensional array of elements.
- A mapping between vector spaces that represents a co-ordinate independent transformation.
In fact, I'm not even sure how correct these three definitions are. Is there a particularly relevant (rigorous, even) definition of tensors and their uses, that might be suitable for a mathematical physicist?
Direct answers/explanations, as well as links to good introductory articles, would be much appreciated.
At least to me, it is helpful to think in terms of bases. (I'll only be talking about tensor products of finite-dimensional vector spaces here.) This makes the universal mapping property that Zach Conn talks about a bit less abstract (in fact, almost trivial).
First recall that if $L: V \to U$ is a linear map, then $L$ is completely determined by what it does to a basis $\{ e_i \}$ for $V$: $$L(x)=L\left( \sum_i x_i e_i \right) = \sum_i x_i L(e_i).$$ (The coefficients of $L(e_i)$ in a basis for $U$ give the $i$th column in the matrix for $L$ with respect to the given bases.)
Tensors come into the picture when one studies multilinear maps. If $B: V \times W \to U$ is a bilinear map, then $B$ is completely determined by the values $B(e_i,f_j)$ where $\{ e_i \}$ is a basis for $V$ and $\{ f_j \}$ is a basis for $W$: $$B(x,y) = B\left( \sum_i x_i e_i,\sum_j y_j f_j \right) = \sum_i \sum_j x_i y_j B(e_i,f_j).$$ For simplicity, consider the particular case when $U=\mathbf{R}$; then the values $B(e_i,f_j)$ make up a set of $N=mn$ real numbers (where $m$ and $n$ are the dimensions of $V$ and $W$), and these numbers are all that we need to keep track of in order to know everything about the bilinear map $B:V \times W \to \mathbf{R}$.
Notice that in order to compute $B(x,y)$ we don't really need to know the individual vectors $x$ and $y$, but rather the $N=mn$ numbers $\{ x_i y_j \}$. Another pair of vectors $v$ and $w$ with $v_i w_j = x_i y_j$ for all $i$ and $j$ will satisfy $B(v,w)=B(x,y)$.
This leads to the idea of splitting the computation of $B(x,y)$ into two stages. Take an $N$-dimensional vector space $T$ (they're all isomorphic so it doesn't matter which one we take) with a basis $(g_1,\dots,g_N)$. Given $x=\sum x_i e_i$ and $y=\sum y_j f_j$, first form the vector in $T$ whose coordinates with respect to the basis $\{ g_k \}$ are given by the column vector $$(x_1 y_1,\dots,x_1 y_m,x_2 y_1,\dots,x_2 y_m,\dots,x_n y_1,\dots,x_n y_m)^T.$$ Then run this vector through the linear map $\tilde{B}:T\to\mathbf{R}$ whose matrix is the row vector $$(B_{11},\dots,B_{1m},B_{21},\dots,B_{2m},\dots,B_{n1},\dots,B_{nm}),$$ where $B_{ij}=B(e_i,f_j)$. This gives, by construction, $\sum\sum B_{ij} x_i y_j=B(x,y)$.
We'll call the space $T$ the tensor product of the vector spaces $V$ and $W$ and denote it by $T=V \otimes W$; it is “uniquely defined up to isomorphism”, and its elements are called tensors. The vector in $T$ that we formed from $x\in V$ and $y\in W$ in the first stage above will be denoted $x \otimes y$; it's a “bilinear mixture” of $x$ and $y$ which doesn't allow us to reconstruct $x$ and $y$ individually, but still contains exactly all the information needed in order to compute $B(x,y)$ for any bilinear map $B$; we have $B(x,y)=\tilde{B}(x \otimes y)$. This is the “universal property”; any bilinear map $B$ from $V \times W$ can be computed by taking a “detour” through $T$, and this detour is unique, since the map $\tilde{B}$ is constructed uniquely from the values $B(e_i,f_j)$.
To tidy this up, one would like to make sure that the definition is basis-independent. One way is to check that everything transforms properly under changes of bases. Another way is to do the construction by forming a much bigger space and taking a quotient with respect to suitable relations (without ever mentioning bases). Then, by untangling definitions, one can for example show that a bilinear map $B:V \times W \to \mathbf{R}$ can be canonically identified with an element of the space $V^* \otimes W^*$, and dually an element of $V \otimes W$ can be identified with a bilinear map $V^* \times W^* \to \mathbf{R}$. Yet other authors find this a convenient starting point, so that they instead define $V \otimes W$ to be the space of bilinear maps $V^* \times W^* \to \mathbf{R}$. So it's no wonder that one can become a little confused when trying to compare different definitions...
In mathematics, tensors are one of the first objects encountered which cannot be fully understood without their accompanying universal mapping property.
Before talking about tensors, one needs to talk about the tensor product of vector spaces. You are probably already familiar with the direct sum of vector spaces. This is an addition operation on spaces. The tensor product provides a multiplication operation on vector spaces.
The key feature of the tensor product is that it replaces bilinear maps on a cartesian product of vector spaces with linear maps on the tensor product of the two spaces. In essence, if $V,W$ are vector spaces, there is a bijective correspondence between the set of bilinear maps on $V\times W$ (to any target space) and the set of linear maps on $V\otimes W$ (the tensor product of $V$ and $W$).
This can be phrased in terms of a universal mapping property. Given vector spaces $V,W$, a tensor product $V\otimes W$ of $V$ and $W$ is a space together with a map $\otimes : V\times W \rightarrow V\otimes W$ such that for any vector space $X$ and any bilinear map $f : V\times W \rightarrow X$ there exists a unique linear map $\tilde{f} : V\otimes W \rightarrow X$ such that $f = \tilde{f}\circ \otimes$. In other words, every bilinear map on the cartesian product factors uniquely through the tensor product.
It can be shown using a basic argument that the tensor product is unique up to isomorphism, so you can speak of "the" tensor product of two spaces rather than "a" tensor product, as I did in the previous paragraph.
A tensor is just an element of a tensor product.
One must show that such a tensor product exists. The standard construction is to take the free vector space over $V\times W$ and introduce various bilinearity relations. See my link at the bottom for an article that does this explicitly. In my experience, however, the key is to be able to use the above mapping property; the particular construction doesn't matter much in the long run. The map $\otimes : V\times W \rightarrow V\otimes W$ sends the pair $(v,w) \in V\times W$ to $v\otimes w \in V\otimes W$. The image of $\otimes$ is the space of so-called elementary tensors, but a general element of $V\otimes W$ is not an elementary tensor but rather a linear combination of elementary tensors. (In fact, due to bilinearity, it is enough to say that a general tensor is a sum of elementary tensors with the coefficients all being 1.)
The most generic reason why tensors are useful is that the tensor product is a machine for replacing bilinear maps with linear ones. In much of mathematics and physics, one seeks to find linear approximations to things; tensors can be seen as one tool for this, although exactly how they accomplish it is less clear than many other tools in the same vein. Here are some more specific reasons why they are useful.
For finite-dimensional spaces $V,W$, the tensor product $V^*\otimes W$ is isomorphic to the space of homomorphisms $\text{Hom}(V,W)$. So in other words every linear map $V \rightarrow W$ has a tensor expansion, i.e., a representation as a tensor in $V^* \otimes W$. For instance, if $\{v_i\}$ is a basis of $V$ and $\{x_i\}$ is the dual basis of $V^*$, then $\sum x_i \otimes v_i \in V^* \otimes V$ is a tensor representation of the identity map on $V$.
Tensor products tend to appear in a lot of unexpected places. For instance, in analyzing the linear representations of a finite group, once the irreducible representations are known it can be of benefit to construct also a "tensor product table" which decomposes the tensor products of all pairs of irreducible representations as direct sums of irreducible representations.
In physics, one often talks about a rank $n$ tensor being an assembly of numbers which transform in a certain way under change of coordinates. What one is really describing here is all the different coordinate representations of an abstract tensor in a tensor power $V^{\otimes n}$.
If one takes the direct sum of all tensor powers of a vector space $V$, one obtains the tensor algebra over $V$. In other words, the tensor algebra is the construction $k\oplus V\oplus (V\otimes V) \oplus (V\otimes V\otimes V) \oplus \dots$, where $k$ is the base field. The tensor algebra is naturally graded, and it admits several extremely useful quotient algebras, including the well-known exterior algebra of $V$. The exterior algebra provides the natural machinery for differential forms in differential geometry.
Here's an example of the exterior algebra in practice. Suppose one wishes to classify all nonabelian two-dimensional Lie algebras $\mathfrak{g}$. The Lie bracket $[\cdot,\cdot]$ is antisymmetric and bilinear, so the machinery of tensor products turns it into a linear map $\bigwedge^2 V \rightarrow V$, where $V$ is the underlying vector space of the algebra. Now $\bigwedge^2 V$ is one-dimensional and since the algebra is nonabelian the Lie bracket is not everywhere zero; hence as a linear map the Lie bracket has a one-dimensional image. Then one can choose a basis $\{X,Y\}$ of $V$ such that $[X,Y] = X$, and we conclude that there is essentially only one nonabelian Lie algebra structure on a two-dimensional vector space.
A fantastic reference on tensor products of modules was written by Keith Conrad: http://www.math.uconn.edu/~kconrad/blurbs/linmultialg/tensorprod.pdf
Once you understand what a tensor product is and what a dual space is, then a tensor of type $(n, m)$ is an element of $V^{\ast \otimes m} \otimes V^{\otimes n}$ where $V$ is some vector space. This is the same thing as a multilinear map $V^m \to V^{\otimes n}$ or, if you don't like the asymmetry, a multilinear map $V^{\ast n} \times V^{m} \to F$ (where $F$ is the underlying field). Examples:
- A tensor of type $(0, 0)$ is a scalar.
- A tensor of type $(1, 0)$ is a vector.
- A tensor of type $(0, 1)$ is a covector.
- A tensor of type $(1, 1)$ is a linear transformation.
- A tensor of type $(0, 2)$ is a bilinear form, for example an inner product.
When you pick a basis of $V$, you can write tensors in terms of the natural basis on $V^{\ast \otimes n} \otimes V^{\otimes m}$ coming from taking products of the basis on $V$ with the corresponding dual basis on $V^{\ast}$. This is where the "multidimensional array" definition of a tensor comes from, since this is the natural generalization of writing a matrix as a square array (which is equivalent to writing an element of $V^{\ast} \otimes V$ in terms of the basis $e_i^{\ast} \otimes e_j$ where $\{ e_i \}$ is a basis).
When a physicist says "tensor," sometimes they mean a tensor field. This is a "globalization" of the above definition: it is a compatible set of choices, for each tangent space $V = T_p(M)$ of a smooth manifold $M$, of a tensor of type $(n, m)$ as defined above. Note that $V^{\ast}$ is the cotangent space. Examples:
- A tensor field of type $(0, 0)$ is a smooth function.
- A tensor field of type $(1, 0)$ is a vector field.
- A tensor field of type $(0, 1)$ is a differential $1$-form.
- A tensor field of type $(1, 1)$ is a morphism of vector fields.
- A tensor field of type $(0, 2)$ which is symmetric and nondegenerate is a metric tensor. If it is also positive-definite, it is a Riemannian metric. If it has signature $(1, n-1)$, it is a Lorentzian metric.
Mathematicians and physicists use very different languages when they talk about tensors. Fortunately, they are talking about the same thing, but unfortunately, this is not obvious at all. Let me explain.
For simplicity, I'm going to focus on covariant 2-tensors, since this case already contains the main intuition. Also, I'm not going to talk about the distinction between covariant and contravariant, but I'll get all the indices right for future study.
Physicist's definition
Definition: A covariant 2-tensor is a set of numbers $t_{ij}$ with two indices that transforms in a particular way under a change of coordinates… Wait, wait, coordinates in what space? Physicists usually don't mention it, but they mean coordinates in a given vector space $V$.
More precisely, let $\{\vec e_i\}$ be a basis of the vector space $V$. Then, every vector $\vec v$ can be expressed in terms of its coordinates $v^i$ as follows:
$$\vec v = \sum_i v^i \vec e_i .$$
So, there are two objects: the vector $\vec v$ which I think of as "solid" or "fundamental", and its coordinates $v^i$, which are "ephemeral", since I have to choose a basis $\vec e_i$ before I can talk about them at all.
Furthermore, in a different basis $\{\vec e'_i\}$ of our vector space, the coordinates of one and the same vector $\vec v$ are very different numbers.
$$ \vec v = \sum_i v^i \vec e_i = \sum_i v'^i \vec e'_i .$$
but $v^i ≠ v'^i$. So, the vector is the fundamental thing. Its coordinates are useful for calculations, but they are ephemeral and heavily depend on the choice of basis.
Now, when defining a covariant 2-tensor, physicists do something very mysterious: they define a fundamental object (= the 2-tensor) not by describing it directly, but only by specifying how its ephemeral coordinates look like and change when switching to a different basis. Namely, a change of basis
$$ \vec e'_i = \sum_j R_i^a \vec e_a $$
will change the coordinates $t_{ij}$ of the tensor via
$$ t'_{ij} = \sum_{ab} R^a_i R^b_j t_{ab} .$$
If that is not completely unintuitive, I don't know what is.
Mathematician's definition
Mathematicians define tensors differently. Namely, the give a direct, fundamental description of what a 2-tensor is and only then ponder how it looks like in different coordinate systems.
Here is the definition: a covariant 2-tensor $t$ is a bilinear map $t : V\times V \to \mathbb{R}$. That's it. (Bilinear = linear in both arguments).
In other words, a covariant 2-tensor $t$ is a thing that eats two vectors $\vec v$, $\vec w$ and returns a number $t(\vec v, \vec w) \in\mathbb{R}$.
Now, what does this thing look like in coordinates? Choosing a basis $\lbrace \vec e_i \rbrace$, bilinearity allows us to write
$$ t(\vec v, \vec w) = t(\sum_i v^i \vec e_i, \sum_j w^j \vec e_j) = \sum_{ij} v^iw^j t(\vec e_i,\vec e_j) .$$
Now, we simply call the numbers $t_{ij} = t(\vec e_i, \vec e_j)$ the coordinates of the tensor $t$ in the basis $\vec e_i$. You can calculate that these numbers will behave just like the physicists tell us when you change the basis to $\vec e_i'$. So, the physicist's tensor and the mathematician's tensor are one and the same thing.
Tensor product
Actually, mathematicians do something more advanced, they define a so called tensor product of vector spaces. The previous definition as a bilinear map is still correct, but mathematicians like to write this as "$t\in V^*\otimes V^*$" instead of "$t: V\times V \to \mathbb{R}$ and $t$ bilinear".
However, for a first understanding of the physicist's vs the mathematician's definition, it is not necessary to understand the mathematical tensor product.
The most general notion I know is that tensor product of modules. You can read about this here http://en.wikipedia.org/wiki/Tensor_product_of_modules
Since vector spaces are modules, this definition specializes to vector spaces. The tensor product of elements in these vector spaces that one usually sees in engineering and physics texts (frequently matrices) is basically an element in the tensor product of the corresponding vector spaces.