Understanding the definition of tensors as multilinear maps

The question arises from the definition of the space of $(p,q)$ tensors as the set of multilinear maps from the Cartesian product of elements of a vector space and its dual onto the field, equipped with addition and s-multiplication rules, given in this series at this point in time as follows:

A $(p,q)$ tensor, $T$ is a MULTILINEAR MAP that takes $p$ copies of $V^*$ and $q$ copies of $V$ and maps multilinearly (linear in each entry) to $k:$

$$T: \underset{p}{\underbrace{V^*\times \cdots \times V^*}}\times \underset{q}{\underbrace{V\times\times \cdots V\times V}} \overset{\sim}\rightarrow K\tag 1$$

The $(p,q)$ TENSOR SPACE is defined as a set:

$$\begin{align}T^p_q\,V &= \underset{p}{\underbrace{V\color{darkorange}{\otimes}\cdots\color{darkorange}{\otimes} V}} \color{darkorange}{\otimes} \underset{q}{\underbrace{V^*\color{darkorange}{\otimes}\cdots\color{darkorange}{\otimes} V^*}}:=\{T\, |\, T\, \text{ is a (p,q) tensor}\}\tag2\\[3ex]&=\{T: \underset{p}{\underbrace{V^*\times \cdots \times V^*}}\times \underset{q}{\underbrace{V\times \cdots \times V}} \overset{\sim}\rightarrow K\}\end{align}\tag3$$

This expression symbolizing the set of all tensors where $T$ is $(p,q)$, equipped this with pointwise addition and s-multiplication.

This is (not surprisingly) consistent with the Wikipedia definition of tensors as multilinear maps.


QUESTION:

I don't understand why in Eq. (2), $p$ seems to index (and is equal to) the number of elements in the vector space $V$ entered in, while in Eq. (3), the same $p$ is indexing the number of elements of the dual space $V^*.$

Can we say then that the $q$ elements $V^*\otimes V^*\otimes\cdots$ in Eq. (2) are linear functionals waiting for the same number of vectors $V$ to produce real numbers that are later multiplied?

If so, what are the $p$ $V$ elements in $V\otimes V\otimes\cdots$ in Eq. 2 doing? Are they vectors in $V$ "waiting" for a functional to be mapped into $K$? And if so where is that function defined? I guess that since we are defining a set, it can be any functional?

Is this the correct interpretation?

And how are the $\color{darkorange}{\otimes}$ and $\times$ operations to be interpreted in these equations?


Solution 1:

Let's first look at a very special type of tensor, namely the $(0,1)$ tensor. What is it? Well, it is the tensor product of $0$ copies of members of $V$ and one copy of members of $V^*$. That is, it is a member of $V^*$.

But what is a member of $V^*$? Well, by the very definition of $V^*$ is is a linear function $\phi:V\to K$. Let's write this explicitly: $$T^0_1V = V^* = \{\phi:V\to K\mid\phi \text{ is linear}\}$$ You see, already at this point, where we didn't even use a tensor product, we get a $V^*$ on one side, and a $V$ on the other, simply by inserting the definition of $V^*$.

From this, it is obvious why $(0,q)$-tensors have $q$ copies of $V^*$ in the tensor product $(2)$, but $q$ copies of $V$ in the domain of the multilinear function in $(3)$.

OK, but why do you have a $V^*$ in the map in $(3)$ for each factor $V$ in the tensor product? After all, vectors are not functions, are they?

Well, in some sense they are: There is a natural linear map from $V$ to its double dual $V^{**}$, that is, the set of linear functions from $V^*$ to $K$. Indeed, for finite dimensional vector spaces, you even have that $V^{**} \cong V$. This natural map is defined by the condition that applying the image of $v$ to $\phi\in V^*$ gives the same value as applying $\phi$ to $v$. I suspect that the lecture assumes finite dimensional vector spaces. In that case, you can identify $V$ with $V^{**}$, and therefore you get $$T^1_0V = V = V^{**} = \{T:V^*\to K\mid T \text{ is linear}\}$$ Here the second equality is exactly that identification.

Now again it should be obvious why $p$ copies of $V$ in the tensor product $(2)$ give $p$ factors of $V^*$ for the domain of the multilinear functions in $(3)$.

Edit: On request in the comments, something about the relations of those terms to the Kronecker product.

The tensor product $\color{darkorange}{\otimes}$ in $(2)$ is a tensor product not of (co)vectors, but of (co)vector spaces. The result of that tensor product describes not one tensor, but the set of all tensors of a given type. The tensors are then elements of the corresponding set. And given a basis of $V$, the tensors can then be specified by giving their coefficients in that basis.

This is completely analogous to the vector space itself. We have the vector space, $V$, this vector space contains vectors $v\in V$, and given a basis $\{e_i\}$ of $V$, we can write the vector in components, $v = \sum_i v^i e_i$.

Similarly for $V^*$, we can write each member $\phi\in V^*$ in the dual basis $\omega^i$ (defined by $\omega^i(e_j)=\delta^i_j$) as $\sum_i \phi_i \omega^i$. An alternative way to get the components $\phi_i$ is to notice that $\phi(e_k) = \sum_i \phi_i \omega^i(e_k) = \sum_i \phi_i \delta^i_k = \phi_k$. That is, the components of the covector are just the function values at the basis vectors.

This way one also sees immediately that $\phi(v) = \sum_i \phi(v^i e_i) = \sum_i v^i\phi(e_i) = \sum_i v^i \phi_i$, which is sort of like an inner product, but not exactly, because it behaves differently at change of basis.

Now let's look at a $(0,2)$ tensor, that is, a bilinear function $f:V\times V\to K$. Note that $f\in V^*\color{darkorange}{\otimes} V^*$, as $V^*\color{darkorange}{\otimes} V^*$ is by definition the set of all such functions (see eq. $(3)$). Now by being a bilinear function, one again only needs to know the values at the basis vectors, as $$f(v,w) = f(\sum_i v^i e_i, \sum_j w^j e_j) = \sum_{i,j}v^i w^j f(e_i,e_j)$$ and therefore we can define as components $f_{ij} = f(e_i,e_j)$ and get $f(v,w)=\sum_{i,j}f_{ij}v^i w^j$.

This goes also for general tensors: A single tensor $T\in T^p_qV$ is a multilinear function $T:(V^*)^p\times V^q\to K$, and it is completely determined by the values you get when inserting basis vectors and basis covectors everywhere, giving the components $$T^{i\ldots j}_{k\ldots l}=T(\underbrace{\omega^i,\ldots,\omega^j}_{p},\underbrace{e_k,\ldots,e_l}_{q})$$

OK, we now have components, but we have still not defined the tensor product of tensors. But that is actually quite easy:

Be $x\in T^p_qV$, and $y\in T^r_sV$. That is, $x$ is a function that takes $p$ covectors and $q$ vectors, and gives a scalar, while $y$ takes $r$ covectors and $s$ vectors to a scalar. Then the tensor product $x\color{blue}{\otimes} y$ is a function that takes $p+r$ covectors and $q+s$ vectors, feeds the first $p$ covectors and the first $q$ vectors to $x$, and the remaining $r$ covectors and $s$ vectors to $y$, and them multiplies the result. That is, $$(x\color{blue}{\otimes} y)(\underbrace{\kappa,\ldots,\lambda,\mu,\ldots,\nu}_{p+r},\underbrace{u,\ldots,v,w,\ldots,x}_{q+s}) = x(\underbrace{\kappa,\ldots,\lambda}_p,\underbrace{u,\ldots,v}_q)\cdot y(\underbrace{\mu,\ldots,\nu}_{r},\underbrace{w,\ldots,x}_{s})$$ It is not hard to check that this function is indeed also multilinear, and therefore $x\color{blue}{\otimes} y\in T^{p+r}_{q+s}V$.

And now finally, we get to the question what the components of $x\color{blue}{\otimes} y$ are. Well, the components of $x\color{blue}{\otimes} y$ are just the function values when inserting basis vectors and basis covectors, and when you do that and use the definition of the tensor product, you find that indeed, the components of the tensor product are the Kronecker product of the components of the factors.

Also, it can be shown that $T^p_q V$ is a vector space in its own right, and therefore the $(p,q)$-tensors can be written as the linear combination of a basis that is $1$ exactly for one combination of basis vectors and basis covectors and $0$ for all other combinations. However it can then easily be seen that this is just the tensor product of the corresponding dual covectors/vectors. Since furthermore in that basis, the coefficients on the basis vectors are just the components of the tensor as introduced before, we finally arrive at the formula $$T = \sum T^{i\ldots j}_{k\ldots l}\underbrace{e_i\color{blue}{\otimes}\dots\color{blue}{\otimes} e_j}_{p}\color{blue}{\otimes}\underbrace{\omega^k\color{blue}{\otimes}\dots,\color{blue}{\otimes}\;\omega^l}_{q}$$

Solution 2:

Taking into account what has been said here, recall that if $\{e_1,...,e_n\}$ is a basis of $V$ and $\{\omega^1,...,\omega^n\}$ its dual basis, then any $T\in\mathcal T^{(p,q)}$ can be written:

\begin{equation} T=\sum \lambda_{i_1, \cdots, i_p}^{j_1, \cdots, j_q} e_{i_1} \otimes \cdots \otimes e_{i_p} \otimes \omega^{j_1} \otimes \cdots \otimes \omega^{j_q} \end{equation}

Think about the simple cases. Tensors of type $(0,1)$ are linear maps $T:V\rightarrow K$, that is, elements of $V^*$. Now, tensors of type $(0,2)$ are usually called bilinear forms. They are multilinear maps $T:V\times V \rightarrow K$. Now, the easiest way of building a $(0,2)$-tensor is picking two covectors $f_1$ and $f_2$, and use tensorial product: $$T=f_1\otimes f_2$$ (note that I wrote sub-indices, there is no difference between writing super-indices, and it's advisable to do it in the general case, but for $\mathcal T^{(0,2)}(V)$ it's all right).

Now, using the first equation, every $(0,2)$-tensor is like this: $$T=\sum \lambda_{ij} \ \omega^i\otimes \omega^j$$ (recall that $\{\omega^1,...,\omega^n\})$ is a basis of $V^*$. So now you can see that $\mathcal T^{(0,2)}(V)=V^*\otimes V^*$.

In the general case, all tensors are sums of multiples of tensors like this: $$e_{i_1} \otimes \cdots \otimes e_{i_p} \otimes \omega^{j_1} \otimes \cdots \otimes \omega^{j_q} \qquad \qquad (1)$$

so this is why $\mathcal T^{(p,q)}(V)=V\otimes \overset{p \text{ times}}{...} \otimes V \otimes V^*\otimes \overset{q \text{ times}}{...} \otimes V^*$


Let us focus on tensors like $(1)$. If $v_1, ..., v_q\in V$ and $\xi^1, ... \xi^p\in V^*$, then

$$e_{i_1} \otimes \cdots \otimes e_{i_p} \otimes \omega^{j_1} \otimes \cdots \otimes \omega^{j_q}(\xi_1, ... \xi_q,v_1, ..., v_p,)=e_{i_1}(\xi^1)\cdots e_{i_p}(\xi^p) \cdot \omega^{j_1}(v_1)\cdots \omega^{j_q}(v_q)$$

as I suppose you know. So, in a certain way, you could say that these $\omega$'s are waiting for a vector to produce numbers that later will be multiplied.

Now, forget that you know the elements of $V^*$ are functionals. They now are, simply, vectors, (as they are elements of a vector space), and its dual is $V^{**}=V$. So the $e$'s, considered as functionals over $V^*$ are also waiting for a vector (but now a vector is an element of $V^*$) to produce numbers that will be multiplied.

The difficulties with this issues is that one has to keep in mind the different points of view. It is true that $V^*$ is the dual of $V$, and the elements of $V^*$ are functionals over $V$. But, as $V^*$ is again a vector space, we can forget for a moment about $V$, considering the elements of $V^*$ as simple vectors, and take the dual $V^{**}$. So, in the end, vectors and covectors behave in really similar ways.

Regarding the isomorphism $\phi: V\to V^{**}$, if we pick $v\in V$. So now, $\phi(v)$ is a functional in $V^*$, and to define who $v$ is, we have to say what is the value of $\phi(v)(\omega)\in K$ for every $\omega\in V^*$. But we know $\omega$ is a functional on $V$, so $\omega(v)\in K$. So we use this fact to define $\phi(v)$:

$$\phi(v)(\omega)=\omega(v)$$

In the end, we would end writing $\phi(v)=v$ so in the end, we get $$v(\omega)=\omega(v)$$ and this is why the theorem that states the isomorphism $V\to V^{**}$ is called reflexivity theorem. When you see $v(\omega)$, just think it is the same as $\omega(v)$.

Solution 3:

The space of linear maps $f:V\to k$ is $V^*$. As such, we can show that the space of multilinear maps from $\Pi_i^n M_i$ to $k$ will be exactly $M_1^*\otimes M_2^*\otimes\dots \otimes M_n^*$, which is exactly what is going on here.

So the space of maps from $\Pi_1^p V\times \Pi_1^q V^*\to k$ would be $\bigotimes_1^p V^*\otimes \bigotimes_1^q V^{**}$

Since this definition seems to implicitly be working with finite dimensional space $V^{**}$ can be replaced with $V$.