I'm trying to understand tensors and I know they have something to do with the basis and the dual basis of a vector space and a dual space. First I will give a concrete example to make clear what I want to understand.

Let $$v_1=\left(\begin{array}{cc} 2 \\ 3 \\ \end{array}\right)$$

$$v_2= \left(\begin{array}{cc} 1 \\ 2 \\ \end{array}\right) $$

be a basis for some vectorspace V(over the reals) and let v be in V.

$$v=\left(\begin{array}{cc} 4 \\ 2 \\ \end{array}\right) $$

Then calculate:

$$\left(\begin{array}{cc} 4 \\ 2 \\ \end{array}\right) = v_1\left(\begin{array}{cc} 2 \\ 3 \\ \end{array}\right)+ v_2\left(\begin{array}{cc} 1 \\ 2 \\ \end{array}\right) $$

In matrix-form: $$\left(\begin{array}{cc} 4 \\ 2 \\ \end{array}\right)= \left(\begin{array}{cc}2&1\\3&2\end{array}\right)\left(\begin{array}{cc} v_1 \\ v_2 \\ \end{array}\right) $$

With solution:

$$\left(\begin{array}{cc} v_1 \\ v_2 \\ \end{array}\right)= \left(\begin{array}{cc} 6 \\ -8 \\ \end{array}\right)$$

So I have two transformations:

1. $$\left(\begin{array}{cc} 4 \\ 2 \\ \end{array}\right)= \left(\begin{array}{cc}2&1\\3&2\end{array}\right)\left(\begin{array}{cc} 6 \\ -8 \\ \end{array}\right)$$

2.With the inverse matrix ! $$\left(\begin{array}{cc}2&-1\\-3&2\end{array}\right)\left(\begin{array}{cc} 4 \\ 2 \\ \end{array}\right)= \left(\begin{array}{cc} 6 \\ -8 \\ \end{array}\right)$$

Now I know that the dual basis is just the rows of the inverse matrix:

$$ B^* = (2, -1), (-3, 2)$$ which gives the Kronecker delta when multiplied in matrix form with it's "not inverse" matrix.

For simplicity let's write:

$$S = \left(\begin{array}{cc}2&1\\3&2\end{array}\right)$$ $$S^{-1} = \left(\begin{array}{cc}2&-1\\-3&2\end{array}\right) = T $$

So we can write it as this:

1. $$\left(\begin{array}{cc} 4 \\ 2 \\ \end{array}\right)= S\left(\begin{array}{cc} 6 \\ -8 \\ \end{array}\right)$$

2. $$T\left(\begin{array}{cc} 4 \\ 2 \\ \end{array}\right)= \left(\begin{array}{cc} 6 \\ -8 \\ \end{array}\right)$$

Now I'm almost there to get a tensor I think. But I'm confused with all the bases and transformations. What role does the dual basis play in this? And how can I construct a tensor out of these bases? Just glue all the bases together, then dual bases indices up and normal bases indices down ?

I think what would help me is when someone could show me how to do this procedure above, but with covectors(other name for dual vectors) instead of vectors. For example, let's have a dual basis and some covector and then find the coefficients with the inverse of the dual basis and so on. And I would appreciate a concrete example similar to mine above.Thanks for response !

Happy Holidays


Solution 1:

The short answer: You do essentially the same thing as before, but in reverse. You take the matrix whose rows are your covectors, take the inverse, and take columns of the resulting matrix will be the dual basis to the covectors.

The long answer:

It is important to separate out a few different things here, which are easy to group together when you are working with numerical examples. The big difficulty is that there are multiple different isomorphic vector spaces and maps between them, and the algorithms for dealing with everything can be confusing if you don't separate what things are from how they are represented. In what follows, I will be dealing with finite dimensional real vector spaces, but most of the discussion generalizes.

Given a finite dimensional real vector space $V$ of dimension $n$, we have a non-cannonical isomorphism $\mathbb R^n\to V$, and the images of the standard basis vectors (e.g., $e_w=(0,1,0,0,0,\ldots,0)^T$ will be a basis for $V$. Where things get confusing is if $V$ was already $\mathbb R^n$ to begin with, in which case we will have to bases floating around, the standard basis, and a second basis. In your first example, your second basis is given by the columns of $A=\pmatrix{2 & 1 \\ 3 & 2}$. Here, $A$ is just the map from $\mathbb R^n \to V$ mentioned above. It is convenient, therefore, to represent vectors as column vectors (i.e., $n\times 1$ matrices), and bases as matrices whose columns are the basis vectors.

Given a vector space $V$, we have a dual vector space $V^*$, which is just the collection of linear maps $V\to \mathbb R$. Given a basis $v_1,\ldots v_n$ of $V$, the dual basis of $V^*$ is by definition the $\phi_1, \ldots, \phi_n\in V^*$ such that $\phi_i(v_j)=\delta_{ij}$ where $\delta$ is the Kronecker delta.

A map $\mathbb R^n\to \mathbb R$ is given by a $1\times n$ matrix (a row vector), and the action of such covectors on our (reprentations of) vectors is just given by matrix multiplication. Given a basis $V^*$, we will associate the matrix whose rows are the covectors in the basis. If the rows of the matrix $B$ are the covectors $\phi_1,\ldots, \phi_n$ and the columns of the matrix $A$ are the vectors $v_1, \ldots, v_n$, then the $ij$th entry in the matrix $BA$ is $\phi_i(v_j)$. We therefore have the dual basis if $BA=I_n$, which explains your process for getting the dual basis.

It is worth pointing out that when we write a covector as a row vector, we are implicitly writing them in terms of the transposes of the standard basis vector. If we take the standard basis vectors as a basis for $V=\mathbb R^n$, their transposes form the dual basis for $V^*$. What we have done, therefore, is taken a basis for $\mathbb R^n$, expressed in terms of the standard basis, and expressed the dual basis in terms of the dual of the standard basis.


In the above, we've chosen to represent vectors of $\mathbb R^n$ as column vectors and the corresponding covectors as row vectors. The difference between being a row vector and a column vector is just a matter of taking the transpose, and if we don't have vectors for them to act on, covectors aren't really anything special, they are just vectors in a (different, yet isomorphic) vector space. (N.B., the connection between our vectors and covectors boiling down to a transpose essentially comes down to the fact that $\mathbb R^n$ has the dot product, and this is important, and they key to some things you do with tensors, but it is outside the scope of the question).

Because the space of covectors is just a particular vector space, you can repeat the process of taking the dual to this vector space, and given a basis of covectors, you can ask about the dual basis of co-covectors. But what is a co-covector? How do we represent such objects? And in that representation, what do our algorithms look like?

Given a vector space $V$, and the dual space $(V^*)$, we can form the double dual (V^)^. If $f\in (V^*)^*$, it takes in a a covector and spits out a number. We already have an easy way to get a number out of a covector: evaluate it at a vector. This gives us a map $V\to (V^*)^*$ defined by $v\mapsto f_v$ defined by $f_v(\phi)=\phi(v)$. This map is injective, and so $V\subset (V^*)^*$. In general, $V$ is a proper subspace, but when we are finite dimensional, we have equality! If $V\cong W$, then $V^*\cong W^*$, and so because $(\mathbb R^n)^*\cong \mathbb R^n$, we have $((\mathbb R^n)^*)^*\cong (\mathbb R^n)^*\cong \mathbb R^n$ In our case, co-covectors are just vectors.


We now have a choice to make. We can either

  • Write our covectors as vectors in the basis $e_1^*, e_2^*, \ldots, e_n^*$, the dual basis of the standard basis for $\mathbb R^n$, write our co-covectors as column vectors, and perform our original algorithm (take the rows of the inverse matrix as the dual basis).

  • Using the fact that co-covectors (acting on the left) and the same thing as vectors (acting on the right by being acted upon), we're back where we started and run our algorithm in reverse.

Both these choices give the same answer. To see why, let's write out what we do more explicitly.

Let $B$ be the matrix whose rows are our covectors. If we make the first choice, then we form the matrix $B^T$, take the inverse, and take the rows. This gives us our dual basis, expressed in terms of the dual of the dual basis of the standard basis (which is just the standard basis).

If we make the second choice, then we are looking at the columns of $B^{-1}$. Let us compare our two choices. Since $(XY)^T=Y^TX^T$, $(X^T)^{-1}=(X^{-1})^T$, and the rows of $(B^T)^{-1}=(B^{-1})^T$ are just the columns of $B^{-1}$, the two approaches give exactly the same result (beyond the superficial difference that we have row vectors in one version and column vectors in the other).