Intuitive way to understand covariance and contravariance in Tensor Algebra

I'm trying to understand basic tensor analysis. I understand the basic concept that the valency of the tensor determines how it is transformed, but I am having trouble visualizing the difference between different valencies when it comes to higher order tensors.

I have this picture in my mind for the lower order tensors

$X^i = \left(\begin{array}{x} x^1 \\\\ x^2 \\\\ x^3\end{array}\right)$

$X_i = \left(\begin{array}{ccc} x_1 & x_2 & x_3\end{array}\right)$

$X^i_j = \left(\begin{array}{ccc} x^1_1 & x^1_2 & x^1_3 \\\\ x^2_1 & x^2_2 & x^2_3 \\\\ x^3_1 & x^3_2 & x^3_3\end{array} \right)$

for $X^{ij}$ and $X_{ij}$ they are represented in the same 2d array, but the action on a vector isn't defined in the same way as with matrices.

What I am having trouble with is intuitively understanding the difference between $X^{ijk}$, $X_{k}^{ij}$, $X_{jk}^{i}$, and $X_{ijk}$ (other permutations of the valence $(2,1)$ and $(1,2)$ omitted for brevity).

ADDED After reading the responses and their comments I came up with this new picture in my head for higher order tensors.

Since I am somewhat comfortable with tensor products in quantum mechanics, I can draw a parallel with the specific tensor space I'm used to.

If we consider a rank-5 tensor with a valence of (2,3) then we can consider it in the braket notation as

$ \langle \psi_i \mid \otimes \ \langle \psi_j \mid \otimes \ \langle \psi_k \mid \otimes \mid \psi_l \rangle \ \otimes \mid \psi_m \rangle = X_{ijk}^{lm} $

Now if we operate with this tensor on rank-3 contravariant tensor, we are-left with a constant (from the inner product) and a rank-2 contravariant tensor, unmixed tensor product $\begin{eqnarray}(\langle \psi_i \mid \otimes \ \langle \psi_j \mid \otimes \ \langle \psi_k \mid \otimes \mid \psi_l \rangle \ \otimes \mid \psi_m \rangle)(\mid \Psi_i \rangle \ \otimes \mid \Psi_j \rangle \ \otimes \mid \Psi_k \rangle) &=& c \mid \psi_l \rangle \ \otimes \mid \psi_m \rangle \\\\ &=& X_{ijk}^{lm}\Psi^{ijk} = cX'^{lm}\end{eqnarray}$

If we were to further operate with a rank-2 covariant tensor (from the right, per convention that a covector and vector facing each other is an implied direct product) we would simply get a number out.

One thing I am confused about though, is that in one of the answer to this question there was a point made that we are taking tensor products of a Vector space with itself (and possibly it's dual), however in the quantum mechanics picture (although I didn't rely on it in this example) we often take tensor products between different, often disjoint, subspaces of the enormous Hilbert space that describes the quantum mechanical universe. Does the tensor picture change in this case?

Any comments on my example would be appreciated.


Since you asked for an intuitive way to understand covariance and contravariance, I think this will do.

First of all, remember that the reason of having covariant or contravariant tensors is because you want to represent the same thing in a different coordinate system. Such a new representation is achieved by a transformation using a set of partial derivatives. In tensor analysis, a good transformation is one that leaves invariant the quantity you are interested in.

For example, we consider the transformation from one coordinate system $x^1,...,x^{n}$ to another $x^{'1},...,x^{'n}$:

$x^{i}=f^{i}(x^{'1},x^{'2},...,x^{'n})$ where $f^{i}$ are certain functions.

Take a look at a couple of specific quantities. How do we transform coordinates? The answer is:

$dx^{i}=\displaystyle \frac{\partial x^{i}}{\partial x^{'k}}dx^{'k}$

Every quantity which under a transformation of coordinates, transforms like the coordinate differentials is called a contravariant tensor.

How do we transform some scalar $\Phi$?

$\displaystyle \frac{\partial \Phi}{\partial x^{i}}=\frac{\partial \Phi}{\partial x^{'k}}\frac{\partial x^{'k}}{\partial x^{i}}$

Every quantity which under a coordinate transformation, transforms like the derivatives of a scalar is called a covariant tensor.

Accordingly, a reasonable generalization is having a quantity which transforms like the product of the components of two contravariant tensors, that is

$A^{ik}=\displaystyle \frac{\partial x^{i}}{\partial x^{'l}}\frac{\partial x^{k}}{\partial x^{'m}}A^{'lm}$

which is called a contravariant tensor of rank two. The same applies to covariant tensors of rank n or mixed tensor of rank n.

Having in mind the analogy to coordinate differentials and derivative of a scalar, take a look at this picture, which I think will help to make it clearer:

From Wikipedia:

alt text

The contravariant components of a vector are obtained by projecting onto the coordinate axes. The covariant components are obtained by projecting onto the normal lines to the coordinate hyperplanes.

Finally, you may want to read: Basis vectors

By the way, I don't recommend to rely blindly on the picture given by matrices, specially when you are doing calculations.


I prefer to think of them as maps instead of matrices. When you move to tensor bundles over manifolds, you won't have global coordinates, so it might be preferable to think this way.

So $x_i$ is a map which sends vectors to reals. Since it's a tensor, you're only concerned with how it acts on basis elements. It's nice to think of them in terms of dual bases: then $x_i(x^j)=\delta_{ij}$, which is defined as $1$ when $i=j$ and $0$ otherwise.

Similarly, $x^i$ is a map which sends covectors to reals, and is defined by $x^i(x_j)=\delta_{ij}$.

If you have more indices, then you're dealing with a tensor product $V^*\otimes\dotsb\otimes V^*\otimes V\otimes\dotsb\otimes V$, say with $n$ copies of the vector space and $m$ copies of the dual. An element of this vector space takes in $m$ vectors and gives you back $n$, again in a tensorial way. So, for example, $X_{ijk}$ is a trilinear map; $X^{ijk}$ is a trivector (an ordered triple of vectors up to linearity); $X_{ij}^k$ is a bilinear map taking two vectors to one vector; and so on.

It's worth thinking about these in terms of the tensors you've seen already. The dot product, for example, is your basic (0,2)-tensor. The cross product is a (1,2)-tensor. If you study Riemannian manifolds, it turns out you can use the metric to "raise and lower indices"; so the Riemannian curvature tensor, for example, is alternately defined as a (1,3)-tensor and a (0,4)-tensor, depending on the author's needs.


The covariance or a contravariance of certain quantities tell you how to transform them to keep the result invariant from the choice of the coordinate system. You transform covariant quantities one-way while you do the inverse with the contravariant ones.

To describe a vector you need coordinates $v^j$ and basis vectors $\mathbf{e_j}$. So the linear combination of the two gives you the actual vector $v^j \mathbf{e_j}$.

But you are free to choose the basis so in a different basis the same vector maybe described as $w^j \mathbf{f_j}$.

So $v^j \mathbf{e_j} = w^j \mathbf{f_j}$

The basis vectors themselves can be expressed as a linear combinations of the other basis:

$\mathbf{e_j} = A^k_j \mathbf{f_k}$.

There $A$ is the basis transformation matrix. Let's have another matrix $B$. Which is inverse of the matrix $A$, so their product gives an identity matrix (Kronecker-delta):

$B^l_j A^k_l = \delta^k_j$

Let's take $w^j \mathbf{f_j}$ and multiply it with the identity, nothing changes:

$w^j \delta^k_j \mathbf{f_k}$

Expand the delta as a product of the two matrices, nothing changes:

$w^j B^l_j A^k_l \mathbf{f_k}$

Parenthesize it like this and you can see something:

$\left( w^j B^l_j \right) \left( A^k_l \mathbf{f_k} \right)$

In the right bracket you got back $\mathbf{e_j}$. While on the left bracket there must be $v^j$.

You can see the basis vectors are transformed with $A$, while the coordinates are the transformed with $B$. The basis vectors very in one way, while the coordinates vary exactly the opposite way. The basis vectors are covariant, the coordinates are contravariant.

Upper indexes and lower indexes just denote whether you need to use the basis change matrix or the inverse. So if you have a tensor let's say: ${F^{abc}}_{defg}$. Based on the index placement you already know you can transform it to a different coordinate system like this: ${F^{abc}}_{defg} B^h_a B^i_b B^j_c A^d_k A^e_l A^f_m A^g_n$.

Also if you care to always match the upper indexes with the lower ones when multiplying the result will be invariant and coordinate system independent. This is an opportunity to self check your work.

Index placement is also helpful to check whether and object is really tensor or just a symbol.

For example the metric tensor $g_{ij}$ have two covariant indexes that mean in a different coordinate system it must look like this: $\tilde g_{ij} A^i_k A^j_l$.

And indeed: $g_{ij} = \mathbf{e_i} \cdot \mathbf{e_j} = \left( \mathbf{f_k} A^k_i \right) \cdot \left( \mathbf{f_l} A^l_j \right) = \left( \mathbf{f_k} \cdot \mathbf{f_l} \right) A^k_i A^l_j = \tilde{g}_{kl} A^k_i A^l_j $

Similarly you can check the Christoffel-symbols $\Gamma^m_{jk}$ aren't tensors, because they aren't transform like that.

While the covariant derivative $\nabla_k v^m = \partial_k v^m + v^j \Gamma^m_{jk}$ does. But that would require more symbol folding.