Why is tensor from a vector space covariant, not contravariant?

The notions "covariant" and "contravariant" are rather old. They are tied to the coordinate representation of vectors with respect to a basis of the underlying vector space.

Let $\mathcal B = (b_1,\ldots, b_n)$ be an ordered basis of $V$. The dual basis $\mathcal B^* = (b_1^*,\ldots, b_n^*)$ is given by the linear maps $b_i^* : V \to \mathbb R, b_i^*(b_j) = \delta_{ij}$. Then any $T : V \to \mathbb R$ can be written uniquely as $T = \sum T_i b_i^*$ where $T_i = T(b_i)$. In your question you write $b_i^* = \theta^i$. For the sake of transparence let us write $T_i = T_i(\mathcal B)$and $T(\mathcal B) = (T_1(\mathcal B),\ldots,T_n(\mathcal B)) \in \mathbb R^n$. The latter is the coordinate representation of $T$ with respect to the basis $\mathcal B^*$ of $V^*$.

If $\mathcal C = (c_1,\ldots,c_n)$ is another ordered basis of $V$, then there exists a unique (invertible) matrix $A = (a_{ij})$ such that $$c_i = \sum_j a_{ij}b_j .$$ $A$ is the transformation matrix of the change of basis $\mathcal B \mapsto \mathcal C$. Note that if $A^{-1} = (a'_{ij})$, then $$b_i = \sum_j a'_{ij}c_j .$$ That is, $A^{-1}$ is the transformation matrix of the inverse change of basis $\mathcal C \mapsto \mathcal B$.

With respect to the new basis $\mathcal C$ We have $T = \sum T_i(\mathcal C) c_i^*$. What is the relation between $T(\mathcal C) = (T_1(\mathcal C),\ldots,T_n(\mathcal C))$ and $T(\mathcal B) = (T_1(\mathcal B),\ldots,T_n(\mathcal B))$? We have $$T_i(\mathcal C) = T(c_i) = T(\sum_j a_{ij}b_j) = \sum_j a_{ij}T(b_j) = \sum_j a_{ij}T_j(\mathcal B) .$$ That is, the transformation formula for a change of basis $\mathcal B \mapsto \mathcal C$ of $V$ and the induced transformation formula for $T(\mathcal B) \mapsto T(\mathcal C)$ are the "same", i.e. have the same transformation matrix. This means that the coordinate representation $T(\mathcal B)$ covaries with $\mathcal B$, and it is the reason why $T$ is called covariant.

What about a tensor $\tilde T : V^* \to \mathbb R$? As Michael Seifert says in his answer, we have $\tilde T \in V^{**}$ and $V^{**}$ can be identified naturally with $V$. Let us nevertheless do it a bit more formally. Let $\mathcal B^{**}$ be the dual basis for $\mathcal B^*$. Then $\tilde T = \sum \tilde T^i b_i^{**}$. Let us write $\tilde T^i = \tilde T^i(\mathcal B)$ and $\tilde T(\mathcal B)= (\tilde T^1(\mathcal B),\ldots,\tilde T^n(\mathcal B))$. A change of basis $\mathcal B \mapsto \mathcal C$ of $V$ induces a change of basis $\mathcal C^* \mapsto \mathcal B^*$ as follows: $$b_i^* = \sum_j a_{ji}c_j^*$$ because $$\sum_j a_{ji}c_j^*(b_k) = \sum_j a_{ji} c_j^*(\sum_l a'_{kl}c_l) = \sum_{j,l} a_{ji} a'_{kl} c_j^*(c_l) = \sum_j a_{ji} a'_{kj} = \sum_j a'_{kj} a_{ji} = \delta_{kj} = \delta_{jk}.$$ That is, the transformation matrix of $\mathcal C^* \mapsto \mathcal B^*$ is the transposed matrix $A^t$. Hence the transformation matrix of $\mathcal B^* \mapsto \mathcal C^*$ is $\tilde A= (A^t)^{-1} = (A^{-1})^t$. By the above considerations we see that the transformation matrix of $\tilde T(\mathcal B) \mapsto \tilde T(\mathcal C)$ is also $\tilde A$. This means that the coordinate representation $\tilde T(\mathcal B)$ contravaries with $\mathcal B$, and it is the reason why $\tilde T$ is called contravariant.

Due to the natural identification $V^{**} \approx V$ the behavior of $\tilde T(\mathcal B)$ is the same as that of the coordinate representation of vectors of $V$. In fact, for $x \in V$ write $x = \sum_i x_i(\mathcal B) b_i$. Then $x(\mathcal B) = (x_1(\mathcal B),\ldots,x_n(\mathcal B))$ is the coordinate representation of $x$ with respect to $\mathcal B$. We get $$x = \sum_i x_i(\mathcal B) b_i = \sum_i x_i(\mathcal B) \sum_j a'_{ij} c_j = \sum_{i,j} x_i(\mathcal B) a'_{ij} c_j = \sum_j \left(\sum_i a'_{ij}x_i(\mathcal B) \right)c_j \\= \sum_i \left(\sum_j a'_{ji}x_j(\mathcal B) \right)c_i = \sum_i x_i(\mathcal C) c_i $$ and therefore $$x_i(\mathcal C) = \sum_j a'_{ji}x_j(\mathcal B) .$$ That is, the transformation matrix of $x(\mathcal B) \mapsto x(\mathcal C)$ is $\tilde A$, i.e. the coordinate representation $x(\mathcal B)$ contravaries with $\mathcal B$.


Any tensor $T: V \to \mathbb{R}$ is a member of $V^{*}$, by the definition of $V^*$. And since $V^*$ is itself a vector space, this means that $V^*$ has a basis $\{ \theta^i \}$, and any $T \in V^*$ can be expressed in this basis as $T = T_i \theta^i$, where the coefficients $T_i$ are simply real numbers.

Similarly, any tensor $T: V^* \to \mathbb{R}$ is often viewed as a member of $V$. (More accurately, there is a canonical isomorphism between $V^{**}$, the space of all maps from $V^* \to \mathbb{R}$, and $V$.) Any member of $V$ can be written in terms of a basis $\{e_i\}$ for $V$ as $T = T^i e_i$, where the $T^i$ coefficients are (again) real numbers.