Tensors: Acting on Vectors vs Multilinear Maps
Let's first set some terminology.
Let $V$ be an $n$-dimensional real vector space, and let $V^*$ denote its dual space. We let $V^k = V \times \cdots \times V$ ($k$ times).
A tensor of type $(r,s)$ on $V$ is a multilinear map $T\colon V^r \times (V^*)^s \to \mathbb{R}$.
A covariant $k$-tensor on $V$ is a multilinear map $T\colon V^k \to \mathbb{R}$.
In other words, a covariant $k$-tensor is a tensor of type $(k,0)$. This is what Spivak refers to as simply a "$k$-tensor."
- A contravariant $k$-tensor on $V$ is a multilinear map $T\colon (V^*)^k\to \mathbb{R}$.
In other words, a contravariant $k$-tensor is a tensor of type $(0,k)$.
- We let $T^r_s(V)$ denote the vector space of tensors of type $(r,s)$. So, in particular,
$$\begin{align*} T^k(V) := T^k_0(V) & = \{\text{covariant $k$-tensors}\} \\ T_k(V) := T^0_k(V) & = \{\text{contravariant $k$-tensors}\}. \end{align*}$$ Two important special cases are: $$\begin{align*} T^1(V) & = \{\text{covariant $1$-tensors}\} = V^* \\ T_1(V) & = \{\text{contravariant $1$-tensors}\} = V^{**} \cong V. \end{align*}$$ This last line means that we can regard vectors $v \in V$ as contravariant 1-tensors. That is, every vector $v \in V$ can be regarded as a linear functional $V^* \to \mathbb{R}$ via $$v(\omega) := \omega(v),$$ where $\omega \in V^*$.
- The rank of an $(r,s)$-tensor is defined to be $r+s$.
In particular, vectors (contravariant 1-tensors) and dual vectors (covariant 1-tensors) have rank 1.
If $S \in T^{r_1}_{s_1}(V)$ is an $(r_1,s_1)$-tensor, and $T \in T^{r_2}_{s_2}(V)$ is an $(r_2,s_2)$-tensor, we can define their tensor product $S \otimes T \in T^{r_1 + r_2}_{s_1 + s_2}(V)$ by
$$(S\otimes T)(v_1, \ldots, v_{r_1 + r_2}, \omega_1, \ldots, \omega_{s_1 + s_2}) = \\ S(v_1, \ldots, v_{r_1}, \omega_1, \ldots,\omega_{s_1})\cdot T(v_{r_1 + 1}, \ldots, v_{r_1 + r_2}, \omega_{s_1 + 1}, \ldots, \omega_{s_1 + s_2}).$$
Taking $s_1 = s_2 = 0$, we recover Spivak's definition as a special case.
Example: Let $u, v \in V$. Again, since $V \cong T_1(V)$, we can regard $u, v \in T_1(V)$ as $(0,1)$-tensors. Their tensor product $u \otimes v \in T_2(V)$ is a $(0,2)$-tensor defined by $$(u \otimes v)(\omega, \eta) = u(\omega)\cdot v(\eta)$$
As I suggested in the comments, every bilinear map -- i.e. every rank-2 tensor, be it of type $(0,2)$, $(1,1)$, or $(2,0)$ -- can be regarded as a matrix, and vice versa.
Admittedly, sometimes the notation can be constraining. That is, we're used to considering vectors as column vectors, and dual vectors as row vectors. So, when we write something like $$u^\top A v,$$ our notation suggests that $u^\top \in T^1(V)$ is a dual vector and that $v \in T_1(V)$ is a vector. This means that the bilinear map $V \times V^* \to \mathbb{R}$ given by $$(v, u^\top) \mapsto u^\top A v$$ is a type $(1,1)$-tensor.
Example: Let $V = \mathbb{R}^3$. Write $u = (1,2,3) \in V$ in the standard basis, and $\eta = (4,5,6)^\top \in V^*$ in the dual basis. For the inputs, let's also write $\omega = (x,y,z)^\top \in V^*$ and $v = (p,q,r) \in V$. Then $$\begin{align*} (u \otimes \eta)(\omega, v) & = u(\omega) \cdot \eta(v) \\ & = \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} (x,y,z) \cdot (4,5,6) \begin{pmatrix} p \\ q \\ r \end{pmatrix} \\ & = (x + 2y + 3z)(4p + 5q + 6r) \\ & = 4px + 5 qx + 6rx \\ & \ \ \ \ \ 8py + 10qy + 12py \\ & \ \ \ \ \ 12pz + 15qz + 18rz \\ & = (x,y,z)\begin{pmatrix} 4 & 5 & 6 \\ 8 & 10 & 12 \\ 12 & 15 & 18 \end{pmatrix}\begin{pmatrix} p \\ q \\ r \end{pmatrix} \\ & = \omega \begin{pmatrix} 4 & 5 & 6 \\ 8 & 10 & 12 \\ 12 & 15 & 18 \end{pmatrix} v. \end{align*}$$
Conclusion: The tensor $u \otimes \eta \in T^1_1(V)$ is the bilinear map $(\omega, v)\mapsto \omega A v$, where $A$ is the $3 \times 3$ matrix above.
The Wikipedia article you linked to would then regard the matrix $A$ as being equal to the tensor product $u \otimes \eta$.
Finally, I should point out two things that you might encounter in the literature.
First, some authors take the definition of an $(r,s)$-tensor to mean a multilinear map $V^s \times (V^*)^r \to \mathbb{R}$ (note that the $r$ and $s$ are reversed). This also means that some indices will be raised instead of lowered, and vice versa. You'll just have to check each author's conventions every time you read something.
Second, note that there is also a notion of tensor products of vector spaces. Many textbooks, particularly ones focused on abstract algebra, regard this as the central concept. I won't go into this here, but note that there is an isomorphism $$T^r_s(V) \cong \underbrace{V^* \otimes \cdots \otimes V^*}_{r\text{ copies}} \otimes \underbrace{V \otimes \cdots \otimes V}_{s \text{ copies}}.$$
Confusingly, some books on differential geometry define the tensor product of vector spaces in this way, but I think this is becoming rarer.
This is not intended to be an answer, but won't fit into a comment. I'd like to mention two things here.
First, while you actually are adressing a topic which seems to be an infinite source of confusion and for sure is a pain for most students beginning to fiddle with it I also do think you should react a little less ignorant on a comment (the one given by Jesse Madnick) which, if you'd give it some moments of thought, might well help you to get a better understanding of the issue. Also, in mathematics, getting angry and being annoyed never helps. Because of this I actually consider downvoting your question.
Secondly, as long as you are in the realm of vector spaces over the real or complex numbers all the definitions of tensor products you will encounter will be equivalent, unless one is really in error. This is, admittedly, not always obious, especially to the novice. This is due to the fact that on the one hand side, the tensor product is a rather abstract thingy in linear algebra (a universal solution to a certain kind of abstract question, namely representing multilinear map by linear ones in a universal way), while on the other hand it turns out to be a rather important construction with the need for hands on operations in areas like differential geometry and, e.g., applications to physics (where it usually is accompanied with the additional difficulty that one needs the construction for bundles, rather than for simple vector spaces).
If you do want to see why the different constructions are equivalent you should put aside some time, find yourself some treatment on multilinear algebra (maybe ask for advice which one to use here) and read it and then try to figure out how the various definitions you found fit into the picture. When your interest is in differential (Riemannian) geometry, be sure to get a decent understanding of the isomorphism between the tangent space of a manifold and it's dual induced by the Riemannian metric, and the way how this is reflected in coordinate and other notations. All this will be, admittedly, a bit boring and painful.
Otherwise, if you don't want to invest the time, treat the different definitions as different concepts. If you really do work with them for a while you'll see the way in which they correspond to each other as time passes by.
I can't comment yet (or I don't know how to if I can), but echo Thomas' response and want to add one thing.
The tensor product of two vector spaces (or more generally, modules over a ring) is an abstract construction that allows you to "multiply" two vectors in that space. A very readable and motivated introduction is given in Dummit & Foote's book. (I always thought the actual construction seemed very strange before reading on D&F -- they manage to make it intuitive, motivated as trying to extend the set of scalars you can multiply with).
The collection of $k$-multilinear functions on a vector space is itself a vector space -- each multilinear map is a vector of that space. The connection between the two seemingly different definitions is that you're performing the "abstract" tensor product on those spaces of multilinear maps.
It always seemed to me that the tensor product definition in Spivak was a particularly nice, concrete example of the more general, abstract definition.