Why use geometric algebra and not differential forms?
I just want to point out that GA can be used to make covariant multivectors (or differential forms) on $\mathbb R^n$ without forcing a metric onto it. In other words, the distinction between vectors and covectors (or between $\mathbb R^n$ and its dual) can be maintained.
This is done with a pseudo-Euclidean space $\mathbb R^{n,n}$.
Take an orthonormal set of spacelike vectors $\{\sigma_i\}$ (which square to ${^+}1$) and timelike vectors $\{\tau_i\}$ (which square to ${^-}1$). Define null vectors
$$\Big\{\nu_i=\frac{\sigma_i+\tau_i}{\sqrt2}\Big\}$$
$$\Big\{\mu_i=\frac{\sigma_i-\tau_i}{\sqrt2}\Big\};$$
they're null because
$${\nu_i}^2=\frac{{\sigma_i}^2+2\sigma_i\cdot\tau_i+{\tau_i}^2}{2}=\frac{(1)+2(0)+({^-}1)}{2}=0$$
$${\mu_i}^2=\frac{{\sigma_i}^2-2\sigma_i\cdot\tau_i+{\tau_i}^2}{2}=\frac{(1)-2(0)+({^-}1)}{2}=0.$$
More generally,
$$\nu_i\cdot\nu_j=\frac{\sigma_i\cdot\sigma_j+\sigma_i\cdot\tau_j+\tau_i\cdot\sigma_j+\tau_i\cdot\tau_j}{2}=\frac{(\delta_{i,j})+0+0+({^-}\delta_{i,j})}{2}=0$$
and
$$\mu_i\cdot\mu_j=0.$$
So the spaces spanned by $\{\nu_i\}$ or $\{\mu_i\}$ each have degenerate quadratic forms. But the dot product between them is non-degenerate:
$$\nu_i\cdot\mu_i=\frac{\sigma_i\cdot\sigma_i-\sigma_i\cdot\tau_i+\tau_i\cdot\sigma_i-\tau_i\cdot\tau_i}{2}=\frac{(1)-0+0-({^-}1)}{2}=1$$
$$\nu_i\cdot\mu_j=\frac{\sigma_i\cdot\sigma_j-\sigma_i\cdot\tau_j+\tau_i\cdot\sigma_j-\tau_i\cdot\tau_j}{2}=\frac{(\delta_{i,j})-0+0-({^-}\delta_{i,j})}{2}=\delta_{i,j}$$
Of course, we could have just started with the definition that $\mu_i\cdot\nu_j=\delta_{i,j}=\nu_i\cdot\mu_j$, and $\nu_i\cdot\nu_j=0=\mu_i\cdot\mu_j$, instead of going through "spacetime".
The space $V$ will be generated by $\{\nu_i\}$, and its dual $V^*$ by $\{\mu_i=\nu^i\}$. You can take the dot product of something in $V^*$ with something in $V$, which will be a differential 1-form. You can make contravariant multivectors from wedge products of things in $V$, and covariant multivectors from wedge products of things in $V^*$.
You can also take the wedge product of something in $V^*$ with something in $V$.
$$\mu_i\wedge\nu_i=\frac{\sigma_i\wedge\sigma_i+\sigma_i\wedge\tau_i-\tau_i\wedge\sigma_i-\tau_i\wedge\tau_i}{2}=\frac{0+\sigma_i\tau_i-\tau_i\sigma_i-0}{2}=\sigma_i\wedge\tau_i$$
$$\mu_i\wedge\nu_j=\frac{\sigma_i\sigma_j+\sigma_i\tau_j-\tau_i\sigma_j-\tau_i\tau_j}{2},\quad i\neq j$$
What does this mean? ...I suppose it could be a matrix (a mixed variance tensor)!
A matrix can be defined as a bivector:
$$M = \sum_{i,j} M^i\!_j\;\nu_i\wedge\mu_j = \sum_{i,j} M^i\!_j\;\nu_i\wedge\nu^j$$
where each $M^i_j$ is a scalar. Note that $(\nu_i\wedge\mu_j)\neq{^-}(\nu_j\wedge\mu_i)$, so $M$ is not necessarily antisymmetric. The corresponding linear function $f:V\to V$ is (with $\cdot$ the "fat dot product")
$$f(x) = M\cdot x = \frac{Mx-xM}{2}$$
$$= \sum_{i,j} M^i_j(\nu_i\wedge\mu_j)\cdot\sum_k x^k\nu_k$$
$$= \sum_{i,j,k} M^i_jx^k\frac{\nu_i\mu_j-\mu_j\nu_i}{2}\cdot\nu_k$$
$$= \sum_{i,j,k} M^i_jx^k\frac{(\nu_i\mu_j)\nu_k-\nu_k(\nu_i\mu_j)-(\mu_j\nu_i)\nu_k+\nu_k(\mu_j\nu_i)}{4}$$
(the $\nu$'s anticommute because their dot product is zero:)
$$= \sum_{i,j,k} M^i_jx^k\frac{\nu_i\mu_j\nu_k+\nu_i\nu_k\mu_j+\mu_j\nu_k\nu_i+\nu_k\mu_j\nu_i}{4}$$
$$= \sum_{i,j,k} M^i_jx^k\frac{\nu_i(\mu_j\nu_k+\nu_k\mu_j)+(\mu_j\nu_k+\nu_k\mu_j)\nu_i}{4}$$
$$= \sum_{i,j,k} M^i_jx^k\frac{\nu_i(\mu_j\cdot\nu_k)+(\mu_j\cdot\nu_k)\nu_i}{2}$$
$$= \sum_{i,j,k} M^i_jx^k\frac{\nu_i(\delta_{j,k})+(\delta_{j,k})\nu_i}{2}$$
$$= \sum_{i,j,k} M^i_jx^k\big(\delta_{j,k}\nu_i\big)$$
$$= \sum_{i,j} M^i_jx^j\nu_i$$
This agrees with the conventional definition of matrix multiplication.
In fact, it even works for non-square matrices; the above calculations work the same if the $\nu_i$'s on the left in $M$ are basis vectors for a different space. A bonus is that it also works for a non-degenerate quadratic form; the calculations don't rely on ${\mu_i}^2=0$, nor ${\nu_i}^2=0$, but only on $\nu_i$ being orthogonal to $\nu_k$, and $\mu_j$ being reciprocal to $\nu_k$. So you could instead have $\mu_j$ (the right factors in $M$) be in the same space as $\nu_k$ (the generators of $x$), and $\nu_i$ (the left factors in $M$) in a different space. A downside is that it won't map a non-degenerate space to itself.
I admit that this is worse than the standard matrix algebra; the dot product is not invertible, nor associative. Still, it's good to have this connection between the different algebras. And it's interesting to think of a matrix as a bivector that "rotates" a vector through the dual space and back to a different point in the original space (or a new space).
Speaking of matrix transformations, I should discuss the underlying principle for "contra/co variance": that the basis vectors may vary.
We want to be able to take any (invertible) linear transformation of the null space $V$, and expect that the opposite transformation applies to $V^*$. Arbitrary linear transformations of the external $\mathbb R^{n,n}$ will not preserve $V$; the transformed $\nu_i$ may not be null. It suffices to consider transformations that preserve the dot product on $\mathbb R^{n,n}$. One obvious type is the hyperbolic rotation
$$\sigma_1\mapsto\sigma_1\cosh\phi+\tau_1\sinh\phi={\sigma_1}'$$
$$\tau_1\mapsto\sigma_1\sinh\phi+\tau_1\cosh\phi={\tau_1}'$$
$$\sigma_2={\sigma_2}',\quad\sigma_3={\sigma_3}',\quad\cdots$$
$$\tau_2={\tau_2}',\quad\tau_3={\tau_3}',\quad\cdots$$
(or, more compactly, $x\mapsto\exp(-\sigma_1\tau_1\phi/2)x\exp(\sigma_1\tau_1\phi/2)$ ).
The induced transformation of the null vectors is
$${\nu_1}'=\frac{{\sigma_1}'+{\tau_1}'}{\sqrt2}=\exp(\phi)\nu_1$$
$${\mu_1}'=\frac{{\sigma_1}'-{\tau_1}'}{\sqrt2}=\exp(-\phi)\mu_1$$
$${\nu_2}'=\nu_2,\quad{\nu_3}'=\nu_3,\quad\cdots$$
$${\mu_2}'=\mu_2,\quad{\mu_3}'=\mu_3,\quad\cdots$$
The vector $\nu_1$ is multiplied by some positive number $e^\phi$, and the covector $\mu_1$ is divided by the same number. The dot product is still ${\mu_1}'\cdot{\nu_1}'=1$.
You can get a negative multiplier for $\nu_1$ simply by the inversion $\sigma_1\mapsto{^-}\sigma_1,\quad\tau_1\mapsto{^-}\tau_1$; this will also negate $\mu_1$. The result is that you can multiply $\nu_1$ by any non-zero Real number, and $\mu_1$ will be divided by the same number.
Of course, this only varies one basis vector in one direction. You could try to rotate the vectors, but a simple rotation in a $\sigma_i\sigma_j$ plane will mix $V$ and $V^*$ together. This problem is solved by an isoclinic rotation in $\sigma_i\sigma_j$ and $\tau_i\tau_j$, which causes the same rotation in $\nu_i\nu_j$ and $\mu_i\mu_j$ (while keeping them separate).
Combine these stretches, reflections, and rotations, and you can generate any invertible linear transformation on $V$, all while maintaining the degeneracy ${\nu_i}^2=0$ and the duality $\mu_i\cdot\nu_j=\delta_{i,j}$. This shows that $V$ and $V^*$ do have the correct "variance".
See also Hestenes' Tutorial, page 5 ("Quadratic forms vs contractions").
This seems to be best answered by Lounesto's paper "Marcel Riesz's Work on Clifford Algebras" (see here or here). In what follows:
$\bigwedge V=$ the exterior algebra over $V$
$C\ell(Q)=$ the Clifford (geometric) algebra over $V$ w.r.t. the quadratic form $Q$
Note in particular that we always have $C\ell(0)=\bigwedge V$, $0$ being the degenerate quadratic form.
On p. 221, Professor Lounesto discusses, given a non-degenerate quadratic form $Q$, how to define an "inner product" (contraction $\rfloor$) on the exterior algebra $\bigwedge V$.
On p. 223, Professor Lounesto discusses how to extend the inner product (by combining it with the wedge product of the exterior algebra) to produce a Clifford/geometric product on $\bigwedge V$, which makes $\bigwedge V$ isomorphic to $C\ell(Q)$ (the Clifford algebra w.r.t. the quadratic form $Q$).
One can also go the other way around, as M. Riesz originally did in 1958 (see section 1.3, beginning on p. 230, "Riesz's Introduction of an Exterior Product in $C\ell(Q)$ "), and use the Clifford product to define a notion of exterior product which makes $C\ell(Q)$ isomorphic to $\bigwedge V$.
In other words, we do indeed have:
(exterior algebra over $\mathbb{R}^n +$ inner product) $\simeq$ geometric algebra over $\mathbb{R}^n$
One should note that $\bigwedge \mathbb{R}^n$, the exterior algebra over $\mathbb{R}^n$, consists of alternating contravariant tensors of rank $k$ over $\mathbb{R}^n$. However, differential forms are alternating covariant tensors of rank $k$ over $\mathbb{R}^n$. So in general they behave differently.
Nevertheless, an inner product on $V$ gives a linear isomorphism between any vector space $V$ and its dual $V^*$ to argue that covariant tensors of rank $k$ and contravariant tensors of rank $k$ are "similar". (Mixed variance tensors complicate things somewhat further, but are not relevant to this question.)
Thus, differential forms are "similar" to $\bigwedge \mathbb{R}^n$ (since they are essentially $\bigwedge (\mathbb{R}^n)^*$). Also, we can just as easily construct a Clifford algebra from $\bigwedge V$ as from $\bigwedge V^*$, so we can extend differential forms to "covariant geometric algebras" by introducing an inner product based on a quadratic form $Q$.
So, perhaps less convincingly, we also do have (at least in an algebraic sense):
(differential forms + inner product) $\simeq$ "covariant geometric algebra" over $\mathbb{R}^n$
It is also worth noting that, according to Professor Lounesto on p. 218, Elie Cartan also studied Clifford algebras, in addition to introducing the modern notions of differential form and exterior derivative. So it is not all too surprising that they should actually be related to one another.
In fact, thinking about (covariant) geometric algebra in terms of "differential forms + inner product", while using the geometric intuition afforded by geometric algebra, actually makes the ideas behind differential forms much more clear. See for example here. I'm only beginning to process all of the implications, but as an example, a $k-$blade represents a $k-$dimensional subspace, and its Hodge dual is the differential form of rank $n-k$ which represents its orthogonal complement. The reason why orthogonal complements are represented in the dual space is because the inner product between two vectors can also be defined as the product of a vector and a covector (w.r.t. our choice of non-degenerate quadratic form $Q$).
All of this should be generalizable from $\mathbb{R}^n$ to the tangent and cotangent spaces of arbitrary smooth manifolds, unless I am missing something. This is especially the case for Riemannian manifolds, where we also get a non-degenerate quadratic form for each (co)tangent space for free.
(Which raises the question of why David Hestenes wants us to throw out smooth manifolds in favor of vector manifolds, a topic for future research.)
As to the answer to "why use geometric algebra and not differential forms", for now my answer is:
Use the tensor algebras over $\mathbb{R}^n$ and $(\mathbb{R}^n)^*$, while appreciating the special properties of their exterior sub-algebras and remembering that, given our favorite quadratic form $Q$, we can always introduce an additional notion of "contraction" or "inner product" to make them into Clifford (geometric) algebras.
Hyping geometric algebra alone misses the importance of linear duals and arbitrary tensors. Likewise, focusing on differential forms alone seems like a good way to do differential geometry without geometric intuition (i.e. with a mathematical lobotomy). Sensible minds may disagree.
Note: There are a lot of differences in the theory in the case that the base field is $\mathbb{F}_2$. To some extent we should expect this, since in that case we don't even have "alternating = anti-symmetric".
In particular, we don't have bivectors for fields of characteristic two, and defining/identifying a grading of the Clifford algebra via an isomorphism with the exterior algebra is impossible (at least if I am interpreting Professor Lounesto's paper correctly).
In any case, when I say "geometric algebra", I essentially mean "Clifford algebras of vector spaces with base field the real numbers", so the exceptions thrown up in the case that the characteristic equals 2 don't really matter for the purposes of this question; we are dealing exclusively with characteristic 0, although generalizations are possible.