Different definitions of a "one-form"

This is a difficult question, one that I grappled with for a long time. Differential forms are simultaneously some of the easiest and most difficult concepts in basic mathematics.

Let me begin by saying how you think about Differential Forms is very dependent on how deep you want to go. If this is your first exposure I suggest that you look at this article by Tao. I don't think there is really anything I could say here in any short amount of time that would trump that.

On to your actual question. The relationship between the two objects, just like differential forms themselves, is a somewhat simultaneously sophisticated and simple idea. You have a bunch of questions all rolled into one in your post, and so I will try to address one. Namely, regardless of who you are talking to, a one-form is defined to be a section of the cotangent bundle of a manifold In less cryptic terms it is a mapping which associates to each $p\in M$ an element of $T_p^\ast M$. Thus, I'd like to explain how elements of $T_p^\ast M$ can be conflated quite literally with functionals on $T_p M$.

Let us first give a definition of tangent and cotangent space/bundle that will make the idea particularly clear.

Let $M$ be a smooth manifold of dimension $n$ and let $p\in M$ be a point. Let us then define an equivalence relation on the set $C^\infty(M)$ by declaring that $f\sim g$ if there exists a centered chart $(x,U)$ at $p$ such that $$\frac{\partial(f\circ x^{-1})}{\partial x_i}\mid_{v=0}=\frac{\partial (g\circ x^{-1})}{\partial x_i}\mid_{v=0}$$

for all $i=1,\cdots,n$. One can quickly verify that this definition is independent of chart choice. Moreover, one can prove that the set $T_p^\ast M:=C^\infty(M)/\sim$ inherits a vector-space structure by merely declaring that $\alpha[f]+\beta[g]:=[\alpha f+\beta g]$ where $[h]$ denotes the equivalence class of a function $h\in C^\infty(M)$ under $\sim$. This resulting vector space is called the cotangent space of $M$ at $p$.

Similarly, let us define a space which, in a sense, is just the cotangent space but with the arrows reversed. Namely, let us denote by $\mathcal{F}_p$ the set of all smooth functions $f:U\to M$ where $U$ is a neighborhood of $0$ in $\mathbb{R}$ and $f(0)=p$. Let us then define $\mathcal{G}_p$ to be the set of germs $\mathcal{F}_p/\simeq$ where $f\simeq g$ if $f$ and $g$ agree on some neighborhood of $0$ (we're dealing with derivatives which only care about the function locally, so we don't care about a function except with respect to how it behaves locally at $0$--so we identify two functions that are indistinguishable near $0$). We shall denote the germ of a function $f\in\mathcal{F}_p$ by $\{f\}$. Lastly, let's define $T_p M:\mathcal{G}_p/\approx$ where $\{f\}=\{g\}$ if there exists a chart $(x,U)$ centered at $p$ such that

$$(x\circ f)'(0)=(x\circ g)'(0)$$

for all $i=1,\cdots,n$ (note that since $x\circ f$ and $x\circ g$ are functions from a subset of $\mathbb{R}$ to a subset of $\mathbb{R}^n$ this just means that the derivative of each of the coordinate functions of $x\circ f$ and $x\circ g$ at $0$ agree). Let us denote an equivalence class of $f\in\mathcal{F}_p$ (really of $\{f\}\in\mathcal{G}_p$) by $\underline{f}$. We see then that, at least as a set, $T_p M$ really is just like $T_p^\ast M$ but with all of the arrows reversed. We can define a vector space operation on $T_p M$ by saying that $\alpha \underline{f}+\beta\underline{g}=x^{-1}(x\circ f\circ m_\alpha+x\circ g\circ m_\beta)$ where $m_\gamma$ is multiplication by $\gamma$. I leave it to you that this is really a vector space operation--it's just basically so that the derivatives work out to add values and multiply values correctly.

Now, while the above definitions make look non-standard to you (for example you may have had $T_p M$ defined for you in terms of derivations at $p$ of $C^\infty(M)$) you can check that these vector spaces are canonically isomorphic to whichever definition you used. So, why did I use them? Because they highlight a very natural pairing $\langle -,-\rangle:T_p M\times T_p^\ast M\to \mathbb{R}$. Namely,

$$\langle \underline{f},[g]\rangle=(g\circ f)'(0)$$

I leave it to you that not only is $\langle-,-\rangle$ well-defined (independent of representative from the equivalence classes) but that it is actually bilinear and non-degenerate. Note that by how we have defined $T_p M$ and $T_p^\ast M$ this pairing is absolutely natural.

So what? What does this extremely natural pairing give us? Well, up until this point we have had one way to think about elements of $T_p^\ast M$--as just elements of $T_p^\ast M$ (equivalence classes of functions)! But, this bilinear pairing allows us to define a very natural map $T_p^\ast M\to (T_p M)^\ast$ (where this right-hand side is the vector dual space of $T_p M$)by taking $[g]$ to $\langle -,[g]\rangle$. The bilinearity and non-degenerateness are precisely what we need to show that this mapping is a well-defined linear isomorphism. Because this isomorphism is so natural (coming from such a natural pairing) this allows us to blur the lines between $T_p^\ast M$ and $(T_p M)^\ast$.

This ability to basically treat $T_p^\ast M$ as $(T_p M)^\ast$ and vice-versa is why people rarely make a fuss when thinking about one-forms as being either associations of points to cotangent vectors or the association of points to linear functionals eating in tangent vectors.

The highfalutin way of phrasing the above is that a one-form is a section of the cotangent bundle $T^\ast M$. But, using the above ideas you can show that the cotangent bundle is isomorphic to the dual bundle $(TM)^\ast$ of the tangent bundle $TM$. This is where all of the naturality comes in--we needed the isomorphism $T_p^\ast M\cong (T_p M)^\ast$ (which is clear by dimension counting) to not rely on choosing a basis so that the fiberwise isomorphism pieces together smoothly.

As I pointed out above Terry Tao is the right man to get an intuition for what one-forms are. How to practically deal with them, which definition to practically take is more difficult. The simple answer is that which situation you are in shall dictate which idea of one-forms is most natural. You should invest time in understanding, and being comfortable with all of theses different ways of thinking about one-forms and how to move seamlessly between them. It will not only make your life easier now, insomuch as that you will erase any question of right definition (they are all right!) but will be absolutely crucial once you start using one-forms in the future.

"What is all this madness?"

Welcome to differential geometry, where the notation's made up and the signs don't matter (usually).

As others have said, Definition (1) is the standard one. A differential $1$-form is a smooth map $\theta\colon M \to T^*M$ such that $\theta(p) \in T_p^*M$ for each $p \in M$.

This is equivalent to Definition (2). Why? Well:

Let $\theta\colon M \to T_p^*M$ be a differential $1$-form in the sense of Definition (1), so for each $p \in M$, we have $\theta(p) \in T_p^*M$. This means that at each $p \in M$, we have a linear map $\theta(p)\colon T_pM \to \mathbb{R}$. That is, $\theta(p)$ is a function that inputs vectors $X_p \in T_pM$ and outputs real numbers $\theta(p)(X_p) \in \mathbb{R}$.

Thus, given a single vector $X_p \in T_pM$, we associate the scalar $\theta(p)(X_p) \in \mathbb{R}$. Given a vector field $X$, we associate the scalar field (function) $p \mapsto \theta(p)(X_p)$. In this way, we can regard $\theta$ as a map $\widehat{\theta}\colon \{\text{vector fields}\} \to C^\infty(M)$, which is Definition (2).

Conversely, given a map $\widehat{\theta}\colon \{\text{vector fields}\} \to C^\infty(M)$, one can define a differential $1$-form in the sense of Definition (1) by $\theta\colon M \to T^*M$ via $p \mapsto [X_p \mapsto \widehat{\theta}(X)(p)]$. That is, $\theta(p) \in T_p^*M$ is the linear map $\theta(p)\colon T_pM \to \mathbb{R}$ defined by $\theta(p)(X_p) = \widehat{\theta}(X)(p)$.

What about Definition (3)? Now for some reason, many linear algebra books have taken to calling any functional on an arbitrary vector space a "$1$-form." This is clearly not the same thing as a differential $1$-form as given by Definition (1). The connection is that to any differential $1$-form $\theta\colon M \to T^*M$, the maps $\theta(p) \colon T_pM \to \mathbb{R}$ are linear functionals (hence "$1$-forms").

Finally, as you notice in the Edit, some authors describe the images $\theta(p) \in T_p^*M$ as being the $1$-forms. This is in keeping with the linear algebra terminology described in the previous paragraph. However, I personally don't like this conflation very much and avoid it whenever possible.

Also (Edit): What's really going on in these last two paragraphs is this: just as a vector field is a field of vectors, and a function is a field of scalars, so is a differential 1-form a field of linear functionals on $T_pM$ (aka "$1$-forms" on $T_pM$). It might sound odd to say, but many times it causes little harm to conflate an object (like a vector) with a field of objects (like a vector field). That's what's happening here.

How would you explain why $e^{i\pi}+1=0$ to a middle school student?

Proof of a Ramanujan Integral

What's the behaviour of $I_n=\int_0^1\left(\frac1{\log x}+\frac1{1-x}\right)^ndx$, as $n \to \infty$?

Is $\frac{1}{11}+\frac{1}{111}+\frac{1}{1111}+\cdots$ an irrational number?

What is the smallest digraph whose reflexive, symmetric, transitive closures (in all combinations) are distinct?

Transfinite Induction and the Axiom of Choice

Find all primes $p$ such that $(2^{p-1}-1)/p$ is a perfect square

Integral: $\int_0^{\pi} \frac{x}{x^2+\ln^2(2\sin x)}\,dx$

A Gamma limit $\lim_{n\rightarrow+\infty}\sum_{k=1}^n \left( \Gamma\bigl(\frac{k}{n}\bigr)\right)^{-k}=\frac{e^\gamma}{e^\gamma-1}$

Abscissa, Ordinate and ?? for z-axis?

What is the intuition for semi-continuous functions?

Please help me identify these mathematicians