What do the fibers of the double tangent bundle look like?

Consider the tangent bundle $\pi:TM\to M$ for some smooth manifold. As outlined in the Wikipedia page, we can then consider the double tangent bundle via the projection $\pi_*:TTM \to TM$, with $\pi_*$ the pushforward of the canonical projection $\pi$.

In the above page, they then proceed to mention that, given $$\xi =\xi^k \frac{\partial}{\partial x^k}\Bigg|_x\in T_x M, \qquad X =X^k \frac{\partial}{\partial x^k}\Bigg|_x \in T_x M,\tag A$$ and "applying the associated coordinate system" $\xi\mapsto (x^1,...,x^n,\xi^1,...,\xi^n)$ on $TM$, the fiber on $TTM$ at $X\in T_x M$ takes the form $$(\pi_*)^{-1}(X)=\left\{ X^k\frac{\partial}{\partial x^k}\Bigg|_\xi + Y^k\frac{\partial}{\partial \xi^k}\Bigg|_\xi : \,\, \xi\in T_x M,\,\, Y^1,...,Y^n\in\mathbb R \right\}.\tag B$$ I'm struggling to understand where this expression comes from.

I think I understand that $(x^1,...,x^n,\xi^1,...,\xi^n)$ is a (local) parametrisation for $TM$, and I can see that the fiber we are interested in is $T_X TM$, that is, the set of elements of $TTM$ above $X$, but I don't understand what the expression in the last equation represents.

If I were to write a fiber $T_x M$ of the tangent bundle, this would be the set of pairs $(x,v)$ with $v$ ranging across all (equivalence classes of) smooth curves $I\to M$ passing through $x$. In local coordinates, and focusing on the "curve component" of tangent vectors, I suppose we could write this as the set $$\pi^{-1}(x)=\left\{v^k \frac{\partial}{\partial x^k}\Bigg|_x : \,\, v^k\in\mathbb R\right\}.$$ By way of analogy, I'd guess $T_X TM$ to be the set of pairs $(X,V)$ with $V$ (equivalence classes of) curves $I\to TM$ passing through $X\in T_x M$. But even switching to local coordinates, I'm not sure how to go from this description to the expression in (B).


In the end, this is just a simple case of computing a tangent map and the fact that $\pi:TTM\to TM$ defines a second vector bundle structure on $TTM$ is not relevant for the problem at hand. (Partially things also get a bit complicated because you are using a notation that makes it complicated to distinguish between and object and its expression in local coordinates.)

Just start by thinking about how one constructs charts on the tangent bundle $TM$: You start with an open subset $U\subset M$ and a diffeomorphism from $U$ to an open subset $V\subset\mathbb R^n$, whose components are the local coordinates $x^i$. Denoting by $p:TM\to M$ the canonical projection, you get an induced diffeomorphism $p^{-1}(U)\to V\times\mathbb R^n$ that is used as a chart on $TM$. The first $n$ components of this are just the local coordinates $x^i$ and if you go through the construction, you see that indeed the tangent vector with coordinates $(x^1,\dots,x^n,\xi^1,\dots,\xi^n)$ exactly is $\sum_i\xi^i\tfrac{\partial}{\partial x^i}|_x$ where $x$ is the point with coordinates $x^i$. Now this shows that $(x^i,\xi^i)$ is a local coordinate system on $p^{-1}(U)\subset TM$, and hence given a point $\xi_x\in p^{-1}(U)$, any tangent vector at that point can be written as $\sum\alpha^i \tfrac{\partial}{\partial x^i}|_{\xi_x}+\sum_j\beta^j\tfrac{\partial}{\partial \xi^j}|_{\xi_x}$ for real numbers $\alpha^i$ and $\beta^j$.

At this point, it is best to forget about vector bundle structures and all that and just notice that $\pi:TTM\to TM$ is the tangent map (i.e. the derivative) of $p:TM\to M$. Since $p$ maps $p^{-1}(U)$ to $U$, this tangent map maps $Tp^{-1}(U)$ to $TU$ and hence can be expressed in the local coordinates we have constructed. But in these local coordinates, we simply get $p(x^1,\dots,x^n,\xi^1,\dots,\xi^n)=(x^1,\dots,x^n)$. This readily shows that the tangent map sends $\tfrac{\partial}{\partial x^i}$ to $\tfrac{\partial}{\partial x^i}$ and $\tfrac{\partial}{\partial \xi^j}$ to zero. Otherwise put, $$\pi\left(\sum_i\alpha^i \frac{\partial}{\partial x^i}|_{\xi_x}+\sum_j\beta^j\frac{\partial}{\partial \xi^j}|_{\xi_x}\right)=\sum_i\alpha^i \frac{\partial}{\partial x^i}|_x$$ and this equals $\xi_x$ if and only if $\alpha^i=\xi^i$ for all $i$, which is exactly what you want to see.

Edit (to address the issue on dimensions raised in your comment): This again mainly is an issue of notation. Since you have denoted points in $M$ by $x$ and local coordinates by $x^i$, I have tried to use similar notation for tangent vectors and this gets a bit misleading. What I have actually shown above is that the tangent vector $$ T_{Y_x}p\left(\sum_i\alpha^i\tfrac{\partial}{\partial x^i}|_{Y_x}+\sum\beta^j\tfrac{\partial}{\partial \xi^j}|_{Y_x}\right)=\sum_i\alpha^i\tfrac{\partial}{\partial x^i}|_x. $$ So for fixed $Y_x$, there is an $n$-dimensional affine subspace in $T_{Y_x}TM$ that gets mapped to $X_x\in TM$. But the actual pre-image of $X_x$ in $TTM$ is the union of all these affine subspaces for all the points $Y_x$, which form an $n$-dimensional vector space. So it looks like the product of $\mathbb R^n$ with an $n$-dimensional affine subspace of $\mathbb R^{2n}$ and hence has dimension $2n$ as expected. In the expression that you wrote in the question, the $n$-missing dimensions come from the free tangent vector $\xi\in T_xM$.

Even easier, if you extend what you know about local coordinates on $TM$ to $TTM$, you see that in the notation above, we get local coordinates $(x^i,\xi^j,\alpha^k,\beta^\ell)$ on an open subset of $TTM$ and in these coordinates, we get $\pi(x^i,\xi^j,\alpha^k,\beta^\ell)=(x^i,\alpha^k)$. Hence the pre-image of $(x^1,\dots, x^n,X^1,\dots,X^n)$ has $2n$ free parameters (the $\xi^j$ and the $\beta^\ell$).