Chain rule for second derivative
If $f:A\subset\mathbb{R^n}\to \mathbb{R}^m$ and $g:B\subset\mathbb{R}^m \to \mathbb{R}^p$ are twice differentiable and $f(A)\subset B$, then for $x_0\in A$, $x,y\in \mathbb{R}^n$, show that: $$D^2(g\circ f(x_0))(x,y)=D^2g(f(x_0))(Df(x_0)\cdot x, Df(x_0)\cdot y)+Dg(f(x_0)) \cdot D^2f(x_0)(x,y) $$
I know that I should apply chain rule twice and get:
$$D^2(g\circ f (x_0))(x,y)=D^2g(f(x_0))Df(x_0)(x,y)+Dg(f(x_0))D^2f(x_0)(x,y)$$
But I'm a little confused with the notation and the vector product, because $Df(x_0)\in \mathbb{R}^m$ and $x\in \mathbb{R}^n$ so $Df(x_0)\cdot x$ does not make sense. I would be grateful if someone could write this properly.
Thank you for your help.
Just to make sure you at least understand where how all the compositions are working:
$x_0$ is a point in $\mathbb{R^n}$
$f: \mathbb{R}^n \to \mathbb{R}^m, g: \mathbb{R}^m \to \mathbb{R}^p$ are functions.
So $f(x_0)$ is a point in $\mathbb{R}^m$ and $g(f(x_0))$ is a point in $\mathbb{R}^p$.
$Df\big|_{x_0}: \mathbb{R}^n \to \mathbb{R}^m$ is a linear map. You should think of it eating a tangent vector based at $x_0$ and returning a tangent vector based at $f(x_0)$.
$Dg\big|_{f(x_{0})}: \mathbb{R}^m \to \mathbb{R}^p$ is a linear map. You should think of it eating a tangent vector based at $f(x_0)$ and returning a tangent vector based at $g(f(x_0))$.
$D^2f\big|_{x_0}:\mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}^m$ is a bilinear map. You should think of it as eating two tangent vectors based at $x_0$ and returning one tangent vector based at $f(x_0)$.
$D^2g\big|_{f(x_0)}: \mathbb{R}^m \times \mathbb{R}^m \to \mathbb{R}^p$ is a bilinear map. You should think of it as eating two tangent vectors based at $f(x_0)$ and returning one tangent vector based at $g(f(x_0))$.
You are trying to compute $D^2 (g \circ f)\big|_{x_0}$ which should be a bilinear map from $\mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}^p$. It eats two tangent vectors based at $x_0$ and returns a tangent vector based at $g(f(x_0))$.
The formula you are trying to prove is now
$D^2g\big|_{f(x_0)}\left(v,w\right) = D^2g\big|_{f(x_0)}\left(Df\big|_{x_0}(v), Df\big|_{x_0}(w)\right) + Dg\big|_{f(x_{0})}\left(D^2f\big|_{x_0}(v,w)\right)$
In the first term $D^2g\big|_{f(x_0)}$ is eating two tangent vectors based at $f(x_0)$, namely $Df\big|_{x_0}(v)$ and $Df\big|_{x_0}(w)$, and returning a tangent vector based at $g(f(x_0))$.
In the second term $Dg\big|_{f(x_{0})}$ is eating a tangent vector based at $f(x_0)$, namely $D^2f\big|_{x_0}(v,w)$.
This is a whole lot of crazy maps flying around, and it is kind of tough to keep it all straight. Hopefully now that you at least know what all the maps are doing (at least what their domains and codomains are!) you can figure out the rest. If you have trouble, let me know! I can post another answer.