Chain rule for second derivative

If $f:A\subset\mathbb{R^n}\to \mathbb{R}^m$ and $g:B\subset\mathbb{R}^m \to \mathbb{R}^p$ are twice differentiable and $f(A)\subset B$, then for $x_0\in A$, $x,y\in \mathbb{R}^n$, show that: $$D^2(g\circ f(x_0))(x,y)=D^2g(f(x_0))(Df(x_0)\cdot x, Df(x_0)\cdot y)+Dg(f(x_0)) \cdot D^2f(x_0)(x,y) $$

I know that I should apply chain rule twice and get:

$$D^2(g\circ f (x_0))(x,y)=D^2g(f(x_0))Df(x_0)(x,y)+Dg(f(x_0))D^2f(x_0)(x,y)$$

But I'm a little confused with the notation and the vector product, because $Df(x_0)\in \mathbb{R}^m$ and $x\in \mathbb{R}^n$ so $Df(x_0)\cdot x$ does not make sense. I would be grateful if someone could write this properly.

Thank you for your help.

Just to make sure you at least understand where how all the compositions are working:

$x_0$ is a point in $\mathbb{R^n}$

$f: \mathbb{R}^n \to \mathbb{R}^m, g: \mathbb{R}^m \to \mathbb{R}^p$ are functions.

So $f(x_0)$ is a point in $\mathbb{R}^m$ and $g(f(x_0))$ is a point in $\mathbb{R}^p$.

$Df\big|_{x_0}: \mathbb{R}^n \to \mathbb{R}^m$ is a linear map. You should think of it eating a tangent vector based at $x_0$ and returning a tangent vector based at $f(x_0)$.

$Dg\big|_{f(x_{0})}: \mathbb{R}^m \to \mathbb{R}^p$ is a linear map. You should think of it eating a tangent vector based at $f(x_0)$ and returning a tangent vector based at $g(f(x_0))$.

$D^2f\big|_{x_0}:\mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}^m$ is a bilinear map. You should think of it as eating two tangent vectors based at $x_0$ and returning one tangent vector based at $f(x_0)$.

$D^2g\big|_{f(x_0)}: \mathbb{R}^m \times \mathbb{R}^m \to \mathbb{R}^p$ is a bilinear map. You should think of it as eating two tangent vectors based at $f(x_0)$ and returning one tangent vector based at $g(f(x_0))$.

You are trying to compute $D^2 (g \circ f)\big|_{x_0}$ which should be a bilinear map from $\mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}^p$. It eats two tangent vectors based at $x_0$ and returns a tangent vector based at $g(f(x_0))$.

The formula you are trying to prove is now

$D^2g\big|_{f(x_0)}\left(v,w\right) = D^2g\big|_{f(x_0)}\left(Df\big|_{x_0}(v), Df\big|_{x_0}(w)\right) + Dg\big|_{f(x_{0})}\left(D^2f\big|_{x_0}(v,w)\right)$

In the first term $D^2g\big|_{f(x_0)}$ is eating two tangent vectors based at $f(x_0)$, namely $Df\big|_{x_0}(v)$ and $Df\big|_{x_0}(w)$, and returning a tangent vector based at $g(f(x_0))$.

In the second term $Dg\big|_{f(x_{0})}$ is eating a tangent vector based at $f(x_0)$, namely $D^2f\big|_{x_0}(v,w)$.

This is a whole lot of crazy maps flying around, and it is kind of tough to keep it all straight. Hopefully now that you at least know what all the maps are doing (at least what their domains and codomains are!) you can figure out the rest. If you have trouble, let me know! I can post another answer.

Hellinger-Toeplitz Theorem and Uniform Boundedness Principle [closed]

Likelihood Function for the Uniform Density $(\theta, \theta+1)$

Given any positive real numbers $a,b,c$, we have $(a^{2}+2)(b^{2}+2)(c^{2}+2)\geq 9(ab+bc+ca)$ [closed]

$ \lim x^2 = a^2$ as $x$ goes to $a$

First index of number in that arithmetic progression which is a multiple of the given prime number

Solve $z^2+|z|=0$

Show that $\int^\infty_0\left(\frac{\ln(1+x)} x\right)^2dx$ converge.

Dual space of $l^1$

Showing orthogonality of Legendre polynomials using Rodrigues' formula

Proving by mathematical induction: $1+\frac{1}{\sqrt{2}}+\frac{1}{\sqrt{3}}+...+\frac{1}{\sqrt{n}}>2(\sqrt{n+1}-1)$ [duplicate]

Analytic map with two fixed points on a simply connected domain is the identity

If $p(z,w)=a_0(z)+a_1(z)w+\dots +a_k(z)w^k$ are non constant polynomial.