\begin{align*} dy & = d(x^{T}Ax) = d(Ax\cdot x) = d\left(\sum_{i=1}^{n}(Ax)_{i}x_{i}\right) \\ & = d \left(\sum_{i=1}^{n}\sum_{j=1}^{n}a_{i,j}x_{j}x_{i}\right) =\sum_{i=1}^{n}\sum_{j=1}^{n}a_{i,j}x_{i}dx_{j}+\sum_{i=1}^{n}\sum_{j=1}^{n}a_{i,j}x_{j}dx_{i} \\ & =\sum_{i=1}^{n}(Ax)dx_{i}+\sum_{i=1}^{n}(Adx)x_{i} =(dx)^{T}Ax+x^{T}Adx \\ & =(dx)^{T}Ax+(dx)^{T}A^{T}x =(dx)^{T}(A+A^{T})x. \end{align*}


Step 2 might be the result of a simple computation. Consider $u(x)=x^TAx$, then $$ u(x+h)=(x+h)^TA(x+h)=x^TAx+h^TAx+x^TAh+h^TAh, $$ that is, $u(x+h)=u(x)+x^T(A+A^T)h+r_x(h)$ where $r_x(h)=h^TAh$ (this uses the fact that $h^TAx=x^TA^Th$, which holds because $m=h^TAx$ is a $1\times1$ matrix hence $m^T=m$).

One sees that $r_x(h)=o(\|h\|)$ when $h\to0$. This proves that the differential of $u$ at $x$ is the linear function $\nabla u(x):\mathbb R^n\to\mathbb R$, $h\mapsto x^T(A+A^T)h$, which can be identified with the unique vector $z$ such that $\nabla u(x)(h)=z^Th$ for every $h$ in $\mathbb R^n$, that is, $z=(A+A^T)x$.