1. By direct computation: First directional derivative of $f:\mathbf{R}^m\rightarrow \mathbf{R}$ in the direction of $u$ at $x$ is given by \begin{equation} \partial_u f(x):=\lim_{t\rightarrow 0}\frac{f(x+tu)-f(x)}{t}=\nabla f(x) \cdot u = \sum_{i=1}^{m} u_i\partial_{x_i}f(x). \label{} \end{equation} The second directional derivative along the direction $u$ is given in the similar fasion: \begin{align*} \partial^2_{uu}f(x)&=\partial_u(\partial_u f)\\ &=\lim_{t\rightarrow 0}\frac{\partial_u f(x+tu)-\partial_u f(x)}{t}\\ &=\lim_{t\rightarrow 0}\frac{\nabla f(x+tu)\cdot u-\nabla f(x)\cdot u}{t}\\ &=\lim_{t\rightarrow 0}\frac{u_i \partial_{x_i}f(x+tu)-u_i \partial_{x_i}f(x)}{t}\\ &=u_i \partial_{x_i x_j} f(x)u_j\\ &=u^THu \label{} \end{align*} where $H=D^2 f(x)$ is the Hessian matrix of $f$ at $x$.

    1. $d$ is a direction means $\|d\|=1$, here the norm the usual norm in $\mathbb{R}^n$, i.e., $\|d\|=\sqrt{d_1^2+\cdots+d_n^2}$. therefore, if $d=\sum_{i=1}^{n}\lambda_i e_i$, where $\left\{ e_i \right\}$ is an O.N.B. given by the eigenvectors of $H$, then by pythagorean's theorem, \begin{equation} 1=\left\|d\right\|^2=\sum_{i=1}^{n}\lambda_i^2 \label{} \end{equation} from which we can conclude that $\lambda_i^2$ are between $0$ and $1$.

3.For any direction $d$, from 1 we know that \begin{equation} \partial_{dd}^2 f(x)=d^T H d \label{} \end{equation} Write $d=\sum_{i=1}^{m}c_i e_i$, then we have \begin{align*} d^THd&=\left( \sum_{i=1}^{m}c_i e_i \right)^T H\left( \sum_{i=1}^{m}c_i e_i \right)\\ &=\left( \sum_{i=1}^{m}c_i e_i \right)^{T}\left( \sum_{i=1}^{m} c_i\lambda_i e_i\right)\\ &=\sum_{i=1}^{n}c_i^2 \lambda_i \leq \lambda_{\max}\sum_{i=1}^{m}c_i^2\\ &=\lambda_{\max} \end{align*} where we use the Pythagorean theorem again for $\sum_{i=1}^{m}c_i^2=1$.

On the other hand, if we set $e_1$ be the eigenvector associate to $\lambda_{\max}$, then we have \begin{equation} \partial_{e_1 e_1f(x)}=e_1^T He_1=x_1^T \lambda_{\max} e_1=\lambda_{\max} \label{} \end{equation} In conclusion, \begin{equation} \partial_{dd}f(x)\leq \lambda_{\max}=\partial_{e_1 e_1}f(x) \label{<++>} \end{equation}


The directional derivative $\nabla_uf = \nabla f \frac {u}{\|u\|}$ is the magnitude of the change in $f$ for a change in the direction of $u.$ The second derivative is the change in the magnitude of the first directional derivative.

If $d$ is not in the direction of one of the eigenvalues, we can still write $d = c_1v_1 + c_2v_2 \cdots c_nv_n$ and $d^TXd = c_1\lambda_1 + \cdots +c_n\lambda_n$

Since $d$ is "unitized", the largest $c_1\lambda_1 + \cdots +c_n\lambda_n$ could be would happen if all of the loading fell onto the largest $\lambda_k.$ (and the smallest is if everything loaded onto the the smallest $\lambda_k$)