Difference between gradient and Jacobian
Solution 1:
These are two particular forms of matrix representation of the derivative of a differentiable function $f,$ used in two cases:
- when $f:\mathbb{R}^n\to\mathbb{R},$ then for $x$ in $\mathbb{R}^n$, $$\mathrm{grad}_x(f):=\left[\frac{\partial f}{\partial x_1}\frac{\partial f}{\partial x_2}\dots\frac{\partial f}{\partial x_n}\right]\!\bigg\rvert_x$$ is the matrix $1\times n$ of the linear map $Df(x)$ exprimed from the canonical base of $\mathbb{R}^n$ to the canonical base of $\mathbb{R}$ (=(1)...). Because in this case this matrix would have only one row, you can think about it as the vector $$\nabla f(x):=\left(\frac{\partial f}{\partial x_1},\frac{\partial f}{\partial x_2},\dots,\frac{\partial f}{\partial x_n}\right)\!\bigg\rvert_x\in\mathbb{R}^n.$$ This vector $\nabla f(x)$ is the unique vector of $\mathbb{R}^n$ such that $Df(x)(y)=\langle\nabla f(x),y\rangle$ for all $y\in\mathbb{R}^n$ (see Riesz representation theorem), where $\langle\cdot,\cdot\rangle$ is the usual scalar product $$\langle(x_1,\dots,x_n),(y_1,\dots,y_n)\rangle=x_1y_1+\dots+x_ny_n.$$
- when $f:\mathbb{R}^n\to\mathbb{R}^m,$ then for $x$ in $\mathbb{R}^n$, $$\mathrm{Jac}_x(f)=\left.\begin{bmatrix}\frac{\partial f_1}{\partial x_1}&\frac{\partial f_1}{\partial x_2}&\dots&\frac{\partial f_1}{\partial x_n}\\\frac{\partial f_2}{\partial x_1}&\frac{\partial f_2}{\partial x_2}&\dots&\frac{\partial f_2}{\partial x_n}\\ \vdots&\vdots&&\vdots\\\frac{\partial f_m}{\partial x_1}&\frac{\partial f_m}{\partial x_2}&\dots&\frac{\partial f_m}{\partial x_n}\\\end{bmatrix}\right|_x$$ is the matrix $m\times n$ of the linear map $Df(x)$ exprimed from the canonical base of $\mathbb{R}^n$ to the canonical base of $\mathbb{R}^m.$
For example, with $f:\mathbb{R}^2\to\mathbb{R}$ such as $f(x,y)=x^2+y$ you get $\mathrm{grad}_{(x,y)}(f)=[2x \,\,\,1]$ (or $\nabla f(x,y)=(2x,1)$) and for $f:\mathbb{R}^2\to\mathbb{R}^2$ such as $f(x,y)=(x^2+y,y^3)$ you get $\mathrm{Jac}_{(x,y)}(f)=\begin{bmatrix}2x&1\\0&3y^2\end{bmatrix}.$
Solution 2:
The gradient vector of a scalar function $f(\mathbf{x})$ that maps $\mathbb{R}^n\to\mathbb{R}$ where $\mathbf{x}=<x_1,x_2,\ldots,x_n>$ is written as $$\nabla f(\mathbf{x})=\frac{\partial f(\mathbf{x})}{\partial x_1}\hat{x}_1+\frac{\partial f(\mathbf{x})}{\partial x_2}\hat{x}_2+\ldots+\frac{\partial f(\mathbf{x})}{\partial x_n}\hat{x}_n$$
Whereas the Jacobian is taken of a vector function $\mathbf{f}(\mathbf{x})$ that maps $\mathbb{R}^n\to\mathbb{R}^m$, where $\mathbf{f}=<f_1,f_2,\ldots,f_m>$ and $\mathbf{x}=<x_1,x_2,\ldots,x_n>$. The Jacobian is written as
$$J_\mathbf{f} = \frac{\partial (f_1,\ldots,f_m)}{\partial(x_1,\ldots,x_n)} = \left[ \begin{matrix} \frac{\partial f_1}{\partial x_1} & \cdots & \frac{\partial f_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial f_m}{\partial x_1} & \cdots & \frac{\partial f_m}{\partial x_n} \end{matrix} \right]$$
Note that when $m=1$ the Jacobian is same as the gradient because it is a generalization of the gradient.
The Jacobian determinant can be used for changes of variables because it can be viewed as the ratio of an infinitesimal change in the variables of one coordinate system to another. This requires that the function $\mathbf{f}(\mathbf{x})$ maps $\mathbb{R}^n\to\mathbb{R}^n$, which produces an $n\times n$ square matrix for the Jacobian. For example
$$\iiint_R f(x,y,z) \,dx\,dy\,dz = \iiint_S f(x(u,v,w),y(u,v,w),z(u,v,w))\left|\frac{\partial (x,y,z)}{\partial(u,v,w)}\right|\,du\,dv\,dw$$
where the Jacobian $J_\mathbf{g}$ is taken of the function
$$\mathbf{g}(u,v,w)=x(u,v,w)\hat{\imath}+y(u,v,w)\hat{\jmath}+z(u,v,w)\hat{k}$$
and the areas $R$ and $S$ correspond to each other.