Why do we make a distinction between derivatives and partial derivatives?
The definition of a partial derivative is the "derivative of a multi-variable function relative to a single variable when all other variables are held constant".
But isn't the regular derivative (for one-variable functions) just a trivial case of this, where there are no other variables to hold constant? Why do we need the separate notation for partial derivatives (that is, writing $\displaystyle \frac{\partial f}{\partial x}$ rather than just $\displaystyle \frac{df}{dx}$)?
Solution 1:
Because there exists a notion of "total derivative" that is the multivariate analogue of the 1-D derivative you're familiar. The total derivative of a function $f: \mathbb{R}^n \to \mathbb{R}^m$ is known as the Jacobian of f. You may have worked with this while doing change of variables with multiple integrals. See http://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant. When we say $f$ is differentiable at a point $a$, we mean that its Jacobian exists at $a$. There is a theorem that says that if all of the partial derivatives of $f$ exist and are continuous at $a$ then $f$ is differentiable at $a$, but the converse isn't true. See Can "being differentiable" imply "having continuous partial derivatives"?.
Furthermore, the total derivative defines a linear map and is used as a linear approximation of $f$, via Taylor's Theorem (http://en.wikipedia.org/wiki/Taylor_series#Taylor_series_in_several_variables), as well as in the implicit function theorem (http://en.wikipedia.org/wiki/Implicit_function_theorem) among others.
Now, the notion of the total derivative of $f: \mathbb{R}^n \to \mathbb{R}^m$ can be extended to functions $f: \Omega \subset V \to W$ where $V$ and $W$ are normed vector spaces and $\Omega$ is open (probably further, but I haven't gotten there yet).
Definition: We say that $f: \Omega \subset V \to W$ is differentiable at a $\in \Omega$ if there exists a bounded linear operator $L_f[h]$ such that:
$f(a+h) = f(a) + L_f[h] + E[h]$
where $lim_{h \to 0} \frac{|E[h]|_W}{|h|_V} \to 0$, i.e. the error term vanishes as h approaches zero (think about the definition of the derivative in $\mathbb{R}$)
An aside: bounded linear operator here means that $|L_f(x) - L_f(y)|_W \leq C|x - y|_V$ for $x, y \in \Omega$ and a constant $C$.