Is the total differential the same as the directional derivative?

multivariable-calculus

The way I understand it, the total differential and the directional derivative are both linear approximations of the change in a function at a certain point.

So if I know the change in $x$ and $y$ from the initial point, then I plug those into the total differential to find the approximate change in $z$.

But isn't this the same as finding the directional derivative in the direction of

$$ v = (\text{change in } x, \text{change in } y)? $$

I stumbled onto this question because I had a related question about something else. Your question is old but I believe I my answer can help others with a similar question. There are essentially two types of derivatives in single-variable calculus, and analogously, two types of derivatives in multivariate calculus. Long but stay with me. The length is a cost but gives the benefit of clarity and organization (hopefully). The total derivative comes at the end. I have to go through some other things before I get there.

Case 1: How a function changes by changing the function directly

In single variable calculus, consider the function $f(x)$. How does $f$ change as we directly change the function through the variable $x$? This is the derivative $df(x)/dx$. It determines how $f$ changes for every unit change in the direct or domain variable $x$.

In multivariate calculus, consider the function $f(x,y)$. Again we ask, how does $f$ change as we directly change the function through the variables $x$ and $y$? We can consider how $f$ changes only by changing $x$, which is $\partial f/\partial x$. This gives the rate of change of $f$ per unit change in the $x$ direction. Likewise for the $y$ variable, holding $x$ fixed. But we can generalize this partial derivative along either $x$ or $y$ to any straight line direction. This is called the directional derivative. Again, just to repeat myself, we are asking how $f$ changes directly by changing it's domain variables. Just to be complete, the directional derivative takes the form $\nabla f \cdot \vec{v}$. We can generalize the directional derivative even further. Instead of asking how $f$ changes along a straight line path, we ask how $f$ changes tangent to an arbitrary path in our domain. The only difference between an arbitrary path and a straight line is that the tangent vectors on a arbitrary path change along the path, while the tangent vectors along a straight line do not change along the line. Therefore, the derivative still takes the form $\nabla f \cdot \vec{v}$, but $\vec{v}$ is changing as you move from point to point on your path.

To recap: This section is all about finding how $f$ changes directly as we change it's direct variables (the domain variables). In the multivariate setting, we can ask how $f$ changes per unit change along it's $x$ and $y$ axes, which we generalize to arbitrary straight lines (directional derivative), which we can generalize to arbitrary paths. In either case, it's always how $f$ changes per unit change in its direct or domain variables.

Case 2: How a function changes by changing the function indirectly

In single variable calculus, consider $f(x)$. But what if $f(x) = f(g(t))$? That is, $f$ is a composite function. You can change $f$ directly by changing x, or indirectly by changing $t$. Changing the $t$ parameter knob would consequently change x and consequently change $f$. Therefore, what is $df(g(t))/dt$? We are asking how the function changes indirectly through $t$. The derivative won't have units of $f$ per unit $x$ domain variable. It will have units of $f$ per unit $t$ indirect variable. To be complete, the chain rule gives $df(g(t))/dt = f'(g(t))g'(t)$ = derivative of outside with respect to the inside, times the derivative of the inside. In case 1 above, we only considered how $f$ changed directly with respect to it's domain variable. To be clear, if you write out $f(g(t))$ all you'd see would be $x$'s. Applying $d/dt$ to the function asks how $f$ changes indirectly through parameter $t$.

In the multivariate case, consider $f(x,y)$. Is there such thing as a 'composite' multivariate function? Yes, and it looks like $f(x,y) = f(g(t), h(t))$. This is just the multivariate version of a composite function. Likewise, we have something called the multivariate chain rule. Again, case 1 above asks how $f$ changes directly with it's domain variables. However, we can ask how $f$ changes through $t$ which is outside it's domain. Changing $t$ changes $x,y$ and consequently changes $f$. The multivariate chain rule looks similar to a single variable chain rule. The multivariate chain rule gives $df(g(t),h(t))/dt = \nabla f \cdot \langle g'(t), h'(t) \rangle$, which is just the derivative of the outside times the derivative of the inside and do a sum. This is also called the chain rule for paths, but I prefer to call it the multivariate chain rule. Also notice that this derivative happens to have the same form as a directional derivative and arbitrary path derivative above. The reason being is because straight lines and arbitrary paths in your domain require a parameterization $x = g(t)$ and $y = h(t)$. Then you can apply the correct limit definition of your derivative and see that it's form...(do not get me started on limits and directional derivatives. I think it's poorly taught...limit definitions when evaluating how $f$ changes directly through domain variables or how $f$ changes indirectly through parameters are different. Yet this difference is never brought to light). Nonetheless, realize that case 1 derivatives give change in $f$ per unit change in domain variables, while case 2 derivative gives change in $f$ per unit change in indirect variable(s), which can have different units. These are two types of derivatives that you can tell apart simply by looking at units, or just by determining if $f$ is changing directly or indirectly.

Total Derivative

So which camp does the total derivative fall into: direct or indirect? The answer is both. Consider a 'multivariate composite function' $f(g(t), h(t))$. If I wrote out $f$, you wouldn't see any $t$'s. Say as an example, $f = x^2 + y$ where $x = g(t)$ and $y = h(t)$. Point: there are no $t$'s. Now what if I consider $f(t, g(t), h(t))$? Now you do see $t$'s in the equation of $f$ such as $f = tx^2 + y - t$. So if I asked for the derivative of $f$ with respect to $t$, $\frac{d}{dt} f(t, g(t), h(t))$, will $f$ change directly or indirectly? Both is the answer because as I change $t$ I'm changing both a domain variable or direct variable but I'm also changing a parameter outside the domain which indirectly affects $x$ and $y$. However we can do a 'trick'. Although $f(t, g(t), h(t))$ is not 'completely' composite, it can be if we consider $f(t, g(t), h(t)) = f(t(t), g(t), h(t))$. What I'm doing is letting the direct variable $t$ be determined by an indirect variable $t$. In other words, $t_{direct} = t_{indirect}$. Previously, I had my hands turning knobs on both a direct variable $t$ and an indirect parameter $t$ living outside the domain. Now, with the $t = t$, I have a completely composite function and therefore I can use the multivariate chain rule which gives,

$$\frac{d}{dt}f(t,g(t),h(t)) = \nabla f \cdot \langle \frac{d}{dt}t, g'(t), h'(t) \rangle$$

where since $f = f(t, g(t), h(t))$, I ''redefined'' the gradient to include $\partial f/\partial t$ in the first slot. I'm only doing this to give a new perspective. Or you can do what textbooks and wikipedia does and keep the gradient as is (that is, just contain partials with respect to x,y,z) and pull the $\partial f/\partial t$ out front. Anyways, think about it like this. $t$ is a direct domain variable, but by setting/thinking of $t_{direct} = t_{indirect}$ it's like we have 2 gears perfectly linked. Rotate one gear and the other perfectly rotates in unison. Change $t_{indirect}$ and $t_{direct}$ changes identically. Therefore even though $f$ changes both directly and indirectly, our hands are only on the indirect parameter knob and we have a completely composite function. And we know how to find the derivative now via case 2 above.

No, no and no: they are very different things. The derivative (also called differential) is the best linear approximation at a point. The directional derivative is a one-dimensional object that describes the "infinitesimal" variation of a function at a point only along a prescribed direction. I will not write down the definitions here.

So to speak, the directional derivative gives you information about the local behavior of a function restricted to a straight line. The derivative gives you information about the local behavior of a function in a whole neighborhood of some point.

There are classical theorems describing the interplay between the two objects. In particular, a differentiable function possesses all the directional derivatives (which you compute by applying the derivative to the directional vector). On the contrary, a function can possess all the directional derivatives, but nevertheless it need not be differentiable.

In your case it seems to me that you are applying the first result: the directional derivative of a function $z=z(x,y)$ along a vector $\vec{v}$ is simply $$ \frac{\partial z}{\partial \vec{v}} = \nabla z \cdot \vec{v}, $$ where $\nabla z$ is the gradient, i.e. the vector that represents the (total) derivative.

Total differential

Let's say you have a function $z=f(x,y)$, the total derivative is defined as: $$\Delta z=\frac{\partial f}{\partial x}.\Delta x + \frac{\partial f}{\partial y}\Delta y$$ In words: for an increase of $x$, in point $x_O$ with $\Delta x$, and an increase of y, in point $y_O$ with $\Delta y$, the total differential represents the increase of the value of your function $f(x,y)$.

For the directional derivative, you'll have to understand a gradient of a function. The gradient of a function, is a vector that points in the direction where the increase per unit-of-distance is at it's maximum.

Gradient

The gradient of a function $f(x,y)$, in points $(x_0,y_0)$, is a vector defined as: $$grad(f) = \overrightarrow{\nabla f} = \frac{\partial f}{\partial x}.\overrightarrow{e_x} + \frac{\partial f}{\partial y}\overrightarrow{e_y}$$ where $e_i$ denotes the i-th unit vector if standard basis.

The directional derivative

The directional derivative can be defined as the increase of $f$, per unit of distance, in the direction, defined by $\alpha$.

$$\frac{df}{ds}=|\overrightarrow{\nabla f}|.cos(\alpha)$$

One way you can look at it is $<\partial f/\partial x, \partial f/\partial y>$ as the direction of maximum change of your function. If you take any other direction the change would be less. $f$ is a differentiable function here.

Total differential and direction derivative is bit different. If you have scalar function, and you take total derivative in strict sense, it is a scalar value, whereas directional derivative involves vectors.

Let's give an example:

say $f(x) = xy$

The total derivative is $df=\frac{\partial f}{\partial x} dx+ \frac {\partial f}{\partial y}dy = ydx+xdy$

Whereas the direction derivative is defines as in some arbitrary direction $\vec n$, as $<\partial f/\partial x, \partial f/\partial y> \cdot \vec n$, I am assuming $x,y$ plane here.