Can someone give me a deeper understanding of implicit differentiation?

I'm doing calculus and I want to be an engineer so I would like to understand the essence of the logic of implicit differentials rather than just memorizing the algorithm. Yes, I could probably memorize it and get a 100% on a test, but it means nothing unless I understand it and can acquire a practical understanding of it. I would really appreciate if someone can enlighten me.

What I understand already:

I understand how the derivatives of normal functions are found the long way... i.e.

$ f'(x)=\frac{f(x+h)-f(x))}{(x+h)-x} $

I understand that the derivative of the above equation is found when we find the limit as h, representing the distance between the two points, approaches 0.

I also understand that the chain rule, quotient rule, etc. are just algorithms that speed up the process of finding the derivative. I don't need a proof for that

An example problem

When we do implicit differential equations such as this one:

A ladder is 8.5 m long leaning against a wall, the bottom part of the ladder is 6.0 m from the wall and is sliding away from the wall at a rate of 2.5m/s.

$x^2 + y^2 = h^2$ (Pythagorean theorem) (x is x value, y is y value, h is hypot) We can find that y = 6.2m.

The derivative is $2x\frac{\mathrm{d} x}{\mathrm{d} t} + 2y\frac{\mathrm{d} y}{\mathrm{d} t} = 2h\frac{\mathrm{d} h}{\mathrm{d} t}$

What I don't understand: (although I can do them by memorizing the algorithm)

Is the implicit differential... How do we relate all the terms to the change in time? I mean how do we know we can make that equation's derivative with respect to t? Would it be possible to make them all with respect to the change in x? If so please demonstrate, I think that would help a lot as my biggest lack of understanding is how to know what the bottom term should be for each derivative term.

I really appreciate any help, thanks.


Solution 1:

Don't treat implicit differentiation as an idea that is distinct from "regular" differentiation. Just as the chain rule is involved in every derivative, and is not a separate rule that governs only certain situations, e.g. $$y=x^2=(x)^2$$ $$ \begin{align} \frac{d}{dx}(y)&=\frac{d}{dx}(x)^2\\ &=2(x)^1\frac{d}{dx}(x)\\ &=2x\cdot 1\\ &=2x \end{align} $$ implicit differentiation is something that you're doing all the time, you just don't see it as such. For example, when you differentiate $y=3x^2+2$, view it as differentiating both sides with respect to the symbol $x$: $$ \begin{align} \frac{d}{dx}(y)&=\frac{d}{dx}(3x^2+2)\\ &=\frac{d}{dx}(3x^2)+\frac{d}{dx}(2)\\ &=3\frac{d}{dx}(x^2)+0\\ &=3\cdot 2(x)^1\frac{d}{dx}(x)\quad\text{ (via chain rule)}\\ &=6\cdot x\cdot 1\\ &=6x \end{align} $$ I think that there are two things that aid in understanding, both of which treat differentiation in a symbolic manner.

  1. The first is to realize that nothing changes if the equation is rearranged, for example if we change it to $y-3x^2=2$, the same process works. Differentiating both sides with respect to the symbol $x$ gives: $$ \begin{align} \frac{d}{dx}(y-3x^2)&=\frac{d}{dx}(2)\\ \frac{d}{dx}(y)-\frac{d}{dx}(3x^2)&=\frac{d}{dx}(2)\\ \frac{d}{dx}(y)-6x\frac{d}{dx}(x)&=0\\ \frac{dy}{dx}&=6x \end{align} $$
  2. The second is to realize (or treat) $x$ as simply being an arbitrary symbol. That is, think of $\dfrac{d}{dx}$ as $\dfrac{d}{d\square}$. The operations above work just the same. We would end up with: $$ \frac{d}{d\square}(y)-6x\frac{d}{d\square}(x)=0 $$ When, if $\square =x$ we get (as above) $$ \frac{dy}{dx}=6x $$ while if $\square=y$ we get $$ \begin{align} \frac{d}{dy}(y)-6x\frac{d}{dy}(x)&=0\\ 1-6x\frac{dx}{dy}&=0\\ \frac{dx}{dy}&=\frac{1}{6x} \end{align} $$ Finally, if $\square=t$ then no "cancellation" occurs and we get $$ \frac{d}{dt}(y)-6x\frac{d}{dt}(x)=0 $$

The only other thing to remember is that the chain rule needs to be applied to all symbols as in: $$ \frac{d}{d\square}(y)^2=2(y)^1\frac{d}{d\square}(y) $$

Solution 2:

We are describing a physical process that happen during some time: The ladder is sliding down. Therefore, it is reasonable to assume that the observables $x,y,h$ depend on the time $t$ and (natura non facit saltas before the introduction of quantum theory) that these dependencies are smooth (continuous and differentiable). By the physical interpetation it is clear that at each time $t$, only one specific value of $x$ and $y$ and $h$ is valid. On the other hand, we are not guaranteed a priori that for each $x$ there is only one $y$ and one $h$ (the ladder might swing back and forth while changing its shape instead of just monotonically sliding down - under such circumstances several $y$-values might correspond to one $x$, which forbids us to view $y$ as a function of $x$).

So once we have established that we can work with functions $x(t), y(t),h(t)$, then $x^2+y^2=h^2$ becomes an equation of functions (because it holds for all $t$. As the derivative of a function depends only on the function and not on the expression we used to write the function down, we are allowed to apply the differential operator $\frac{\mathrm d}{\mathrm dt}$ to both sides. This is just the same as addin the same value to both sides of an equation or multiplying both sides by the same value.

$$ x(t)^2+y(t)^2=h(t)^2\quad\text{for all }t$$ implies $$\frac{\mathrm d}{\mathrm dt}\left( x(t)^2+y(t)^2\right)=\frac{\mathrm d}{\mathrm dt}\left(h(t)^2\right)\quad\text{for all }t.$$ The rest is just application of rules of differentiation (e.g. chain rule) to arrive at $$2x(t)\frac{\mathrm dx(t)}{\mathrm dt}+2y(t)\frac{\mathrm dy(t)}{\mathrm dt}=2h(t)\frac{\mathrm dh(t)}{\mathrm dt}\quad\text{for all }t.$$ In this specific problem, additionally we know that $h$ is constant, that is $h'=0$. This allows us to solve for $\frac{\frac{\mathrm dy(t)}{\mathrm dt}}{\frac{\mathrm dx(t)}{\mathrm dt}}$, which is in fact the same as $\frac{\mathrm dy(x)}{\mathrm dx}$ (naively by cancelling terms, but that can be made rigorous).

Solution 3:

For me, one of the keys to understanding implicit differentiation is that each variable in your example (and all other implicit differentiation problems) are all dependent variables (meaning functions of another independent variable). In your particular problem it would be time, $t$, that acts as the independent variable for each of the dependent variables. The $t$ is not an arbitrary choice, it comes from the application of those three quantities ($x, y, h$) as changing over time (meaning they are functions of time, or depend on time) in this specific physical situation.

So think of $x, y, h$ not as variables, but functions: $x(t), y(t), h(t)$. Now you are just applying the chain rule when differentiating them (the algorithm that you have memorized).

I'm not certain if this is helpful in your understanding, but ask some follow up questions in the comments and I can edit or answer further in the comments.