Derivatives 101: what does "with respect to" mean?

I'm studying derivatives 101 and I can't get my head around the phrasing "with respect to" something.

Eg in chain rule we calculate the derivative of outer function with respect to inner + derivative of inner with respect to x. But what does it actually mean (in human language) to say a derivative is "with respect to" anything at all?

Thanks a ton.


If a function depends on only one variable, then its derivative is of course 'with respect to' that one variable, because the function only depends on one parameter, so there is no need to distinguish which parameter we are talking about.

But if it depends on two variables it is slightly more clear. For $f(x,y)$, the derivative with respect to $x$, is $\frac{df}{dx}$ and the derivative with respect to $y$ is $\frac{df}{dy}$. So if we let $$ f(x,y) = x + y^2 \\ \frac{\partial f}{\partial x} = 1 \\ \frac{\partial f}{\partial y} = 2 y $$ we can see these quantities are not the same. The derivative with respect to $x$ is: "at what rate does $f$ change as $x$ changes", in this case it is a constant, $1$. At what rate does $f$ change as $y$ changes, i.e. "the derivative with respect to $y$", which goes like $2y$.

I hope that is what you are looking for.

Note: Hurkyl's comments below are very important, in this instance we have to use a slightly different notation $\partial$ for the derivative where there is more than one parameter, because there may be co-dependence between parameters. I had originally intended to keep the explanation simple as you had indicated it was 'derivatives 101'.


I a variable-centric approach to algebra, one generally works with a collection of interdependent variables.

For example, consider modeling a problem where you have a point that loops counter-clockwise around the unit circle at a constant rate. One might introduce three separate variables:

  • $t$, the current time
  • $x$, the first coordinate of the point's position
  • $y$, the second coordinate of the point's position

and these are interrelated by various equations:

$$ x = \cos(t) \qquad y = \sin(t) \qquad x^2 + y^2 = 1 $$

There is a gadget called a "differential", which we write as $\mathrm{d}x$, which captures the idea of "the rate at which $x$ varies", in an absolute way.

Now, a differential is not a number; it's a new kind of mathematical gadget. Introductory calculus classes generally don't teach it; instead they want a way to do calculus that avoids working with them.

It turns out that differentials satisfy analogs of the basic derivative laws; the differentials of the above equations are

$$ \mathrm{d}x = -\sin(t) \, \mathrm{d}t \qquad \mathrm{d}y = \cos(t) \, \mathrm{d}t \qquad 2x \, \mathrm{d}x + 2y \, \mathrm{d}y = 0 $$

The main thing to observe here is that, in this situation, the differentials are all proportional to one another; we can meaningfully ask for their ratios, and get an informative result that doesn't involve differentials at all in their expression. We can solve these equations to get

$$ \frac{\mathrm{d}x}{\mathrm{d}t} = -\sin(t) \qquad \frac{\mathrm{d}y}{\mathrm{d}t} = \cos(t) \qquad \frac{\mathrm{d}y}{\mathrm{d}x} = -\frac{x}{y} $$

Since we're comparing the proportion between the rate at which two variables change, these derivatives are, respectively, the "derivative of $x$ with respect to $t$", "derivative of $y$ with respect to $t$", and "derivative of $y$ with respect to $x$".

In these terms, the chain rule is simply the algebraic property of chaining proportions. And you can check, for example,

$$ \frac{\mathrm{d}y}{\mathrm{d}t} = \frac{\mathrm{d}y}{\mathrm{d}x} \frac{\mathrm{d}x}{\mathrm{d}t} = \left(-\frac{x}{y}\right) \cdot (-\sin(t)) = \frac{\cos(t)}{\sin(t)} \cdot \sin(t) = \cos(t) $$

Now, to connect it all back to the familiar definition of derivative in terms of functions it turns out that you have the following theorem:

Theorem: If $x$ and $y$ are related by the equation $y = f(x)$ where $f$ is differentiable, then $$\frac{\mathrm{d}y}{\mathrm{d}x} = f'(x) $$