What do "function of" and "differentiate with respect to" mean?

Solution 1:

As a student of math and physics, this has been one of the biggest annoyances for me; I'll give my two cents on the matter. Throughout my entire answer, whenever I use the term "function", it will always mean in the usual math sense (a rule with a certain domain and codomain blablabla).

I generally find two ways in which people use the phrase "... is a function of ..." The first is as you say: "$f$ is a function of $x$" simply means that for the remainder of the discussion, we shall agree to denote the input of the function $f$ by the letter $x$. This is just a notational choice as you say, so there's no real math going on. We just make this choice of notation to in a sense "standardize everything". Of course, we usually allow for variants on the letter $x$. So, we may write things like $f(x), f(x_0), f(x_1), f(x'), f(\tilde{x}), f(\bar{x})$ etc. The way to interpret this is as usual: this is just the result obtained by evaluating the function $f$ on a specific element of its domain.

Also, you're right that the input label is completely arbitrary, so we can say $f(t), f(y), f(\ddot{\smile})$ whatever else we like. But again, often times it might just be convenient to use certain letters for certain purposes (this can allow for easier reading, and also reduce notational conflicts); and as much as possible it is a good idea to conform to the widely used notation, because at the end of the day, math is about communicating ideas, and one must find a balance between absolute precision and rigour and clarity/flow of thought.


btw as a side remark, I think I am a very very very nitpicky individual regarding issues like: $f$ vs $f(x)$ for a function, I'm also always careful to use my quantifiers properly etc. However, there have been a few textbooks I glossed over, which are also extremely picky and explicit and precise about everything; but while what they wrote was $100 \%$ correct, it was difficult to read (I had to pause often etc). This is as opposed to some other books/papers which leave certain issues implicit, but convey ideas more clearly. This is what I meant above regarding balance between precision and flow of thought.


Now, back to the issue at hand. In your third and fourth paragraphs, I think you have made a couple of true statements, but you're missing the point. (one of) the job(s) of any scientist is to quantitatively describe and explain observations made in real life. For example, you introduced the example of the amount of wax burnt, $w$. If all you wish to do is study properties of functions which map $\Bbb{R} \to \Bbb{R}$ (or subsets thereof), then there is clearly no point in calling $w$ the wax burnt or whatever.

But given that you have $w$ as the amount of wax burnt, the most naive model for describing how this changes is to assume that the flame which is burning the wax is kept constant and all other variables are kept constant etc. Then, clearly the amount of wax burnt will only depend on the time elapsed. From the moment you start your measurement/experiment process, at each time $t$, there will be a certain amount of wax burnt off, $w(t)$. In other words, we have a function $w: [0, \tau] \to \Bbb{R}$, where the physical interpretation is that for each $t \in [0, \tau]$, $w(t)$ is the amount of wax burnt off $t$ units of time after starting the process. Let's for the sake of definiteness say that $w(t) = t^3$ (with the above domain and codomain).


"Sure, $w$ only has the interpretation we think it does (cumulative amount of wax burnt) when we provide a (real number in the domain of definition, which we interpret as) time as its argument"

True.

"...Sure, we can't really interpret $w$ in the same way if I did this, but there is nothing in the definition of w which stops me from doing this."

Also true.

But here's where you're missing the point. If you didn't want to give a physical interpretation of what elements in the domain and target space of $w$ mean, why would you even talk about the example of burning wax? Why not just tell me the following:

Fix a number $\tau > 0$, and define $w: [0, \tau] \to \Bbb{R}$ by $w(t) = t^3$.

This is a perfectly self-contained mathematical statement. And now, I can tell you a bunch of properties of $w$. Such as:

  • $w$ is an increasing function
  • For all $t \in [0, \tau]$, $w'(t) = 3t^2$ (derivatives at end points of course are interpreted as one-sided limits)
  • $w$ has exactly one root (of multiplicity $3$) on this interval of definition.

(and many more other properties). So, if you want to completely forget about the physical context, and just focus on the function and its properties, then of course you can do so. Sometimes, such an abstraction is very useful as it removes any "clutter".

However, I really don't think it is (always) a good idea to completely disconnect mathematical ideas from their physical origins/interpretations. And the reason that in the sciences people often assign such interpretations is because their purpose is to use the powerful tool of mathematics to quantitatively model an actual physical observation.

So, while you have made a few technically true statements in your third and fourth paragraphs, I believe you've missed the point of why people assign physical meaning to certain quantities.


For your fifth paragraph however, I agree with the sentiment you're describing, and questions like this have tortured me. You're right that $w$ is a function of a single variable (where in this physical context, we interpret the arguments as time). If you now ask me how does $w$ change in relation to the distance I have started to walk, then I completely agree that there is no relation whatsoever.

But what is really going on is a terrible, annoying, confusing abuse of notation, where we use the same letter $w$ to have two differnent meanings. Physicists love such abuse of notation, and this has confused me for so long (and it still does from time to time). Of course, the intuitive idea of why the amount of wax burnt should depend on distance is clear: the further I walk, the more time has passed, and hence the more max has burnt. So, this is really a two step process.

To formalize this, we need to introduce a second function $\gamma$ (between certain subsets of $\Bbb{R}$), where the interpretation is that $\gamma(x)$ is the time taken to walk a distance $x$. Then when we (by abuse of language) say $w$ is a function of distance, what we really mean is that

The composite function $w \circ \gamma$ has the physical interpretation that for each $x \in \text{domain}(\gamma)$, $(w \circ \gamma)(x)$ is the amount of wax burnt when I walk a distance $x$.

Very often, this composition is not made explicit. In the Leibniz chain rule notation \begin{align} \dfrac{dw}{dx} &= \dfrac{dw}{dt} \dfrac{dt}{dx} \end{align} Where on the LHS $w$ is miraculously a function of distance, even though on the LHS (and initially) $w$ was a function of time, what is really going on is that the $w$ on the LHS is a complete abuse of notation. And of course, the precise way of writing it is $(w \circ \gamma)'(x) = w'(\gamma(x)) \cdot \gamma'(x)$.

In general, whenever you initially have a function $f$ "as a function of $x$" and then suddenly it becomes a "function of $t$", what is really meant is that we are given two functions $f$ and $\gamma$; and when we say "consider $f$ as a function of $x$", we really mean to just consider the function $f$, but when we say "consider $f$ as a function of time", we really mean to consider the (completely different) function $f \circ \gamma$.

Summary: if the arugments of a function suddenly change interpretations (eg from time to distance or really anything else) then you immediately know that the author is being sloppy/lazy in explicitly mentioning that there is a hidden composition.

Solution 2:

Excellent question. There are already good answers, I'll try to make a few, concise points.

Be nice to your readers

You should try to be nice to people reading and using your definitions, including your future self. It means that you should stick to conventions when possible.

Variable names imply domain and codomain

If you write that "$f$ is a function of $x$", readers will assume that it means that $f:\mathbb{R}\rightarrow\mathbb{R}$.

Similarly, if you write $f(z)$ it will imply that $f:\mathbb{C}\rightarrow\mathbb{C}$, and $f(n)$ might be for $f:\mathbb{N}\rightarrow\mathbb{Z}$.

It wouldn't be wrong to define $f:\mathbb{C}\rightarrow\mathbb{C}$ as $f(n)= \frac{in+1}{\overline{n}-i}$ but it would be surprising and might lead to incorrect assumptions (e.g. $\overline{n} = n$).

Free and bound variables

You might be interested in knowing the distinction between free and bound variables.

$$\sum_{k=1}^{10} f(k, n)$$

$n$ is a free variable and $k$ is a bound variable; consequently the value of this expression depends on the value of n, but there is nothing called $k$ on which it could depend.

Here's a related answer on StackOverflow.

"All models are wrong, some are useful", George Box

Your simplified amount of wax burnt as a function of time is probably wrong (it cannot perfectly know or describe the status of every atom) but it might at least be useful.

The amount of wax burnt as a function of "the distance you have walked since the candle was lit" will be even less correct and much less useful.

Physical variable names have meaning

Physical variable names are not just placeholders. They are linked to physical quantities and units. Replacing $l$ by $t$ as a variable name for a function will not just be surprising to readers, it will break dimensional homogeneity.

Solution 3:

Sometimes, especially in physical contexts, the view is not of functions acting on arguments but rather of constraints acting on variables. The simplest example is that maybe we have variables $w$ and $t$ representing the length of wax burned and the duration since the candle was lit respectively, and we observe the following relation: $$w=\left(1\,\frac{\text{meter}}{\text{second}}\right)\cdot t$$ You can imagine this as the implicit definition of a curve in a $w$-$t$ plane. It's legal to take "the derivative" of both sides to get: $$dw=\left(1\,\frac{\text{meter}}{\text{second}}\right) \cdot dt$$ where the items on either side are formally known as differential forms. Here, you can't just swap out variables because $w$ was not defined as a function - it is related to some other quantity in a fixed way! One can read this equation as saying that, no matter how we change the state, over a small enough amount of change, the amount of candle burned is proportional to the duration passed as long as this equation holds.

A somewhat more practical idea of this is to consider what would happen if we wanted to represent a point on the circle. We know that a point $(x,y)$ is only a valid state if $$x^2+y^2=1$$ and we can take the derivative of both sides to get $$2x\,dx+2y\,dy=0$$ or, simplifying $$x\,dx + y\,dy = 0$$ which essentially reads that, no matter how this system moves or what laws might dictate how $x$ and $y$ vary through time or any other parameter, for small changes, the sum of each coordinate times its instantaneous rate of change must be zero. We could also rearrange to $dx=\frac{-y}x\,dy$ which clarifies that the derivative of $x$ with respect to $y$ is $\frac{-y}x$, meaning that the changes $dx$ and $dy$ in these variables are proportional by this constant.

Note that we can also add more information freely; suppose that $x$ is actually varying in time and is given as $x=t^2$. Then $dx=2t\,dt$. We could substitute this in to the prior formula to find out that $$x\cdot(2t\,dt) + y\,dy = 2t^3\,dt+y\,dy = 0$$ in a perfectly rigorous fashion. Then, we can see that the derivative of $y$ with respect to $t$ is $\frac{-2t^3}y$ by rearranging to get $dy$ as the product of $dt$ by that expression. Notice how the variables are integral to this point of view: "the derivative of $x$" is perhaps an acceptable way to refer to $dx$, but that symbol tells you nothing; the idea of "derivative of $x$ with respect to $y$" tells you a meaningful relationship between $dx$ and $dy$ - which are objects in their own right (differential forms), rather than evaluations of $f'$ for some function $f$. This is actually a rather convenient way to do calculus - for instance, the fact that you can substitute in for anything (including $dx$) replaces both the chain rule and the formulas for integration by substitution, which makes calculus feel more like algebra.

Okay, but how does this relate to the idea of "function of" and "differentiate with respect to"? Well, whenever we have some expression of the form $$da=k\cdot db$$ where $a$ and $b$ and $k$ are variables, we might write that $k=\frac{da}{db}$ (which is an abuse of notation, not literal division - you cannot divide differential forms!) is the derivative of $a$ with respect to $b$ since it's the constant of proportionality relating the change of those variables. Similarly, expressions of the form $$a=f(b)$$ can often be read as saying that $a$ is a function of $b$ - in the very literal since where "is" means "equals" and "a function" refers to $f$ and "of" refers to function application. These are still variables, but there's a function involved now, and we do indeed have $$da= f'(b)\,db$$ where $f'$ is the derivative of the (abstract) function $f$. Of course, if you consider $f$ as a function whose domain is the set of durations and whose codomain is the set of lengths, you will find that $f'$ carries units of speed by definition of the derivative - so there is still some concrete information in $f$, even if we could take some other duration $c$ and write $f(c)$ (though we wouldn't know that this was equal to anything of interest). Sometimes we even say $a$ is a function of $b$ if a relation like $a=f(b)$ just holds on some section of the space of states (e.g. if the coordinates are just restricted to be on some circle, where no relation like this holds globally).

Unless you are working in a single dimensional space of states (as is the case for a circle or a line in the earlier examples), the derivative of one variable with respect to another needn't exist - which also indicates another meaning of "differentiate with respect to". For instance, suppose we wanted to consider a sphere: $$x^2+y^2+z^2=0$$ We can differentiate and rearrange to get that if $x\neq 0$ then $$dx = \frac{-y}{x}\,dy + \frac{-z}x\,dz$$ If we agree that $y$ and $z$ are the canonical coordinates, then the coefficients $\frac{-y}x$ and $\frac{-z}x$ are the derivatives of $x$ with respect to $y$ and $z$ respectively. This can also be thought of as a two step process where we look at the sets of states where the $z$ coordinate is fixed (which is then one dimensional) and find a coefficient of proportionality between $dx$ and $dy$ - noting that this meaning of the word does depend on the definition of $z$, so you have to actually choose a whole coordinate system to get any well-defined notion of "differentiate with respect to" out of multiple dimensions.

In summary, a lot of this terminology arises because there are multiple formal viewpoints on calculus; you are largely writing about the view that calculus studies functions $\mathbb R\rightarrow\mathbb R$, but it is also valid to view calculus as studying variables defined on a space. This latter view better explains terms like "function of" and "derivative with respect to" which refer literally to variables that are not treated as functions.


Formal disclaimer: Largely, this view is associated to differential geometry where we have some differentiable manifold $M$ (i.e. a set with enough structure that we can do differential calculus on - like a curve or a surface) which represents the set of all possible states of a system (e.g. all the points on a circle or all the states that a burning candle passes through) and then each "variable" is a function $M\rightarrow\mathbb R$ that reads off some quality of that state (e.g. the $x$ coordinate or the amount of wax burned). Note that this is somewhat backwards from the functional view, since there is no separation between inputs and outputs and no parameterization of the manifold $M$ implied - and since one can work purely off of the relationships between these variables. However, note that this largely avoids the "function of what?" problem because our variables, though they are functions, are functions on a very meaningful domain: the set of legal states of a system - and while you might be able to parameterize these states by real numbers, these states needn't be thought of as real numbers. Even better is that we don't have to think of the codomain of variables as being $\mathbb R$ - for instance $w$ could be a map from $M$ to the space of lengths and $t$ could be a map to the space of durations, which can both be parameterized by real numbers, but inherently have units and are therefore not naturally equal to the real numbers. So, as is surprisingly common in mathematics, we have really just taken a function and said "we're going to call it a variable and use the notation we'd use for a real number", but everything works out like you'd expect, so it's okay. The point of view basically boils down to "we need to define $M$ in order to make this rigorous, but we will never mention it if we don't have to."

Formal disclaimer 2: Sometimes this notion is also used in connection with the study of differential algebras, which is fairly different from what is presented here, but it's unlikely that you'd encounter these things unless you were really looking for them, so don't worry about it.