why don't standard analysis texts relax the injectivity requirement of the multivariable change of variables theorem? [duplicate]
In standard multivariable analysis texts, the change of variables for multivariable integration in Euclidean space is almost always stated for a $C^1$ diffeomorphism $\phi$, giving the familiar equation (for continuous $f$, say)
$$\int_{\phi(U)}f=\int_U(f\circ\phi)\cdot|\det D\phi|$$
Of course, this result by itself is not very useful in practice because a diffeomorphism is usually hard to come by. The better advanced calculus and multivariable analysis texts explain explicitly how the hypothesis that $\phi$ is injective with $\det D\phi\neq0$ can be relaxed to handle problems along sets of measure zero -- a result which is necessary for almost all practical applications of the theorem, starting with polar coordinates.
Despite offering this slight generalization, very few of the standard texts state that the situation can be improved further still: there is an analogous theorem for arbitrary $C^1$ mappings $\phi$, not just those that are injective everywhere except on a set of measure zero. We simply account for how many times a point in the image gets hit by $\phi$, giving
$$\int_{\phi(U)}f\cdot\,\text{card}(\phi^{-1})=\int_U(f\circ\phi)\cdot|\det D\phi|$$
where $\text{card}(\phi^{-1})$ measures the cardinality of $\phi^{-1}(x)$.
I think this theorem is a lot more natural and satisfying than -- and surely just as heuristically plausible as -- the first. For one thing, it removes a huge restriction, bringing the theorem closer to the standard one-variable change of variables for which injectivity is not required (though of course the one-variable theorem is really a theorem about differential forms). In particular, it emphasizes that regularity is what's important, not injectivity. For another thing, it's not a big step from here to the geometric intuition for degree theory or for the "area formula" in geometric measure theory. (Indeed, the factor $\text{card}(\phi^{-1})$ is a special case of what old references in geometric measure theory called the "multiplicity function" or the "Banach indicatrix.") It's also used in multivariate probability to write down densities of non-injective transformations of random variables. And last, it's in the spirit of modern approaches to gesture at the most general possible result. The traditional statement is really just a special case; injectivity only becomes essential when we define the integral over a manifold (rather than a parametrized manifold), which we want to be independent of parametrization. I think teaching the more general result would greatly clarify these matters, which are a constant source of confusion to beginners.
Yet many of the standard multivariable analysis texts (Spivak, Rudin PMA and RCA, Folland, Loomis/Sternberg, Munkres, Duistermaat/Kolk, Burkill) don't mention this result, even in passing, as far as I can tell. The impression a typical undergraduate gets is that the traditional statement is the final word on the matter, not to be improved upon; after all, the possibility of improvement isn't even hinted at, even when the multivariable result is compared to the single variable result. So I've had to hunt for discussions of the extension; I've found it here:
- Zorich, Mathematical Analysis II (page 150, exercise 9, for the Riemann integral)
- Kuttler, Modern Analysis (page 258, for the Lebesgue integral; used later in a discussion of probabilities densities)
- Csikós, Differential Geometry (page 72, for the Lebesgue integral)
- Ciarlet, Linear and Nonlinear Functional Analysis with Applications (page 34, for the Lebesgue integral)
- Bogachev, Measure Theory I (page 381, for the Lebesgue integral)
- the Planet Math page on multivariable change of variables (Theorem 2)
I'm also confident I've seen it in some multivariable probability books, but I can't remember which. But none of these is a standard textbook, except perhaps for Zorich.
My question: are there standard analysis references with nice discussions of this extension of the more familiar result? Probability references are fine, but I'm especially curious whether I've missed some definitive treatment in one of the classic analysis texts.
Also feel free to speculate, or explain, why so few texts mention it. (Is there really any good reason for not mentioning it, when failing to do so implicitly trains students to think injectivity is an essential ingredient for this kind of result?) I'm hoping there's a more interesting answer than "most authors don't mention it because the texts they learned from didn't either" or "even an extra sentence alluding to the possibility of a more general result is too much to ask for since the traditional theorem is hard enough to prove on its own."
(Cross-posted on MSE.)
Solution 1:
The better advanced calculus and multivariable analysis texts explain explicitly how the hypothesis that $\varphi$ is injective with $\det D \varphi \neq 0$ can be relaxed to handle problems along sets of measure zero -- a result which is necessary for almost all practical applications of the theorem, starting with polar coordinates.
I speculate that most authors don't go beyond the injective immersion condition because it is sufficient for most practical applications of the theorem, polar and spherical coordinates being among the most important examples. The difficulty of formulating and proving the change of variables formula for integration is out of proportion to the rest of the content of an advanced calculus course, so if you are writing a textbook on the subject then there is a strong temptation not to stray too far from what you need to handle the basic examples and applications. This also explains why some authors are happy to live with the assumption that $\phi$ is a diffeomorphism - if your goal is just to prove Stokes' theorem, then why make it harder?
It wouldn't hurt, I suppose, to allude to more general versions of the theorem in a parenthetical remark or an extended exercise, but I don't think the stakes are very high. Undergraduates are usually accustomed to the fact that they aren't getting the most general possible theorems in their classes.
Solution 2:
As Paul Siegel says in his answer: The usual formula is sufficient for most practical applications. I would go further and say:
The plain form of the change of variables theorem makes it much more clear that the main motivation for this theorem is just to compute integrals.
The change of variables theorem is really a workhorse-theorem to work with and (as far as I see) is not something that is structurally important. Check Christian Blatter's answer at MSE to see a mathematician with years of experience telling you how often he really used the non-injective form.
Also, the plain form really shows that the Jacobian is the crucial thing here and also the proofs of the plain form (at least the ones I know) makes that pretty clear. However, if you want to prove the more general form, I don't know anything else than to start from the plain result and add on top of that.
And my last point: As I said above, the change of variables theorem is a workhorse to do something, namely, to compute integrals. If you would ever calculate an integral of the form
$$\int_{\phi(U)}f\cdot\,\text{card}(\phi^{-1})=\int_U(f\circ\phi)\cdot|\det D\phi|$$
what would you do? You would check for each point how often it is reached by $\phi$ and would patch the results together (neglecting the issues with null-sets) using the plain form on each patch. This is something that a student would came come up with by himself. Hence, the general form is not at all helpful to do the very thing for which the plain form of the theorem in intended. Balancing how complicated the proof of the more general result is and how intuitive and (most importantly) practically not so useful the result is, it seems clear what you should do when writing a textbook.