Why should the substitution be injective when integrating by substitution?

I made a silly mistake in evaluating some integral by using a non-injective $u$-substitution. But why should $u$-substitutions be injective in the first place?

I reasoned in the following way: the formula $$ \int_{\phi(a)}^{\phi(b)}g(x)\ dx = \int_a^b g(\phi(t))\phi^\prime(t)\ dt $$ holds for a general $C^1$ function $\phi$, even if it is not injective. When you calculate an integral of the form $\int_a^b f(\phi(t))\ dt$, to use the formula above from right to left, you should find a function $f$ such that $$ f(\phi(t)) = g(\phi(t))\phi^\prime(t), $$ which do not exist if $\phi$ is not injective, i.e., $\phi(t) = 0$ for some $t$. This is why substitutions should be injective.

Is my reasoning correct? If so, I believe that if $\phi^\prime(t) = 0 \Rightarrow f(\phi(t)) = 0$, a function $g$ that satisfies the formula above may exist and $\phi$ should not necessarily be injective. Is this right?

I am often confused about the fact $\phi$ should be injective. Is there an intuitive way to interpret this fact, so that I always remember to take a $\phi$ that is injective?

I would be grateful if you could help me understand this matter.


Solution 1:

When $f:\ I\to{\mathbb R}$ has a primitive $F$ on the interval $I$, then by definition $$\int_a^b f(t)\ dt =F(b)-F(a)$$ for any $a$, $b\in I$; in particular $b<a$ is allowed.

When $\phi$ is differentiable on $[a,b]$ and $g$ has a primitive $G$ on an interval $I$ containing $\phi\bigl([a,b]\bigr)$, then by the chain rule $G \circ \phi$ is a primitive of $(g\circ\phi)\cdot\phi'$ on $[a,b]$. It follows that $$\int_{\phi(a)}^{\phi(b)} g(x)\ dx =G\bigl(\phi(b)\bigr)-G\bigl(\phi(a)\bigr)=\int_a^bg\bigl(\phi(t)\bigr)\phi'(t)\ dt\ .\tag{1}$$ No question of injectivity here.

Now there is a second kind of substitution. Here we are given an integral $$\int_a^b f(x)\ dx$$ without any $\phi$ visible neither in the boundaries nor in the integrand. It is up to us to choose a clever $\phi$ defined on some interval $J$ such that (i) $a$, $b\in \phi(J)$ and (ii) $f\circ\phi$ is defined on $J$. Assume that $\phi(a')=a$, $\>\phi(b')=b$. Then according to $(1)$ we have $$\int_a^b f(x)\ dx=\int_{a'}^{b'}f\bigl(\phi(t)\bigr)\>\phi'(t)\ dt\ .$$ No question of injectivity here, either. Consider the following example: $$\int_0^{1/2} x^2\ dx=\int_{-\pi}^{25\pi/6}\sin^2 t\>\cos t\ dt.$$ It is true that for this second kind of substitution one usually chooses an injective $\phi$ so that one can immediately write $\phi^{-1}(a)$ and $\phi^{-1}(b)$ instead of "take an $a'$ such that $\phi(a')=a\ $".

Solution 2:

Well, imagine the substitution as tracing a path (along the $x$-axis in this case). If you go from $a$ to $b$ and then back from $b$ to $a$ you will cancel out the integral and not compute the integral on $[a,b]$ as you intended. And all sorts of intermediate things can happen.

Try "parametrizing" $[0,1]$ by $x=\sin t$, $0\le t\le\pi$, and computing $\displaystyle\int_0^1 x\,dx$, for example. Of course, if you do the official substitution, you end up with $\int_0^0 x\,dx = 0$. But the function has "covered" the interval $[0,1]$ and then "uncovered" it.