Justification behind the working of "integration by substitution"

I am very confused about "integration by substitution". For example:

We know that $\int\ x^2 dx$ = $x^3/3 +C$.

Just for doing it, maybe for the sake of practicing, or for testing with integration by substitution, we could have made the substitution $x^2=u$. Then $x=u^{1/2}$ and $dx= (u^{-1/2}/2) du$. Consequently:

$\int\ x^2 dx = \int\ u (u^{-1/2}/2) du = (1/2) \int\ u^{1/2} du = (1/3) u^{3/2} + C$

Since $u^{3/2} = x^{3}$, we see that we got the right answer.

My question is, why did we get the right answer? This method look like magic to me, since I don't know the justification behind it. How do we prove that we can make the change of variable and change the $dx$ accordingly to obtain the right answer. "Integration by parts" is based on the product rule, for example. On what differentiation rule or rules is based integration by substitution? Is it the chain rule? By the way, I have no idea what I am doing. I just want to know the logic behind it. I have not see an explanation in the books I have searched. It seems to be introduced by examples, without justification. I am sorry if my question is bad, I am really confused right now.


Solution 1:

The chain rule is $$\frac{d}{dx}f(g(x))=f^{\prime}(g(x))g^{\prime}(x)$$ now we can integrate both sides of this equation to get

$$f(g(x))=\int f^{\prime}(g(x))g^{\prime}(x)dx$$

If we were to write $y=g(x)$, then the previous equation becomes $$f(y)=\int f^{\prime}(y)dy$$ Putting the last two together we could express the rule as

$$\int f^{\prime}(y)dy=\int f^{\prime}(g(x))g^{\prime}(x)dx$$ or with a change in the notation, $f$ instead of $f^{\prime}$, $$\int f(y)dy=\int f(g(x))g^{\prime}(x)dx$$ and this is exactly substitution as you are using it.

Solution 2:

The purpose of the $du$ is to suck up the extra factor of the inner function spilt out by the chain rule.