Why does changing variables work?

Solution 1:

Good question, good answers.

As @EricS points out, you don't have to substitute in this particular case - you can do all the work with the original variable. But you can substitute. The advantage is that changing the name of the variable in a systematic way makes the shape of the problem and solution a little clearer.

Since all you're doing here is finding isolated solutions, all you need is bijectivity to make sure your translation from one name space to another and back is faithful.

If you want to do more than algebra you may need a better dictionary - that is, a substitution with better properties. If as @mweiss comments you want to do calculus on the transformed equation then the substitution and its inverse must be differentiable. That's essentially what the wikipedia page is saying, in a more abstract context.

When you study abstract algebra you'll want your "substitutions" to respect the algebraic properties of the domain and range. That's the essence of @asymplectomorphic 's comment about linear algebra.

Solution 2:

By the rules of algebra,

$$x^6 - 9 x^3 + 8 = 0$$

is strictly equivalent to

$$(x^3)^2 - 9 x^3 + 8 = 0.$$

Then it makes no harm to substitute $u=x^3$ and solve

$$u^2-9u+8,$$

leading to a set of solutions $u\in S=\{s_k\}$. And this is equivalent to $x^3\in S$, or $x\in\{\sqrt[3]s_k\}$.

What matters for the substitution to be valid is that the domain of $u$ includes the range of $x^3$ so that no solution is lost (some $x^3$ verifying the equation but not covered by $u$); on the other hand, no alien solution is introduced when inverting $u=x^3$, as the domain of $x$ takes precedence.

I don't think that any other condition, such as continuity or differentiability, need to be imposed on the substitution.


For the sake of the illustration, let us consider the substitution $u=x^3-\text{sign}(x)$, which is neither continuous nor invertible. We have a branch with $x<0,u<1$, and another with $x>0,u>-1$.

The equation is split for the two branches

$$u<1\land(u-1)^2-9(u-1)+8=u^2-11u+18=0,\\ u>-1\land(u-1)^2-9(u+1)+8=u^2-7u=0.$$

These give the solution sets $u\in\{\}$ (no $u$ is admissible) and $u\in\{0,7\}$. Then for the second branch, $x^3\in\{1,8\}$, which is correct.

Solution 3:

I must admit I never gave such much thought into this method as you did, and that I do not understand the formal introduction on the Wikipedia page (I assume you refered to this one). But perhaps this may provide you with some more insight into why the method works.

Again consider the equation $x^6-9x^3+8=0$. $$ x^6-9x^3+8=\left(x^3\right)^2-9\left(x^3\right)+8=0 $$ $$\iff$$ $$ \left(\left(x^3\right)-8\right)\left(\left(x^3\right)-1\right)=0 $$ $$\iff$$ $$ x^3=8\ \lor\ x^3=1 $$ $$\iff$$ $$ x=\sqrt[3]{8}=2\ \lor\ x=1 $$ Now, we both know this is just substitution without writing it explicitly, but the reason it works is because we simply rewrite the equation in an attractive form and solved a quadratic equation. So I'd say the reason substitution works depends on what kind of equation you're solving and the technique you use for solving such equations. In the above case, substitution worked, because you aren't changing the original equation at all, merely rewriting it, and because the quadratic formula works for solving quadratic equations.

Kind of an non-mathematical answer. Again, I hope it helps. If not, please forgive me :)

Solution 4:

This answer will give an idea on why change of variables works/is allowed when integrating.

I will give you an idea (heuristic) on why these requirements for $u$ are needed. The basic idea behind substition of variables is that you choose a different basis over which you know the solution.

Basically you want to solve $\int_a^b f(x) dx$.

If you think about a two-dimensional Euclidean $(x,y)$-grid (actually: manifold), then you can think of $dx$ as a vector (actually: covector) that defines the direction-step in the $x$-direction.

In the integral expression, $x$ is just a dummy, so you can choose it as anything you would like it to be, but then you need to change it at any place.

You've chosen $u = x^{\frac{1}{3}}$. This function is "sufficiently nice" in the sense that you can invert it, differentiate it infinitely many times and that it's continuous over $\mathbb{R}$ and everything else.

You can replace $dx$ now by $[\text{Something}]du$. You know $x = u^3$, so $x'(u) = \frac{dx}{du} = 3 u^2$. Multiply both sides by the differential of $u$ to find $dx = 3 u^2 du$.

Now you can transform everything from the basis in $x$ to a basis in $u$, so $$\int_a^b f(x) dx = \int_{a^{1/3}}^{b^{1/3}} f(u) (3 u^2 du)$$

Why do you need differentiability? For instance, consider $u = \frac{1}{x}$ and suppose that the point $x=0$ is within the interval $\langle a,b\rangle$. What is the value of $u$ when we consider $x = 0$?

The same way diffeomorphism. Suppose that the basis transformation is not injective. For instance consider $u = x$ if $x<0$ and $u = x+2$ if $x\geq 0$. There is no value of $x$ that maps to $u=1$. Now try to integrate over $x$ from -1 to 1. Then $\int_{-1}^1 f(x) dx = \int_{-1}^{3} f(?) du$. What would be the value of the latter integral?