Formal basis for variable substitution in limits

The complete story is as follows:

If the functions $g:\ A\to B$ and $f:\ B\to C$ have limits $$\lim_{x\to\xi}g(x)=:\eta\ ,\qquad \lim_{y\to\eta}f(y)=:\zeta\ ,$$ and if $f$ is continuous at $\eta$ in case $\eta$ occurs as value of $g$, then $$\lim_{x\to\xi}f\bigl(g(x)\bigr)=\lim_{y\to\eta} f(y)\ .$$

This holds also if any one of $\xi$, $\eta$, $\zeta$ is $\ =\infty$.

The extra condition "and if $f$ is continuous $\ldots$" is usually fulfilled, but one cannot do without it: Consider the example $g(x):\equiv 1$ and $f(y):=2$ $\ (y=1)$, $\ f(y):=3$ $\ (y\ne1)$. Then $\lim_{x\to1}f\bigl(g(x)\bigr)=2$, but $\lim_{x\to1}g(x)=1$, $\ \lim_{y\to1}f(y)=3$.

Yes, that's essentially the idea.

But this follows from the definition of continuity: we just need to show that if $f$ is continuous at $g(a)$ and $g$ is continuous at $a$, then $f\circ g$ is continuous at $a$. Because then both sides evaluate to $f(g(a))$, by definition of continuity (which requires that the limit exist and be equal to evaluating the function at the point.

To prove that if $g$ is continuous at $a$ and $f$ is continuous at $g(a)$, then $f\circ g$ is continuous at $a$, note first that $f\circ g$ is defined at $a$. Now, let $\epsilon\gt0$. then we know that there exists $\delta_1\gt 0$ such that if $|y-g(a)|\lt\delta_1$, then $|f(y)-f(g(a))|\lt\epsilon$; this holds, because $f$ is continuous at $g(a)$.

Now, since $g$ is continuous at $a$, and $\delta_1\gt 0$, this means that there exists $\delta\gt 0$ such that if $|x-a|\lt\delta$, then $|g(x)-g(a)|\lt\delta_1$.

Thus, for all $\epsilon\gt 0$ there exists $\delta\gt 0$ such that if $|x-a|\lt\delta$, then $|f\circ g(x) - f\circ g(a)|\lt\epsilon$. Therefore, $f\circ g$ is continuous at $a$.

Therefore, we have that $\lim\limits_{y\to g(a)} f(y) = f(g(a))$, since $f$ is continuous at $g(a)$; and $\lim\limits_{x\to a}f\circ g(x) = f\circ g(a) = f(g(a))$, since $f\circ g$ is continuous at $a$.

You don't quite need $g$ to be continuous: if $\lim\limits_{x\to a}g(x)=L$ and $f$ is continuous at $L$, then we have $$\lim_{x\to a}f\circ g(x) = \lim_{y\to L}f(y) = f(L).$$ To verify this, let $\epsilon\gt 0$. Then there exists $\delta_1\gt 0$ such that for all $x$, $|y-L|\lt\delta_1$ implies $|f(y)-f(L)|\lt\epsilon$. Since $\lim\limits_{x\to a}g(x)=L$, there exists $\delta\gt 0$ such that if $0\lt |x-a|\lt \delta$ then $|g(x)-L|\lt\delta_1$. So, suppose that $0\lt |x-a|\lt\delta$. Then $|g(x)-L|\lt\delta_1$, and therefore $|f(g(x))-f(L)|\lt\epsilon$. Therefore, for every $\epsilon\gt 0$ there exists $\delta\gt 0$ such that if $0\lt |x-a|\lt\delta$, then $|f(x)-f(L)|\lt\epsilon$. This proves that if $\lim\limits_{x\to a}g(x) = L$ and $f$ is continuous at $L$, then $$\lim\limits_{x\to a}f\circ g(x) = \lim\limits_{y\to L}f(y) = f(L).$$ In particular, if $g$ is continuous at $a$, then we replace $L$ with $g(a)$.

We cannot omit the continuity of $f$ at $L$, though: take $g(x) = 0$ for all $x$, and let $$f(x) = \left\{\begin{array}{ll} 1 &\text{if }x\neq 0\\ 0 &\text{if }x=0. \end{array}\right.$$ Then $\lim\limits_{x\to a}f(g(x)) = 0$, because $f(g(x))=f(0)=0$. But $$\lim\limits_{y\to 0}f(y) = 1,$$ because we never take the value $y=0$ in evaluating the limit.

One can replace continuity of $f$ with other conditions; for example, we may ask that $g$ have a limit $L$ at $a$, and moreover, that for every $\delta\gt0$ there exist an $\eta\gt 0$ such that $g$ takes all values on $(L-\eta,L+\eta)$, except perhaps $L$ itself, on $(a-\delta,a+\delta)-\{a\}$.

Added. The situation with limits as $x\to\infty$ is essentially the same if $\lim\limits_{x\to\infty}g(x)$ exists and is real. It is more complicated when the limit of $g$ is $\pm\infty$. See this answer for some discussion on that.

While this is a fairly old question, I stumbled upon it and wanted to give a different take for the case where the continuity of $f$ cannot be assumed. To make things work, then, we need to strengthen the hypotheses on $g$.

Theorem. Suppose $g$ is continuous and injective on an open interval $I$ containing $a$ and $f$ is some function. Then $\displaystyle \lim_{x \to a}{(f \circ g)(x)}$ exists if and only if $\displaystyle \lim_{t \to g(a)}{f(t)}$ exists, and they are equal when they do exist.

Proof. Suppose $\displaystyle \lim_{x \to a}{f(g(x))} = L$. We claim that $\displaystyle \lim_{t \to g(a)}{f(t)} = L$. To that effect, let $\epsilon>0$ be arbitrary. Our hypothesis means that there is a $\delta > 0$ such that $$0 < |x-a| < \delta \implies |f(g(x)) - L| < \epsilon.$$ Because $g$ is one-to-one and continuous on an open interval $I$, $g^{-1}$ is also one-to-one and continuous on the open interval $g[I]$. Thus, there exists a $\gamma>0$ such that $$|t - g(a)| < \gamma \implies |g^{-1}(t) - a| < \delta.$$ Because $g^{-1}(t) = a$ only occurs when $t=g(a)$, we similarly have $$0 < |t - g(a)| < \gamma \implies 0 < |g^{-1}(t) - a| < \delta.$$ Thus, for all $t$, $$0 < |t - g(a)| < \gamma \implies 0 < |g^{-1}(t) - a| < \delta \implies |f(g(g^{-1}(t))) - L| < \epsilon$$ or $$0 < |t - g(a)| < \gamma \implies |f(t) - L| < \epsilon.$$ Since $\epsilon>0$ was arbitrary, it follows that $\displaystyle \lim_{t \to g(a)}{f(t)} = L$.

The opposite direction follows from the above direction: if we let $h = f \circ g$ and $b=g(a)$, then $\displaystyle \lim_{t \to g(a)}{f(t)}$ may be written as $\displaystyle \lim_{t \to b}{h(g^{-1}(t))}$, and $\displaystyle \lim_{x \to a}{f(g(x))}$ may be written as $\displaystyle \lim_{x \to g^{-1}(b)}{h(x)}$. Since $g^{-1}$ satisfies the same hypotheses as $g$ and there were no requirements on $f$, the above argument goes through in this new arrangement. Q.E.D.

Here is my point of view. Let $f:\text{D}_f\to\mathbb{R}$ and $g:\text{D}_g\to \text{R}_g\subseteq\text{D}_f$ be functions so that $f\circ g:\text{D}_g\to\mathbb{R}$ is well defined. Well, I assume that you are aware of the following theorem.

Theorem 1. If $g$ is continuous at $a$ and $f$ is continuous at $g(a)$ then $f\circ g$ is continuous at $a$. In terms of limit notation, if $\lim_{x\to a}g(x)=g(a)$ and $\lim_{x\to g(a)}f(x)=f(g(a))$ then we have $\lim_{x\to a}(f\circ g)(x)=(f\circ g)(a)$.

A slight generalization of this theorem is the following

Theorem 2. Suppose that $\lim_{x\to a}g(x)=b$ and $\lim_{x\to b}f(x)=c$. Furthemore, assume that $g$ satisfies the following condition $$\exists r>0,\,\,\,0<|x-a|<r\implies 0<|g(x)-b|.$$ Then we have $\lim_{x\to a}(f\circ g)(x)=c$.

In your example, we have $f:x\mapsto x^2$ and $g:x\mapsto x+1$, $a=5$, $b=6$. We clearly have $\lim_{x\to5}x+1=6$ and $\lim_{x\to 6}x^2=36$. More importantly, choose $r=1$ so whenever $0<|x-5|<1$ we have that $|(x+1)-6|>0$. These two imply that $\lim_{x\to 6}(x+1)^2=36$. In practice, one usually write this as

$$\lim_{x\to 5}(x+1)^2=\lim_{y\to 6}y^2=36.$$

What happens in the mind of calculus students is that they detect $g:x\mapsto x+1$ instantly, compute $\lim_{x\to 5}x+1=6$ in their (unconscious) mind and get the first equality which is then easily computed to get the second one. One tricky point about this is that as you do the writing from left to right, you may think that the implications are also in this order while it is exactly in the converse direction, right to left! Also, $y$ here is just a dummy variable and can be replaced with other variables such as $t$ or the $x$ itself.

We can also express another variation of Theorem 2 as below.

Theorem 3. Suppose that $\lim_{x\to a}g(x)=b$ and $\lim_{x\to b}f(x)=c$. Furthemore, assume that $f$ is continuous at $b$. Then we have $\lim_{x\to a}(f\circ g)(x)=c$.

Remark 1. Sometimes, instead of writing $\lim_{x\to a}(f\circ g)(x)=c$, we use the more intuitive symbolism of $\lim_{g(x)\to b}f(g(x))=c$.

Remark 2. These theorems can naturally be generalized for metric spaces.

Let $L_1 = \lim\limits_{x \to g(a)} f(x)$ and $L_2 = \lim\limits_{x \to a} f(g(x))$. $L_1$ is unique in $\mathbb{R}$ so that for all $\varepsilon > 0$, there exists $\delta_1(\varepsilon) > 0$ such that if $|x-g(a)| < \delta_1(\varepsilon)$ then $|f(x) - L_1| < \varepsilon$. Similarly $L_2$ is unique so that there exists $\delta_2(\varepsilon) > 0$ such that if $|x-a| < \delta_2(\varepsilon)$ then $|f(g(x)) - L_2| < \varepsilon$.

Now we will show that $L_1$ also satisfies the property that $L_2$ satisfies uniquely. Take $\varepsilon > 0$. When $|x-g(a)| < \delta_1(\varepsilon/2)$,

\begin{align*} |f(g(x)) - L_1| &= |f(g(x)) + f(x) - f(x) - L_1| \\ &\le |f(g(x)) - f(x)| + |f(x) - L_1| \\ &< \varepsilon/2 + |f(g(x) - f(x)|. \end{align*}

Since $f$ is continuous, there exists $\delta(\varepsilon/2) > 0$ such that when $|x-g(a)| < \delta(\varepsilon/2)$, $|f(g(x)) - f(x)| < \varepsilon / 2$. Thus if $|x - g(a)| < \min\{\delta_1(\varepsilon/2), \delta(\varepsilon/2)\}$, then we get \begin{align*} |f(g(x)) - L_1| < \varepsilon/2 + \varepsilon/2 = \varepsilon. \end{align*} Thus $L_1$ is also the limit $\lim\limits_{x\to a} f(g(x))$ and by uniqueness we conclude $L_1 = L_2$, i.e. $\lim\limits_{x\to a} f(g(x)) = \lim\limits_{x\to g(a)} f(x)$.

Open maps which are not continuous

Approximating a $\sigma$-algebra by a generating algebra

Expectation of the maximum of gaussian random variables

Completion of rational numbers via Cauchy sequences

showing $\arctan(\frac{2}{3}) = \frac{1}{2} \arctan(\frac{12}{5})$

Why is the upper Riemann integral the infimum of all upper sums?

$\gcd(a,b)\!=\!1\!=\!\gcd(a,c)\Rightarrow\gcd(a,bc)\!=\!1$ [coprimes to $\,a\,$ are product closed]

Number of permutations of $n$ elements where no number $i$ is in position $i$

$a^{\phi (n) +1} \equiv a \pmod{\! n}; $ Carmichael generalization of Fermat & Euler theorems.

Are associates unit multiples in a commutative ring with $1$?

How to show that this binomial sum satisfies the Fibonacci relation?

Prove $\gcd(a+b, a-b) = 1$ or $2\,$ if $\,\gcd(a,b) = 1$