Does a continuous point-wise limit imply uniform convergence?

Question

Given a sequence of continuous functions $(f_n)_{n \in \mathbb N}$ and define $$ f : X \rightarrow Y, \quad f(x) = \lim_{n \rightarrow \infty} f_n(x) $$ where $X$ and $Y$ are metric spaces.

If $f$ is continuous, is it true that $(f_n)$ converges uniformly to $f$?

Are there any restrictions on $(f_n)$, $f$, $X$ and/or $Y$ for this to be true?

Thoughts

This seems like a very useful result, yet it does not seem to be mentioned anywhere which leads me to think it is either completely obvious, or false. This site usually has a question or two pertaining to 'obvious' results, and my attempts to find them have been unsuccessful, though I have found Dini's theorem which relaxes the condition of continuity but requires the sequence to be monotonic and $X$ to be compact.

I know this statement is false if $(f_n)$ are discontinuous functions. Take as counter-example $$ f_n : \mathbb R \rightarrow \mathbb R, \quad f_n(x) = \begin{cases} x^n & x \in [0, 1) \\ 0 & \text{Else} \end{cases} $$ then $f \equiv 0$ is continuous, but $(f_n)$ does not converge uniformly to it.

I have attempted a proof below, and if nothing is wrong, then we require that $X$ is compact.

Attempted Proof

We wish to show that given $\varepsilon > 0$, there exists an $N$ such that for $n > N$, then $$ \forall x \in X, \quad d(f_n(x), f(x)) < \varepsilon. $$

Since $f_n$ are continuous, then for $x \in X$, there exists a neighbourhood $N_x \subseteq X$ such that $$ \forall y \in N_x, \quad d(f_n(x), f_n(y)) < \varepsilon $$ and similarly with $f$.

Additionally, we know that for $x \in X$, there exists $N(x) \in \mathbb N$ such that $$ \forall n > N(x), \quad d(f_n(x), f(x)) < \varepsilon $$

We now set $\varepsilon > 0$, and suppose there does not exist an $N$ such that for $n > N$, then $\sup_{x \in X} \{d(f_n(x), f(x))\} < \varepsilon$. This means we can always pick some $x_0 \in X$ such that $d(f_n(x_0), f(x_0)) \geq \varepsilon$.

By the triangle inequality, we have that $\forall x \in X$ $$ d(f_n(x_0), f(x_0)) \leq d(f_n(x_0), f_n(x)) + d(f_n(x), f(x)) + d(f(x), f(x_0)) $$ As $f_n$ and $f$ are continuous, then we can pick $x$ in a neighbourhood such that the first and third term are less than any $\varepsilon > 0$. In other words, we have that $$ d(f_n(x_0), f(x_0)) < d(f_n(x), f(x)) $$ Thus in the neighbourhood of any 'badly behaving' point, there exists another badly behaving point.

If we require $X$ to be compact, then there exists a point $y \in X$ such that $d(f_n(y), f(y)) = \sup_{x \in X}\{ d(f_n(x), f(x)) \}$ (since $f_n$, $f$ and $d$ are all continuous, thus the image of the distance function must be compact too), in which case the above result would lead to a contradiction.

Some of the answer have shown the proof to be false. Why?

Solution 1:

That the claim is false even on $[0,1]$ can be seen by considering a function $f_n$ which is zero everywhere except for a peak between $(n-1)/n$ and $1$, with a maximum of $1$ and $f_n(1) = 0$.

Solution 2:

Proving or disproving this claim (over $\mathbb R$, if I remember correctly) was the task given to me for my first-year math project many years ago. The counterexample I came up with was basically the same as given by Mike — a sequence of piecewise linear functions supported on a shrinking interval:

$$f_n(x) = \max\left(1-|1-nx|, 0\right) = \begin{cases} nx & \text{if } 0 < x \le \frac1n, \\ 2-nx & \text{if } \frac1n < x \le \frac2n, \\ 0 & \text{otherwise.} \end{cases}$$

Clearly, all these functions are identically zero for all $x \le 0$. Conversely, if $x > 0$, then $f_n(x) = 0$ for all $n \ge \frac2x$, so $f_n(x) \to 0$ pointwise for all $x \in \mathbb R$. Yet for all $n$, $f_n(\frac1n) = 1$, so the functions $f_n$ cannot converge uniformly to zero.

There are several easy ways to tweak this counterexample. For example, you can replace the piecewise linear $f_n$ above with smooth bump functions to make them not just continuous but $C^\infty$. Or you can scale each $f_n$ by $n$ to make the distance $\sup_x |f_n(x)-0|$ diverge to infinity instead of being identically $1$, while still retaining pointwise convergence.

As for why your "proof" fails, the first thing I'd like to note is that it's quite hard to keep track of what's actually going on, since you don't clearly mark which variables are dependent on which others. For example, in the paragraph:

We now set $\varepsilon > 0$, and suppose there does not exist an $N$ such that for $n > N$, then $\sup_{x \in X} \{d(f_n(x), f(x))\} < \varepsilon$. This means we can always pick some $x_0 \in X$ such that $d(f_n(x_0), f(x_0)) \geq \varepsilon$.

your "$x_0$" can be different for each $n$ (and $\varepsilon$), even though that's not at all obvious at a glance. It would be much clearer to, say, denote it by $x_n$ (or even $x_{n,\varepsilon}$) instead, so that it would be immediately obvious that its value may depend on $n$.

I wouldn't normally put quite so much stress on a simple "stylistic" point like this, if it weren't for that fact that a very common error in proofs like this is getting your variable dependencies mixed up, and, say, accidentally turning an dependent existential statement like $\forall n \exists x: \dotsc$ into the much stronger universal statement $\exists x \forall n: \dotsc$ by forgetting that $x$ may depend on $n$.

It looks like something similar may have happened in your proof. Explicitly marking such dependencies (e.g. with subscripts like $\forall n \exists x_n: \dotsc$) is a good habit to get into, since it makes such mistakes much easier to spot and avoid.

(Since we're on the subject of good variable naming style, I'd also like to note that having $N$, $N_x$ and $N(x)$ in the same proof may not be the most optimal choice, especially when the middle one has nothing to do with the other two $N$s. Try to find some other letter to use instead.)

Another issue I spotted is that the step where you go from:

$$d(f_n(x_0), f(x_0)) \leq d(f_n(x_0), f_n(x)) + d(f_n(x), f(x)) + d(f(x), f(x_0))$$

to:

$$d(f_n(x_0), f(x_0)) < d(f_n(x), f(x))$$

is invalid. I don't actually think this is the main mistake in your proof, but it certainly is a mistake.

The most you can derive from the former inequality (by noting that the first and last term on the RHS can be made arbitrarily small by choosing $x$ near $x_0$) is:

$$d(f_n(x_0), f(x_0)) \le d(f_n(x), f(x)) + \varepsilon^*$$

for some arbitrarily small but positive $\varepsilon^*$, which doesn't really say much; it's perfectly possible that $d(f_n(x), f(x))$ in fact attains its (local) maximum at $x = x_0$. (Also, again, the variable naming is quite confusing here; if I were to rewrite this, I'd rename $x_0$ to $x_n$ and $x$ to, say, $x'_n$ or even $x'_{n,\varepsilon^*}$, and make sure to distinguish $\varepsilon^*$ from the earlier $\varepsilon$s (hence the $^*$).)

Finally, I'd like to note a very useful general verification mechanism: if you think you have both a proof and a counterexample for the same claim, try applying the proof, step by step, to the specific counterexample and see what goes wrong. You'll generally find that either:

The proof fails at some specific step; you can then step back and see if the proof is completely unsalvageable, or if it might be possible to rescue part of it by imposing some extra constraints needed to actually make the step work.
The counterexample turns out not to be a counterexample after all. This tends to be less common, since counterexamples are often simpler and more concrete than general proofs, but sometimes it does happen.
You get confused and lose track of what actually happens in your counterexample when you try to apply the proof to it. This can be a sign that your proof (or, just possibly, the counterexample) is badly written, and the muddled structure conceals some hidden assumption or other flaw. In such cases, try to rewrite the proof until you can actually follow it yourself.

This can actually be a useful strategy even when you don't yet have a fully formed proof or counterexample in mind. Basically, if you have a claim so abstract that you don't have a good intuition for why it is or isn't generally true, try to come up with a concrete example and work through that first. This will often give you at least some intuition about the general case, which you can then use to try to sketch a proof. Then, if you can convince yourself that the claim holds at least with certain assumptions, try to come up with a new example that breaks those assumptions, and see what happens there.

For the first example, you can often simplify a lot: a general topological space $X$ can be $X = \mathbb R$; an arbitrary function $f$ can be $f(x) = x$, or $f(x) = 0$; an arbitrary number $n$ can be, say, $n = 1$. If you have a free parameter $x$, see what happens when $x$ is very close to zero, or very, very large. Just pick something simple enough that you can keep track of all the details, and familiar enough that you can apply your existing knowledge and intuition.

I've found this to be a particularly useful technique for following lectures, where you don't always have time to sit down and work out at your leisure just why the claim written on the blackboard really holds. Several times, it has also helped me spot lecturer mistakes or unstated assumptions — it's quite remarkable how often a claim that looks like it should hold in general can fail on a very simple counterexample.

Solution 3:

Mike's example, in fact, generalizes significantly.

Theorem: Let $X$ be a metric space. If there is any disjoint sequence $U_n$ of open sets in $X$, there is a sequence $(f_n:X\to\Bbb R)$ of continuous functions that converges to a continuous function, but not uniformly.

Let $V_n\subseteq U_n$ be a closed set inside of $U_n$; it is not too hard to show that one must exist. Since $X$ is a metric space, it is normal, and so Urysohn's Lemma guarantees the existence of a continuous function $f_n$ that is $1$ on $V_n$ and $0$ on $(U_n)^c$.

Evidently, $f_n\to 0$, since every point is mapped to zero by all functions except possibly one. However, it is also clear that $\max_x|f_n-f|=1$, and so the convergence is not uniform.

(I believe you can replace the disjoint sequence of sets condition with $|X|=\infty$: relax it to be a sequence of points, and then I'm pretty sure the sequence is either discrete or it has a Cauchy subsequence. If it is discrete then we expand the points to disjoint open sets and we're good. If it has a Cauchy subsequence then pass to that subsequence and we can build the sets inductively by putting an closed set around the limit point so that there are only finitely many sets to consider. Normality kicks in and we're still good.)

EDIT: It may generalize in another direction as well. The concept of uniform convergence most generally makes sense in uniform spaces. In general a uniform space need not be normal, but it must be Tychonoff, and this question shows that Urysohn's Lemma almost holds in that setting. Since singletons are compact in Tychonoff spaces, we should be able to repeat the argument. But I have not worked much with uniform spaces and so I may be missing some subtleties.