Why do introductory real analysis courses teach bottom up?

Although I never had any difficulty with the $\epsilon$-$\delta$ definition, I still found that continuity made much more sense after I encountered the general topological setting. However, I ended up a set-theoretic topologist; after some forty years of teaching mathematics, I’m quite certain that this is a minority experience. I’m also quite certain that there is no single right answer to the question of how to teach continuity in a first rigorous approach: the answer depends not only on the individual student, but also on the preferences of the instructor. I do think that it’s worth being aware of the range of possibilities and some of their strengths and weaknesses.

I’m familiar with five approaches to teaching continuity in a rigorous fashion.

  1. The traditional $\epsilon$-$\delta$ approach. This has the overwhelming advantage of universal familiarity and use, and the distinct disadvantage that while the intuition underlying continuity is covariant $-$ if $x$ is near $a$, $f(x)$ is near $f(a)$ $-$ the definition is contravariant. The definition is also $\forall\exists\forall$, which is logically rather complex. On the other hand, it uses concrete, quantitative measures of approximation, which for many students is a plus, and its connection with uniform continuity (and other stronger forms of continuity) is straightforward.

  2. Sequences first, with $\epsilon$-$N$, then $f$ is continuous iff it preserves limits of sequences. The main advantages are that a sequence is a particularly simple kind of function whose convergence is easily visualized (‘it eventually gets inside any given $\epsilon$-nbhd and stays there’), and that the definition of continuity is covariant. It’s also relatively easy eventually to move on to the $\epsilon$-$\delta$ definition of continuity. The problem with this approach is that the ‘thinness’ of sequences tends to obscure what’s really going on, namely, that everything near $a$ is being sent near $f(a)$. It also doesn’t generalize well to uniform continuity, to put it mildly.

  3. Open sets. In full generality this is simply too abstract for most students in a first rigorous encounter with continuity. If we work only with functions on the reals and limit ourselves to open intervals, it isn’t really much different from the $\epsilon$-$\delta$ approach. It may make some of the theory just a hair simpler, but the main advantage is that it makes the generalization to $\mathbb{R}^n$ easier: open boxes are more convenient than open (Euclidean) balls. On the other hand, general open intervals aren’t as concrete a notion of approximation as $\epsilon$-nbhds.

  4. An abstract nearness relation. This approach was outlined in P. Cameron, J. G. Hocking and S. A. Naimpally, Nearness $-$ A Better Approach to Continuity and Limits, The American Mathematical Monthly, Vol. 81, No. 7 (Aug. - Sep., 1974), pp. 739-745]. It has the advantage of a covariant definition of continuity, of treating convergence of sequences as just another instance of continuity, and of having a very intuitive underpinning. It has the disadvantage of being very much outside the mainstream, so that one must eventually derive the more usual characterizations of continuity anyway. One pays for the more intuitive introduction by having to spend extra time introducing the standard approach, and it’s not clear to me whether there’s a net benefit. One definite strength of the approach, however, is that it yields uniform continuity in a very straightforward way: one simply replaces the notion of a point being near a set with the notion of a set being near another set, thereby getting a proximity space.

  5. Infinitesimals. The classic example of this approach is H. Jerome Keisler, Elementary Calculus: An Infinitesimal Approach; another example is Keith Stroyan, Foundations of Infinitesimal Calculus. Pretty much everything that I said about the nearness approach applies here as well, including the ease of defining uniform continuity. The main differences are that the preliminaries are a bit more complicated, but the payoff is in my view a bit greater: rigorous infinitesimals are, I think, more useful than the axiomatized notion of nearness used in (4).

There is a strong pragmatic argument for (1) or (2), especially in a school that has a lot of transfer students. One can make pretty good pædagogical arguments for (4) and (5), but in most situations they well may be overridden by practical concerns. In practice some combination of ideas from (1), (2), and (3) is likely to be as effective as anything.


My thoughts:

  • People who go on to become professional users of advanced mathematics generally have a different standard of "understandable" than people who don't. Frankly, if I were writing an introductory real analysis book for people who could be counted on to do things like hold multiple equivalent definitions of an object in their heads, to the point that they can compare and contrast them and form a favorite, the book I would produce would be very different from the standard real analysis textbook. But that isn't the world in which those books are written.

    I guess what I'm saying is although you might feel it's more "understandable" to do it that way, I suspect that if the books actually did that, many would be flummoxed by open sets and happy when they got to $\epsilon$ and $\delta$. The only people who would be happy in either world are the ones for whom the choice makes no difference. (This reminds me, a little bit, of people who have a favorite set-theoretic construction of $\mathbb{R}$ from the ring of natural numbers, with pedagogical justifications for their favorite. For most people, any construction of $\mathbb{R}$ from anything is going to be a huge stumbling block--- huge, that is, compared to the size of any pedagogical choice made about how to do it.)

  • The other obvious choices for a definition of continuity require forming a mental image of very large sets: the collection of all open subsets of $\mathbb{R}$, or the collection of all convergent sequences with a given limit. Many people struggle with conceptualizing such large collections--- and try to base proofs involving them on misconceptions of what an arbitrary element of such a collection must "look like." (You will see this no matter when you talk about open sets.)

  • At most schools, the introductory real analysis class must also accommodate the needs of future teachers of K-12 mathematics, and people whose future jobs will not involve mathematics at all (but whose majors require one advanced math class). Many of these people will not be happy with any precise definitions, because definitions are used for writing proofs, and they don't see any point in writing proofs. The current choice is partly an acknowledgement to this reality, I think, because...

  • ...the $\epsilon$-$\delta$ definition of continuity is perhaps the closest of many possibilities to the "intuitive" conception of continuity given in calculus classes ($f$ is continuous at $c$ if for however strictly you interpret $\approx$, you can always ensure that $f(x) \approx f(c)$ by taking $x$ sufficiently close to $c$).

    I do think that, for a population of future proof writers, the open sets definition should definitely get more emphasis than playing with $\epsilon$ and $\delta$. (But you can use nothing but the $\epsilon$-$\delta$ definition, and still minimize the amount of playing with $\epsilon$ and $\delta$. It comes down to writing style. I can't defend the writing style of many real analysis textbooks, but I feel these issues usually go far beyond just what choices are made in the definitions.)

  • How do you define an open subset of $\mathbb{R}$? Maybe: a set $G$ is open if for all $g$ in $G$ there are $a$ and $b$ in $G$ satisfying $a < g < b$ and $(a,b) \subseteq G$. OK. To verify from a given subset of $\mathbb{R}$ is open from this definition (and not from nice theorems about the sets that satisfy the conditions of this definition), you need to fix $g$'s, and produce $a$'s and $b$'s making the above claim true. But this is a Roman letter version of what you probably don't like about $\epsilon$ and $\delta$.


Personally, I found the definition of continuity in terms of open sets much easier to get my head around than the $\varepsilon$-$\delta$-definition. Indeed, when I was first learning this material (self-taught from various texts) I couldn't really parse the latter definition, but then I found the definition in terms of open sets and that made immediate sense to me, since it was quite structural. (It was reminiscent of a homomorphism of groups, for example; we had a certain structure that had to be preserved by the map.)

On the other hand, I have always found algebraic concepts easier than analytic ones, and I'm not sure that my inuition was really improved by the open subset definition; while I found it easier to accept, I don't know that it gave me any better feeling for what a continuous map really is.

What's more, many students are seeing this material in a class before learning other, more structural, mathematical concepts. When everything is new and unfamiliar, having one more layer of structure to contemplate (in this case, the notion of open sets) can just make things more opaque, rather than less; it takes a certain mathematical maturity for structural definitions and explanations to seem more natural, rather than simply mystifying.

Taking all this together, the conclusion that I draw is that probably one has to be exposed to all the various facets of continuous maps in order to build up a solid technical intuition: the approximation view-point, expressed in terms of $\varepsilon$s and $\delta$s; the open set view-point; the "commuting with covergence of sequences" view-point, and so on. And among these, the $\varepsilon$-$\delta$ viewpoint has the merit (from a technical view-point) of lending itself to computations (of the type "find the $\delta$ which serves for this given $\varepsilon$"), which are important training for more sophisticated and technical analytic investigations (such as keeping track of errors when trying to interchange various limiting processes).


Why not just introduce continuity in terms of open sets first 
(e.g. it would be a better visual representation)?

You're missing the fact it's not even going to be a different visual representation for the beginning student. So you teach him about open sets, and the topology on the real line. What is he going to imagine when you say "open set"? He's going to picture an open interval.

Or worse, maybe he'll be sharp and he'll be able to picture unions of open intervals, in which case you're going to have to introduce the notion of a basis (and reformulate continuity in terms of bases) in order to invoke the right mental image.

And in the end, you haven't gained anything -- the student is still picturing open intervals, he will still have to use $\epsilon - \delta$ arguments, but now he has to hold all these other ideas in his head too.

And, IMO, the most central idea to real analysis (and especially real calculus) is that of approximation -- how to use approximation schemes to prove exact truths, how to come up with good approximations, and how to combine them to form new ones.

The $\epsilon - \delta$ definition of continuity is probably the simplest and most basic example of this important idea. If I have a continuous function, then I can always find a suitable approximation of an output value by using a sufficiently good approximation of the input value. Conversely, to prove a function continuous, I show that this is possible.

It is incredibly important to learn how this idea is expressed precisely, and to be able to work with it. Even if you do decide to teach continuity in some other fashion, you're going to have to break it down into $\epsilon - \delta$ anyways, teach them to understand that this form really does capture what's going on, and how to work with it.

The general idea of open set doesn't capture this notion of approximation anyways; the idea is built into specific examples rather than the general concept. e.g. in the Zariski topology, open sets are not about approximation, but instead about avoiding points that are solutions to some polynomial equation. (e.g. to avoid singularities or other edge cases in an argument you're making)