On "familiarity" (or How to avoid "going down the Math Rabbit Hole"?)

Anyone trying to learn mathematics on his/her own has had the experience of "going down the Math Rabbit Hole."

For example, suppose you come across the novel term vector space, and want to learn more about it. You look up various definitions, and they all refer to something called a field. So now you're off to learn what a field is, but it's the same story all over again: all the definitions you find refer to something called a group. Off to learn about what a group is. Ad infinitum. That's what I'm calling here "to go down the Math Rabbit Hole."

Upon first encountering the situation described above one may think: "well, if that's what it takes to learn about vector spaces, then I'll have to toughen up, and do it." I picked this particular example, however, because I'm sure that the course of action it envisions is one that is not just arduous: it is in fact utterly misguided.

I can say so with some confidence, for this particular case, thanks to some serendipitous personal experience. It turns out that, luckily for me, some kind calculus professor in college gave me the tip to take a course in linear algebra (something that I would have never thought of on my own), and therefore I had the luxury of learning about vector spaces without having to venture into the dreaded MRH. I did well in this class, and got a good intuitive grasp of vector spaces, but even after I had studied for my final exams (let alone the first day of class), I couldn't have said what a field was. Therefore, from my experience, and that of pretty much all my fellow students in that class, I know that one does not need to know a whole lot about fields to get the hang of vector spaces. All one needs is a familiarity with some field (say $\mathbb{R}$).

Now, it's hard to pin down more precisely what this familiarity amounts to. The only thing that I can say about it is that it is a state somewhere between, and quite distinct from, (a) the state right after reading and understanding the definition of whatever it is one wants to learn about (say, "vector spaces"), and (b) the state right after acing a graduate-level pure math course in that topic.

Even harder than defining this familiarity is coming up with an efficient way to attain it...

I'd like to ask all the math autodidacts reading this: how do you avoid falling into the Math Rabbit Hole? And more specifically, how do you efficiently attain enough familiarity with pre-requisite concepts to move on to the topics that you want to learn about?

PS: John von Neumann allegedly once said "Young man, in mathematics you don't understand things. You just get used to them." I think that this "getting used to things" is much of what I'm calling familiarity above. The problem of learning mathematics efficiently then becomes the problem of "getting used to things" quickly.

EDIT: Several answers and comments have suggested to use textbooks rather than, say, Wikipedia, to learn math. But textbooks usually have the same problem. There are exceptions, such as Gilbert Strang's books, which generally avoid technicalities and instead focus on the big picture. They are indeed ideal introductions to a subject, but they are exceedingly rare. For example, as I already mentioned in one comment, I've been looking for an intro book on homotopy theory that focuses on the big picture, to no avail; all the books I've found bristle with technicalities from the get go: Hausdorff this, locally compact that, yadda yadda...

I'm sure that when one mathematician asks another for an introduction to some branch of math, the latter does not start spewing all these formal technicalities, but instead gives a big-picture account, based on simple examples. I wish authors of mathematics books sometimes wrote books in such an informal vein. Note that I'm not talking here about books written for math-phobes (in fact I detest it when a math book adopts a condescending "for-dummies", "let's-not-fry-our-little-brains-now" tone). Informal does not mean "dumbed down". There's a huge gap in the mathematics literature (at least in English), and I can't figure out why.

(BTW, I'm glad that MJD brought up Strang's Linear Algebra book, because it's a concrete example that shows it's not impossible to write a successful math textbook that stays on the big picture, and doesn't fuss over technicalities. It goes without saying that I'm not advocating that all math books be written this way. Attention to such technical details, precision, and rigor are all essential to doing mathematics, but they can easily overwhelm an introductory exposition.)

Your example makes me think of graphs.

Imagine some nice, helpful fellow came along, and made a big graph of every math concept ever, where each concept is one node and related concepts are connected by edges. Now you can take a copy of this graph, and color every node green based on whether you "know" that concept (unknowns can be grey).

How to define "know"? In this case, when somebody mentions that concept while talking about something, do you immediately feel confused and get the urge to look the concept up? If no, then you know it (funnily enough, you may be deluding yourself into thinking you know something that you completely misunderstand, and it would be classed as "knowing" based on this rule - but that's fine and I'll explain why in a bit). For purposes of determining whether you "know" it, try to assume that the particular thing the person is talking about isn't some intricate argument that hinges on obscure details of the concept or bizarre interpretations - it's just mentioned matter-of-factly, as a tangential remark.

When you are studying a topic, you are basically picking one grey node and trying to color it green. But you may discover that to do this, you must color some adjacent grey nodes first. So the moment you discover a prerequisite node, you go to color it right away, and put your original topic on hold. But this node also has prerequisites, so you put it on hold, and... What you are doing is known as a depth first search. It's natural for it to feel like a rabbit hole - you are trying to go as deep as possible. The hope is that sooner or later you will run into a wall of greens, which is when your long, arduous search will have born fruit, and you will get to feel that unique rush of climbing back up the stack with your little jewel of recursion terminating return value.

Then you get back to coloring your original node and find out about the other prerequisite, so now you can do it all over again.

DFS is suited for some applications, but it is bad for others. If your goal is to color the whole graph (ie. learn all of math), any strategy will have you visit the same number of nodes, so it doesn't matter as much. But if you are not seriously attempting to learn everything right now, DFS is not the best choice.

So, the solution to your problem is straightforward - use a more appropriate search algorithm!

Immediately obvious is breadth-first search. This means, when reading an article (or page, or book chapter), don't rush off to look up every new term as soon as you see it. Circle it or make a note of it on a separate paper, but force yourself to finish your text even if its completely incomprehensible to you without knowing the new term. You will now have a list of prerequisite nodes, and can deal with them in a more organized manner.

Compared to your DFS, this already makes it much easier to avoid straying too far from your original area of interest. It also has another benefit which is not common in actual graph problems: Often in math, and in general, understanding is cooperative. If you have a concept A which has prerequisite concept B and C, you may find that B is very difficult to understand (it leads down a deep rabbit hole), but only if you don't yet know the very easy topic C, which if you do, make B very easy to "get" because you quickly figure out the salient and relevant points (or it may be turn out that knowing either B or C is sufficient to learn A). In this case, you really don't want to have a learning strategy which will not make sure you do C before B!

BFS not only allows you to exploit cooperativities, but it also allows you to manage your time better. After your first pass, let's say you ended up with a list of 30 topics you need to learn first. They won't all be equally hard. Maybe 10 will take you 5 minutes of skimming wikipedia to figure out. Maybe another 10 are so simple, that the first Google Image diagram explains everything. Then there will be 1 or 2 which will take days or even months of work. You don't want to get tripped up on the big ones while you have the small ones to take care of. After all, it may turn out that the big topic is not essential, but the small topic is. If that's the case, you would feel very silly if you tried to tackle the big topic first! But if the small one proves useless, you haven't really lost much energy or time.

Once you're doing BFS, you might as well benefit from the other, very nice and clever twists on it, such as Dijkstra or A*. When you have the list of topics, can you order them by how promising they seem? Chances are you can, and chances are, your intuition will be right. Another thing to do - since ultimately, your aim is to link up with some green nodes, why not try to prioritize topics which seem like they would be getting closer to things you do know? The beauty of A* is that these heuristics don't even have to be very correct - even "wrong" or "unrealistic" heuristics may end up making your search faster.

You don't learn what a vector space is by swallowing a definition that says

A vector space $\langle V, S\rangle$ is a set $V$ and a field $S$ that satisfy the following 8 axioms: …

Or at least I don't, and from the sound of things that isn't working for you either. That definition is for someone who not only already knows what a field is, but who also already knows what a vector space is, and for whom the formal statement may illuminate what they already know.

Instead, if you want to learn what a vector space is, you pick up an elementary textbook on linear algebra and you start reading it. I picked up Linear Algebra and its Applications (G. Strang, 1988) from next to the bed just now, and I find that "vector space" isn't even defined. The first page of chapter 2 (“Vector Spaces and Linear Equations”) introduces the idea informally, leaning heavily on the example of $\Bbb R^n$, which was already introduced in Chapter 1, and then emphasizes the crucial property: “We can add any two vectors, and we can multiply vectors by scalars.” The next page reiterates this idea: “a real vector space is a set of ‘vectors’ together with rules for vector addition and multiplication by real numbers.” Then there follow three examples that are different from the $\Bbb R^n$ examples.

A good textbook will do this: it will reduce those 8 axioms to a brief statement of what the axioms are actually about, and provide a set of illuminating examples. In the case of the vector space, the brief statement I quoted, boldface in the original, was it: we can add any two vectors, and we can multiply vectors by scalars.

You don't need to know what a field is to understand any of this, because it's restricted to real vector spaces, rather than to vector spaces over arbitrary fields. But it sets you up to understand the idea in its full generality once you do find out what a field is: “Just like the vector spaces you're used to, except instead of the scalars being real numbers, they can be elements of any field.”

If you find yourself chasing an endless series of definitions, that's because you're trying to learn mathematics from a mathematical encyclopedia. Well, it's worth a try; it worked for Ramanujan. But if you find that you're not Ramanujan, you might try what the rest of us non-Ramanujans do, and try reading a textbook instead. And if the textbook starts off by saying something like:

A vector space $\langle V, S\rangle$ is a set $V$ and a field $S$ that satisfy the following 8 axioms: …

then that means you have mistakenly gotten hold of a textbook that was written for people who already know what a vector space is, and you need to put it aside and get another one. (This is not a joke; there are many such books.)

The Strang book is really good, by the way. I recommend it.

One last note: It's not usually enough to read the book; you have to do a bunch of the exercises also.

A very well known mathematician showed me how he avoids the rabbit hole. I copied his method, and now I can stay out of it most of the time.

I had private weekly seminars with him. Every week, he would research a topic he knew nothing about (that was our deal and that's what was in it for him). I would name the topic (examples: Bloom Filters, Knuth-Bendix Theorem, Linear Logic), and the following week he would give a zero-frills Power-Point presentation of what he found out. The presentations had a uniform pattern:

Motivating Example
Definitions
Lemmas and Theorems
Applications

By beginning with the motivating example, we never got lost in the thicket of technicalities, and the Applications section would circle back and explain the Motivating Example (and maybe some others if time allowed) in terms of the technicalities.

This is how he taught himself a topic without going down the MRH.

Limit your rabbit-hole time (one week)
your presentation must be one hour long
Focus on a Motivating Example
do just enough technicalities to explain the example and optional variations

I have since copied this style. When I teach myself a new topic, I make a slide presentation like that, and then I present it to others in a weekly reading group.

I think that sometimes, you don't really need to know exactly what every term used means, not right away, anyway. Most of the time, a vague idea is enough to get you started.

Check out the definition (without necessarily understanding it at first -- ploughing through a huge mess of a formal definition is not always helpful at this point, but it helps to see its general structure), then see some examples, tinker a little, see how it works. If I told you everything about horse riding for a month, you probably wouldn't be as good at horse riding as you would be if you had instead tried practised horse riding for a week (and not just because I don't know a thing about horse riding ;) ).

As you get deeper into the subject matter, it might help to understand the details of the definitions, as well as the auxiliary objects. What are they for? What do they really mean? But at first, you shouldn't expect to understand everything, especially when studying more in-depth stuff which (unlike vector spaces) can get you really deep into... rabbit hole.

Familiarity comes with experience. There is no other way.

As a side comment about your vector spaces example: I don't think you can really understand linear algebra if you restrict yourself to reals. They have characteristic zero, are not algebraically closed, they are naturally ordered... this can be very misleading. It's good for starters, but I wouldn't say you understand vector spaces if you just understand real vector spaces.

It is a good idea to learn about vector spaces first in the context of real scalars rather than general fields. But afterward, it is worthwhile to observe that, in most of what you learned (everything short of inner product spaces, in the usual presentations of the subject), you never used the fact that the real numbers come with an ordering; you never needed to consider whether numbers were positive or negative. And for some purposes, like eigenvalues and eigenvectors, it's actually helpful to allow complex numbers into your picture. In fact, all you needed about the real numbers was that you can add, subtract, multiply, and divide them (except of course that you can't divide by $0$) and you can manipulate equations as you learned in elementary algebra. That's why it's safe to allow complex numbers into your picture --- they share all those essential (for linear algebra) properties of the real numbers. And at this point, you know what a field is, even if you've never seen the definition or even the word, because a field is just a collection of things that resemble numbers to the extent that you can add, subtract, multiply, and divide them (except of course that you can't divide by $0$) and you can manipulate equations as you learned in elementary algebra. The formal axioms that define "field" are just the result of the observation that all those algebraic rules you learned are consequences of just a few of the rules; i.e., most of them are redundant. So "field" can be defined by giving just the necessary rules, not all the redundant ones. Of course, that makes it easier to check that something is a field, because you have far fewer rules to verify, and it also makes it easier to write the definition of "field" in a book, because it's shorter than it would otherwise be. But the true idea of "field" remains that all the usual manipulations of equations are valid.

On "familiarity" (or How to avoid "going down the Math Rabbit Hole"?)

Related

Recent Posts