How to catch proof errors during self study? [closed]
I completed a Bachelor's in Mathematics May 2018 with a 3.6 major GPA. I had trouble with real analysis, scoring B-, B, B, B+ in the four courses I took on the subject despite significant effort and paying for a PhD Candidate tutor.
My goal is to go to graduate school in Machine Learning. I want to learn as much theory as I can on my own while working to afford Graduate School. Over the next ~6 years years, I want to self study up to 12 graduate level topics related to Probability / Point Estimation / Optimization or Control / Dynamics and Statistic Learning. As much as I can finish in ~6 years.
I will also schedule time over those ~6 years to work on programming at least 6 non-trivial personal projects in Machine Learning and on replicating one peer-reviewed academic paper every 1 - 2 months. After that, I'll crack open the Deep Learning Book and Reinforcement Learning Book over another year to study them thoroughly using my experience and studied theory while applying to graduate schools.
The answer here (https://www.quora.com/How-can-I-self-study-functional-analysis) raises an important point:
"if you want to understand it [Functional Analysis] in depth, you have to solve problems, which usually means proving stuff (as opposed to calculating stuff), and that's pretty hard for anyone to self-critique.
You'll want someone to help you out of tight spots as you're reading the text, and look over your solutions to see if you're actually getting it. It's not too hard to delude yourself into thinking that you've proved something while in fact you did not. If you miss a subtlety or fail to understand a definition, you might be proving the wrong thing or nothing at all - and you may have no way of even realizing that."
Partial progress is still amazing. However, what proactive stategies avoid falling into these pitfalls? I'm not always an A student, and I want to avoid spending more than 8 months average per subject. Do I repeatedly post every problem I attempt to prove onto Stack Exchange for advice / correctness, and do I occasionally contact a professor from my Alma Mater when I am really stuck?
You can post some (not all) proofs here with the proof-verification tag. It would be helpful if you flagged the few particular places where you were in doubt.
If an old professor is willing to spend occasional time, go for it.
One suggestion. Rather than learning the basics from the bottom up, start with something you really want to know for its own sake and work backwards through the prerequisites as necessary. You will probably discover that you need a lot more linear algebra than you thought, and a lot less functional analysis.
Finally, six years is a long time to study all alone. Good grad schools do support their students. Consider applying sooner.
To answer your first question, about how you can catch errors during self-study, I think that you need to have others check your proofs. There have been numerous alleged proofs in the history of mathematics by well-known mathematicians that were later demonstrated to be insufficient or wrong. So, I think you need to find a community of researchers, online or not, to exchange your ideas with them.
As a matter of fact, these days I talk to many people about my future B.Sc. thesis which is going to be about machine learning. What I'm going to write is something that has been said to me by my professors and students studying at higher levels, and I don't claim that it's the best possible approach. So, please keep that in mind.
I think the starting point is to get a copy of the book Elements of Statistical Learning by Hastie and Tibshirani. As a more advanced text to supplement it, you can use Pattern Recognition and Machine Learning by C. Bishop. I think you already know this or probably have even better suggestions for this part.
After reading these two books, you can read the book that Ian Goodfellow, Yoshua Bengio and Aaron Courville have written about deep learning with the same name: Deep Learning. Once you start reading the book, you will be surprised to see how little you need to know to read through the chapters.
You need to take a course in Stochastic Processes. Now, engineering students take this course too. If you can, take this course from the engineering department because they usually avoid measure theory and depending on the lecturers, you may learn some things about signals and systems during the course.
If you want to take the rigorous path, you will need to learn measure theory first. Then you'll be able to understand stochastic calculus rigorously. Last semester, I took a course in stochastic processes from the computer engineering department. You will be surprised to know that most computer engineers know little about the rigorous treatment of the stuff they work with everyday. A book that engineers use for a more or less mathematical treatment is Gallagher's Stochastic Processes which is a terrible book in my opinion. It doesn't satisfy mathematicians, neither does it explain the beautiful intuitions that sometimes engineering offers.
One advantage to the rigorous path is that you get to learn about some other fields like financial mathematics as well. The rigorous approach is helpful when you want to define things like conditional expectation and Radon-Nikodym derivative. But after all, I think it's not wise to spend too much time on 'abstractions'.
You need to spend a lot of time on programming. Learn Python or R, preferably. You need to learn about Markov chain Monte Carlo methods. You also may need to learn about calculus of variation at some point. Overall, the list of things that you can learn is endless. You may like to learn differential geometry to understand information geometry which is more theoretical than practical. Also, some knowledge from physics like thermodynamics can be helpful when you study things like the Boltzmann machine, etc. Again, I would like to emphasize that many of the recent advances in neural networks and deep learning do not really require advanced (abstract) mathematics. Just some linear algebra, a good understanding of probability theory, some experience with matrix calculations as in The Matrix Cookbook and some creativity that engineers have is enough to start your journey. Once you have started and you have chosen your final destination, you will acquire the knowledge you need along the way.
By not stopping. If you keep thinking about the topic with a misconception in your head (even if it's a small untrue lemma that you never stated explicitly), you will eventually prove something you know to be untrue. Then you can have lots of fun* going over everything you thought you knew with a fine-toothed comb, and trying to find the bug.
*: Your mileage may vary.