Mathematical Background Required for Advanced Machine Learning Concepts

Solution 1:

It definitely depends on what you want to do, since ML is a relatively large and diverse field now. A quick summary might be something like this:

Basics (i.e. needed for the more advanced ones below)

  • Linear algebra (e.g. matrix operations and decompositions, vector spaces)
  • Multivariate calculus (e.g. gradients and jacobians for optimization)
  • Basic probability and statistics (e.g. basic distributions & estimators)
  • Algorithmic analysis
  • Basic signal processing (e.g. convolutions, Fourier series)

Mathematical Theory (e.g. PAC theory)

  • Analysis & measure theory (e.g. advanced probability)
  • Functional analysis

Probabilistic Modelling (e.g. Bayesian deep learning, generative modelling)

  • Stochastic processes & information theory (e.g. MCMC, variational inference)
  • Advanced statistics (e.g. properties of estimators, convergence of distributions)

Implementation-Oriented ML

  • Optimization (e.g. convex optimization)
  • Numerical analysis (e.g. discretizations)
  • Computational numerics (e.g. error accumulation, matrix algorithms)

(Just to link some relevant questions on how to study basic ML mathematically to this one: [1], [2], [3], [4], [5], [6], [7], [8], [9], [10] )

Solution 2:

Machine Learning as a whole is incredibly diverse. Likewise, the type of math seen largely depends on the certain kind of questions you're interested in. Regardless of your interests, a strong background in linear algebra and probability/statistics is a must.

Assuming that you're interested in Deep Learning (a fair assumption to make considering it's at the height of a huge popularity wave), you will want to make sure you're comfortable with multivariable calculus and some basic optimization (this can be achieved through a strong understanding of calculus and linear algebra).

Deep Learning at a practical level is very accessible to someone with an understanding of early undergraduate mathematics courses. You could really just pick up a book and go through the introductory sections to see what kind of theory the authors utilize. Deep Learning by Goodfellow et al (freely available online) dedicates a third of the book to building up prerequisite math and statistics. It's intended as a review of concepts so you could then supplement those sections with specific texts if you need to.

My biggest recommendation though is to not get too caught up in reading. Since you're interested in data science, you need to actually work with and explore real datasets. When introduced to a new algorithm don't just go through the main ideas, strengths, weaknesses, etc. Implement it for yourself on some reasonable dataset (MNIST is a very popular beginner dataset for computer vision). Deep Learning is still in its infancy as a scientific discipline. The majority of results are coming from intuition. You can only gain this by actually working with these things. Come up with ideas to address issues, run tests, see what works/what doesn't work. You will understand things a lot better this way.