What are the best books to study Neural Networks from a purely mathematical perspective?

Solution 1:

I'd recommend Deep Learning by Goodfellow, Bengio and Courville. I don't know if I'd call it "purely mathematical", but it covers a good amount of math background in the first few chapters. No exercises, though.

Solution 2:

For MLPs, there is a rigorous derivation in the optimization textbook by Edwin Chong and Zak. Although it is notation heavy as all things related to neural networks must be.

This book is for some reason freely available online. See page 219 of https://eng.uok.ac.ir/mfathi/Courses/Advanced%20Eng%20Math/An%20Introduction%20to%20Optimization-%20E.%20Chong,%20S.%20Zak.pdf

I think there is essentially no good mathematical textbook on convolutional neural networks or RNN in existence. People essentially just base their intuition off of MLPs. But it is not hard to create a mathematically rigorous derivation of forward and backward propagation of CNN or RNN.

Solution 3:

Gilbert Strang (of MIT OCW Linear Algebra lectures and Introduction to Linear Algebra fame) has a new textbook on linear algebra for deep learning, Linear Algebra and Learning from Data.

It's got a decent course in linear algebra, some statistics & optimization, the calculus needed for stochastic gradient descent, and then applies them all to neural network models.

Solution 4:

One of my favorite books on theoretical aspects of neural networks is Anthony and Bartlett's book: "Neural Network Learning Theoretical Foundations".

This book studies neural networks in the context of statistical learning theory. You will find loads of estimates of VC dimensions of sets of networks and all that fun stuff.

I should say that this book does not go into detail on CNNs and RNNs.

Solution 5:

This field is in its nascent age. Not too many materials for "pure mathematical" lovers. Perhaps you would like to take a look at Stanford's STAT581 course (Theories of Deep Learning).