This posts lists some of the best books on AI that I have read (so far). The list is neither in any particular order nor exhaustive.
The list for this post includes the following:
Understanding Deep Learning (MIT Press 2023) by Simon D Prince
Deep Learning with Python (2nd ed, Manning Publications 2021) by François Chollet
Thinking Clearly with Data: A Guide to Quantitative Reasoning and Analysis (Princeton University Press 2021) by Ethan Bueno de Mesquita and Anthony Fowler
AI Engineering: Building Applications with Foundation Models (O'Reilly Media 2024) by Chip Huyen
Please leave comments on your thoughts on these books if you have read them, or if there other books on AI that you like.
Understanding Deep Learning
If you are working with AI systems, it can certainly help to know, at a deep level, how they are intended to function, which Simon D Prince comprehensively explains in this book.
Understanding Deep Learning is about the theories underpinning deep learning and how it is supposed to work. By understanding the theoretical underpinnings of deep learning, readers can better understand how to apply it in the real world.
Simon D Prince is a research scientist specialising in AI and deep learning, an author of other books on machine learning, and an Honorary Professor of Computer Science at the University of Bath.
The book covers a range of fundamental areas of deep learning, including all the currently relevant topics like unsupervised learning, transformers, diffusion models and reinforcement learning. It does so with a plenty of mathematical formulas and diagrams.
Also included are chapters on the other aspects of deep learning, including a chapter titled 'Why Does Deep Learning Work?'. In this section, Simon D Prince points out how, while deep learning models do (kind of) work, we still lack a good understanding of why they work:
...we now take it for granted that with sufficient hidden units, deep networks will classify almost any training set nearly-perfectly. We also take for granted that the fitted model will generalize to new data. However, it is not at all obvious either that the training process should succeed or that the resulting model should generalize.1
Deep Learning with Python
I have tried a few textbooks for learning how to code AI systems, and this book is by far the best one I have come across so far.
Deep Learning with Python does not just provide step-by-step instructions on how to build various different models with machine learning. It also provides detailed, accessible explanations for these steps and what the models are specifically designed to do.
François Chollet is a French software engineer and the creator of the Keras deep learning library. He is also the co-founder of AI startup Ndea and the ARC Prize, a benchmark for testing the reasoning capabilities of AI models.
I particularly like the way Chollet explains the manifold hypothesis and interpolation, which are quite fundamental concepts for understanding how deep learning models generalise on unseen data. They are also critical for understanding the importance of training data; the higher quality and quantity of the data, the better chance you have of building a good AI system.
Thinking Clearly with Data
While not on AI specifically, this book is still quite helpful for understanding some of the important data science concepts relevant for AI development.
Thinking Clearly with Data is essential for those first getting into data science. It provides explanations on, among other things, correlation vs causation, p-hacking and reversion to the mean.
Both authors are professors at the Harris School of Public Policy at the University of Chicago.
The second chapter on the concept of correlation is one of my favourites. It not only explains what this concept is, but also how it can be used to measure certain aspects of the world, and thereby highlights its relevance for data engineering in machine learning:
...correlations tell us what we should predict about some features of the world given what we know about other features of the world.2
AI Engineering
If you want to know how the so-called 'GPT wrappers' are built, then you need to read about AI engineering.
Chip Huyen's timely textbook covers the different approaches to building applications and systems on top of foundation models. This includes prompt engineering, building retrieval augmented generation (RAG) systems, building agents and fine-tuning.
Huyen is a computer scientist from Vietnam with a focus on AI systems in production.
AI engineering has proven quite helpful in my own work, both in terms of understanding and writing about the current crop of AI systems being introduced into the world and also trying to create them myself. If you want to learn about how to build on top of foundation models, this is the book to get:
There are multiple techniques you can use to get the model to generate what you want. For example, you can craft detailed instructions with examples of the desirable product descriptions. This approach is prompt engineering. You can connect the model to a database of customer reviews that the model can leverage to generate bet‐ ter descriptions. Using a database to supplement the instructions is called retrieval-augmented generation (RAG). You can also finetune—further train—the model on a dataset of high-quality product descriptions.3
Simon JD Prince, Understanding Deep Learning (MIT Press 2024), p.401.
Ethan Bueno de Mesquita and Anthony Fowler, Thinking Clearly with Data: A Guide to Quantitative Reasoning and Analysis (Princeton University Press 2021), p.18.
Chip Huyen, AI Engineering: Building Applications with Foundation Models (O'Reilly Media 2024), p.11.