The world’s most valuable resource is now data! That’s how far The Economist went in relaying the value of data as a commodity, claiming it has now even surpassed the traditional monopoly of oil. Indeed and LinkedIn have consistently ranked Machine Learning Engineers and Data Scientists as some of the hottest professions in the last few years and predict that the demand trend will only continue to grow well into the decade.
Learning and mastering machine learning and data science can be overwhelming on the technical side. There are countless books, online courses, and graduate degrees that are offering this knowledge with varying breadth and depth – so, where do you start?
Through the rest of the article, we’ll review some of the most popular books for data science and machine learning for various levels of expertise and understanding, both on technical topics as well as non technical ones. The books below are listed in no particular order and cover a large spectrum of the machine learning and data science fields. Some of these books will require familiarity with some coding languages, mathematical concepts or even prior machine learning experience, but we’ll be sure to mention it if that’s the case.
Superintelligence: Paths, Dangers, Strategies
Nick Bostrom, in this book, artfully leverages his background in computational neuroscience and Artificial Intelligence (AI) with his inherent philosophical thinking to provide an unprecedented analysis of the future with AI.
Bostrom imagines how we can create an AI far superior than we could have imagined and what risks it entails in terms of the world and societal power dynamics, while delving into how things can potentially go wrong and if superintelligence can replace us as the dominant lifeform on Earth. He talks about steering the future course through uncharted territory and navigating this still unknown terrain. The mere projection of such potentially grim scenarios forces us to begin considering solutions for them and the author looks at several such solutions.
This book is definitely written from a philosophical standpoint and a great insight into the world we are headed towards and the policies, regulations and thought leadership that is required for a harmonious co-existence.
The book was recommended by Bill Gates and Elon Musk as worth reading.
Deep Learning with Python
In this book, Francois Chollet covers practical deep learning implementation with the Keras library, which is a high-level API and considered a much better entry point for coders to deep learning than TensorFlow, for example, based on complexity. The text strikes a relatively good balance between covering basic foundational concepts with topics and considerations for the advanced practitioners and researchers.
Mathematical concepts are explained using code snippets and the book is well suited for anyone with basic Python coding experience.
The first part of the book provides a general fundamental understanding and mathematical building blocks of both machine learning and deep learning. However, you will likely need another book for a more in-depth look into the theoretical side of Deep Learning such as “Deep Learning (Adaptive Computation and Machine Learning series)” by Ian, Yoshua and Aaron, which is the next book discussed.
The second part of the book introduces different practical applications of deep learning networks including Computer vision with convolutional neural networks (CNNs), natural language processing and transactional data (time series) with recurrent neural networks (RNNs) and generating texts and images using variational autoencoders (VANs) as well as generative adversarial networks (GANs). A recurring theme within deep learning is the problem of overfitting and the book addresses a range of solution techniques as well.
Although all the examples are implemented using Keras, the topics are covered in general perspective and knowledge can be used with other similar high-level frameworks with relative ease.
Deep Learning (Adaptive Computation and Machine Learning series)
Written by luminaries in the field such as Yoshua Bengio, considered as one of the world’s leading experts in AI and a pioneer in deep learning, this book is a rigorous and up to date reference of deep learning algorithms that is virtually self-contained.
The book has a good balance of mathematics (advanced statistics, linear algebra, numerical optimization), applications, and research topics. The applications section is definitely the meat of the book and the research topics will mostly be of interest to fellow machine learning researchers.
Do note that this is a rather theoretical book and great for readers to gain intuition behind many of the concepts underpinning deep learning techniques taken for granted. Someone without any prior knowledge of machine learning probably would benefit from studying other introductory machine learning texts before tackling this one. Without a solid background, or at least a deep interest, in mathematics and practical hands-on experience with machine learning, this might end up being somewhat of a dense and challenging read.
The Book of Why: The New Science of Cause and Effect
The Book of Why is about Causality and discusses how it is different than Big Data and Correlation. Judea Pearl, the main author, argues that data and pattern recognition can only get you so far in understanding complex patterns, and that the more crucial piece is building transparent models on how the data was collected and produced in the first place. Pearl expects that leveraging causal reasoning could provide machines with human-level intelligence.
Terms and concepts that are explored in the book include causality, correlation, mediation, transportability, counterfactuality, among others. Causality, in the context of how Pearl describes it, is around data transparency in understanding why a certain conclusion is reached. He provides many examples along the book on how just data correlation alone fails because of a lack of upstream transparency. The concepts are certainly more on the abstract side and complex with many intermingled ideas and thoughts, but it does caution around relying solely on data to predict future events.
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
If you are comfortable with coding in Python and are looking for a quick introduction to both classical machine and deep learning techniques in Python from an experienced practitioner, this is a book well suited just for that. Hands-on Machine Learning is a great surface-level introduction to a vast array of machine learning and deep learning models, including their implementation in Scikit-Learn, Keras and Tensorflow (2.0). The book is comprehensive, written in a friendly tone, and contains a large set of excellent exercises, making it a great introduction to the knowledge areas as well as a useful reference text. The book also manages to strike a good balance between covering classical machine learning and deep learning, and the right amount of theory accompanied by references.
AI Superpowers: China, Silicon Valley, and the New World Order
This book is a potential eye opener for those of us unfamiliar with the wide-ranging capabilities and imminent impact of AI. Lee contrasts the Chinese’s work in AI with that in the US. While Chinese AI is based on technologies developed in the US, Chinese companies are now developing their own strategic direction. Lee suggests a future which is not only filled with promise, but also fraught with social challenges, making a strong case that AI will soon determine the relative economic power of nations.
In this book, Lee explains with striking clarity China is quickly becoming the AI superpower, having the perfect combination of: an indomitable entrepreneurial spirit; supportive government policies; well-trained AI scientists; and more data than any other country in the world.
The book is an apt recommendation to anyone who wants an understanding of the first principles of AI, why it is so powerful and why we should be concerned about it from a political and societal standpoint. However, AI Superpowers doesn’t really delve into the solutions side of things such as law, regulations, and moral responsibilities.
An Introduction to Statistical Learning: with Applications in R (Springer Texts in Statistics)
An Introduction to Statistical Learningfocusses on all foundational topics systematically, assuming no prior knowledge from the readers, other than very basic statistics. Therefore, if you consider yourself a newbie, this book will help you approach data science easily.
The core statistical ideas of model optimization such as bias-variance tradeoffs are deeply discussed and revisited through multiple example problems. Chapter are like small case studies, developing progressively in complexity, each starting off with a research question, hypothesized models and data descriptions, followed by the data analysis and modelling with the code in R demonstrated with sufficient detail. No prior knowledge of R is required - the included R exercises are particularly helpful for beginners to learn R.
Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again
Deep Medicine: How Artificial Intelligence can Make Healthcare Human Again provides a quick overview of AI, in what sectors it is being used, and then proceeds to discuss its application within the different branches of healthcare, including drug discovery, omics, diagnosis from radiography, virtual medicine (telemedicine, chatbots), nutrition, and mental health. Topol brings forth the argument that incorporating AI in executing these mundane and repetitive tasks that generally have solid boundaries of scope, will eventually free up doctors to focus on value-added human-to-human interactions with their patients that machines, no matter how sophisticated, are unlikely capable of duplicating. This book is potentially a great read for not only healthcare workers but also policy makers interested in driving change in the industry.
Life 3.0: Being Human in the Age of Artificial Intelligence
In Life 3.0, Max Tegmark describes the present, future, and distant future possibilities about the impacts of AI accompanied by a discussion on the popular controversies, myths and misconceptions associated with AI, before moving on to a discussion on what we really mean by “intelligence”.
Overall, Tegmark remains optimistic, stating that technology is responsible for nearly all the improvements in the quality of life since the stone age and it will only continue on that path. At the same time, he convincingly illustrates the necessity for further evaluation of our goals as a society and plans for AI’s integration into our lives—one that can be the most important conversation of our times.
The author breaks down complex concepts with simple comparisons and manages to keep things, and steer clear of exaggerations, with in-depth breakdown of every possible scenario AI might take, both promising as well as grim ones. Regardless of your field of study or profession, the book really makes you ponder what “values” you want machines and AI to have.
Introduction to Machine Learning with Python: A Guide for Data Scientists
For individuals familiar with Python who are eager to apply it in genuine Machine Learning applications, this book can definitely lay a solid foundation on the mainstream Machine Learning algorithms. It focusses mostly on Scikit-learn packages with some exposure to numpy, pandas and matplotlib. Additionally, some elements of clustering, feature engineering (PCA), and model performance evaluation metrics are discussed in detail as well, rounding off all the different segments of a typical machine learning pipeline. Deep learning topics such as Artificial Neural Networks (ANN) are covered in brevity, with the focus being in core machine learning.
Artificial Intelligence in Finance: A Python-Based Guide
Finance is a huge application area for Machine Learning. For example, one of the more traditional applications lies stock price prediction with time series and transactional data. Artificial Intelligence in Finance is an unreleased book as of yet, with pre-orders being accepted leading upto its expected release in Q4 2020. There isn’t much published information or reviews on the book, but it definitely addresses a market gap for a good reference book for AI applications in Finance.
Grokking Algorithms: An illustrated guide for programmers and other curious people
What makes this book stand out is its visual approach to teaching algorithms. With the plethora of illustrations in the book, this book is a must-read for ones who have a visual learning method and want to learn coding. This book is geared towards beginner-level programmers and individuals without a background in Computer Science and advanced Software Engineers or individuals with a degree in Computer Science and well versed in algorithms probably will likely find this book too high level.
The focus of the book is definitely data structures and algorithms, more so than machine learning specifically. Although it doesn’t cover all the data structure and algorithms you generally see within Computer Science, but the ones it does cover, it does so very well!
Data structure and algorithms is definitely a very interesting knowledge area, especially at the higher level topics, but can quickly grow to be complex and hard to visualize, especially if not explained very well - and that is where this book stands out.
Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
In the first few pages of the book, Géron provides an overview of Machine Learning (ML) systems, the main challenges around ML models, and perspectives on testing and validation of models.
This is followed by a few hands-on, end-to-end ML projects implemented with Scikit-learn libraries. Through the chapters, the author provides guidance on how you should approach framing a given problem, selecting the performance evaluation measure(s), extracting the data, performing exploratory data analysis and visualization, preparing and cleaning the data, selecting and training the model, tuning the model hyperparameters, and concluding with topics on launching, monitoring, and maintaining the system.
Multiple ML models are discussed, including, but not limited to, linear and logistic regressions, Support Vector Machines (SVMs), decision trees, ensemble methods (GBM, random forest), as well as feature engineering topics such as dimensionality reduction using Principal Component Analysis (PCA).
Conceptual introduction to neural nets is also included in the book in the form of Multi-Level Perceptron (MLP), Deep Neural Network (DNN), Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNN’s) with Tensorflow.
Hands-On Machine Learning with Scikit-Learn and TensorFlow is better suited for the advanced novices. The book provides good practical tips and hints to help you to build good instincts as a data scientist. It is best if you are already comfortable with coding in Python; and without a background in statistics, you could be prone to making poor choices once you start modelling the problems.