5 "Best" Machine Learning & AI Books of All Time (2021)

The world of AI can be intimidating due to the terminology and different machine learning algorithms that are available. After having read over 50 of the most highly recommended books on machine learning, I have compiled my personal list of must read books.

The books that were chosen are based on the types of ideas that are introduced, and how well different concepts such as deep learning, reinforcement learning, and genetic algorithms are presented. Most importantly the list is based on the books that best pave the path forward for futurists and researchers towards building provably responsible, and explainable AI.

#5. Life 3.0 by Max Tegmark

“Life 3.0” has an ambitious goal and that is to explore the possibilities of of how we will co-exist with AI in the future. Artificial General Intelligence (AGI) is the eventual and inevitable consequence of the intelligence explosion argument made by British mathematician Irving Good back in 1965. This argument stipulates that superhuman intelligence will be the result of a machine that can continuously self-improve. The famous quote for the intelligence explosion is as following:

“Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion,’ and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make.”

Max Tegmark launches the book into a theoretical future of living in a world that is controlled by an AGI. From this moment onwards explosive questions are asked such as what is intelligence? What is memory? What is computation? and, what is learning? How do these questions and possible answers eventually lead to the paradigm of a machine that can use various types of machine learning to achieve the breakthroughs in self-improvement that are needed to achieve human level intelligence, and the inevitable resulting superintelligence?

These are the type of forward thinking and important questions that Life 3.0 explores. Life 1.0 is simple lifeforms such as bacteria that can only change through evolution that modifies its DNA. Life 2.0 are lifeforms that can redesign their own software such as learning a new language or skill. Life 3.0 is an AI that can not only modify its own behavior and skills, but can also modify its own hardware, for example upgrading its robotic self.

Only when we understand the benefits and pitfalls of an AGI, can we then begin reviewing options to ensure that we build a friendly AI than can align with our goals. In order to do this we may also need to understand what is consciousness? And how will AI consciousness differ from our own?

There are many hot topics that are explored in this book, and it should be mandatory reading for anyone that truly wishes to understand how AGI is a potential threat, as well as being a potential lifeline for the future of human civilization.

What happens if we succeed in building an an intelligent agent, something that perceives, that acts, and that is more intelligent than its creators? How will we convince the machines to achieve our objectives instead of their own objectives?

The above is what leads to one of the most important concepts of the book “Human Compatible: Artificial Intelligence and the Problem of Control” is that we must avoid “putting a purpose into the machine,” as Norbert Wiener once said. An intelligent machine that is too certain of its fixed objectives is the ultimate type of dangerous AI. In other words if the AI becomes unwilling to consider the possibility that it is wrong in performing its pre-programmed purpose and function, then it may be impossible to have the AI system shut itself down.

The difficulty as outlined by Stuart Russell is in instructing the AI/robot that no instructed command is intended to be achieved at any cost. It is not okay to sacrifice human life to fetch a coffee, or to grill the cat to supply lunch. It must be understood that “take me to the airport as fast as possible”, does not imply that speeding laws may be broken, even if this instruction is not explicit. Should the AI get the above wrong, then the fail safe is a certain pre-programmed level of uncertainty. With some uncertainty, the AI can challenge itself before completing a task, to perhaps seek verbal confirmation.

In a 1965 paper titled “Speculations Concerning the First Ultraintelligence Machine“, I.J Good a brilliant mathematician who worked alongside Alan Turing stated, “The survival of man depends on the early construction of an ultraintelligent machine”. It is entirely possible that to save ourselves from ecological, biological, and humanitarian disaster that we must build the most advanced AI that we can.

This seminal paper explains the intelligence explosion, this theory being that an ultraintelligent machine can design even better and superior machines with each iteration, and this inevitably leads to the creation of an AGI. While the AGI may initially be of equal intelligence to a human, it would rapidly surpass humans within a short time span. Due to this foregone conclusion, it is important for AI developers to actualize the core principles that are shared in this book and to learn how to safely apply them to designing AI systems that are capable not only of serving humans, but of saving humans from themselves.

As outlined by Stuart Russell retreating from AI research is not an option, we must press forward. This book is a roadmap to guide us towards designing safe, responsible, and provably beneficial AI systems.

Ray Kurzweil is one of the world’s leading inventors, thinkers, and futurists, he has been referred to as “the restless genius” by The Wall Street Journal and “the ultimate thinking machine” by Forbes magazine. He is also a Co-Founder of Singularity University, and he is best known for his groundbreaking book “The Singularity is Near”. “How to Create a Mind” tackles less the issues of exponential growth that are hallmarks of his other work, instead it focuses on how we need to understand the human brain in order to reverse engineer it to create the ultimate thinking machine.

One of the core principals outlined in this seminal work is how pattern recognition works in the human brain. How do humans recognize patterns in every day life? How are these connections formed in the brain? The book begins with understanding hierarchical thinking, this is understanding a structure that is composed of diverse elements that are arranged in a pattern, this arrangement then represents a symbol such as a letter or character, and then this is further arranged into a more advanced pattern such as a word, and eventually a sentence. Eventually these patterns form ideas, and these ideas are transformed into the products that humans are responsible for building.

Since it is a Ray Kurzweil book it of course does not take long before exponential thinking in introduced. The “Law of Accelerating Returns‘ is a hallmark of this seminal book. This law showcases how technologies and the pace of acceleration is accelerating due to the tendency for advances to feed on themselves, further increasing the rate of progress. This thinking can then be applied to how fast we are learning to understand and reverse engineer the human brain. This accelerated understanding of pattern recognition systems in the human brain can then be applied towards building an AGI system.

This book was so transformational to the future of AI, that Eric Schmidt recruited Ray Kurzweil to work on AI projects after he finishing reading this seminal book. It’s impossible to outline all of the ideas, and concepts that are discussed in a short article, nonetheless it is an instrumental must read book to better understand how human neural networks work in order to design an advanced artificial neural network.

Pattern recognition is the key element for deep learning, and this book illustrates why.

The central hypothesis of The Master Algorithm is that all knowledge – past, present, and future – can be derived from data by a single, universal learning algorithm that is a quantified as a Master Algorithm. The book details some of the top machine learning methodologies, it gives detailed explanations of how different algorithms work, how they can be optimized, and how collaboratively they can work towards achieving the ultimate goal of creating the Master algorithm. This is an algorithm that is capable of solving any problem that we feed it, and this includes curing cancer.

The reader will start off by learning about Naïve Bayes, a simple algorithm that can be explained in one simple equation. From there it accelerates full speed into more interesting machine learning techniques. In order to understand the technologies that are accelerating us towards this master algorithm we learn about converging fundamentals. First, from neuroscience we learn about brain plasticity, human neural networks. Second, we move on to natural selection in a lesson to understand how to design a genetic algorithm that simulates evolution and natural selection. With a genetic algorithm a population of hypotheses in each generation crosses over and mutates, from there the fittest algorithms produce the next generation. This evolution offers the ultimate in self-improvement.

Other arguments come from physics, statistics, and of course the best of computer science. It’s impossible to comprehensively review all of the different facets this book touches upon, due to the books ambitious scope of laying out the framework for building the Master Algorithm. It is this framework that has pushed this book to second place, as all of the other machine learning books build on this in some shape or form.

“A Thousand Brains” builds on the concepts that are discussed in the previous book by Jeff Hawkins titled “On Intelligence”. “On Intelligence” explored the framework for understanding how human intelligence works, and how these concepts can then be applied towards building the ultimate AI and AGI systems. It fundamentally analyzes how our brains predict what we will experience before we experience it.

While “A Thousand Brains” is a great standalone book, it will be best enjoyed and appreciated if “On Intelligence” is read first.

“A Thousand Brains” builds on the latest research by Jeff Hawkins and the company he founded called Numenta. Numenta has a primary goal of developing a theory on how the neocortex works, the secondary objective is how this theory of the brain can be applied to machine learning and machine intelligence.

Numenta’s first major discovery in 2010 entails how neurons make predictions, and the second discovery in 2016 involved maplike reference frames in the neocortex. The book details first and foremost what the “Thousand Brains theory” is, what reference frames are, and how the theory works in the real world. One of the most fundamental components behind this theory is understanding how the neocortex evolved to its current size.

The neocortex started small, similar to other mammals, but it grew exponentially larger (only being limited by the size of the birth canal) not by creating anything new, but by copying a basic circuit repeatedly. In essence, what differentiates humans is not the organic material of the brain but the number of copies of the identical elements that form the neocortex.

The theory further evolves into how the neocortex is formed with approximately 150,000 cortical columns that are not visible under a microscope as there are no visible boundaries between them. How these cortical columns communicate amongst one another, is the implementation of a fundamental algorithm that is responsible for every aspect of perception and intelligence.

More importantly the book unveils how this theory can be applied towards building intelligent machines, and the possible future implications for society. For example, the brain learns a model of the world by observing how inputs change over time, especially when movement is applied. The cortical columns require a reference frame that is fixed to an object, these reference frames allow a cortical column to learn the locations of features that define the realities of an object. In essence reference frames can organize any type of knowledge. This leads to the most important part of this seminal book, can reference frames potentially be the vital missing link towards building a more advanced AI or even an AGI system? Jeff himself believes in an inevitable future when an AGI will learn models of the world using maplike reference frames similar to the neocortex, and he does a remarkable job illustrating why he believes this.

Credit: Source link