🎙Jeff Hawkins, author of A Thousand Brains, about the path to AGI
It’s so inspiring to learn from practitioners and thinkers. Getting to know the experience gained by researchers, engineers, and entrepreneurs doing real ML work is an excellent source of insight and inspiration. Share this interview if you like it. No subscription is needed.
👤 Quick bio / Jeff Hawkins
Tell us a bit about yourself. Your background, current role and how did you get started in machine learning?
Jeff Hawkins (JH): I studied electrical engineering in college and started my career at Intel. But soon after, I began reading about the brain. I was struck by the fact that scientists had amassed many details of the brain’s architecture, but how it worked was a mystery. It was as if we had a circuit diagram of the brain but no idea how it functioned. I felt we could solve this mystery in my lifetime, and when we did, we would have a much better idea of what intelligence is and how to build intelligent machines. I found this challenge exciting, and I have pursued that goal ever since.
My career path has not been linear. Along the way, I founded two mobile computing companies, Palm and Handspring, and I created and ran the Redwood Neuroscience Institute, which is now at U.C. Berkeley. But throughout my career, my long-term goal has always been the same: understand the brain and then create machines that work on the same principles.
Today, I am co-founder and chief scientist at Numenta. We spent a decade reverse-engineering the neocortex, and we had a lot of success. We are now applying what we learned to improve existing neural networks and create a new form of AI based on sensory-motor learning.
🛠 AI Work
You have published two iconic books that explore the relationships between the underpinnings of the human brain and AI. On Intelligence was the book that originally inspired me to study neuroscience and A Thousand Brains outlines a very original thesis about the foundations of intelligence. Is the understanding of the human brain and specifically the neocortex a key requirement to achieve AGI or we can build sophisticated AI systems that don’t require detailed knowledge of the human brain?
JH: As I said, I believe the quickest and surest way to create AGI is to study brains. This didn’t have to be the case. Perhaps we could have created truly intelligent machines by paying no attention to neuroscience. The early attempts at symbolic AI took this approach. They failed. Today’s artificial neural networks have achieved some remarkable results, and they are loosely modeled on brain principles. But today’s AI is still far from being intelligent. We are all familiar with the shortcomings of today’s neural networks. They are difficult to train. They are brittle. They don’t generalize. If we want to claim that a deep learning system understands something, we at least have to admit that its understanding is very shallow. No AI system today has the kind of general knowledge that a human has.
Numenta dove deep into neuroscience, not because we want to emulate a human brain but to discover the principles of how it works. It seems obvious that there are some basic things that brains do that we are missing in today’s AI. Once we understand the brain’s operating principles, we can leave neuroscience behind.
Fortunately, we have already learned most of the techniques used by the brain that I believe will be essential for AGI. I provide a list of these in my recent book. Let me give you one example here: the brain is a sensory-motor learning system. We learn by moving and sensing, moving and sensing. We move our bodies, our eyes, and our touch sensors. The brain keeps track of where our sensors are relative to things in the world using a type of neural reference frame. This allows the brain to integrate sensation and movement to quickly learn three-dimensional models of the environments and objects we interact with. Learning through movement and storing knowledge in reference frames is essential. I believe that all intelligent machines in the future will work this way.
In A Thousand Brains you outlined the theory inspired by the work of Vernon Mountcastle that explains that the neocortex is divided into individual cortical columns that are independent learning machines. Can you elaborate more in the relevance of the Thousand Brains theory from the AI perspective?
JH: Right, Vernon Mountcastle was the first scientist to propose that the neocortex is made up of tens of thousands of nearly identical units he called cortical columns. A cortical column is about the size of a grain of rice. Imagine stacking 150,000 grains of rice vertically next to each other, and you get a picture of the human neocortex. Cortical columns are complex; they have many types of neurons arranged in multiple layers, connected with hundreds of millions of synapses. Mountcastle proposed that each cortical column performs the same intrinsic function, although they are applied to different problems, such as vision, hearing, and touch.
We believe that each cortical column is a complete sensory-motor learning system. Each cortical column uses reference frames to learn models and store knowledge. Therefore, the entire neocortex is a distributed sensory-motor modeling system. For example, the brain has separate models of what a coffee cup looks like, sounds like, and feels like. This is why we call it the Thousand Brains Theory. These separate models communicate with each other to reach a consensus on what is happening in the world.
There are several advantages to this type of distributed architecture from an AI perspective. It makes it easy to build AI systems using multiple sensors and multiple sensor modalities. For example, a car manufacturer could easily swap in and out different types of sensors to provide different capabilities. This could be done without retraining. A distributed architecture also makes an AI system robust to noise, occlusions, and complete loss of one or more sensors. But perhaps most importantly, it allows us to create smaller and larger AI systems by adding and deleting cortical columns. The primary difference between a rat’s neocortex, a monkey’s neocortex, and a human neocortex is the number of cortical columns.
In the future, I believe we will create silicon equivalents to cortical columns. Chips will contain varying numbers of these modeling units, analogous to how CPU chips contain varying numbers of cores. I see no reason why we can’t make machines that have more cortical column equivalents than a human.
One of the aspects of the Thousand Brains theory of intelligence that impacted me the most was the relevance of time-based patterns in human cognition and how this is mostly missing from most AI techniques. Do we need to rethink the current generation of AI techniques to work more aligned with time-based representations?
JH: Incorporating time is absolutely critical. And you’re right, most of today’s ANNs don’t incorporate time at all. But think about how you interact with the world. If I ask you to learn what a new object feels like, you move your fingers over the object's surface. Similarly, when you look at something, your eyes are constantly moving, about three times a second, attending to different parts of the world. When we move any part of our body, the inputs to the brain change. Therefore, the inputs to the brain are constantly changing over time. The brain is able to make sense of this time-based stream of inputs by associating each input with locations relative to objects, environments, and the body. The key point is that time-changing inputs are not a problem to be compensated for but are the essence of how we learn and infer.
Other aspects that are highlighted in The Thousand Brains theory are memory and the ability to develop models of the world. Do you see value in recent techniques such as self-supervised learning (SSL) or continual learning as the path to build more efficient representations of the world and memory capabilities?
JH: Yes. The way I view it, there are many people trying to build intelligent machines. Ultimately, we will all reach a consensus on the key principles and key attributes needed for AI. Two of those attributes will be self-supervised learning and continual learning. We have shown that ANNs that use neurons with dendrites can learn continuously and rapidly, without catastrophic forgetting. We have also shown that a system with predictive models can use prediction error for self-supervised learning. Perhaps there are other methods for achieving self-supervised and continual learning, but it is clear to me that these will be essential for AI.
For decades AI has evolved based on a friction between neural networks (deep learning) and symbolic systems. Based on your current theory of intelligence, do you think the path towards AGI is based on:
a) deep learning
b) symbolic systems
c) a combination of both
d) a new type of technique that we haven’t seen before
JH: I love this question, but the answer isn’t as simple as picking a, b, c, or d.
First, although I believe deep learning is not on the path to AGI, it is not going away either. Deep learning is a useful technology that can outperform humans on many tasks. At Numenta, we have shown that we can dramatically improve the performance of deep learning networks using principles of sparsity that we learned by studying the brain. As far as I can tell, we are leading in this area. We have a team dedicated to this as we think it is environmentally and financially valuable.
But when it comes to AGI, we need something different. Symbolic AI pioneers had the right idea but the wrong execution. They argued, correctly in my opinion, that intelligent machines need to have everyday knowledge about the world. Thus, they believed that achieving the intelligence of a five-year-old was more important than, say, creating the world’s best chess player. Where they went wrong is that they tried to manually collate knowledge and then encode that knowledge via software. This turned out to be impossible. I recall one AI researcher saying something like, “how to represent knowledge in a computer is not just a difficult problem for AI, it is the ONLY problem of AI.”
The Thousand Brains Theory provides the solution. Our brains learn models of everything we interact with. These models in the brain are built using the neural equivalent of reference frames, allowing them to capture the structure and behavior of the things we observe. Knowledge is stored in the models. Take, for example, a stapler. How can we store knowledge of staplers, such as how the parts move and what happens when you press down on the top? In the old days, AI researchers would make a list of stapler facts and behaviors and then try to encode them in software. The brain observes the stapler, learns a model of the stapler, and when asked how it works, the brain uses the model to mentally play back what it previously observed. I would say these models are symbolic but not how AI researchers used the term in the past.
In summary, AGI systems will use sensory-motor learning to create models of environments and objects. Knowledge is encoded in these models via reference frames. Whether we call this form of knowledge representation symbolic or not is not important. Today’s deep learning networks have nothing equivalent to this.
💥 Recommended book
What book can you recommend to an aspiring ML engineer?
I wrote both of my books (On Intelligence and A thousand Brains) with this reader in mind. There are many good books about machine learning, but there are few places you can read about intelligence from a broader perspective, including what the brain tells us about intelligence. That’s why I wrote A Thousand Brains. I am not trying to sell books, but I seriously believe a young ML student would benefit from reading A Thousand Brains early in their education.