🎙 William Falcon: "We did our job right if the term MLOps disappears"
Getting to know the experience gained by researchers, engineers, and entrepreneurs doing real ML work can become a great source of insights and inspiration.
Share this interview if you find it enriching. No subscription is needed.
👤 Quick bio / William Falcon
Tell us a bit about yourself. Your background, current role and how did you get started in machine learning?
William Falcon (WF): I'm the co-founder of Grid.ai and the creator of PyTorch Lightning. In undergrad, I majored in CS and statistics which focused on ML. Got involved in deep learning as an undergrad studying how the brain and eyes encode light into neural activity (computational neuroscience).
🛠 ML Work
Both PyTorch Lightning and Grid.ai are projects focused on removing the dependencies between data science research and complex ML engineering. Both projects remind me of what Ruby of Rails or Heroku did for web and cloud application development, respectively. Can you elaborate on the vision behind the two projects and why this is a very important problem to solve?
(WF): Love the analogies! A bunch of years ago, data scientists heard about deep learning and tried it. Some spent 6+ months trying to get it to work but never did. The problem is that deep learning has a LOT of moving parts... model structure, parameters, implementations, etc...
All these moving pieces make everything really hard to get right. Lightning and Grid.ai help solve A LOT of these issues because we can get rid of most of the engineering you have to do.
You can have math people focus on... math, data analysis, and building models instead of learning to be engineers.
ML infrastructure platforms are advancing at a very rapid pace and the space is incredibly fragmented. Recently, there have been a plethora of new frameworks for streamlining different aspects of ML engineering such as parallel training, hyperparameter optimization, debugging and many others. What is the right balance between abstracting those infrastructure building blocks from data scientists and give them some control over the infrastructure used to train, optimize and run ML models?
(WF): It's the approach we take with PyTorch Lightning. Give the user full flexibility for configuration but give them a structured framework in which to operate in.
PyTorch Lightning changed the way people do deep learning. It removed most of the engineering but made it powerful enough for pros to do really advanced things.
Grid.ai is doing the same. In fact, I think we did our job right if the term MLOps disappears.
PyTorch Lightning enables a simpler and more abstract way to build ML applications. Frameworks like Keras or TensorFlow 2.0 have certainly taken steps in similar directions although without so much focus on the infrastructure aspects of ML solutions. From your perspective, how much simpler can machine learning programming can get and what do we need to get there?
(WF): Einstein said, "Everything should be made as simple as possible, but no simpler."
I think PyTorch Lightning and Grid.ai are definitely very close to that bound of very simple but no simpler. It's possible to go simpler, but then you're back to square one of being a black box.
What role the trends like AutoML, Neural Architecture Search (NAS) or self-supervised learning can play in the simplification of ML programming?
(WF): The dirty secret of deep learning is that NAS is very hyped... It works at best okay in very specific research papers. Maybe we can get there in a few years, but I'm not confident that will happen.
AutoML has been done over and over and works great if you're a business analyst or just starting out but doesn't really get you the performance you need for real-world production systems.
The most promising tech is self-supervised learning which in fact has been my focus throughout my Ph.D. and the focus of my research advisors.
Self-supervised learning is already successful in models like BERT where the model uses its own inputs as labels. In my opinion, that trend will continue but not via contrastive learning. I think the computational expense involved is a serious problem for the field.
Are platforms like AWS SageMaker or Azure ML are just too complex for mainstream developers, do we need a higher-level programming model?
(WF): Depends who you are! Some people love to spend their time doing MLOps. :) Others just want to build and ship models.
It's like the difference between Linux users and Mac users. Some people really want to spend time configuring every part of a system even if it's not directly relevant to their immediate work (as an engineer I find this fun too). Others just want to turn on their machines, install apps and go...
I'm somewhere in between depending on how much free time I have, haha.
💥 Miscellaneous – a set of rapid-fire questions
Is the Turing Test still relevant? Is there a better alternative?
(WF): The banter test is what I'd like to propose. Can you banter with an AI for 20 minutes without detecting that it's a robot?
Favorite math paradox?
(WF): Hilbert's paradox of the Grand Hotel. Because comparing infinite sets is so much fun, haha. For example, count all positive real numbers... that's infinity. Now count all the positive AND negative real numbers... that's also infinity. But wait, there are at least twice as many in the second set!
Any book you would recommend to aspiring data scientists?
(WF): Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
Probabilistic Machine Learning: An Introduction (by Kevin Murphy)
Is P equals NP?
(WF): P = 42