π William Falcon: "We did our job right if the term MLOps disappears"
Getting to know the experience gained by researchers, engineers, and entrepreneurs doing real ML work canΒ become a great source of insights and inspiration.
Share this interview if you find it enriching. No subscription is needed.
π€Β Quick bio /Β William Falcon
Tell us a bit about yourself. Your background, current role and how did you get started in machine learning?Β
William Falcon (WF): I'm the co-founder ofΒ Grid.aiΒ and the creator of PyTorch Lightning. In undergrad, I majored in CS and statistics which focused on ML. Got involved in deep learning as an undergrad studying how the brain and eyes encode light into neural activity (computational neuroscience).
π ML WorkΒ
BothΒ PyTorchΒ Lightning andΒ Grid.aiΒ are projects focused on removing the dependencies between data science research and complexΒ MLΒ engineering.Β Both projects remind me of what Ruby of RailsΒ or HerokuΒ did for webΒ and cloud applicationΒ development,Β respectively. Can you elaborate on the vision behind the two projects and why this is a very important problem to solve?
(WF): Love the analogies! A bunch of years ago, data scientists heard about deep learning and tried it. Some spent 6+ months trying to get it to work but never did. The problem is that deep learning has a LOT of moving parts... model structure, parameters, implementations, etc...
All these moving pieces make everything really hard to get right. Lightning andΒ Grid.aiΒ help solve A LOT of these issues because we can get rid of most of the engineering you have to do.
You can have math people focus on... math, data analysis, and building models instead of learning to be engineers.
MLΒ infrastructure platformsΒ are advancingΒ at a very rapid pace andΒ the space is incredibly fragmented. Recently,Β there have been a plethora of new frameworks for streamlining different aspects of ML engineering such as parallel training, hyperparameter optimization, debugging and many others. What is the right balance betweenΒ abstractingΒ those infrastructure building blocks from data scientists andΒ give them some control over theΒ infrastructure used to train, optimize and run ML models?
(WF): It's the approach we take with PyTorch Lightning. Give the user full flexibility for configuration but give them a structured framework in which to operate in.
PyTorch Lightning changed the way people do deep learning. It removed most of the engineering but made it powerful enough for pros to do really advanced things.
Grid.aiΒ is doing the same. In fact, I think we did our job right if the term MLOps disappears.
PyTorchΒ Lightning enablesΒ a simpler and more abstract way to build ML applications. Frameworks likeΒ KerasΒ or TensorFlow 2.0 have certainly taken stepsΒ in similar directionsΒ although without so much focus on the infrastructure aspects of MLΒ solutions. From your perspective, how much simpler can machine learning programming can get and what do we need to get there?
(WF): Einstein said, "Everything should be made as simple as possible, but no simpler."
I think PyTorch Lightning andΒ Grid.aiΒ are definitely very close to that bound of very simple but no simpler. It's possible to go simpler, but then you're back to square one of being a black box.
What role the trends likeΒ AutoML, Neural Architecture Search (NAS) orΒ self-supervisedΒ learning can playΒ in the simplification of ML programming?
(WF): The dirty secret of deep learning is that NAS is very hyped... It works at best okay in very specific research papers. Maybe we can get there in a few years, but I'm not confident that will happen.
AutoML has been done over and over and works great if you're a business analyst or just starting out but doesn't really get you the performance you need for real-world production systems.
The most promising tech is self-supervised learning which in fact has been my focus throughout my Ph.D. and the focus of my research advisors.
Self-supervised learning is already successful in models like BERT where the model uses its own inputs as labels. In my opinion, that trend will continue but not via contrastive learning. I think the computational expense involved is a serious problem for the field.
Are platforms like AWSΒ SageMakerΒ or Azure ML are just tooΒ complex for mainstream developers, do we need aΒ higher-levelΒ programming model?
(WF): Depends who you are! Some people love to spend their time doing MLOps. :) Others just want to build and ship models.
It's like the difference between Linux users and Mac users. Some people really want to spend time configuring every part of a system even if it's not directly relevant to their immediate work (as an engineer I find this fun too). Others just want to turn on their machines, install apps and go...
I'm somewhere in between depending on how much free time I have, haha.
Do you like TheSequence?Β Subscribe to support our missionΒ to simplify AI education, one newsletter at a time. You can also give TheSequenceΒ as a gift.
π₯ Miscellaneous β a set of rapid-fire questionsΒ Β
Is the Turing Test still relevant? Is there a better alternative?
(WF): The banter test is what I'd like to propose. Can you banter with an AI for 20 minutes without detecting that it's a robot?
Favorite math paradox?
(WF): Hilbert's paradox of the Grand Hotel. Because comparing infinite sets is so much fun, haha. For example, count all positive real numbers... that's infinity. Now count all the positive AND negative real numbers... that's also infinity. But wait, there are at least twice as many in the second set!
Any book you would recommend to aspiring data scientists?
(WF): Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
Probabilistic Machine Learning: An Introduction (by Kevin Murphy)
Is P equals NP?
(WF): P = 42