🎙Fabio Buso about How Hopsworks Feature Store Became Fully Serverless

Aug 31, 2022

Getting to know the experience gained by researchers, engineers, and entrepreneurs doing real ML work is an excellent source of insight and inspiration. Share this interview if you like it. No subscription is needed.

👤 Quick bio / Fabio Buso

Fabio Buso is VP of Engineering at Hopsworks, leading the Feature Store development team. Fabio holds a master’s degree in Cloud Computing and Services with a focus on data-intensive applications.

Tell us a bit about yourself. Your background, current role and how did you get started in machine learning?

Fabio Buso (FB): I got started in machine learning the old fashioned way: with Andrew Ng’s ML course on Coursera. I’ve always been fascinated by the data side of ML applications. During my master's I had a minor in data-intensive applications. It was during my internship that I met the folks at Hopsworks. I started working with them during what was the early days of Hopsworks.

Since then, I led several projects, from infrastructure all the way to leading the development of the Hopsworks Feature Store. Now I’m looking after the entire engineering team. Working closely with customers and users, I’m always impressed by the models they build and the challenges they solve.

🛠 ML Work

Hopsworks recently announced a new release of its feature store solution. Could you walk us through the evolution of the platform in the last few years and the new capabilities of this release?

FB: In the first two versions of the Hopsworks feature store, it was Spark-centric. Most of the operations to create features and training datasets required data scientists’ direct interaction with Spark. Many data scientists have a soft spot for Pandas and can be very creative in avoiding using Spark, even when Spark is needed. Already in Hopsworks 2.x we started building support for pure Python clients. With Hopsworks 3.x this capability has reached a level of maturity that allows data scientists to not only write feature pipelines with Pandas, but also create training datasets, and perform batch/online inference from Python.

The Python API improvements in Hopsworks 3.0 were part of the bigger theme of Hopsworks 3.0. In the new release, we focused on improving the data science experience.

The new release also brings more mature support with model serving infrastructure built on KServe. We put particular focus on having a tight connection between the model serving infrastructure and the feature store. This allows data scientists to deploy models at scale on KServe and have easy access to the precomputed features that power those real-time predictions.

One of the key emphasis of Hopsworks 3.0 is its serverless architecture which deviates from traditional designs in feature store platforms. What were the drivers behind that design choice and what are the key capabilities that a serverless approach enables for Hopsworks 3.0?

FB: The serverless architecture is also part of the developer experience work we did for Hopsworks 3.0. Hopsworks brings lots of cool functionality to the feature engineering pipelines (a UI to discover and collaborate with your team, lineage, tags, and more). Serverless is the answer to the question we posed ourselves: how can we make it easier for data scientists to start building features and leverage the functionalities Hopsworks provides, without the need to deploy and manage a Hopsworks platform?

Today with Hopsworks Serverless, data scientists can be up and running with Hopsworks in a matter of seconds, without the need to connect any cloud account or install any software.

Another interesting design choice in Hopsworks 3.0 was to move on from the Spark-centric architecture that is very common in feature stores to a Python-native architecture. Is this a tradeoff between computation power and flexibility? What are the benefits of this architecture and what are the areas in which Spark still makes sense?

FB: The new API architecture has the same philosophy as frameworks like PyTorch/Tensorflow. In PyTorch/Tensorflow data scientists interact with the Python API, but the heavy lifting is done by optimized C++ routines running on different hardware, but users don’t have to interact with it.

Something similar happens in Hopsworks. Users create and register features with Python and in the backend Hopsworks leverages Spark to persist those features in the feature groups and to join features together when creating a training dataset. However, users don’t have to have knowledge of Spark to use the feature store.

For advanced use cases that require extremely fresh features or large feature engineering pipeline, users can still interact directly with Spark as it was the case in previous versions of Hopsworks.

Hopsworks 3.0 improves in model serving and deployment using open source frameworks like KServe and Istio respectively. What are the differentiators of these two frameworks that makes them a good fit for an MLOps pipeline?

FB: Istio is the go-to secure service mesh in K8s. KServe is the de facto framework to deploy models on K8s and make those models available to users through rest APIs. KServe leverages Istio to provide service discovery, request routing, and secure external endpoint functionalities. Both Istio and KServe are widely used and battle-tested, making them good candidates to run production-grade model serving infrastructure.

What we did in Hopsworks is extend KServe with access control using Hopsworks API keys and feature/prediction logging, seamlessly integrating it with the Hopsworks ecosystem. Hopsworks’ model registry is designed for managing versioned KServe deployments (artifacts, transformers, predictors), and Hopsworks Python API gives you secure access to both the feature store and model deployments on KServe.

But most importantly we worked on providing the models deployed on KServe with real-time data they need to make predictions using the feature store. At the same time, Hopsworks provides the infrastructure to log the predictions back to the feature store, enabling analysis and debugging of models, and even new feature data for models.

The feature store space remains extremely crowded with platforms that seem to provide similar feature sets. Do you think the market is going to become more differentiated and less fragmented in the next couple of years? What are some of the main challenges and next wave of technical innovations that we are likely to see in the next wave of feature store platforms?

FB: From a technology perspective I expect all platforms to start focusing as well on data scientists and lowering the barrier to adoption. Like we did for Hopsworks 3.0.

The market is already going through a consolidation phase, with several startups who offered feature stores last year no longer doing so this year. The remaining vendors are maturing their platforms and focusing on adding additional use cases. Like any other segment, I expect some players to not make it and I would not be surprised if some big player makes some moves to augment or bootstrap their ML/AI capabilities.

💥 Miscellaneous – a set of rapid-fire questions

Favorite math paradox?

All horses have the same color, which is an attempt to use induction proof to prove that any given group of horses are of the same color. Seems reasonable at first, but the induction breaks down if you only have 2 horses.

What book can you recommend to an aspiring ML engineer?

Feature Engineering Bookcamp by Sinan Ozdemir, Manning. In particular, the hands-on chapters that use Hopsworks!

Is the Turing Test still relevant? Any clever alternatives?

Chatbots or even GPT-3 have become very good at fooling humans into thinking they are talking to other humans. However, they have been developed for this very purpose, and therefore it’s hard to derive intelligence from that.

Most exciting area of deep learning research at the moment?

Diffusion models are all the hype these days. It will be interesting to see how far the community will be able to push these models and, more importantly, if we’ll be able to make them cheaper to train.

TheSequence

Discussion about this post