Discover more from TheSequence
📌 Event: Leverage your Snowflake, BigQuery, Redshift Data Warehouse with a Real-Time Feature Store / Sept 21
Building historical and reproducible training datasets from data warehouses
Hopsworks feature store can be configured to leverage the content of data warehouses to simplify the data science workflow. For data scientists, using data directly from a data warehouse presents three challenges:
data in the data warehouse is often updated making it impossible to reproduce previously generated training data and previous experiments.
Data warehouses often lack the historical view of the data, leaving to data scientists the chore of building it.
Finally productionizing a model often requires building additional pipelines to make the same data available in a low latency database for online serving.
What is this talk about?
In this talk, we will discuss how Hopsworks can be connected to existing cloud-native data warehouses like Snowflake, Redshift and BigQuery. We will show how to use data warehouses as a source of data to build historical and reproducible training dataset. We will show how to leverage the core functionalities of Hopsworks: Python centric APIs, time travel, statistics, search and data validation to build historical, clean and reproducible dataset to train and productionize machine learning models.
What: Leverage your Snowflake, BigQuery, Redshift Data Warehouse with a Real-Time Feature Store
Who: Fabio Buso, VP of Engineering at Hopsworks
When: Wednesday, September 21st | 9 AM PDT