☁️ Edge#140: cnvrg.io’s Metacloud aims to help AI developers to fight vendor lock-in
This is an example of TheSequence Edge, a Premium newsletter that our subscribers receive every Tuesday and Thursday. On Thursdays, we do deep dives into one of the freshest research papers or technology frameworks that is worth your attention.
💥 What’s New in AI: cnvrg.io’s Metacloud aims to help AI developers to fight vendor lock-in
Fragmentation is one of the key challenges faced when building machine learning (ML) applications in the real world. Given the early stage of the ML tools market, it’s very common that data science teams end up using multiple technology stacks to optimize different stages of the lifecycle of ML applications such as training, hyperparameter optimization of monitoring, deployments, etc. That level of technology stack fragmentation creates regular friction in the development of ML solutions, given the lack of integration and inconsistent user experience between ML stacks. Many people would argue that it’s too early in the ML market to propose end-to-end solutions for the implementation of ML pipelines and, yet, that doesn’t prevent ambitious startups from trying. Today, we would like to discuss cnvrg.io. Despite its relatively short time in the ML market, cnvrg.io can’t be considered a startup anymore. In 2020, Intel acquired cnvrg.io to accelerate its ML offering but has kept the company relatively independent. cnvrg.io has rapidly evolved its offering providing key building blocks to enable nearly all aspects of the lifecycle of ML models. With recently launched cnvrg.io Metacloud, the company claims to become one of the most complete platforms that provides an end-to-end experience for implementing and operationalizing ML solutions and frees AI developers from vendor lock-in. Let’s look into it.
The cnvrg.io Platform
A good way to think about the cnvrg.io platform is as a single experience for building and managing all aspects of ML pipelines. From a functional standpoint, cnvrg.io includes key building blocks that enable data scientists and ML engineers with consistent experience to manage the lifecycle of ML models. The feature set of the cnvrg.io platform can be decomposed into the following key capabilities:
Machine Learning Pipelines
The cnvrg.io platform provides a visual workflow interface for designing end-to-end ML pipelines. The visual environment improves the reusability and traceability of ML components as well as its optimization for different environments. cnvrg.io ML pipelines capabilities include automatic hyperparameter tuning as well as out-of-the-box integration with runtimes such as Spark or Kubernetes. The platform also includes performance monitoring retraining triggers based on the runtime behavior of ML workflows.
ML Deployment
cnvrg.io automates the deployment of ML models to Kubernetes clusters across different cloud and on-premise runtimes. The deployment module natively integrates with the ML performance monitoring features to enable the continuous retraining and redeployment of ML models.
Dataset Management
cnvrg.io enables native integration with data sources such as Snowflake, S3, relational databases, and many others. The platform includes a version control system as well as a labeling interface that facilitates the creation of training datasets. Additionally, cnvrg.io associates training datasets with models creating the necessary feedback loops for optimization and retraining.
AI Library
cnvrg.io includes a catalog of pre-configured models and ML components that can be seamlessly integrated into new ML applications. The catalog goes beyond algorithms and contains elements such as docker images and runtime configurations. Data scientists can access pre-configured ML modules using the web interface of the Python/CLI SDKs.
ML Tracking
The cnvrg.io platform includes a series of capabilities to enable the monitoring and tracking of ML models. The monitoring engine tracks runtime metrics such as GPU, memory as well as key performance indicators of ML models. Data scientists can integrate cnvrg.io’s ML monitoring capabilities using just a few lines of Python code.
Open Compute
cnvrg.io seamlessly integrates with different cloud infrastructure environments such as AWS, GCP or Azure enables the elastic scaling of compute resources for ML models. cnvrg.io also enables the management of compute resources for ML pipelines from a single, centralized interface. This capability has been accelerated with the recent release of cnvrg.io Metacloud.
cnvrg.io Metacloud
cnvrg.io Metacloud is a significant addition to the cnvrg.io platform. Infrastructure dependency is one of the handicaps of modern ML solutions. Whether you are using AWS or Azure container engines for your ML solutions, you are likely to incur a dependency on that platform. cnvrg.io’s Metacloud abstracts the lifecycle of ML models from the underlying compute infrastructure. The result is a consistent experience and workflow to scale ML pipelines across different infrastructures without incurring specific dependencies on any of them.
From the functional standpoint, Metacloud accelerates the following capabilities:
Open Compute: It allows using any cloud solution and on-premises hardware instantly. On Metacloud’s marketplace of OEMs, a user can select among an extensive catalog of ML platforms, hardware architectures and accelerators such as Dell, Intel, Lenovo, Supermicro, Lambda Labs and others. With this optionality, data scientists can run AI workloads on the compute that will be cheaper or faster, rather than be limited to the infrastructure their organization runs on. A typical use case for Metacloud would be for a data science team to run a model preprocessing pipeline on AWS, training on GPUs or an accelerator, and deploy on CPUs.
BYO Compute & Storage: In addition to the preset catalog, Metacloud allows ML engineers to leverage their own storage and compute by simply connecting it to cnvrg.io and running any AI workload on those providers from the cnvrg.io control plane.
Hosted/Managed SaaS: As a managed service, cnvrg.io Metacloud can be deployed instantly, allowing to work with cnvrg.io without any concern about managing the underlying Kubernetes cluster or disparate infrastructures.
Conclusion
cnvrg.io has evolved into a complete ML stack that can be adapted to highly heterogeneous scenarios. For instance, the platform has incorporated robust security and access control mechanisms which are incredibly relevant in mission-critical enterprise ML solutions. Additionally, cnvrg.io’s Data Science Workbench provides a single code environment to use different ML frameworks such as TensorFlow or PyTorch as well as native acceleration with stacks such as NVIDIA CUDA. The addition of cnvrg.io Metacloud positions cnvrg.io as one of the most relevant cloud & hardware agnostic, end-to-end ML platforms in the market.