🍮 Edge#147: MLOPs – Model Serving

plus overview of the TensorFlow serving paper and TorchServe

Dec 07, 2021

∙ Paid

In this issue:

we explain what model serving is;
we explore the TensorFlow serving paper;
we cover TorchServe, a super simple serving framework for PyTorch.

💡 ML Concept of the Day: Model Serving

Continuing with our MLOPs series, we would like to focus on the serving of machine learning (ML) models. Model deployment/serving can be considered one of MLOps pipelines' most challenging aspects. This is partly because model serving architectures have little to do with data science and are more related to ML engineering techniques. Some ML models take hours to execute, requiring large computation pipelines, while others can be executed in seconds on a mobile phone. A solid ML serving infrastructure should be able to adapt to diverse requirements from ML applications.

Many unique requirements can influence a model serving architecture. Throughout the last few years, we have seen four fundamental model serving patterns emerging in modern ML architectures:

TheSequence

🍮 Edge#147: MLOPs – Model Serving

plus overview of the TensorFlow serving paper and TorchServe

💡 ML Concept of the Day: Model Serving

This post is for paid subscribers