TheSequence

TheSequence

Share this post

TheSequence
TheSequence
๐Ÿฎ Edge#147: MLOPs โ€“ Model Serving

๐Ÿฎ Edge#147: MLOPs โ€“ Model Serving

plus overview of the TensorFlow serving paper and TorchServe

Dec 07, 2021
โˆ™ Paid
9

Share this post

TheSequence
TheSequence
๐Ÿฎ Edge#147: MLOPs โ€“ Model Serving
Share

In this issue:

  • we explain what model serving is;

  • we explore the TensorFlow serving paper;

  • we cover TorchServe, a super simple serving framework for PyTorch.ย 


๐Ÿ’กย ML Concept of the Day: Model Servingย 

Continuing with our MLOPs series, we would like to focus on the serving of machine learning (ML) models. Model deployment/serving can be considered one of MLOps pipelines' most challenging aspects. This is partly because model serving architectures have little to do with data science and are more related to ML engineering techniques. Some ML models take hours to execute, requiring large computation pipelines, while others can be executed in seconds on a mobile phone. A solid ML serving infrastructure should be able to adapt to diverse requirements from ML applications.ย ย 

Many unique requirements can influence a model serving architecture. Throughout the last few years, we have seen four fundamental model serving patterns emerging in modern ML architectures:ย ย 

This post is for paid subscribers

Already a paid subscriber? Sign in
ยฉ 2025 Jesus Rodriguez
Privacy โˆ™ Terms โˆ™ Collection notice
Start writingGet the app
Substack is the home for great culture

Share