Introduction to Model Deployment with Ray Serve

Welcome to the tutorial at MLOps World 2022 in Toronto!

This is a two-part introductory and 👩 hands-on 💻 guided tutorial. Part one covers a 👩 hands-on 💻 coding tour through the Ray core APIs, which provide powerful yet easy-to-use design patterns (tasks and actors) for implementing distributed systems in Python. Building on the foundation of Ray Core APIs, part two of this tutorial focuses on Ray Serve concepts, what and why Ray Serve, scalable architecture, and model deployment patterns.

Then, using code examples 👩‍💻 in Jupyter notebooks, we will take a coding tour of creating, exposing, and deploying models to Ray Serve using core deployment APIs.

And lastly, we will touch upon Ray Serve’s integration with model registries such as MLflow, walk through an end-to-end example, and discuss and show Ray Serve’s integration with FastAPI.

Key takeaways for students:

👩‍💻 Code Ray Core APIs to convert Python function/classes into a distributed setting
📖 Learn to use Ray Serve APIs to create, expose, and deploy models with Ray Server APIs
☑️ Access and call deployment endpoints in Ray Serve via Python or HTTP
⚙️ Configure compute resources and replicas to scale models in production
📖 Learn about Ray Serve integrations with MLflow and FastAPI

Outline for the Tutorial Lessons 📖

Notebooks	Module 1	Ray Core API Patterns for Tasks, Objects, and Actors
00	Ray Tutorial Overview	Overview of this tutorial.
01	Ray Remote Functions	The Remote Function as Stateless Tasks Pattern.
02	Ray Remote Objects	The Remote Objects as Futures Pattern.
03	Ray Remote Classes	The Remote Classes as Stateful Actors Pattern
	Module 2	Introduction to Ray Serve and model deployments
04	Ray Serve Model Serving Challenges	What are model serving challenges
05	Ray Serve Model Composition	Model composition model deployment pattern. A sentiment analysis model using HuggingFace 🤗 transformers
	Extras	Ray Serve Integration and end-to-end example
06	Ray Serve and MLflow Integration	A simple model trained, logged to MLflow registry and served from in a Ray Serve deployment.
07	Ray Serve and FastAPI Integration	A simple XGBoost model trained, deployed, and accesses via FastAPI endpoints
08	Ray Serve End to End Example	An end-to-end example using XGBoost, Tune, and Ray Serve for classification model using diabetes dataset

🧑‍🎓Prerequisite knowledge

Some prior experience with Python and Jupyter notebooks will be helpful, but we'll explain most details as we go if you haven't used notebooks before. Knowledge of basic machine learning concepts, including hyperparameters, model serving, and principles of distributed computing is helpful, but not required.

All exercises are optional and can be done on your laptop, preferably running a Linux or macOS, using all its cores. Because you won’t have access to Ray clusters, we have to run Ray locally and parallelize all your tasks on all your cores.

Python 3.7+ is required on your laptop, and some minimal installation of quick python packages using conda and pip.

👩‍🏫 Instructions to get started

We assume that you have a conda installed.

conda create -n ray-core-serve-tutorial python=3.8
conda activate ray-core-serve-tutorial
git clone git@github.com:dmatrix/ray-core-serve-tutorial-mlops.git
cd to <cloned_dir>
python3 -m pip install -r requirements.txt
python3 -m ipykernel install
conda install jupyterlab
jupyter lab

If you are using Apple M1 laptop 🍎 run the following additional command:

conda install grpcio

Let's have 😜 fun with Ray @ MLOps World 2022!

Thank you 🙏,

Jules & Archit

dmatrix / ray-core-serve-tutorial-mlops

Introduction to Model Deployment with Ray Serve

Outline for the Tutorial Lessons 📖

🧑‍🎓Prerequisite knowledge

👩‍🏫 Instructions to get started

About

Languages