Introduction to Model Deployment with Ray Serve
Welcome to the tutorial at MLOps World 2022 in Toronto!
This is a two-part introductory and ๐ฉ hands-on ๐ป guided tutorial. Part one covers a ๐ฉ hands-on ๐ป coding tour through the Ray core APIs, which provide powerful yet easy-to-use design patterns (tasks and actors) for implementing distributed systems in Python. Building on the foundation of Ray Core APIs, part two of this tutorial focuses on Ray Serve concepts, what and why Ray Serve, scalable architecture, and model deployment patterns.
Then, using code examples ๐ฉโ๐ป in Jupyter notebooks, we will take a coding tour of creating, exposing, and deploying models to Ray Serve using core deployment APIs.
And lastly, we will touch upon Ray Serveโs integration with model registries such as MLflow, walk through an end-to-end example, and discuss and show Ray Serveโs integration with FastAPI.
Key takeaways for students:
- ๐ฉโ๐ป Code Ray Core APIs to convert Python function/classes into a distributed setting
- ๐ Learn to use Ray Serve APIs to create, expose, and deploy models with Ray Server APIs
- โ๏ธ Access and call deployment endpoints in Ray Serve via Python or HTTP
- โ๏ธ Configure compute resources and replicas to scale models in production
- ๐ Learn about Ray Serve integrations with MLflow and FastAPI
Outline for the Tutorial Lessons ๐
Notebooks | Module 1 | Ray Core API Patterns for Tasks, Objects, and Actors |
---|---|---|
00 | Ray Tutorial Overview | Overview of this tutorial. |
01 | Ray Remote Functions | The Remote Function as Stateless Tasks Pattern. |
02 | Ray Remote Objects | The Remote Objects as Futures Pattern. |
03 | Ray Remote Classes | The Remote Classes as Stateful Actors Pattern |
Module 2 | Introduction to Ray Serve and model deployments | |
04 | Ray Serve Model Serving Challenges | What are model serving challenges |
05 | Ray Serve Model Composition | Model composition model deployment pattern. A sentiment analysis model using HuggingFace ๐ค transformers |
Extras | Ray Serve Integration and end-to-end example | |
06 | Ray Serve and MLflow Integration | A simple model trained, logged to MLflow registry and served from in a Ray Serve deployment. |
07 | Ray Serve and FastAPI Integration | A simple XGBoost model trained, deployed, and accesses via FastAPI endpoints |
08 | Ray Serve End to End Example | An end-to-end example using XGBoost, Tune, and Ray Serve for classification model using diabetes dataset |
๐งโ๐Prerequisite knowledge
Some prior experience with Python and Jupyter notebooks will be helpful, but we'll explain most details as we go if you haven't used notebooks before. Knowledge of basic machine learning concepts, including hyperparameters, model serving, and principles of distributed computing is helpful, but not required.
All exercises are optional and can be done on your laptop, preferably running a Linux or macOS, using all its cores. Because you wonโt have access to Ray clusters, we have to run Ray locally and parallelize all your tasks on all your cores.
Python 3.7+ is required on your laptop, and some minimal installation of quick python packages using conda and pip.
๐ฉโ๐ซ Instructions to get started
We assume that you have a conda
installed.
conda create -n ray-core-serve-tutorial python=3.8
conda activate ray-core-serve-tutorial
git clone git@github.com:dmatrix/ray-core-serve-tutorial-mlops.git
cd
to <cloned_dir>python3 -m pip install -r requirements.txt
python3 -m ipykernel install
conda install jupyterlab
jupyter lab
If you are using Apple M1 laptop ๐ run the following additional command:
conda install grpcio
Let's have ๐ fun with Ray @ MLOps World 2022!
Thank you ๐,
Jules & Archit