Amazon SageMaker Safe Deployment Pipeline

Introduction

This is a sample solution to build a safe deployment pipeline for Amazon SageMaker. This example could be useful for any organization looking to operationalize machine learning with native AWS development tools such as AWS CodePipeline, AWS CodeBuild and AWS CodeDeploy.

This solution provides as safe deployment by creating an AWS Lambda API that calls into an Amazon SageMaker Endpoint for real-time inference.

Architecture

Following is a digram of the continous delivery stages in the AWS Code Pipeline.

Build Artifacts: Runs a AWS CodeBuild job to create AWS CloudFormation templates.
Train: Trains an Amazon SageMaker pipline and Baseline Processing Job
Deploy Dev: Deploys a development Amazon SageMaker Endpoint
Deploy Prod: Deploys an AWS API Gateway Lambda in front of Amazon SageMaker Endpoints using AWS CodeDeploy for blue/green deployment and rollback.

Components Details

AWS SageMaker – This solution uses SageMaker to train the model to be used and host the model at an endpoint, where it can be accessed via HTTP/HTTPS requests
AWS CodePipeline – CodePipeline has various stages defined in CloudFormation which step through which actions must be taken in which order to go from source code to creation of the production endpoint.
AWS CodeBuild – This solution uses CodeBuild to build the source code from GitHub
AWS CloudFormation – This solution uses the CloudFormation Template language, in either YAML or JSON, to create each resource including custom resource.
AWS S3 – Artifacts created throughout the pipeline as well as the data for the model is stored in an Simple Storage Service (S3) Bucket.

Deployment Steps

Following is the list of steps required to get up and running with this sample.

Step 1. Prepare an AWS Account

Create your AWS account at http://aws.amazon.com by following the instructions on the site.

Step 2. Fork this Github Repository

Fork this GitHub Repository so that you can run with your own GitHub Auth Token.

Step 3. Create a GitHub OAuth Token

Create your token at GitHub's Token Settings, making sure to select scopes of repo and admin:repo_hook. After clicking Generate Token, make sure to save your OAuth Token in a secure location. The token will not be shown again.

Step 4. Launch the Stack

Click on the Launch Stack button below to launch the CloudFormation Stack to set up the SageMaker Pipeline. Before Launching, ensure all architecture, configuration, etc. is set as desired.

You can launch the same stack using the AWS CLI. Here's an example:

aws cloudformation create-stack --stack-name sagemaker-safe-deployment \ --template-body file://pipeline.yml \ --capabilities CAPABILITY_IAM \ --parameters \ ParameterKey=GitHubUser,ParameterValue=youremailaddress@example.com \ ParameterKey=GitHubToken,ParameterValue=YOURGITHUBTOKEN12345ab1234234 \ ParameterKey=ModelName,ParameterValue=mymodelname

Folling is a list of the paramters for running the cloud formation.

Parameters	Description
Email	The email where CodePipeline will send SNS notifications.
GitHubUser	GitHub Username.
GitHubToken	A Secret OAuthToken with access to the GitHub repo.
GitHubRepo	The name (not URL) of the GitHub repository to pull from.
GitHubBranch	The name (not URL) of the GitHub repository’s branch to use.
ModelName	The short name to namespace all the mlops resources.

Make sure you also update the GitHubUser stack parameter to be from your forked account.

Step 5. Start, Test and Approve the Deployment

Once the deployment has completed, there will be a new AWS CodePipeline created linked to your GitHub source. You will notice initially that it will be in a Failed state as it is waiting on an S3 data source.

Launch the newly created SageMaker Notebook in your AWS console, navigate to the notebook directory and opening the notebook by clicking on the mlops.ipynb link.

Once the notebook is running, you will be guided through a series of steps starting with downloading the New York City Taxi dataset, uploading this to an Amazon SageMaker S3 bucket along with the data source meta data to trigger a new build in the AWS CodePipeline.

Once your pipeline is kicked off it will run model training and deploy a development SageMaker Endpoint.

There is a manual approval step which you can action directly within the SageMaker Notebook to promote this to production, send some traffic to the live endpoint and create a REST API.

Subsequent deployments of the pipeline will use AWS CodeDeploy to perform a blue/green deployment to shift traffic from the Original to Replacement endpoint over a period of 5 minutes.

Finally, the SageMaker Notebook provides the ability to retrieve the results from the Monitoring Schedule that is run on the hour.

Approximate Times:

Following is a lis of approximate running times fo the pipeline

Full Pipeline: 45 minutes
Start Build: 2 Minutes
Model Training and Baseline: 5 Minutes
Launch Dev Endpoint: 10 minutes
Launch Prod Endpoint: 25 minutes
Monitoring Schedule: Runs on the hour

Customising for your own model

This project is written in Python, and design to be customised for your own model and API.

.
├── api
│   ├── __init__.py
│   ├── app.py
│   ├── post_traffic_hook.py
│   └── pre_traffic_hook.py
├── model
│   ├── buildspec.yml
│   ├── requirements.txt
│   └── run.py
├── notebook
│   └── mlops.ipynb
└── pipeline.yml

Edit the get_training_params method in the model/run.py script that is run as part of the AWS CodeBuild step to add your own estimator or model definition.

Extend the AWS Lambda hooks in api/pre_traffic_hook.py and api/post_traffic_hook.py to add your own validation or inference against the deployed Amazon SageMaker endpoints. Also you can edit the api/app.py lambda to add any encichment or transformation to the request/response payload.

Opening Issues

If you encounter a bug with this project we would like to hear about it. Search the existing issues and try to make sure your problem doesn’t already exist before opening a new issue. It’s helpful if you include the version of the python you’re using. Please include a stack trace and reduced repro case when appropriate, too.

License

This SDK is distributed under the Apache License, Version 2.0, see LICENSE.txt and NOTICE.txt for more information.

VaibhavSingh98 / sagemaker-safe-deployment-pipeline