This repo is an end-to-end template for training and deploying a container & creating and serving endpoints using Amazon SageMaker.
This repository is a copy of shapemaker, which is licensed under the MIT license.
The template has some modifications that allows the setup to work on M2 laptops. The following functionality is supported:
- a minimalistic template for model code
- a template for a docker image for model training
- an endpoint docker image template for real-time inference
- command-line functions for interacting with the model/endpoint
- command-line functions for delivering and integrating the model w/SageMaker
- workflows enabling continuous integration/delivery.
The setup was done on a Mac (M2), it might need to be tweaked slightly to accommodate other devices.
Cloud Services
- Amazon Web Services*
-
User and Policy: Create a new user 'vmascarenhas' and attach the following policies:
AmazonSageMakerFullAccess
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "ec2:DescribeNetworkInterfaces", "ec2:DescribeVpcs", "kms:ListAliases", "iam:ListRoles", "ec2:DescribeSubnets", "ec2:DescribeSecurityGroups", "ec2:CreateNetworkInterface", "iam:CreatePolicy", "ec2:DeleteNetworkInterface", "iam:CreateRole", "iam:AttachRolePolicy" ], "Resource": "*" } ] }
-
S3 bucket
- The S3 bucket acts as the location where you’ll store data for various ML processes, including passing training and test data to the ML algorithms, temporary data and output from the ML algorithms (e.g. model files). Be sure to create the S3 bucket in the same region that you intend to create the Sagemaker instance.
-
Software
- AWS CLI
-
curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg" sudo installer -pkg ./AWSCLIV2.pkg -target / which aws aws --version
-
- direnv
brew install direnv
- gettext
brew install gettext
- docker
- docker compose
- pyenv
-
brew install pyenv brew install pyenv-virtualenv pyenv install -v 3.11.3 pyenv virtualenv 3.11.3 v3.11.3 pyenv activate v3.11.3
-
- python
- Cookiecutter
pip install cookiecutter
CI/CD
Create a project template using cookiecutter
:
cookiecutter gh:VipulMascarenhas/sagemaker-template
The inputs for the template are described below:
Input | Description |
---|---|
PROJECT_NAME | Name of model project. |
PY_VERSION | Which version of python to use, e.g. '3.11.3'. |
DIR_MODEL_LOCAL | Local directory for model artifact storage, e.g. './artifacts'. |
DIR_TMP | Temporary files directory, e.g. '/tmp'. |
AWS_ACCOUNT_ID | 12-digit AWS account ID. |
AWS_DEFAULT_REGION | AWS default region. |
ECR_REPO | Name of AWS ECR repository, where containers are published. |
SAGEMAKER_ROLE | Name of the Sagemaker execution role to be assumed by Sagemaker, e.g. SageMakerRole |
BUCKET_ARTIFACTS | Name of S3 bucket for model artifact storage. NOTE: prefix with 'sagemaker' for Sagemaker access, e.g. 'sagemaker-emails-data'. |
NOTE: do NOT enquote input values.
Initialize project by executing make init
from the command line in the project directory. The init
target makes the included shell scripts executable and provisions relevant AWS infrastructure.
Export project-specific environment variables automatically with direnv
, i.e. by invoking direnv allow
.
To help you navigate in this template, here is an overview of the folder structure:
./
├── .github/
│ └── workflows/ # Workflows for automation, CI/CD.
├── modelpkg/ # Python package defining model logic.
| | construct.py # Code for constructing and training the model etc.
│ └── tests/ # Unit tests for model code.
├── aws/ # Shell scripts for integrating the project with Sagemaker.
├── configs/ # Configurations for Sagemaker endpoints, training jobs, etc.
├── images/ # Docker images for model training and model endpoint.
├── server/ # Configuration for a default NGINX web server for the model endpoint.*
├── .envrc # Project-specific environment variables.
├── Makefile # Command-line functions for project-specific tasks.
├── train.py # Script for training the model. Builds into training image.
├── app.py # Application code for the model endpoint. Builds into endpoint image.
├── requirements_modelpkg.txt # Python packages required by the model.
└── requirements_dev.txt # .. All other python packages needed in development mode.
We can modify this template depending on our specific use-cases.
All tasks related to interacting with the model project are implemented as command-line functions in ./Makefile
implying that functions are invoked with make [target]
, e.g. make build_training_image
.
If you want to build, train and deploy a model on-the-fly you can do it by invoking a sequence of make
targets, i.e.:
make init
make build_training_image
make push_training_image
make create_training_job
make build_endpoint_image
make push_endpoint_image
make create_endpoint
make
+ space + tab + tab lists all available make
targets.
This template ships with a number of automation (CI/CD) workflows implemented with Github Actions.
To enable CI/CD workflows, upload your project to Github and connect the Github repository with
your AWS account by providing your AWS credentials as Github
Secrets. Secrets should have names:
- AWS_ACCESS_KEY_ID
- AWS_SECRET_ACCESS_KEY
By default, every commit to main
triggers a workflow ./github/workflows/deliver_images.yaml
, that runs unit tests and builds and pushes training and endpoint images.
All workflows can be run manually.