brunopistone / multi-model-train-template

The purpose of this template is to deploy a Sagemaker Training Pipeline for parallel training of multiple models, and a scheduled batch inference using SageMaker Batch Transform and SageMaker Pipelines, given two `ModelGroupPackageName` from the Amazon SageMaker Model Registry.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Multi-model training and batch inference pipeline

Purpose

The purpose of this template is to deploy a Sagemaker Training Pipeline for parallel training of multiple models, and a scheduled batch inference using SageMaker Batch Transform and SageMaker Pipelines, given two ModelGroupPackageName from the Amazon SageMaker Model Registry.

Architecture

Architecture.png

Prerequisites

Copy data from the public S3 bucket sagemaker-sample-files from your default bucket.

aws s3 cp s3://sagemaker-sample-files/datasets/tabular/tweets_dataset/TheSocialDilemma.csv s3://DOC-EXAMPLE-BUCKET/datasets/tabular/tweets_dataset/TheSocialDilemma.csv

Instructions

Part 1: Create initial Service Catalog Product

  1. To create the Service Catalog product for this project, download the create-batch-inference-product.yaml and upload it into your CloudFormation console: https://console.aws.amazon.com/cloudformation

  2. Update the Parameters section:

    • Supply a unique name for the stack

    • Enter your Service Catalog portfolio id, which can be found in the Outputs tab of your deployed portfolio stack or in the Service Catalog portfolio list: https://console.aws.amazon.com/servicecatalog/home?#/portfolios

    • Update the Product Information. The product name and description are visible inside of SageMaker Studio. Other fields are visible to users that consume this directly through Service Catalog.

    • Support information is not available inside of SageMaker Studio, but is available in the Service Catalog Dashboard.

    • Updating the source code repository by pointing to the current repo.

  3. Choose Next, Next again, check the box acknowledging that the template will create IAM resources, and then choose Create Stack.

  4. Your template should now be visible inside of SageMaker Studio.

Part 2: Deploy the Project inside of SageMaker Studio

  1. Open SageMaker Studio and sign in to your user profile.

  2. Choose the SageMaker components and registries icon on the left, and choose the Create project button.

  3. The default view displays SageMaker templates. Switch to the Organization templates tab to see custom project templates.

  4. The template you created will be displayed in the template list. (If you do not see it yet, make sure the correct execution role is added to the product and the sagemaker:studio-visibility tag with a value of true is added to the Service Catalog product).

  5. Choose the template and click Select the correct project template.

  6. Fill out the required fields for this project.

    • Name: A unique name for the project deployment.
  7. Choose Create Project.

  8. After a few minutes, your example project should be deployed.

About

The purpose of this template is to deploy a Sagemaker Training Pipeline for parallel training of multiple models, and a scheduled batch inference using SageMaker Batch Transform and SageMaker Pipelines, given two `ModelGroupPackageName` from the Amazon SageMaker Model Registry.


Languages

Language:Python 64.7%Language:Jupyter Notebook 35.3%