SageMaker Experiments Python SDK
Experiment tracking in SageMaker Training Jobs, Processing Jobs, and Notebooks.
Overview
SageMaker Experiments is an AWS service for tracking machine learning Experiments. The SageMaker Experiments Python SDK is a high-level interface to this service that helps you track Experiment information using Python.
Experiment tracking powers the machine learning integrated development environment Amazon SageMaker Studio.
For detailed API reference please go to: Read the Docs
Concepts
- Experiment: A collection of related Trials. Add Trials to an Experiment that you wish to compare together.
- Trial: A description of a multi-step machine learning workflow. Each step in the workflow is described by a Trial Component. There is no relationship between Trial Components such as ordering.
- Trial Component: A description of a single step in a machine learning workflow. For example data cleaning, feature extraction, model training, model evaluation, etc...
- Tracker: A Python context-manager for logging information about a single TrialComponent.
For more information see Amazon SageMaker Experiments - Organize, Track, and Compare Your Machine Learning Trainings
Using the SDK
You can use this SDK to:
- Manage Experiments, Trials, and Trial Components within Python scripts, programs, and notebooks.
- Add tracking information to a SageMaker notebook, allowing you to model your notebook in SageMaker Experiments as a multi-step ML workflow.
- Record experiment information from inside your running SageMaker Training and Processing Jobs.
Installation
pip install sagemaker-experiments
Examples
import boto3
import pickle, gzip, numpy, urllib.request, json
import io
import numpy as np
import sagemaker.amazon.common as smac
import sagemaker
from sagemaker import get_execution_role
from sagemaker import analytics
from smexperiments import experiment
# Specify training container
from sagemaker.amazon.amazon_estimator import get_image_uri
container = get_image_uri(boto3.Session().region_name, 'linear-learner')
# Load the dataset
urllib.request.urlretrieve("http://deeplearning.net/data/mnist/mnist.pkl.gz", "mnist.pkl.gz")
with gzip.open('mnist.pkl.gz', 'rb') as f:
train_set, valid_set, test_set = pickle.load(f, encoding='latin1')
vectors = np.array([t.tolist() for t in train_set[0]]).astype('float32')
labels = np.where(np.array([t.tolist() for t in train_set[1]]) == 0, 1, 0).astype('float32')
buf = io.BytesIO()
smac.write_numpy_to_dense_tensor(buf, vectors, labels)
buf.seek(0)
key = 'recordio-pb-data'
bucket = '{YOUR-BUCKET}'
prefix = 'sagemaker/DEMO-linear-mnist'
boto3.resource('s3').Bucket(bucket).Object(os.path.join(prefix, 'train', key)).upload_fileobj(buf)
s3_train_data = 's3://{}/{}/train/{}'.format(bucket, prefix, key)
output_location = 's3://{}/{}/output'.format(bucket, prefix)
my_experiment = experiment.Experiment.create(experiment_name='MNIST')
my_trial = my_experiment.create_trial(trial_name='linear-learner')
role = get_execution_role()
sess = sagemaker.Session()
linear = sagemaker.estimator.Estimator(container,
role,
train_instance_count=1,
train_instance_type='ml.c4.xlarge',
output_path=output_location,
sagemaker_session=sess)
linear.set_hyperparameters(feature_dim=784,
predictor_type='binary_classifier',
mini_batch_size=200)
linear.fit(inputs={'train': s3_train_data}, experiment_config={
"ExperimentName": my_experiment.experiment_name,
"TrialName": my_trial.trial_name,
"TrialComponentDisplayName": "MNIST-linear-learner",
},)
trial_component_analytics = analytics.ExperimentAnalytics(experiment_name=my_experiment.experiment_name)
analytic_table = trial_component_analytics.dataframe()
analytic_table
For more examples, check out: sagemaker-experiments in AWS Labs Amazon SageMaker Examples.
License
This library is licensed under the Apache 2.0 License.
Running Tests
Unit Tests
tox tests/unit
Integration Tests
To run the integration tests, the following prerequisites must be met:
- AWS account credentials are available in the environment for the boto3 client to use.
- The AWS account has an IAM role with SageMaker permissions.
tox tests/integ
- Test against different regions
tox -e py37 -- --region cn-north-1
Generate Docs
tox -e docs