rtlemos / AWS-SageMaker-R-Workshop

Samples for Using R in Amazon SageMaker

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using R in SageMaker


Disclaimer:

  • The content provided in this repository is for demonstration purposes and not meant for production. You should use your own discretion when using the content.
  • The ideas and opinions outlined in these examples are my own and do not represent the opinions of AWS.

This GitHub repository provides examples of coding in R in SageMaker environment. These examples include the following:

  1. Running RStudio on EC2 Instance This example explains how to run a CloudFormation stack to provision an EC2 Instance with all necessary resources to run RStudio.

  2. Using R Kernel in SageMaker Notebook Instances: Basic Hello World Example This is a simple example for writing R script in SageMaker, downloading data, processing and visualizing data, and then storing the data to S3.

  3. Using R Kernel in SageMaker Notebook Instance: End-2-End Example This sample Notebook describes how to train, deploy, and retrieve predictions from a machine learning (ML) model using Amazon SageMaker and R. The model predicts abalone age as measured by the number of rings in the shell. The reticulate package will be used as an R interface to Amazon SageMaker Python SDK to make API calls to Amazon SageMaker. The reticulate package translates between R and Python objects, and Amazon SageMaker provides a serverless data science environment to train and deploy ML models at scale.

  4. SageMaker Batch Transform using R Kernel This sample Notebook describes how to conduct batch transform using SageMaker Transformer in R. The notebook uses Abalone dataset and XGBoost regressor algorithm.

  5. Bring Your Own R Algorithm to SageMaker This notebook will focus mainly on the integration of hyperparameter tuning and a custom algorithm container, as well as hosting the tuned model and making inference using the endpoint.

  6. Hyperparameter Optimization for XGBoost in R and Batch Transform This sample Notebook describes how to conduct Hyperparamter tuning and batch transform to make predictions for abalone age as measured by the number of rings in the shell. The notebook will use the public abalone dataset hosted by UCI Machine Learning Repository.

  7. Using Spark EMR Clusters in SageMaker with R Kernel The purpose of this example is to demonstrate how SageMaker notebook with R kernel can be connected to an EMR cluster using SparklyR package to process Spark job including data processing, SQL queries, Machine Learning, and data read/write in different formats. The example uses iris, abalone, and mtcars public datasets.

  8. Creating a Persistent Custom R Environment for SageMaker The instructions outlined in this example will walk you through the steps to create a custom R environment with user-specific packages for Amazon SageMaker, and make the environment persistent between sessions, and also to use the environment in creating new SageMaker instances.

These examples utilize two libraries that provide R interfaces for AWS SageMaker and AWS services:

  • Reticulate library: that provides an R interface to make API calls Amazon SageMaker Python SDK to make API calls to Amazon SageMaker. The reticulate package translates between R and Python objects, and Amazon SageMaker provides a serverless data science environment to train and deploy ML models at scale.
  • paws library: that provides an interface to make API calls to AWS services, similar to how boto3 works. boto3 is the Amazon Web Services (AWS) SDK for Python. It enables Python developers to create, configure, and manage AWS services, such as EC2 and S3. Boto provides an easy to use, object-oriented API, as well as low-level access to AWS services. paws provides the same capabilities in R.

More Useful Resources:

NOTE: The author of this Github repository does not endorse/approve the content provided in these resources. These linkes are provided here for educational purpose, and the reader is encouraged to review and check the contents.

About

Samples for Using R in Amazon SageMaker


Languages

Language:Jupyter Notebook 94.7%Language:R 3.3%Language:Shell 1.8%Language:Dockerfile 0.2%