sfc-gh-twhite / Intro_SnowML

An introduction to using Snowflake-ML for Machine Learning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Intro_SnowML

Create a .env and include the following filling in your account URL/Username/Password

SNOWFLAKE_ACCOUNT =
SNOWFLAKE_USER =
SNOWFLAKE_PASSWORD =
SNOWFLAKE_ROLE = sysadmin
SNOWFLAKE_WAREHOUSE = compute_wh
SNOWFLAKE_DATABASE = snowpark
SNOWFLAKE_SCHEMA = titanic

Run the load_data notebook which will perform the following tasks

  • Load Titanic dataset from Seaborn, uppercase the column names and convert to csv
  • Put the CSV file into a Snowflake Internal Stage
  • Create a Snowpark DataFrame from the CSV in the stage
  • Write the Snowpark DataFrame to Snowflake as a table

Run the snowml notebook which will perform the following tasks

  • Create a Snowpark DataFrame from the Titanic table
  • Check Null values
  • drop columns with high count of nulls
  • Convert Fare datatype
  • Impute Categorical columns with nulls
  • One Hot Encode Categrocial Values
  • Split into Test & Train
  • Train an XGBOOST Classifier Model
  • Perform predictions on test
  • Return Accuracy, Precision, and Recall

About

An introduction to using Snowflake-ML for Machine Learning


Languages

Language:Jupyter Notebook 100.0%