Azure Databricks Hands-on (Tutorials)

Follow each instructions on notebook below.

Storage Settings
Basics of PySpark, Spark Dataframe, and Spark Machine Learning
Spark Machine Learning Pipeline
Hyper-parameter Tuning
MLeap (requires ML runtime)
Horovod Runner with TensorFlow (requires ML runtime)
Structured Streaming (Basic)
Structured Streaming with Azure EventHub or Kafka
Delta Lake
MLflow (requires ML runtime)
Orchestration with Azure Data Services
Delta Live Tables

How to start

Create Azure Databricks resource in Microsoft Azure.
After the resource is created, launch Databricks workspace UI by clicking "Launch Workspace".
Create a compute (cluster) in Databricks UI. (Select "Compute" menu and proceed.)
Databricks Runtime Version 10.2 ML or above is recommended for this tutorial.
Download HandsOn.dbc and import into your workspace as follows.
- Select "Workspace" in Workspace UI.
- Go to user folder, click your e-mail (the arrow icon), and then select "import" command.
- Pick up HandsOn.dbc to import.
Open notebook and attach above compute (your cluster) in every notebook. (Select compute on the top of each notebook.)
Please run "Exercise 01 : Storage Settings (Prepare)" notebook first, before running other notebooks.

Note : You cannot use Azure trial (free) subscription, because of the limited quota. Please promote to pay-as-you-go when you're in Azure free subscription. (The credit will be reserved, even when you transit to pay-as-you-go.)

Tsuyoshi Matsuzaki @ Microsoft

danielsef / azure-databricks-exercise

Azure Databricks Hands-on (Tutorials)

How to start

About