Prepare a Data Pipeline based in aforesaid diagram. Do following steps
-
Extract the data from Oracle RDS instance
-
Upload the data to your S3 bucket using Python ( or using Postman )
-
Import the data into Snowflake from the s3 bucket
-
Write the queries from Oracle/SQL assignments.
-
Output of queries ( whichever query generates output ) export into S3 from Snowflake
-
Create a basic data Visualization ( graphs ) using the data available on s3 from step 5.
-
Publish the whole code into Github.
View the steps in databricks notebook: https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/1454017710891412/1008713947465588/8419640203592435/latest.html