shubhammirajkar

Shubham Mirajkar's repositories

tokyo_olympics_de_project

Explore the Tokyo Olympics data journey! We ingested a GitHub CSV into Azure via Data Factory, stored it in Data Lake Storage Gen2, performed transformations in Databricks, conducted advanced analytics in Azure Synapse, and visualized insights in Synapse or Power BI.

Language:Jupyter Notebook2 10

uber_etl_data_engineering_project

An ETL Pipeline built over GCP and orchestrated by Mage, which involves Extracting Data from GCS Bucket, building Dimensional Model, loading the Data into BigQuery and a Looker Dashboard for further analysis.

Language:Jupyter Notebook100

banks_webscraping_etl_project

Python script for ETL operations on the world's largest banks' data, utilizing web scraping to extract information from a Wikipedia page, performing data transformations, and storing results in CSV and SQLite.

Language:Python000

etl_using_spark

Language:Jupyter Notebook000

machine_learning__practice_repo

Language:Jupyter Notebook000

PySpark-Practice-Projects

PySpark Practice Projects

Language:Jupyter Notebook000

sales-outlet-etl-pipeline

An end-to-end ETL pipeline that extracts data from an Azure SQL Server database, transforms the data using Databricks, and loads the transformed dataset into Azure Data Lake Storage (ADLS).

Language:Jupyter Notebook010

superstore_azure_de_project

Copying data from Amazon S3 bucket to Azure Blob container by using Azure Data Factory pipeline. This Data is mounted to Databricks and further analysis is done using Spark SQL.

Language:Python000