Donatus Enebuse's repositories
Water-Quality-DW-on-SQL-Server
This is an MSSQL Data Warehouse and manual ETL demo on a specially formatted Water Quality dataset from DEFRA, UK. It is a personal academic-grade exercise to explore the basic concepts of data warehousing and manual ETL process from an academic perspective.
Amazon-Product-Sales
This is an Exploratory Data Analysis done on the Amazon Product Sales dataset from kaggle.
Assignment-on-Data-Scraping
Analyzing Historical Stock/Revenue Data and Building a Dashboard for my IBM Data Analyst Certificate Programme
Automobile-Data
This is an Exploratory Data Analysis done on an Automobile dataset from kaggle
Cities-Weather-S3-Snowflake-Slack-notif-ETL-by-Airflow-on-EC2
This is my second AWS Cloud ETL project. This data pipeline orchestration uses Apache Airflow on AWS EC2. It demonstrates how to build an ETL data pipeline that would extract data (JSON) from the OpenWeatherMap API, transform it, dump it as CSV in S3 bucket, then copy it to destination tables in Snowflake DW and send Slack notification.
City-Weather-and-S3File-RDS-S3-BigQuery-ETL-by-Airflow-on-EC2
This is my third AWS Cloud ETL project. This data pipeline orchestration uses Apache Airflow on AWS EC2. It demonstrates how to build an ETL data pipeline that would perform data extraction to a database in parallel to a loading process into the same database, join the tables, copy joined data to S3 and finally copy the S3 file to BigQuery DW.
Countries-Population
This is an Exploratory Data Analysis done on a Countries dataset from kaggle
Customer-Churn-Data-Analytics-ETL-Pipeline-by-Airflow-on-EC2
This is an end-to-end AWS Cloud ETL project. This orchestration uses Apache Airflow on AWS EC2 as well as AWS Glue. It demonstrates how to build ETL pipeline that would perform data transform using Glue job/crawler as well as loading into a Redshift table. It also shows how to connect Amazon Athena to Glue Data Catalog, and Power BI to Redshift.
Cyclistic-Ride-Sharing-Company
This is my Google Data Analytics Certificate case study for the Cyclistic ride-sharing company
Foresight-Institution
This is a Data Analysis case study done on the Foresight Institution dataset.
Foresight-Pharmaceutical
This is a Data Analysis case study done on the Foresight Pharmaceutical Company dataset.
Istanbul-Shopping
This is an Exploratory Data Analysis done on Istanbul Shopping dataset from kaggle.
Lagos-Weather-S3-Snowflake-Email-notif-ETL-by-Airflow-on-EC2
This is my first ever AWS Could ETL project. This data pipeline orchestration uses Apache Airflow on AWS EC2. It demonstrates how to build an ETL data pipeline that would extract data (JSON) from the OpenWeatherMap API, transform it, dump it as CSV in S3 bucket, then copy it to a destination table in Snowflake DW and send email notification.
NoSQL-and-Big-Data-demonstration
This is a fun assignment task I undertook to explore the world of NoSQL and Big Data. technologies.
Redfin-Analytics-ETL-Data-Engg-Pipeline-by-Airflow-on-EC2
This is an end-to-end AWS Cloud ETL project. This data pipeline orchestration uses Apache Airflow on AWS EC2 as well as Snowpipe. It demonstrates how to build ETL data pipeline that would perform data transformation using Python on Apache Airflow as well as automatic ingestion into Snowflake data warehouse via Snowpipe. Also features Power BI.
Redfin-Analytics-ETL-using-Amazon-EMR-by-Airflow-on-EC2
This is an end-to-end AWS Cloud ETL project. This data pipeline uses an Amazon EMR cluster managed by Apache Airflow that is running on an AWS EC2 instance. It demonstrates how to build orchestration that would perform data transformation using Amazon EMR as well as automatic data ingestion into a Snowflake via Snowpipe. It also features Power BI.
Salifort-Motors-and-Waze-Churn
Employee retention predictive model development for Salifort Motors and Waze. This is a terminal project I did to earn the Google Advanced Data Analytics Professional Certificate.
Water-Quality-DW-on-Oracle-Database
This is an Oracle DB Data Warehouse and manual ETL demo on a specially formatted Water Quality dataset from DEFRA, UK. It is a personal academic-grade exercise to explore the basic concepts of data warehousing and manual ETL process from an academic perspective.
Zillow-Rapid-API-end-to-end-ETL-data-pipeline-by-Airflow-on-EC2
This is an end-to-end AWS Cloud ETL project. This data pipeline orchestration uses Apache Airflow on AWS EC2 as well as AWS Lambda. It demonstrates how to build ETL data pipeline that would perform data transformation using Lambda function as well as loading into a Redshift cluster table. The data would then be visualized using Amazon QuickSight.
Image-Background-Remover-demo-with-Python
This is a simple but fun exercise that was done to demonstrate the power of Python in image manipulation using libraries like Pillow (PIL) and Rembg as well as leveraging ONNX Runtime for faster processing on GPU.
little-lemon-booking-system-db
Database project for managing the table booking system of the Little Lemon restaurant. This is a capstone project I undertook in order to earn the Meta Database Engineer Professional Certificate credential.