Khaled Shabasy (khaledshabasy)

khaledshabasy

Geek Repo

Location:Cairo, Egypt

Github PK Tool:Github PK Tool

Khaled Shabasy's repositories

Data-Modeling-Spark-udacity-capstone

An ETL pipeline for I94 immigration, global land temperatures and US demographics datasets is created to form an analytics database on immigration events. A data model is established with pandas and pyspark to find patterns of immigration to the United States.

Language:Jupyter NotebookStargazers:1Issues:1Issues:0
Language:CSSStargazers:0Issues:0Issues:0

Data-Lake-Spark-EMR

[Sparkify]build an ETL pipeline that extracts their data from S3, processes them using Spark, and loads the data back into S3 as a set of dimensional tables. This will allow their analytics team to continue finding insights in what songs their users are listening to.

Language:PythonStargazers:0Issues:0Issues:0

Data-Modeling-Cassandra

[Sparkify]A Non-Relational database schema and ETL pipeline for data which resides in a directory of CSV logs on user activity for a music app as well as metadata on the songs in their app.

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

Data-Modeling-Postgres

[Sparkify]A database schema and ETL pipeline for data which resides in a directory of JSON logs on user activity for a music app, as well as a directory with JSON metadata on the songs in their app.

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

Data-Pipelines-with-Airflow

[Sparkify]Build high grade data pipelines that are dynamic and built from reusable tasks, can be monitored, and allow easy backfills. The data quality plays a big part when analyses are executed on top the data warehouse and running tests against the datasets after the ETL steps is executed to catch any discrepancies in the datasets.

Language:PythonStargazers:0Issues:0Issues:0

Data-Warehouse-AWS-Redshift

[Sparkify]Building an ETL pipeline that extracts data from S3, stages them in Redshift, and transforms data into a set of dimensional tables as a data warehouse for analytics team to continue finding insights into what songs their users are listening to.

Language:Jupyter NotebookStargazers:0Issues:0Issues:0
Language:CSSStargazers:0Issues:0Issues:0