SalSuwai

Salman AlSuwaina's repositories

Data_Lake

In this project, We will use Spark and data lakes to build an ETL pipeline for a data lake hosted on S3. To complete the project, you will need to load data from S3, process the data into analytics tables using Spark, and load them back into S3. We will deploy this Spark process on a cluster using AWS.

Language:Jupyter Notebook000

Data_Modeling_Apache-cassandra

Applying data modeling to a NoSQL database with Apache Cassandra and build an ETL pipeline using Python. And modeling the data by creating tables in Apache Cassandra to run queries.

Language:Jupyter Notebook010

Data_Modeling_PostGres

Modeling the data with Postgres and building an ETL pipeline using Python. I will define fact and dimension tables for a star schema for a particular analytic focus, and write an ETL pipeline that transfers data from files in two local directories into these tables in Postgres using Python and SQL.

Language:Jupyter Notebook010

Data_Warehouses_AWS

applying data warehouses tools and AWS to build an ETL pipeline for a database hosted on Redshift. loading data from AWS S3 bucket to staging tables on Redshift and executing SQL statements that create the analytics tables from these staging tables.

Language:Jupyter Notebook010

Machine_Learning_Arrival_prediction

Analyzing a dataset that shows the information of patients of a hospital and shows whether the patient arrived at their booked appointment or not. The goal of this project is to build a Machine learning model that can predict whether the patient with the given info will arrive or not.

Language:Jupyter Notebook010

SalSuwai

010