fgrullon / Data-Engineering-Udacity

This repository contains my solutions to Udacity Data Engineering Nanodegree

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Projects

Project Folder Description Done
[Project 1a - PostgreSQL] Building a star schema in PostgreSQL and inserting data via Python ✔️
[Project 1b - Cassandra] Building a star schema in Cassandra and inserting data via Python ✔️
[Project 2 - AWS Redshift] Building a star schema in AWS Redshift and inserting data from AWS S3 via Python ✖️
[Project 3 - Spark] Reading and transforming data from AWS S3 with Spark to parse them in partitioned parquet files ✖️
[Project 4 - Airflow Pipelines] Building an Airflow Pipeline to automate parsing and transforming files from AWS S3 to AWS Redshift ✖️
[Project 5 - Capstone Project] Integrating files from S3 into PostgreSQL via Spark ✖️
Status Symbol
Completed ✔️
Pending ✖️

About

This repository contains my solutions to Udacity Data Engineering Nanodegree


Languages

Language:Jupyter Notebook 92.3%Language:Python 7.7%