nroldanf / data-engineering-course

Data engineering course imparted by Pete Fein.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Data Pipeline

1. Ingestion

Download

Process (Unzip and choose CSV)

2. Data Warehouse

Create Dataset and Tables

Check data quality

Load data into db

3. Data Quality

Check for null values

To run everything:

make

Generate Graph view from makefile

wget https://raw.githubusercontent.com/vak/makefile2dot/master/makefile2dot.py
python makefile2dot.py dag.makefile | dot -Tpng > example-dag.png

About

Data engineering course imparted by Pete Fein.


Languages

Language:HTML 74.9%Language:Jupyter Notebook 15.4%Language:Python 8.9%Language:Makefile 0.6%Language:Shell 0.1%