This repository contains notes and exercises I made taking the Data Engineer Zoomcamp provided by the Data Talks Club.
Data used: Yellow Taxi Data New York
The data can ce downloaded using: wget https://s3.amazonaws.com/nyc-tlc/trip+data/yellow_tripdata_2021-01.csv
- Postgres
- Load the data into a database
- Use pgcli to connect to Postgres
- pgAdmin
- Use the webinterface to look at the data
- Docker
- Getting started with Docker
- Use Docker to start Postgres
- Use Docker to start pgAdmin
- Use both in the same network
- docker-compose
- Use one yaml-file to start pgAdmin and Postgres in the same network
- Introduction to Terraform
- Introduction to Google Cloud
- Homework
- Data Lake
- Workflow orchestration
- Introduction to Prefect
- ETL with GCP & Prefect
- store data in GCS and Big Query
- Parametrizing workflows
- Prefect Cloud and additional resources
- Homework