Dorian Teffo's starred repositories

public-apis

A collective list of free APIs

Language:PythonLicense:MITStargazers:304533Issues:4142Issues:609

data-engineering-zoomcamp

Free Data Engineering course!

Language:Jupyter NotebookStargazers:23857Issues:437Issues:124

data-engineering-practice

Data Engineering Practice Problems

ssh-deploy

GitHub Action for deploying code via rsync over ssh. (with NodeJS)

Language:JavaScriptLicense:MITStargazers:1139Issues:8Issues:119

data_engineering_project_template

A template repository to create a data project with IAC, CI/CD, Data migrations, & testing

Language:HTMLLicense:MITStargazers:210Issues:7Issues:9

pypi-duck-flow

end-to-end data engineering project to get insights from PyPi using python, duckdb, MotherDuck & Evidence

Language:PythonStargazers:107Issues:4Issues:0

beginner_de_project_stream

Simple stream processing pipeline

bitcoinMonitor

Near real time ETL to populate a dashboard.

online_store

End to end data engineering project

Language:PythonLicense:MITStargazers:49Issues:3Issues:3

modern-data-platform

End-to-end data platform leveraging the Modern data stack

Language:PythonStargazers:30Issues:1Issues:0

unitTestPySpark

how to unit test your PySpark code

DataEngineeringProjects

Some example projects for Data Engineers to build, end-to-end.

crypto_api_kafka_airflow_streaming

Get Crypto data from API, stream it to Kafka with Airflow. Write data to MySQL and visualize with Metabase

Language:PythonStargazers:12Issues:1Issues:0
Language:HCLStargazers:12Issues:3Issues:0

tdf

Dagster and dbt assets for the fancy data stack project.

Language:PythonStargazers:9Issues:1Issues:0

etl_pipeline_docker_metabase

Data pipeline to build a data warehouse on Postgres

Language:PythonStargazers:9Issues:1Issues:0

data-engineering-projects

Welcome to my data engineering projects repository! Here you will find a collection of data engineering projects that I have worked on.

DuckdbAndDeltaLake

Learning how to query remote s3 Delta Lake with DuckDB.

Language:PythonStargazers:7Issues:2Issues:0

vg-sales-glue-spark-terraform

ETL job with AWS Glue

Language:PythonStargazers:4Issues:1Issues:0

INSERT-UPDATE-DELETE-READ-CRUD-on-Delta-lakes-S3-using-Glue-PySpark-Custom-Jar-Files-Athen

INSERT | UPDATE |DELETE| READ | CRUD |on Delta lakes(S3) using Glue PySpark Custom Jar Files & Athena

Language:PythonLicense:Apache-2.0Stargazers:4Issues:3Issues:0

ci_cd_lambda

Use docker, terraform and github actions to deploy Lambda code to aws

Language:HCLStargazers:3Issues:1Issues:0

dbt_python_docker

Use dbt to create a star schema

Language:PythonStargazers:2Issues:1Issues:0

analytics-project-SQL-PowerBI

Utilized SQL and Power BI to assist an e-commerce products business in gaining insights into their data by presenting key metrics.

Language:HCLStargazers:1Issues:1Issues:0

designer-website-scraping

Extract designer informations on https://www.dexigner.com/

Language:PythonStargazers:1Issues:1Issues:0

uber-eats-airflow-spark-glue-athena

Ingest CSV files and load them to S3, upload Spark script to S3, run the Spark code on EMR cluster, which will pull the raw UberEats data from S3, clean the data, and load them back to S3 in the proper schema. All of this orchestrated with Airflow

Language:PythonStargazers:1Issues:1Issues:0

unit_test_sample

Run simple unit tests

Language:PythonStargazers:1Issues:1Issues:0