Dorianteffo

followers

following

stars

https://topmate.io/dorianteffo/1028431

Dorian Teffo's starred repositories

public-apis

A collective list of free APIs

Language:PythonMIT304533 4142 609

data-engineering-zoomcamp

Free Data Engineering course!

Language:Jupyter Notebook23857 437 124

data-engineering-practice

Data Engineering Practice Problems

Language:Dockerfile1564 34 10

ssh-deploy

GitHub Action for deploying code via rsync over ssh. (with NodeJS)

Language:JavaScriptMIT1139 8 119

data_engineering_project_template

A template repository to create a data project with IAC, CI/CD, Data migrations, & testing

Language:HTMLMIT210 7 9

pypi-duck-flow

end-to-end data engineering project to get insights from PyPi using python, duckdb, MotherDuck & Evidence

Language:Python107 40

beginner_de_project_stream

Simple stream processing pipeline

Language:Python82 3 5

bitcoinMonitor

Near real time ETL to populate a dashboard.

Language:Python64 2 1

online_store

End to end data engineering project

Language:PythonMIT49 3 3

modern-data-platform

End-to-end data platform leveraging the Modern data stack

Language:Python30 10

unitTestPySpark

how to unit test your PySpark code

Language:Python27 4 1

DataEngineeringProjects

Some example projects for Data Engineers to build, end-to-end.

hadoop-docker

Language:Python20 2 4

crypto_api_kafka_airflow_streaming

Get Crypto data from API, stream it to Kafka with Airflow. Write data to MySQL and visualize with Metabase

Language:Python12 10

RTP-03

Language:HCL12 30

tdf

Dagster and dbt assets for the fancy data stack project.

Language:Python9 10

etl_pipeline_docker_metabase

Data pipeline to build a data warehouse on Postgres

Language:Python9 10

data-engineering-projects

Welcome to my data engineering projects repository! Here you will find a collection of data engineering projects that I have worked on.

8 40

DuckdbAndDeltaLake

Learning how to query remote s3 Delta Lake with DuckDB.

Language:Python7 20

vg-metrics-lakehouse

Language:Python4 10

vg-sales-glue-spark-terraform

ETL job with AWS Glue

Language:Python4 10

INSERT-UPDATE-DELETE-READ-CRUD-on-Delta-lakes-S3-using-Glue-PySpark-Custom-Jar-Files-Athen

INSERT | UPDATE |DELETE| READ | CRUD |on Delta lakes(S3) using Glue PySpark Custom Jar Files & Athena

Language:PythonApache-2.04 30

ci_cd_lambda

Use docker, terraform and github actions to deploy Lambda code to aws

Language:HCL3 10

dbt_python_docker

Use dbt to create a star schema

Language:Python2 10

analytics-project-SQL-PowerBI

Utilized SQL and Power BI to assist an e-commerce products business in gaining insights into their data by presenting key metrics.

1 10

aws_lambda_intro

Language:Python1 10

deploy_ec2

Language:HCL1 10

designer-website-scraping

Extract designer informations on https://www.dexigner.com/

Language:Python1 10

uber-eats-airflow-spark-glue-athena

Ingest CSV files and load them to S3, upload Spark script to S3, run the Spark code on EMR cluster, which will pull the raw UberEats data from S3, clean the data, and load them back to S3 in the proper schema. All of this orchestrated with Airflow

Language:Python1 10

unit_test_sample

Run simple unit tests

Language:Python1 10