Dorian Teffo's repositories
modern-data-platform
End-to-end data platform leveraging the Modern data stack
etl_pipeline_docker_metabase
Data pipeline to build a data warehouse on Postgres
ci_cd_lambda
Use docker, terraform and github actions to deploy Lambda code to aws
vg-sales-glue-spark-terraform
ETL job with AWS Glue
dbt_python_docker
Use dbt to create a star schema
sql_interview
Solving SQL Interview Query for Data Analyst asked by a Product based company
amazon_products_analysis
EDA on the Amazon products data, and build a web app to track performance of the top 10 manufacturers on Amazon
analytics-project-SQL-PowerBI
Utilized SQL and Power BI to assist an e-commerce products business in gaining insights into their data by presenting key metrics.
designer-website-scraping
Extract designer informations on https://www.dexigner.com/
fifa21_datacleaning_python
Using Pandas library in python to clean the fifa21 dataset
impact_analysis-SQL-Powerbi
Quantify the impact of large scale supply changes on the sales perfomance of a business
spotify-airflow-s3-pipeline
Build an ETL pipeline with the spotify API and find insights in the extracted data
uber-eats-airflow-spark-glue-athena
Ingest CSV files and load them to S3, upload Spark script to S3, run the Spark code on EMR cluster, which will pull the raw UberEats data from S3, clean the data, and load them back to S3 in the proper schema. All of this orchestrated with Airflow
unit_test_sample
Run simple unit tests
data-engineering-practice
Data Engineering Practice Problems
github_scheduling
Just a repository to test GitHub Actions scheduling feature.