Mike Houngbadji's repositories
airflow_minio_twitter_data_pipeline
A simple example of Data Pipeline using apache-airflow (Orchestrator) and MinIO(Object Storage like s3)
twitter_data-lakehouse_minio_drill_superset
Building a Data Lakehouse for Analyzing Elon Musk Tweets using MinIO, Apache Airflow, Apache Drill and Apache Superset
genesis_data_generator
Fake Data Generator to different destinations
streamlit_duckdb
Tutorial on Streamlit + duckdb for blazing fast web app
tuto_python_kafka_cassandra_postgres
Data Engineering Project: Consumer data from Kafka and insert in to cassandra & postgresql databases. There is also some data enrichement
coincap_etl_cron
DE Pipeline Tutorial
dataquest.io-predict_sp500
ML Tutorial
duckdb_job_runner_on_s3
A Project on using DuckDB as an Engine
astronomer-docs
This repository contains all content and code for Astro and Astronomer Software documentation.
data-dockerfiles
a curated list of docker-compose files prepared for testing data engineering tools, databases and open source libraries.
data-engineering-zoomcamp
Data Engineering Zoomcamp inspired by DataTalksClub
gcp-oss-alternatives
Google Cloud Platform Open Sources Softsware Alternative
gcp-training-data-analyst
Labs and demos for courses for GCP Training (http://cloud.google.com/training).
gcp_tutorials
Testing GCP Python Clients
google-colab
Google Colabs
mikekenneth
GitHub Profile
tuto_airflow_soda_postgres
A tutorial for Data Quality check with Soda in Airflow