FadyGrAb / data-engineering-mini-projects

I'm always learning. And I love learning through practice. So, I've created this repo to share with everybody what I learn.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Data Engineering mini projects

banner

What's this?

Whenever I talk to a colleague or a friend about Data Engineering, I often detect this sense of confusion as if they communicated telepathically with me asking me, "What are you talking about? Opening an MS Excel Workbook with pandas you mean?!". So, I'll try to create simple Data Engineering projects to allow anyone who wants to know more about, and explore it to have fun and play with famous Data Engineering practices and tools.

I don't know anything about Data Engineering, can I learn it from this repo?

To some extent, yes. But I'm not designing the projects to be step-by-step guides so I guess you'll have to do some work to get all the details figured out.

What's the absolutely bare minimum pre-requisites needed from me to be able to follow up?

You don't have to be super tech-savy, but you have to have some software development background and to be conformable around docker, git, IDEs, cloud, ..., etc. I don't think this repo will be suitable for beginners without any coding experience.

What tools and programming languages will you use?

Essentially, anything that can add value to a Data Engineering project. But mainly, I'll use the most popular tools in the Data Engineering spectrum like Apache Spark, Kafka, Flink, ..., etc., a wide range of Databases, and some Cloud Services. As for programming languages, I'll use Python but I'll do some Rust coding as well.

About

I'm always learning. And I love learning through practice. So, I've created this repo to share with everybody what I learn.

License:MIT License


Languages

Language:PLpgSQL 53.7%Language:Python 23.7%Language:Jupyter Notebook 15.1%Language:Rust 4.9%Language:Shell 2.1%Language:Dockerfile 0.5%