Ayoub-etoullali / ETL_Training

Comprehensive training program equips developers with essential skills in data engineering and data science life cycles, encompassing data processing, software development, ML/AI, and KPI visualization for real-world business problem-solving.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

| NTT DATA

Data Engineering and Data Science Training Program

Welcome to our comprehensive data training project, designed as part of our training curriculum for newcomers in the field. This program offers an immersive learning experience to practice and master the key aspects of data engineering and data science through their complete life cycles.

Introduction

In this training program, participants will embark on a journey through the intricate world of data engineering and data science. The program encompasses various stages, starting from data acquisition, cleaning, conversion, disambiguation, and deduplication as integral parts of the data engineering process. For the data science aspect, participants will delve into problem definition, data collection, preparation, exploratory data analysis, model building, and deployment.

It's important to note that while both life cycles share some common steps, they require distinct skill sets. Data engineers need to excel in software development, designing data pipelines, and managing databases and processing systems. On the other hand, data scientists must be well-versed in machine learning, artificial intelligence, specialized model development, and working with pristine datasets.

Participants will have the opportunity to practice programming in Scala and Python, perform batch scripting, and work with SQL. They will also gain hands-on experience with tools such as Airflow, Spark, HDFS, Postgres, MariaDB, and Hive for software development, data pipeline creation, and database management on the data engineering front. For data science, tools like Jupyter notebooks, Spark ML (potentially), and Grafana will be employed for exploring machine learning, AI techniques, specialized model development, and KPI visualization.

We encourage you to fully immerse yourself in this learning journey and enjoy the process!

Goals

The primary objective of this training program is to equip developers with the necessary skills to thrive in real-world business projects. By gaining proficiency in essential tools, programming languages, and frameworks, participants will be well-prepared to tackle various business challenges. Additionally, the program aims to foster familiarity with emerging frameworks and processes that enhance developer capabilities in analysis, development, deployment, and testing.

We look forward to guiding you through this enriching learning experience. Happy learning!


With ❤️ By Ayoub ETOULLALI

About

Comprehensive training program equips developers with essential skills in data engineering and data science life cycles, encompassing data processing, software development, ML/AI, and KPI visualization for real-world business problem-solving.


Languages

Language:Scala 95.7%Language:Groovy 4.3%