Three Simple examples how to create ETL
In this repository, three Apache Spark ETL are built using Pyspark to load data into mysql.
Objectives:
- Be familiar with spark.
- Be familiar with ETL process (Extract, transform, and load data from one system to another using data pipelines).
- Be familiar to connect and load data from/to mysql.
Datasets:
- Friends dataset was generated randomly.
- Used cars dataset from [kaggel] (https://www.kaggle.com/lepchenkov/usedcarscatalog)