Tech-with-Vidhya / Building_ETL_Data_Pipeline_on-AWS_EMR_Cluster_Hive_Tables_Tableau_Visualisation

This project covers the implementation of building a ETL batch data pipeline using Python and AWS Services for sales data. The persisted batch sales data is stored in the AWS S3 Bucket and ingested into the AWS Elastic MapReduce (EMR) Cluster. This ingested data is further transformed using Apache Hive Tables and finally consumed by Tableau application for displaying the sales related visualisations as a dashboard. Tools & Technologies: Python, Boto3, AWS CLI, AWS S3, AWS EMR, Apache Hive, Tableau

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Tech-with-Vidhya/Building_ETL_Data_Pipeline_on-AWS_EMR_Cluster_Hive_Tables_Tableau_Visualisation Issues

No issues in this repository yet.