esvs2202 / Customer-Lifetime-value-Analysis-on-Amazon-Retail-sales-data

The aim of this project is to build a cost efficient Data Warehouse on Amazon's Retail sales data and perform Customer lifetime value analyses

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Customer-Lifetime-Analysis-on-Amazon-Retail-sales-data

Problem: Design a cost-efficient Data engineering solution (Data Warehouse) for analyzing the Customer Lifetime value from retail sales data of Amazon.

Architecture: image

Final Data model: image

Solution Steps:

  1. Azure resource group: esv_retail image

Synapse authorization: SQL Authentication

  1. Upload the raw “Amazon Retail Sales” excel file into the “raw” directory inside the Datalake storage. image

  2. Now inside Data factory, design data flows that creates staging files from the given excel file. I.e., a. orders.csv b. returns.csv c. customer.csv d. product.csv All these files have to be stored in a different container “transformed” inside the same Data lake storage. image image image

  3. Provisioned Dedicated SQL pool inside the Synapse workspace. This can be used to create a data warehouse. image

  4. Launch Synapse workspace. Inside the “Data” tab, we find the transformed container inside the Datalake storage which contains all the stage data. image

  5. And under the Workspace tab, we find the SQL database inside Dedicated SQL pool. image

  6. Created Staging tables as external tables for all the csv files inside “esv_sql” database. image

  7. Created Fact and Dimension tables inside the “esv_sql” database. image

This is our final Data warehouse. Now connect this Data warehouse to PowerBI desktop.

Data Analysis using Power BI: Performed Customer Lifetime value analysis: image image image

About

The aim of this project is to build a cost efficient Data Warehouse on Amazon's Retail sales data and perform Customer lifetime value analyses


Languages

Language:TSQL 100.0%