vishal2232 / Project_1-Spark-using-Scala-API-

Problem statement, get the revenue and number of orders from order_items on daily basis.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Project_1-Spark-using-Scala-API-

Analysis of Production Data

Requirements: Vmware CentOs Hadoop Apache Spark Scala

Importing both the tables(orders and order_items from retail_db database) from mysql to HDFS using sqoop.

Then perform necessary analytics using apache spark.

About

Problem statement, get the revenue and number of orders from order_items on daily basis.