M10han / Hadoop

Analysis on Yelp dataset using MapReduce and Spark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hadoop

Analysis on Yelp dataset using MapReduce and Spark MapReduce jobs analyzing the yelp dataset

  1. List the unique categories of business with their addresses located in "various cities"
  2. Find the top 10 rated businesses using the average ratings.
  3. List the business_id, full address and categories of the Top 10 businesses using the average ratings.
  4. List the 'user id' and 'rating' of users that reviewed businesses located in a city.

Derived statistics from yelp dataset using Hadoop. Implemented multiple queries using MapReduce and Spark.

About

Analysis on Yelp dataset using MapReduce and Spark


Languages

Language:Java 100.0%