sushmithas99 / YOUTUBE-DATA-ANALYSIS-USING-HADOOP

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

YOUTUBE-DATA-ANALYSIS-USING-HADOOP

Introduction

  • Big Data is extremely large and complex datasets. The complicated task is to analyze the big data - how it can be captured, stored, searched for, shared, analyzed and viewed. These can not be analyzed with traditional database management tools. For this reason, Hadoop is one of the best technologies that can be used to process these extremely large and complex data sets.

Objective

  • The main goal of this project is to show how to analyze YouTube data from a YouTube dataset to make targeted decisions in real time and with full knowledge of the facts. This project will help to understand changing trends in individuals by analyzing YouTube data and obtaining meaningful results.

How to analyze YouTube data with hadoop?

  • Although Hadoop offers the ability to collect data on HDFS, many applications available on the market can be used to analyze data. Among the applications:
    Pig ------ Mapreduce ------- Hive
    For this type of analysis more comfortable application is MapReduce.

Description of Dataset

  • Column 1 Identifier video that contains 11 characters.
  • Column 2 Name of the downloader.
  • Column 3 Date of publication.
  • Column 4 Category of the video.
  • Column 5 Length of the video.
  • Column 6 Number of views for the video.
  • Column 7 Number of comments.
  • Column 8 Number of "likes".
  • Column 9 Countries.

About


Languages

Language:Java 100.0%