Fasust / stackexchange_data_analysis

An analysis of the Science Fiction section of stackexchange using the Hadoop MapReduce framework.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

stackexchange_data_analysis

This project was part of the final assignment of the module “Big Data” at Høyskolen Kristiania.

We used the Hadoop MapReduce framework to analyse a large collection data from the forum stackexchange regarding the topic “Science Fiction”. The analysis was mainly focused on users and blog posts. This repository includes all the MapReduce-Jobs we created to fulfill the tasks outlined in the assignment.

The full Documentation of the development process and an analysis of the results can be found here.

About

An analysis of the Science Fiction section of stackexchange using the Hadoop MapReduce framework.


Languages

Language:Java 95.6%Language:Python 4.0%Language:PigLatin 0.4%