jester94 / BDT

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Big Data Technology

This repository contains all the lab assignments for Big Data Technology subject during Semester 6th of our Btech in Data Science(Business Analytics) 2018 - 2022.

Lab 1 - Map Reduce in Java - Using eclipse on cloudera, we counted the number of words in a paragraph.

Lab 2 - Map Reduce in Hive - Using Hive, number of words in the paragraph was counted.

Lab 3 - Big Dask Tutorial - Some examples to compare pandas and Dask.

Lab 4 - Pyspark implementation - Some examples of pyspark and its implementation is shown.

Lab 5- Movie Recommendations using Pyspark - Movie recommendation is built using Pyspark.

Submitted By-

Vidhi Kapoor - J021 | Kartikay Laddha - J025

About


Languages

Language:Jupyter Notebook 100.0%