yvgupta03 / Big_Data_Project_Movies_Search_Engine

Search Engine for querying movie names to search movies based on summaries using PySpark, Map-Reduce and tf-idf, cosine similarity.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Big_Data_Movies_Search_Engine

Search Engine for querying movies to search movie summaries accordingly using PySpark Map-Reduce based on cosine similarity.

Project done on databricks cluster and input datasets/files used in codes are stored in databricks cluster tables to access them in Databricks notebook. You can upload the code notebooks on databricks account to run it on cluster.

About

Search Engine for querying movie names to search movies based on summaries using PySpark, Map-Reduce and tf-idf, cosine similarity.


Languages

Language:Jupyter Notebook 100.0%