kaushikj / Inverted-Index-And-full-text-search-Implementation-with-Pyspark-and-mysql

This project is an Implementaion of Full Text Search with Invereted Index which is stored in mysql database. It is implemented using pyspark. This project was tested in google colab.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Inverted-Index-And-full-text-search-Implementation-with-Pyspark-and-mysql

This project is an Implementation of Full Text Search with Invereted Index which is stored in mysql database. It is implemented using pyspark. This project was tested in google colab.

The entire project is divided into 6 steps
-Step 0 - Setup Environment and Import Packages
-Step 1 - Create Invereted Index and doc magnitude and store it in a file
-Step 2 - Store it in a mysql
-Step 3 - Lookup Inverted Index and get metrics
-Step 4 - Calculate Cosine Similarity -Step 5 - Document Ranking

#Steps to run the code unzip "input_docs.zip" and run the .ipnb

About

This project is an Implementaion of Full Text Search with Invereted Index which is stored in mysql database. It is implemented using pyspark. This project was tested in google colab.


Languages

Language:Jupyter Notebook 100.0%