pwdz / Persian-Search-Engine

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Persian-Search-Engine

Processing Persian News:

  • Preprocessing data

    • Removing punctuations
    • lemmatizing
    • normalizing
    • tokenizing
  • Building inverted index

  • Building champions list

User Query & Finding Results:

  • Cosine similarity
  • Showing results from most to least similarity using a heap

About


Languages

Language:Python 100.0%