yadnyi / VectorSpaceModel

Vector space model is an algebraic model for representing text documents as vectors and Cosine similarity is used to compute the similarity between documents and queries

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

VectorSpaceModel

Vector space model is an algebraic model for representing text documents as vectors and Cosine similarity is used to compute the similarity between documents and queries. The Result is filtered with alpha=0.005. It is used in information retrieval, indexing and relevancy rankings.

Files in this project include:

  1. ShortStories (50)
  2. Stop-words list

Sample Queries:

  1. q= "love hate"

    4 documents

    43.txt - 0.03291

    6.txt - 0.01589

    1.txt - 0.00635

    9.txt - 0.00531

  2. q= "lodie"

    1 document

    24.txt - 0.12751

  3. q= "travel water"

    3 documents

    21.txt - 0.02184

    19.txt - 0.01530

    11.txt - 0.01274

  4. q= "king queen"

    8 documents

    31.txt - 0.25304

    7.txt - 0.01565

    34.txt - 0.01546

    43.txt - 0.01163

    49.txt - 0.00764

    40.txt - 0.00744

    25.txt - 0.00667

    40.txt - 0.00614

About

Vector space model is an algebraic model for representing text documents as vectors and Cosine similarity is used to compute the similarity between documents and queries


Languages

Language:Jupyter Notebook 100.0%