Houssem-Ousji / Text-Analyzer-using-NLTK

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Text-Analyzer-using-NLTK

Its a python scripts(class task) which contain two part:

the indexing phase:

  • word tokenization
  • line tokenization
  • deleting stop words
  • word racinisation
  • word lemmatisation
  • word labeling

the research phase:

  • getting The list of documents containing a given word
  • getting The number of occurrences of a given word in each returned document
  • getting The weight of a given word in each returned document
  • getting The tf-idf of a given word in each returned document
  • getting The most relevant document for a given word

About


Languages

Language:Python 100.0%