Rajspeaks / Machine-Learning-approach-to-Bengali-POS-Tagging-using-NLTK

Bengali POS Tagging using Indian Corpus through NLTK. A sample testing to apply POS Tagging under the supervision of Prof. Sandipan Ganguly, HIT-K.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Machine Learning approach to Bengali POS Tagging using NLTK on Indian-Corpus

Indian corpus is a collection of these Indian Languages: Bengali, Hindi, Marathi, and Telugu language data. NLTK is Natural Language Toolkit Library.

Methodology

  • Here I have imported NLTK(Natural Language Tool Kit).
  • Imported indian corpus from NLTK.
  • Stored that Indian Corpus into 'bangla.pos'.
  • 'bangla.pos' has been stored in a variable 'tagged_set'.
  • Stored the bengali sentences from bengali corpus into 'word_set' variable.
  • Using for loop to count the number of sentences, present in that corpus.

Tools & Library requirements:

  • Google Colab/Jupyter
  • Language: Python
  • NLTK Library

Mentor:

Prof. Sandipan Ganguly

Developer:

Rajdeep Das

Reference:

Click here to read the source article.

About

Bengali POS Tagging using Indian Corpus through NLTK. A sample testing to apply POS Tagging under the supervision of Prof. Sandipan Ganguly, HIT-K.

License:GNU General Public License v3.0


Languages

Language:Jupyter Notebook 100.0%