jaissugam / NLP-for-Nepali-Language

The project is all about Natural Language Processing for the Nepali Language. "Text Summarization" and "Word Segmentation" are implemented in this project.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NLP-for-Nepali-Language

The project is all about Natural Language Processing for the Nepali Language. "Text Summarization" and "Word Segmentation" are implemented in this project.

Text Summarization

→ For text summarization, the technique used is TF-IDF.

Word Segmentation

→ The dataset includes most common words that are used in nepali language. The dataset has various P.O.S. words.
→ The project implements Verbal Inflections.
→ Non-verbal words from the dataset are used as a dictionary to the implemented dictionary based algorithms.
→ Research is done for the verbal part and the most common nepali verbs are categorized into various types. Each type has its own rule of adding of prefixes and suffixes.
→ FSA is implemented for Morphological Recognition.
→ Various Orthographic rules are also taken into consideration while implementing the FSA.

Summary of Regular Verbs

image
Apart from these regular verbs, two most common irregular verbs are also implemented in the program.

Finite State Automata (FSA) for Morphological Recognition of Regular Verb Conjugations

image

About

The project is all about Natural Language Processing for the Nepali Language. "Text Summarization" and "Word Segmentation" are implemented in this project.


Languages

Language:Jupyter Notebook 100.0%