Pruthwik / CRF-Based-Malayalam-POS-Tagger

CRF-Based-Malayalam-POS-Tagger

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CRF-Based-Malayalam-POS-Tagger

CRF-Based-Malayalam-POS-Tagger

How to run the POS Tagger

sh run_malayalam_pos_model_and_save_to_file.sh input_file_path output_file_path pos_model_path

Please install CRF++ toolkit. All the necessary details can be accesed from the below website:

https://taku910.github.io/crfpp/

Here, you can use the following command

sh run_malayalam_pos_model_and_save_to_file.sh malayalam_input_raw_sentences.txt malayalam_pos_conll.txt malayalam-pos-model.m

Make sure that the sentences in the input file are tokenized.

For tokenizer of Malayalam, check the following repository

https://github.com/Pruthwik/Tokenizer_for_Indian_Languages

About

CRF-Based-Malayalam-POS-Tagger

License:MIT License


Languages

Language:Python 88.2%Language:Shell 11.8%