POS-Tagger

Parts of Speech Tagger Using Hidden Markov Model and Viterbi decoding algorithm.

Hidden Markov Model part-of-speech tagger for Catalan. The training data is provided tokenized and tagged; the test data will be provided tokenized, and the tagger will add the tags.

The Training data contains the data in the following format. A file with tagged training data in the word/TAG format, with words separated by spaces and each sentence on a new line. A file with untagged development data, with words separated by spaces and each sentence on a new line. A file with tagged development data in the word/TAG format, with words separated by spaces and each sentence on a new line, to serve as an answer key.

About

Parts of Speech Tagger Using Hidden Markov Model and Viterbi decoding algorithm.

Languages

Language:Python 100.0%