jungyeul / sjmorph

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

sjmorph

Park J & Tyers F. A New Annotation Scheme for the Sejong Part-of-speech Tagged Corpus. In: Proceedings of the 13th Linguistic Annotation Workshop. Florence, Italy: Association for Computational Linguistics; 2019:195-202. https://www.aclweb.org/anthology/W19-4022.

sjmorph.model for udpipe (http://ufal.mff.cuni.cz/udpipe) is available at https://doi.org/10.5281/zenodo.3236528

We obtain 99.88% f1 score for segmentation and 94.75% accuracy for POS tagging for Sejong tag sets.

history

[October 2020] We update UPOS in the SJMorph model based on a new mapping table in kim-colineau:2020:LREC (to be announced).

[September 2020] We implement Korean NER based on the SJMorph model (to be announced).

[July 2020] Using sjmorph_v3.model, we obtain 99.88% f1 score for token segmentation and 94.77% accuracy for POS tagging for Sejong tag sets. We also fixed encoding problems in macOS. For the newest model of sjmorph.model, please contact jungyeul (dot) park (at) gmail (dot) com.

About