Olvija / dict_uk

Project to generate POS tag dictionary for Ukrainian language

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This is a project to generate POS tag dictionary for Ukrainian language.

Це — проект генерування словника з тегами частин мови для української мови.

Description:

dict_uk/expand/expand_all.py -aff data/affix -dict data/dict

For all files in data/dict the project genereates all possible word forms with POS tags
by using affix rules from files in data/affix.

How to run:

# dict_uk/expand/expand_all.py -aff data/affix -dict data/dict -corp -indent -mfl -wordlist
Output:

    * dict_corp_vis.txt - Dictionary in visual (indented) format for review, analysis or conversion
    * dict_corp_lt.txt - Dictionary for LT for annotating the corpus
    * words.txt, lemmas.txt, tags.txt - list of all uniq words, lemmas and tags

# dict_uk/expand/expand_all.py -aff data/affix -dict data/dict
Output:

    * dict_rules_lt.txt - Dictionary file for LT (LanguageTool) used for grammar rules checking

About

Project to generate POS tag dictionary for Ukrainian language

License:GNU General Public License v3.0


Languages

Language:Python 47.0%Language:Groovy 43.0%Language:Java 7.4%Language:Shell 2.6%