The aim of the project is to create a grammatical dictionary of the Polish language based on the marisa-trie structure.
pip install https://github.com/kaszubab/Grammatical-Dictionary-of-Polish
- Build new grammatical dictionary from given array of files:
>>> dictionary = dict.Dictionary(["words.txt"])
The file must be in the following form:
<infinitive>:<flexographic label>:<derivatives separated with colon>
where:
<flexographic label> is a sequence of capital letters,
optionally with an asterisk at the beginning
Codes:
* - ambiguous word
AA - noun (pl. rzeczownik męski osobowowy)
AB - noun (pl. rzeczownik męski żywotny)
AC - noun (pl. rzeczownik męski nieżywotny)
AD - noun (pl. rzeczownik żeński itd)
AF - noun (pl. rzeczownik nijaki)
B - verb
C - adjective
F - adverb
<derivatives separeted with colon> are word forms arranged in a fixed order
depending on the part of speech that is
defined by the first letter of the label.
Example line in file:
pies : *ABABAB:pies:psa:psu:psa:psem:psie:psie:psy:psów:psom:psy:psami:psach:psy:
- Get derivatives of a word:
>>> dictionary.get_children("pies")
- Get an infinitive of a word:
>>> dictionary.get_parent("psa")