kdebiec / Grammatical-Dictionary-of-Polish

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Grammatical-Dictionary-of-Polish

The aim of the project is to create a grammatical dictionary of the Polish language based on the marisa-trie structure.

Installation

pip install https://github.com/kaszubab/Grammatical-Dictionary-of-Polish

Usage

  • Build new grammatical dictionary from given array of files:
>>> dictionary = dict.Dictionary(["words.txt"])

The file must be in the following form:

<infinitive>:<flexographic label>:<derivatives separated with colon>  
where:
    <flexographic label> is a sequence of capital letters, 
                         optionally with an asterisk at the beginning
                         Codes:
        *  - ambiguous word
        AA - noun (pl. rzeczownik męski osobowowy)
        AB - noun (pl. rzeczownik męski żywotny)
        AC - noun (pl. rzeczownik męski nieżywotny)
        AD - noun (pl. rzeczownik żeński itd)
        AF - noun (pl. rzeczownik nijaki) 
        B  - verb
        C  - adjective
        F  - adverb

    <derivatives separeted with colon> are word forms arranged in a fixed order 
                                       depending on the part of speech that is 
                                       defined by the first letter of the label.

Example line in file:

pies :  *ABABAB:pies:psa:psu:psa:psem:psie:psie:psy:psów:psom:psy:psami:psach:psy:
  • Get derivatives of a word:
>>> dictionary.get_children("pies")
  • Get an infinitive of a word:
>>> dictionary.get_parent("psa")

Authors

About


Languages

Language:Python 100.0%