hunspell / hunspell

The most popular spellchecking library.

Home Page:http://hunspell.github.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Modifying first character while adding suffixes

shantanuo opened this issue · comments

In sanskrit when the "िक" or "ी" suffix is added, the first character of the word is modified (i.e. vruddhi) For e.g. नगर - नागरिक / नागरी The same applies to 'य'. For e.g. सहाय - साहाय्य
The current suffix rules do not take into account the concept of " vruddhi (वृद्धि)" Therefore an additional tag is required

vruddhi p

I should be able to use "p" tag in my dict file for e.g.

नगर/p
सहाय/p

The rule will be useful not only to Sanskrit but all other Indian languages like Marathi / Hindi / Gujarti etc.

Python implementation of the tag can be found here...
https://gist.github.com/shantanuo/19087db135474c2e59ca7f57e0ee9f36

As you can see I am getting the expected results:

sanskrit_expand('नगर')
# returns ('नागरिक', 'नागरी', 'नागर्य')

These 3 words should be part of dictionary.

Have you already tried CIRCUMFIX for modifying both the start and end of the words? The CIRCUMFIX test example in hunspell sources shows mofification in Hungarian. Hunspell is receiving mostly stability corrections in the last years, new tags don't seem to be planned any time soon.