invalid endchars in check compound pattern
shantanuo opened this issue · comments
I have this dictionary and affix file for the word
भानूत्सवः
and it is working correctly.
# cat dicts/sa.dic
2
भानु/x
उत्सवः/x
# cat dicts/sa.aff
SET UTF-8
COMPOUNDMIN 1
COMPOUNDFLAG x
CHECKCOMPOUNDPATTERN 1
CHECKCOMPOUNDPATTERN ु उ ू
But the same word is marked as incorrect if I add this entry
CHECKCOMPOUNDPATTERN ा आ ा
I do not see any reason why adding an entry should mark the word incorrect that was previously considered accurate.
There is no problem if I add an entry something like this...
CHECKCOMPOUNDPATTERN ा आ ू
It means hunspell does not accept "ा" as endchar and the entire affix file stops working due to this single entry of endchar.
This is not expected. Looks like a bug.
It should be documented what characters are not alloed as endchars.
Closing this bug because when I tested the same word in python, it's working as expected.
import hunspell
spellchecker = hunspell.HunSpell(
"./sa_IN.dic",
"./sa_IN.aff",
)
spellchecker.spell('भानूत्सवः')
It seems that applications like firefox are implementing hunspell in different ways.