Exact word marked as a near miss
ciesiolka opened this issue · comments
It may be not a bug, but rather my mistake/misunderstanding of how hunspell dictionaries work.
I am trying to create a dictionary for latin language with accents. Let's consider a word románus
. According to its declension one of its form is romanórum
. To represent that I created the following dic and aff files:
1
románus/A
SET UTF-8
SFX A N 1
SFX A us órum
This doesn't work because those rules generate word románórum
which is invalid since it has two accents. So what I did is that I added an OCONV
entry:
(...)
OCONV 1
OCONV ánó anó
Running hunspell with that dictionary gives an odd result: románórum
is accepted, but romanórum
is considered a near miss with suggested spelling romanórum
(exactly the same).
Hunspell 1.7.0
románórum
+ románus
romanórum
& romanórum 1 0: romanórum
Maybe I simply misunderstood how ICONV and OCONV work - the explanation in man
isn't very detailed.
I'd suggest changing -ánus into -anórum with aff file
SET UTF-8
SFX A N 1
SFX A ánus anórum
Depending on stress patterns, you'll probably end up with different flags, one for each stress pattern
echo "romanórum" | hunspell -d la
+ románus
echo "románus" | hunspell -d la
*