NEEDAFFIX with both PFX and SFX
srtxg opened this issue · comments
Hello,
The walloon language has changes on the beginning of words depending on phonetic of previous word;
I implemented that with PFX rules.
Then I use various SFX rules.
For verbs I use scond level SFX rules, to decrease the number of rules.
It works quite well, however when SFX and PFX flags are used, the stem is made a valid word, despite being explicitely flagges with NEEDAFIX.
(at least is it like that in 1.7.0 version)
---- x.aff ----
SET UTF-8
FLAG UTF-8
TRY ersainthocuxdlpéymbzîvjåfèwgkêôûçERSAINTHOCUXDLPÉYMBZÎVJÅFÈWGKÊÔÛ’'Ç-
NEEDAFFIX *
# "v" flag is for verbs;
# if the stem given in dic file ends in "é" it is a verb of 1st group (flag "1"),
# and it is also "stem A" of verbs (they can have several stems, but kept simple here)
# the ending "é" is stipped, but I use the "*" flag to tell this stripped stem is not a word
SFX v Y 2
SFX v é /1* é po:v
SFX v é /A* é po:v
# rules for 1st group of verbs, the ending "é" is added back; is stemA (bdjA) and past participle (p.p.) form
SFX 1 Y 1
SFX 1 0 é . is:bdjA is:p.p.
# rules for "stemA" of verbs (here a signle rule for the 1st person plural of present tense
SFX A Y 2
SFX A 0 ans [^k] is:bdjA is:pr. is:1pl
SFX A k cans k is:bdjA is:pr. is:1pl
# and prefix rule, di- can be elided to d- (eg: diné -> dné)
PFX i Y 2
PFX i 0 0 di sp:plin
PFW i di d di sp:spotch
------- x.dict -----
2
diné/iv* st:diner
viké/v* st:viker
------ testfile.txt -----
diné
dné
dinans
dnans
din
dn
viké
vicans
vik
in case of only using SFX (eg viké/v* ); the stripped stem "vik" is correctly ignored as a valid word;
however, when using also PFX (eg: diné/iv ) the stripped stem "din" as well as "dn" are incorrectly included as valid:
$ hunspell -d x -m test.txt
diné sp:plin st:diner
diné st:diner po:v is:bdjA is:p.p.
diné sp:plin st:diner po:v is:bdjA is:p.p.
dné sp:spotch st:diner
dné sp:spotch st:diner po:v is:bdjA is:p.p.
dinans st:diner po:v is:bdjA is:pr. is:1pl
dinans sp:plin st:diner po:v is:bdjA is:pr. is:1pl
dnans sp:spotch st:diner po:v is:bdjA is:pr. is:1pl
din sp:plin st:diner po:v <=== wrong
dn sp:spotch st:diner po:v <==== wrong
viké st:viker po:v is:bdjA is:p.p.
vicans st:viker po:v is:bdjA is:pr. is:1pl
vik
I would have expected
diné/iv* st:diner
to be equivalent to
diné/v* sp:plin st:diner
dné/v* sp:spotch st:diner
(if I do write it like that, it's ok; but that would require rewriting a thousand lines)
Is there something I am missing, or is that behaviour incorrect ?
Thanks