Fanaen / Hunspell2WordList

Java library/tool generating every possible word from Hunspell dictionaries

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

words missing when defined in .aff?

danielnaber opened this issue · comments

When exporting Dicollecte 6.4.1, words like "étant" are missing. It seems they are not in the *.dic, but in the *.aff, like this:

SFX zE être étant/n'q'l'm't's' être po:ppre

Are there any plans to extend H2WL to export these words, too?

(Original issue at LanguageTool: languagetool-org/languagetool#1633)

Oh dear, this project is so old I don't even remember how it works.
Well, this is maybe simple to do. Let me look at it.

Please note H2WL is not meant to create an absolutely exhaustive list, in the sense that some words may still be missing.

The problem is solved! "étant" and such are generated as well.
We now have "étant" and its combination with other affixes, like this:

Qu'êtres
Qu'Êtres
être
être
s'être
t'être
d'être
l'être
m'être
n'être
qu'être
étant
s'étant
t'étant
l'étant
m'étant
n'étant
qu'étant
été
suis
es
t'es
l'es
m'es
n'es
qu'es

Technical explanation: it turns out words like "être" (which is the base of "étant") matches perfectly with the condition in the affix. The regex did not allow that.

I close the issue, but feel free to reopen it if necessary.

Thanks!