hunspell / hunspell

The most popular spellchecking library.

Home Page:http://hunspell.github.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How exactly do the `/ABC` affix flags work in Hunspell?

lancejpollard opened this issue · comments

I asked this the other day, but pretty much answered it for myself I think, but still have a few outstanding questions about the /ABC affix flags and how they work.

My main question now for this issue is, how exactly does the /ABC work if there are n number of affix flags? I'm not sure if there is a limit. Here are some examples:

# an_ES.aff
SFX A Y 311		# FLEXION VERBAL 
SFX A 0 /CDEF r		# infinitivo
SFX A r u/EF [ai]r	# participio
SFX A r to/EF [ai]r
SFX A r us/EF [ai]r
SFX A r tos/EF [ai]r
SFX A r da/EF [ai]r
SFX A r ta/EF [ai]r
SFX A r das/EF [ai]r
SFX A r tas/EF [ai]r
SFX A er iu/EF er
SFX A er ito/EF er
SFX A er ius/EF er
SFX A er itos/EF er
SFX A er ida/EF er
SFX A er ita/EF er
SFX A er idas/EF er
SFX A er itas/EF er
SFX A r ndo/CDE r		# cherundio
SFX A ar o/E [^u]ar 		# present -ar
SFX A ar as/E [^u]ar
SFX A ar a/CDE [^u]ar
SFX A ar amos/E [^u]ar
SFX A ar atz/CDE [^u]ar
# da_DK.aff
SFX 1 Y 4
SFX 1 0 de/34,22 e	+DATID
SFX 1 0 ede/34,22 [^e]	+DATID
SFX 1 0 et/34,22 [^e]	+PERF_PART
SFX 1 0 t/34,22 e	+PERF_PART
# de_AT_frami.aff
PFX i Y 1
PFX i 0 -/coyf .

SFX j Y 3
SFX j 0 0/xoc .
SFX j 0 -/zocf .
SFX j 0 -/cz .
# eo.aff
NEEDAFFIX X
COMPOUNDFLAG 2

SFX A Y 37
SFX A 0 0/2XmNEV€ .
SFX A 0 eg/2XmNEV .
SFX A 0 et/2XmNEV .
SFX A 0 ul/2XmNE€ .
SFX A 0 ulin/2XmNE .
SFX A 0 egul/2XmNE .

That should cover most questions I have for this post. Here are some basic questions to start:

  1. So it appears you can use uppercase letters, lowercase letters, numbers, and non-ascii symbols like € for the affix flags? I thought the digits were aliases, and the uppercase letters were direct links. Wasn't aware you could use lowercase letters. Is that still the case in the example SFX A 0 0/2XmNEV€ .? Does that mean / alias 2, link X, link m, etc.? Like here SFX 1 0 t/34,22 the digits mean aliases, separated by comma, correct?
  2. There is an empty before-slash one, SFX A 0 /CDEF r, and one with a 0 at SFX A 0 0/2XmNEV€ ., what does the empty mean, and what does the 0 mean in that context?

Now onto the main question.

  1. In a case like SFX A 0 et/2XmNEV ., what is the order of operations of all the flags 2XmNEV to the current SFX? How do I conceptualize how this works generically so I better understand all the examples? Does this mean there are 6 flags, 1 alias (2), and 5 direct links (XmNEV)? And then they are applied in the order from left to right? Or no, some of those could be prefixes/suffixes/circumfixes, so yeah then how do they get resolved? How is that determined, are there docs for that that I missed?

That is the main thing I would like to know, how many affix flags you can have, and what order they are evaluated in / generally how they get evaluated? Does the order that you specify them matter?

Thank you again for time in advance.