Quantified concatenation using < or > fails when not escaped, different from xfst & hfst
snomos opened this issue · comments
Sjur N Moshagen commented
In writing a URL parser, I have the following lexicon:
LEXICON realdomain
< [ a | b | c | d | e | f | g | h | i | j | k
| l | m | n | o | p | q | r | s | t | u | v
| w | x | y | z | A | B | C | D | E | F | G
| H | I | J | K | L | M | N | O | P | Q | R
| S | T | U | V | W | X | Y | Z |%-
|%0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 ]^>1 %. > topdomainlist ;
This fails in Foma with the following error:
***Syntax error on line 49 column 616 at '>'
If I escape the quantifier as follows: ^%>1
the regex compiles in Foma, but fails in both Xfst:
*** Warning: regex_parse: Positive integer expeted, got 0. ***
and Hfst-xfst:
*** xre parsing failed: syntax error, unexpected LEXER_ERROR, expecting end of file
*** parsing […]
|%_ |%? |%& |%= |%% |%@ |%. |%/ |%~ ]^%>1 [near ^] on line 27...
Unable to parse regular expression