skvadrik / re2c

Lexer generator for C, C++, Go and Rust.

Home Page:https://re2c.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bad codepoint range

ccleve opened this issue · comments

I'm trying to compile the exact example here:

https://re2c.org/manual/manual_rust.html#encoding-support

I've copied unicode_categories.re into the test directory and attempted to compile the code using re2rust. I get this error:

tests/pipeline/tokenizers/re2c/unicode_categories.re:2:68: error: bad code point range: '0xF8 - 0x2C1'

What am I doing wrong?

Also: I see that unicode_categories.re is three years old at this point. Should it be regenerated with a more recent version of unicode?

What am I doing wrong?

Can you provide you command line? Did you forget --utf8 argument?

Also: I see that unicode_categories.re is three years old at this point. Should it be regenerated with a more recent version of unicode?

Yes, it should. We even had a project for rewriting the generator (#235 (comment)) but that somehow got stuck.

Yes, I did forget the --ut8 argument. Thank you, works now.

I'll take a look at the regeneration code. Maybe I can help.