begriffs / flexicode

Tools scanning Unicode in Flex

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Utilities for unicode in Flex

charclass

Outputs a regex to match UTF-8 byte sequences for all codepoints matching an ICU unicode regex.

# all Chinese characters
./charclass '\p{Han}'

# horizontal whitespace
./charclass '\h'

The \p option is especially powerful because it can match unicode properties.

To use the regexes, give them aliases in your Flex file:

/* from charcode '\h' */
whitespace \x09|\x20|\xc2\xa0|\xe1\x9a\x80|\xe2\x80[\x80-\x8a]|\xe2\x80\xaf|\xe2\x81\x9f

%%

{whitespace}  { /* ... */ }

Installation

Requires C99, ICU, and pkg-config.

./configure
make

About

Tools scanning Unicode in Flex


Languages

Language:C 92.8%Language:Makefile 7.2%