Seth-Harlaar / flexer-unicode-regular-expressions

A python script built for generating flexer-compatible regular expressions to capture all the unicode code points in specific general categories

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

flexer-unicode-regular-expressions

A python script built for generating flexer-compatible regular expressions to capture all the unicode code points in specific general categories.

This script was used in the development of the cooklang-c project.

How it Works

  1. The script will automatically parse through every code point found in the all_unicode_chars.txt file and find each code points' general category.
  2. After some processing, the script will place the desired code points (based on the conditional statements in lines 50-56.
  3. Then, the script will generate a flexer-compatible regular expression, based on line length and delimination strings.

To use

  1. Clone the repository.
  2. Cusomtize the conditional statements in lines 50-56 to include the general categories you want to generate the regular expressions you need.
  3. Edit the end of the script to print the expressions generated by using the make_hex_string() function.
  4. Run the script with python unicode.py.

About

A python script built for generating flexer-compatible regular expressions to capture all the unicode code points in specific general categories


Languages

Language:Python 100.0%