Mateuszd6 / simd-lexing

A very fast SIMD (AVX2) accelerated lexer that does not rely on tables

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A very fast single-header C99 AVX2-accelerated lexer that does not use any
tables, can be customized by manipulating macros and is on average 3-5 times
faster than corresponding flex lexer.

It should work out of a box, but it's more of a prove-of-concept right now.

Nice things about it:
* No code generation, include and call a function
* No tables, custom language, etc
* It's fast
* Can define you own callback (through macros) on each token for advanced usage
* Handles some tricky things, which some "example" flex files don't, like
  BACKSLASH-LF in string and single-line comments, CR-LF line endings etc

Examples comming soon, I hope

Speed for lexing the GCC source code (all GCC files concatenated into a single
file, from: https://people.csail.mit.edu/smcc/projects/single-file-programs/ ),
on my Skylake 2.40GHz i5 laptop. This only counts idents, strings etc, does not
parse integers or others (this is on the way, but the reference flex file does
not do that either):
|-------------+----------|
| Lexer:      | time(ms) |
|-------------+----------|
| flex        |      231 |
| flex --fast |      102 |
| flex --full |      106 |
| simd-lex    |       34 |
|-------------+----------|

About

A very fast SIMD (AVX2) accelerated lexer that does not rely on tables

License:zlib License


Languages

Language:C 94.8%Language:Makefile 2.4%Language:Shell 2.0%Language:Batchfile 0.7%