skvadrik / re2c

Lexer generator for C, C++, Go and Rust.

Home Page:https://re2c.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Stack overflow due to recursion in src/dfa/dead_rules.cc

Me19m4 opened this issue · comments

Operating System Version:ubuntu 20.04

re2c version:2.2

error function:re2c::backprop

==9992==ERROR: AddressSanitizer: stack-overflow on address 0x7ffdf3f83ff8 (pc 0x00000066f8e0 bp 0x000000135534 sp 0x7ffdf3f84000 T0)
#0 0x66f8e0 in re2c::backprop(re2c::rdfa_t const&, bool*, unsigned long, unsigned long) re2c/src/dfa/dead_rules.cc:149:9
#1 0x66f8e4 in re2c::backprop(re2c::rdfa_t const&, bool*, unsigned long, unsigned long) re2c/src/dfa/dead_rules.cc:149:9
#2 0x66f8e4 in re2c::backprop(re2c::rdfa_t const&, bool*, unsigned long, unsigned long) re2c/src/dfa/dead_rules.cc:149:9
#3 0x66f8e4 in re2c::backprop(re2c::rdfa_t const&, bool*, unsigned long, unsigned long) re2c/src/dfa/dead_rules.cc:149:9
Omit.....
#245 0x66f8e4 in re2c::backprop(re2c::rdfa_t const&, bool*, unsigned long, unsigned long)re2c/src/dfa/dead_rules.cc:149:9
#246 0x66f8e4 in re2c::backprop(re2c::rdfa_t const&, bool*, unsigned long, unsigned long)re2c/src/dfa/dead_rules.cc:149:9
#247 0x66f8e4 in re2c::backprop(re2c::rdfa_t const&, bool*, unsigned long, unsigned long) re2c/src/dfa/dead_rules.cc:149:9
#248 0x66f8e4 in re2c::backprop(re2c::rdfa_t const&, bool*, unsigned long, unsigned long)re2c/src/dfa/dead_rules.cc:149:9
AddressSanitizer: stack-overflow re2c/src/dfa/dead_rules.cc:149:9 in re2c::backprop(re2c::rdfa_t const&, bool*, unsigned long, unsigned long)

Test example link:

https://drive.google.com/file/d/1bLXgifNQhcTQI6937lJhapAa3hgwEugT/view?usp=sharing

Run the following command to repeat the error:

$ ./re2c example

Thanks for the bug report. Did you find these examples with some kind of fuzzer?

There are two different places where re2c should enforce reasonable size limits:

  • NFA size and depth
  • DFA size and depth

For regular expressions it is not necessary, because counted repetition is only unrolled when NFA is constructed (so RE can't get much larger than their text representation in the source file). For NFA and DFA the limits should be enforced separately, because there may be very large NFA that result in very small DFA (e.g. for something like (((""){0,100}){0,100}){0,100} NFA will have about 100^3 states, but DFA will have just one state). And the other way around (a small NFA triggering pathological exponential DFA size).

In the second test case re2c should check that the lower repetition bound is less or equal to the upper bound.

Fixed in commits a3473fd and 039c189.

CVE-2022-23901 was assigned to this.

NVD also rated it as a Critical CVE:
https://nvd.nist.gov/vuln/detail/CVE-2022-23901

I didn't have any involvement in this assignment, I'm just posting here for reference.

This bug has been fixed (see above for details). The fix was released in re2c-3.1.