Parse errors on perl-ish negative lookahead regex syntax in routes-small.json, routes-large.json
hlein opened this issue · comments
The routes-small.json
and routes-large.json
files at https://wordlists-cdn.assetnote.io/rawdata/kiterunner/
trigger parse errors visible when running kr scan ... -v debug
such as:
7:05PM DBG failed to generate regex string crumb value error="error parsing regexp: invalid or unsupported Perl syntax: `(?!`" name=owner regex=[a-z0-9](?:-(?!-)|[a-z0-9])+[a-z0-9]
7:05PM DBG failed to generate regex string crumb value error="error parsing regexp: invalid or unsupported Perl syntax: `(?!`" name=id regex=[a-z0-9](?:-(?!-)|[a-z0-9])+[a-z0-9]
7:05PM DBG failed to generate regex string crumb value error="error parsing regexp: invalid or unsupported Perl syntax: `(?!`" name=streamId regex=[a-z0-9](?:-(?!-)|[a-z0-9]){1,93}[a-z0-9]
7:05PM DBG failed to generate regex string crumb value error="error parsing regexp: invalid or unsupported Perl syntax: `(?!`" name=owner regex=[a-z0-9](?:-(?!-)|[a-z0-9])+[a-z0-9]
...
Those regexes seem perfectly fine; these simple tests do what I think they should:
$ for A in foo f-oo f--oo ; do echo "## $A" ; echo $A | grep -P '[a-z0-9](?:-(?!-)|[a-z0-9])+[a-z0-9]' ; echo $A | perl -ne 'print if /[a-z0-9](?:-(?!-)|[a-z0-9])+[a-z0-9]/' ; done
## foo
foo
foo
## f-oo
f-oo
f-oo
## f--oo
$
...But, I think those errors are being emitted by binaryregexp
, which although it has Perl
/ PerlX
flags to turn on various perl-regex-like features, apparently does not support negative lookahead (?!
.
It's not clear to me if this bug belongs here or in https://github.com/assetnote/wordlists
; given that those routes files exist solely for kiterunner
, they ought to stick to syntax that kiterunner
supports, or, kiterunner
ought to switch to a smarter regex library that supports negative lookahead like maybe go-pcre
.