assetnote / kiterunner

Contextual Content Discovery Tool

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Parse errors on perl-ish negative lookahead regex syntax in routes-small.json, routes-large.json

hlein opened this issue · comments

commented

The routes-small.json and routes-large.json files at https://wordlists-cdn.assetnote.io/rawdata/kiterunner/ trigger parse errors visible when running kr scan ... -v debug such as:

7:05PM DBG failed to generate regex string crumb value error="error parsing regexp: invalid or unsupported Perl syntax: `(?!`" name=owner regex=[a-z0-9](?:-(?!-)|[a-z0-9])+[a-z0-9]
7:05PM DBG failed to generate regex string crumb value error="error parsing regexp: invalid or unsupported Perl syntax: `(?!`" name=id regex=[a-z0-9](?:-(?!-)|[a-z0-9])+[a-z0-9]
7:05PM DBG failed to generate regex string crumb value error="error parsing regexp: invalid or unsupported Perl syntax: `(?!`" name=streamId regex=[a-z0-9](?:-(?!-)|[a-z0-9]){1,93}[a-z0-9]
7:05PM DBG failed to generate regex string crumb value error="error parsing regexp: invalid or unsupported Perl syntax: `(?!`" name=owner regex=[a-z0-9](?:-(?!-)|[a-z0-9])+[a-z0-9]
...

Those regexes seem perfectly fine; these simple tests do what I think they should:

$ for A in foo f-oo f--oo ; do echo "## $A" ; echo $A | grep -P '[a-z0-9](?:-(?!-)|[a-z0-9])+[a-z0-9]' ; echo $A | perl -ne 'print if /[a-z0-9](?:-(?!-)|[a-z0-9])+[a-z0-9]/' ; done
## foo
foo
foo
## f-oo
f-oo
f-oo
## f--oo
$

...But, I think those errors are being emitted by binaryregexp, which although it has Perl / PerlX flags to turn on various perl-regex-like features, apparently does not support negative lookahead (?!.

It's not clear to me if this bug belongs here or in https://github.com/assetnote/wordlists; given that those routes files exist solely for kiterunner, they ought to stick to syntax that kiterunner supports, or, kiterunner ought to switch to a smarter regex library that supports negative lookahead like maybe go-pcre.