Use expression character class instead of lookaround
RunDevelopment opened this issue · comments
Motivation
Due to the lack of set operation for character classes, I wrote regexes like (?!\s)[\w\P{ASCII}]
in the past (another example). They can be transformed to use set operations instead. Using set operation not only makes the intent clearer, it's also more performant. I did some performance test a while back, and (?!\s)[\w\P{ASCII}]
was about 4x slower than [\w...whatever ranges are equivalent]
.
Description
Combine single-character lookarounds with character classes in regexes with the v
flag. E.g. (?=[x])[y]
=> [y&&x]
and (?![x])[y]
=> [y--x]
. Regexes without the v
should not be affected.
If the element after the lookaround is a character set, then the transformation should still be applied. E.g. (?!\s)\P{ASCII}
=> [\P{ASCII}--\s]
.
Examples
// good
/(?!\s)[\w\x80-\uFFFF]/;
/(?!\s)[\w\P{ASCII}]/u;
// bad
bad = /(?!\s)[\w\P{ASCII}]/v;
good = /[[\w\P{ASCII}]--\s]/v;