t/perl/regexp.t fails with PCRE2 10.32-RC1
ppisar opened this issue · comments
Petr Pisar commented
PCRE2 has a release candidate for 10.32 and these t/perl/regexp.t tests fail with it:
$ perl -Iblib/{arch,lib} t/perl/regexp.t 1 t/perl/re_tests 1443 1444 1960 1961
1..4
# 1 iterations
not ok 1 () /\N{U+41}\x{c1}/i:a\x{e1}:y:$&:a\x{e1} => `/', match=
$subject = "a\341";
$got = "/";
;
$match = ($subject =~ m/\N{U+41}\x{c1}/i) while $c--;
$got = "$&";
not ok 2 () /[\N{U+41}\x{c1}]/i:\x{e1}:y:$&:\x{e1} => `/', match=
$subject = "\341";
$got = "/";
;
$match = ($subject =~ m/[\N{U+41}\x{c1}]/i) while $c--;
$got = "$&";
not ok 3 () foo(*ACCEPT:foo):foo:y:$::REGMARK:foo => `', match=1
$subject = "foo";
$got = "";
;
$match = ($subject =~ m'foo(*ACCEPT:foo)') while $c--;
$got = "$::REGMARK";
not ok 4 () (foo(*ACCEPT:foo)):foo:y:$::REGMARK:foo => `', match=1
$subject = "foo";
$got = "";
;
$match = ($subject =~ m'(foo(*ACCEPT:foo))') while $c--;
$got = "$::REGMARK";
This may be caused by these new PCRE2 features:
27. (*ACCEPT:ARG), (*FAIL:ARG), and (*COMMIT:ARG) are now supported.
29. Add support for \N{U+dddd}, but not in EBCDIC environments.
Petr Pisar commented
I confirm that the failures are triggered with these new features introduced with PCRE2 commits:
commit 1ad8a5e6add80b53753a4b78589ff41fc58dad18
Author: ph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069>
Date: Sat Jul 21 14:34:51 2018 +0000
Allow :NAME on (*ACCEPT), (*FAIL), and (*COMMIT) and fix bug with (*MARK)
followed by (*ACCEPT) in an assertion. More small updates to perltest.sh.
git-svn-id: svn://vcs.exim.org/pcre2/code/trunk@968 6239d852-aaf2-0410-a92c-79f79f948069
and
commit f0921f962e383718a302729151ee21860b419d79
Author: ph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069>
Date: Fri Jul 27 16:30:40 2018 +0000
Add support for \N{U+dd...}, for ASCII and Unicode modes only.
git-svn-id: svn://vcs.exim.org/pcre2/code/trunk@972 6239d852-aaf2-0410-a92c-79f79f948069
Reini Urban commented
Thanks, confirmed
Reini Urban commented
I've filed 2 PCRE2 bugs: https://bugs.exim.org/show_bug.cgi?id=2306
https://bugs.exim.org/show_bug.cgi?id=2305 for these.
2305 clearly a pcre2 regression, 2306 looks also like a pcre2 bug to me.
Reini Urban commented
Added the specializations to the testcases, where pcre2 deviates from perl5 for the upcoming 0.15 release
fixup for libpcre2 >= 10.32 unicode semantic changes:
- Allow :NAME on (*ACCEPT), (*FAIL), and (*COMMIT) and fix bug with (*MARK)
followed by (*ACCEPT) in an assertion. - Add support for \N{U+dd...}, for ASCII and Unicode modes only.
Caused unicode regression https://bugs.exim.org/show_bug.cgi?id=2305
(need to observe unicode folding rules for \N{U+NNNN} chars)