Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

t/perl/regexp.t fails with PCRE2 10.32-RC1 #31

Closed
ppisar opened this issue Aug 31, 2018 · 4 comments
Closed

t/perl/regexp.t fails with PCRE2 10.32-RC1 #31

ppisar opened this issue Aug 31, 2018 · 4 comments
Assignees
Labels

Comments

@ppisar
Copy link
Contributor

ppisar commented Aug 31, 2018

PCRE2 has a release candidate for 10.32 and these t/perl/regexp.t tests fail with it:

$ perl -Iblib/{arch,lib} t/perl/regexp.t 1 t/perl/re_tests 1443 1444 1960 1961 
1..4
# 1 iterations
not ok 1 () /\N{U+41}\x{c1}/i:a\x{e1}:y:$&:a\x{e1} => `/', match=
$subject = "a\341";

$got = "/";

                ;
                $match = ($subject =~ m/\N{U+41}\x{c1}/i) while $c--;
                $got = "$&";

not ok 2 () /[\N{U+41}\x{c1}]/i:\x{e1}:y:$&:\x{e1} => `/', match=
$subject = "\341";

$got = "/";

                ;
                $match = ($subject =~ m/[\N{U+41}\x{c1}]/i) while $c--;
                $got = "$&";

not ok 3 () foo(*ACCEPT:foo):foo:y:$::REGMARK:foo => `', match=1
$subject = "foo";

$got = "";

                ;
                $match = ($subject =~ m'foo(*ACCEPT:foo)') while $c--;
                $got = "$::REGMARK";

not ok 4 () (foo(*ACCEPT:foo)):foo:y:$::REGMARK:foo => `', match=1
$subject = "foo";

$got = "";

                ;
                $match = ($subject =~ m'(foo(*ACCEPT:foo))') while $c--;
                $got = "$::REGMARK";

This may be caused by these new PCRE2 features:

27. (*ACCEPT:ARG), (*FAIL:ARG), and (*COMMIT:ARG) are now supported.

29. Add support for \N{U+dddd}, but not in EBCDIC environments.
@ppisar
Copy link
Contributor Author

ppisar commented Aug 31, 2018

I confirm that the failures are triggered with these new features introduced with PCRE2 commits:

commit 1ad8a5e6add80b53753a4b78589ff41fc58dad18
Author: ph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069>
Date:   Sat Jul 21 14:34:51 2018 +0000

    Allow :NAME on (*ACCEPT), (*FAIL), and (*COMMIT) and fix bug with (*MARK)
    followed by (*ACCEPT) in an assertion. More small updates to perltest.sh.
    
    
    git-svn-id: svn:https://vcs.exim.org/pcre2/code/trunk@968 6239d852-aaf2-0410-a92c-79f79f948069

and

commit f0921f962e383718a302729151ee21860b419d79
Author: ph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069>
Date:   Fri Jul 27 16:30:40 2018 +0000

    Add support for \N{U+dd...}, for ASCII and Unicode modes only.
    
    
    git-svn-id: svn:https://vcs.exim.org/pcre2/code/trunk@972 6239d852-aaf2-0410-a92c-79f79f948069

@rurban rurban self-assigned this Sep 1, 2018
@rurban rurban added the bug label Sep 1, 2018
@rurban
Copy link
Owner

rurban commented Sep 1, 2018

Thanks, confirmed

rurban added a commit that referenced this issue Sep 1, 2018
See [GH #31], thank to Petr Pisar.
Need to check how to map these 2 new features:

Allow :NAME on (*ACCEPT), (*FAIL), and (*COMMIT) and fix bug with (*MARK)
followed by (*ACCEPT) in an assertion.

Add support for \N{U+dd...}, for ASCII and Unicode modes only.
@rurban
Copy link
Owner

rurban commented Sep 1, 2018

I've filed 2 PCRE2 bugs: https://bugs.exim.org/show_bug.cgi?id=2306
https://bugs.exim.org/show_bug.cgi?id=2305 for these.
2305 clearly a pcre2 regression, 2306 looks also like a pcre2 bug to me.

rurban added a commit that referenced this issue Sep 3, 2018
See [GH #31], thank to Petr Pisar.
Need to check how to map these 2 new features:

Allow :NAME on (*ACCEPT), (*FAIL), and (*COMMIT) and fix bug with (*MARK)
followed by (*ACCEPT) in an assertion.

Add support for \N{U+dd...}, for ASCII and Unicode modes only.
Caused unicode regression https://bugs.exim.org/show_bug.cgi?id=2305
(need to observe unicode folding rules for \N{U+NNNN} chars)
@rurban
Copy link
Owner

rurban commented Apr 8, 2019

Added the specializations to the testcases, where pcre2 deviates from perl5 for the upcoming 0.15 release

fixup for libpcre2 >= 10.32 unicode semantic changes:

  • Allow :NAME on (*ACCEPT), (*FAIL), and (*COMMIT) and fix bug with (*MARK)
    followed by (*ACCEPT) in an assertion.
  • Add support for \N{U+dd...}, for ASCII and Unicode modes only.
    Caused unicode regression https://bugs.exim.org/show_bug.cgi?id=2305
    (need to observe unicode folding rules for \N{U+NNNN} chars)

@rurban rurban closed this as completed Apr 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants