This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
[gh18096] assume worst-case for GOSUBs we don't analyse
authorHugo van der Sanden <hv@crypt.org>
Tue, 15 Sep 2020 13:02:54 +0000 (14:02 +0100)
committerHugo van der Sanden <hv@crypt.org>
Tue, 22 Sep 2020 11:51:47 +0000 (12:51 +0100)
During study_chunk, under various conditions we avoid recursing into
a GOSUB. But we must avoid giving the enclosing scope the idea that
this GOSUB would match only an empty string, since that could trigger
wrong optimizations (eg CURLYX => CURLYM in the ticket).

So we mark the construct as infinite, as in the code branch where we
_do_ recurse into it.

regcomp.c
t/re/re_tests

index 124ea5b..fae3f80 100644 (file)
--- a/regcomp.c
+++ b/regcomp.c
@@ -5212,7 +5212,12 @@ S_study_chunk(pTHX_ RExC_state_t *pRExC_state, regnode **scanp,
                      * might result in a minlen of 1 and not of 4,
                      * but this doesn't make us mismatch, just try a bit
                      * harder than we should.
-                     * */
+                     *
+                     * However we must assume this GOSUB is infinite, to
+                     * avoid wrongly applying other optimizations in the
+                     * enclosing scope - see GH 18096, for example.
+                     */
+                    is_inf = is_inf_internal = 1;
                     scan= regnext(scan);
                     continue;
                 }
index 554a700..ab5a0d8 100644 (file)
@@ -2023,6 +2023,8 @@ AB\s+\x{100}      AB \x{100}X     y       -       -
 /(?iaax:A? \K +)/      African_Feh     c       -       \\K + is forbidden - matches null string many times in regex
 /(?iaa:A?\K+)/ African_Feh     c       -       \\K+ is forbidden - matches null string many times in regex
 /(?iaa:A?\K*)/ African_Feh     c       -       \\K* is forbidden - matches null string many times in regex
+^((\w|<(\s)*(?1)(?3)*>)(?:(?3)*\+(?3)*(?2))*)(?3)*\+   a + b + <c + d> y       $1      a + b           # [GH #18096]
+^((\w|<(\s)*(?1)(?3)*>)(?:(?3)*\+(?3)*(?2))*)(?3)*\+   a + <b> + c     y       $1      a + <b>         # [GH #18096]
 # Keep these lines at the end of the file
 # pat  string  y/n/etc expr    expected-expr   skip-reason     comment
 # vim: softtabstop=0 noexpandtab