This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
fix 68564: /g failure with zero-width patterns
This is based on a patch by Father Chrysostomos <sprout@cpan.org>
The start class optimisation has two modes, "try every valid start
position" (doevery) and "flip flop mode" (!doevery) where it trys
only the first valid start position in a sequence.
Consider /(\d+)X/ and the string "123456Y", now we know that if we fail
to match X after matching "123456" then we will also fail to match after
"23456" (assuming no evil tricks are in place, which disable the
optimisation anyway), so we know we can skip forward until the check
/fails/ and only then start looking for a real match. This is flip-flop
mode.
Now consider the case with zero-width lookahead under /g: /(?=(\d+)X)/.
In this case we have an additional failure mode, that is failure when
we match a zero-width string twice at the same pos(). So now, the
"flip-flop" logic breaks as it /is/ possible that we could match at
"23456" when we couldn't match at "123456" because of the zero-length
twice at the same pos() rule. For instance:
print $1 for "123"=~/(?=(\d+))/g
should first match "123". Since $& is zero length, pos() is not
incremented. We then match again, successfully, except that the match
is rejected despite technical-success because its $& is also zero
length and pos() has not advanced. If the flip-flop mode is enabled
we wont retry until we find a failing character first.
The point here is that it makes perfect sense to disable the
"flip-flop" mode optimisation when the start class is inside
a lookahead as it really doesnt apply.