This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
re_op_compile: recalc code indexes on utf8 upgrade
authorDavid Mitchell <davem@iabyn.com>
Fri, 18 Nov 2011 12:37:59 +0000 (12:37 +0000)
committerDavid Mitchell <davem@iabyn.com>
Wed, 13 Jun 2012 12:25:53 +0000 (13:25 +0100)
commit2bd8e0da284e556e0ebae220a2fa52570cd96ca3
treeb64c881eebed9be5199023ce8aecb91215d76ca5
parentf5cf2abdb19e03f24bd768767fd145f15f076d40
re_op_compile: recalc code indexes on utf8 upgrade

As part of the compilation, we calculate the start and end positions
of the text of each literal code block within the pattern string.

The 'if pattern gets unexpected upgraded to UTF8, longjmp and restart
the compilation' mechanism, means that these indices can become invalid,
so if this happens, recalculate them. We do this by unrolling a call
to Perl_bytes_to_utf8(), which updates the indices at the same time that
it uopdtes the string.

Note that some of the new TODO test are actually passing, but this is for
the wrong reason. They're supposed to test for forced recompilation of
non-literal code blocks, even if the pattern string hasn't changed (which I
haven't implemented yet), but instead they're passing because the "don't
recomile if strings match" check isn't UTF8-aware. I'll fix this
(pre-existing) bug in the next commit.
regcomp.c
t/re/pat_re_eval.t