From 4044502721ac7b89c6d21cf1099a3a518717eeba Mon Sep 17 00:00:00 2001 From: David Mitchell Date: Wed, 24 Jul 2013 15:20:22 +0100 Subject: [PATCH] perlvar.pod: add a separate section on $& et al Add a new separate section explaining the performance issues of $`, $& and $'; plus descriptions of the various workarounds like @-, /p and COW, and which perl version they were each introduced in. Then in the entries for each individual var, strip out any commentary about performance, and just include a link to the new performance section. --- pod/perlvar.pod | 86 ++++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 58 insertions(+), 28 deletions(-) diff --git a/pod/perlvar.pod b/pod/perlvar.pod index a278d10..4d869f1 100644 --- a/pod/perlvar.pod +++ b/pod/perlvar.pod @@ -801,16 +801,51 @@ we have not made another match: $1 is Mutt; $2 is Jeff $1 is Wallace; $2 is Grommit -The C and C -modules can help you find uses of these -problematic match variables in your code. +=head3 Performance issues -Since Perl v5.10.0, you can use the C

match operator flag and the -C<${^PREMATCH}>, C<${^MATCH}>, and C<${^POSTMATCH}> variables instead -so you only suffer the performance penalties. +Traditionally in Perl, any use of any of the three variables C<$`>, C<$&> +or C<$'> (or their C equivalents) anywhere in the code, caused +all subsequent successful pattern matches to make a copy of the matched +string, in case the code might subsequently access one of those variables. +This imposed a considerable performance penalty across the whole program, +so generally the use of these variables has been discouraged. -If you are using Perl v5.20.0 or higher, you do not need to worry about -this, as the three naughty variables are no longer naughty. +In Perl 5.6.0 the C<@-> and C<@+> dynamic arrays were introduced that +supply the indices of successful matches. So you could for example do +this: + + $str =~ /pattern/; + + print $`, $&, $'; # bad: perfomance hit + + print # good: no perfomance hit + substr($str, 0, $-[0]), + substr($str, $-[0], $+[0]-$-[0]), + substr($str, $+[0]); + +In Perl 5.10.0 the C

match operator flag and the C<${^PREMATCH}>, +C<${^MATCH}>, and C<${^POSTMATCH}> variables were introduced, that allowed +you to suffer the penalties only on patterns marked with C

. + +In Perl 5.18.0 onwards, perl started noting the presence of each of the +three variables separately, and only copied that part of the string +required; so in + + $`; $&; "abcdefgh" =~ /d/ + +perl would only copy the "abcd" part of the string. That could make a big +difference in something like + + $str = 'x' x 1_000_000; + $&; # whoops + $str =~ /x/g # one char copied a million times, not a million chars + +In Perl 5.20.0 a new copy-on-write system was enabled by default, which +finally fixes all performance issues with these three variables, and makes +them safe to use anywhere. + +The C and C modules can help you +find uses of these problematic match variables in your code. =over 8 @@ -834,12 +869,8 @@ The string matched by the last successful pattern match (not counting any matches hidden within a BLOCK or C enclosed by the current BLOCK). -In Perl v5.18 and earlier, the use of this variable -anywhere in a program imposes a considerable -performance penalty on all regular expression matches. To avoid this -penalty, you can extract the same substring by using L. Starting -with Perl v5.10.0, you can use the C

match flag and the C<${^MATCH}> -variable to do the same thing for particular match operations. +See L above for the serious performance implications +of using this variable (even once) in your code. This variable is read-only and dynamically-scoped. @@ -850,6 +881,9 @@ X<${^MATCH}> This is similar to C<$&> (C<$MATCH>) except that it does not incur the performance penalty associated with that variable. + +See L above. + In Perl v5.18 and earlier, it is only guaranteed to return a defined value when the pattern was compiled or executed with the C

modifier. In Perl v5.20, the C

modifier does nothing, so @@ -868,13 +902,8 @@ The string preceding whatever was matched by the last successful pattern match, not counting any matches hidden within a BLOCK or C enclosed by the current BLOCK. -In Perl v5.18 and earlier, the use of this variable -anywhere in a program imposes a considerable -performance penalty on all regular expression matches. To avoid this -penalty, you can extract the same substring by using L. Starting -with Perl v5.10.0, you can use the C

match flag and the -C<${^PREMATCH}> variable to do the same thing for particular match -operations. +See L above for the serious performance implications +of using this variable (even once) in your code. This variable is read-only and dynamically-scoped. @@ -885,6 +914,9 @@ X<$`> X<${^PREMATCH}> This is similar to C<$`> ($PREMATCH) except that it does not incur the performance penalty associated with that variable. + +See L above. + In Perl v5.18 and earlier, it is only guaranteed to return a defined value when the pattern was compiled or executed with the C

modifier. In Perl v5.20, the C

modifier does nothing, so @@ -907,13 +939,8 @@ enclosed by the current BLOCK). Example: /def/; print "$`:$&:$'\n"; # prints abc:def:ghi -In Perl v5.18 and earlier, the use of this variable -anywhere in a program imposes a considerable -performance penalty on all regular expression matches. -To avoid this penalty, you can extract the same substring by -using L. Starting with Perl v5.10.0, you can use the C

match flag -and the C<${^POSTMATCH}> variable to do the same thing for particular -match operations. +See L above for the serious performance implications +of using this variable (even once) in your code. This variable is read-only and dynamically-scoped. @@ -924,6 +951,9 @@ X<${^POSTMATCH}> X<$'> X<$POSTMATCH> This is similar to C<$'> (C<$POSTMATCH>) except that it does not incur the performance penalty associated with that variable. + +See L above. + In Perl v5.18 and earlier, it is only guaranteed to return a defined value when the pattern was compiled or executed with the C

modifier. In Perl v5.20, the C

modifier does nothing, so -- 1.8.3.1