typedef AV PADNAMELIST;
typedef SV PADNAME;
-/* XXX for 5.18, disable the COW by default
- * #if !defined(PERL_OLD_COPY_ON_WRITE) && !defined(PERL_NEW_COPY_ON_WRITE) && !defined(PERL_NO_COW)
- * # define PERL_NEW_COPY_ON_WRITE
- * #endif
- */
+/* enable PERL_NEW_COPY_ON_WRITE by default */
+#if !defined(PERL_OLD_COPY_ON_WRITE) && !defined(PERL_NEW_COPY_ON_WRITE) && !defined(PERL_NO_COW)
+# define PERL_NEW_COPY_ON_WRITE
+#endif
#if defined(PERL_OLD_COPY_ON_WRITE) || defined(PERL_NEW_COPY_ON_WRITE)
# if defined(PERL_OLD_COPY_ON_WRITE) && defined(PERL_NEW_COPY_ON_WRITE)
=item *
-XXX
+Perl has a new copy-on-write mechanism that avoids the need to copy the
+internal string buffer when assigning from one scalar to another. This
+makes copying large strings appear much faster. Modifying one of the two
+(or more) strings after an assignment will force a copy internally. This
+makes it unnecessary to pass strings by reference for efficiency.
+
+This feature was already available in 5.18.0, but wasn't enabled by
+default. It is the default now, and so you no longer need build perl with
+the F<Configure> argument:
+
+ -Accflags=PERL_NEW_COPY_ON_WRITE
+
+It can be disabled (for now) in a perl build with:
+
+ -Accflags=PERL_NO_COW
=back
=item *
+Perl's new copy-on-write mechanism (which is now enabled by default),
+allows any C<SvPOK> scalar to be automatically upgraded to a copy-on-write
+scalar when copied. A reference count on the string buffer is stored in
+the string buffer itself.
+
+For example:
+
+ $ perl -MDevel::Peek -e'$a="abc"; $b = $a; Dump $a; Dump $b'
+ SV = PV(0x260cd80) at 0x2620ad8
+ REFCNT = 1
+ FLAGS = (POK,IsCOW,pPOK)
+ PV = 0x2619bc0 "abc"\0
+ CUR = 3
+ LEN = 16
+ COW_REFCNT = 1
+ SV = PV(0x260ce30) at 0x2620b20
+ REFCNT = 1
+ FLAGS = (POK,IsCOW,pPOK)
+ PV = 0x2619bc0 "abc"\0
+ CUR = 3
+ LEN = 16
+ COW_REFCNT = 1
+
+Note that both scalars share the same PV buffer and have a COW_REFCNT
+greater than zero.
+
+This means that XS code which wishes to modify the C<SvPVX()> buffer of an
+SV should call C<SvPV_force()> or similar first, to ensure a valid (and
+unshared) buffer, and to call C<SvSETMAGIC()> afterwards. This in fact has
+always been the case (for example hash keys were already copy-on-write);
+this change just spreads the COW behaviour to a wider variety of SVs.
+
+One important difference is that before 5.18.0, shared hash-key scalars
+used to have the C<SvREADONLY> flag set; this is no longer the case.
+
+This new behaviour can still be disabled by running F<Configure> with
+B<-Accflags=-DPERL_NO_COW>. This option will probably be removed in Perl
+5.22.
+
+=item *
+
+C<PL_sawampersand> is now a constant. The switch this variable provided
+(to enable/disable the pre-match copy depending on whether C<$&> had been
+seen) has been removed and replaced with copy-on-write, eliminating a few
+bugs.
+
+The previous behaviour can still be enabled by running F<Configure> with
+B<-Accflags=-DPERL_SAWAMPERSAND>.
+
+=item *
+
The functions C<my_swap>, C<my_htonl> and C<my_ntohl> have been removed.
It is unclear why these functions were ever marked as I<A>, part of the
API. XS code can't call them directly, as it can't rely on them being
Preserve the string matched such that ${^PREMATCH}, ${^MATCH}, and
${^POSTMATCH} are available for use after matching.
+In Perl 5.20 and higher this is ignored. Due to a new copy-on-write
+mechanism, ${^PREMATCH}, ${^MATCH}, and ${^POSTMATCH} will be available
+after the match regardless of the modifier.
+
=item g and c
X</g> X</c>
which makes it easier to write code that tests for a series of more
specific cases and remembers the best match.
-B<WARNING>: Once Perl sees that you need one of C<$&>, C<$`>, or
+B<WARNING>: If your code is to run on Perl 5.16 or earlier,
+beware that once Perl sees that you need one of C<$&>, C<$`>, or
C<$'> anywhere in the program, it has to provide them for every
-pattern match. This may substantially slow your program. Perl
-uses the same mechanism to produce C<$1>, C<$2>, etc, so you also pay a
-price for each pattern that contains capturing parentheses. (To
-avoid this cost while retaining the grouping behaviour, use the
+pattern match. This may substantially slow your program.
+
+Perl uses the same mechanism to produce C<$1>, C<$2>, etc, so you also
+pay a price for each pattern that contains capturing parentheses.
+(To avoid this cost while retaining the grouping behaviour, use the
extended regular expression C<(?: ... )> instead.) But if you never
use C<$&>, C<$`> or C<$'>, then patterns I<without> capturing
parentheses will not be penalized. So avoid C<$&>, C<$'>, and C<$`>
if you can, but if you can't (and some algorithms really appreciate
them), once you've used them once, use them at will, because you've
-already paid the price. As of 5.17.4, the presence of each of the three
-variables in a program is recorded separately, and depending on
-circumstances, perl may be able be more efficient knowing that only C<$&>
-rather than all three have been seen, for example.
+already paid the price.
X<$&> X<$`> X<$'>
-As a workaround for this problem, Perl 5.10.0 introduces C<${^PREMATCH}>,
+Perl 5.16 introduced a slightly more efficient mechanism that notes
+separately whether each of C<$`>, C<$&>, and C<$'> have been seen, and
+thus may only need to copy part of the string. Perl 5.20 introduced a
+much more efficient copy-on-write mechanism which eliminates any slowdown.
+
+As another workaround for this problem, Perl 5.10.0 introduced C<${^PREMATCH}>,
C<${^MATCH}> and C<${^POSTMATCH}>, which are equivalent to C<$`>, C<$&>
and C<$'>, B<except> that they are only guaranteed to be defined after a
successful match that was executed with the C</p> (preserve) modifier.
The use of these variables incurs no global performance penalty, unlike
their punctuation char equivalents, however at the trade-off that you
-have to tell perl when you want to use them.
+have to tell perl when you want to use them. As of Perl 5.20, these three
+variables are equivalent to C<$`>, C<$&> and C<$'>, and C</p> is ignored.
X</p> X<p modifier>
=head2 Quoting metacharacters
${^MATCH} Entire matched string
${^POSTMATCH} Everything after to matched string
+Note to those still using Perl 5.18 or earlier:
The use of C<$`>, C<$&> or C<$'> will slow down B<all> regex use
within your program. Consult L<perlvar> for C<@->
to see equivalent expressions that won't cause slow down.
can also use the equivalent variables C<${^PREMATCH}>, C<${^MATCH}>
and C<${^POSTMATCH}>, but for them to be defined, you have to
specify the C</p> (preserve) modifier on your regular expression.
+In Perl 5.20, the use of C<$`>, C<$&> and C<$'> makes no speed difference.
$1, $2 ... hold the Xth captured expr
$+ Last parenthesized pattern match
In the second match, C<$`> equals C<''> because the regexp matched at the
first character position in the string and stopped; it never saw the
-second 'the'. It is important to note that using C<$`> and C<$'>
+second 'the'.
+
+If your code is to run on Perl versions earlier than
+5.20, it is worthwhile to note that using C<$`> and C<$'>
slows down regexp matching quite a bit, while C<$&> slows it down to a
lesser extent, because if they are used in one regexp in a program,
they are generated for I<all> regexps in the program. So if raw
$' is the same as substr( $x, $+[0] )
As of Perl 5.10, the C<${^PREMATCH}>, C<${^MATCH}> and C<${^POSTMATCH}>
-variables may be used. These are only set if the C</p> modifier is present.
-Consequently they do not penalize the rest of the program.
+variables may be used. These are only set if the C</p> modifier is
+present. Consequently they do not penalize the rest of the program. In
+Perl 5.20, C<${^PREMATCH}>, C<${^MATCH}> and C<${^POSTMATCH}> are available
+whether the C</p> has been used or not (the modifier is ignored), and
+C<$`>, C<$'> and C<$&> do not cause any speed difference.
=head2 Non-capturing groupings
$1 is Mutt; $2 is Jeff
$1 is Wallace; $2 is Grommit
-Due to an unfortunate accident of Perl's implementation, C<use
+If you are using Perl v5.18 or earlier, note that C<use
English> imposes a considerable performance penalty on all regular
expression matches in a program because it uses the C<$`>, C<$&>, and
C<$'>, regardless of whether they occur in the scope of C<use
C<${^PREMATCH}>, C<${^MATCH}>, and C<${^POSTMATCH}> variables instead
so you only suffer the performance penalties.
+If you are using Perl v5.20.0 or higher, you do not need to worry about
+this, as the three naughty variables are no longer naughty.
+
=over 8
=item $<I<digits>> ($1, $2, ...)
any matches hidden within a BLOCK or C<eval()> enclosed by the current
BLOCK).
-The use of this variable anywhere in a program imposes a considerable
+In Perl v5.18 and earlier, the use of this variable
+anywhere in a program imposes a considerable
performance penalty on all regular expression matches. To avoid this
penalty, you can extract the same substring by using L</@->. Starting
with Perl v5.10.0, you can use the C</p> match flag and the C<${^MATCH}>
X<${^MATCH}>
This is similar to C<$&> (C<$MATCH>) except that it does not incur the
-performance penalty associated with that variable, and is only guaranteed
+performance penalty associated with that variable.
+In Perl v5.18 and earlier, it is only guaranteed
to return a defined value when the pattern was compiled or executed with
-the C</p> modifier.
+the C</p> modifier. In Perl v5.20, the C</p> modifier does nothing, so
+C<${^MATCH}> does the same thing as C<$MATCH>.
This variable was added in Perl v5.10.0.
pattern match, not counting any matches hidden within a BLOCK or C<eval>
enclosed by the current BLOCK.
-The use of this variable anywhere in a program imposes a considerable
+In Perl v5.18 and earlier, the use of this variable
+anywhere in a program imposes a considerable
performance penalty on all regular expression matches. To avoid this
penalty, you can extract the same substring by using L</@->. Starting
with Perl v5.10.0, you can use the C</p> match flag and the
X<$`> X<${^PREMATCH}>
This is similar to C<$`> ($PREMATCH) except that it does not incur the
-performance penalty associated with that variable, and is only guaranteed
+performance penalty associated with that variable.
+In Perl v5.18 and earlier, it is only guaranteed
to return a defined value when the pattern was compiled or executed with
-the C</p> modifier.
+the C</p> modifier. In Perl v5.20, the C</p> modifier does nothing, so
+C<${^PREMATCH}> does the same thing as C<$PREMATCH>.
This variable was added in Perl v5.10.0
/def/;
print "$`:$&:$'\n"; # prints abc:def:ghi
-The use of this variable anywhere in a program imposes a considerable
+In Perl v5.18 and earlier, the use of this variable
+anywhere in a program imposes a considerable
performance penalty on all regular expression matches.
To avoid this penalty, you can extract the same substring by
using L</@->. Starting with Perl v5.10.0, you can use the C</p> match flag
X<${^POSTMATCH}> X<$'> X<$POSTMATCH>
This is similar to C<$'> (C<$POSTMATCH>) except that it does not incur the
-performance penalty associated with that variable, and is only guaranteed
+performance penalty associated with that variable.
+In Perl v5.18 and earlier, it is only guaranteed
to return a defined value when the pattern was compiled or executed with
-the C</p> modifier.
+the C</p> modifier. In Perl v5.20, the C</p> modifier does nothing, so
+C<${^POSTMATCH}> does the same thing as C<$POSTMATCH>.
This variable was added in Perl v5.10.0.
{
# [perl #4289] First mention $& after a match
- local $::TODO = "these tests fail without Copy-on-Write enabled";
fresh_perl_is(
'$_ = "abc"; /b/g; $_ = "hello"; print eval q|$&|, "\n"',
"b\n", {}, '$& first mentioned after match');