X-Git-Url: https://perl5.git.perl.org/perl5.git/blobdiff_plain/a4a5e2bf376c447a0f259a4a8033781dfc6c46fa..b7f87eaf73269868bae45510d6e24e6b3dac919e:/pod/perldelta.pod diff --git a/pod/perldelta.pod b/pod/perldelta.pod index 4996dd1..9e96279 100644 --- a/pod/perldelta.pod +++ b/pod/perldelta.pod @@ -1,72 +1,136 @@ =encoding utf8 -=for comment -This has been completed up to 0aae26c14, except for: -803e389 rurban CYG17 utf8 paths -d9298c1 rurban mymalloc isn't thread safe - =head1 NAME -[ this is a template for a new perldelta file. Any text flagged as -XXX needs to be processed before release. ] - -perldelta - what is new for perl v5.15.8 +perldelta - what is new for perl v5.16.0 =head1 DESCRIPTION -This document describes differences between the 5.15.7 release and -the 5.15.8 release. +This document describes differences between the 5.14.0 release and +the 5.16.0 release. -If you are upgrading from an earlier release such as 5.15.6, first read -L, which describes differences between 5.15.6 and -5.15.7. +If you are upgrading from an earlier release such as 5.12.0, first read +L, which describes differences between 5.12.0 and +5.14.0. =head1 Notice -XXX Any important notices here +As described in L, the release of Perl 5.16.0 marks the +official end of support for Perl 5.12. Users of Perl 5.12 or earlier +should consider upgrading to a more recent release of Perl. =head1 Core Enhancements -XXX New core language features go here. Summarise user-visible core language -enhancements. Particularly prominent performance optimisations could go -here, but most should go in the L section. +=head2 C> -[ List each enhancement as a =head2 entry ] +As of this release, version declarations like C now disable +all features before enabling the new feature bundle. This means that +the following holds true: -=head2 Improved ability to mix locales and Unicode, including UTF-8 locales + use 5.016; + # only 5.16 features enabled here + use 5.014; + # only 5.14 features enabled here (not 5.16) -An optional parameter has been added to C +C and higher continue to enable strict, but explicit C and C now override the version declaration, even +when they come first: - use locale ':not_characters'; + no strict; + use 5.012; + # no strict here -which tells Perl to use all but the C and C -portions of the current locale. Instead, the character set is assumed -to be Unicode. This allows locales and Unicode to be seamlessly mixed, -including the increasingly frequent UTF-8 locales. When using this -hybrid form of locales, the C<:locale> layer to the L pragma can -be used to interface with the file system, and there are CPAN modules -available for ARGV and environment variable conversions. +There is a new ":default" feature bundle that represents the set of +features enabled before any version declaration or C has +been seen. Version declarations below 5.10 now enable the ":default" +feature set. This does not actually change the behaviour of C, because features added to the ":default" set are those that were +traditionally enabled by default, before they could be turned off. -Full details are in L. +C<< no feature >> now resets to the default feature set. To disable all +features (which is likely to be a pretty special-purpose request, since +it presumably won't match any named set of semantics) you can now +write C<< no feature ':all' >>. -=head2 New function C and corresponding escape sequence C<\F> for Unicode foldcase +C<$[> is now disabled under C. It is part of the default +feature set and can be turned on or off explicitly with C. -Unicode foldcase is an extension to lowercase that gives better results -when comparing two strings case-insensitively. It has long been used -internally in regular expression C matching. Now it is available -explicitly through the new C function call (enabled by -S>, or C, or explicitly callable via -C) or through the new C<\F> sequence in double-quotish -strings. +=head2 C<__SUB__> -Full details are in L. +The new C<__SUB__> token, available under the C feature +(see L) or C, returns a reference to the current +subroutine, making it easier to write recursive closures. -=head2 C<_> in subroutine prototypes +=head2 New and Improved Built-ins -The C<_> character in subroutine prototypes is now allowed before C<@> or -C<%>. +=head3 More consistent C + +The C operator sometimes treats a string argument as a sequence of +characters and sometimes as a sequence of bytes, depending on the +internal encoding. The internal encoding is not supposed to make any +difference, but there is code that relies on this inconsistency. + +The new C and C features (enabled under C) resolve this. The C feature causes C to treat the string always as Unicode. The C +features provides a function, itself called C, which +evaluates its argument always as a string of bytes. + +These features also fix oddities with source filters leaking to outer +dynamic scopes. + +See L for more detail. + +=head3 C lvalue revamp + +=for comment Does this belong here, or under Incomptable Changes? + +When C is called in lvalue or potential lvalue context with two +or three arguments, a special lvalue scalar is returned that modifies +the original string (the first argument) when assigned to. + +Previously, the offsets (the second and third arguments) passed to +C would be converted immediately to match the string, negative +offsets being translated to positive and offsets beyond the end of the +string being truncated. + +Now, the offsets are recorded without modification in the special +lvalue scalar that is returned, and the original string is not even +looked at by C itself, but only when the returned lvalue is +read or modified. + +These changes result in an incompatible change: -=head1 Supports (I) Unicode 6.1 +If the original string changes length after the call to C but +before assignment to its return value, negative offsets will remember +their position from the end of the string, affecting code like this: + + my $string = "string"; + my $lvalue = \substr $string, -4, 2; + print $lvalue, "\n"; # prints "ri" + $string = "bailing twine"; + print $lvalue, "\n"; # prints "wi"; used to print "il" + +The same thing happens with an omitted third argument. The returned +lvalue will always extend to the end of the string, even if the string +becomes longer. + +Since this change also allowed many bugs to be fixed (see +L operator>), and since the behaviour +of negative offsets has never been specified, so the +change was deemed acceptable. + +=head3 Return value of C + +The value returned by C on a tied variable is now the actual +scalar that holds the object to which the variable is tied. This +allows ties to be weakened with C. + +=head2 Unicode Support + +=head3 Supports (I) Unicode 6.1 Besides the addition of whole new scripts, and new characters in existing scripts, this new version of Unicode, as always, makes some @@ -83,16 +147,31 @@ other changes in 6.1, the Perl regular expression construct C<\X> now works differently for some characters in Thai and Lao. New aliases (synonyms) have been defined for many property values; -these, along with the previously existing ones, are all cross indexed in +these, along with the previously existing ones, are all cross-indexed in L. -The return value of C is affected by other changes. -One of these is that the preferred name (which is what C -returns) for the character at U+2118 has been changed from SCRIPT CAPITAL P -to WEIERSTRASS ELLIPTIC FUNCTION. But most of these changes are the +The return value of C is affected by other +changes: + + Code point Old Name New Name + U+000A LINE FEED (LF) LINE FEED + U+000C FORM FEED (FF) FORM FEED + U+000D CARRIAGE RETURN (CR) CARRIAGE RETURN + U+0085 NEXT LINE (NEL) NEXT LINE + U+008E SINGLE-SHIFT 2 SINGLE-SHIFT-2 + U+008F SINGLE-SHIFT 3 SINGLE-SHIFT-3 + U+0091 PRIVATE USE 1 PRIVATE USE-1 + U+0092 PRIVATE USE 2 PRIVATE USE-2 + U+2118 SCRIPT CAPITAL P WEIERSTRASS ELLIPTIC FUNCTION + +Perl will accept any of these names as input, but +C now returns the new name of each pair. The +change for U+2118 is considered by Unicode to be a correction, that is +the original name was a mistake (but again, it will remain forever valid +to use it to refer to U+2118). But most of these changes are the fallout of the mistake Unicode 6.0 made in naming a character used in -Japanese cell phones to be "BELL", which conflicts with the long -standing industry use of (and Unicode's recommendation to use) that name +Japanese cell phones to be "BELL", which conflicts with the longstanding +industry use of (and Unicode's recommendation to use) that name to mean the ASCII control character at U+0007. As a result, that name has been deprecated in Perl since v5.14; and any use of it will raise a warning message (unless turned off). The name "ALERT" is now the @@ -105,21 +184,23 @@ this character, and not U+0007. Unicode has taken steps to make sure that this sort of mistake does not happen again. The Standard now includes all the generally accepted names and abbreviations for control characters, whereas previously it -didn't. This means that all the names that Perl had previously -deprecated (except BELL) are no longer deprecated, such as FILE -SEPARATOR. Also, the names for four rarely used characters are subtly -different (a hyphen instead of a space) than before: - - Code point Old Name New Name - U+008E SINGLE-SHIFT 2 SINGLE-SHIFT-2 - U+008F SINGLE-SHIFT 3 SINGLE-SHIFT-3 - U+0091 PRIVATE USE 1 PRIVATE USE-1 - U+0092 PRIVATE USE 2 PRIVATE USE-2 - -Perl will accept either name as input, but C now -returns the new name. - -Additional name abbreviations are accepted: +didn't (though there were recommended names for most of them, which Perl +used). This means that most of those recommended names are now +officially in the Standard. Unicode did not recommend names for the +four code points listed above between U+008E and U+008F, and in +standardizing them Unicode subtly changed the names that Perl had +previously given them, by replacing the final blank in each name by a +hyphen. Unicode also officially accepts names that Perl had deprecated, +such as FILE SEPARATOR. Now the only deprecated name is BELL. +Finally, Perl now uses the new official names instead of the old +(now considered obsolete) names for the first four code points in the +list above (the ones which have the parentheses in them). + +Now that the names have been placed in the Unicode standard, these kinds +of changes should not happen again, though corrections, such as to +U+2118, are still possible. + +Unicode also added some name abbreviations, which Perl now accepts: SP for SPACE; TAB for CHARACTER TABULATION; NEW LINE, END OF LINE, NL, and EOL for LINE FEED; @@ -130,566 +211,3526 @@ and ZWNBSP for ZERO WIDTH NO-BREAK SPACE. More details on this version of Unicode are provided in L. -=head1 Security - -XXX Any security-related notices go here. In particular, any security -vulnerabilities closed should be noted here rather than in the -L section. +=head3 C is no longer needed for C<\N{I}> -[ List each security issue as a =head2 entry ] +When C<\N{I}> is encountered, the C module is now +automatically loaded when needed as if the C<:full> and C<:short> +options had been specified. See L for more information. -=head1 Incompatible Changes +=head3 C<\N{...}> can now have Unicode loose name matching -XXX For a release on a stable branch, this section aspires to be: +This is described in the C item in +L below. - There are no changes intentionally incompatible with 5.XXX.XXX - If any exist, they are bugs, and we request that you submit a - report. See L below. +=head3 Unicode Symbol Names -[ List each incompatible change as a =head2 entry ] +Perl now has proper support for Unicode in symbol names. It used to be +that C<*{$foo}> would ignore the internal UTF8 flag and use the bytes of +the underlying representation to look up the symbol. That meant that +C<*{"\x{100}"}> and C<*{"\xc4\x80"}> would return the same thing. All +these parts of Perl have been fixed to account for Unicode: -=head2 Special blocks called in void context +=over -Special blocks (C, C, C, C, C) are now -called in void context. This avoids wasteful copying of the result of the -last statement [perl #108794]. +=item * -=head2 The C pragma and regexp objects +Method names (including those passed to C) -With C, regular expression objects returned by C are -now stringified as "Regexp=REGEXP(0xbe600d)" instead of the regular -expression itself [perl #108780]. +=item * -=head2 Two XS typemap Entries removed +Typeglob names (including names of variables, subroutines and filehandles) -Two presumably unused XS typemap entries have been removed from the -core typemap: T_DATAUNIT and T_CALLBACK. If you are, against all odds, -a user of these, please see the instructions on how to regain them -in L. +=item * -=head2 Unicode 6.1 has incompatibilities with Unicode 6.0 +Package names -These are detailed in L above. +=item * -=head2 Changed returns for some properties in C +C -The return values for C have been changed for some -properties to make the returned lists significantly smaller. This -allows those lists to be searched faster. +=item * -This function was introduced earlier in the v5.15 series of releases, -and the API will not be considered stable until v5.16. +Symbolic dereferencing -See L for details on the new interface. +=item * -=head1 Deprecations +Second argument to C and C -XXX Any deprecated features, syntax, modules etc. should be listed here. -In particular, deprecated modules should be listed here even if they are -listed as an updated module in the L section. +=item * -[ List each deprecation as a =head2 entry ] +Return value of C -=head1 Performance Enhancements +=item * -XXX Changes which enhance performance without changing behaviour go here. There -may well be none in a stable release. +Subroutine prototypes -[ List each enhancement as a =item entry ] +=item * -=over 4 +Attributes =item * -XXX +Various warnings and error messages that mention variable names or values, +methods, etc. =back -=head1 Modules and Pragmata +In addition, a parsing bug has been fixed that prevented C<*{é}> from +implicitly quoting the name, but instead interpreted it as C<*{+é}>, which +would cause a strict violation. -XXX All changes to installed files in F, F, F and F -go here. If Module::CoreList is updated, generate an initial draft of the -following sections using F, which prints stub -entries to STDOUT. Results can be pasted in place of the '=head2' entries -below. A paragraph summary for important changes should then be added by hand. -In an ideal world, dual-life modules would have a F file that could be -cribbed. +C<*{"*a::b"}> automatically strips off the * if it is followed by an ASCII +letter. That has been extended to all Unicode identifier characters. -[ Within each section, list entries as a =item entry ] +One-character non-ASCII non-punctuation variables (like C<$é>) are now +subject to "Used only once" warnings. They used to be exempt, as they +was treated as punctuation variables. -=head2 New Modules and Pragmata +Also, single-character Unicode punctuation variables (like C<$‰>) are now +supported [perl #69032]. -=over 4 +=head3 Improved ability to mix locales and Unicode, including UTF-8 locales -=item * +An optional parameter has been added to C -The C PerlIO layer is no longer implemented by perl itself, but has -been moved out into the new L module. + use locale ':not_characters'; -=back +which tells Perl to use all but the C and C +portions of the current locale. Instead, the character set is assumed +to be Unicode. This allows locales and Unicode to be seamlessly mixed, +including the increasingly frequent UTF-8 locales. When using this +hybrid form of locales, the C<:locale> layer to the L pragma can +be used to interface with the file system, and there are CPAN modules +available for ARGV and environment variable conversions. -=head2 Updated Modules and Pragmata +Full details are in L. -=over 4 +=head3 New function C and corresponding escape sequence C<\F> for Unicode foldcase -=item * +Unicode foldcase is an extension to lowercase that gives better results +when comparing two strings case-insensitively. It has long been used +internally in regular expression C matching. Now it is available +explicitly through the new C function call (enabled by +S>, or C, or explicitly callable via +C) or through the new C<\F> sequence in double-quotish +strings. -L has been upgraded from version 0.03 to version 0.04. +Full details are in L. -List slices no longer modify items on the stack belonging to outer lists -[perl #109570]. +=head3 The Unicode C property is now supported. -=item * +New in Unicode 6.0, this is an improved C