initialisation of simple aggregate state variables

[perl5.git] / pod / perlfunc.pod
diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod

index efd6198..b571faf 100644 (file)
--- a/pod/perlfunc.pod
+++ b/pod/perlfunc.pod
@@ -199,7 +199,7 @@ L<C<flock>|/flock FILEHANDLE,OPERATION>, L<C<format>|/format>,
  L<C<getc>|/getc FILEHANDLE>, L<C<print>|/print FILEHANDLE LIST>,
  L<C<printf>|/printf FILEHANDLE FORMAT, LIST>,
  L<C<read>|/read FILEHANDLE,SCALAR,LENGTH,OFFSET>,
-L<C<readdir>|/readdir DIRHANDLE>, L<C<readline>|/readline EXPR>
+L<C<readdir>|/readdir DIRHANDLE>, L<C<readline>|/readline EXPR>,
  L<C<rewinddir>|/rewinddir DIRHANDLE>, L<C<say>|/say FILEHANDLE LIST>,
  L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>,
  L<C<seekdir>|/seekdir DIRHANDLE,POS>,
@@ -258,7 +258,7 @@ X<control flow>
  L<C<break>|/break>, L<C<caller>|/caller EXPR>,
  L<C<continue>|/continue BLOCK>, L<C<die>|/die LIST>, L<C<do>|/do BLOCK>,
  L<C<dump>|/dump LABEL>, L<C<eval>|/eval EXPR>,
-L<C<evalbytes>|/evalbytes EXPR> L<C<exit>|/exit EXPR>,
+L<C<evalbytes>|/evalbytes EXPR>, L<C<exit>|/exit EXPR>,
  L<C<__FILE__>|/__FILE__>, L<C<goto>|/goto LABEL>,
  L<C<last>|/last LABEL>, L<C<__LINE__>|/__LINE__>,
  L<C<next>|/next LABEL>, L<C<__PACKAGE__>|/__PACKAGE__>,
@@ -1541,10 +1541,9 @@ makes it spring into existence the first time that it is called; see
  L<perlsub>.
  
  Use of L<C<defined>|/defined EXPR> on aggregates (hashes and arrays) is
-deprecated.  It
-used to report whether memory for that aggregate had ever been
-allocated.  This behavior may disappear in future versions of Perl.
-You should instead use a simple test for size:
+no longer supported. It used to report whether memory for that
+aggregate had ever been allocated.  You should instead use a simple
+test for size:
  
      if (@an_array) { print "has array elements\n" }
      if (%a_hash)   { print "has hash members\n"   }
@@ -1806,22 +1805,42 @@ See L<perlsyn> for alternative strategies.
  X<do>
  
  Uses the value of EXPR as a filename and executes the contents of the
-file as a Perl script.
+file as a Perl script:
  
+    # load the exact specified file (./ and ../ special-cased)
+    do '/foo/stat.pl';
+    do './stat.pl';
+    do '../foo/stat.pl';
+
+    # search for the named file within @INC
      do 'stat.pl';
+    do 'foo/stat.pl';
  
-is largely like
+C<do './stat.pl'> is largely like
  
      eval `cat stat.pl`;
  
-except that it's more concise, runs no external processes, keeps track of
-the current filename for error messages, searches the
-L<C<@INC>|perlvar/@INC> directories, and updates L<C<%INC>|perlvar/%INC>
-if the file is found.  See L<perlvar/@INC> and L<perlvar/%INC> for these
-variables.  It also differs in that code evaluated with C<do FILE>
-cannot see lexicals in the enclosing scope; C<eval STRING> does.  It's
-the same, however, in that it does reparse the file every time you call
-it, so you probably don't want to do this inside a loop.
+except that it's more concise, runs no external processes, and keeps
+track of the current filename for error messages. It also differs in that
+code evaluated with C<do FILE> cannot see lexicals in the enclosing
+scope; C<eval STRING> does.  It's the same, however, in that it does
+reparse the file every time you call it, so you probably don't want
+to do this inside a loop.
+
+Using C<do> with a relative path (except for F<./> and F<../>), like
+
+    do 'foo/stat.pl';
+
+will search the L<C<@INC>|perlvar/@INC> directories, and update
+L<C<%INC>|perlvar/%INC> if the file is found.  See L<perlvar/@INC>
+and L<perlvar/%INC> for these variables. In particular, note that
+whilst historically L<C<@INC>|perlvar/@INC> contained '.' (the
+current directory) making these two cases equivalent, that is no
+longer necessarily the case, as '.' is not included in C<@INC> by default
+in perl versions 5.26.0 onwards. Instead, perl will now warn:
+
+    do "stat.pl" failed, '.' is no longer in @INC;
+    did you mean do "./stat.pl"?
  
  If L<C<do>|/do EXPR> can read the file but cannot compile it, it
  returns L<C<undef>|/undef EXPR> and sets an error message in
@@ -1839,7 +1858,8 @@ if there's a problem.
  You might like to use L<C<do>|/do EXPR> to read in a program
  configuration file.  Manual error checking can be done this way:
  
-    # read in config files: system first, then user
+    # Read in config files: system first, then user.
+    # Beware of using relative pathnames here.
      for $file ("/share/prog/defaults.rc",
                 "$ENV{HOME}/.someprogrc")
      {
@@ -2036,86 +2056,187 @@ X<error, handling> X<exception, handling>
  
  =for Pod::Functions catch exceptions or compile and run code
  
-In the first form, often referred to as a "string eval", the return
-value of EXPR is parsed and executed as if it
-were a little Perl program.  The value of the expression (which is itself
-determined within scalar context) is first parsed, and if there were no
-errors, executed as a block within the lexical context of the current Perl
-program.  This means, that in particular, any outer lexical variables are
-visible to it, and any package variable settings or subroutine and format
-definitions remain afterwards.
-
-Note that the value is parsed every time the L<C<eval>|/eval EXPR>
-executes.  If EXPR is omitted, evaluates L<C<$_>|perlvar/$_>.  This form
-is typically used to delay parsing and subsequent execution of the text
-of EXPR until run time.
-
-If the
-L<C<"unicode_eval"> feature|feature/The 'unicode_eval' and 'evalbytes' features>
-is enabled (which is the default under a
-C<use 5.16> or higher declaration), EXPR or L<C<$_>|perlvar/$_> is
-treated as a string of characters, so L<C<use utf8>|utf8> declarations
-have no effect, and source filters are forbidden.  In the absence of the
-L<C<"unicode_eval"> feature|feature/The 'unicode_eval' and 'evalbytes' features>,
-will sometimes be treated as characters and sometimes as bytes,
-depending on the internal encoding, and source filters activated within
-the L<C<eval>|/eval EXPR> exhibit the erratic, but historical, behaviour
-of affecting some outer file scope that is still compiling.  See also
-the L<C<evalbytes>|/evalbytes EXPR> operator, which always treats its
-input as a byte stream and works properly with source filters, and the
-L<feature> pragma.
-
-Problems can arise if the string expands a scalar containing a floating
-point number.  That scalar can expand to letters, such as C<"NaN"> or
-C<"Infinity">; or, within the scope of a L<C<use locale>|locale>, the
-decimal point character may be something other than a dot (such as a
-comma).  None of these are likely to parse as you are likely expecting.
-
-In the second form, the code within the BLOCK is parsed only once--at the
-same time the code surrounding the L<C<eval>|/eval EXPR> itself was
-parsed--and executed
+C<eval> in all its forms is used to execute a little Perl program,
+trapping any errors encountered so they don't crash the calling program.
+
+Plain C<eval> with no argument is just C<eval EXPR>, where the
+expression is understood to be contained in L<C<$_>|perlvar/$_>.  Thus
+there are only two real C<eval> forms; the one with an EXPR is often
+called "string eval".  In a string eval, the value of the expression
+(which is itself determined within scalar context) is first parsed, and
+if there were no errors, executed as a block within the lexical context
+of the current Perl program.  This form is typically used to delay
+parsing and subsequent execution of the text of EXPR until run time.
+Note that the value is parsed every time the C<eval> executes.
+
+The other form is called "block eval".  It is less general than string
+eval, but the code within the BLOCK is parsed only once (at the same
+time the code surrounding the C<eval> itself was parsed) and executed
  within the context of the current Perl program.  This form is typically
-used to trap exceptions more efficiently than the first (see below), while
-also providing the benefit of checking the code within BLOCK at compile
-time.
-
-The final semicolon, if any, may be omitted from the value of EXPR or within
-the BLOCK.
+used to trap exceptions more efficiently than the first, while also
+providing the benefit of checking the code within BLOCK at compile time.
+BLOCK is parsed and compiled just once.  Since errors are trapped, it
+often is used to check if a given feature is available.
  
  In both forms, the value returned is the value of the last expression
-evaluated inside the mini-program; a return statement may be also used, just
+evaluated inside the mini-program; a return statement may also be used, just
  as with subroutines.  The expression providing the return value is evaluated
  in void, scalar, or list context, depending on the context of the
-L<C<eval>|/eval EXPR> itself.  See L<C<wantarray>|/wantarray> for more
+C<eval> itself.  See L<C<wantarray>|/wantarray> for more
  on how the evaluation context can be determined.
  
  If there is a syntax error or runtime error, or a L<C<die>|/die LIST>
-statement is executed, L<C<eval>|/eval EXPR> returns
-L<C<undef>|/undef EXPR> in scalar context or an empty list in list
+statement is executed, C<eval> returns
+L<C<undef>|/undef EXPR> in scalar context, or an empty list in list
  context, and L<C<$@>|perlvar/$@> is set to the error message.  (Prior to
  5.16, a bug caused L<C<undef>|/undef EXPR> to be returned in list
  context for syntax errors, but not for runtime errors.) If there was no
  error, L<C<$@>|perlvar/$@> is set to the empty string.  A control flow
  operator like L<C<last>|/last LABEL> or L<C<goto>|/goto LABEL> can
  bypass the setting of L<C<$@>|perlvar/$@>.  Beware that using
-L<C<eval>|/eval EXPR> neither silences Perl from printing warnings to
+C<eval> neither silences Perl from printing warnings to
  STDERR, nor does it stuff the text of warning messages into
  L<C<$@>|perlvar/$@>.  To do either of those, you have to use the
  L<C<$SIG{__WARN__}>|perlvar/%SIG> facility, or turn off warnings inside
  the BLOCK or EXPR using S<C<no warnings 'all'>>.  See
  L<C<warn>|/warn LIST>, L<perlvar>, and L<warnings>.
  
-Note that, because L<C<eval>|/eval EXPR> traps otherwise-fatal errors,
+Note that, because C<eval> traps otherwise-fatal errors,
  it is useful for determining whether a particular feature (such as
  L<C<socket>|/socket SOCKET,DOMAIN,TYPE,PROTOCOL> or
  L<C<symlink>|/symlink OLDFILE,NEWFILE>) is implemented.  It is also
  Perl's exception-trapping mechanism, where the L<C<die>|/die LIST>
  operator is used to raise exceptions.
  
-If you want to trap errors when loading an XS module, some problems with
-the binary interface (such as Perl version skew) may be fatal even with
-L<C<eval>|/eval EXPR> unless C<$ENV{PERL_DL_NONLAZY}> is set.  See
-L<perlrun>.
+Before Perl 5.14, the assignment to L<C<$@>|perlvar/$@> occurred before
+restoration
+of localized variables, which means that for your code to run on older
+versions, a temporary is required if you want to mask some, but not all
+errors:
+
+ # alter $@ on nefarious repugnancy only
+ {
+    my $e;
+    {
+      local $@; # protect existing $@
+      eval { test_repugnancy() };
+      # $@ =~ /nefarious/ and die $@; # Perl 5.14 and higher only
+      $@ =~ /nefarious/ and $e = $@;
+    }
+    die $e if defined $e
+ }
+
+There are some different considerations for each form:
+
+=over 4
+
+=item String eval
+
+Since the return value of EXPR is executed as a block within the lexical
+context of the current Perl program, any outer lexical variables are
+visible to it, and any package variable settings or subroutine and
+format definitions remain afterwards.
+
+=over 4
+
+=item Under the L<C<"unicode_eval"> feature|feature/The 'unicode_eval' and 'evalbytes' features>
+
+If this feature is enabled (which is the default under a C<use 5.16> or
+higher declaration), EXPR is considered to be
+in the same encoding as the surrounding program.  Thus if
+S<L<C<use utf8>|utf8>> is in effect, the string will be treated as being
+UTF-8 encoded.  Otherwise, the string is considered to be a sequence of
+independent bytes.  Bytes that correspond to ASCII-range code points
+will have their normal meanings for operators in the string.  The
+treatment of the other bytes depends on if the
+L<C<'unicode_strings"> feature|feature/The 'unicode_strings' feature> is
+in effect.
+
+In a plain C<eval> without an EXPR argument, being in S<C<use utf8>> or
+not is irrelevant; the UTF-8ness of C<$_> itself determines the
+behavior.
+
+Any S<C<use utf8>> or S<C<no utf8>> declarations within the string have
+no effect, and source filters are forbidden.  (C<unicode_strings>,
+however, can appear within the string.)  See also the
+L<C<evalbytes>|/evalbytes EXPR> operator, which works properly with
+source filters.
+
+Variables defined outside the C<eval> and used inside it retain their
+original UTF-8ness.  Everything inside the string follows the normal
+rules for a Perl program with the given state of S<C<use utf8>>.
+
+=item Outside the C<"unicode_eval"> feature
+
+In this case, the behavior is problematic and is not so easily
+described.  Here are two bugs that cannot easily be fixed without
+breaking existing programs:
+
+=over 4
+
+=item *
+
+It can lose track of whether something should be encoded as UTF-8 or
+not.
+
+=item *
+
+Source filters activated within C<eval> leak out into whichever file
+scope is currently being compiled.  To give an example with the CPAN module
+L<Semi::Semicolons>:
+
+ BEGIN { eval "use Semi::Semicolons; # not filtered" }
+ # filtered here!
+
+L<C<evalbytes>|/evalbytes EXPR> fixes that to work the way one would
+expect:
+
+ use feature "evalbytes";
+ BEGIN { evalbytes "use Semi::Semicolons; # filtered" }
+ # not filtered
+
+=back
+
+=back
+
+Problems can arise if the string expands a scalar containing a floating
+point number.  That scalar can expand to letters, such as C<"NaN"> or
+C<"Infinity">; or, within the scope of a L<C<use locale>|locale>, the
+decimal point character may be something other than a dot (such as a
+comma).  None of these are likely to parse as you are likely expecting.
+
+You should be especially careful to remember what's being looked at
+when:
+
+    eval $x;        # CASE 1
+    eval "$x";      # CASE 2
+
+    eval '$x';      # CASE 3
+    eval { $x };    # CASE 4
+
+    eval "\$$x++";  # CASE 5
+    $$x++;          # CASE 6
+
+Cases 1 and 2 above behave identically: they run the code contained in
+the variable $x.  (Although case 2 has misleading double quotes making
+the reader wonder what else might be happening (nothing is).)  Cases 3
+and 4 likewise behave in the same way: they run the code C<'$x'>, which
+does nothing but return the value of $x.  (Case 4 is preferred for
+purely visual reasons, but it also has the advantage of compiling at
+compile-time instead of at run-time.)  Case 5 is a place where
+normally you I<would> like to use double quotes, except that in this
+particular situation, you can just use symbolic references instead, as
+in case 6.
+
+An C<eval ''> executed within a subroutine defined
+in the C<DB> package doesn't see the usual
+surrounding lexical scope, but rather the scope of the first non-DB piece
+of code that called it.  You don't normally need to worry about this unless
+you are writing a Perl debugger.
+
+The final semicolon, if any, may be omitted from the value of EXPR.
+
+=item Block eval
  
  If the code to be executed doesn't vary, you may use the eval-BLOCK
  form to trap run-time errors without incurring the penalty of
@@ -2135,6 +2256,11 @@ Examples:
      # a run-time error
      eval '$answer =';   # sets $@
  
+If you want to trap errors when loading an XS module, some problems with
+the binary interface (such as Perl version skew) may be fatal even with
+C<eval> unless C<$ENV{PERL_DL_NONLAZY}> is set.  See
+L<perlrun>.
+
  Using the C<eval {}> form as an exception trap in libraries does have some
  issues.  Due to the current arguably broken state of C<__DIE__> hooks, you
  may wish not to trigger any C<__DIE__> hooks that user code may have installed.
@@ -2160,56 +2286,13 @@ messages:
  Because this promotes action at a distance, this counterintuitive behavior
  may be fixed in a future release.
  
-With an L<C<eval>|/eval EXPR>, you should be especially careful to
-remember what's being looked at when:
-
-    eval $x;        # CASE 1
-    eval "$x";      # CASE 2
-
-    eval '$x';      # CASE 3
-    eval { $x };    # CASE 4
-
-    eval "\$$x++";  # CASE 5
-    $$x++;          # CASE 6
-
-Cases 1 and 2 above behave identically: they run the code contained in
-the variable $x.  (Although case 2 has misleading double quotes making
-the reader wonder what else might be happening (nothing is).)  Cases 3
-and 4 likewise behave in the same way: they run the code C<'$x'>, which
-does nothing but return the value of $x.  (Case 4 is preferred for
-purely visual reasons, but it also has the advantage of compiling at
-compile-time instead of at run-time.)  Case 5 is a place where
-normally you I<would> like to use double quotes, except that in this
-particular situation, you can just use symbolic references instead, as
-in case 6.
-
-Before Perl 5.14, the assignment to L<C<$@>|perlvar/$@> occurred before
-restoration
-of localized variables, which means that for your code to run on older
-versions, a temporary is required if you want to mask some but not all
-errors:
-
-    # alter $@ on nefarious repugnancy only
-    {
-       my $e;
-       {
-         local $@; # protect existing $@
-         eval { test_repugnancy() };
-         # $@ =~ /nefarious/ and die $@; # Perl 5.14 and higher only
-         $@ =~ /nefarious/ and $e = $@;
-       }
-       die $e if defined $e
-    }
-
  C<eval BLOCK> does I<not> count as a loop, so the loop control statements
  L<C<next>|/next LABEL>, L<C<last>|/last LABEL>, or
  L<C<redo>|/redo LABEL> cannot be used to leave or restart the block.
  
-An C<eval ''> executed within a subroutine defined
-in the C<DB> package doesn't see the usual
-surrounding lexical scope, but rather the scope of the first non-DB piece
-of code that called it.  You don't normally need to worry about this unless
-you are writing a Perl debugger.
+The final semicolon, if any, may be omitted from within the BLOCK.
+
+=back
  
  =item evalbytes EXPR
  X<evalbytes>
@@ -2218,18 +2301,42 @@ X<evalbytes>
  
  =for Pod::Functions +evalbytes similar to string eval, but intend to parse a bytestream
  
-This function is like L<C<eval>|/eval EXPR> with a string argument,
-except it always parses its argument, or L<C<$_>|perlvar/$_> if EXPR is
-omitted, as a string of bytes.  A string containing characters whose
-ordinal value exceeds 255 results in an error.  Source filters activated
-within the evaluated code apply to the code itself.
+This function is similar to a L<string eval|/eval EXPR>, except it
+always parses its argument (or L<C<$_>|perlvar/$_> if EXPR is omitted)
+as a string of independent bytes.
  
-L<C<evalbytes>|/evalbytes EXPR> is available only if the
-L<C<"evalbytes"> feature|feature/The 'unicode_eval' and 'evalbytes' features>
-is enabled or if it is prefixed with C<CORE::>.  The
+If called when S<C<use utf8>> is in effect, the string will be assumed
+to be encoded in UTF-8, and C<evalbytes> will make a temporary copy to
+work from, downgraded to non-UTF-8.  If this is not possible
+(because one or more characters in it require UTF-8), the C<evalbytes>
+will fail with the error stored in C<$@>.
+
+Bytes that correspond to ASCII-range code points will have their normal
+meanings for operators in the string.  The treatment of the other bytes
+depends on if the L<C<'unicode_strings"> feature|feature/The
+'unicode_strings' feature> is in effect.
+
+Of course, variables that are UTF-8 and are referred to in the string
+retain that:
+
+ my $a = "\x{100}";
+ evalbytes 'print ord $a, "\n"';
+
+prints
+
+ 256
+
+and C<$@> is empty.
+
+Source filters activated within the evaluated code apply to the code
+itself.
+
+L<C<evalbytes>|/evalbytes EXPR> is available starting in Perl v5.16.  To
+access it, you must say C<CORE::evalbytes>, but you can omit the
+C<CORE::> if the
  L<C<"evalbytes"> feature|feature/The 'unicode_eval' and 'evalbytes' features>
-is enabled automatically with a C<use v5.16> (or higher) declaration in
-the current scope.
+is enabled.  This is enabled automatically with a C<use v5.16> (or
+higher) declaration in the current scope.
  
  =item exec LIST
  X<exec> X<execute>
@@ -2554,9 +2661,12 @@ A special token that returns the name of the file in which it occurs.
  =item fileno FILEHANDLE
  X<fileno>
  
+=item fileno DIRHANDLE
+
  =for Pod::Functions return file descriptor from filehandle
  
-Returns the file descriptor for a filehandle, or undefined if the
+Returns the file descriptor for a filehandle or directory handle,
+or undefined if the
  filehandle is not open.  If there is no real file descriptor at the OS
  level, as can happen with filehandles connected to memory objects via
  L<C<open>|/open FILEHANDLE,EXPR> with a reference for the third
@@ -2653,8 +2763,8 @@ Here's a mailbox appender for BSD systems.
      sub lock {
          my ($fh) = @_;
          flock($fh, LOCK_EX) or die "Cannot lock mailbox - $!\n";
-
-        # and, in case someone appended while we were waiting...
+        # and, in case we're running on a very old UNIX
+        # variant without the modern O_APPEND semantics...
          seek($fh, 0, SEEK_END) or die "Cannot seek - $!\n";
      }
  
@@ -2879,6 +2989,9 @@ Returns the current priority for a process, a process group, or a user.
  (See L<getpriority(2)>.)  Will raise a fatal exception if used on a
  machine that doesn't implement L<getpriority(2)>.
  
+C<WHICH> can be any of C<PRIO_PROCESS>, C<PRIO_PGRP> or C<PRIO_USER>
+imported from L<POSIX/RESOURCE CONSTANTS>.
+
  Portability issues: L<perlport/getpriority>.
  
  =item getpwnam NAME
@@ -3764,8 +3877,8 @@ many elements these have.  For that, use C<scalar @array> and C<scalar keys
  Like all Perl character operations, L<C<length>|/length EXPR> normally
  deals in logical
  characters, not physical bytes.  For how many bytes a string encoded as
-UTF-8 would take up, use C<length(Encode::encode_utf8(EXPR))> (you'll have
-to C<use Encode> first).  See L<Encode> and L<perlunicode>.
+UTF-8 would take up, use C<length(Encode::encode('UTF-8', EXPR))>
+(you'll have to C<use Encode> first).  See L<Encode> and L<perlunicode>.
  
  =item __LINE__
  X<__LINE__>
@@ -3964,12 +4077,11 @@ X<map>
  =for Pod::Functions apply a change to a list to get back a new list with the changes
  
  Evaluates the BLOCK or EXPR for each element of LIST (locally setting
-L<C<$_>|perlvar/$_> to each element) and returns the list value composed
-of the
-results of each such evaluation.  In scalar context, returns the
-total number of elements so generated.  Evaluates BLOCK or EXPR in
-list context, so each element of LIST may produce zero, one, or
-more elements in the returned value.
+L<C<$_>|perlvar/$_> to each element) and composes a list of the results of
+each such evaluation.  Each element of LIST may produce zero, one, or more
+elements in the generated list, so the number of elements in the generated
+list may differ from that in LIST.  In scalar context, returns the total
+number of elements so generated.  In list context, returns the generated list.
  
      my @chars = map(chr, @numbers);
  
@@ -4406,9 +4518,9 @@ argument being L<C<undef>|/undef EXPR>:
  
      open(my $tmp, "+>", undef) or die ...
  
-opens a filehandle to an anonymous temporary file.  Also using C<< +< >>
-works for symmetry, but you really should consider writing something
-to the temporary file first.  You will need to
+opens a filehandle to a newly created empty anonymous temporary file.
+(This happens under any mode, which makes C<< +> >> the only useful and
+sensible mode to use.)  You will need to
  L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE> to do the reading.
  
  Perl is built using PerlIO by default.  Unless you've
@@ -4423,6 +4535,13 @@ To (re)open C<STDOUT> or C<STDERR> as an in-memory file, close it first:
      open(STDOUT, ">", \$variable)
         or die "Can't open STDOUT: $!";
  
+The scalars for in-memory files are treated as octet strings: unless
+the file is being opened with truncation the scalar may not contain
+any code points over 0xFF.
+
+Opening in-memory files I<can> fail for a variety of reasons.  As with
+any other C<open>, check the return value for success.
+
  See L<perliol> for detailed info on PerlIO.
  
  General examples:
@@ -4668,7 +4787,8 @@ DIRHANDLE may be an expression whose value can be used as an indirect
  dirhandle, usually the real dirhandle name.  If DIRHANDLE is an undefined
  scalar variable (or array or hash element), the variable is assigned a
  reference to a new anonymous dirhandle; that is, it's autovivified.
-DIRHANDLEs have their own namespace separate from FILEHANDLEs.
+Dirhandles are the same objects as filehandles; an I/O object can only
+be open as one of these handle types at once.
  
  See the example at L<C<readdir>|/readdir DIRHANDLE>.
  
@@ -4852,7 +4972,7 @@ of values, as follows:
            those.  Raises an exception otherwise.)
  
      i  A signed integer value.
-    I  A unsigned integer value.
+    I  An unsigned integer value.
           (This 'integer' is _at_least_ 32 bits wide.  Its exact
            size depends on what a local C compiler calls 'int'.)
  
@@ -5669,7 +5789,7 @@ returning the filehandle value instead, in which case the LIST may not be
  omitted:
  
      print { $files[$i] } "stuff\n";
-    print { $OK ? STDOUT : STDERR } "stuff\n";
+    print { $OK ? *STDOUT : *STDERR } "stuff\n";
  
  Printing to a closed pipe or socket will generate a SIGPIPE signal.  See
  L<perlipc> for more on signal handling.
@@ -6091,7 +6211,7 @@ Note the I<characters>: depending on the status of the socket, either
  (8-bit) bytes or characters are received.  By default all sockets
  operate on bytes, but for example if the socket has been changed using
  L<C<binmode>|/binmode FILEHANDLE, LAYER> to operate with the
-C<:encoding(utf8)> I/O layer (see the L<open> pragma), the I/O will
+C<:encoding(UTF-8)> I/O layer (see the L<open> pragma), the I/O will
  operate on UTF8-encoded Unicode
  characters, not bytes.  Similarly for the C<:encoding> layer: in that
  case pretty much any characters can be read.
@@ -6651,7 +6771,7 @@ of the file) from the L<Fcntl> module.  Returns C<1> on success, false
  otherwise.
  
  Note the emphasis on bytes: even if the filehandle has been set to operate
-on characters (for example using the C<:encoding(utf8)> I/O layer), the
+on characters (for example using the C<:encoding(UTF-8)> I/O layer), the
  L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>,
  L<C<tell>|/tell FILEHANDLE>, and
  L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE>
@@ -6890,7 +7010,7 @@ Note the I<characters>: depending on the status of the socket, either
  (8-bit) bytes or characters are sent.  By default all sockets operate
  on bytes, but for example if the socket has been changed using
  L<C<binmode>|/binmode FILEHANDLE, LAYER> to operate with the
-C<:encoding(utf8)> I/O layer (see L<C<open>|/open FILEHANDLE,EXPR>, or
+C<:encoding(UTF-8)> I/O layer (see L<C<open>|/open FILEHANDLE,EXPR>, or
  the L<open> pragma), the I/O will operate on UTF-8
  encoded Unicode characters, not bytes.  Similarly for the C<:encoding>
  layer: in that case pretty much any characters can be sent.
@@ -6919,6 +7039,9 @@ Sets the current priority for a process, a process group, or a user.
  (See L<setpriority(2)>.)  Raises an exception when used on a machine
  that doesn't implement L<setpriority(2)>.
  
+C<WHICH> can be any of C<PRIO_PROCESS>, C<PRIO_PGRP> or C<PRIO_USER>
+imported from L<POSIX/RESOURCE CONSTANTS>.
+
  Portability issues: L<perlport/setpriority>.
  
  =item setsockopt SOCKET,LEVEL,OPTNAME,OPTVAL
@@ -7168,9 +7291,7 @@ If the subroutine's prototype is C<($$)>, the elements to be compared are
  passed by reference in L<C<@_>|perlvar/@_>, as for a normal subroutine.
  This is slower than unprototyped subroutines, where the elements to be
  compared are passed into the subroutine as the package global variables
-C<$a> and C<$b> (see example below).  Note that in the latter case, it
-is usually highly counter-productive to declare C<$a> and C<$b> as
-lexicals.
+C<$a> and C<$b> (see example below).
  
  If the subroutine is an XSUB, the elements to be compared are pushed on
  to the stack, the way arguments are usually passed to XSUBs.  C<$a> and
@@ -7315,16 +7436,63 @@ C<find_records()> then you can use:
      my @contact = sort(find_records @key);
      my @contact = sort(find_records (@key));
  
-You I<must not> declare C<$a>
-and C<$b> as lexicals.  They are package globals.  That means
-that if you're in the C<main> package and type
-
-    my @articles = sort {$b <=> $a} @files;
-
-then C<$a> and C<$b> are C<$main::a> and C<$main::b> (or C<$::a> and C<$::b>),
-but if you're in the C<FooPack> package, it's the same as typing
-
-    my @articles = sort {$FooPack::b <=> $FooPack::a} @files;
+C<$a> and C<$b> are set as package globals in the package the sort() is
+called from.  That means C<$main::a> and C<$main::b> (or C<$::a> and
+C<$::b>) in the C<main> package, C<$FooPack::a> and C<$FooPack::b> in the
+C<FooPack> package, etc.  If the sort block is in scope of a C<my> or
+C<state> declaration of C<$a> and/or C<$b>, you I<must> spell out the full
+name of the variables in the sort block :
+
+   package main;
+   my $a = "C"; # DANGER, Will Robinson, DANGER !!!
+
+   print sort { $a cmp $b }               qw(A C E G B D F H);
+                                          # WRONG
+   sub badlexi { $a cmp $b }
+   print sort badlexi                     qw(A C E G B D F H);
+                                          # WRONG
+   # the above prints BACFEDGH or some other incorrect ordering
+
+   print sort { $::a cmp $::b }           qw(A C E G B D F H);
+                                          # OK
+   print sort { our $a cmp our $b }       qw(A C E G B D F H);
+                                          # also OK
+   print sort { our ($a, $b); $a cmp $b } qw(A C E G B D F H);
+                                          # also OK
+   sub lexi { our $a cmp our $b }
+   print sort lexi                        qw(A C E G B D F H);
+                                          # also OK
+   # the above print ABCDEFGH
+
+With proper care you may mix package and my (or state) C<$a> and/or C<$b>:
+
+   my $a = {
+      tiny   => -2,
+      small  => -1,
+      normal => 0,
+      big    => 1,
+      huge   => 2
+   };
+
+   say sort { $a->{our $a} <=> $a->{our $b} }
+       qw{ huge normal tiny small big};
+
+   # prints tinysmallnormalbighuge
+
+C<$a> and C<$b> are implicitely local to the sort() execution and regain their
+former values upon completing the sort.
+
+Sort subroutines written using C<$a> and C<$b> are bound to their calling
+package. It is possible, but of limited interest, to define them in a
+different package, since the subroutine must still refer to the calling
+package's C<$a> and C<$b> :
+
+   package Foo;
+   sub lexi { $Bar::a cmp $Bar::b }
+   package Bar;
+   ... sort Foo::lexi ...
+
+Use the prototyped versions (see above) for a more generic alternative.
  
  The comparison function is required to behave.  If it returns
  inconsistent results (sometimes saying C<$x[1]> is less than C<$x[2]> and
@@ -7406,6 +7574,8 @@ X<split>
  
  Splits the string EXPR into a list of strings and returns the
  list in list context, or the size of the list in scalar context.
+(Prior to Perl 5.11, it also overwrote C<@_> with the list in
+void and scalar context. If you target old perls, beware.)
  
  If only PATTERN is given, EXPR defaults to L<C<$_>|perlvar/$_>.
  
@@ -7442,6 +7612,10 @@ If PATTERN is C</^/>, then it is treated as if it used the
  L<multiline modifier|perlreref/OPERATORS> (C</^/m>), since it
  isn't much use otherwise.
  
+C<E<sol>m> and any of the other pattern modifiers valid for C<qr>
+(summarized in L<perlop/qrE<sol>STRINGE<sol>msixpodualn>) may be
+specified explicitly.
+
  As another special case,
  L<C<split>|/split E<sol>PATTERNE<sol>,EXPR,LIMIT> emulates the default
  behavior of the
@@ -7458,6 +7632,14 @@ special case was restricted to the use of a plain S<C<" ">> as the
  pattern argument to split; in Perl 5.18.0 and later this special case is
  triggered by any expression which evaluates to the simple string S<C<" ">>.
  
+As of Perl 5.28, this special-cased whitespace splitting works as expected in
+the scope of L<< S<C<"use feature 'unicode_strings">>|feature/The
+'unicode_strings' feature >>. In previous versions, and outside the scope of
+that feature, it exhibits L<perlunicode/The "Unicode Bug">: characters that are
+whitespace according to Unicode rules but not according to ASCII rules can be
+treated as part of fields rather than as field separators, depending on the
+string's internal encoding.
+
  If omitted, PATTERN defaults to a single space, S<C<" ">>, triggering
  the previously described I<awk> emulation.
  
@@ -8159,7 +8341,7 @@ If more than one variable is listed, the list must be placed in
  parentheses.  With a parenthesised list, L<C<undef>|/undef EXPR> can be
  used as a
  dummy placeholder.  However, since initialization of state variables in
-list context is currently not possible this would serve no purpose.
+such lists is currently not possible this would serve no purpose.
  
  L<C<state>|/state VARLIST> is available only if the
  L<C<"state"> feature|feature/The 'state' feature> is enabled or if it is
@@ -8489,7 +8671,7 @@ to the current position plus POSITION; and C<2> to set it to EOF plus
  POSITION, typically negative.
  
  Note the emphasis on bytes: even if the filehandle has been set to operate
-on characters (for example using the C<:encoding(utf8)> I/O layer), the
+on characters (for example using the C<:encoding(UTF-8)> I/O layer), the
  L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>,
  L<C<tell>|/tell FILEHANDLE>, and
  L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE>
@@ -8656,7 +8838,7 @@ the actual filehandle.  If FILEHANDLE is omitted, assumes the file
  last read.
  
  Note the emphasis on bytes: even if the filehandle has been set to operate
-on characters (for example using the C<:encoding(utf8)> I/O layer), the
+on characters (for example using the C<:encoding(UTF-8)> I/O layer), the
  L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>,
  L<C<tell>|/tell FILEHANDLE>, and
  L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE>
@@ -9410,10 +9592,12 @@ extend the string with sufficiently many zero bytes.   It is an error
  to try to write off the beginning of the string (i.e., negative OFFSET).
  
  If the string happens to be encoded as UTF-8 internally (and thus has
-the UTF8 flag set), this is ignored by L<C<vec>|/vec EXPR,OFFSET,BITS>,
-and it operates on the
-internal byte string, not the conceptual character string, even if you
-only have characters with values less than 256.
+the UTF8 flag set), L<C<vec>|/vec EXPR,OFFSET,BITS> tries to convert it
+to use a one-byte-per-character internal representation. However, if the
+string contains characters with values of 256 or higher, that conversion
+will fail, and a deprecation message will be raised.  In that situation,
+C<vec> will operate on the underlying buffer regardless, in its internal
+UTF-8 representation.  In Perl 5.32, this will be a fatal error.
  
  Strings created with L<C<vec>|/vec EXPR,OFFSET,BITS> can also be
  manipulated with the logical