L<C<getc>|/getc FILEHANDLE>, L<C<print>|/print FILEHANDLE LIST>,
L<C<printf>|/printf FILEHANDLE FORMAT, LIST>,
L<C<read>|/read FILEHANDLE,SCALAR,LENGTH,OFFSET>,
-L<C<readdir>|/readdir DIRHANDLE>, L<C<readline>|/readline EXPR>
+L<C<readdir>|/readdir DIRHANDLE>, L<C<readline>|/readline EXPR>,
L<C<rewinddir>|/rewinddir DIRHANDLE>, L<C<say>|/say FILEHANDLE LIST>,
L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>,
L<C<seekdir>|/seekdir DIRHANDLE,POS>,
L<C<break>|/break>, L<C<caller>|/caller EXPR>,
L<C<continue>|/continue BLOCK>, L<C<die>|/die LIST>, L<C<do>|/do BLOCK>,
L<C<dump>|/dump LABEL>, L<C<eval>|/eval EXPR>,
-L<C<evalbytes>|/evalbytes EXPR> L<C<exit>|/exit EXPR>,
+L<C<evalbytes>|/evalbytes EXPR>, L<C<exit>|/exit EXPR>,
L<C<__FILE__>|/__FILE__>, L<C<goto>|/goto LABEL>,
L<C<last>|/last LABEL>, L<C<__LINE__>|/__LINE__>,
L<C<next>|/next LABEL>, L<C<__PACKAGE__>|/__PACKAGE__>,
L<perlsub>.
Use of L<C<defined>|/defined EXPR> on aggregates (hashes and arrays) is
-deprecated. It
-used to report whether memory for that aggregate had ever been
-allocated. This behavior may disappear in future versions of Perl.
-You should instead use a simple test for size:
+no longer supported. It used to report whether memory for that
+aggregate had ever been allocated. You should instead use a simple
+test for size:
if (@an_array) { print "has array elements\n" }
if (%a_hash) { print "has hash members\n" }
X<do>
Uses the value of EXPR as a filename and executes the contents of the
-file as a Perl script.
+file as a Perl script:
+ # load the exact specified file (./ and ../ special-cased)
+ do '/foo/stat.pl';
+ do './stat.pl';
+ do '../foo/stat.pl';
+
+ # search for the named file within @INC
do 'stat.pl';
+ do 'foo/stat.pl';
-is largely like
+C<do './stat.pl'> is largely like
eval `cat stat.pl`;
-except that it's more concise, runs no external processes, keeps track of
-the current filename for error messages, searches the
-L<C<@INC>|perlvar/@INC> directories, and updates L<C<%INC>|perlvar/%INC>
-if the file is found. See L<perlvar/@INC> and L<perlvar/%INC> for these
-variables. It also differs in that code evaluated with C<do FILE>
-cannot see lexicals in the enclosing scope; C<eval STRING> does. It's
-the same, however, in that it does reparse the file every time you call
-it, so you probably don't want to do this inside a loop.
+except that it's more concise, runs no external processes, and keeps
+track of the current filename for error messages. It also differs in that
+code evaluated with C<do FILE> cannot see lexicals in the enclosing
+scope; C<eval STRING> does. It's the same, however, in that it does
+reparse the file every time you call it, so you probably don't want
+to do this inside a loop.
+
+Using C<do> with a relative path (except for F<./> and F<../>), like
+
+ do 'foo/stat.pl';
+
+will search the L<C<@INC>|perlvar/@INC> directories, and update
+L<C<%INC>|perlvar/%INC> if the file is found. See L<perlvar/@INC>
+and L<perlvar/%INC> for these variables. In particular, note that
+whilst historically L<C<@INC>|perlvar/@INC> contained '.' (the
+current directory) making these two cases equivalent, that is no
+longer necessarily the case, as '.' is not included in C<@INC> by default
+in perl versions 5.26.0 onwards. Instead, perl will now warn:
+
+ do "stat.pl" failed, '.' is no longer in @INC;
+ did you mean do "./stat.pl"?
If L<C<do>|/do EXPR> can read the file but cannot compile it, it
returns L<C<undef>|/undef EXPR> and sets an error message in
You might like to use L<C<do>|/do EXPR> to read in a program
configuration file. Manual error checking can be done this way:
- # read in config files: system first, then user
+ # Read in config files: system first, then user.
+ # Beware of using relative pathnames here.
for $file ("/share/prog/defaults.rc",
"$ENV{HOME}/.someprogrc")
{
=for Pod::Functions catch exceptions or compile and run code
-In the first form, often referred to as a "string eval", the return
-value of EXPR is parsed and executed as if it
-were a little Perl program. The value of the expression (which is itself
-determined within scalar context) is first parsed, and if there were no
-errors, executed as a block within the lexical context of the current Perl
-program. This means, that in particular, any outer lexical variables are
-visible to it, and any package variable settings or subroutine and format
-definitions remain afterwards.
-
-Note that the value is parsed every time the L<C<eval>|/eval EXPR>
-executes. If EXPR is omitted, evaluates L<C<$_>|perlvar/$_>. This form
-is typically used to delay parsing and subsequent execution of the text
-of EXPR until run time.
-
-If the
-L<C<"unicode_eval"> feature|feature/The 'unicode_eval' and 'evalbytes' features>
-is enabled (which is the default under a
-C<use 5.16> or higher declaration), EXPR or L<C<$_>|perlvar/$_> is
-treated as a string of characters, so L<C<use utf8>|utf8> declarations
-have no effect, and source filters are forbidden. In the absence of the
-L<C<"unicode_eval"> feature|feature/The 'unicode_eval' and 'evalbytes' features>,
-will sometimes be treated as characters and sometimes as bytes,
-depending on the internal encoding, and source filters activated within
-the L<C<eval>|/eval EXPR> exhibit the erratic, but historical, behaviour
-of affecting some outer file scope that is still compiling. See also
-the L<C<evalbytes>|/evalbytes EXPR> operator, which always treats its
-input as a byte stream and works properly with source filters, and the
-L<feature> pragma.
-
-Problems can arise if the string expands a scalar containing a floating
-point number. That scalar can expand to letters, such as C<"NaN"> or
-C<"Infinity">; or, within the scope of a L<C<use locale>|locale>, the
-decimal point character may be something other than a dot (such as a
-comma). None of these are likely to parse as you are likely expecting.
-
-In the second form, the code within the BLOCK is parsed only once--at the
-same time the code surrounding the L<C<eval>|/eval EXPR> itself was
-parsed--and executed
+C<eval> in all its forms is used to execute a little Perl program,
+trapping any errors encountered so they don't crash the calling program.
+
+Plain C<eval> with no argument is just C<eval EXPR>, where the
+expression is understood to be contained in L<C<$_>|perlvar/$_>. Thus
+there are only two real C<eval> forms; the one with an EXPR is often
+called "string eval". In a string eval, the value of the expression
+(which is itself determined within scalar context) is first parsed, and
+if there were no errors, executed as a block within the lexical context
+of the current Perl program. This form is typically used to delay
+parsing and subsequent execution of the text of EXPR until run time.
+Note that the value is parsed every time the C<eval> executes.
+
+The other form is called "block eval". It is less general than string
+eval, but the code within the BLOCK is parsed only once (at the same
+time the code surrounding the C<eval> itself was parsed) and executed
within the context of the current Perl program. This form is typically
-used to trap exceptions more efficiently than the first (see below), while
-also providing the benefit of checking the code within BLOCK at compile
-time.
-
-The final semicolon, if any, may be omitted from the value of EXPR or within
-the BLOCK.
+used to trap exceptions more efficiently than the first, while also
+providing the benefit of checking the code within BLOCK at compile time.
+BLOCK is parsed and compiled just once. Since errors are trapped, it
+often is used to check if a given feature is available.
In both forms, the value returned is the value of the last expression
-evaluated inside the mini-program; a return statement may be also used, just
+evaluated inside the mini-program; a return statement may also be used, just
as with subroutines. The expression providing the return value is evaluated
in void, scalar, or list context, depending on the context of the
-L<C<eval>|/eval EXPR> itself. See L<C<wantarray>|/wantarray> for more
+C<eval> itself. See L<C<wantarray>|/wantarray> for more
on how the evaluation context can be determined.
If there is a syntax error or runtime error, or a L<C<die>|/die LIST>
-statement is executed, L<C<eval>|/eval EXPR> returns
-L<C<undef>|/undef EXPR> in scalar context or an empty list in list
+statement is executed, C<eval> returns
+L<C<undef>|/undef EXPR> in scalar context, or an empty list in list
context, and L<C<$@>|perlvar/$@> is set to the error message. (Prior to
5.16, a bug caused L<C<undef>|/undef EXPR> to be returned in list
context for syntax errors, but not for runtime errors.) If there was no
error, L<C<$@>|perlvar/$@> is set to the empty string. A control flow
operator like L<C<last>|/last LABEL> or L<C<goto>|/goto LABEL> can
bypass the setting of L<C<$@>|perlvar/$@>. Beware that using
-L<C<eval>|/eval EXPR> neither silences Perl from printing warnings to
+C<eval> neither silences Perl from printing warnings to
STDERR, nor does it stuff the text of warning messages into
L<C<$@>|perlvar/$@>. To do either of those, you have to use the
L<C<$SIG{__WARN__}>|perlvar/%SIG> facility, or turn off warnings inside
the BLOCK or EXPR using S<C<no warnings 'all'>>. See
L<C<warn>|/warn LIST>, L<perlvar>, and L<warnings>.
-Note that, because L<C<eval>|/eval EXPR> traps otherwise-fatal errors,
+Note that, because C<eval> traps otherwise-fatal errors,
it is useful for determining whether a particular feature (such as
L<C<socket>|/socket SOCKET,DOMAIN,TYPE,PROTOCOL> or
L<C<symlink>|/symlink OLDFILE,NEWFILE>) is implemented. It is also
Perl's exception-trapping mechanism, where the L<C<die>|/die LIST>
operator is used to raise exceptions.
-If you want to trap errors when loading an XS module, some problems with
-the binary interface (such as Perl version skew) may be fatal even with
-L<C<eval>|/eval EXPR> unless C<$ENV{PERL_DL_NONLAZY}> is set. See
-L<perlrun>.
+Before Perl 5.14, the assignment to L<C<$@>|perlvar/$@> occurred before
+restoration
+of localized variables, which means that for your code to run on older
+versions, a temporary is required if you want to mask some, but not all
+errors:
+
+ # alter $@ on nefarious repugnancy only
+ {
+ my $e;
+ {
+ local $@; # protect existing $@
+ eval { test_repugnancy() };
+ # $@ =~ /nefarious/ and die $@; # Perl 5.14 and higher only
+ $@ =~ /nefarious/ and $e = $@;
+ }
+ die $e if defined $e
+ }
+
+There are some different considerations for each form:
+
+=over 4
+
+=item String eval
+
+Since the return value of EXPR is executed as a block within the lexical
+context of the current Perl program, any outer lexical variables are
+visible to it, and any package variable settings or subroutine and
+format definitions remain afterwards.
+
+=over 4
+
+=item Under the L<C<"unicode_eval"> feature|feature/The 'unicode_eval' and 'evalbytes' features>
+
+If this feature is enabled (which is the default under a C<use 5.16> or
+higher declaration), EXPR is considered to be
+in the same encoding as the surrounding program. Thus if
+S<L<C<use utf8>|utf8>> is in effect, the string will be treated as being
+UTF-8 encoded. Otherwise, the string is considered to be a sequence of
+independent bytes. Bytes that correspond to ASCII-range code points
+will have their normal meanings for operators in the string. The
+treatment of the other bytes depends on if the
+L<C<'unicode_strings"> feature|feature/The 'unicode_strings' feature> is
+in effect.
+
+In a plain C<eval> without an EXPR argument, being in S<C<use utf8>> or
+not is irrelevant; the UTF-8ness of C<$_> itself determines the
+behavior.
+
+Any S<C<use utf8>> or S<C<no utf8>> declarations within the string have
+no effect, and source filters are forbidden. (C<unicode_strings>,
+however, can appear within the string.) See also the
+L<C<evalbytes>|/evalbytes EXPR> operator, which works properly with
+source filters.
+
+Variables defined outside the C<eval> and used inside it retain their
+original UTF-8ness. Everything inside the string follows the normal
+rules for a Perl program with the given state of S<C<use utf8>>.
+
+=item Outside the C<"unicode_eval"> feature
+
+In this case, the behavior is problematic and is not so easily
+described. Here are two bugs that cannot easily be fixed without
+breaking existing programs:
+
+=over 4
+
+=item *
+
+It can lose track of whether something should be encoded as UTF-8 or
+not.
+
+=item *
+
+Source filters activated within C<eval> leak out into whichever file
+scope is currently being compiled. To give an example with the CPAN module
+L<Semi::Semicolons>:
+
+ BEGIN { eval "use Semi::Semicolons; # not filtered" }
+ # filtered here!
+
+L<C<evalbytes>|/evalbytes EXPR> fixes that to work the way one would
+expect:
+
+ use feature "evalbytes";
+ BEGIN { evalbytes "use Semi::Semicolons; # filtered" }
+ # not filtered
+
+=back
+
+=back
+
+Problems can arise if the string expands a scalar containing a floating
+point number. That scalar can expand to letters, such as C<"NaN"> or
+C<"Infinity">; or, within the scope of a L<C<use locale>|locale>, the
+decimal point character may be something other than a dot (such as a
+comma). None of these are likely to parse as you are likely expecting.
+
+You should be especially careful to remember what's being looked at
+when:
+
+ eval $x; # CASE 1
+ eval "$x"; # CASE 2
+
+ eval '$x'; # CASE 3
+ eval { $x }; # CASE 4
+
+ eval "\$$x++"; # CASE 5
+ $$x++; # CASE 6
+
+Cases 1 and 2 above behave identically: they run the code contained in
+the variable $x. (Although case 2 has misleading double quotes making
+the reader wonder what else might be happening (nothing is).) Cases 3
+and 4 likewise behave in the same way: they run the code C<'$x'>, which
+does nothing but return the value of $x. (Case 4 is preferred for
+purely visual reasons, but it also has the advantage of compiling at
+compile-time instead of at run-time.) Case 5 is a place where
+normally you I<would> like to use double quotes, except that in this
+particular situation, you can just use symbolic references instead, as
+in case 6.
+
+An C<eval ''> executed within a subroutine defined
+in the C<DB> package doesn't see the usual
+surrounding lexical scope, but rather the scope of the first non-DB piece
+of code that called it. You don't normally need to worry about this unless
+you are writing a Perl debugger.
+
+The final semicolon, if any, may be omitted from the value of EXPR.
+
+=item Block eval
If the code to be executed doesn't vary, you may use the eval-BLOCK
form to trap run-time errors without incurring the penalty of
# a run-time error
eval '$answer ='; # sets $@
+If you want to trap errors when loading an XS module, some problems with
+the binary interface (such as Perl version skew) may be fatal even with
+C<eval> unless C<$ENV{PERL_DL_NONLAZY}> is set. See
+L<perlrun>.
+
Using the C<eval {}> form as an exception trap in libraries does have some
issues. Due to the current arguably broken state of C<__DIE__> hooks, you
may wish not to trigger any C<__DIE__> hooks that user code may have installed.
Because this promotes action at a distance, this counterintuitive behavior
may be fixed in a future release.
-With an L<C<eval>|/eval EXPR>, you should be especially careful to
-remember what's being looked at when:
-
- eval $x; # CASE 1
- eval "$x"; # CASE 2
-
- eval '$x'; # CASE 3
- eval { $x }; # CASE 4
-
- eval "\$$x++"; # CASE 5
- $$x++; # CASE 6
-
-Cases 1 and 2 above behave identically: they run the code contained in
-the variable $x. (Although case 2 has misleading double quotes making
-the reader wonder what else might be happening (nothing is).) Cases 3
-and 4 likewise behave in the same way: they run the code C<'$x'>, which
-does nothing but return the value of $x. (Case 4 is preferred for
-purely visual reasons, but it also has the advantage of compiling at
-compile-time instead of at run-time.) Case 5 is a place where
-normally you I<would> like to use double quotes, except that in this
-particular situation, you can just use symbolic references instead, as
-in case 6.
-
-Before Perl 5.14, the assignment to L<C<$@>|perlvar/$@> occurred before
-restoration
-of localized variables, which means that for your code to run on older
-versions, a temporary is required if you want to mask some but not all
-errors:
-
- # alter $@ on nefarious repugnancy only
- {
- my $e;
- {
- local $@; # protect existing $@
- eval { test_repugnancy() };
- # $@ =~ /nefarious/ and die $@; # Perl 5.14 and higher only
- $@ =~ /nefarious/ and $e = $@;
- }
- die $e if defined $e
- }
-
C<eval BLOCK> does I<not> count as a loop, so the loop control statements
L<C<next>|/next LABEL>, L<C<last>|/last LABEL>, or
L<C<redo>|/redo LABEL> cannot be used to leave or restart the block.
-An C<eval ''> executed within a subroutine defined
-in the C<DB> package doesn't see the usual
-surrounding lexical scope, but rather the scope of the first non-DB piece
-of code that called it. You don't normally need to worry about this unless
-you are writing a Perl debugger.
+The final semicolon, if any, may be omitted from within the BLOCK.
+
+=back
=item evalbytes EXPR
X<evalbytes>
=for Pod::Functions +evalbytes similar to string eval, but intend to parse a bytestream
-This function is like L<C<eval>|/eval EXPR> with a string argument,
-except it always parses its argument, or L<C<$_>|perlvar/$_> if EXPR is
-omitted, as a string of bytes. A string containing characters whose
-ordinal value exceeds 255 results in an error. Source filters activated
-within the evaluated code apply to the code itself.
+This function is similar to a L<string eval|/eval EXPR>, except it
+always parses its argument (or L<C<$_>|perlvar/$_> if EXPR is omitted)
+as a string of independent bytes.
-L<C<evalbytes>|/evalbytes EXPR> is available only if the
-L<C<"evalbytes"> feature|feature/The 'unicode_eval' and 'evalbytes' features>
-is enabled or if it is prefixed with C<CORE::>. The
+If called when S<C<use utf8>> is in effect, the string will be assumed
+to be encoded in UTF-8, and C<evalbytes> will make a temporary copy to
+work from, downgraded to non-UTF-8. If this is not possible
+(because one or more characters in it require UTF-8), the C<evalbytes>
+will fail with the error stored in C<$@>.
+
+Bytes that correspond to ASCII-range code points will have their normal
+meanings for operators in the string. The treatment of the other bytes
+depends on if the L<C<'unicode_strings"> feature|feature/The
+'unicode_strings' feature> is in effect.
+
+Of course, variables that are UTF-8 and are referred to in the string
+retain that:
+
+ my $a = "\x{100}";
+ evalbytes 'print ord $a, "\n"';
+
+prints
+
+ 256
+
+and C<$@> is empty.
+
+Source filters activated within the evaluated code apply to the code
+itself.
+
+L<C<evalbytes>|/evalbytes EXPR> is available starting in Perl v5.16. To
+access it, you must say C<CORE::evalbytes>, but you can omit the
+C<CORE::> if the
L<C<"evalbytes"> feature|feature/The 'unicode_eval' and 'evalbytes' features>
-is enabled automatically with a C<use v5.16> (or higher) declaration in
-the current scope.
+is enabled. This is enabled automatically with a C<use v5.16> (or
+higher) declaration in the current scope.
=item exec LIST
X<exec> X<execute>
=item fileno FILEHANDLE
X<fileno>
+=item fileno DIRHANDLE
+
=for Pod::Functions return file descriptor from filehandle
-Returns the file descriptor for a filehandle, or undefined if the
+Returns the file descriptor for a filehandle or directory handle,
+or undefined if the
filehandle is not open. If there is no real file descriptor at the OS
level, as can happen with filehandles connected to memory objects via
L<C<open>|/open FILEHANDLE,EXPR> with a reference for the third
sub lock {
my ($fh) = @_;
flock($fh, LOCK_EX) or die "Cannot lock mailbox - $!\n";
-
- # and, in case someone appended while we were waiting...
+ # and, in case we're running on a very old UNIX
+ # variant without the modern O_APPEND semantics...
seek($fh, 0, SEEK_END) or die "Cannot seek - $!\n";
}
(See L<getpriority(2)>.) Will raise a fatal exception if used on a
machine that doesn't implement L<getpriority(2)>.
+C<WHICH> can be any of C<PRIO_PROCESS>, C<PRIO_PGRP> or C<PRIO_USER>
+imported from L<POSIX/RESOURCE CONSTANTS>.
+
Portability issues: L<perlport/getpriority>.
=item getpwnam NAME
Like all Perl character operations, L<C<length>|/length EXPR> normally
deals in logical
characters, not physical bytes. For how many bytes a string encoded as
-UTF-8 would take up, use C<length(Encode::encode_utf8(EXPR))> (you'll have
-to C<use Encode> first). See L<Encode> and L<perlunicode>.
+UTF-8 would take up, use C<length(Encode::encode('UTF-8', EXPR))>
+(you'll have to C<use Encode> first). See L<Encode> and L<perlunicode>.
=item __LINE__
X<__LINE__>
=for Pod::Functions apply a change to a list to get back a new list with the changes
Evaluates the BLOCK or EXPR for each element of LIST (locally setting
-L<C<$_>|perlvar/$_> to each element) and returns the list value composed
-of the
-results of each such evaluation. In scalar context, returns the
-total number of elements so generated. Evaluates BLOCK or EXPR in
-list context, so each element of LIST may produce zero, one, or
-more elements in the returned value.
+L<C<$_>|perlvar/$_> to each element) and composes a list of the results of
+each such evaluation. Each element of LIST may produce zero, one, or more
+elements in the generated list, so the number of elements in the generated
+list may differ from that in LIST. In scalar context, returns the total
+number of elements so generated. In list context, returns the generated list.
my @chars = map(chr, @numbers);
open(my $tmp, "+>", undef) or die ...
-opens a filehandle to an anonymous temporary file. Also using C<< +< >>
-works for symmetry, but you really should consider writing something
-to the temporary file first. You will need to
+opens a filehandle to a newly created empty anonymous temporary file.
+(This happens under any mode, which makes C<< +> >> the only useful and
+sensible mode to use.) You will need to
L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE> to do the reading.
Perl is built using PerlIO by default. Unless you've
open(STDOUT, ">", \$variable)
or die "Can't open STDOUT: $!";
+The scalars for in-memory files are treated as octet strings: unless
+the file is being opened with truncation the scalar may not contain
+any code points over 0xFF.
+
+Opening in-memory files I<can> fail for a variety of reasons. As with
+any other C<open>, check the return value for success.
+
See L<perliol> for detailed info on PerlIO.
General examples:
dirhandle, usually the real dirhandle name. If DIRHANDLE is an undefined
scalar variable (or array or hash element), the variable is assigned a
reference to a new anonymous dirhandle; that is, it's autovivified.
-DIRHANDLEs have their own namespace separate from FILEHANDLEs.
+Dirhandles are the same objects as filehandles; an I/O object can only
+be open as one of these handle types at once.
See the example at L<C<readdir>|/readdir DIRHANDLE>.
those. Raises an exception otherwise.)
i A signed integer value.
- I A unsigned integer value.
+ I An unsigned integer value.
(This 'integer' is _at_least_ 32 bits wide. Its exact
size depends on what a local C compiler calls 'int'.)
omitted:
print { $files[$i] } "stuff\n";
- print { $OK ? STDOUT : STDERR } "stuff\n";
+ print { $OK ? *STDOUT : *STDERR } "stuff\n";
Printing to a closed pipe or socket will generate a SIGPIPE signal. See
L<perlipc> for more on signal handling.
(8-bit) bytes or characters are received. By default all sockets
operate on bytes, but for example if the socket has been changed using
L<C<binmode>|/binmode FILEHANDLE, LAYER> to operate with the
-C<:encoding(utf8)> I/O layer (see the L<open> pragma), the I/O will
+C<:encoding(UTF-8)> I/O layer (see the L<open> pragma), the I/O will
operate on UTF8-encoded Unicode
characters, not bytes. Similarly for the C<:encoding> layer: in that
case pretty much any characters can be read.
otherwise.
Note the emphasis on bytes: even if the filehandle has been set to operate
-on characters (for example using the C<:encoding(utf8)> I/O layer), the
+on characters (for example using the C<:encoding(UTF-8)> I/O layer), the
L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>,
L<C<tell>|/tell FILEHANDLE>, and
L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE>
(8-bit) bytes or characters are sent. By default all sockets operate
on bytes, but for example if the socket has been changed using
L<C<binmode>|/binmode FILEHANDLE, LAYER> to operate with the
-C<:encoding(utf8)> I/O layer (see L<C<open>|/open FILEHANDLE,EXPR>, or
+C<:encoding(UTF-8)> I/O layer (see L<C<open>|/open FILEHANDLE,EXPR>, or
the L<open> pragma), the I/O will operate on UTF-8
encoded Unicode characters, not bytes. Similarly for the C<:encoding>
layer: in that case pretty much any characters can be sent.
(See L<setpriority(2)>.) Raises an exception when used on a machine
that doesn't implement L<setpriority(2)>.
+C<WHICH> can be any of C<PRIO_PROCESS>, C<PRIO_PGRP> or C<PRIO_USER>
+imported from L<POSIX/RESOURCE CONSTANTS>.
+
Portability issues: L<perlport/setpriority>.
=item setsockopt SOCKET,LEVEL,OPTNAME,OPTVAL
passed by reference in L<C<@_>|perlvar/@_>, as for a normal subroutine.
This is slower than unprototyped subroutines, where the elements to be
compared are passed into the subroutine as the package global variables
-C<$a> and C<$b> (see example below). Note that in the latter case, it
-is usually highly counter-productive to declare C<$a> and C<$b> as
-lexicals.
+C<$a> and C<$b> (see example below).
If the subroutine is an XSUB, the elements to be compared are pushed on
to the stack, the way arguments are usually passed to XSUBs. C<$a> and
my @contact = sort(find_records @key);
my @contact = sort(find_records (@key));
-You I<must not> declare C<$a>
-and C<$b> as lexicals. They are package globals. That means
-that if you're in the C<main> package and type
-
- my @articles = sort {$b <=> $a} @files;
-
-then C<$a> and C<$b> are C<$main::a> and C<$main::b> (or C<$::a> and C<$::b>),
-but if you're in the C<FooPack> package, it's the same as typing
-
- my @articles = sort {$FooPack::b <=> $FooPack::a} @files;
+C<$a> and C<$b> are set as package globals in the package the sort() is
+called from. That means C<$main::a> and C<$main::b> (or C<$::a> and
+C<$::b>) in the C<main> package, C<$FooPack::a> and C<$FooPack::b> in the
+C<FooPack> package, etc. If the sort block is in scope of a C<my> or
+C<state> declaration of C<$a> and/or C<$b>, you I<must> spell out the full
+name of the variables in the sort block :
+
+ package main;
+ my $a = "C"; # DANGER, Will Robinson, DANGER !!!
+
+ print sort { $a cmp $b } qw(A C E G B D F H);
+ # WRONG
+ sub badlexi { $a cmp $b }
+ print sort badlexi qw(A C E G B D F H);
+ # WRONG
+ # the above prints BACFEDGH or some other incorrect ordering
+
+ print sort { $::a cmp $::b } qw(A C E G B D F H);
+ # OK
+ print sort { our $a cmp our $b } qw(A C E G B D F H);
+ # also OK
+ print sort { our ($a, $b); $a cmp $b } qw(A C E G B D F H);
+ # also OK
+ sub lexi { our $a cmp our $b }
+ print sort lexi qw(A C E G B D F H);
+ # also OK
+ # the above print ABCDEFGH
+
+With proper care you may mix package and my (or state) C<$a> and/or C<$b>:
+
+ my $a = {
+ tiny => -2,
+ small => -1,
+ normal => 0,
+ big => 1,
+ huge => 2
+ };
+
+ say sort { $a->{our $a} <=> $a->{our $b} }
+ qw{ huge normal tiny small big};
+
+ # prints tinysmallnormalbighuge
+
+C<$a> and C<$b> are implicitely local to the sort() execution and regain their
+former values upon completing the sort.
+
+Sort subroutines written using C<$a> and C<$b> are bound to their calling
+package. It is possible, but of limited interest, to define them in a
+different package, since the subroutine must still refer to the calling
+package's C<$a> and C<$b> :
+
+ package Foo;
+ sub lexi { $Bar::a cmp $Bar::b }
+ package Bar;
+ ... sort Foo::lexi ...
+
+Use the prototyped versions (see above) for a more generic alternative.
The comparison function is required to behave. If it returns
inconsistent results (sometimes saying C<$x[1]> is less than C<$x[2]> and
Splits the string EXPR into a list of strings and returns the
list in list context, or the size of the list in scalar context.
+(Prior to Perl 5.11, it also overwrote C<@_> with the list in
+void and scalar context. If you target old perls, beware.)
If only PATTERN is given, EXPR defaults to L<C<$_>|perlvar/$_>.
L<multiline modifier|perlreref/OPERATORS> (C</^/m>), since it
isn't much use otherwise.
+C<E<sol>m> and any of the other pattern modifiers valid for C<qr>
+(summarized in L<perlop/qrE<sol>STRINGE<sol>msixpodualn>) may be
+specified explicitly.
+
As another special case,
L<C<split>|/split E<sol>PATTERNE<sol>,EXPR,LIMIT> emulates the default
behavior of the
pattern argument to split; in Perl 5.18.0 and later this special case is
triggered by any expression which evaluates to the simple string S<C<" ">>.
+As of Perl 5.28, this special-cased whitespace splitting works as expected in
+the scope of L<< S<C<"use feature 'unicode_strings">>|feature/The
+'unicode_strings' feature >>. In previous versions, and outside the scope of
+that feature, it exhibits L<perlunicode/The "Unicode Bug">: characters that are
+whitespace according to Unicode rules but not according to ASCII rules can be
+treated as part of fields rather than as field separators, depending on the
+string's internal encoding.
+
If omitted, PATTERN defaults to a single space, S<C<" ">>, triggering
the previously described I<awk> emulation.
parentheses. With a parenthesised list, L<C<undef>|/undef EXPR> can be
used as a
dummy placeholder. However, since initialization of state variables in
-list context is currently not possible this would serve no purpose.
+such lists is currently not possible this would serve no purpose.
L<C<state>|/state VARLIST> is available only if the
L<C<"state"> feature|feature/The 'state' feature> is enabled or if it is
POSITION, typically negative.
Note the emphasis on bytes: even if the filehandle has been set to operate
-on characters (for example using the C<:encoding(utf8)> I/O layer), the
+on characters (for example using the C<:encoding(UTF-8)> I/O layer), the
L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>,
L<C<tell>|/tell FILEHANDLE>, and
L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE>
last read.
Note the emphasis on bytes: even if the filehandle has been set to operate
-on characters (for example using the C<:encoding(utf8)> I/O layer), the
+on characters (for example using the C<:encoding(UTF-8)> I/O layer), the
L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>,
L<C<tell>|/tell FILEHANDLE>, and
L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE>
to try to write off the beginning of the string (i.e., negative OFFSET).
If the string happens to be encoded as UTF-8 internally (and thus has
-the UTF8 flag set), this is ignored by L<C<vec>|/vec EXPR,OFFSET,BITS>,
-and it operates on the
-internal byte string, not the conceptual character string, even if you
-only have characters with values less than 256.
+the UTF8 flag set), L<C<vec>|/vec EXPR,OFFSET,BITS> tries to convert it
+to use a one-byte-per-character internal representation. However, if the
+string contains characters with values of 256 or higher, that conversion
+will fail, and a deprecation message will be raised. In that situation,
+C<vec> will operate on the underlying buffer regardless, in its internal
+UTF-8 representation. In Perl 5.32, this will be a fatal error.
Strings created with L<C<vec>|/vec EXPR,OFFSET,BITS> can also be
manipulated with the logical