sv.c, rs.t, perlvar.pod (Coverity finding: did you know what happens with $/=\0?)

[perl5.git] / pod / perlvar.pod
diff --git a/pod/perlvar.pod b/pod/perlvar.pod

index 817f1e5..e8a38cb 100644 (file)
--- a/pod/perlvar.pod
+++ b/pod/perlvar.pod
@@ -13,14 +13,20 @@ you need only say
  
      use English;
  
-at the top of your program.  This will alias all the short names to the
-long names in the current package.  Some even have medium names,
-generally borrowed from B<awk>.
+at the top of your program. This aliases all the short names to the long
+names in the current package. Some even have medium names, generally
+borrowed from B<awk>. In general, it's best to use the
  
-If you don't mind the performance hit, variables that depend on the
-currently selected filehandle may instead be set by calling an
-appropriate object method on the IO::Handle object.  (Summary lines
-below for this contain the word HANDLE.)  First you must say
+    use English '-no_match_vars';
+
+invocation if you don't need $PREMATCH, $MATCH, or $POSTMATCH, as it avoids
+a certain performance hit with the use of regular expressions. See
+L<English>.
+
+Variables that depend on the currently selected filehandle may be set by
+calling an appropriate object method on the IO::Handle object, although
+this is less efficient than using the regular built-in variables. (Summary
+lines below for this contain the word HANDLE.) First you must say
  
      use IO::Handle;
  
@@ -33,10 +39,11 @@ or more safely,
      HANDLE->method(EXPR)
  
  Each method returns the old value of the IO::Handle attribute.
-The methods each take an optional EXPR, which if supplied specifies the
+The methods each take an optional EXPR, which, if supplied, specifies the
  new value for the IO::Handle attribute in question.  If not supplied,
  most methods do nothing to the current value--except for
  autoflush(), which will assume a 1 for you, just to be different.
+
  Because loading in the IO::Handle class is an expensive operation, you should
  learn how to use the regular built-in variables.
  
@@ -44,6 +51,71 @@ A few of these variables are considered "read-only".  This means that if
  you try to assign to this variable, either directly or indirectly through
  a reference, you'll raise a run-time exception.
  
+You should be very careful when modifying the default values of most
+special variables described in this document. In most cases you want
+to localize these variables before changing them, since if you don't,
+the change may affect other modules which rely on the default values
+of the special variables that you have changed. This is one of the
+correct ways to read the whole file at once:
+
+    open my $fh, "foo" or die $!;
+    local $/; # enable localized slurp mode
+    my $content = <$fh>;
+    close $fh;
+
+But the following code is quite bad:
+
+    open my $fh, "foo" or die $!;
+    undef $/; # enable slurp mode
+    my $content = <$fh>;
+    close $fh;
+
+since some other module, may want to read data from some file in the
+default "line mode", so if the code we have just presented has been
+executed, the global value of C<$/> is now changed for any other code
+running inside the same Perl interpreter.
+
+Usually when a variable is localized you want to make sure that this
+change affects the shortest scope possible. So unless you are already
+inside some short C<{}> block, you should create one yourself. For
+example:
+
+    my $content = '';
+    open my $fh, "foo" or die $!;
+    {
+        local $/;
+        $content = <$fh>;
+    }
+    close $fh;
+
+Here is an example of how your own code can go broken:
+
+    for (1..5){
+        nasty_break();
+        print "$_ ";
+    }
+    sub nasty_break {
+        $_ = 5;
+        # do something with $_
+    }
+
+You probably expect this code to print:
+
+    1 2 3 4 5
+
+but instead you get:
+
+    5 5 5 5 5
+
+Why? Because nasty_break() modifies C<$_> without localizing it
+first. The fix is to add local():
+
+        local $_ = 5;
+
+It's easy to notice the problem in such a short example, but in more
+complicated code you are looking for trouble if you don't localize
+changes to the special variables.
+
  The following list is ordered by scalar variables first, then the
  arrays, then the hashes.
  
@@ -105,6 +177,11 @@ test.  Outside a C<while> test, this will not happen.
  
  =back
  
+As C<$_> is a global variable, this may lead in some cases to unwanted
+side-effects.  As of perl 5.9.1, you can now use a lexical version of
+C<$_> by declaring it in a file or in a block with C<my>.  Moreover,
+declaring C<our $> restores the global C<$_> in the current scope.
+
  (Mnemonic: underline is understood in certain operations.)
  
  =back
@@ -117,10 +194,9 @@ test.  Outside a C<while> test, this will not happen.
  
  Special package variables when using sort(), see L<perlfunc/sort>.
  Because of this specialness $a and $b don't need to be declared
-(using local(), use vars, or our()) even when using the strict
-vars pragma.  Don't lexicalize them with C<my $a> or C<my $b>
-if you want to be able to use them in the sort() comparison block
-or function.
+(using use vars, or our()) even when using the C<strict 'vars'> pragma.
+Don't lexicalize them with C<my $a> or C<my $b> if you want to be
+able to use them in the sort() comparison block or function.
  
  =back
  
@@ -144,7 +220,7 @@ BLOCK).  (Mnemonic: like & in some editors.)  This variable is read-only
  and dynamically scoped to the current BLOCK.
  
  The use of this variable anywhere in a program imposes a considerable
-performance penalty on all regular expression matches.  See L<BUGS>.
+performance penalty on all regular expression matches.  See L</BUGS>.
  
  =item $PREMATCH
  
@@ -156,7 +232,7 @@ enclosed by the current BLOCK).  (Mnemonic: C<`> often precedes a quoted
  string.)  This variable is read-only.
  
  The use of this variable anywhere in a program imposes a considerable
-performance penalty on all regular expression matches.  See L<BUGS>.
+performance penalty on all regular expression matches.  See L</BUGS>.
  
  =item $POSTMATCH
  
@@ -167,14 +243,14 @@ pattern match (not counting any matches hidden within a BLOCK or eval()
  enclosed by the current BLOCK).  (Mnemonic: C<'> often follows a quoted
  string.)  Example:
  
-    $_ = 'abcdefghi';
+    local $_ = 'abcdefghi';
      /def/;
      print "$`:$&:$'\n";        # prints abc:def:ghi
  
  This variable is read-only and dynamically scoped to the current BLOCK.
  
  The use of this variable anywhere in a program imposes a considerable
-performance penalty on all regular expression matches.  See L<BUGS>.
+performance penalty on all regular expression matches.  See L</BUGS>.
  
  =item $LAST_PAREN_MATCH
  
@@ -196,7 +272,7 @@ with the rightmost closing parenthesis) of the last successful search
  pattern.  (Mnemonic: the (possibly) Nested parenthesis that most
  recently closed.)
  
-This is primarly used inside C<(?{...})> blocks for examining text
+This is primarily used inside C<(?{...})> blocks for examining text
  recently matched. For example, to effectively capture text to a variable
  (in addition to C<$1>, C<$2>, etc.), replace C<(...)> with
  
@@ -222,48 +298,41 @@ past where $2 ends, and so on.  You can use C<$#+> to determine
  how many subgroups were in the last successful match.  See the
  examples given for the C<@-> variable.
  
-=item $MULTILINE_MATCHING
+=item HANDLE->input_line_number(EXPR)
  
-=item $*
+=item $INPUT_LINE_NUMBER
  
-Set to a non-zero integer value to do multi-line matching within a
-string, 0 (or undefined) to tell Perl that it can assume that strings
-contain a single line, for the purpose of optimizing pattern matches.
-Pattern matches on strings containing multiple newlines can produce
-confusing results when C<$*> is 0 or undefined. Default is undefined.
-(Mnemonic: * matches multiple things.) This variable influences the
-interpretation of only C<^> and C<$>. A literal newline can be searched
-for even when C<$* == 0>.
+=item $NR
  
-Use of C<$*> is deprecated in modern Perl, supplanted by 
-the C</s> and C</m> modifiers on pattern matching.
+=item $.
  
-Assigning a non-numerical value to C<$*> triggers a warning (and makes
-C<$*> act if C<$* == 0>), while assigning a numerical value to C<$*>
-makes that an implicit C<int> is applied on the value.
+Current line number for the last filehandle accessed. 
  
-=item input_line_number HANDLE EXPR
+Each filehandle in Perl counts the number of lines that have been read
+from it.  (Depending on the value of C<$/>, Perl's idea of what
+constitutes a line may not match yours.)  When a line is read from a
+filehandle (via readline() or C<< <> >>), or when tell() or seek() is
+called on it, C<$.> becomes an alias to the line counter for that
+filehandle.
  
-=item $INPUT_LINE_NUMBER
+You can adjust the counter by assigning to C<$.>, but this will not
+actually move the seek pointer.  I<Localizing C<$.> will not localize
+the filehandle's line count>.  Instead, it will localize perl's notion
+of which filehandle C<$.> is currently aliased to.
  
-=item $NR
+C<$.> is reset when the filehandle is closed, but B<not> when an open
+filehandle is reopened without an intervening close().  For more
+details, see L<perlop/"IE<sol>O Operators">.  Because C<< <> >> never does
+an explicit close, line numbers increase across ARGV files (but see
+examples in L<perlfunc/eof>).
  
-=item $.
+You can also use C<< HANDLE->input_line_number(EXPR) >> to access the
+line counter for a given filehandle without having to worry about
+which handle you last accessed.
  
-The current input record number for the last file handle from which
-you just read() (or called a C<seek> or C<tell> on).  The value
-may be different from the actual physical line number in the file,
-depending on what notion of "line" is in effect--see C<$/> on how
-to change that.  An explicit close on a filehandle resets the line
-number.  Because C<< <> >> never does an explicit close, line
-numbers increase across ARGV files (but see examples in L<perlfunc/eof>).
-Consider this variable read-only: setting it does not reposition
-the seek pointer; you'll have to do that on your own.  Localizing C<$.>
-has the effect of also localizing Perl's notion of "the last read
-filehandle".  (Mnemonic: many programs use "." to mean the current line
-number.)
-
-=item input_record_separator HANDLE EXPR
+(Mnemonic: many programs use "." to mean the current line number.)
+
+=item IO::Handle->input_record_separator(EXPR)
  
  =item $INPUT_RECORD_SEPARATOR
  
@@ -285,8 +354,8 @@ blindly assume that the next input character belongs to the next
  paragraph, even if it's a newline.  (Mnemonic: / delimits
  line boundaries when quoting poetry.)
  
-    undef $/;          # enable "slurp" mode
-    $_ = <FH>;         # whole file now here
+    local $/;           # enable "slurp" mode
+    local $_ = <FH>;    # whole file now here
      s/\n[ \t]+/ /g;
  
  Remember: the value of C<$/> is a string, not a regex.  B<awk> has to be
@@ -297,15 +366,16 @@ scalar that's convertible to an integer will attempt to read records
  instead of lines, with the maximum record size being the referenced
  integer.  So this:
  
-    $/ = \32768; # or \"32768", or \$var_containing_32768
-    open(FILE, $myfile);
-    $_ = <FILE>;
+    local $/ = \32768; # or \"32768", or \$var_containing_32768
+    open my $fh, $myfile or die $!;
+    local $_ = <$fh>;
  
  will read a record of no more than 32768 bytes from FILE.  If you're
  not reading from a record-oriented file (or your OS doesn't have
  record-oriented files), then you'll likely get a full chunk of data
  with every read.  If a record is larger than the record size you've
-set, you'll get the record back in pieces.
+set, you'll get the record back in pieces.  Trying to set the record
+size to zero or less will cause reading in the (rest of the) whole file.
  
  On VMS, record reads are done with the equivalent of C<sysread>,
  so it's best not to mix record and non-record reads on the same
@@ -316,7 +386,7 @@ non-record reads of a file.
  
  See also L<perlport/"Newlines">.  Also see C<$.>.
  
-=item autoflush HANDLE EXPR
+=item HANDLE->autoflush(EXPR)
  
  =item $OUTPUT_AUTOFLUSH
  
@@ -334,7 +404,7 @@ a Perl program under B<rsh> and want to see the output as it's
  happening.  This has no effect on input buffering.  See L<perlfunc/getc>
  for that.  (Mnemonic: when you want your pipes to be piping hot.)
  
-=item output_field_separator HANDLE EXPR
+=item IO::Handle->output_field_separator EXPR
  
  =item $OUTPUT_FIELD_SEPARATOR
  
@@ -342,14 +412,11 @@ for that.  (Mnemonic: when you want your pipes to be piping hot.)
  
  =item $,
  
-The output field separator for the print operator.  Ordinarily the
-print operator simply prints out its arguments without further
-adornment.  To get behavior more like B<awk>, set this variable as
-you would set B<awk>'s OFS variable to specify what is printed
-between fields.  (Mnemonic: what is printed when there is a "," in
-your print statement.)
+The output field separator for the print operator.  If defined, this
+value is printed between each of print's arguments.  Default is C<undef>.
+(Mnemonic: what is printed when there is a "," in your print statement.)
  
-=item output_record_separator HANDLE EXPR
+=item IO::Handle->output_record_separator EXPR
  
  =item $OUTPUT_RECORD_SEPARATOR
  
@@ -357,14 +424,10 @@ your print statement.)
  
  =item $\
  
-The output record separator for the print operator.  Ordinarily the
-print operator simply prints out its arguments as is, with no
-trailing newline or other end-of-record string added.  To get
-behavior more like B<awk>, set this variable as you would set
-B<awk>'s ORS variable to specify what is printed at the end of the
-print.  (Mnemonic: you set C<$\> instead of adding "\n" at the
-end of the print.  Also, it's just like C<$/>, but it's what you
-get "back" from Perl.)
+The output record separator for the print operator.  If defined, this
+value is printed after the last of print's arguments.  Default is C<undef>.
+(Mnemonic: you set C<$\> instead of adding "\n" at the end of the print.
+Also, it's just like C<$/>, but it's what you get "back" from Perl.)
  
  =item $LIST_SEPARATOR
  
@@ -406,21 +469,7 @@ taken for something more important.)
  Consider using "real" multidimensional arrays as described
  in L<perllol>.
  
-=item $OFMT
-
-=item $#
-
-The output format for printed numbers.  This variable is a half-hearted
-attempt to emulate B<awk>'s OFMT variable.  There are times, however,
-when B<awk> and Perl have differing notions of what counts as 
-numeric.  The initial value is "%.I<n>g", where I<n> is the value
-of the macro DBL_DIG from your system's F<float.h>.  This is different from
-B<awk>'s default OFMT setting of "%.6g", so you need to set C<$#>
-explicitly to get B<awk>'s value.  (Mnemonic: # is the number sign.)
-
-Use of C<$#> is deprecated.
-
-=item format_page_number HANDLE EXPR
+=item HANDLE->format_page_number(EXPR)
  
  =item $FORMAT_PAGE_NUMBER
  
@@ -430,7 +479,7 @@ The current page number of the currently selected output channel.
  Used with formats.
  (Mnemonic: % is page number in B<nroff>.)
  
-=item format_lines_per_page HANDLE EXPR
+=item HANDLE->format_lines_per_page(EXPR)
  
  =item $FORMAT_LINES_PER_PAGE
  
@@ -441,7 +490,7 @@ output channel.  Default is 60.
  Used with formats.
  (Mnemonic: = has horizontal lines.)
  
-=item format_lines_left HANDLE EXPR
+=item HANDLE->format_lines_left(EXPR)
  
  =item $FORMAT_LINES_LEFT
  
@@ -461,9 +510,9 @@ C<$-[>I<n>C<]> is the offset of the start of the substring matched by
  I<n>-th subpattern, or undef if the subpattern did not match.
  
  Thus after a match against $_, $& coincides with C<substr $_, $-[0],
-$+[0] - $-[0]>.  Similarly, C<$>I<n> coincides with C<substr $_, $-[>I<n>C<],
-$+[>I<n>C<] - $-[>I<n>C<]> if C<$-[>I<n>C<]> is defined, and $+ coincides with
-C<substr $_, $-[$#-], $+[$#-]>.  One can use C<$#-> to find the last
+$+[0] - $-[0]>.  Similarly, $I<n> coincides with C<substr $_, $-[n],
+$+[n] - $-[n]> if C<$-[n]> is defined, and $+ coincides with
+C<substr $_, $-[$#-], $+[$#-] - $-[$#-]>.  One can use C<$#-> to find the last
  matched subgroup in the last successful match.  Contrast with
  C<$#+>, the number of subgroups in the regular expression.  Compare
  with C<@+>.
@@ -472,10 +521,8 @@ This array holds the offsets of the beginnings of the last
  successful submatches in the currently active dynamic scope.
  C<$-[0]> is the offset into the string of the beginning of the
  entire match.  The I<n>th element of this array holds the offset
-of the I<n>th submatch, so C<$+[1]> is the offset where $1
-begins, C<$+[2]> the offset where $2 begins, and so on.
-You can use C<$#-> to determine how many subgroups were in the
-last successful match.  Compare with the C<@+> variable.
+of the I<n>th submatch, so C<$-[1]> is the offset where $1
+begins, C<$-[2]> the offset where $2 begins, and so on.
  
  After a match against some variable $var:
  
@@ -491,11 +538,11 @@ After a match against some variable $var:
  
  =item C<$2> is the same as C<substr($var, $-[2], $+[2] - $-[2])>
  
-=item C<$3> is the same as C<substr $var, $-[3], $+[3] - $-[3])>
+=item C<$3> is the same as C<substr($var, $-[3], $+[3] - $-[3])>
  
  =back
  
-=item format_name HANDLE EXPR
+=item HANDLE->format_name(EXPR)
  
  =item $FORMAT_NAME
  
@@ -505,7 +552,7 @@ The name of the current report format for the currently selected output
  channel.  Default is the name of the filehandle.  (Mnemonic: brother to
  C<$^>.)
  
-=item format_top_name HANDLE EXPR
+=item HANDLE->format_top_name(EXPR)
  
  =item $FORMAT_TOP_NAME
  
@@ -515,7 +562,7 @@ The name of the current top-of-page format for the currently selected
  output channel.  Default is the name of the filehandle with _TOP
  appended.  (Mnemonic: points to top of page.)
  
-=item format_line_break_characters HANDLE EXPR
+=item IO::Handle->format_line_break_characters EXPR
  
  =item $FORMAT_LINE_BREAK_CHARACTERS
  
@@ -526,7 +573,7 @@ fill continuation fields (starting with ^) in a format.  Default is
  S<" \n-">, to break on whitespace or hyphens.  (Mnemonic: a "colon" in
  poetry is a part of a line.)
  
-=item format_formfeed HANDLE EXPR
+=item IO::Handle->format_formfeed EXPR
  
  =item $FORMAT_FORMFEED
  
@@ -552,7 +599,7 @@ L<perlfunc/formline()>.
  The status returned by the last pipe close, backtick (C<``>) command,
  successful call to wait() or waitpid(), or from the system()
  operator.  This is just the 16-bit status word returned by the
-wait() system call (or else is made up to look like it).  Thus, the
+traditional Unix wait() system call (or else is made up to look like it).  Thus, the
  exit value of the subprocess is really (C<<< $? >> 8 >>>), and
  C<$? & 127> gives which signal, if any, the process died from, and
  C<$? & 128> reports whether there was a core dump.  (Mnemonic:
@@ -574,10 +621,29 @@ change the exit status of your program.  For example:
  
  Under VMS, the pragma C<use vmsish 'status'> makes C<$?> reflect the
  actual VMS exit status, instead of the default emulation of POSIX
-status.
+status; see L<perlvms/$?> for details.
  
  Also see L<Error Indicators>.
  
+=item ${^CHILD_ERROR_NATIVE}
+
+The native status returned by the last pipe close, backtick (C<``>)
+command, successful call to wait() or waitpid(), or from the system()
+operator.  On POSIX-like systems this value can be decoded with the
+WIFEXITED, WEXITSTATUS, WIFSIGNALED, WTERMSIG, WIFSTOPPED, WSTOPSIG
+and WIFCONTINUED functions provided by the L<POSIX> module.
+
+Under VMS this reflects the actual VMS exit status; i.e. it is the same
+as $? when the pragma C<use vmsish 'status'> is in effect.
+
+=item ${^ENCODING}
+
+The I<object reference> to the Encode object that is used to convert
+the source code to Unicode.  Thanks to this variable your perl script
+does not have to be written in UTF-8.  Default is I<undef>.  The direct
+manipulation of this variable is highly discouraged.  See L<encoding>
+for more details.
+
  =item $OS_ERROR
  
  =item $ERRNO
@@ -585,10 +651,26 @@ Also see L<Error Indicators>.
  =item $!
  
  If used numerically, yields the current value of the C C<errno>
-variable, with all the usual caveats.  (This means that you shouldn't
-depend on the value of C<$!> to be anything in particular unless
-you've gotten a specific error return indicating a system error.)
-If used an a string, yields the corresponding system error string.
+variable, or in other words, if a system or library call fails, it
+sets this variable.  This means that the value of C<$!> is meaningful
+only I<immediately> after a B<failure>:
+
+    if (open(FH, $filename)) {
+       # Here $! is meaningless.
+       ...
+    } else {
+       # ONLY here is $! meaningful.
+       ...
+       # Already here $! might be meaningless.
+    }
+    # Since here we might have either success or failure,
+    # here $! is meaningless.
+
+In the above I<meaningless> stands for anything: zero, non-zero,
+C<undef>.  A successful system or library call does B<not> set
+the variable to zero.
+
+If used as a string, yields the corresponding system error string.
  You can assign a number to C<$!> to set I<errno> if, for instance,
  you want C<"$!"> to return the string for error I<n>, or you want
  to set the exit value for the die() operator.  (Mnemonic: What just
@@ -596,6 +678,18 @@ went bang?)
  
  Also see L<Error Indicators>.
  
+=item %!
+
+Each element of C<%!> has a true value only if C<$!> is set to that
+value.  For example, C<$!{ENOENT}> is true if and only if the current
+value of C<$!> is C<ENOENT>; that is, if the most recent error was
+"No such file or directory" (or its moral equivalent: not all operating
+systems give that exact error, and certainly not all languages).
+To check if a particular key is meaningful on your system, use
+C<exists $!{the_key}>; for a list of legal keys, use C<keys %!>.
+See L<Errno> for more information, and also see above for the
+validity of C<$!>.
+
  =item $EXTENDED_OS_ERROR
  
  =item $^E
@@ -650,6 +744,12 @@ The process number of the Perl running this script.  You should
  consider this variable read-only, although it will be altered
  across fork() calls.  (Mnemonic: same as shells.)
  
+Note for Linux users: on Linux, the C functions C<getpid()> and
+C<getppid()> return different values from different threads. In order to
+be portable, this behavior is not reflected by C<$$>, whose value remains
+consistent across threads. If you want to call the underlying C<getpid()>,
+you may use the CPAN module C<Linux::Pid>.
+
  =item $REAL_USER_ID
  
  =item $UID
@@ -658,7 +758,9 @@ across fork() calls.  (Mnemonic: same as shells.)
  
  The real uid of this process.  (Mnemonic: it's the uid you came I<from>,
  if you're running setuid.)  You can change both the real uid and
-the effective uid at the same time by using POSIX::setuid().
+the effective uid at the same time by using POSIX::setuid().  Since
+changes to $< require a system call, check $! after a change attempt to 
+detect any possible errors.
  
  =item $EFFECTIVE_USER_ID
  
@@ -672,7 +774,8 @@ The effective uid of this process.  Example:
      ($<,$>) = ($>,$<); # swap real and effective uid
  
  You can change both the effective uid and the real uid at the same
-time by using POSIX::setuid().
+time by using POSIX::setuid().  Changes to $> require a check to $!
+to detect any possible errors after an attempted change. 
  
  (Mnemonic: it's the uid you went I<to>, if you're running setuid.)
  C<< $< >> and C<< $> >> can be swapped only on machines
@@ -695,7 +798,8 @@ set the real gid.  So the value given by C<$(> should I<not> be assigned
  back to C<$(> without being forced numeric, such as by adding zero.
  
  You can change both the real gid and the effective gid at the same
-time by using POSIX::setgid().
+time by using POSIX::setgid().  Changes to $( require a check to $!
+to detect any possible errors after an attempted change.
  
  (Mnemonic: parentheses are used to I<group> things.  The real gid is the
  group you I<left>, if you're running setgid.)
@@ -721,6 +825,8 @@ list, say C< $) = "5 5" >.
  
  You can change both the effective gid and the real gid at the same
  time by using POSIX::setgid() (use only a single numeric argument).
+Changes to $) require a check to $! to detect any possible errors
+after an attempted change.
  
  (Mnemonic: parentheses are used to I<group> things.  The effective gid
  is the group that's I<right> for you, if you're running setgid.)
@@ -733,16 +839,36 @@ and C<$)> can be swapped only on machines supporting setregid().
  
  =item $0
  
-Contains the name of the program being executed.  On some operating
-systems assigning to C<$0> modifies the argument area that the B<ps>
-program sees.  This is more useful as a way of indicating the current
-program state than it is for hiding the program you're running.
-(Mnemonic: same as B<sh> and B<ksh>.)
+Contains the name of the program being executed.
+
+On some (read: not all) operating systems assigning to C<$0> modifies
+the argument area that the C<ps> program sees.  On some platforms you
+may have to use special C<ps> options or a different C<ps> to see the
+changes.  Modifying the $0 is more useful as a way of indicating the
+current program state than it is for hiding the program you're
+running.  (Mnemonic: same as B<sh> and B<ksh>.)
+
+Note that there are platform specific limitations on the maximum
+length of C<$0>.  In the most extreme case it may be limited to the
+space occupied by the original C<$0>.
+
+In some platforms there may be arbitrary amount of padding, for
+example space characters, after the modified name as shown by C<ps>.
+In some platforms this padding may extend all the way to the original
+length of the argument area, no matter what you do (this is the case
+for example with Linux 2.2).
  
  Note for BSD users: setting C<$0> does not completely remove "perl"
-from the ps(1) output.  For example, setting C<$0> to C<"foobar"> will
-result in C<"perl: foobar (perl)">.  This is an operating system
-feature.
+from the ps(1) output.  For example, setting C<$0> to C<"foobar"> may
+result in C<"perl: foobar (perl)"> (whether both the C<"perl: "> prefix
+and the " (perl)" suffix are shown depends on your exact BSD variant
+and version).  This is an operating system feature, Perl cannot help it.
+
+In multithreaded scripts Perl coordinates the threads so that any
+thread may modify its copy of the C<$0> and the change becomes visible
+to ps(1) (assuming the operating system plays along).  Note that
+the view of C<$0> the other threads have will not change since they
+have their own copies of it.
  
  =item $[
  
@@ -754,8 +880,14 @@ subscripting and when evaluating the index() and substr() functions.
  
  As of release 5 of Perl, assignment to C<$[> is treated as a compiler
  directive, and cannot influence the behavior of any other file.
+(That's why you can only assign compile-time constants to it.)
  Its use is highly discouraged.
  
+Note that, unlike other compile-time directives (such as L<strict>),
+assignment to C<$[> can be seen from outer lexical scopes in the same file.
+However, you can use local() on it to strictly bind its value to a
+lexical block.
+
  =item $]
  
  The version + patchlevel / 1000 of the Perl interpreter.  This variable
@@ -768,10 +900,9 @@ of perl in the right bracket?)  Example:
  See also the documentation of C<use VERSION> and C<require VERSION>
  for a convenient way to fail if the running Perl interpreter is too old.
  
-The use of this variable is deprecated.  The floating point representation
-can sometimes lead to inaccurate numeric comparisons.  See C<$^V> for a
-more modern representation of the Perl version that allows accurate string
-comparisons.
+The floating point representation can sometimes lead to inaccurate
+numeric comparisons.  See C<$^V> for a more modern representation of
+the Perl version that allows accurate string comparisons.
  
  =item $COMPILING
  
@@ -788,7 +919,23 @@ C<$^C = 1> is similar to calling C<B::minus_c>.
  =item $^D
  
  The current value of the debugging flags.  (Mnemonic: value of B<-D>
-switch.)
+switch.) May be read or set. Like its command-line equivalent, you can use
+numeric or symbolic values, eg C<$^D = 10> or C<$^D = "st">.
+
+=item ${^RE_DEBUG_FLAGS}
+
+The current value of the regex debugging flags. Set to 0 for no debug output
+even when the re 'debug' module is loaded. See L<re> for details.
+
+=item ${^RE_TRIE_MAXBUF}
+
+Controls how certain regex optimisations are applied and how much memory they
+utilize. This value by default is 65536 which corresponds to a 512kB temporary
+cache. Set this to a higher value to trade memory for speed when matching
+large alternations. Set it to a lower value if you want the optimisations to
+be as conservative of memory as possible but still occur, and set it to a
+negative value to prevent the optimisation and conserve the most memory.
+Under normal situations this variable should be of no interest to you.
  
  =item $SYSTEM_FD_MAX
  
@@ -866,15 +1013,16 @@ inplace editing.  (Mnemonic: value of B<-i> switch.)
  By default, running out of memory is an untrappable, fatal error.
  However, if suitably built, Perl can use the contents of C<$^M>
  as an emergency memory pool after die()ing.  Suppose that your Perl
-were compiled with -DPERL_EMERGENCY_SBRK and used Perl's malloc.
+were compiled with C<-DPERL_EMERGENCY_SBRK> and used Perl's malloc.
  Then
  
      $^M = 'a' x (1 << 16);
  
  would allocate a 64K buffer for use in an emergency.  See the
  F<INSTALL> file in the Perl distribution for information on how to
-enable this option.  To discourage casual use of this advanced
-feature, there is no L<English|English> long name for this variable.
+add custom C compilation flags when compiling perl.  To discourage casual
+use of this advanced feature, there is no L<English|English> long name for
+this variable.
  
  =item $OSNAME
  
@@ -885,6 +1033,18 @@ built, as determined during the configuration process.  The value
  is identical to C<$Config{'osname'}>.  See also L<Config> and the 
  B<-V> command-line switch documented in L<perlrun>.
  
+In Windows platforms, $^O is not very helpful: since it is always
+C<MSWin32>, it doesn't tell the difference between
+95/98/ME/NT/2000/XP/CE/.NET.  Use Win32::GetOSName() or
+Win32::GetOSVersion() (see L<Win32> and L<perlport>) to distinguish
+between the variants.
+
+=item ${^OPEN}
+
+An internal variable used by PerlIO.  A string in two parts, separated
+by a C<\0> byte, the first part describes the input layers, the second
+part describes the output layers.
+
  =item $PERLDB
  
  =item $^P
@@ -935,6 +1095,10 @@ Provide informative "file" names for evals based on the place they were compiled
  Provide informative names to anonymous subroutines based on the place they
  were compiled.
  
+=item 0x400
+
+Debug assertion subroutines enter/exit.
+
  =back
  
  Some bits may be relevant at compile-time only, some at
@@ -951,9 +1115,15 @@ regular expression assertion (see L<perlre>).  May be written to.
  
  =item $^S
  
-Current state of the interpreter.  Undefined if parsing of the current
-module/eval is not finished (may happen in $SIG{__DIE__} and
-$SIG{__WARN__} handlers).  True if inside an eval(), otherwise false.
+Current state of the interpreter.
+
+    $^S         State
+    ---------   -------------------
+    undef       Parsing module/eval
+    true (1)    Executing an eval
+    false (0)   Otherwise
+
+The first state may happen in $SIG{__DIE__} and $SIG{__WARN__} handlers.
  
  =item $BASETIME
  
@@ -963,6 +1133,32 @@ The time at which the program began running, in seconds since the
  epoch (beginning of 1970).  The values returned by the B<-M>, B<-A>,
  and B<-C> filetests are based on this value.
  
+=item ${^TAINT}
+
+Reflects if taint mode is on or off.  1 for on (the program was run with
+B<-T>), 0 for off, -1 when only taint warnings are enabled (i.e. with
+B<-t> or B<-TU>).  This variable is read-only.
+
+=item ${^UNICODE}
+
+Reflects certain Unicode settings of Perl.  See L<perlrun>
+documentation for the C<-C> switch for more information about
+the possible values. This variable is set during Perl startup
+and is thereafter read-only.
+
+=item ${^UTF8CACHE}
+
+This variable controls the state of the internal UTF-8 offset caching code.
+1 for on (the default), 0 for off, -1 to debug the caching code by checking
+all its results against linear scans, and panicking on any discrepancy.
+
+=item ${^UTF8LOCALE}
+
+This variable indicates whether an UTF-8 locale was detected by perl at
+startup. This information is used by perl when it's in
+adjust-utf8ness-to-locale mode (as when run with the C<-CL> command-line
+switch); see L<perlrun> for more info on this.
+
  =item $PERL_VERSION
  
  =item $^V
@@ -979,6 +1175,11 @@ Control.)  Example:
  
      warn "No \"our\" declarations!\n" if $^V and $^V lt v5.6.0;
  
+To convert C<$^V> into its string representation use sprintf()'s
+C<"%vd"> conversion:
+
+    printf "version is v%vd\n", $^V;  # Perl's version
+
  See the documentation of C<use VERSION> and C<require VERSION>
  for a convenient way to fail if the running Perl interpreter is too old.
  
@@ -997,27 +1198,67 @@ related to the B<-w> switch.)  See also L<warnings>.
  The current set of warning checks enabled by the C<use warnings> pragma.
  See the documentation of C<warnings> for more details.
  
-=item ${^WIDE_SYSTEM_CALLS}
-
-Global flag that enables system calls made by Perl to use wide character
-APIs native to the system, if available.  This is currently only implemented
-on the Windows platform.
-
-This can also be enabled from the command line using the C<-C> switch.
-
-The initial value is typically C<0> for compatibility with Perl versions
-earlier than 5.6, but may be automatically set to C<1> by Perl if the system
-provides a user-settable default (e.g., C<$ENV{LC_CTYPE}>).
-
-The C<bytes> pragma always overrides the effect of this flag in the current
-lexical scope.  See L<bytes>.
-
  =item $EXECUTABLE_NAME
  
  =item $^X
  
-The name that the Perl binary itself was executed as, from C's C<argv[0]>.
-This may not be a full pathname, nor even necessarily in your path.
+The name used to execute the current copy of Perl, from C's
+C<argv[0]> or (where supported) F</proc/self/exe>.
+
+Depending on the host operating system, the value of $^X may be
+a relative or absolute pathname of the perl program file, or may
+be the string used to invoke perl but not the pathname of the
+perl program file.  Also, most operating systems permit invoking
+programs that are not in the PATH environment variable, so there
+is no guarantee that the value of $^X is in PATH.  For VMS, the
+value may or may not include a version number.
+
+You usually can use the value of $^X to re-invoke an independent
+copy of the same perl that is currently running, e.g.,
+
+  @first_run = `$^X -le "print int rand 100 for 1..100"`;
+
+But recall that not all operating systems support forking or
+capturing of the output of commands, so this complex statement
+may not be portable.
+
+It is not safe to use the value of $^X as a path name of a file,
+as some operating systems that have a mandatory suffix on
+executable files do not require use of the suffix when invoking
+a command.  To convert the value of $^X to a path name, use the
+following statements:
+
+  # Build up a set of file names (not command names).
+  use Config;
+  $this_perl = $^X;
+  if ($^O ne 'VMS')
+     {$this_perl .= $Config{_exe}
+          unless $this_perl =~ m/$Config{_exe}$/i;}
+
+Because many operating systems permit anyone with read access to
+the Perl program file to make a copy of it, patch the copy, and
+then execute the copy, the security-conscious Perl programmer
+should take care to invoke the installed copy of perl, not the
+copy referenced by $^X.  The following statements accomplish
+this goal, and produce a pathname that can be invoked as a
+command or referenced as a file.
+
+  use Config;
+  $secure_perl_path = $Config{perlpath};
+  if ($^O ne 'VMS')
+     {$secure_perl_path .= $Config{_exe}
+          unless $secure_perl_path =~ m/$Config{_exe}$/i;}
+
+=item ARGV
+
+The special filehandle that iterates over command-line filenames in
+C<@ARGV>. Usually written as the null filehandle in the angle operator
+C<< <> >>. Note that currently C<ARGV> only has its magical effect
+within the C<< <> >> operator; elsewhere it is just a plain filehandle
+corresponding to the last file opened by C<< <> >>. In particular,
+passing C<\*ARGV> as a parameter to a function that expects a filehandle
+may not cause your function to automatically read the contents of all the
+files in C<@ARGV>.
  
  =item $ARGV
  
@@ -1030,6 +1271,13 @@ the script.  C<$#ARGV> is generally the number of arguments minus
  one, because C<$ARGV[0]> is the first argument, I<not> the program's
  command name itself.  See C<$0> for the command name.
  
+=item ARGVOUT
+
+The special filehandle that points to the currently open output file
+when doing edit-in-place processing with B<-i>.  Useful when you have
+to do a lot of inserting and don't want to keep modifying $_.  See
+L<perlrun> for the B<-i> switch.
+
  =item @F
  
  The array @F contains the fields of each line read in when autosplit
@@ -1044,13 +1292,18 @@ C<require>, or C<use> constructs look for their library files.  It
  initially consists of the arguments to any B<-I> command-line
  switches, followed by the default Perl library, probably
  F</usr/local/lib/perl>, followed by ".", to represent the current
-directory.  If you need to modify this at runtime, you should use
+directory.  ("." will not be appended if taint checks are enabled, either by
+C<-T> or by C<-t>.)  If you need to modify this at runtime, you should use
  the C<use lib> pragma to get the machine-dependent library properly
  loaded also:
  
      use lib '/mypath/libdir/';
      use SomeMod;
  
+You can also insert hooks into the file inclusion system by putting Perl
+code directly into @INC.  Those hooks may be subroutine references, array
+references or blessed objects.  See L<perlfunc/require> for details.
+
  =item @_
  
  Within a subroutine the array @_ contains the parameters passed to that
@@ -1065,6 +1318,12 @@ value is the location of the file found.  The C<require>
  operator uses this hash to determine whether a particular file has
  already been included.
  
+If the file was loaded via a hook (e.g. a subroutine reference, see
+L<perlfunc/require> for a description of these hooks), this hook is
+by default inserted into %INC in place of a filename.  Note, however,
+that the hook may have set the %INC entry by itself to provide some more
+specific info.
+
  =item %ENV
  
  =item $ENV{expr}
@@ -1107,20 +1366,11 @@ Be sure not to use a bareword as the name of a signal handler,
  lest you inadvertently call it. 
  
  If your system has the sigaction() function then signal handlers are
-installed using it.  This means you get reliable signal handling.  If
-your system has the SA_RESTART flag it is used when signals handlers are
-installed.  This means that system calls for which restarting is supported
-continue rather than returning when a signal arrives.  If you want your
-system calls to be interrupted by signal delivery then do something like
-this:
-
-    use POSIX ':signal_h';
-
-    my $alarm = 0;
-    sigaction SIGALRM, new POSIX::SigAction sub { $alarm = 1 }
-       or die "Error setting SIGALRM handler: $!\n";
+installed using it.  This means you get reliable signal handling.
  
-See L<POSIX>.
+The default delivery policy of signals changed in Perl 5.8.0 from 
+immediate (also known as "unsafe") to deferred, also known as 
+"safe signals".  See L<perlipc> for more information.
  
  Certain internal hooks can be also set using the %SIG hash.  The
  routine indicated by C<$SIG{__WARN__}> is called when a warning message is
@@ -1184,9 +1434,9 @@ To illustrate the differences between these variables, consider the
  following Perl expression, which uses a single-quoted string:
  
      eval q{
-       open PIPE, "/cdrom/install |";
-       @res = <PIPE>;
-       close PIPE or die "bad pipe: $?, $!";
+       open my $pipe, "/cdrom/install |" or die $!;
+       my @res = <$pipe>;
+       close $pipe or die "bad pipe: $?, $!";
      };
  
  After execution of this statement all 4 variables may have been set.  
@@ -1195,7 +1445,7 @@ C<$@> is set if the string to be C<eval>-ed did not compile (this
  may happen if C<open> or C<close> were imported with bad prototypes),
  or if Perl code executed during evaluation die()d .  In these cases
  the value of $@ is the compile error, or the argument to C<die>
-(which will interpolate C<$!> and C<$?>!).  (See also L<Fatal>,
+(which will interpolate C<$!> and C<$?>).  (See also L<Fatal>,
  though.)
  
  When the eval() expression above is executed, open(), C<< <PIPE> >>,
@@ -1254,18 +1504,19 @@ used safely in programs.  C<$^_> itself, however, I<is> reserved.
  
  Perl identifiers that begin with digits, control characters, or
  punctuation characters are exempt from the effects of the C<package>
-declaration and are always forced to be in package C<main>.  A few
-other names are also exempt:
+declaration and are always forced to be in package C<main>; they are
+also exempt from C<strict 'vars'> errors.  A few other names are also
+exempt in these ways:
  
         ENV             STDIN
         INC             STDOUT
         ARGV            STDERR
-       ARGVOUT
+       ARGVOUT         _
         SIG
  
  In particular, the new special C<${^_XYZ}> variables are always taken
  to be in package C<main>, regardless of any C<package> declarations
-presently in scope.
+presently in scope.  
  
  =head1 BUGS
  
@@ -1275,7 +1526,7 @@ expression matches in a program, regardless of whether they occur
  in the scope of C<use English>.  For that reason, saying C<use
  English> in libraries is strongly discouraged.  See the
  Devel::SawAmpersand module documentation from CPAN
-(http://www.perl.com/CPAN/modules/by-module/Devel/)
+( http://www.cpan.org/modules/by-module/Devel/ )
  for more information.
  
  Having to even think about the C<$^S> variable in your exception