X-Git-Url: https://perl5.git.perl.org/perl5.git/blobdiff_plain/9bf2270250326fb85445d6849ed84a94434dd12c..5e220227379abcad75ff0534a6b23aa30c22c695:/pod/perlvar.pod
diff --git a/pod/perlvar.pod b/pod/perlvar.pod
index bcd8ecf..14324a5 100644
--- a/pod/perlvar.pod
+++ b/pod/perlvar.pod
@@ -4,120 +4,77 @@ perlvar - Perl predefined variables
=head1 DESCRIPTION
-=head2 Predefined Names
+=head2 The Syntax of Variable Names
-The following names have special meaning to Perl. Most
-punctuation names have reasonable mnemonics, or analogs in the
-shells. Nevertheless, if you wish to use long variable names,
-you need only say
-
- use English;
-
-at the top of your program. This aliases all the short names to the long
-names in the current package. Some even have medium names, generally
-borrowed from B. In general, it's best to use the
-
- use English '-no_match_vars';
-
-invocation if you don't need $PREMATCH, $MATCH, or $POSTMATCH, as it avoids
-a certain performance hit with the use of regular expressions. See
-L.
-
-Variables that depend on the currently selected filehandle may be set by
-calling an appropriate object method on the IO::Handle object, although
-this is less efficient than using the regular built-in variables. (Summary
-lines below for this contain the word HANDLE.) First you must say
-
- use IO::Handle;
-
-after which you may use either
-
- method HANDLE EXPR
-
-or more safely,
-
- HANDLE->method(EXPR)
-
-Each method returns the old value of the IO::Handle attribute.
-The methods each take an optional EXPR, which, if supplied, specifies the
-new value for the IO::Handle attribute in question. If not supplied,
-most methods do nothing to the current value--except for
-autoflush(), which will assume a 1 for you, just to be different.
-
-Because loading in the IO::Handle class is an expensive operation, you should
-learn how to use the regular built-in variables.
-
-A few of these variables are considered "read-only". This means that if
-you try to assign to this variable, either directly or indirectly through
-a reference, you'll raise a run-time exception.
-
-You should be very careful when modifying the default values of most
-special variables described in this document. In most cases you want
-to localize these variables before changing them, since if you don't,
-the change may affect other modules which rely on the default values
-of the special variables that you have changed. This is one of the
-correct ways to read the whole file at once:
-
- open my $fh, "<", "foo" or die $!;
- local $/; # enable localized slurp mode
- my $content = <$fh>;
- close $fh;
-
-But the following code is quite bad:
-
- open my $fh, "<", "foo" or die $!;
- undef $/; # enable slurp mode
- my $content = <$fh>;
- close $fh;
-
-since some other module, may want to read data from some file in the
-default "line mode", so if the code we have just presented has been
-executed, the global value of C<$/> is now changed for any other code
-running inside the same Perl interpreter.
+Variable names in Perl can have several formats. Usually, they
+must begin with a letter or underscore, in which case they can be
+arbitrarily long (up to an internal limit of 251 characters) and
+may contain letters, digits, underscores, or the special sequence
+C<::> or C<'>. In this case, the part before the last C<::> or
+C<'> is taken to be a I; see L.
-Usually when a variable is localized you want to make sure that this
-change affects the shortest scope possible. So unless you are already
-inside some short C<{}> block, you should create one yourself. For
-example:
+Perl variable names may also be a sequence of digits or a single
+punctuation or control character. These names are all reserved for
+special uses by Perl; for example, the all-digits names are used
+to hold data captured by backreferences after a regular expression
+match. Perl has a special syntax for the single-control-character
+names: It understands C<^X> (caret C) to mean the control-C
+character. For example, the notation C<$^W> (dollar-sign caret
+C) is the scalar variable whose name is the single character
+control-C. This is better than typing a literal control-C
+into your program.
- my $content = '';
- open my $fh, "<", "foo" or die $!;
- {
- local $/;
- $content = <$fh>;
- }
- close $fh;
+Since Perl v5.6.0, Perl variable names may be alphanumeric
+strings that begin with control characters (or better yet, a caret).
+These variables must be written in the form C<${^Foo}>; the braces
+are not optional. C<${^Foo}> denotes the scalar variable whose
+name is a control-C followed by two C's. These variables are
+reserved for future special uses by Perl, except for the ones that
+begin with C<^_> (control-underscore or caret-underscore). No
+control-character name that begins with C<^_> will acquire a special
+meaning in any future version of Perl; such names may therefore be
+used safely in programs. C<$^_> itself, however, I reserved.
-Here is an example of how your own code can go broken:
+Perl identifiers that begin with digits, control characters, or
+punctuation characters are exempt from the effects of the C
+declaration and are always forced to be in package C; they are
+also exempt from C errors. A few other names are also
+exempt in these ways:
- for (1..5){
- nasty_break();
- print "$_ ";
- }
- sub nasty_break {
- $_ = 5;
- # do something with $_
- }
+ ENV STDIN
+ INC STDOUT
+ ARGV STDERR
+ ARGVOUT
+ SIG
-You probably expect this code to print:
+In particular, the special C<${^_XYZ}> variables are always taken
+to be in package C, regardless of any C declarations
+presently in scope.
- 1 2 3 4 5
+=head1 SPECIAL VARIABLES
-but instead you get:
+The following names have special meaning to Perl. Most punctuation
+names have reasonable mnemonics, or analogs in the shells.
+Nevertheless, if you wish to use long variable names, you need only say:
- 5 5 5 5 5
+ use English;
-Why? Because nasty_break() modifies C<$_> without localizing it
-first. The fix is to add local():
+at the top of your program. This aliases all the short names to the long
+names in the current package. Some even have medium names, generally
+borrowed from B. To avoid a performance hit, if you don't need the
+C<$PREMATCH>, C<$MATCH>, or C<$POSTMATCH> it's best to use the C
+module without them:
- local $_ = 5;
+ use English '-no_match_vars';
-It's easy to notice the problem in such a short example, but in more
-complicated code you are looking for trouble if you don't localize
-changes to the special variables.
+Before you continue, note the sort order for variables. In general, we
+first list the variables in case-insensitive, almost-lexigraphical
+order (ignoring the C<{> or C<^> preceding words, as in C<${^UNICODE}>
+or C<$^T>), although C<$_> and C<@_> move up to the top of the pile.
+For variables with the same identifier, we list it in order of scalar,
+array, hash, and bareword.
-The following list is ordered by scalar variables first, then the
-arrays, then the hashes.
+=head2 General Variables
=over 8
@@ -129,7 +86,7 @@ X<$_> X<$ARG>
The default input and pattern-searching space. The following pairs are
equivalent:
- while (<>) {...} # equivalent only in while!
+ while (<>) {...} # equivalent only in while!
while (defined($_ = <>)) {...}
/^Subject:/
@@ -141,19 +98,20 @@ equivalent:
chomp
chomp($_)
-Here are the places where Perl will assume $_ even if you
-don't use it:
+Here are the places where Perl will assume C<$_> even if you don't use it:
=over 3
=item *
-The following functions:
+The following functions use C<$_> as a default argument:
-abs, alarm, chomp, chop, chr, chroot, cos, defined, eval, exp, glob,
-hex, int, lc, lcfirst, length, log, lstat, mkdir, oct, ord, pos, print,
+abs, alarm, chomp, chop, chr, chroot,
+cos, defined, eval, evalbytes, exp, fc, glob, hex, int, lc,
+lcfirst, length, log, lstat, mkdir, oct, ord, pos, print, printf,
quotemeta, readlink, readpipe, ref, require, reverse (in scalar context only),
-rmdir, sin, split (on its second argument), sqrt, stat, study, uc, ucfirst,
+rmdir, say, sin, split (for its second
+argument), sqrt, stat, study, uc, ucfirst,
unlink, unpack.
=item *
@@ -161,7 +119,6 @@ unlink, unpack.
All file tests (C<-f>, C<-d>) except for C<-t>, which defaults to STDIN.
See L
-
=item *
The pattern matching operations C, C and C
(aka C)
@@ -174,434 +131,897 @@ variable is supplied.
=item *
-The implicit iterator variable in the grep() and map() functions.
+The implicit iterator variable in the C and C
modifier.
+LinuxThreads is now obsolete on Linux, and caching C
+like this made embedding perl unnecessarily complex (since you'd have
+to manually update the value of $$), so now C<$$> and C
+will always return the same values as the underlying C library.
-=item $PREMATCH
+Debian GNU/kFreeBSD systems also used LinuxThreads up until and
+including the 6.0 release, but after that moved to FreeBSD thread
+semantics, which are POSIX-like.
-=item $`
-X<$`> X<$PREMATCH>
+To see if your system is affected by this discrepancy check if
+C returns a false
+value. NTPL threads preserve the POSIX semantics.
-The string preceding whatever was matched by the last successful
-pattern match (not counting any matches hidden within a BLOCK or eval
-enclosed by the current BLOCK). (Mnemonic: C<`> often precedes a quoted
-string.) This variable is read-only.
+Mnemonic: same as shells.
-The use of this variable anywhere in a program imposes a considerable
-performance penalty on all regular expression matches. See L.
+=item $PROGRAM_NAME
-See L@-> for a replacement.
+=item $0
+X<$0> X<$PROGRAM_NAME>
-=item ${^PREMATCH}
-X<${^PREMATCH}>
+Contains the name of the program being executed.
-This is similar to C<$`> ($PREMATCH) except that it does not incur the
-performance penalty associated with that variable, and is only guaranteed
-to return a defined value when the pattern was compiled or executed with
-the C modifier.
+On some (but not all) operating systems assigning to C<$0> modifies
+the argument area that the C program sees. On some platforms you
+may have to use special C options or a different C to see the
+changes. Modifying the C<$0> is more useful as a way of indicating the
+current program state than it is for hiding the program you're
+running.
-=item $POSTMATCH
+Note that there are platform-specific limitations on the maximum
+length of C<$0>. In the most extreme case it may be limited to the
+space occupied by the original C<$0>.
-=item $'
-X<$'> X<$POSTMATCH>
+In some platforms there may be arbitrary amount of padding, for
+example space characters, after the modified name as shown by C.
+In some platforms this padding may extend all the way to the original
+length of the argument area, no matter what you do (this is the case
+for example with Linux 2.2).
-The string following whatever was matched by the last successful
-pattern match (not counting any matches hidden within a BLOCK or eval()
-enclosed by the current BLOCK). (Mnemonic: C<'> often follows a quoted
-string.) Example:
+Note for BSD users: setting C<$0> does not completely remove "perl"
+from the ps(1) output. For example, setting C<$0> to C<"foobar"> may
+result in C<"perl: foobar (perl)"> (whether both the C<"perl: "> prefix
+and the " (perl)" suffix are shown depends on your exact BSD variant
+and version). This is an operating system feature, Perl cannot help it.
- local $_ = 'abcdefghi';
- /def/;
- print "$`:$&:$'\n"; # prints abc:def:ghi
+In multithreaded scripts Perl coordinates the threads so that any
+thread may modify its copy of the C<$0> and the change becomes visible
+to ps(1) (assuming the operating system plays along). Note that
+the view of C<$0> the other threads have will not change since they
+have their own copies of it.
-This variable is read-only and dynamically scoped to the current BLOCK.
+If the program has been given to perl via the switches C<-e> or C<-E>,
+C<$0> will contain the string C<"-e">.
-The use of this variable anywhere in a program imposes a considerable
-performance penalty on all regular expression matches. See L.
+On Linux as of perl v5.14.0 the legacy process name will be set with
+C, in addition to altering the POSIX name via C as
+perl has done since version 4.000. Now system utilities that read the
+legacy process name such as ps, top and killall will recognize the
+name you set when assigning to C<$0>. The string you supply will be
+cut off at 16 bytes, this is a limitation imposed by Linux.
-See L@-> for a replacement.
+Mnemonic: same as B and B.
-=item ${^POSTMATCH}
-X<${^POSTMATCH}>
+=item $REAL_GROUP_ID
-This is similar to C<$'> (C<$POSTMATCH>) except that it does not incur the
-performance penalty associated with that variable, and is only guaranteed
-to return a defined value when the pattern was compiled or executed with
-the C modifier.
+=item $GID
-=item $LAST_PAREN_MATCH
+=item $(
+X<$(> X<$GID> X<$REAL_GROUP_ID>
-=item $+
-X<$+> X<$LAST_PAREN_MATCH>
+The real gid of this process. If you are on a machine that supports
+membership in multiple groups simultaneously, gives a space separated
+list of groups you are in. The first number is the one returned by
+C, and the subsequent ones by C, one of which may be
+the same as the first number.
-The text matched by the last bracket of the last successful search pattern.
-This is useful if you don't know which one of a set of alternative patterns
-matched. For example:
+However, a value assigned to C<$(> must be a single number used to
+set the real gid. So the value given by C<$(> should I be assigned
+back to C<$(> without being forced numeric, such as by adding zero. Note
+that this is different to the effective gid (C<$)>) which does take a
+list.
- /Version: (.*)|Revision: (.*)/ && ($rev = $+);
+You can change both the real gid and the effective gid at the same
+time by using C. Changes
+to C<$(> require a check to C<$!>
+to detect any possible errors after an attempted change.
-(Mnemonic: be positive and forward looking.)
-This variable is read-only and dynamically scoped to the current BLOCK.
+Mnemonic: parentheses are used to I things. The real gid is the
+group you I, if you're running setgid.
-=item $LAST_SUBMATCH_RESULT
+=item $EFFECTIVE_GROUP_ID
-=item $^N
-X<$^N>
+=item $EGID
-The text matched by the used group most-recently closed (i.e. the group
-with the rightmost closing parenthesis) of the last successful search
-pattern. (Mnemonic: the (possibly) Nested parenthesis that most
-recently closed.)
+=item $)
+X<$)> X<$EGID> X<$EFFECTIVE_GROUP_ID>
-This is primarily used inside C<(?{...})> blocks for examining text
-recently matched. For example, to effectively capture text to a variable
-(in addition to C<$1>, C<$2>, etc.), replace C<(...)> with
+The effective gid of this process. If you are on a machine that
+supports membership in multiple groups simultaneously, gives a space
+separated list of groups you are in. The first number is the one
+returned by C, and the subsequent ones by C,
+one of which may be the same as the first number.
- (?:(...)(?{ $var = $^N }))
+Similarly, a value assigned to C<$)> must also be a space-separated
+list of numbers. The first number sets the effective gid, and
+the rest (if any) are passed to C. To get the effect of an
+empty list for C, just repeat the new effective gid; that is,
+to force an effective gid of 5 and an effectively empty C
+list, say C< $) = "5 5" >.
-By setting and then using C<$var> in this way relieves you from having to
-worry about exactly which numbered set of parentheses they are.
+You can change both the effective gid and the real gid at the same
+time by using C (use only a single numeric argument).
+Changes to C<$)> require a check to C<$!> to detect any possible errors
+after an attempted change.
-This variable is dynamically scoped to the current BLOCK.
+C<< $< >>, C<< $> >>, C<$(> and C<$)> can be set only on
+machines that support the corresponding I routine. C<$(>
+and C<$)> can be swapped only on machines supporting C.
-=item @LAST_MATCH_END
+Mnemonic: parentheses are used to I things. The effective gid
+is the group that's I for you, if you're running setgid.
-=item @+
-X<@+> X<@LAST_MATCH_END>
+=item $REAL_USER_ID
-This array holds the offsets of the ends of the last successful
-submatches in the currently active dynamic scope. C<$+[0]> is
-the offset into the string of the end of the entire match. This
-is the same value as what the C function returns when called
-on the variable that was matched against. The Ith element
-of this array holds the offset of the Ith submatch, so
-C<$+[1]> is the offset past where $1 ends, C<$+[2]> the offset
-past where $2 ends, and so on. You can use C<$#+> to determine
-how many subgroups were in the last successful match. See the
-examples given for the C<@-> variable.
+=item $UID
-=item %LAST_PAREN_MATCH
+=item $<
+X<< $< >> X<$UID> X<$REAL_USER_ID>
-=item %+
-X<%+>
+The real uid of this process. You can change both the real uid and the
+effective uid at the same time by using C. Since
+changes to C<< $< >> require a system call, check C<$!> after a change
+attempt to detect any possible errors.
-Similar to C<@+>, the C<%+> hash allows access to the named capture
-buffers, should they exist, in the last successful match in the
-currently active dynamic scope.
+Mnemonic: it's the uid you came I, if you're running setuid.
-For example, C<$+{foo}> is equivalent to C<$1> after the following match:
+=item $EFFECTIVE_USER_ID
- 'foo' =~ /(?foo)/;
+=item $EUID
-The keys of the C<%+> hash list only the names of buffers that have
-captured (and that are thus associated to defined values).
+=item $>
+X<< $> >> X<$EUID> X<$EFFECTIVE_USER_ID>
-The underlying behaviour of C<%+> is provided by the
-L module.
+The effective uid of this process. For example:
-B C<%-> and C<%+> are tied views into a common internal hash
-associated with the last successful regular expression. Therefore mixing
-iterative access to them via C may have unpredictable results.
-Likewise, if the last successful match changes, then the results may be
-surprising.
+ $< = $>; # set real to effective uid
+ ($<,$>) = ($>,$<); # swap real and effective uids
-=item HANDLE->input_line_number(EXPR)
+You can change both the effective uid and the real uid at the same
+time by using C. Changes to C<< $> >> require a check
+to C<$!> to detect any possible errors after an attempted change.
-=item $INPUT_LINE_NUMBER
+C<< $< >> and C<< $> >> can be swapped only on machines
+supporting C.
-=item $NR
+Mnemonic: it's the uid you went I, if you're running setuid.
-=item $.
-X<$.> X<$NR> X<$INPUT_LINE_NUMBER> X
+=item $SUBSCRIPT_SEPARATOR
-Current line number for the last filehandle accessed.
+=item $SUBSEP
-Each filehandle in Perl counts the number of lines that have been read
-from it. (Depending on the value of C<$/>, Perl's idea of what
-constitutes a line may not match yours.) When a line is read from a
-filehandle (via readline() or C<< <> >>), or when tell() or seek() is
-called on it, C<$.> becomes an alias to the line counter for that
-filehandle.
+=item $;
+X<$;> X<$SUBSEP> X
-You can adjust the counter by assigning to C<$.>, but this will not
-actually move the seek pointer. I will not localize
-the filehandle's line count>. Instead, it will localize perl's notion
-of which filehandle C<$.> is currently aliased to.
+The subscript separator for multidimensional array emulation. If you
+refer to a hash element as
+
+ $foo{$a,$b,$c}
+
+it really means
+
+ $foo{join($;, $a, $b, $c)}
+
+But don't put
+
+ @foo{$a,$b,$c} # a slice--note the @
+
+which means
+
+ ($foo{$a},$foo{$b},$foo{$c})
+
+Default is "\034", the same as SUBSEP in B. If your keys contain
+binary data there might not be any safe value for C<$;>.
+
+Consider using "real" multidimensional arrays as described
+in L.
+
+Mnemonic: comma (the syntactic subscript separator) is a semi-semicolon.
+
+=item $a
+
+=item $b
+X<$a> X<$b>
+
+Special package variables when using C, see L.
+Because of this specialness C<$a> and C<$b> don't need to be declared
+(using C