Windows issues with select() are already documented in perlport.

[perl5.git] / pod / perlfunc.pod
diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod

index a7bbacc..b2c6776 100644 (file)
--- a/pod/perlfunc.pod
+++ b/pod/perlfunc.pod
@@ -366,6 +366,12 @@ Example:
      print "Text\n" if -T _;
      print "Binary\n" if -B _;
  
+As of Perl 5.9.1, as a form of purely syntactic sugar, you can stack file
+test operators, in a way that C<-f -w -x $file> is equivalent to
+C<-x $file && -w _ && -f _>. (This is only syntax fancy : if you use
+the return value of C<-f $file> as an argument to another filetest
+operator, no special magic will happen.)
+
  =item abs VALUE
  
  =item abs
@@ -2145,22 +2151,13 @@ In scalar context, C<gmtime()> returns the ctime(3) value:
  
      $now_string = gmtime;  # e.g., "Thu Oct 13 04:54:34 1994"
  
-Also see the C<timegm> function provided by the C<Time::Local> module,
-and the strftime(3) function available via the POSIX module.
-
-This scalar value is B<not> locale dependent (see L<perllocale>), but
-is instead a Perl builtin.  Also see the C<Time::Local> module, and the
-strftime(3) and mktime(3) functions available via the POSIX module.  To
-get somewhat similar but locale dependent date strings, set up your
-locale environment variables appropriately (please see L<perllocale>)
-and try for example:
-
-    use POSIX qw(strftime);
-    $now_string = strftime "%a %b %e %H:%M:%S %Y", gmtime;
+If you need local time instead of GMT use the L</localtime> builtin. 
+See also the C<timegm> function provided by the C<Time::Local> module,
+and the strftime(3) and mktime(3) functions available via the L<POSIX> module.
  
-Note that the C<%a> and C<%b> escapes, which represent the short forms
-of the day of the week and the month of the year, may not necessarily
-be three characters wide in all locales.
+This scalar value is B<not> locale dependent (see L<perllocale>), but is
+instead a Perl builtin.  To get somewhat similar but locale dependent date
+strings, see the example in L</localtime>.
  
  =item goto LABEL
  
@@ -2230,6 +2227,11 @@ element of a list returned by grep (for example, in a C<foreach>, C<map>
  or another C<grep>) actually modifies the element in the original list.
  This is usually something to be avoided when writing clear code.
  
+If C<$_> is lexical in the scope where the C<grep> appears (because it has
+been declared with C<my $_>) then, in addition the be locally aliased to
+the list elements, C<$_> keeps being lexical inside the block; i.e. it
+can't be seen from the outside, avoiding any potential side-effects.
+
  See also L</map> for a list composed of the results of the BLOCK or EXPR.
  
  =item hex EXPR
@@ -2527,17 +2529,20 @@ In scalar context, C<localtime()> returns the ctime(3) value:
  
      $now_string = localtime;  # e.g., "Thu Oct 13 04:54:34 1994"
  
-This scalar value is B<not> locale dependent, see L<perllocale>, but
-instead a Perl builtin.  Also see the C<Time::Local> module
-(to convert the second, minutes, hours, ... back to seconds since the
-stroke of midnight the 1st of January 1970, the value returned by
-time()), and the strftime(3) and mktime(3) functions available via the
-POSIX module.  To get somewhat similar but locale dependent date
-strings, set up your locale environment variables appropriately
-(please see L<perllocale>) and try for example:
+This scalar value is B<not> locale dependent but is a Perl builtin. For GMT
+instead of local time use the L</gmtime> builtin. See also the
+C<Time::Local> module (to convert the second, minutes, hours, ... back to
+the integer value returned by time()), and the L<POSIX> module's strftime(3)
+and mktime(3) functions.
+
+To get somewhat similar but locale dependent date strings, set up your
+locale environment variables appropriately (please see L<perllocale>) and
+try for example:
  
      use POSIX qw(strftime);
      $now_string = strftime "%a %b %e %H:%M:%S %Y", localtime;
+    # or for GMT formatted appropriately for your locale:
+    $now_string = strftime "%a %b %e %H:%M:%S %Y", gmtime;
  
  Note that the C<%a> and C<%b>, the short forms of the day of the week
  and the month of the year, may not necessarily be three characters wide.
@@ -2615,6 +2620,11 @@ Using a regular C<foreach> loop for this purpose would be clearer in
  most cases.  See also L</grep> for an array composed of those items of
  the original list for which the BLOCK or EXPR evaluates to true.
  
+If C<$_> is lexical in the scope where the C<map> appears (because it has
+been declared with C<my $_>) then, in addition the be locally aliased to
+the list elements, C<$_> keeps being lexical inside the block; i.e. it
+can't be seen from the outside, avoiding any potential side-effects.
+
  C<{> starts both hash references and blocks, so C<map { ...> could be either
  the start of map BLOCK LIST or map EXPR, LIST. Because perl doesn't look
  ahead for the closing C<}> it has to take a guess at which its dealing with
@@ -3262,34 +3272,14 @@ of values, as follows:
      h  A hex string (low nybble first).
      H  A hex string (high nybble first).
  
-    c  A signed char value.
+    c  A signed char (8-bit) value.
      C  An unsigned char value.  Only does bytes.  See U for Unicode.
  
-    s  A signed short value.
+    s  A signed short (16-bit) value.
      S  An unsigned short value.
-         (This 'short' is _exactly_ 16 bits, which may differ from
-          what a local C compiler calls 'short'.  If you want
-          native-length shorts, use the '!' suffix.)
  
-    i  A signed integer value.
-    I  An unsigned integer value.
-         (This 'integer' is _at_least_ 32 bits wide.  Its exact
-           size depends on what a local C compiler calls 'int',
-           and may even be larger than the 'long' described in
-           the next item.)
-
-    l  A signed long value.
+    l  A signed long (32-bit) value.
      L  An unsigned long value.
-         (This 'long' is _exactly_ 32 bits, which may differ from
-          what a local C compiler calls 'long'.  If you want
-          native-length longs, use the '!' suffix.)
-
-    n  An unsigned short in "network" (big-endian) order.
-    N  An unsigned long in "network" (big-endian) order.
-    v  An unsigned short in "VAX" (little-endian) order.
-    V  An unsigned long in "VAX" (little-endian) order.
-         (These 'shorts' and 'longs' are _exactly_ 16 bits and
-          _exactly_ 32 bits, respectively.)
  
      q  A signed quad (64-bit) value.
      Q  An unsigned quad value.
@@ -3297,14 +3287,23 @@ of values, as follows:
            integer values _and_ if Perl has been compiled to support those.
             Causes a fatal error otherwise.)
  
-    j   A signed integer value (a Perl internal integer, IV).
-    J   An unsigned integer value (a Perl internal unsigned integer, UV).
+    i  A signed integer value.
+    I  A unsigned integer value.
+         (This 'integer' is _at_least_ 32 bits wide.  Its exact
+           size depends on what a local C compiler calls 'int'.)
+ 
+    n  An unsigned short (16-bit) in "network" (big-endian) order.
+    N  An unsigned long (32-bit) in "network" (big-endian) order.
+    v  An unsigned short (16-bit) in "VAX" (little-endian) order.
+    V  An unsigned long (32-bit) in "VAX" (little-endian) order.
+
+    j   A Perl internal signed integer value (IV).
+    J   A Perl internal unsigned integer value (UV).
  
      f  A single-precision float in the native format.
      d  A double-precision float in the native format.
  
-    F  A floating point value in the native native format
-           (a Perl internal floating point value, NV).
+    F  A Perl internal floating point value (NV) in the native format
      D  A long double-precision float in the native format.
           (Long doubles are available only if your system supports long
            double values _and_ if Perl has been compiled to support those.
@@ -3328,6 +3327,27 @@ of values, as follows:
          the innermost ()-group.
      (  Start of a ()-group.
  
+Some letters in the TEMPLATE may optionally be followed by one or
+more of these modifiers (the second column lists the letters for
+which the modifier is valid):
+
+    !   sSlLiI     Forces native (short, long, int) sizes instead
+                   of fixed (16-/32-bit) sizes.
+
+        xX         Make x and X act as alignment commands.
+
+        nNvV       Treat integers as signed instead of unsigned.
+
+    >   sSiIlLqQ   Force big-endian byte-order on the type.
+        jJfFdDpP   (The "big end" touches the construct.)
+
+    <   sSiIlLqQ   Force little-endian byte-order on the type.
+        jJfFdDpP   (The "little end" touches the construct.)
+
+The C<E<gt>> and C<E<lt>> modifiers can also be used on C<()>-groups,
+in which case they force a certain byte-order on all components of
+that group, including subgroups.
+
  The following rules apply:
  
  =over 8
@@ -3432,6 +3452,11 @@ The C<P> type packs a pointer to a structure of the size indicated by the
  length.  A NULL pointer is created if the corresponding value for C<p> or
  C<P> is C<undef>, similarly for unpack().
  
+If your system has a strange pointer size (i.e. a pointer is neither as
+big as an int nor as big as a long), it may not be possible to pack or
+unpack pointers in big- or little-endian byte order.  Attempting to do
+so will result in a fatal error.
+
  =item *
  
  The C</> template character allows packing and unpacking of strings where
@@ -3463,7 +3488,7 @@ which Perl does not regard as legal in numeric strings.
  =item *
  
  The integer types C<s>, C<S>, C<l>, and C<L> may be
-immediately followed by a C<!> suffix to signify native shorts or
+followed by a C<!> modifier to signify native shorts or
  longs--as you can see from above for example a bare C<l> does mean
  exactly 32 bits, the native C<long> (as seen by the local C compiler)
  may be larger.  This is an issue mainly in 64-bit platforms.  You can
@@ -3529,12 +3554,45 @@ via L<Config>:
  Byteorders C<'1234'> and C<'12345678'> are little-endian, C<'4321'>
  and C<'87654321'> are big-endian.
  
-If you want portable packed integers use the formats C<n>, C<N>,
-C<v>, and C<V>, their byte endianness and size are known.
+If you want portable packed integers you can either use the formats
+C<n>, C<N>, C<v>, and C<V>, or you can use the C<E<gt>> and C<E<lt>>
+modifiers.  These modifiers are only available as of perl 5.8.5.
  See also L<perlport>.
  
  =item *
  
+All integer and floating point formats as well as C<p> and C<P> and
+C<()>-groups may be followed by the C<E<gt>> or C<E<lt>> modifiers
+to force big- or little- endian byte-order, respectively.
+This is especially useful, since C<n>, C<N>, C<v> and C<V> don't cover
+signed integers, 64-bit integers and floating point values.  However,
+there are some things to keep in mind.
+
+Exchanging signed integers between different platforms only works
+if all platforms store them in the same format.  Most platforms store
+signed integers in two's complement, so usually this is not an issue.
+
+The C<E<gt>> or C<E<lt>> modifiers can only be used on floating point
+formats on big- or little-endian machines.  Otherwise, attempting to
+do so will result in a fatal error.
+
+Forcing big- or little-endian byte-order on floating point values for
+data exchange can only work if all platforms are using the same
+binary representation (e.g. IEEE floating point format).  Even if all
+platforms are using IEEE, there may be subtle differences.  Being able
+to use C<E<gt>> or C<E<lt>> on floating point values can be very useful,
+but also very dangerous if you don't know exactly what you're doing.
+It is definetely not a general way to portably store floating point
+values.
+
+When using C<E<gt>> or C<E<lt>> on an C<()>-group, this will affect
+all types inside the group that accept the byte-order modifiers,
+including all subgroups.  It will silently be ignored for all other
+types.  You are not allowed to override the byte-order within a group
+that already has a byte-order modifier suffix.
+
+=item *
+
  Real numbers (floats and doubles) are in the native machine format only;
  due to the multiplicity of floating formats around, and the lack of a
  standard "network" representation, no facility for interchange has been
@@ -3543,10 +3601,13 @@ may not be readable on another - even if both use IEEE floating point
  arithmetic (as the endian-ness of the memory representation is not part
  of the IEEE spec).  See also L<perlport>.
  
-Note that Perl uses doubles internally for all numeric calculation, and
-converting from double into float and thence back to double again will
-lose precision (i.e., C<unpack("f", pack("f", $foo)>) will not in general
-equal $foo).
+If you know exactly what you're doing, you can use the C<E<gt>> or C<E<lt>>
+modifiers to force big- or little-endian byte-order on floating point values.
+
+Note that Perl uses doubles (or long doubles, if configured) internally for
+all numeric calculation, and converting from double into float and thence back
+to double again will lose precision (i.e., C<unpack("f", pack("f", $foo)>)
+will not in general equal $foo).
  
  =item *
  
@@ -3592,9 +3653,17 @@ both result in no-ops.
  
  =item *
  
+C<n>, C<N>, C<v> and C<V> accept the C<!> modifier. In this case they
+will represent signed 16-/32-bit integers in big-/little-endian order.
+This is only portable if all platforms sharing the packed data use the
+same binary representation for signed integers (e.g. all platforms are
+using two's complement representation).
+
+=item *
+
  A comment in a TEMPLATE starts with C<#> and goes to the end of line.
  White space may be used to separate pack codes from each other, but
-a C<!> modifier and a repeat count must follow immediately.
+modifiers and a repeat count must follow immediately.
  
  =item *
  
@@ -3654,6 +3723,15 @@ Examples:
      # short 12, zero fill to position 4, long 34
      # $foo eq $bar
  
+    $foo = pack('nN', 42, 4711);
+    # pack big-endian 16- and 32-bit unsigned integers
+    $foo = pack('S>L>', 42, 4711);
+    # exactly the same
+    $foo = pack('s<l<', -42, 4711);
+    # pack little-endian 16- and 32-bit signed integers
+    $foo = pack('(sl)<', -42, 4711);
+    # exactly the same
+
  The same template may generally also be used in unpack().
  
  =item package NAMESPACE
@@ -4441,7 +4519,8 @@ You can effect a sleep of 250 milliseconds this way:
      select(undef, undef, undef, 0.25);
  
  Note that whether C<select> gets restarted after signals (say, SIGALRM)
-is implementation-dependent.
+is implementation-dependent.  See also L<perlport> for notes on the
+portability of C<select>.
  
  B<WARNING>: One should not attempt to mix buffered I/O (like C<read>
  or <FH>) with C<select>, except as permitted by POSIX, and even
@@ -4874,8 +4953,9 @@ Example, assuming array lengths are passed before arrays:
  
  =item split
  
-Splits a string into a list of strings and returns that list.  By default,
-empty leading fields are preserved, and empty trailing ones are deleted.
+Splits the string EXPR into a list of strings and returns that list.  By
+default, empty leading fields are preserved, and empty trailing ones are
+deleted.
  
  In scalar context, returns the number of fields found and splits into
  the C<@_> array.  Use of split in scalar context is deprecated, however,
@@ -5335,7 +5415,7 @@ as follows:
             = stat($filename);
  
  Not all fields are supported on all filesystem types.  Here are the
-meaning of the fields:
+meanings of the fields:
  
    0 dev      device number of filesystem
    1 ino      inode number
@@ -5353,13 +5433,13 @@ meaning of the fields:
  
  (The epoch was at 00:00 January 1, 1970 GMT.)
  
-(*) The ctime field is non-portable, in particular you cannot expect
+(*) The ctime field is non-portable.  In particular, you cannot expect
  it to be a "creation time", see L<perlport/"Files and Filesystems">
  for details.
  
-If stat is passed the special filehandle consisting of an underline, no
+If C<stat> is passed the special filehandle consisting of an underline, no
  stat is done, but the current contents of the stat structure from the
-last stat or filetest are returned.  Example:
+last C<stat>, C<lstat>, or filetest are returned.  Example:
  
      if (-x $file && (($d) = stat(_)) && $d < 0) {
         print "$file is executable NFS file\n";
@@ -5404,7 +5484,7 @@ You can import symbolic mode constants (C<S_IF*>) and functions
      $is_setgid     =  S_ISDIR($mode);
  
  You could write the last two using the C<-u> and C<-d> operators.
-The commonly available S_IF* constants are
+The commonly available C<S_IF*> constants are
  
      # Permissions: read, write, execute, for user, group, others.
  
@@ -5425,7 +5505,7 @@ The commonly available S_IF* constants are
  
      S_IREAD S_IWRITE S_IEXEC
  
-and the S_IF* functions are
+and the C<S_IF*> functions are
  
      S_IMODE($mode)     the part of $mode containing the permission bits
                         and the setuid/setgid/sticky bits
@@ -5434,7 +5514,7 @@ and the S_IF* functions are
                         which can be bit-anded with e.g. S_IFREG
                          or with the following functions
  
-    # The operators -f, -d, -l, -b, -c, -p, and -s.
+    # The operators -f, -d, -l, -b, -c, -p, and -S.
  
      S_ISREG($mode) S_ISDIR($mode) S_ISLNK($mode)
      S_ISBLK($mode) S_ISCHR($mode) S_ISFIFO($mode) S_ISSOCK($mode)
@@ -5446,7 +5526,7 @@ and the S_IF* functions are
      S_ISENFMT($mode) S_ISWHT($mode)
  
  See your native chmod(2) and stat(2) documentation for more details
-about the S_* constants.  To get status info for a symbolic link
+about the C<S_*> constants.  To get status info for a symbolic link
  instead of the target file behind the link, use the C<lstat> function.
  
  =item study SCALAR
@@ -5562,15 +5642,21 @@ replacement string as the 4th argument.  This allows you to replace
  parts of the EXPR and return what was there before in one operation,
  just as you can with splice().
  
-If the lvalue returned by substr is used after the EXPR is changed in
-any way, the behaviour may not be as expected and is subject to change.
-This caveat includes code such as C<print(substr($foo,$a,$b)=$bar)> or
-C<(substr($foo,$a,$b)=$bar)=$fud> (where $foo is changed via the
-substring assignment, and then the substr is used again), or where a
-substr() is aliased via a C<foreach> loop or passed as a parameter or
-a reference to it is taken and then the alias, parameter, or deref'd
-reference either is used after the original EXPR has been changed or
-is assigned to and then used a second time.
+Note that the lvalue returned by by the 3-arg version of substr() acts as
+a 'magic bullet'; each time it is assigned to, it remembers which part
+of the original string is being modified; for example:
+
+    $x = '1234';
+    for (substr($x,1,2)) {
+        $_ = 'a';   print $x,"\n";     # prints 1a4
+        $_ = 'xyz'; print $x,"\n";     # prints 1xyz4
+        $x = '56789';
+        $_ = 'pq';  print $x,"\n";     # prints 5pq9
+    }
+
+
+Prior to Perl version 5.9.1, the result of using an lvalue multiple times was
+unspecified.
  
  =item symlink OLDFILE,NEWFILE
  
@@ -6602,10 +6688,10 @@ and for other examples.
  
  =item wantarray
  
-Returns true if the context of the currently executing subroutine is
-looking for a list value.  Returns false if the context is looking
-for a scalar.  Returns the undefined value if the context is looking
-for no value (void context).
+Returns true if the context of the currently executing subroutine or
+eval() block is looking for a list value.  Returns false if the context is
+looking for a scalar.  Returns the undefined value if the context is
+looking for no value (void context).
  
      return unless defined wantarray;   # don't bother doing more
      my @a = complex_calculation();