X-Git-Url: https://perl5.git.perl.org/perl5.git/blobdiff_plain/2341804c04aefb306adfdbe22c23e6cd11f0e48f..1336785ebf927e0609f93578343ec6e6c67ef90c:/pod/perlfunc.pod diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod index 54f5ae2..6bbdf39 100644 --- a/pod/perlfunc.pod +++ b/pod/perlfunc.pod @@ -107,8 +107,8 @@ than one place. =item Functions for SCALARs or strings X X X -C, C, C, C, C, C, C, C, -C, C, C, C, C, C, C, +C, C, C, C, C, C, C, C, +C, C, C, C, C, C, C, C, C, C, C, C, C, C, C =item Regular expressions and pattern matching @@ -161,34 +161,45 @@ C, C, C =item Keywords related to the control flow of your Perl program X -C, C, C, C, C, C, C, -C, C, C, C, C, C, C +C, C, C, C, +C, C, C C, +C<__FILE__>, C, C, C<__LINE__>, C, C<__PACKAGE__>, +C, C, C, C<__SUB__>, C + +C<__SUB__> is only available with a C (or higher) declaration or +with the C<"current_sub"> feature (see L). =item Keywords related to the switch feature -C, C, CC, C +C, C, C, C, C -These are available only if you enable the C<"switch"> feature. -See L and L. -Alternately, include a C or later to the current scope. +Except for C, these are available only if you enable the +C<"switch"> feature or use the C prefix. +See L and L. +Alternately, include a C or later to the current scope. In Perl +5.14 and earlier, C required the C<"switch"> feature, like the +other keywords. =item Keywords related to scoping C, C, C, C, C, C, C, C -C is available only if the C<"state"> feature is enabled. See +C is available only if the C<"state"> feature +is enabled or if it is prefixed with C. See L. Alternately, include a C or later to the current scope. =item Miscellaneous functions -C, C, C, C, C, C, C, +C, C, C, C, +C, C, C, C, C, C, C, C, C =item Functions for processes and process groups X X X C, C, C, C, C, C, C, -C, C, C, C, C, C, +C, C, C, C, +C, C, C, C, C, C =item Keywords related to Perl modules @@ -236,21 +247,12 @@ X. Equivalent examples: @@ -1338,7 +1374,8 @@ determined from the values of C<$!> and C<$?> with this pseudocode: exit 255; # last resort The intent is to squeeze as much possible information about the likely cause -into the limited space of the system exit code. However, as C<$!> is the value +into the limited space of the system exit +code. However, as C<$!> is the value of C's C, which can be set by any system call, this means that the value of the exit code used by C can be non-predictable, so should not be relied upon, other than to be non-zero. @@ -1394,7 +1431,7 @@ X X Not really a function. Returns the value of the last command in the sequence of commands indicated by BLOCK. When modified by the C or C loop modifier, executes the BLOCK once before testing the loop -condition. (On other statements the loop modifiers test the conditional +condition. (On other statements the loop modifiers test the conditional first.) C does I count as a loop, so the loop control statements @@ -1471,10 +1508,12 @@ be open any more when the program is reincarnated, with possible resulting confusion by Perl. This function is now largely obsolete, mostly because it's very hard to -convert a core file into an executable. That's why you should now invoke +convert a core file into an executable. That's why you should now invoke it as C, if you don't want to be warned against a possible typo. +Portability issues: L. + =item each HASH X X @@ -1483,11 +1522,12 @@ X =item each EXPR -When called in list context, returns a 2-element list consisting of the key -and value for the next element of a hash, or the index and value for the -next element of an array, so that you can iterate over it. When called in -scalar context, returns only the key (not the value) in a hash, or the index -in an array. +When called on a hash in list context, returns a 2-element list +consisting of the key and value for the next element of a hash. In Perl +5.12 and later only, it will also return the index and value for the next +element of an array so that you can iterate over it; older Perls consider +this a syntax error. When called in scalar context, returns only the key +(not the value) in a hash, or the index in an array. Hash entries are returned in an apparently random order. The actual random order is subject to change in future versions of Perl, but it is @@ -1498,14 +1538,15 @@ for security reasons (see L). After C has returned all entries from the hash or array, the next call to C returns the empty list in list context and C in -scalar context. The next call following that one restarts iteration. Each -hash or array has its own internal iterator, accessed by C, C, -and C. The iterator is implicitly reset when C has reached -the end as just described; it can be explicitly reset by calling C or -C on the hash or array. If you add or delete a hash's elements -while iterating over it, entries may be skipped or duplicated--so don't do -that. Exception: It is always safe to delete the item most recently -returned by C, so the following code works properly: +scalar context; the next call following I one restarts iteration. +Each hash or array has its own internal iterator, accessed by C, +C, and C. The iterator is implicitly reset when C has +reached the end as just described; it can be explicitly reset by calling +C or C on the hash or array. If you add or delete a hash's +elements while iterating over it, entries may be skipped or duplicated--so +don't do that. Exception: In the current implementation, it is always safe +to delete the item most recently returned by C, so the following +code works properly: while (($key, $value) = each %hash) { print $key, "\n"; @@ -1526,6 +1567,14 @@ The exact behaviour may change in a future version of Perl. while (($key,$value) = each $hashref) { ... } +To avoid confusing would-be users of your code who are running earlier +versions of Perl with mysterious syntax errors, put this sort of thing at +the top of your file to signal that your code will work I on Perls of +a recent vintage: + + use 5.012; # so keys/values/each work on arrays + use 5.014; # so keys/values/each work on scalars (experimental) + See also C, C, and C. =item eof FILEHANDLE @@ -1574,7 +1623,7 @@ of the very last file only. Examples: print "--------------\n"; } print; - last if eof(); # needed if we're reading from a terminal + last if eof(); # needed if we're reading from a terminal } Practical hint: you almost never need to use C in Perl, because the @@ -1593,7 +1642,7 @@ In the first form, the return value of EXPR is parsed and executed as if it were a little Perl program. The value of the expression (which is itself determined within scalar context) is first parsed, and if there were no errors, executed as a block within the lexical context of the current Perl -program. This means, that in particular, any outer lexical variables are +program. This means, that in particular, any outer lexical variables are visible to it, and any package variable settings or subroutine and format definitions remain afterwards. @@ -1601,6 +1650,17 @@ Note that the value is parsed every time the C executes. If EXPR is omitted, evaluates C<$_>. This form is typically used to delay parsing and subsequent execution of the text of EXPR until run time. +If the C feature is enabled (which is the default under a +C or higher declaration), EXPR or C<$_> is treated as a string of +characters, so C declarations have no effect, and source filters +are forbidden. In the absence of the C feature, the string +will sometimes be treated as characters and sometimes as bytes, depending +on the internal encoding, and source filters activated within the C +exhibit the erratic, but historical, behaviour of affecting some outer file +scope that is still compiling. See also the L keyword, which +always treats its input as a byte stream and works properly with source +filters, and the L pragma. + In the second form, the code within the BLOCK is parsed only once--at the same time the code surrounding the C itself was parsed--and executed within the context of the current Perl program. This form is typically @@ -1620,12 +1680,12 @@ determined. If there is a syntax error or runtime error, or a C statement is executed, C returns C in scalar context -or an empty list--or, for syntax errors, a list containing a single -undefined value--in list context, and C<$@> is set to the error -message. The discrepancy in the return values in list context is -considered a bug by some, and will probably be fixed in a future -release. If there was no error, C<$@> is guaranteed to be the empty -string. Beware that using C neither silences Perl from printing +or an empty list in list context, and C<$@> is set to the error +message. (Prior to 5.16, a bug caused C to be returned +in list context for syntax errors, but not for runtime errors.) +If there was no error, C<$@> is set to the empty string. A +control flow operator like C or C can bypass the setting of +C<$@>. Beware that using C neither silences Perl from printing warnings to STDERR, nor does it stuff the text of warning messages into C<$@>. To do either of those, you have to use the C<$SIG{__WARN__}> facility, or turn off warnings inside the BLOCK or EXPR using S>. @@ -1638,7 +1698,7 @@ the die operator is used to raise exceptions. If you want to trap errors when loading an XS module, some problems with the binary interface (such as Perl version skew) may be fatal even with -C unless C<$ENV{PERL_DL_NONLAZY}> is set. See L. +C unless C<$ENV{PERL_DL_NONLAZY}> is set. See L. If the code to be executed doesn't vary, you may use the eval-BLOCK form to trap run-time errors without incurring the penalty of @@ -1705,7 +1765,7 @@ particular situation, you can just use symbolic references instead, as in case 6. Before Perl 5.14, the assignment to C<$@> occurred before restoration -of localised variables, which means that for your code to run on older +of localized variables, which means that for your code to run on older versions, a temporary is required if you want to mask some but not all errors: @@ -1726,9 +1786,24 @@ C, C, or C cannot be used to leave or restart the block. An C executed within the C package doesn't see the usual surrounding lexical scope, but rather the scope of the first non-DB piece -of code that called it. You don't normally need to worry about this unless +of code that called it. You don't normally need to worry about this unless you are writing a Perl debugger. +=item evalbytes EXPR +X + +=item evalbytes + +This function is like L with a string argument, except it always +parses its argument, or C<$_> if EXPR is omitted, as a string of bytes. A +string containing characters whose ordinal value exceeds 255 results in an +error. Source filters activated within the evaluated code apply to the +code itself. + +This function is only available under the C feature, a +C (or higher) declaration, or with a C prefix. See +L for more information. + =item exec LIST X X @@ -1805,6 +1880,8 @@ open handles to avoid lost output. Note that C will not call your C blocks, nor will it invoke C methods on your objects. +Portability issues: L. + =item exists EXPR X X @@ -1895,10 +1972,12 @@ The exit() function does not always exit immediately. It calls any defined C routines first, but these C routines may not themselves abort the exit. Likewise any object destructors that need to be called are called before the real exit. C routines and destructors -can change the exit status by modifying C<$?>. If this is a problem, you +can change the exit status by modifying C<$?>. If this is a problem, you can call C to avoid END and destructor processing. See L for details. +Portability issues: L. + =item exp EXPR X X X X X @@ -1907,6 +1986,54 @@ X X X X X Returns I (the natural logarithm base) to the power of EXPR. If EXPR is omitted, gives C. +=item fc EXPR +X X X X X + +=item fc + +Returns the casefolded version of EXPR. This is the internal function +implementing the C<\F> escape in double-quoted strings. + +Casefolding is the process of mapping strings to a form where case +differences are erased; comparing two strings in their casefolded +form is effectively a way of asking if two strings are equal, +regardless of case. + +Roughly, if you ever found yourself writing this + + lc($this) eq lc($that) # Wrong! + # or + uc($this) eq uc($that) # Also wrong! + # or + $this =~ /\Q$that/i # Right! + +Now you can write + + fc($this) eq fc($that) + +And get the correct results. + +Perl only implements the full form of casefolding. +For further information on casefolding, refer to +the Unicode Standard, specifically sections 3.13 C, +4.2 C, and 5.18 C, +available at L, as well as the +Case Charts available at L. + +If EXPR is omitted, uses C<$_>. + +This function behaves the same way under various pragma, such as in a locale, +as L does. + +While the Unicode Standard defines two additional forms of casefolding, +one for Turkic languages and one that never maps one character into multiple +characters, these are not provided by the Perl core; However, the CPAN module +C may be used to provide an implementation. + +This keyword is available only when the C<"fc"> feature is enabled, +or when prefixed with C; See L. Alternately, +include a C or later to the current scope. + =item fcntl FILEHANDLE,FUNCTION,SCALAR X @@ -1944,6 +2071,13 @@ on your own, though. $flags = fcntl(REMOTE, F_SETFL, $flags | O_NONBLOCK) or die "Can't set flags for the socket: $!\n"; +Portability issues: L. + +=item __FILE__ +X<__FILE__> + +A special token that returns the name of the file in which it occurs. + =item fileno FILEHANDLE X @@ -2044,6 +2178,8 @@ function lose their locks, making it seriously harder to write servers. See also L for other flock() examples. +Portability issues: L. + =item fork X X X @@ -2073,6 +2209,15 @@ if you exit, then the remote server (such as, say, a CGI script or a backgrounded job launched from a remote shell) won't think you're done. You should reopen those to F if it's any issue. +On some platforms such as Windows, where the fork() system call is not available, +Perl can be built to emulate fork() in the Perl interpreter. +The emulation is designed, at the level of the Perl program, +to be as compatible as possible with the "Unix" fork(). +However it has limitations that have to be considered in code intended to be portable. +See L for more details. + +Portability issues: L. + =item format X @@ -2164,6 +2309,8 @@ returns the empty string, use C. Do not consider C for authentication: it is not as secure as C. +Portability issues: L. + =item getpeername SOCKET X X @@ -2186,25 +2333,31 @@ doesn't implement getpgrp(2). If PID is omitted, returns the process group of the current process. Note that the POSIX version of C does not accept a PID argument, so only C is truly portable. +Portability issues: L. + =item getppid X X X Returns the process id of the parent process. Note for Linux users: on Linux, the C functions C and -C return different values from different threads. In order to +C return different values from different threads. In order to be portable, this behavior is not reflected by the Perl-level function -C, that returns a consistent value across threads. If you want +C, that returns a consistent value across threads. If you want to call the underlying C, you may use the CPAN module C. +Portability issues: L. + =item getpriority WHICH,WHO X X X Returns the current priority for a process, a process group, or a user. -(See C.) Will raise a fatal exception if used on a +(See L.) Will raise a fatal exception if used on a machine that doesn't implement getpriority(2). +Portability issues: L. + =item getpwnam NAME X X X X X X X X X X @@ -2357,9 +2510,16 @@ you can write this: $ip_address = inet_ntoa($packed_ip); } -Make sure is called in SCALAR context and that +Make sure C is called in SCALAR context and that its return value is checked for definedness. +The C function, even though it only takes one argument, +has the precedence of a list operator, so beware: + + getprotobynumber $number eq 'icmp' # WRONG + getprotobynumber($number eq 'icmp') # actually means this + getprotobynumber($number) eq 'icmp' # better this way + If you get tired of remembering which element of the return list contains which return value, by-name interfaces are provided in standard modules: C, C, C, @@ -2376,6 +2536,8 @@ Even though it looks as though they're the same method calls (uid), they aren't, because a C object is different from a C object. +Portability issues: L to L. + =item getsockname SOCKET X @@ -2396,15 +2558,15 @@ X Queries the option named OPTNAME associated with SOCKET at a given LEVEL. Options may exist at multiple protocol levels depending on the socket type, but at least the uppermost socket level SOL_SOCKET (defined in the -C module) will exist. To query options at another level the +C module) will exist. To query options at another level the protocol number of the appropriate protocol controlling the option -should be supplied. For example, to indicate that an option is to be +should be supplied. For example, to indicate that an option is to be interpreted by the TCP protocol, LEVEL should be set to the protocol number of TCP, which you can get using C. The function returns a packed string representing the requested socket option, or C on error, with the reason for the error placed in -C<$!>. Just what is in the packed string depends on LEVEL and OPTNAME; +C<$!>. Just what is in the packed string depends on LEVEL and OPTNAME; consult getsockopt(2) for details. A common case is that the option is an integer, in which case the result is a packed integer, which you can decode using C with the C (or C) format. @@ -2421,13 +2583,15 @@ Here's an example to test whether Nagle's algorithm is enabled on a socket: my $nodelay = unpack("I", $packed); print "Nagle's algorithm is turned ", $nodelay ? "off\n" : "on\n"; +Portability issues: L. =item given EXPR BLOCK X =item given BLOCK -C is analogous to the C keyword in other languages. C +C is analogous to the C +keyword in other languages. C and C are used in Perl to implement C/C like statements. Only available after Perl 5.10. For example: @@ -2444,7 +2608,7 @@ Only available after Perl 5.10. For example: } } -See L for detailed information. +See L for detailed information. =item glob EXPR X X X X @@ -2452,10 +2616,10 @@ X X X X =item glob In list context, returns a (possibly empty) list of filename expansions on -the value of EXPR such as the standard Unix shell F would do. In +the value of EXPR such as the standard Unix shell F would do. In scalar context, glob iterates through such filename expansions, returning -undef when the list is exhausted. This is the internal function -implementing the C<< <*.c> >> operator, but you can use it directly. If +undef when the list is exhausted. This is the internal function +implementing the C<< <*.c> >> operator, but you can use it directly. If EXPR is omitted, C<$_> is used. The C<< <*.c> >> operator is discussed in more detail in L. @@ -2463,6 +2627,19 @@ Note that C splits its arguments on whitespace and treats each segment as separate pattern. As such, C matches all files with a F<.c> or F<.h> extension. The expression C matches all files in the current working directory. +If you want to glob filenames that might contain whitespace, you'll +have to use extra quotes around the spacey filename to protect it. +For example, to glob filenames that have an C followed by a space +followed by an C, use either of: + + @spacies = <"*e f*">; + @spacies = glob '"*e f*"'; + @spacies = glob q("*e f*"); + +If you had to get a variable through, you could do this: + + @spacies = glob "'*${var}e f*'"; + @spacies = glob qq("*${var}e f*"); If non-empty braces are the only wildcard characters used in the C, no filenames are matched, but potentially many strings @@ -2475,6 +2652,8 @@ Beginning with v5.6.0, this operator is implemented using the standard C extension. See L for details, including C which does not treat whitespace as a pattern separator. +Portability issues: L. + =item gmtime EXPR X X X @@ -2487,7 +2666,7 @@ Note: When called in list context, $isdst, the last value returned by gmtime, is always C<0>. There is no Daylight Saving Time in GMT. -See L for portability concerns. +Portability issues: L. =item goto LABEL X X X @@ -2497,7 +2676,7 @@ X X X =item goto &NAME The C form finds the statement labeled with LABEL and -resumes execution there. It can't be used to get out of a block or +resumes execution there. It can't be used to get out of a block or subroutine given to C. It can be used to go almost anywhere else within the dynamic scope, including out of subroutines, but it's usually better to use some other construct such as C or C. @@ -2513,8 +2692,8 @@ necessarily recommended if you're optimizing for maintainability: goto ("FOO", "BAR", "GLARCH")[$i]; As shown in this example, C is exempt from the "looks like a -function" rule. A pair of parentheses following it does not (necessarily) -delimit its argument. C is equivalent to C. +function" rule. A pair of parentheses following it does not (necessarily) +delimit its argument. C is equivalent to C. Use of C or C to jump into a construct is deprecated and will issue a warning. Even then, it may not be used to @@ -2587,7 +2766,7 @@ L.) If EXPR is omitted, uses C<$_>. Hex strings may only represent integers. Strings that would cause integer overflow trigger a warning. Leading whitespace is not stripped, -unlike oct(). To present something as hex, look into L, +unlike oct(). To present something as hex, look into L, L, and L. =item import LIST @@ -2609,9 +2788,8 @@ It returns the position of the first occurrence of SUBSTR in STR at or after POSITION. If POSITION is omitted, starts searching from the beginning of the string. POSITION before the beginning of the string or after its end is treated as if it were the beginning or the end, -respectively. POSITION and the return value are based at C<0> (or whatever -you've set the C<$[> variable to--but don't do that). If the substring -is not found, C returns one less than the base, ordinarily C<-1>. +respectively. POSITION and the return value are based at zero. +If the substring is not found, C returns -1. =item int EXPR X X X X X @@ -2664,6 +2842,8 @@ system: The special string C<"0 but true"> is exempt from B<-w> complaints about improper numeric conversions. +Portability issues: L. + =item join EXPR,LIST X @@ -2682,8 +2862,10 @@ X X =item keys EXPR -Returns a list consisting of all the keys of the named hash, or the indices -of an array. (In scalar context, returns the number of keys or indices.) +Called in list context, returns a list consisting of all the keys of the +named hash, or in Perl 5.12 or later only, the indices of an array. Perl +releases prior to 5.12 will produce a syntax error if you try to use an +array argument. In scalar context, returns the number of keys or indices. The keys of a hash are returned in an apparently random order. The actual random order is subject to change in future versions of Perl, but it @@ -2734,7 +2916,7 @@ buckets will be retained even if you do C<%hash = ()>, use C if you want to free the storage while C<%hash> is still in scope. You can't shrink the number of buckets allocated for the hash using C in this way (but you needn't worry about doing this by accident, -as trying has no effect). C in an lvalue context is a syntax +as trying has no effect). C in an lvalue context is a syntax error. Starting with Perl 5.14, C can take a scalar EXPR, which must contain @@ -2745,9 +2927,19 @@ experimental. The exact behaviour may change in a future version of Perl. for (keys $hashref) { ... } for (keys $obj->get_arrayref) { ... } +To avoid confusing would-be users of your code who are running earlier +versions of Perl with mysterious syntax errors, put this sort of thing at +the top of your file to signal that your code will work I on Perls of +a recent vintage: + + use 5.012; # so keys/values/each work on arrays + use 5.014; # so keys/values/each work on scalars (experimental) + See also C, C, and C. =item kill SIGNAL, LIST + +=item kill SIGNAL X X Sends a signal to a list of processes. Returns the number of @@ -2765,7 +2957,8 @@ alive (even if only as a zombie) and hasn't changed its UID. See L for notes on the portability of this construct. Unlike in the shell, if SIGNAL is negative, it kills process groups instead -of processes. That means you usually want to use positive not negative signals. +of processes. That means you usually +want to use positive not negative signals. You may also use a signal name in quotes. The behavior of kill when a I number is zero or negative depends on @@ -2774,6 +2967,20 @@ signal the current process group and -1 will signal all processes. See L for more details. +On some platforms such as Windows where the fork() system call is not available. +Perl can be built to emulate fork() at the interpreter level. +This emulation has limitations related to kill that have to be considered, +for code running on Windows and in code intended to be portable. + +See L for more details. + +If there is no I of processes, no signal is sent, and the return +value is 0. This form is sometimes used, however, because it causes +tainting checks to be run. But see +L. + +Portability issues: L. + =item last LABEL X X @@ -2829,21 +3036,29 @@ respectively. =back -=item Otherwise, If EXPR has the UTF8 flag set +=item Otherwise, if C (but not C) is in effect: + +Respects current LC_CTYPE locale for code points < 256; and uses Unicode +semantics for the remaining code points (this last can only happen if +the UTF8 flag is also set). See L. -If the current package has a subroutine named C, it will be used to -change the case -(See L.) -Otherwise Unicode semantics are used for the case change. +A deficiency in this is that case changes that cross the 255/256 +boundary are not well-defined. For example, the lower case of LATIN CAPITAL +LETTER SHARP S (U+1E9E) in Unicode semantics is U+00DF (on ASCII +platforms). But under C, the lower case of U+1E9E is +itself, because 0xDF may not be LATIN SMALL LETTER SHARP S in the +current locale, and Perl has no way of knowing if that character even +exists in the locale, much less what code point it is. Perl returns +the input character unchanged, for all instances (and there aren't +many) where the 255/256 boundary would otherwise be crossed. -=item Otherwise, if C is in effect +=item Otherwise, If EXPR has the UTF8 flag set: -Respects current LC_CTYPE locale. See L. +Unicode semantics are used for the case change. -=item Otherwise, if C is in effect: +=item Otherwise, if C or C) is in effect: -Unicode semantics are used for the case change. Any subroutine named -C will be ignored. +Unicode semantics are used for the case change. =item Otherwise: @@ -2894,12 +3109,19 @@ characters, not physical bytes. For how many bytes a string encoded as UTF-8 would take up, use C (you'll have to C first). See L and L. +=item __LINE__ +X<__LINE__> + +A special token that compiles to the current line number. + =item link OLDFILE,NEWFILE X Creates a new filename linked to the old filename. Returns true for success, false otherwise. +Portability issues: L. + =item listen SOCKET,QUEUESIZE X @@ -2948,15 +3170,11 @@ This makes it easy to get a month name from a list: print "$abbr[$mon] $mday"; # $mon=9, $mday=18 gives "Oct 18" -C<$year> is the number of years since 1900, B just the last two digits -of the year. That is, C<$year> is C<123> in year 2023. The proper way -to get a 4-digit year is simply: +C<$year> contains the number of years since 1900. To get a 4-digit +year write: $year += 1900; -Otherwise you create non-Y2K-compliant programs--and you wouldn't want -to do that, would you? - To get the last two digits of the year (e.g., "01" in 2001) do: $year = sprintf("%02d", $year % 100); @@ -2975,8 +3193,9 @@ In scalar context, C returns the ctime(3) value: $now_string = localtime; # e.g., "Thu Oct 13 04:54:34 1994" -This scalar value is B locale-dependent but is a Perl builtin. For GMT -instead of local time use the L builtin. See also the +The format of this scalar value is B locale-dependent +but built into Perl. For GMT instead of local +time use the L builtin. See also the C module (for converting seconds, minutes, hours, and such back to the integer value returned by time()), and the L module's strftime(3) and mktime(3) functions. @@ -2993,8 +3212,6 @@ try for example: Note that the C<%a> and C<%b>, the short forms of the day of the week and the month of the year, may not necessarily be three characters wide. -See L for portability concerns. - The L and L modules provide a convenient, by-name access mechanism to the gmtime() and localtime() functions, respectively. @@ -3002,12 +3219,17 @@ respectively. For a comprehensive date and time representation look at the L module on CPAN. +Portability issues: L. + =item lock THING X This function places an advisory lock on a shared variable or referenced object contained in I until the lock goes out of scope. +The value returned is the scalar itself, if the argument is a scalar, or a +reference, if the argument is a hash, array or subroutine. + lock() is a "weak keyword" : this means that if you've defined a function by this name (before any calls to it), that function will be called instead. If you are not under C this does nothing. @@ -3031,9 +3253,13 @@ divided by the natural log of N. For example: See also L for the inverse operation. -=item lstat EXPR +=item lstat FILEHANDLE X +=item lstat EXPR + +=item lstat DIRHANDLE + =item lstat Does the same thing as the C function (including setting the @@ -3044,6 +3270,8 @@ information, please see the documentation for C. If EXPR is omitted, stats C<$_>. +Portability issues: L. + =item m// The match operator. See L. @@ -3071,7 +3299,7 @@ translates a list of numbers to their squared values. my @squares = map { $_ > 5 ? ($_ * $_) : () } @numbers; shows that number of returned elements can differ from the number of -input elements. To omit an element, return an empty list (). +input elements. To omit an element, return an empty list (). This could also be achieved by writing my @squares = map { $_ * $_ } grep { $_ > 5 } @numbers; @@ -3080,7 +3308,7 @@ which makes the intention more clear. Map always returns a list, which can be assigned to a hash such that the elements -become key/value pairs. See L for more details. +become key/value pairs. See L for more details. %hash = map { get_a_key_for($_) => $_ } @array; @@ -3104,11 +3332,12 @@ the list elements, C<$_> keeps being lexical inside the block; that is, it can't be seen from the outside, avoiding any potential side-effects. C<{> starts both hash references and blocks, so C could be either -the start of map BLOCK LIST or map EXPR, LIST. Because Perl doesn't look +the start of map BLOCK LIST or map EXPR, LIST. Because Perl doesn't look ahead for the closing C<}> it has to take a guess at which it's dealing with -based on what it finds just after the C<{>. Usually it gets it right, but if it +based on what it finds just after the +C<{>. Usually it gets it right, but if it doesn't it won't realize something is wrong until it gets to the C<}> and -encounters the missing (or unexpected) comma. The syntax error will be +encounters the missing (or unexpected) comma. The syntax error will be reported close to the C<}>, but you'll need to change something near the C<{> such as using a unary C<+> to give Perl some help: @@ -3168,6 +3397,8 @@ C<"0 but true"> for zero, or the actual return value otherwise. See also L and the documentation for C and C. +Portability issues: L. + =item msgget KEY,FLAGS X @@ -3176,6 +3407,8 @@ id, or C on error. See also L and the documentation for C and C. +Portability issues: L. + =item msgrcv ID,VAR,SIZE,TYPE,FLAGS X @@ -3188,6 +3421,8 @@ Taints the variable. Returns true if successful, false on error. See also L and the documentation for C and C. +Portability issues: L. + =item msgsnd ID,MSG,FLAGS X @@ -3199,6 +3434,8 @@ C. Returns true if successful, false on error. See also the C and C documentation. +Portability issues: L. + =item my EXPR X @@ -3340,7 +3577,7 @@ created if necessary. You can put a C<+> in front of the C<< > >> or C<< < >> to indicate that you want both read and write access to the file; thus C<< +< >> is almost always preferred for read/write updates--the -C<< +> >> mode would clobber the file first. You cant usually use +C<< +> >> mode would clobber the file first. You can't usually use either read-write mode for updating textfiles, since they have variable-length records. See the B<-i> switch in L for a better approach. The file is created with permissions of C<0666> @@ -3379,15 +3616,18 @@ or C<-> opens STDIN and opening C<< >- >> opens STDOUT. You may (and usually should) use the three-argument form of open to specify I/O layers (sometimes referred to as "disciplines") to apply to the handle that affect how the input and output are processed (see L and -L for more details). For example: +L for more details). For example: open(my $fh, "<:encoding(UTF-8)", "filename") || die "can't open UTF-8 encoded filename: $!"; opens the UTF8-encoded file containing Unicode characters; -see L. Note that if layers are specified in the +see L. Note that if layers are specified in the three-argument form, then default layers stored in ${^OPEN} (see L; usually set by the B pragma or the switch B<-CioD>) are ignored. +Those layers will also be ignored if you specifying a colon with no name +following it. In that case the default layer for the operating system +(:raw on Unix, :crlf on Windows) is used. Open returns nonzero on success, the undefined value otherwise. If the C involved a pipe, the return value happens to be the pid of @@ -3492,7 +3732,8 @@ duped (as C) and opened. You may use C<&> after C<< > >>, C<<< >> >>>, C<< < >>, C<< +> >>, C<<< +>> >>>, and C<< +< >>. The mode you specify should match the mode of the original filehandle. (Duping a filehandle does not take into account any existing contents -of IO buffers.) If you use the three-argument form, then you can pass either a +of IO buffers.) If you use the three-argument +form, then you can pass either a number, the name of a filehandle, or the normal "reference to a glob". Here is a script that saves, redirects, and restores C and @@ -3564,7 +3805,7 @@ Use C or C to determine whether the open was successful. For example, use either - $child_pid = open(FROM_KID, "|-") // die "can't fork: $!"; + $child_pid = open(FROM_KID, "-|") // die "can't fork: $!"; or $child_pid = open(TO_KID, "|-") // die "can't fork: $!"; @@ -3664,7 +3905,7 @@ but will not work on a filename that happens to have a trailing space, while will have exactly the opposite restrictions. -If you want a "real" C C (see C on your system), then you +If you want a "real" C C (see L on your system), then you should use the C function, which involves no such magic (but may use subtly different filemodes than Perl open(), which is mapped to C fopen()). This is another way to protect your filenames from @@ -3708,6 +3949,8 @@ yourself and inspect the return value. See L for some details about mixing reading and writing. +Portability issues: L. + =item opendir DIRHANDLE,EXPR X @@ -3726,8 +3969,7 @@ X X =item ord -Returns the numeric (the native 8-bit encoding, like ASCII or EBCDIC, -or Unicode) value of the first character of EXPR. +Returns the numeric value of the first character of EXPR. If EXPR is an empty string, returns 0. If EXPR is omitted, uses C<$_>. (Note I, not byte.) @@ -3799,9 +4041,9 @@ An C declaration may also have a list of attributes associated with it. The exact semantics and interface of TYPE and ATTRS are still -evolving. TYPE is currently bound to the use of C pragma, -and attributes are handled using the C pragma, or starting -from Perl 5.8.0 also via the C module. See +evolving. TYPE is currently bound to the use of the C pragma, +and attributes are handled using the C pragma, or, starting +from Perl 5.8.0, also via the C module. See L for details, and L, L, and L. @@ -3824,7 +4066,8 @@ of values, as follows: A A text (ASCII) string, will be space padded. Z A null-terminated (ASCIZ) string, will be null padded. - b A bit string (ascending bit order inside each byte, like vec()). + b A bit string (ascending bit order inside each byte, + like vec()). B A bit string (descending bit order inside each byte). h A hex string (low nybble first). H A hex string (high nybble first). @@ -3841,49 +4084,52 @@ of values, as follows: q A signed quad (64-bit) value. Q An unsigned quad value. - (Quads are available only if your system supports 64-bit - integer values _and_ if Perl has been compiled to support those. - Raises an exception otherwise.) + (Quads are available only if your system supports 64-bit + integer values _and_ if Perl has been compiled to support + those. Raises an exception otherwise.) i A signed integer value. I A unsigned integer value. - (This 'integer' is _at_least_ 32 bits wide. Its exact - size depends on what a local C compiler calls 'int'.) + (This 'integer' is _at_least_ 32 bits wide. Its exact + size depends on what a local C compiler calls 'int'.) n An unsigned short (16-bit) in "network" (big-endian) order. N An unsigned long (32-bit) in "network" (big-endian) order. v An unsigned short (16-bit) in "VAX" (little-endian) order. V An unsigned long (32-bit) in "VAX" (little-endian) order. - j A Perl internal signed integer value (IV). - J A Perl internal unsigned integer value (UV). + j A Perl internal signed integer value (IV). + J A Perl internal unsigned integer value (UV). f A single-precision float in native format. d A double-precision float in native format. F A Perl internal floating-point value (NV) in native format D A float of long-double precision in native format. - (Long doubles are available only if your system supports long - double values _and_ if Perl has been compiled to support those. - Raises an exception otherwise.) + (Long doubles are available only if your system supports + long double values _and_ if Perl has been compiled to + support those. Raises an exception otherwise.) p A pointer to a null-terminated string. P A pointer to a structure (fixed-length string). u A uuencoded string. - U A Unicode character number. Encodes to a character in character mode - and UTF-8 (or UTF-EBCDIC in EBCDIC platforms) in byte mode. + U A Unicode character number. Encodes to a character in char- + acter mode and UTF-8 (or UTF-EBCDIC in EBCDIC platforms) in + byte mode. - w A BER compressed integer (not an ASN.1 BER, see perlpacktut for - details). Its bytes represent an unsigned integer in base 128, - most significant digit first, with as few digits as possible. Bit - eight (the high bit) is set on each byte except the last. + w A BER compressed integer (not an ASN.1 BER, see perlpacktut + for details). Its bytes represent an unsigned integer in + base 128, most significant digit first, with as few digits + as possible. Bit eight (the high bit) is set on each byte + except the last. x A null byte (a.k.a ASCII NUL, "\000", chr(0)) X Back up a byte. @ Null-fill or truncate to absolute position, counted from the start of the innermost ()-group. - . Null-fill or truncate to absolute position specified by the value. + . Null-fill or truncate to absolute position specified by + the value. ( Start of a ()-group. One or more modifiers below may optionally follow certain letters in the @@ -3897,8 +4143,8 @@ TEMPLATE (the second column lists letters for which the modifier is valid): nNvV Treat integers as signed instead of unsigned. @. Specify position as byte offset in the internal - representation of the packed string. Efficient but - dangerous. + representation of the packed string. Efficient + but dangerous. > sSiIlLqQ Force big-endian byte-order on the type. jJfFdDpP (The "big end" touches the construct.) @@ -3921,7 +4167,7 @@ count. A numeric repeat count may optionally be enclosed in brackets, as in C. The repeat count gobbles that many values from the LIST when used with all format types other than C, C, C, C, C, C, C, C<@>, C<.>, C, C, and C

, where it means -something else, dscribed below. Supplying a C<*> for the repeat count +something else, described below. Supplying a C<*> for the repeat count instead of a number means to use however many items are left, except for: =over @@ -3980,7 +4226,7 @@ bigger then the group level. =back The repeat count for C is interpreted as the maximal number of bytes -to encode per line of output, with 0, 1 and 2 replaced by 45. The repeat +to encode per line of output, with 0, 1 and 2 replaced by 45. The repeat count should not be more than 65. =item * @@ -4073,18 +4319,18 @@ unpacking has encoded the sizes or repeat counts for some of its fields within the structure itself as separate fields. For C, you write ICI, and the -I describes how the length value is packed. Formats likely +I describes how the length value is packed. Formats likely to be of most use are integer-packing ones like C for Java strings, C for ASN.1 or SNMP, and C for Sun XDR. For C, I may have a repeat count, in which case the minimum of that and the number of available items is used as the argument -for I. If it has no repeat count or uses a '*', the number +for I. If it has no repeat count or uses a '*', the number of available items is used. For C, an internal stack of integer arguments unpacked so far is -used. You write CI and the repeat count is obtained by -popping off the last element from the stack. The I must not +used. You write CI and the repeat count is obtained by +popping off the last element from the stack. The I must not have a repeat count. If I refers to a string type (C<"A">, C<"a">, or C<"Z">), @@ -4092,12 +4338,14 @@ the I is the string length, not the number of strings. With an explicit repeat count for pack, the packed string is adjusted to that length. For example: - unpack("W/a", "\004Gurusamy") gives ("Guru") - unpack("a3/A A*", "007 Bond J ") gives (" Bond", "J") - unpack("a3 x2 /A A*", "007: Bond, J.") gives ("Bond, J", ".") + This code: gives this result: + + unpack("W/a", "\004Gurusamy") ("Guru") + unpack("a3/A A*", "007 Bond J ") (" Bond", "J") + unpack("a3 x2 /A A*", "007: Bond, J.") ("Bond, J", ".") - pack("n/a* w/a","hello,","world") gives "\000\006hello,\005world" - pack("a/W2", ord("a") .. ord("z")) gives "2ab" + pack("n/a* w/a","hello,","world") "\000\006hello,\005world" + pack("a/W2", ord("a") .. ord("z")) "2ab" The I is not returned explicitly from C. @@ -4262,8 +4510,9 @@ will not in general equal $foo. Pack and unpack can operate in two modes: character mode (C mode) where the packed string is processed per character, and UTF-8 mode (C mode) where the packed string is processed in its UTF-8-encoded Unicode form on -a byte-by-byte basis. Character mode is the default unless the format string -starts with C. You can always switch mode mid-format with an explicit +a byte-by-byte basis. Character mode is the default +unless the format string starts with C. You +can always switch mode mid-format with an explicit C or C in the format. This mode remains in effect until the next mode change, or until the end of the C<()> group it (directly) applies to. @@ -4299,7 +4548,7 @@ handle their output and input as flat sequences of characters. A C<()> group is a sub-TEMPLATE enclosed in parentheses. A group may take a repeat count either as postfix, or for unpack(), also via the C template character. Within each repetition of a group, positioning with -C<@> starts over at 0. Therefore, the result of +C<@> starts over at 0. Therefore, the result of pack("@1A((@2A)@3A)", qw[X Y Z]) @@ -4309,7 +4558,7 @@ is the string C<"\0X\0\0YZ">. C and C accept the C modifier to act as alignment commands: they jump forward or back to the closest position aligned at a multiple of C -characters. For example, to pack() or unpack() a C structure like +characters. For example, to pack() or unpack() a C structure like struct { char c; /* one signed, 8-bit character */ @@ -4357,12 +4606,14 @@ Examples: $foo = pack("W4",0x24b6,0x24b7,0x24b8,0x24b9); # same thing with Unicode circled letters. $foo = pack("U4",0x24b6,0x24b7,0x24b8,0x24b9); - # same thing with Unicode circled letters. You don't get the UTF-8 - # bytes because the U at the start of the format caused a switch to - # U0-mode, so the UTF-8 bytes get joined into characters + # same thing with Unicode circled letters. You don't get the + # UTF-8 bytes because the U at the start of the format caused + # a switch to U0-mode, so the UTF-8 bytes get joined into + # characters $foo = pack("C0U4",0x24b6,0x24b7,0x24b8,0x24b9); # foo eq "\xe2\x92\xb6\xe2\x92\xb7\xe2\x92\xb8\xe2\x92\xb9" - # This is the UTF-8 encoding of the string in the previous example + # This is the UTF-8 encoding of the string in the + # previous example $foo = pack("ccxxcc",65,66,67,68); # foo eq "AB\0\0CD" @@ -4479,6 +4730,11 @@ On systems that support a close-on-exec flag on files, that flag is set on all newly opened file descriptors whose Cs are I than the current value of $^F (by default 2 for C). See L. +=item __PACKAGE__ +X<__PACKAGE__> + +A special token that returns the name of the package in which it occurs. + =item pop ARRAY X X @@ -4498,6 +4754,13 @@ reference to an unblessed array. The argument will be dereferenced automatically. This aspect of C is considered highly experimental. The exact behaviour may change in a future version of Perl. +To avoid confusing would-be users of your code who are running earlier +versions of Perl with mysterious syntax errors, put this sort of thing at +the top of your file to signal that your code will work I on Perls of +a recent vintage: + + use 5.014; # so push/pop/etc work on scalars (experimental) + =item pos SCALAR X X @@ -4505,14 +4768,14 @@ X X Returns the offset of where the last C search left off for the variable in question (C<$_> is used when the variable is not -specified). Note that 0 is a valid match offset. C indicates +specified). Note that 0 is a valid match offset. C indicates that the search position is reset (usually due to match failure, but can also be because no match has yet been run on the scalar). C directly accesses the location used by the regexp engine to store the offset, so assigning to C will change that offset, and so will also influence the C<\G> zero-width assertion in regular -expressions. Both of these effects take place for the next match, so +expressions. Both of these effects take place for the next match, so you can't affect the position with C during the current match, such as in C<(?{pos() = 5})> or C. @@ -4537,7 +4800,7 @@ FILEHANDLE may be a scalar variable containing the name of or a reference to the filehandle, thus introducing one level of indirection. (NOTE: If FILEHANDLE is a variable and the next token is a term, it may be misinterpreted as an operator unless you interpose a C<+> or put -parentheses around the arguments.) If FILEHANDLE is omitted, prints to the +parentheses around the arguments.) If FILEHANDLE is omitted, prints to the last selected (see L) output handle. If LIST is omitted, prints C<$_> to the currently selected output handle. To use FILEHANDLE alone to print the content of C<$_> to it, you must use a real filehandle like @@ -4576,10 +4839,12 @@ X Equivalent to C, except that C<$\> (the output record separator) is not appended. The first argument of the -list will be interpreted as the C format. See C for an -explanation of the format argument. If you omit the LIST, C<$_> is used; +list will be interpreted as the C format. See +L for an +explanation of the format argument. If you omit the LIST, C<$_> is used; to use FILEHANDLE without a LIST, you must use a real filehandle like -C, not an indirect one like C<$fh>. If C is in effect and +C, not an indirect one like C<$fh>. If C (including +C) is in effect and POSIX::setlocale() has been called, the character used for the decimal separator in formatted floating-point numbers is affected by the LC_NUMERIC locale setting. See L and L. @@ -4623,6 +4888,13 @@ reference to an unblessed array. The argument will be dereferenced automatically. This aspect of C is considered highly experimental. The exact behaviour may change in a future version of Perl. +To avoid confusing would-be users of your code who are running earlier +versions of Perl with mysterious syntax errors, put this sort of thing at +the top of your file to signal that your code will work I on Perls of +a recent vintage: + + use 5.014; # so push/pop/etc work on scalars (experimental) + =item q/STRING/ =item qq/STRING/ @@ -4653,7 +4925,7 @@ If EXPR is omitted, uses C<$_>. quotemeta (and C<\Q> ... C<\E>) are useful when interpolating strings into regular expressions, because by default an interpolated variable will be -considered a mini-regular expression. For example: +considered a mini-regular expression. For example: my $sentence = 'The quick brown fox jumped over the lazy dog'; my $substring = 'quick.*?fox'; @@ -4674,7 +4946,8 @@ Or: my $quoted_substring = quotemeta($substring); $sentence =~ s{$quoted_substring}{big bad wolf}; -Will both leave the sentence as is. Normally, when accepting literal string +Will both leave the sentence as is. +Normally, when accepting literal string input from the user, quotemeta() or C<\Q> must be used. In Perl 5.14, all characters whose code points are above 127 are not @@ -4709,8 +4982,8 @@ B is not cryptographically secure. You should not rely on it in security-sensitive situations.> As of this writing, a number of third-party CPAN modules offer random number generators intended by their authors to be cryptographically secure, -including: L, L, and -L. +including: L, L, L, +and L. =item read FILEHANDLE,SCALAR,LENGTH,OFFSET X X @@ -4732,7 +5005,8 @@ results in the string being padded to the required size with C<"\0"> bytes before the result of the read is appended. The call is implemented in terms of either Perl's or your system's native -fread(3) library function. To get a true read(2) system call, see C. +fread(3) library function. To get a true read(2) system call, see +L. Note the I: depending on the status of the filehandle, either (8-bit) bytes or characters are read. By default, all @@ -4767,6 +5041,13 @@ which will set C<$_> on every iteration. } closedir $dh; +To avoid confusing would-be users of your code who are running earlier +versions of Perl with mysterious failures, put this sort of thing at the +top of your file to signal that your code will work I on Perls of a +recent vintage: + + use 5.012; # so readdir assigns to $_ in a lone while test + =item readline EXPR =item readline @@ -4803,7 +5084,7 @@ C and dies if the result is not defined. } Note that you have can't handle C errors that way with the -C filehandle. In that case, you have to open each element of +C filehandle. In that case, you have to open each element of C<@ARGV> yourself since C handles C differently. foreach my $arg (@ARGV) { @@ -4826,6 +5107,8 @@ implemented. If not, raises an exception. If there is a system error, returns the undefined value and sets C<$!> (errno). If EXPR is omitted, uses C<$_>. +Portability issues: L. + =item readpipe EXPR =item readpipe @@ -4906,7 +5189,7 @@ X X =item ref Returns a non-empty string if EXPR is a reference, the empty -string otherwise. If EXPR +string otherwise. If EXPR is not specified, C<$_> will be used. The value returned depends on the type of thing the reference is a reference to. Builtin types include: @@ -4934,8 +5217,8 @@ name is returned instead. You can think of C as a C operator. } The return value C indicates a reference to an lvalue that is not -a variable. You get this from taking the reference of function calls like -C or C. C is returned if the reference points +a variable. You get this from taking the reference of function calls like +C or C. C is returned if the reference points to a L. The result C indicates that the argument is a regular expression @@ -4959,6 +5242,8 @@ rename(2) manpage or equivalent system documentation for details. For a platform independent C function look at the L module. +Portability issues: L. + =item require VERSION X @@ -5055,7 +5340,7 @@ will complain about not finding "F" there. In this case you can do: Now that you understand how C looks for files with a bareword argument, there is a little extra functionality going on behind the scenes. Before C looks for a "F<.pm>" extension, it will -first look for a similar filename with a "F<.pmc>" extension. If this file +first look for a similar filename with a "F<.pmc>" extension. If this file is found, it will be loaded in place of any file ending in a "F<.pm>" extension. @@ -5078,7 +5363,7 @@ A filehandle, from which the file will be read. =item 2 -A reference to a subroutine. If there is no filehandle (previous item), +A reference to a subroutine. If there is no filehandle (previous item), then this subroutine is expected to generate one line of source code per call, writing the line into C<$_> and returning 1, then finally at end of file returning 0. If there is a filehandle, then the subroutine will be @@ -5088,7 +5373,7 @@ returned. =item 3 -Optional state for the subroutine. The state is passed in as C<$_[1]>. A +Optional state for the subroutine. The state is passed in as C<$_[1]>. A reference to the subroutine itself is passed in as C<$_[0]>. =back @@ -5139,7 +5424,7 @@ into package C

.) Here is a typical code layout: push @INC, Foo->new(...); These hooks are also permitted to set the %INC entry -corresponding to the files they have loaded. See L. +corresponding to the files they have loaded. See L. For a yet-more-powerful import facility, see L and L. @@ -5175,7 +5460,7 @@ X Returns from a subroutine, C, or C with the value given in EXPR. Evaluation of EXPR may be in list, scalar, or void context, depending on how the return value will be used, and the context -may vary from one execution to the next (see C). If no EXPR +may vary from one execution to the next (see L). If no EXPR is given, returns an empty list in list context, the undefined value in scalar context, and (of course) nothing at all in void context. @@ -5219,6 +5504,8 @@ X Sets the current position to the beginning of the directory for the C routine on DIRHANDLE. +Portability issues: L. + =item rindex STR,SUBSTR,POSITION X @@ -5258,7 +5545,8 @@ simply an abbreviation for C<{ local $\ = "\n"; print LIST }>. To use FILEHANDLE without a LIST to print the contents of C<$_> to it, you must use a real filehandle like C, not an indirect one like C<$fh>. -This keyword is available only when the C<"say"> feature is enabled; see +This keyword is available only when the C<"say"> feature +is enabled, or when prefixed with C; see L. Alternately, include a C or later to the current scope. @@ -5377,6 +5665,8 @@ methods, preferring to write the last example as: use IO::Handle; STDERR->autoflush(1); +Portability issues: L. + =item select RBITS,WBITS,EBITS,TIMEOUT X behaves just like select(2): it returns On some Unixes, select(2) may report a socket file descriptor as "ready for reading" even when no data is available, and thus any subsequent C -would block. This can be avoided if you always use O_NONBLOCK on the -socket. See select(2) and fcntl(2) for further details. +would block. This can be avoided if you always use O_NONBLOCK on the +socket. See select(2) and fcntl(2) for further details. The standard C module provides a user-friendlier interface to C, except as permitted by POSIX, and even then only on POSIX systems. You have to use C instead. +Portability issues: L. + =item semctl ID,SEMNUM,CMD,ARG X @@ -5457,6 +5749,8 @@ short integers, which may be created with C. See also L, C, C documentation. +Portability issues: L. + =item semget KEY,NSEMS,FLAGS X @@ -5465,6 +5759,8 @@ the undefined value on error. See also L, C, C documentation. +Portability issues: L. + =item semop KEY,OPSTRING X @@ -5483,6 +5779,8 @@ To signal the semaphore, replace C<-1> with C<1>. See also L, C, and C documentation. +Portability issues: L. + =item send SOCKET,MSG,FLAGS,TO X @@ -5513,6 +5811,8 @@ it defaults to C<0,0>. Note that the BSD 4.2 version of C does not accept any arguments, so only C is portable. See also C. +Portability issues: L. + =item setpriority WHICH,WHO,PRIORITY X X X X @@ -5520,6 +5820,8 @@ Sets the current priority for a process, a process group, or a user. (See setpriority(2).) Raises an exception when used on a machine that doesn't implement setpriority(2). +Portability issues: L. + =item setsockopt SOCKET,LEVEL,OPTNAME,OPTVAL X @@ -5534,6 +5836,8 @@ An example disabling Nagle's algorithm on a socket: use Socket qw(IPPROTO_TCP TCP_NODELAY); setsockopt($socket, IPPROTO_TCP, TCP_NODELAY, 1); +Portability issues: L. + =item shift ARRAY X @@ -5554,6 +5858,13 @@ reference to an unblessed array. The argument will be dereferenced automatically. This aspect of C is considered highly experimental. The exact behaviour may change in a future version of Perl. +To avoid confusing would-be users of your code who are running earlier +versions of Perl with mysterious syntax errors, put this sort of thing at +the top of your file to signal that your code will work I on Perls of +a recent vintage: + + use 5.014; # so push/pop/etc work on scalars (experimental) + See also C, C, and C. C and C do the same thing to the left end of an array that C and C do to the right end. @@ -5571,6 +5882,8 @@ structure. Returns like ioctl: C for error; "C<0> but true" for zero; and the actual return value otherwise. See also L and C documentation. +Portability issues: L. + =item shmget KEY,SIZE,FLAGS X @@ -5578,6 +5891,8 @@ Calls the System V IPC function shmget. Returns the shared memory segment id, or C on error. See also L and C documentation. +Portability issues: L. + =item shmread ID,VAR,POS,SIZE X X @@ -5590,9 +5905,11 @@ detaching from it. When reading, VAR must be a variable that will hold the data read. When writing, if STRING is too long, only SIZE bytes are used; if STRING is too short, nulls are written to fill out SIZE bytes. Return true if successful, false on error. -shmread() taints the variable. See also L, +shmread() taints the variable. See also L, C, and the C module from CPAN. +Portability issues: L and L. + =item shutdown SOCKET,HOW X @@ -5656,7 +5973,7 @@ For delays of finer granularity than one second, the Time::HiRes module distribution) provides usleep(). You may also use Perl's four-argument version of select() leaving the first three arguments undefined, or you might be able to use the C interface to access setitimer(2) if -your system supports it. See L for details. +your system supports it. See L for details. See also the POSIX module's C function. @@ -5697,6 +6014,8 @@ See L for an example of socketpair use. Perl 5.8 and later will emulate socketpair using IP sockets to localhost if your system implements sockets but not socketpair. +Portability issues: L. + =item sort SUBNAME LIST X X X X @@ -5724,13 +6043,18 @@ into the subroutine as the package global variables $a and $b (see example below). Note that in the latter case, it is usually highly counter-productive to declare $a and $b as lexicals. +If the subroutine is an XSUB, the elements to be compared are pushed on to +the stack, the way arguments are usually passed to XSUBs. $a and $b are +not set. + The values to be compared are always passed by reference and should not be modified. You also cannot exit out of the sort block or subroutine using any of the loop control operators described in L or with C. -When C is in effect, C sorts LIST according to the +When C (but not C) is in +effect, C sorts LIST according to the current collation locale. See L. sort() returns aliases into the original list, much as a for loop's index @@ -5762,7 +6086,7 @@ Examples: @articles = sort {$a cmp $b} @files; # now case-insensitively - @articles = sort {uc($a) cmp uc($b)} @files; + @articles = sort {fc($a) cmp fc($b)} @files; # same thing in reversed order @articles = sort {$b cmp $a} @files; @@ -5779,7 +6103,7 @@ Examples: # sort using explicit subroutine name sub byage { - $age{$a} <=> $age{$b}; # presuming numeric + $age{$a} <=> $age{$b}; # presuming numeric } @sortedclass = sort byage @class; @@ -5799,8 +6123,8 @@ Examples: my @new = sort { ($b =~ /=(\d+)/)[0] <=> ($a =~ /=(\d+)/)[0] - || - uc($a) cmp uc($b) + || + fc($a) cmp fc($b) } @old; # same thing, but much more efficiently; @@ -5809,22 +6133,22 @@ Examples: my @nums = @caps = (); for (@old) { push @nums, ( /=(\d+)/ ? $1 : undef ); - push @caps, uc($_); + push @caps, fc($_); } my @new = @old[ sort { - $nums[$b] <=> $nums[$a] - || - $caps[$a] cmp $caps[$b] - } 0..$#old - ]; + $nums[$b] <=> $nums[$a] + || + $caps[$a] cmp $caps[$b] + } 0..$#old + ]; # same thing, but without any temps @new = map { $_->[0] } sort { $b->[1] <=> $a->[1] - || - $a->[2] cmp $b->[2] - } map { [$_, /=(\d+)/, uc($_)] } @old; + || + $a->[2] cmp $b->[2] + } map { [$_, /=(\d+)/, fc($_)] } @old; # using a prototype allows you to use any comparison subroutine # as a sort subroutine (including other package's subroutines) @@ -5843,7 +6167,7 @@ Examples: @new = sort { substr($a, 3, 5) cmp substr($b, 3, 5) } @old; Warning: syntactical care is required when sorting the list returned from -a function. If you want to sort the list returned by the function call +a function. If you want to sort the list returned by the function call C, you can use: @contact = sort { $a cmp $b } find_records @key; @@ -5876,8 +6200,7 @@ sometimes saying the opposite, for example) the results are not well-defined. Because C<< <=> >> returns C when either operand is C -(not-a-number), and laso because C raises an exception unless the -result of a comparison is defined, be careful when sorting with a +(not-a-number), be careful when sorting with a comparison function like C<< $a <=> $b >> any lists that might contain a C. The following example takes advantage that C to eliminate any Cs from the input list. @@ -5902,11 +6225,11 @@ If OFFSET is negative then it starts that far from the end of the array. If LENGTH is omitted, removes everything from OFFSET onward. If LENGTH is negative, removes the elements from OFFSET onward except for -LENGTH elements at the end of the array. -If both OFFSET and LENGTH are omitted, removes everything. If OFFSET is +If both OFFSET and LENGTH are omitted, removes everything. If OFFSET is past the end of the array, Perl issues a warning, and splices at the end of the array. -The following equivalences hold (assuming C<< $[ == 0 and $#a >= $i >> ) +The following equivalences hold (assuming C<< $#a >= $i >> ) push(@a,$x,$y) splice(@a,@a,0,$x,$y) pop(@a) splice(@a,-1) @@ -5932,6 +6255,13 @@ reference to an unblessed array. The argument will be dereferenced automatically. This aspect of C is considered highly experimental. The exact behaviour may change in a future version of Perl. +To avoid confusing would-be users of your code who are running earlier +versions of Perl with mysterious syntax errors, put this sort of thing at +the top of your file to signal that your code will work I on Perls of +a recent vintage: + + use 5.014; # so push/pop/etc work on scalars (experimental) + =item split /PATTERN/,EXPR,LIMIT X @@ -5941,124 +6271,158 @@ X =item split -Splits the string EXPR into a list of strings and returns that list. By -default, empty leading fields are preserved, and empty trailing ones are -deleted. (If all fields are empty, they are considered to be trailing.) +Splits the string EXPR into a list of strings and returns the +list in list context, or the size of the list in scalar context. + +If only PATTERN is given, EXPR defaults to C<$_>. + +Anything in EXPR that matches PATTERN is taken to be a separator +that separates the EXPR into substrings (called "I") that +do B include the separator. Note that a separator may be +longer than one character or even have no characters at all (the +empty string, which is a zero-width match). + +The PATTERN need not be constant; an expression may be used +to specify a pattern that varies at runtime. + +If PATTERN matches the empty string, the EXPR is split at the match +position (between characters). As an example, the following: -In scalar context, returns the number of fields found. + print join(':', split('b', 'abc')), "\n"; -If EXPR is omitted, splits the C<$_> string. If PATTERN is also omitted, -splits on whitespace (after skipping any leading whitespace). Anything -matching PATTERN is taken to be a delimiter separating the fields. (Note -that the delimiter may be longer than one character.) +uses the 'b' in 'abc' as a separator to produce the output 'a:c'. +However, this: + + print join(':', split('', 'abc')), "\n"; + +uses empty string matches as separators to produce the output +'a:b:c'; thus, the empty string may be used to split EXPR into a +list of its component characters. + +As a special case for C, the empty pattern given in +L syntax (C) specifically matches the empty string, which is contrary to its usual +interpretation as the last successful match. + +If PATTERN is C, then it is treated as if it used the +L (C), since it +isn't much use otherwise. + +As another special case, C emulates the default behavior of the +command line tool B when the PATTERN is either omitted or a I composed of a single space character (such as S> or +S>, but not e.g. S>). In this case, any leading +whitespace in EXPR is removed before splitting occurs, and the PATTERN is +instead treated as if it were C; in particular, this means that +I contiguous whitespace (not just a single space character) is used as +a separator. However, this special treatment can be avoided by specifying +the pattern S> instead of the string S>, thereby allowing +only a single space character to be a separator. + +If omitted, PATTERN defaults to a single space, S>, triggering +the previously described I emulation. If LIMIT is specified and positive, it represents the maximum number -of fields the EXPR will be split into, though the actual number of -fields returned depends on the number of times PATTERN matches within -EXPR. If LIMIT is unspecified or zero, trailing null fields are -stripped (which potential users of C would do well to remember). -If LIMIT is negative, it is treated as if an arbitrarily large LIMIT -had been specified. Note that splitting an EXPR that evaluates to the -empty string always returns the empty list, regardless of the LIMIT -specified. +of fields into which the EXPR may be split; in other words, LIMIT is +one greater than the maximum number of times EXPR may be split. Thus, +the LIMIT value C<1> means that EXPR may be split a maximum of zero +times, producing a maximum of one field (namely, the entire value of +EXPR). For instance: -A pattern matching the empty string (not to be confused with -an empty pattern C, which is just one member of the set of patterns -matching the epmty string), splits EXPR into individual -characters. For example: + print join(':', split(//, 'abc', 1)), "\n"; - print join(':', split(/ */, 'hi there')), "\n"; +produces the output 'abc', and this: -produces the output 'h:i:t:h:e:r:e'. + print join(':', split(//, 'abc', 2)), "\n"; -As a special case for C, the empty pattern C specifically -matches the empty string; this is not be confused with the normal use -of an empty pattern to mean the last successful match. So to split -a string into individual characters, the following: +produces the output 'a:bc', and each of these: - print join(':', split(//, 'hi there')), "\n"; + print join(':', split(//, 'abc', 3)), "\n"; + print join(':', split(//, 'abc', 4)), "\n"; -produces the output 'h:i: :t:h:e:r:e'. +produces the output 'a:b:c'. -Empty leading fields are produced when there are positive-width matches at -the beginning of the string; a zero-width match at the beginning of -the string does not produce an empty field. For example: +If LIMIT is negative, it is treated as if it were instead arbitrarily +large; as many fields as possible are produced. - print join(':', split(/(?=\w)/, 'hi there!')); +If LIMIT is omitted (or, equivalently, zero), then it is usually +treated as if it were instead negative but with the exception that +trailing empty fields are stripped (empty leading fields are always +preserved); if all fields are empty, then all fields are considered to +be trailing (and are thus stripped in this case). Thus, the following: -produces the output 'h:i :t:h:e:r:e!'. Empty trailing fields, on the other -hand, are produced when there is a match at the end of the string (and -when LIMIT is given and is not 0), regardless of the length of the match. -For example: + print join(':', split(',', 'a,b,c,,,')), "\n"; - print join(':', split(//, 'hi there!', -1)), "\n"; - print join(':', split(/\W/, 'hi there!', -1)), "\n"; +produces the output 'a:b:c', but the following: -produce the output 'h:i: :t:h:e:r:e:!:' and 'hi:there:', respectively, -both with an empty trailing field. + print join(':', split(',', 'a,b,c,,,', -1)), "\n"; -The LIMIT parameter can be used to split a line partially +produces the output 'a:b:c:::'. - ($login, $passwd, $remainder) = split(/:/, $_, 3); +In time-critical applications, it is worthwhile to avoid splitting +into more fields than necessary. Thus, when assigning to a list, +if LIMIT is omitted (or zero), then LIMIT is treated as though it +were one larger than the number of variables in the list; for the +following, LIMIT is implicitly 4: -When assigning to a list, if LIMIT is omitted, or zero, Perl supplies -a LIMIT one larger than the number of variables in the list, to avoid -unnecessary work. For the list above LIMIT would have been 4 by -default. In time critical applications it behooves you not to split -into more fields than you really need. + ($login, $passwd, $remainder) = split(/:/); -If the PATTERN contains parentheses, additional list elements are -created from each matching substring in the delimiter. +Note that splitting an EXPR that evaluates to the empty string always +produces zero fields, regardless of the LIMIT specified. - split(/([,-])/, "1-10,20", 3); +An empty leading field is produced when there is a positive-width +match at the beginning of EXPR. For instance: -produces the list value + print join(':', split(/ /, ' abc')), "\n"; - (1, '-', 10, ',', 20) +produces the output ':abc'. However, a zero-width match at the +beginning of EXPR never produces an empty field, so that: -If you had the entire header of a normal Unix email message in $header, -you could split it up into fields and their values this way: + print join(':', split(//, ' abc')); - $header =~ s/\n(?=\s)//g; # fix continuation lines - %hdrs = (UNIX_FROM => split /^(\S*?):\s*/m, $header); +produces the output S<' :a:b:c'> (rather than S<': :a:b:c'>). -The pattern C may be replaced with an expression to specify -patterns that vary at runtime. (To do runtime compilation only once, -use C.) +An empty trailing field, on the other hand, is produced when there is a +match at the end of EXPR, regardless of the length of the match +(of course, unless a non-zero LIMIT is given explicitly, such fields are +removed, as in the last example). Thus: -As a special case, specifying a PATTERN of space (S>) will split on -white space just as C with no arguments does. Thus, S> can -be used to emulate B's default behavior, whereas S> -will give you as many initial null fields (empty string) as there are leading spaces. -A C on C is like a S> except that any leading -whitespace produces a null first field. A C with no arguments -really does a S> internally. + print join(':', split(//, ' abc', -1)), "\n"; -A PATTERN of C is treated as if it were C, since it isn't -much use otherwise. +produces the output S<' :a:b:c:'>. -Example: +If the PATTERN contains +L, +then for each separator, an additional field is produced for each substring +captured by a group (in the order in which the groups are specified, +as per L); if any group does not +match, then it captures the C value instead of a substring. Also, +note that any such additional field is produced whenever there is a +separator (that is, whenever a split occurs), and such an additional field +does B count towards the LIMIT. Consider the following expressions +evaluated in list context (each returned list is provided in the associated +comment): - open(PASSWD, '/etc/passwd'); - while () { - chomp; - ($login, $passwd, $uid, $gid, - $gcos, $home, $shell) = split(/:/); - #... - } + split(/-|,/, "1-10,20", 3) + # ('1', '10', '20') + + split(/(-|,)/, "1-10,20", 3) + # ('1', '-', '10', ',', '20') -As with regular pattern matching, any capturing parentheses that are not -matched in a C will be set to C when returned: + split(/-|(,)/, "1-10,20", 3) + # ('1', undef, '10', ',', '20') - @fields = split /(A)|B/, "1A2B3"; - # @fields is (1, 'A', 2, undef, 3) + split(/(-)|,/, "1-10,20", 3) + # ('1', '-', '10', undef, '20') + + split(/(-)|(,)/, "1-10,20", 3) + # ('1', '-', undef, '10', undef, ',', '20') =item sprintf FORMAT, LIST X Returns a string formatted by the usual C conventions of the C library function C. See below for more details -and see C or C on your system for an explanation of +and see L or L on your system for an explanation of the general principles. For example: @@ -6076,7 +6440,8 @@ Non-standard extensions in your local sprintf(3) are therefore unavailable from Perl. Unlike C, C does not do what you probably mean when you -pass it an array as your first argument. The array is given scalar context, +pass it an array as your first argument. +The array is given scalar context, and instead of using the 0th element of the array as the format, Perl will use the count of elements in the array as the format, which is almost never useful. @@ -6103,7 +6468,7 @@ In addition, Perl permits the following widely-supported conversions: %B like %b, but using an upper-case "B" with the # flag %p a pointer (outputs the Perl value's address in hexadecimal) %n special: *stores* the number of characters output so far - into the next variable in the parameter list + into the next argument in the parameter list Finally, for backward (and we do mean "backward") compatibility, Perl permits these unnecessary but widely-supported conversions: @@ -6128,7 +6493,7 @@ In order, these are: =item format parameter index -An explicit format parameter index, such as C<2$>. By default sprintf +An explicit format parameter index, such as C<2$>. By default sprintf will format the next unused argument in the list, but this allows you to take the arguments out of order: @@ -6176,9 +6541,9 @@ the precision is incremented if it's necessary for the leading "0". =item vector flag This flag tells Perl to interpret the supplied string as a vector of -integers, one for each character in the string. Perl applies the format to +integers, one for each character in the string. Perl applies the format to each integer in turn, then joins the resulting strings with a separator (a -dot C<.> by default). This can be useful for displaying ordinal values of +dot C<.> by default). This can be useful for displaying ordinal values of characters in arbitrary strings: printf "%vd", "AB\x{100}"; # prints "65.66.256" @@ -6198,7 +6563,7 @@ the join string using something like C<*2$v>; for example: =item (minimum) width Arguments are usually formatted to be only as wide as required to -display the given value. You can override the width by putting +display the given value. You can override the width by putting a number here, or get the width from the next argument (with C<*>) or from a specified argument (e.g., with C<*2$>): @@ -6228,7 +6593,7 @@ For example: printf '<%.1e>', 10; # prints "<1.0e+01>" For "g" and "G", this specifies the maximum number of digits to show, -including thoe prior to the decimal point and those after it; for +including those prior to the decimal point and those after it; for example: # These examples are subject to system-specific variation. @@ -6290,7 +6655,7 @@ example using C<.*2$>: =item size For numeric conversions, you can specify the size to interpret the -number as using C, C, C, C, C, or C. For integer +number as using C, C, C, C, C, or C. For integer conversions (C), numbers are usually assumed to be whatever the default integer size is on your platform (usually 32 or 64 bits), but you can override this to use instead one of the standard C types, @@ -6329,7 +6694,7 @@ You can find out whether your Perl supports quads via L: For floating-point conversions (C), numbers are usually assumed to be the default floating-point size on your platform (double or long double), but you can force "long double" with C, C, or C if your -platform supports them. You can find out whether your Perl supports long +platform supports them. You can find out whether your Perl supports long doubles via L: use Config; @@ -6356,7 +6721,7 @@ integer or floating-point number", which is the default. =item order of arguments Normally, sprintf() takes the next unused argument as the value to -format for each format specification. If the format specification +format for each format specification. If the format specification uses C<*> to require additional arguments, these are consumed from the argument list in the order they appear in the format specification I the value to format. Where an argument is @@ -6386,7 +6751,8 @@ index, the C<$> may need escaping: =back -If C is in effect and POSIX::setlocale() has been called, +If C (including C) is in effect +and POSIX::setlocale() has been called, the character used for the decimal separator in formatted floating-point numbers is affected by the LC_NUMERIC locale. See L and L. @@ -6410,11 +6776,14 @@ X X X Sets and returns the random number seed for the C operator. -The point of the function is to "seed" the C function so that -C can produce a different sequence each time you run your -program. When called with a parameter, C uses that for the seed; -otherwise it (semi-)randomly chooses a seed. In either case, starting with -Perl 5.14, it returns the seed. +The point of the function is to "seed" the C function so that C +can produce a different sequence each time you run your program. When +called with a parameter, C uses that for the seed; otherwise it +(semi-)randomly chooses a seed. In either case, starting with Perl 5.14, +it returns the seed. To signal that your code will work I on Perls +of a recent vintage: + + use 5.014; # so srand returns the seed If C is not called explicitly, it is called implicitly without a parameter at the first use of the C operator. However, this was not true @@ -6425,10 +6794,7 @@ C at all. But there are a few situations in recent Perls where programs are likely to want to call C. One is for generating predictable results generally for testing or debugging. There, you use C, with the same C<$seed> -each time. Another other case is where you need a cryptographically-strong -starting point rather than the generally acceptable default, which is based on -time of day, process ID, and memory allocation, or the F device -if available. And still another case is that you may want to call C +each time. Another case is that you may want to call C after a C to avoid child processes sharing the same seed value as the parent (and consequently each other). @@ -6447,16 +6813,6 @@ current C