This is a live mirror of the Perl 5 development currently hosted at
perl 4.0 patch 7: patch #4, continued
[perl5.git] /
fe14fcc3 1.rn '' }`
352d5a3a 2''' $RCSfile:,v $$Revision: $$Date: 91/06/07 11:41:23 $
4''' $Log:,v $
5''' Revision 91/06/07 11:41:23 lwall
6''' patch4: added global modifier for pattern matches
7''' patch4: default top-of-form format is now FILEHANDLE_TOP
8''' patch4: added $^P variable to control calling of perldb routines
9''' patch4: added $^F variable to specify maximum system fd, default 2
10''' patch4: changed old $^P to $^X
12''' Revision 91/04/11 17:50:44 lwall
13''' patch1: fixed some typos
15''' Revision 4.0 91/03/20 01:38:08 lwall
16''' 4.0 baseline.
18''' Sh 5
25.. Sp
27.if t .sp .5v
28.if n .sp
29.. Ip \\n(.$>=3 .ne \\$3
33.el .ne 3
34.IP "\\$1" \\$2
37''' Set up \*(-- to give an unbreakable dash;
38''' string Tr holds user defined translation string.
39''' Bell System Logo is used as a dummy character.
40''' \(*W-|\(bv\*(Tr n \{\
43.ds -- \(*W-
44.if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
45.if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
46.ds L" ""
47.ds R" ""
48.ds L' '
49.ds R' '
52.ds -- \(em\| \*(Tr
54.ds L" ``
55.ds R" ''
56.ds L' `
57.ds R' '
59.TH PERL 1 "\*(RP"
62perl \- Practical Extraction and Report Language
64.B perl
65[options] filename args
67.I Perl
68is an interpreted language optimized for scanning arbitrary text files,
69extracting information from those text files, and printing reports based
70on that information.
71It's also a good language for many system management tasks.
72The language is intended to be practical (easy to use, efficient, complete)
73rather than beautiful (tiny, elegant, minimal).
74It combines (in the author's opinion, anyway) some of the best features of C,
75\fIsed\fR, \fIawk\fR, and \fIsh\fR,
76so people familiar with those languages should have little difficulty with it.
77(Language historians will also note some vestiges of \fIcsh\fR, Pascal, and
78even BASIC-PLUS.)
79Expression syntax corresponds quite closely to C expression syntax.
80Unlike most Unix utilities,
81.I perl
82does not arbitrarily limit the size of your data\*(--if you've got
83the memory,
84.I perl
85can slurp in your whole file as a single string.
86Recursion is of unlimited depth.
87And the hash tables used by associative arrays grow as necessary to prevent
88degraded performance.
89.I Perl
90uses sophisticated pattern matching techniques to scan large amounts of
91data very quickly.
92Although optimized for scanning text,
93.I perl
94can also deal with binary data, and can make dbm files look like associative
95arrays (where dbm is available).
97.I perl
98scripts are safer than C programs
99through a dataflow tracing mechanism which prevents many stupid security holes.
100If you have a problem that would ordinarily use \fIsed\fR
101or \fIawk\fR or \fIsh\fR, but it
102exceeds their capabilities or must run a little faster,
103and you don't want to write the silly thing in C, then
104.I perl
105may be for you.
106There are also translators to turn your
107.I sed
109.I awk
110scripts into
111.I perl
113OK, enough hype.
115Upon startup,
116.I perl
117looks for your script in one of the following places:
118.Ip 1. 4 2
119Specified line by line via
120.B \-e
121switches on the command line.
122.Ip 2. 4 2
123Contained in the file specified by the first filename on the command line.
124(Note that systems supporting the #! notation invoke interpreters this way.)
125.Ip 3. 4 2
126Passed in implicitly via standard input.
127This only works if there are no filename arguments\*(--to pass
128arguments to a
129.I stdin
130script you must explicitly specify a \- for the script name.
132After locating your script,
133.I perl
134compiles it to an internal form.
135If the script is syntactically correct, it is executed.
136.Sh "Options"
137Note: on first reading this section may not make much sense to you. It's here
138at the front for easy reference.
140A single-character option may be combined with the following option, if any.
141This is particularly useful when invoking a script using the #! construct which
142only allows one argument. Example:
144 2
146 #!/usr/bin/perl \-spi.bak # same as \-s \-p \-i.bak
147 .\|.\|.
150Options include:
151.TP 5
152.BI \-0 digits
153specifies the record separator ($/) as an octal number.
154If there are no digits, the null character is the separator.
155Other switches may precede or follow the digits.
156For example, if you have a version of
157.I find
158which can print filenames terminated by the null character, you can say this:
161 find . \-name '*.bak' \-print0 | perl \-n0e unlink
164The special value 00 will cause Perl to slurp files in paragraph mode.
165The value 0777 will cause Perl to slurp files whole since there is no
166legal character with that value.
167.TP 5
168.B \-a
169turns on autosplit mode when used with a
170.B \-n
172.BR \-p .
173An implicit split command to the @F array
174is done as the first thing inside the implicit while loop produced by
176.B \-n
178.BR \-p .
181 perl \-ane \'print pop(@F), "\en";\'
183is equivalent to
185 while (<>) {
186 @F = split(\' \');
187 print pop(@F), "\en";
188 }
191.TP 5
192.B \-c
194.I perl
195to check the syntax of the script and then exit without executing it.
196.TP 5
197.BI \-d
198runs the script under the perl debugger.
199See the section on Debugging.
200.TP 5
201.BI \-D number
202sets debugging flags.
203To watch how it executes your script, use
204.BR \-D14 .
205(This only works if debugging is compiled into your
206.IR perl .)
207Another nice value is \-D1024, which lists your compiled syntax tree.
208And \-D512 displays compiled regular expressions.
209.TP 5
210.BI \-e " commandline"
211may be used to enter one line of script.
213.B \-e
214commands may be given to build up a multi-line script.
216.B \-e
217is given,
218.I perl
219will not look for a script filename in the argument list.
220.TP 5
221.BI \-i extension
222specifies that files processed by the <> construct are to be edited
224It does this by renaming the input file, opening the output file by the
225same name, and selecting that output file as the default for print statements.
226The extension, if supplied, is added to the name of the
227old file to make a backup copy.
228If no extension is supplied, no backup is made.
229Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using
230the script:
232 2
234 #!/usr/bin/perl \-pi.bak
235 s/foo/bar/;
237which is equivalent to
238 14
240 #!/usr/bin/perl
241 while (<>) {
242 if ($ARGV ne $oldargv) {
243 rename($ARGV, $ARGV . \'.bak\');
244 open(ARGVOUT, ">$ARGV");
245 select(ARGVOUT);
246 $oldargv = $ARGV;
247 }
248 s/foo/bar/;
249 }
250 continue {
251 print; # this prints to original filename
252 }
253 select(STDOUT);
256except that the
257.B \-i
258form doesn't need to compare $ARGV to $oldargv to know when
259the filename has changed.
260It does, however, use ARGVOUT for the selected filehandle.
261Note that
263is restored as the default output filehandle after the loop.
265You can use eof to locate the end of each input file, in case you want
266to append to each file, or reset line numbering (see example under eof).
267.TP 5
268.BI \-I directory
269may be used in conjunction with
270.B \-P
271to tell the C preprocessor where to look for include files.
272By default /usr/include and /usr/lib/perl are searched.
273.TP 5
274.BI \-l octnum
275enables automatic line-ending processing. It has two effects:
276first, it automatically chops the line terminator when used with
277.B \-n
279.B \-p ,
280and second, it assigns $\e to have the value of
281.I octnum
282so that any print statements will have that line terminator added back on. If
283.I octnum
284is omitted, sets $\e to the current value of $/.
285For instance, to trim lines to 80 columns:
288 perl -lpe \'substr($_, 80) = ""\'
291Note that the assignment $\e = $/ is done when the switch is processed,
292so the input record separator can be different than the output record
293separator if the
294.B \-l
295switch is followed by a
296.B \-0
300 gnufind / -print0 | perl -ln0e 'print "found $_" if -p'
303This sets $\e to newline and then sets $/ to the null character.
304.TP 5
305.B \-n
307.I perl
308to assume the following loop around your script, which makes it iterate
309over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR:
311 3
313 while (<>) {
314 .\|.\|. # your script goes here
315 }
318Note that the lines are not printed by default.
320.B \-p
321to have lines printed.
322Here is an efficient way to delete all files older than a week:
325 find . \-mtime +7 \-print | perl \-nle \'unlink;\'
328This is faster than using the \-exec switch of find because you don't have to
329start a process on every filename found.
330.TP 5
331.B \-p
333.I perl
334to assume the following loop around your script, which makes it iterate
335over filename arguments somewhat like \fIsed\fR:
337 5
339 while (<>) {
340 .\|.\|. # your script goes here
341 } continue {
342 print;
343 }
346Note that the lines are printed automatically.
347To suppress printing use the
348.B \-n
351.B \-p
352overrides a
353.B \-n
355.TP 5
356.B \-P
357causes your script to be run through the C preprocessor before
358compilation by
359.IR perl .
360(Since both comments and cpp directives begin with the # character,
361you should avoid starting comments with any words recognized
362by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".)
363.TP 5
364.B \-s
365enables some rudimentary switch parsing for switches on the command line
366after the script name but before any filename arguments (or before a \-\|\-).
367Any switch found there is removed from @ARGV and sets the corresponding variable in the
368.I perl
370The following script prints \*(L"true\*(R" if and only if the script is
371invoked with a \-xyz switch.
373 2
375 #!/usr/bin/perl \-s
376 if ($xyz) { print "true\en"; }
379.TP 5
380.B \-S
382.I perl
383use the PATH environment variable to search for the script
384(unless the name of the script starts with a slash).
385Typically this is used to emulate #! startup on machines that don't
386support #!, in the following manner:
389 #!/usr/bin/perl
390 eval "exec /usr/bin/perl \-S $0 $*"
391 if $running_under_some_shell;
394The system ignores the first line and feeds the script to /bin/sh,
395which proceeds to try to execute the
396.I perl
397script as a shell script.
398The shell executes the second line as a normal shell command, and thus
399starts up the
400.I perl
402On some systems $0 doesn't always contain the full pathname,
403so the
404.B \-S
406.I perl
407to search for the script if necessary.
409.I perl
410locates the script, it parses the lines and ignores them because
411the variable $running_under_some_shell is never true.
412A better construct than $* would be ${1+"$@"}, which handles embedded spaces
413and such in the filenames, but doesn't work if the script is being interpreted
414by csh.
415In order to start up sh rather than csh, some systems may have to replace the
416#! line with a line containing just
417a colon, which will be politely ignored by perl.
418Other systems can't control that, and need a totally devious construct that
419will work under any of csh, sh or perl, such as the following:
421 3
423 eval '(exit $?0)' && eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
424 & eval 'exec /usr/bin/perl -S $0 $argv:q'
425 if 0;
428.TP 5
429.B \-u
431.I perl
432to dump core after compiling your script.
433You can then take this core dump and turn it into an executable file
434by using the undump program (not supplied).
435This speeds startup at the expense of some disk space (which you can
436minimize by stripping the executable).
437(Still, a "hello world" executable comes out to about 200K on my machine.)
438If you are going to run your executable as a set-id program then you
439should probably compile it using taintperl rather than normal perl.
440If you want to execute a portion of your script before dumping, use the
441dump operator instead.
442Note: availability of undump is platform specific and may not be available
443for a specific port of perl.
444.TP 5
445.B \-U
447.I perl
448to do unsafe operations.
449Currently the only \*(L"unsafe\*(R" operation is the unlinking of directories while
450running as superuser.
451.TP 5
452.B \-v
453prints the version and patchlevel of your
454.I perl
456.TP 5
457.B \-w
458prints warnings about identifiers that are mentioned only once, and scalar
459variables that are used before being set.
460Also warns about redefined subroutines, and references to undefined
461filehandles or filehandles opened readonly that you are attempting to
462write on.
463Also warns you if you use == on values that don't look like numbers, and if
464your subroutines recurse more than 100 deep.
465.TP 5
466.BI \-x directory
468.I perl
469that the script is embedded in a message.
470Leading garbage will be discarded until the first line that starts
471with #! and contains the string "perl".
472Any meaningful switches on that line will be applied (but only one
473group of switches, as with normal #! processing).
474If a directory name is specified, Perl will switch to that directory
475before running the script.
477.B \-x
478switch only controls the the disposal of leading garbage.
479The script must be terminated with __END__ if there is trailing garbage
480to be ignored (the script can process any or all of the trailing garbage
481via the DATA filehandle if desired).
482.Sh "Data Types and Objects"
484.I Perl
485has three data types: scalars, arrays of scalars, and
486associative arrays of scalars.
487Normal arrays are indexed by number, and associative arrays by string.
489The interpretation of operations and values in perl sometimes
490depends on the requirements
491of the context around the operation or value.
492There are three major contexts: string, numeric and array.
493Certain operations return array values
494in contexts wanting an array, and scalar values otherwise.
495(If this is true of an operation it will be mentioned in the documentation
496for that operation.)
497Operations which return scalars don't care whether the context is looking
498for a string or a number, but
499scalar variables and values are interpreted as strings or numbers
500as appropriate to the context.
501A scalar is interpreted as TRUE in the boolean sense if it is not the null
502string or 0.
503Booleans returned by operators are 1 for true and 0 or \'\' (the null
504string) for false.
506There are actually two varieties of null string: defined and undefined.
507Undefined null strings are returned when there is no real value for something,
508such as when there was an error, or at end of file, or when you refer
509to an uninitialized variable or element of an array.
510An undefined null string may become defined the first time you access it, but
511prior to that you can use the defined() operator to determine whether the
512value is defined or not.
514References to scalar variables always begin with \*(L'$\*(R', even when referring
515to a scalar that is part of an array.
518 3
520 $days \h'|2i'# a simple scalar variable
521 $days[28] \h'|2i'# 29th element of array @days
522 $days{\'Feb\'}\h'|2i'# one value from an associative array
523 $#days \h'|2i'# last index of array @days
525but entire arrays or array slices are denoted by \*(L'@\*(R':
527 @days \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n])
528 @days[3,4,5]\h'|2i'# same as @days[3.\|.5]
529 @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'})
531and entire associative arrays are denoted by \*(L'%\*(R':
533 %days \h'|2i'# (key1, val1, key2, val2 .\|.\|.)
536Any of these eight constructs may serve as an lvalue,
537that is, may be assigned to.
538(It also turns out that an assignment is itself an lvalue in
539certain contexts\*(--see examples under s, tr and chop.)
540Assignment to a scalar evaluates the righthand side in a scalar context,
541while assignment to an array or array slice evaluates the righthand side
542in an array context.
544You may find the length of array @days by evaluating
545\*(L"$#days\*(R", as in
546.IR csh .
547(Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.)
548Assigning to $#days changes the length of the array.
549Shortening an array by this method does not actually destroy any values.
550Lengthening an array that was previously shortened recovers the values that
551were in those elements.
552You can also gain some measure of efficiency by preextending an array that
553is going to get big.
554(You can also extend an array by assigning to an element that is off the
555end of the array.
556This differs from assigning to $#whatever in that intervening values
557are set to null rather than recovered.)
558You can truncate an array down to nothing by assigning the null list () to
560The following are exactly equivalent
563 @whatever = ();
564 $#whatever = $[ \- 1;
568If you evaluate an array in a scalar context, it returns the length of
569the array.
570The following is always true:
573 @whatever == $#whatever \- $[ + 1;
577Multi-dimensional arrays are not directly supported, but see the discussion
578of the $; variable later for a means of emulating multiple subscripts with
579an associative array.
580You could also write a subroutine to turn multiple subscripts into a single
583Every data type has its own namespace.
584You can, without fear of conflict, use the same name for a scalar variable,
585an array, an associative array, a filehandle, a subroutine name, and/or
586a label.
587Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R',
588or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved
589with respect to variable names.
590(They ARE reserved with respect to labels and filehandles, however, which
591don't have an initial special character.
592Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\').
593Using uppercase filehandles also improves readability and protects you
594from conflict with future reserved words.)
595Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all
596different names.
597Names which start with a letter may also contain digits and underscores.
598Names which do not start with a letter are limited to one character,
599e.g. \*(L"$%\*(R" or \*(L"$$\*(R".
600(Most of the one character names have a predefined significance to
601.IR perl .
602More later.)
604Numeric literals are specified in any of the usual floating point or
605integer formats:
607 5
609 12345
610 12345.67
611 .23E-10
612 0xffff # hex
613 0377 # octal
616String literals are delimited by either single or double quotes.
617They work much like shell quotes:
618double-quoted string literals are subject to backslash and variable
619substitution; single-quoted strings are not (except for \e\' and \e\e).
620The usual backslash rules apply for making characters such as newline, tab,
621etc., as well as some more exotic forms:
624 \et tab
625 \en newline
626 \er return
627 \ef form feed
628 \eb backspace
629 \ea alarm (bell)
630 \ee escape
631 \e033 octal char
632 \ex1b hex char
633 \ec[ control char
634 \el lowercase next char
635 \eu uppercase next char
636 \eL lowercase till \eE
637 \eU uppercase till \eE
638 \eE end case modification
641You can also embed newlines directly in your strings, i.e. they can end on
642a different line than they begin.
643This is nice, but if you forget your trailing quote, the error will not be
644reported until
645.I perl
646finds another line containing the quote character, which
647may be much further on in the script.
648Variable substitution inside strings is limited to scalar variables, normal
649array values, and array slices.
650(In other words, identifiers beginning with $ or @, followed by an optional
651bracketed expression as a subscript.)
652The following code segment prints out \*(L"The price is $100.\*(R"
654 2
656 $Price = \'$100\';\h'|3.5i'# not interpreted
657 print "The price is $Price.\e\|n";\h'|3.5i'# interpreted
660Note that you can put curly brackets around the identifier to delimit it
661from following alphanumerics.
662Also note that a single quoted string must be separated from a preceding
663word by a space, since single quote is a valid character in an identifier
664(see Packages).
666Two special literals are __LINE__ and __FILE__, which represent the current
667line number and filename at that point in your program.
668They may only be used as separate tokens; they will not be interpolated
669into strings.
670In addition, the token __END__ may be used to indicate the logical end of the
671script before the actual end of file.
672Any following text is ignored (but may be read via the DATA filehandle).
673The two control characters ^D and ^Z are synonyms for __END__.
675A word that doesn't have any other interpretation in the grammar will be
676treated as if it had single quotes around it.
677For this purpose, a word consists only of alphanumeric characters and underline,
678and must start with an alphabetic character.
679As with filehandles and labels, a bare word that consists entirely of
680lowercase letters risks conflict with future reserved words, and if you
681use the
682.B \-w
683switch, Perl will warn you about any such words.
685Array values are interpolated into double-quoted strings by joining all the
686elements of the array with the delimiter specified in the $" variable,
687space by default.
688(Since in versions of perl prior to 3.0 the @ character was not a metacharacter
689in double-quoted strings, the interpolation of @array, $array[EXPR],
690@array[LIST], $array{EXPR}, or @array{LIST} only happens if array is
691referenced elsewhere in the program or is predefined.)
692The following are equivalent:
694 4
696 $temp = join($",@ARGV);
697 system "echo $temp";
699 system "echo @ARGV";
702Within search patterns (which also undergo double-quotish substitution)
703there is a bad ambiguity: Is /$foo[bar]/ to be
704interpreted as /${foo}[bar]/ (where [bar] is a character class for the
705regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to
706array @foo)?
707If @foo doesn't otherwise exist, then it's obviously a character class.
708If @foo exists, perl takes a good guess about [bar], and is almost always right.
709If it does guess wrong, or if you're just plain paranoid,
710you can force the correct interpretation with curly brackets as above.
712A line-oriented form of quoting is based on the shell here-is syntax.
713Following a << you specify a string to terminate the quoted material, and all lines
714following the current line down to the terminating string are the value
715of the item.
716The terminating string may be either an identifier (a word), or some
717quoted text.
718If quoted, the type of quotes you use determines the treatment of the text,
719just as in regular quoting.
720An unquoted identifier works like double quotes.
721There must be no space between the << and the identifier.
722(If you put a space it will be treated as a null identifier, which is
723valid, and matches the first blank line\*(--see Merry Christmas example below.)
724The terminating string must appear by itself (unquoted and with no surrounding
725whitespace) on the terminating line.
728 print <<EOF; # same as above
729The price is $Price.
732 print <<"EOF"; # same as above
733The price is $Price.
736 print << x 10; # null identifier is delimiter
737Merry Christmas!
739 print <<`EOC`; # execute commands
740echo hi there
741echo lo there
744 print <<foo, <<bar; # you can stack them
745I said foo.
747I said bar.
751Array literals are denoted by separating individual values by commas, and
752enclosing the list in parentheses:
755 (LIST)
758In a context not requiring an array value, the value of the array literal
759is the value of the final element, as in the C comma operator.
760For example,
762 4
764 @foo = (\'cc\', \'\-E\', $bar);
766assigns the entire array value to array foo, but
768 $foo = (\'cc\', \'\-E\', $bar);
771assigns the value of variable bar to variable foo.
772Note that the value of an actual array in a scalar context is the length
773of the array; the following assigns to $foo the value 3:
775 2
777 @foo = (\'cc\', \'\-E\', $bar);
778 $foo = @foo; # $foo gets 3
781You may have an optional comma before the closing parenthesis of an
782array literal, so that you can say:
785 @foo = (
786 1,
787 2,
788 3,
789 );
792When a LIST is evaluated, each element of the list is evaluated in
793an array context, and the resulting array value is interpolated into LIST
794just as if each individual element were a member of LIST. Thus arrays
795lose their identity in a LIST\*(--the list
797 (@foo,@bar,&SomeSub)
799contains all the elements of @foo followed by all the elements of @bar,
800followed by all the elements returned by the subroutine named SomeSub.
802A list value may also be subscripted like a normal array.
806 $time = (stat($file))[8]; # stat returns array value
807 $digit = ('a','b','c','d','e','f')[$digit-10];
808 return (pop(@foo),pop(@foo))[0];
812Array lists may be assigned to if and only if each element of the list
813is an lvalue:
816 ($a, $b, $c) = (1, 2, 3);
818 ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00);
820The final element may be an array or an associative array:
822 ($a, $b, @rest) = split;
823 local($a, $b, %rest) = @_;
826You can actually put an array anywhere in the list, but the first array
827in the list will soak up all the values, and anything after it will get
828a null value.
829This may be useful in a local().
831An associative array literal contains pairs of values to be interpreted
832as a key and a value:
834 2
836 # same as map assignment above
837 %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
840Array assignment in a scalar context returns the number of elements
841produced by the expression on the right side of the assignment:
844 $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2
848There are several other pseudo-literals that you should know about.
849If a string is enclosed by backticks (grave accents), it first undergoes
850variable substitution just like a double quoted string.
851It is then interpreted as a command, and the output of that command
852is the value of the pseudo-literal, like in a shell.
853In a scalar context, a single string consisting of all the output is
855In an array context, an array of values is returned, one for each line
856of output.
857(You can set $/ to use a different line terminator.)
858The command is executed each time the pseudo-literal is evaluated.
859The status value of the command is returned in $? (see Predefined Names
860for the interpretation of $?).
861Unlike in \f2csh\f1, no translation is done on the return
862data\*(--newlines remain newlines.
863Unlike in any of the shells, single quotes do not hide variable names
864in the command from interpretation.
865To pass a $ through to the shell you need to hide it with a backslash.
867Evaluating a filehandle in angle brackets yields the next line
868from that file (newline included, so it's never false until EOF, at
869which time an undefined value is returned).
870Ordinarily you must assign that value to a variable,
871but there is one situation where an automatic assignment happens.
872If (and only if) the input symbol is the only thing inside the conditional of a
873.I while
874loop, the value is
875automatically assigned to the variable \*(L"$_\*(R".
876(This may seem like an odd thing to you, but you'll use the construct
877in almost every
878.I perl
879script you write.)
880Anyway, the following lines are equivalent to each other:
882 5
884 while ($_ = <STDIN>) { print; }
885 while (<STDIN>) { print; }
886 for (\|;\|<STDIN>;\|) { print; }
887 print while $_ = <STDIN>;
888 print while <STDIN>;
891The filehandles
892.IR STDIN ,
896are predefined.
897(The filehandles
898.IR stdin ,
899.I stdout
901.I stderr
902will also work except in packages, where they would be interpreted as
903local identifiers rather than global.)
904Additional filehandles may be created with the
905.I open
908If a <FILEHANDLE> is used in a context that is looking for an array, an array
909consisting of all the input lines is returned, one line per array element.
910It's easy to make a LARGE data space this way, so use with care.
912The null filehandle <> is special and can be used to emulate the behavior of
913\fIsed\fR and \fIawk\fR.
914Input from <> comes either from standard input, or from each file listed on
915the command line.
916Here's how it works: the first time <> is evaluated, the ARGV array is checked,
917and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard
919The ARGV array is then processed as a list of filenames.
920The loop
922 3
924 while (<>) {
925 .\|.\|. # code for each line
926 }
927 10
929is equivalent to
931 unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[;
932 while ($ARGV = shift) {
933 open(ARGV, $ARGV);
934 while (<ARGV>) {
935 .\|.\|. # code for each line
936 }
937 }
940except that it isn't as cumbersome to say.
941It really does shift array ARGV and put the current filename into
942variable ARGV.
943It also uses filehandle ARGV internally.
944You can modify @ARGV before the first <> as long as you leave the first
945filename at the beginning of the array.
946Line numbers ($.) continue as if the input was one big happy file.
947(But see example under eof for how to reset line numbers on each file.)
948.PP 5
950If you want to set @ARGV to your own list of files, go right ahead.
951If you want to pass switches into your script, you can
952put a loop on the front like this:
954 10
956 while ($_ = $ARGV[0], /\|^\-/\|) {
957 shift;
958 last if /\|^\-\|\-$\|/\|;
959 /\|^\-D\|(.*\|)/ \|&& \|($debug = $1);
960 /\|^\-v\|/ \|&& \|$verbose++;
961 .\|.\|. # other switches
962 }
963 while (<>) {
964 .\|.\|. # code for each line
965 }
968The <> symbol will return FALSE only once.
969If you call it again after this it will assume you are processing another
970@ARGV list, and if you haven't set @ARGV, will input from
971.IR STDIN .
973If the string inside the angle brackets is a reference to a scalar variable
974(e.g. <$foo>),
975then that variable contains the name of the filehandle to input from.
977If the string inside angle brackets is not a filehandle, it is interpreted
978as a filename pattern to be globbed, and either an array of filenames or the
979next filename in the list is returned, depending on context.
980One level of $ interpretation is done first, but you can't say <$foo>
981because that's an indirect filehandle as explained in the previous
983You could insert curly brackets to force interpretation as a
984filename glob: <${foo}>.
987 3
989 while (<*.c>) {
990 chmod 0644, $_;
991 }
993is equivalent to
994 5
996 open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|");
997 while (<foo>) {
998 chop;
999 chmod 0644, $_;
1000 }
1003In fact, it's currently implemented that way.
1004(Which means it will not work on filenames with spaces in them unless
1005you have /bin/csh on your machine.)
1006Of course, the shortest way to do the above is:
1009 chmod 0644, <*.c>;
1012.Sh "Syntax"
1015.I perl
1016script consists of a sequence of declarations and commands.
1017The only things that need to be declared in
1018.I perl
1019are report formats and subroutines.
1020See the sections below for more information on those declarations.
1021All uninitialized user-created objects are assumed to
1022start with a null or 0 value until they
1023are defined by some explicit operation such as assignment.
1024The sequence of commands is executed just once, unlike in
1025.I sed
1027.I awk
1028scripts, where the sequence of commands is executed for each input line.
1029While this means that you must explicitly loop over the lines of your input file
1030(or files), it also means you have much more control over which files and which
1031lines you look at.
1032(Actually, I'm lying\*(--it is possible to do an implicit loop with either the
1033.B \-n
1035.B \-p
1038A declaration can be put anywhere a command can, but has no effect on the
1039execution of the primary sequence of commands\*(--declarations all take effect
1040at compile time.
1041Typically all the declarations are put at the beginning or the end of the script.
1043.I Perl
1044is, for the most part, a free-form language.
1045(The only exception to this is format declarations, for fairly obvious reasons.)
1046Comments are indicated by the # character, and extend to the end of the line.
1047If you attempt to use /* */ C comments, it will be interpreted either as
1048division or pattern matching, depending on the context.
1049So don't do that.
1050.Sh "Compound statements"
1052.IR perl ,
1053a sequence of commands may be treated as one command by enclosing it
1054in curly brackets.
1055We will call this a BLOCK.
1057The following compound commands may be used to control flow:
1059 4
1061 if (EXPR) BLOCK
1062 if (EXPR) BLOCK else BLOCK
1063 if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK
1064 LABEL while (EXPR) BLOCK
1065 LABEL while (EXPR) BLOCK continue BLOCK
1067 LABEL foreach VAR (ARRAY) BLOCK
1068 LABEL BLOCK continue BLOCK
1071Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not
1073This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed.
1074If you want to write conditionals without curly brackets there are several
1075other ways to do it.
1076The following all do the same thing:
1078 5
1080 if (!open(foo)) { die "Can't open $foo: $!"; }
1081 die "Can't open $foo: $!" unless open(foo);
1082 open(foo) || die "Can't open $foo: $!"; # foo or bust!
1083 open(foo) ? \'hi mom\' : die "Can't open $foo: $!";
1084 # a bit exotic, that last one
1089.I if
1090statement is straightforward.
1091Since BLOCKs are always bounded by curly brackets, there is never any
1092ambiguity about which
1093.I if
1095.I else
1096goes with.
1097If you use
1098.I unless
1099in place of
1100.IR if ,
1101the sense of the test is reversed.
1104.I while
1105statement executes the block as long as the expression is true
1106(does not evaluate to the null string or 0).
1107The LABEL is optional, and if present, consists of an identifier followed by
1108a colon.
1109The LABEL identifies the loop for the loop control statements
1110.IR next ,
1111.IR last ,
1113.I redo
1114(see below).
1115If there is a
1116.I continue
1117BLOCK, it is always executed just before
1118the conditional is about to be evaluated again, similarly to the third part
1119of a
1120.I for
1121loop in C.
1122Thus it can be used to increment a loop variable, even when the loop has
1123been continued via the
1124.I next
1125statement (similar to the C \*(L"continue\*(R" statement).
1127If the word
1128.I while
1129is replaced by the word
1130.IR until ,
1131the sense of the test is reversed, but the conditional is still tested before
1132the first iteration.
1134In either the
1135.I if
1136or the
1137.I while
1138statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional
1139is true if the value of the last command in that block is true.
1142.I for
1143loop works exactly like the corresponding
1144.I while
1147 12
1149 for ($i = 1; $i < 10; $i++) {
1150 .\|.\|.
1151 }
1153is the same as
1155 $i = 1;
1156 while ($i < 10) {
1157 .\|.\|.
1158 } continue {
1159 $i++;
1160 }
1163The foreach loop iterates over a normal array value and sets the variable
1164VAR to be each element of the array in turn.
1165The variable is implicitly local to the loop, and regains its former value
1166upon exiting the loop.
1167The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword,
1168so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity.
1169If VAR is omitted, $_ is set to each value.
1170If ARRAY is an actual array (as opposed to an expression returning an array
1171value), you can modify each element of the array
1172by modifying VAR inside the loop.
1175 5
1177 for (@ary) { s/foo/bar/; }
1179 foreach $elem (@elements) {
1180 $elem *= 2;
1181 }
1182 3
1184 for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) {
1185 print $_, "\en"; sleep(1);
1186 }
1188 for (1..15) { print "Merry Christmas\en"; }
1189 3
1191 foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'})) {
1192 print "Item: $item\en";
1193 }
1197The BLOCK by itself (labeled or not) is equivalent to a loop that executes
1199Thus you can use any of the loop control statements in it to leave or
1200restart the block.
1202.I continue
1203block is optional.
1204This construct is particularly nice for doing case structures.
1206 6
1208 foo: {
1209 if (/^abc/) { $abc = 1; last foo; }
1210 if (/^def/) { $def = 1; last foo; }
1211 if (/^xyz/) { $xyz = 1; last foo; }
1212 $nothing = 1;
1213 }
1216There is no official switch statement in perl, because there
1217are already several ways to write the equivalent.
1218In addition to the above, you could write
1220 6
1222 foo: {
1223 $abc = 1, last foo if /^abc/;
1224 $def = 1, last foo if /^def/;
1225 $xyz = 1, last foo if /^xyz/;
1226 $nothing = 1;
1227 }
1230 6
1232 foo: {
1233 /^abc/ && do { $abc = 1; last foo; };
1234 /^def/ && do { $def = 1; last foo; };
1235 /^xyz/ && do { $xyz = 1; last foo; };
1236 $nothing = 1;
1237 }
1240 6
1242 foo: {
1243 /^abc/ && ($abc = 1, last foo);
1244 /^def/ && ($def = 1, last foo);
1245 /^xyz/ && ($xyz = 1, last foo);
1246 $nothing = 1;
1247 }
1249or even
1250 8
1252 if (/^abc/)
1253 { $abc = 1; }
1254 elsif (/^def/)
1255 { $def = 1; }
1256 elsif (/^xyz/)
1257 { $xyz = 1; }
1258 else
1259 {$nothing = 1;}
1262As it happens, these are all optimized internally to a switch structure,
1263so perl jumps directly to the desired statement, and you needn't worry
1264about perl executing a lot of unnecessary statements when you have a string
1265of 50 elsifs, as long as you are testing the same simple scalar variable
1266using ==, eq, or pattern matching as above.
1267(If you're curious as to whether the optimizer has done this for a particular
1268case statement, you can use the \-D1024 switch to list the syntax tree
1269before execution.)
1270.Sh "Simple statements"
1271The only kind of simple statement is an expression evaluated for its side
1273Every expression (simple statement) must be terminated with a semicolon.
1274Note that this is like C, but unlike Pascal (and
1275.IR awk ).
1277Any simple statement may optionally be followed by a
1278single modifier, just before the terminating semicolon.
1279The possible modifiers are:
1281 4
1283 if EXPR
1284 unless EXPR
1285 while EXPR
1286 until EXPR
1290.I if
1292.I unless
1293modifiers have the expected semantics.
1295.I while
1297.I until
1298modifiers also have the expected semantics (conditional evaluated first),
1299except when applied to a do-BLOCK or a do-SUBROUTINE command,
1300in which case the block executes once before the conditional is evaluated.
1301This is so that you can write loops like:
1303 4
1305 do {
1306 $_ = <STDIN>;
1307 .\|.\|.
1308 } until $_ \|eq \|".\|\e\|n";
1311(See the
1312.I do
1313operator below. Note also that the loop control commands described later will
1314NOT work in this construct, since modifiers don't take loop labels.
1316.Sh "Expressions"
1318.I perl
1319expressions work almost exactly like C expressions, only the differences
1320will be mentioned here.
1322Here's what
1323.I perl
1324has that C doesn't:
1325.Ip ** 8 2
1326The exponentiation operator.
1327.Ip **= 8
1328The exponentiation assignment operator.
1329.Ip (\|) 8 3
1330The null list, used to initialize an array to null.
1331.Ip . 8
1332Concatenation of two strings.
1333.Ip .= 8
1334The concatenation assignment operator.
1335.Ip eq 8
1336String equality (== is numeric equality).
1337For a mnemonic just think of \*(L"eq\*(R" as a string.
1338(If you are used to the
1339.I awk
1340behavior of using == for either string or numeric equality
1341based on the current form of the comparands, beware!
1342You must be explicit here.)
1343.Ip ne 8
1344String inequality (!= is numeric inequality).
1345.Ip lt 8
1346String less than.
1347.Ip gt 8
1348String greater than.
1349.Ip le 8
1350String less than or equal.
1351.Ip ge 8
1352String greater than or equal.
1353.Ip cmp 8
1354String comparison, returning -1, 0, or 1.
1355.Ip <=> 8
1356Numeric comparison, returning -1, 0, or 1.
1357.Ip =~ 8 2
1358Certain operations search or modify the string \*(L"$_\*(R" by default.
1359This operator makes that kind of operation work on some other string.
1360The right argument is a search pattern, substitution, or translation.
1361The left argument is what is supposed to be searched, substituted, or
1362translated instead of the default \*(L"$_\*(R".
1363The return value indicates the success of the operation.
1364(If the right argument is an expression other than a search pattern,
1365substitution, or translation, it is interpreted as a search pattern
1366at run time.
1367This is less efficient than an explicit search, since the pattern must
1368be compiled every time the expression is evaluated.)
1369The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else.
1370.Ip !~ 8
1371Just like =~ except the return value is negated.
1372.Ip x 8
1373The repetition operator.
1374Returns a string consisting of the left operand repeated the
1375number of times specified by the right operand.
1376In an array context, if the left operand is a list in parens, it repeats
1377the list.
1380 print \'\-\' x 80; # print row of dashes
1381 print \'\-\' x80; # illegal, x80 is identifier
1383 print "\et" x ($tab/8), \' \' x ($tab%8); # tab over
35c8bce7 1385 @ones = (1) x 80; # an array of 80 1's
1386 @ones = (5) x @ones; # set all elements to 5
1389.Ip x= 8
1390The repetition assignment operator.
1391Only works on scalars.
1392.Ip .\|. 8
1393The range operator, which is really two different operators depending
1394on the context.
1395In an array context, returns an array of values counting (by ones)
1396from the left value to the right value.
1397This is useful for writing \*(L"for (1..10)\*(R" loops and for doing
1398slice operations on arrays.
1400In a scalar context, .\|. returns a boolean value.
1401The operator is bistable, like a flip-flop..
1402Each .\|. operator maintains its own boolean state.
1403It is false as long as its left operand is false.
1404Once the left operand is true, the range operator stays true
1405until the right operand is true,
1406AFTER which the range operator becomes false again.
1407(It doesn't become false till the next time the range operator is evaluated.
1408It can become false on the same evaluation it became true, but it still returns
1409true once.)
1410The right operand is not evaluated while the operator is in the \*(L"false\*(R" state,
1411and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state.
1412The scalar .\|. operator is primarily intended for doing line number ranges
1414the fashion of \fIsed\fR or \fIawk\fR.
1415The precedence is a little lower than || and &&.
1416The value returned is either the null string for false, or a sequence number
1417(beginning with 1) for true.
1418The sequence number is reset for each range encountered.
1419The final sequence number in a range has the string \'E0\' appended to it, which
1420doesn't affect its numeric value, but gives you something to search for if you
1421want to exclude the endpoint.
1422You can exclude the beginning point by waiting for the sequence number to be
1423greater than 1.
1424If either operand of scalar .\|. is static, that operand is implicitly compared
1425to the $. variable, the current line number.
1428 6
1430As a scalar operator:
1431 if (101 .\|. 200) { print; } # print 2nd hundred lines
1433 next line if (1 .\|. /^$/); # skip header lines
1435 s/^/> / if (/^$/ .\|. eof()); # quote body
1436 4
1438As an array operator:
1439 for (101 .\|. 200) { print; } # print $_ 100 times
1441 @foo = @foo[$[ .\|. $#foo]; # an expensive no-op
1442 @foo = @foo[$#foo-4 .\|. $#foo]; # slice last 5 items
1445.Ip \-x 8
1446A file test.
1447This unary operator takes one argument, either a filename or a filehandle,
1448and tests the associated file to see if something is true about it.
1449If the argument is omitted, tests $_, except for \-t, which tests
1450.IR STDIN .
1451It returns 1 for true and \'\' for false, or the undefined value if the
1452file doesn't exist.
1453Precedence is higher than logical and relational operators, but lower than
1454arithmetic operators.
1455The operator may be any of:
1457 \-r File is readable by effective uid.
1458 \-w File is writable by effective uid.
1459 \-x File is executable by effective uid.
1460 \-o File is owned by effective uid.
1461 \-R File is readable by real uid.
1462 \-W File is writable by real uid.
1463 \-X File is executable by real uid.
1464 \-O File is owned by real uid.
1465 \-e File exists.
1466 \-z File has zero size.
1467 \-s File has non-zero size (returns size).
1468 \-f File is a plain file.
1469 \-d File is a directory.
1470 \-l File is a symbolic link.
1471 \-p File is a named pipe (FIFO).
1472 \-S File is a socket.
1473 \-b File is a block special file.
1474 \-c File is a character special file.
1475 \-u File has setuid bit set.
1476 \-g File has setgid bit set.
1477 \-k File has sticky bit set.
1478 \-t Filehandle is opened to a tty.
1479 \-T File is a text file.
1480 \-B File is a binary file (opposite of \-T).
1481 \-M Age of file in days when script started.
1482 \-A Same for access time.
1483 \-C Same for inode change time.
1486The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X
1487is based solely on the mode of the file and the uids and gids of the user.
1488There may be other reasons you can't actually read, write or execute the file.
1489Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and
1490\-x and \-X return 1 if any execute bit is set in the mode.
1491Scripts run by the superuser may thus need to do a stat() in order to determine
1492the actual mode of the file, or temporarily set the uid to something else.
1494Example: 7
1498 while (<>) {
1499 chop;
1500 next unless \-f $_; # ignore specials
1501 .\|.\|.
1502 }
1505Note that \-s/a/b/ does not do a negated substitution.
1506Saying \-exp($foo) still works as expected, however\*(--only single letters
1507following a minus are interpreted as file tests.
1509The \-T and \-B switches work as follows.
1510The first block or so of the file is examined for odd characters such as
1511strange control codes or metacharacters.
1512If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file.
1513Also, any file containing null in the first block is considered a binary file.
1514If \-T or \-B is used on a filehandle, the current stdio buffer is examined
1515rather than the first block.
1516Both \-T and \-B return TRUE on a null file, or a file at EOF when testing
1517a filehandle.
1519If any of the file tests (or either stat operator) are given the special
1520filehandle consisting of a solitary underline, then the stat structure
1521of the previous file test (or stat operator) is used, saving a system
1523(This doesn't work with \-t, and you need to remember that lstat and -l
1524will leave values in the stat structure for the symbolic link, not the
1525real file.)
1529 print "Can do.\en" if -r $a || -w _ || -x _;
1530 9
1532 stat($filename);
1533 print "Readable\en" if -r _;
1534 print "Writable\en" if -w _;
1535 print "Executable\en" if -x _;
1536 print "Setuid\en" if -u _;
1537 print "Setgid\en" if -g _;
1538 print "Sticky\en" if -k _;
1539 print "Text\en" if -T _;
1540 print "Binary\en" if -B _;
1544Here is what C has that
1545.I perl
1547.Ip "unary &" 12
1548Address-of operator.
1549.Ip "unary *" 12
1550Dereference-address operator.
1551.Ip "(TYPE)" 12
1552Type casting operator.
1554Like C,
1555.I perl
1556does a certain amount of expression evaluation at compile time, whenever
1557it determines that all of the arguments to an operator are static and have
1558no side effects.
1559In particular, string concatenation happens at compile time between literals that don't do variable substitution.
1560Backslash interpretation also happens at compile time.
1561You can say
1563 2
1565 \'Now is the time for all\' . "\|\e\|n" .
1566 \'good men to come to.\'
1569and this all reduces to one string internally.
1571The autoincrement operator has a little extra built-in magic to it.
1572If you increment a variable that is numeric, or that has ever been used in
1573a numeric context, you get a normal increment.
1574If, however, the variable has only been used in string contexts since it
1575was set, and has a value that is not null and matches the
1576pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done
1577as a string, preserving each character within its range, with carry:
1580 print ++($foo = \'99\'); # prints \*(L'100\*(R'
1581 print ++($foo = \'a0\'); # prints \*(L'a1\*(R'
1582 print ++($foo = \'Az\'); # prints \*(L'Ba\*(R'
1583 print ++($foo = \'zz\'); # prints \*(L'aaa\*(R'
1586The autodecrement is not magical.
1588The range operator (in an array context) makes use of the magical
1589autoincrement algorithm if the minimum and maximum are strings.
1590You can say
1592 @alphabet = (\'A\' .. \'Z\');
1594to get all the letters of the alphabet, or
1596 $hexdigit = (0 .. 9, \'a\' .. \'f\')[$num & 15];
1598to get a hexadecimal digit, or
1600 @z2 = (\'01\' .. \'31\'); print @z2[$mday];
1602to get dates with leading zeros.
1603(If the final value specified is not in the sequence that the magical increment
1604would produce, the sequence goes until the next value would be longer than
1605the final value specified.)
1607The || and && operators differ from C's in that, rather than returning 0 or 1,
1608they return the last value evaluated.
1609Thus, a portable way to find out the home directory might be:
1612 $home = $ENV{'HOME'} || $ENV{'LOGDIR'} ||
1613 (getpwuid($<))[7] || die "You're homeless!\en";
1617Along with the literals and variables mentioned earlier,
1618the operations in the following section can serve as terms in an expression.
1619Some of these operations take a LIST as an argument.
1620Such a list can consist of any combination of scalar arguments or array values;
1621the array values will be included in the list as if each individual element were
1622interpolated at that point in the list, forming a longer single-dimensional
1623array value.
1624Elements of the LIST should be separated by commas.
1625If an operation is listed both with and without parentheses around its
1626arguments, it means you can either use it as a unary operator or
1627as a function call.
1628To use it as a function call, the next token on the same line must
1629be a left parenthesis.
1630(There may be intervening white space.)
1631Such a function then has highest precedence, as you would expect from
1632a function.
1633If any token other than a left parenthesis follows, then it is a
1634unary operator, with a precedence depending only on whether it is a LIST
1635operator or not.
1636LIST operators have lowest precedence.
1637All other unary operators have a precedence greater than relational operators
1638but less than arithmetic operators.
1639See the section on Precedence.
1640.Ip "/PATTERN/" 8 4
1641See m/PATTERN/.
1642.Ip "?PATTERN?" 8 4
1643This is just like the /pattern/ search, except that it matches only once between
1644calls to the
1645.I reset
1647This is a useful optimization when you only want to see the first occurrence of
1648something in each file of a set of files, for instance.
1649Only ?? patterns local to the current package are reset.
1651Does the same thing that the accept system call does.
1652Returns true if it succeeded, false otherwise.
1653See example in section on Interprocess Communication.
1654.Ip "alarm(SECONDS)" 8 4
1655.Ip "alarm SECONDS" 8
1656Arranges to have a SIGALRM delivered to this process after the specified number
1657of seconds (minus 1, actually) have elapsed. Thus, alarm(15) will cause
1658a SIGALRM at some point more than 14 seconds in the future.
1659Only one timer may be counting at once. Each call disables the previous
1660timer, and an argument of 0 may be supplied to cancel the previous timer
1661without starting a new one.
1662The returned value is the amount of time remaining on the previous timer.
1663.Ip "atan2(Y,X)" 8 2
1664Returns the arctangent of Y/X in the range
1665.if t \-\(*p to \(*p.
1666.if n \-PI to PI.
1667.Ip "bind(SOCKET,NAME)" 8 2
1668Does the same thing that the bind system call does.
1669Returns true if it succeeded, false otherwise.
1670NAME should be a packed address of the proper type for the socket.
1671See example in section on Interprocess Communication.
1672.Ip "binmode(FILEHANDLE)" 8 4
1673.Ip "binmode FILEHANDLE" 8 4
1674Arranges for the file to be read in \*(L"binary\*(R" mode in operating systems
1675that distinguish between binary and text files.
1676Files that are not read in binary mode have CR LF sequences translated
1677to LF on input and LF translated to CR LF on output.
1678Binmode has no effect under Unix.
1679If FILEHANDLE is an expression, the value is taken as the name of
1680the filehandle.
1681.Ip "caller(EXPR)"
1682.Ip "caller"
1683Returns the context of the current subroutine call:
1686 ($package,$filename,$line) = caller;
1689With EXPR, returns some extra information that the debugger uses to print
1690a stack trace. The value of EXPR indicates how many call frames to go
1691back before the current one.
1692.Ip "chdir(EXPR)" 8 2
1693.Ip "chdir EXPR" 8 2
1694Changes the working directory to EXPR, if possible.
1695If EXPR is omitted, changes to home directory.
1696Returns 1 upon success, 0 otherwise.
1697See example under
1698.IR die .
1699.Ip "chmod(LIST)" 8 2
1700.Ip "chmod LIST" 8 2
1701Changes the permissions of a list of files.
1702The first element of the list must be the numerical mode.
1703Returns the number of files successfully changed.
1705 2
1707 $cnt = chmod 0755, \'foo\', \'bar\';
1708 chmod 0755, @executables;
1711.Ip "chop(LIST)" 8 7
1712.Ip "chop(VARIABLE)" 8
1713.Ip "chop VARIABLE" 8
1714.Ip "chop" 8
1715Chops off the last character of a string and returns the character chopped.
1716It's used primarily to remove the newline from the end of an input record,
1717but is much more efficient than s/\en// because it neither scans nor copies
1718the string.
1719If VARIABLE is omitted, chops $_.
1722 5
1724 while (<>) {
1725 chop; # avoid \en on last field
1726 @array = split(/:/);
1727 .\|.\|.
1728 }
1731You can actually chop anything that's an lvalue, including an assignment:
1734 chop($cwd = \`pwd\`);
1735 chop($answer = <STDIN>);
1738If you chop a list, each element is chopped.
1739Only the value of the last chop is returned.
1740.Ip "chown(LIST)" 8 2
1741.Ip "chown LIST" 8 2
1742Changes the owner (and group) of a list of files.
1743The first two elements of the list must be the NUMERICAL uid and gid,
1744in that order.
1745Returns the number of files successfully changed.
1747 2
1749 $cnt = chown $uid, $gid, \'foo\', \'bar\';
1750 chown $uid, $gid, @filenames;
1751 23
352d5a3a 1754Here's an example that looks up non-numeric uids in the passwd file:
1757 print "User: ";
1758 $user = <STDIN>;
1759 chop($user);
1760 print "Files: "
1761 $pattern = <STDIN>;
1762 chop($pattern); t \{\
1764 open(pass, \'/etc/passwd\') || die "Can't open passwd: $!\en";
1766.el \{\
1767 open(pass, \'/etc/passwd\')
1768 || die "Can't open passwd: $!\en";
1770 while (<pass>) {
1771 ($login,$pass,$uid,$gid) = split(/:/);
1772 $uid{$login} = $uid;
1773 $gid{$login} = $gid;
1774 }
1775 @ary = <${pattern}>; # get filenames
1776 if ($uid{$user} eq \'\') {
1777 die "$user not in passwd file";
1778 }
1779 else {
1780 chown $uid{$user}, $gid{$user}, @ary;
1781 }
1784.Ip "chroot(FILENAME)" 8 5
1785.Ip "chroot FILENAME" 8
1786Does the same as the system call of that name.
1787If you don't know what it does, don't worry about it.
1788If FILENAME is omitted, does chroot to $_.
1789.Ip "close(FILEHANDLE)" 8 5
1790.Ip "close FILEHANDLE" 8
1791Closes the file or pipe associated with the file handle.
1792You don't have to close FILEHANDLE if you are immediately going to
1793do another open on it, since open will close it for you.
1795.IR open .)
1796However, an explicit close on an input file resets the line counter ($.), while
1797the implicit close done by
1798.I open
1799does not.
1800Also, closing a pipe will wait for the process executing on the pipe to complete,
1801in case you want to look at the output of the pipe afterwards.
1802Closing a pipe explicitly also puts the status value of the command into $?.
1805 4
1807 open(OUTPUT, \'|sort >foo\'); # pipe to sort
1808 .\|.\|. # print stuff to output
1809 close OUTPUT; # wait for sort to finish
1810 open(INPUT, \'foo\'); # get sort's results
1813FILEHANDLE may be an expression whose value gives the real filehandle name.
1814.Ip "closedir(DIRHANDLE)" 8 5
1815.Ip "closedir DIRHANDLE" 8
1816Closes a directory opened by opendir().
1817.Ip "connect(SOCKET,NAME)" 8 2
1818Does the same thing that the connect system call does.
1819Returns true if it succeeded, false otherwise.
1820NAME should be a package address of the proper type for the socket.
1821See example in section on Interprocess Communication.
1822.Ip "cos(EXPR)" 8 6
1823.Ip "cos EXPR" 8 6
1824Returns the cosine of EXPR (expressed in radians).
1825If EXPR is omitted takes cosine of $_.
1826.Ip "crypt(PLAINTEXT,SALT)" 8 6
1827Encrypts a string exactly like the crypt() function in the C library.
1828Useful for checking the password file for lousy passwords.
1829Only the guys wearing white hats should do this.
1830.Ip "dbmclose(ASSOC_ARRAY)" 8 6
1831.Ip "dbmclose ASSOC_ARRAY" 8
1832Breaks the binding between a dbm file and an associative array.
1833The values remaining in the associative array are meaningless unless
1834you happen to want to know what was in the cache for the dbm file.
1835This function is only useful if you have ndbm.
1836.Ip "dbmopen(ASSOC,DBNAME,MODE)" 8 6
1837This binds a dbm or ndbm file to an associative array.
1838ASSOC is the name of the associative array.
1839(Unlike normal open, the first argument is NOT a filehandle, even though
1840it looks like one).
1841DBNAME is the name of the database (without the .dir or .pag extension).
1842If the database does not exist, it is created with protection specified
1843by MODE (as modified by the umask).
1844If your system only supports the older dbm functions, you may only have one
1845dbmopen in your program.
1846If your system has neither dbm nor ndbm, calling dbmopen produces a fatal
1849Values assigned to the associative array prior to the dbmopen are lost.
1850A certain number of values from the dbm file are cached in memory.
1851By default this number is 64, but you can increase it by preallocating
1852that number of garbage entries in the associative array before the dbmopen.
1853You can flush the cache if necessary with the reset command.
1855If you don't have write access to the dbm file, you can only read
1856associative array variables, not set them.
1857If you want to test whether you can write, either use file tests or
1858try setting a dummy array entry inside an eval, which will trap the error.
1860Note that functions such as keys() and values() may return huge array values
1861when used on large dbm files.
1862You may prefer to use the each() function to iterate over large dbm files.
1865 6
1867 # print out history file offsets
1868 dbmopen(HIST,'/usr/lib/news/history',0666);
1869 while (($key,$val) = each %HIST) {
1870 print $key, ' = ', unpack('L',$val), "\en";
1871 }
1872 dbmclose(HIST);
1875.Ip "defined(EXPR)" 8 6
1876.Ip "defined EXPR" 8
1877Returns a boolean value saying whether the lvalue EXPR has a real value
1878or not.
1879Many operations return the undefined value under exceptional conditions,
1880such as end of file, uninitialized variable, system error and such.
1881This function allows you to distinguish between an undefined null string
1882and a defined null string with operations that might return a real null
1883string, in particular referencing elements of an array.
1884You may also check to see if arrays or subroutines exist.
1885Use on predefined variables is not guaranteed to produce intuitive results.
1888 7
1890 print if defined $switch{'D'};
1891 print "$val\en" while defined($val = pop(@ary));
1892 die "Can't readlink $sym: $!"
1893 unless defined($value = readlink $sym);
1894 eval '@foo = ()' if defined(@foo);
1895 die "No XYZ package defined" unless defined %_XYZ;
1896 sub foo { defined &bar ? &bar(@_) : die "No bar"; }
1899See also undef.
1900.Ip "delete $ASSOC{KEY}" 8 6
1901Deletes the specified value from the specified associative array.
1902Returns the deleted value, or the undefined value if nothing was deleted.
1903Deleting from $ENV{} modifies the environment.
1904Deleting from an array bound to a dbm file deletes the entry from the dbm
1907The following deletes all the values of an associative array:
1909 3
1911 foreach $key (keys %ARRAY) {
1912 delete $ARRAY{$key};
1913 }
1916(But it would be faster to use the
1917.I reset
1919Saying undef %ARRAY is faster yet.)
1920.Ip "die(LIST)" 8
1921.Ip "die LIST" 8
1922Outside of an eval, prints the value of LIST to
1924and exits with the current value of $!
1926If $! is 0, exits with the value of ($? >> 8) (\`command\` status).
1927If ($? >> 8) is 0, exits with 255.
1928Inside an eval, the error message is stuffed into $@ and the eval is terminated
1929with the undefined value.
1931Equivalent examples:
1933 3 t \{\
1936 die "Can't cd to spool: $!\en" unless chdir \'/usr/spool/news\';
1938.el \{\
1939 die "Can't cd to spool: $!\en"
1940 unless chdir \'/usr/spool/news\';
1943 chdir \'/usr/spool/news\' || die "Can't cd to spool: $!\en"
1947If the value of EXPR does not end in a newline, the current script line
1948number and input line number (if any) are also printed, and a newline is
1950Hint: sometimes appending \*(L", stopped\*(R" to your message will cause it to make
1951better sense when the string \*(L"at foo line 123\*(R" is appended.
1952Suppose you are running script \*(L"canasta\*(R".
1954 7
1956 die "/etc/games is no good";
1957 die "/etc/games is no good, stopped";
1959produce, respectively
1961 /etc/games is no good at canasta line 123.
1962 /etc/games is no good, stopped at canasta line 123.
1965See also
1966.IR exit .
1967.Ip "do BLOCK" 8 4
1968Returns the value of the last command in the sequence of commands indicated
1969by BLOCK.
1970When modified by a loop modifier, executes the BLOCK once before testing the
1971loop condition.
1972(On other statements the loop modifiers test the conditional first.)
1973.Ip "do SUBROUTINE (LIST)" 8 3
1974Executes a SUBROUTINE declared by a
1975.I sub
1976declaration, and returns the value
1977of the last expression evaluated in SUBROUTINE.
1978If there is no subroutine by that name, produces a fatal error.
1979(You may use the \*(L"defined\*(R" operator to determine if a subroutine
1981If you pass arrays as part of LIST you may wish to pass the length
1982of the array in front of each array.
1983(See the section on subroutines later on.)
1984SUBROUTINE may be a scalar variable, in which case the variable contains
1985the name of the subroutine to execute.
1986The parentheses are required to avoid confusion with the \*(L"do EXPR\*(R"
1989As an alternate form, you may call a subroutine by prefixing the name with
1990an ampersand: &foo(@args).
1991If you aren't passing any arguments, you don't have to use parentheses.
1992If you omit the parentheses, no @_ array is passed to the subroutine.
1993The & form is also used to specify subroutines to the defined and undef
1995.Ip "do EXPR" 8 3
1996Uses the value of EXPR as a filename and executes the contents of the file
1997as a
1998.I perl
2000Its primary use is to include subroutines from a
2001.I perl
2002subroutine library.
2005 do \'\';
2007is just like
2009 eval \`cat\`;
2012except that it's more efficient, more concise, keeps track of the current
2013filename for error messages, and searches all the
2014.B \-I
2015libraries if the file
2016isn't in the current directory (see also the @INC array in Predefined Names).
2017It's the same, however, in that it does reparse the file every time you
2018call it, so if you are going to use the file inside a loop you might prefer
2019to use \-P and #include, at the expense of a little more startup time.
2020(The main problem with #include is that cpp doesn't grok # comments\*(--a
2021workaround is to use \*(L";#\*(R" for standalone comments.)
2022Note that the following are NOT equivalent:
2024 2
2026 do $foo; # eval a file
2027 do $foo(); # call a subroutine
2030Note that inclusion of library routines is better done with
2031the \*(L"require\*(R" operator.
2032.Ip "dump LABEL" 8 6
2033This causes an immediate core dump.
2034Primarily this is so that you can use the undump program to turn your
2035core dump into an executable binary after having initialized all your
2036variables at the beginning of the program.
2037When the new binary is executed it will begin by executing a "goto LABEL"
2038(with all the restrictions that goto suffers).
2039Think of it as a goto with an intervening core dump and reincarnation.
2040If LABEL is omitted, restarts the program from the top.
2041WARNING: any files opened at the time of the dump will NOT be open any more
2042when the program is reincarnated, with possible resulting confusion on the part
2043of perl.
2044See also \-u.
2048 16
2050 #!/usr/bin/perl
2051 require '';
2052 require '';
2053 %days = (
2054 'Sun',1,
2055 'Mon',2,
2056 'Tue',3,
2057 'Wed',4,
2058 'Thu',5,
2059 'Fri',6,
2060 'Sat',7);
2062 dump QUICKSTART if $ARGV[0] eq '-d';
2065 do Getopt('f');
2068.Ip "each(ASSOC_ARRAY)" 8 6
2069.Ip "each ASSOC_ARRAY" 8
2070Returns a 2 element array consisting of the key and value for the next
2071value of an associative array, so that you can iterate over it.
2072Entries are returned in an apparently random order.
2073When the array is entirely read, a null array is returned (which when
2074assigned produces a FALSE (0) value).
2075The next call to each() after that will start iterating again.
2076The iterator can be reset only by reading all the elements from the array.
2077You must not modify the array while iterating over it.
2078There is a single iterator for each associative array, shared by all
2079each(), keys() and values() function calls in the program.
2080The following prints out your environment like the printenv program, only
2081in a different order:
2083 3
2085 while (($key,$value) = each %ENV) {
2086 print "$key=$value\en";
2087 }
2090See also keys() and values().
2091.Ip "eof(FILEHANDLE)" 8 8
2092.Ip "eof()" 8
2093.Ip "eof" 8
2094Returns 1 if the next read on FILEHANDLE will return end of file, or if
2095FILEHANDLE is not open.
2096FILEHANDLE may be an expression whose value gives the real filehandle name.
2097(Note that this function actually reads a character and then ungetc's it,
2098so it is not very useful in an interactive context.)
2099An eof without an argument returns the eof status for the last file read.
2100Empty parentheses () may be used to indicate the pseudo file formed of the
2101files listed on the command line, i.e. eof() is reasonable to use inside
2102a while (<>) loop to detect the end of only the last file.
2103Use eof(ARGV) or eof without the parentheses to test EACH file in a while (<>) loop.
2106 7
2108 # insert dashes just before last line of last file
2109 while (<>) {
2110 if (eof()) {
2111 print "\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\en";
2112 }
2113 print;
2114 }
2115 7
2117 # reset line numbering on each input file
2118 while (<>) {
2119 print "$.\et$_";
2120 if (eof) { # Not eof().
2121 close(ARGV);
2122 }
2123 }
2126.Ip "eval(EXPR)" 8 6
2127.Ip "eval EXPR" 8 6
2128EXPR is parsed and executed as if it were a little
2129.I perl
2131It is executed in the context of the current
2132.I perl
2133program, so that
2134any variable settings, subroutine or format definitions remain afterwards.
2135The value returned is the value of the last expression evaluated, just
2136as with subroutines.
2137If there is a syntax error or runtime error, or a die statement is
2138executed, an undefined value is returned by
2139eval, and $@ is set to the error message.
2140If there was no error, $@ is guaranteed to be a null string.
2141If EXPR is omitted, evaluates $_.
2142The final semicolon, if any, may be omitted from the expression.
2144Note that, since eval traps otherwise-fatal errors, it is useful for
2145determining whether a particular feature
2146(such as dbmopen or symlink) is implemented.
2147It is also Perl's exception trapping mechanism, where the die operator is
2148used to raise exceptions.
2149.Ip "exec(LIST)" 8 8
2150.Ip "exec LIST" 8 6
2151If there is more than one argument in LIST, or if LIST is an array with
2152more than one value,
2153calls execvp() with the arguments in LIST.
2154If there is only one scalar argument, the argument is checked for shell metacharacters.
2155If there are any, the entire argument is passed to \*(L"/bin/sh \-c\*(R" for parsing.
2156If there are none, the argument is split into words and passed directly to
2157execvp(), which is more efficient.
2158Note: exec (and system) do not flush your output buffer, so you may need to
2159set $| to avoid lost output.
2163 exec \'/bin/echo\', \'Your arguments are: \', @ARGV;
2164 exec "sort $outfile | uniq";
2168If you don't really want to execute the first argument, but want to lie
2169to the program you are executing about its own name, you can specify
2170the program you actually want to run by assigning that to a variable and
2171putting the name of the variable in front of the LIST without a comma.
2172(This always forces interpretation of the LIST as a multi-valued list, even
2173if there is only a single scalar in the list.)
2176 2
2178 $shell = '/bin/csh';
2179 exec $shell '-sh'; # pretend it's a login shell
2182.Ip "exit(EXPR)" 8 6
2183.Ip "exit EXPR" 8
2184Evaluates EXPR and exits immediately with that value.
2187 2
2189 $ans = <STDIN>;
2190 exit 0 \|if \|$ans \|=~ \|/\|^[Xx]\|/\|;
2193See also
2194.IR die .
2195If EXPR is omitted, exits with 0 status.
2196.Ip "exp(EXPR)" 8 3
2197.Ip "exp EXPR" 8
2199.I e
2200to the power of EXPR.
2201If EXPR is omitted, gives exp($_).
2203Implements the fcntl(2) function.
2204You'll probably have to say
2207 require ""; # probably /usr/local/lib/perl/
2210first to get the correct function definitions.
2211If doesn't exist or doesn't have the correct definitions
2212you'll have to roll
2213your own, based on your C header files such as <sys/fcntl.h>.
2214(There is a perl script called h2ph that comes with the perl kit
2215which may help you in this.)
2216Argument processing and value return works just like ioctl below.
2217Note that fcntl will produce a fatal error if used on a machine that doesn't implement
2219.Ip "fileno(FILEHANDLE)" 8 4
2220.Ip "fileno FILEHANDLE" 8 4
2221Returns the file descriptor for a filehandle.
2222Useful for constructing bitmaps for select().
2223If FILEHANDLE is an expression, the value is taken as the name of
2224the filehandle.
2225.Ip "flock(FILEHANDLE,OPERATION)" 8 4
2226Calls flock(2) on FILEHANDLE.
2227See manual page for flock(2) for definition of OPERATION.
2228Returns true for success, false on failure.
2229Will produce a fatal error if used on a machine that doesn't implement
2231Here's a mailbox appender for BSD systems.
2233 20
2235 $LOCK_SH = 1;
2236 $LOCK_EX = 2;
2237 $LOCK_NB = 4;
2238 $LOCK_UN = 8;
2240 sub lock {
2241 flock(MBOX,$LOCK_EX);
2242 # and, in case someone appended
2243 # while we were waiting...
2244 seek(MBOX, 0, 2);
2245 }
2247 sub unlock {
2248 flock(MBOX,$LOCK_UN);
2249 }
2251 open(MBOX, ">>/usr/spool/mail/$ENV{'USER'}")
2252 || die "Can't open mailbox: $!";
2254 do lock();
2255 print MBOX $msg,"\en\en";
2256 do unlock();
2259.Ip "fork" 8 4
2260Does a fork() call.
2261Returns the child pid to the parent process and 0 to the child process.
2262Note: unflushed buffers remain unflushed in both processes, which means
2263you may need to set $| to avoid duplicate output.
2264.Ip "getc(FILEHANDLE)" 8 4
2265.Ip "getc FILEHANDLE" 8
2266.Ip "getc" 8
2267Returns the next character from the input file attached to FILEHANDLE, or
2268a null string at EOF.
2269If FILEHANDLE is omitted, reads from STDIN.
2270.Ip "getlogin" 8 3
2271Returns the current login from /etc/utmp, if any.
2272If null, use getpwuid.
2274 $login = getlogin || (getpwuid($<))[0] || "Somebody";
2276.Ip "getpeername(SOCKET)" 8 3
2277Returns the packed sockaddr address of other end of the SOCKET connection.
2279 4
2281 # An internet sockaddr
2282 $sockaddr = 'S n a4 x8';
2283 $hersockaddr = getpeername(S); t \{\
2285 ($family, $port, $heraddr) = unpack($sockaddr,$hersockaddr);
2287.el \{\
2288 ($family, $port, $heraddr) =
2289 unpack($sockaddr,$hersockaddr);
2293.Ip "getpgrp(PID)" 8 4
2294.Ip "getpgrp PID" 8
2295Returns the current process group for the specified PID, 0 for the current
2297Will produce a fatal error if used on a machine that doesn't implement
2299If EXPR is omitted, returns process group of current process.
2300.Ip "getppid" 8 4
2301Returns the process id of the parent process.
2302.Ip "getpriority(WHICH,WHO)" 8 4
2303Returns the current priority for a process, a process group, or a user.
2304(See getpriority(2).)
2305Will produce a fatal error if used on a machine that doesn't implement
2307.Ip "getpwnam(NAME)" 8
2308.Ip "getgrnam(NAME)" 8
2309.Ip "gethostbyname(NAME)" 8
2310.Ip "getnetbyname(NAME)" 8
2311.Ip "getprotobyname(NAME)" 8
2312.Ip "getpwuid(UID)" 8
2313.Ip "getgrgid(GID)" 8
2314.Ip "getservbyname(NAME,PROTO)" 8
2315.Ip "gethostbyaddr(ADDR,ADDRTYPE)" 8
2316.Ip "getnetbyaddr(ADDR,ADDRTYPE)" 8
2317.Ip "getprotobynumber(NUMBER)" 8
2318.Ip "getservbyport(PORT,PROTO)" 8
2319.Ip "getpwent" 8
2320.Ip "getgrent" 8
2321.Ip "gethostent" 8
2322.Ip "getnetent" 8
2323.Ip "getprotoent" 8
2324.Ip "getservent" 8
2325.Ip "setpwent" 8
2326.Ip "setgrent" 8
2327.Ip "sethostent(STAYOPEN)" 8
2328.Ip "setnetent(STAYOPEN)" 8
2329.Ip "setprotoent(STAYOPEN)" 8
2330.Ip "setservent(STAYOPEN)" 8
2331.Ip "endpwent" 8
2332.Ip "endgrent" 8
2333.Ip "endhostent" 8
2334.Ip "endnetent" 8
2335.Ip "endprotoent" 8
2336.Ip "endservent" 8
2337These routines perform the same functions as their counterparts in the
2338system library.
2339The return values from the various get routines are as follows:
2342 ($name,$passwd,$uid,$gid,
2343 $quota,$comment,$gcos,$dir,$shell) = getpw.\|.\|.
2344 ($name,$passwd,$gid,$members) = getgr.\|.\|.
2345 ($name,$aliases,$addrtype,$length,@addrs) = gethost.\|.\|.
2346 ($name,$aliases,$addrtype,$net) = getnet.\|.\|.
2347 ($name,$aliases,$proto) = getproto.\|.\|.
2348 ($name,$aliases,$port,$proto) = getserv.\|.\|.
2351The $members value returned by getgr.\|.\|. is a space separated list
2352of the login names of the members of the group.
2354The @addrs value returned by the gethost.\|.\|. functions is a list of the
2355raw addresses returned by the corresponding system library call.
2356In the Internet domain, each address is four bytes long and you can unpack
2357it by saying something like:
2360 ($a,$b,$c,$d) = unpack('C4',$addr[0]);
2363.Ip "getsockname(SOCKET)" 8 3
2364Returns the packed sockaddr address of this end of the SOCKET connection.
2366 4
2368 # An internet sockaddr
2369 $sockaddr = 'S n a4 x8';
2370 $mysockaddr = getsockname(S); t \{\
2372 ($family, $port, $myaddr) = unpack($sockaddr,$mysockaddr);
2374.el \{\
2375 ($family, $port, $myaddr) =
2376 unpack($sockaddr,$mysockaddr);
2380.Ip "getsockopt(SOCKET,LEVEL,OPTNAME)" 8 3
2381Returns the socket option requested, or undefined if there is an error.
2382.Ip "gmtime(EXPR)" 8 4
2383.Ip "gmtime EXPR" 8
2384Converts a time as returned by the time function to a 9-element array with
2385the time analyzed for the Greenwich timezone.
2386Typically used as follows:
2388 3 t \{\
2391 ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = gmtime(time);
2393.el \{\
2394 ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
2395 gmtime(time);
2399All array elements are numeric, and come straight out of a struct tm.
2400In particular this means that $mon has the range 0.\|.11 and $wday has the
2401range 0.\|.6.
2402If EXPR is omitted, does gmtime(time).
2403.Ip "goto LABEL" 8 6
2404Finds the statement labeled with LABEL and resumes execution there.
2405Currently you may only go to statements in the main body of the program
2406that are not nested inside a do {} construct.
2407This statement is not implemented very efficiently, and is here only to make
2409.IR sed -to- perl
2410translator easier.
2411I may change its semantics at any time, consistent with support for translated
2412.I sed
2414Use it at your own risk.
2415Better yet, don't use it at all.
2416.Ip "grep(EXPR,LIST)" 8 4
2417Evaluates EXPR for each element of LIST (locally setting $_ to each element)
2418and returns the array value consisting of those elements for which the
2419expression evaluated to true.
2420In a scalar context, returns the number of times the expression was true.
2423 @foo = grep(!/^#/, @bar); # weed out comments
2426Note that, since $_ is a reference into the array value, it can be
2427used to modify the elements of the array.
2428While this is useful and supported, it can cause bizarre results if
2429the LIST is not a named array.
2430.Ip "hex(EXPR)" 8 4
2431.Ip "hex EXPR" 8
2432Returns the decimal value of EXPR interpreted as an hex string.
2433(To interpret strings that might start with 0 or 0x see oct().)
2434If EXPR is omitted, uses $_.
2435.Ip "index(STR,SUBSTR,POSITION)" 8 4
2436.Ip "index(STR,SUBSTR)" 8 4
2437Returns the position of the first occurrence of SUBSTR in STR at or after
2439If POSITION is omitted, starts searching from the beginning of the string.
2440The return value is based at 0, or whatever you've
2441set the $[ variable to.
2442If the substring is not found, returns one less than the base, ordinarily \-1.
2443.Ip "int(EXPR)" 8 4
2444.Ip "int EXPR" 8
2445Returns the integer portion of EXPR.
2446If EXPR is omitted, uses $_.
2448Implements the ioctl(2) function.
2449You'll probably have to say
2452 require ""; # probably /usr/local/lib/perl/
2455first to get the correct function definitions.
2456If doesn't exist or doesn't have the correct definitions
2457you'll have to roll
2458your own, based on your C header files such as <sys/ioctl.h>.
2459(There is a perl script called h2ph that comes with the perl kit
2460which may help you in this.)
2461SCALAR will be read and/or written depending on the FUNCTION\*(--a pointer
2462to the string value of SCALAR will be passed as the third argument of
2463the actual ioctl call.
2464(If SCALAR has no string value but does have a numeric value, that value
2465will be passed rather than a pointer to the string value.
2466To guarantee this to be true, add a 0 to the scalar before using it.)
2467The pack() and unpack() functions are useful for manipulating the values
2468of structures used by ioctl().
2469The following example sets the erase character to DEL.
2471 9
2473 require '';
2474 $sgttyb_t = "ccccs"; # 4 chars and a short
2475 if (ioctl(STDIN,$TIOCGETP,$sgttyb)) {
2476 @ary = unpack($sgttyb_t,$sgttyb);
2477 $ary[2] = 127;
2478 $sgttyb = pack($sgttyb_t,@ary);
2479 ioctl(STDIN,$TIOCSETP,$sgttyb)
2480 || die "Can't ioctl: $!";
2481 }
2484The return value of ioctl (and fcntl) is as follows:
2486 4
2488 if OS returns:\h'|3i'perl returns:
2489 -1\h'|3i' undefined value
2490 0\h'|3i' string "0 but true"
2491 anything else\h'|3i' that number
2494Thus perl returns true on success and false on failure, yet you can still
2495easily determine the actual value returned by the operating system:
2498 ($retval = ioctl(...)) || ($retval = -1);
2499 printf "System returned %d\en", $retval;
2501.Ip "join(EXPR,LIST)" 8 8
2502.Ip "join(EXPR,ARRAY)" 8
2503Joins the separate strings of LIST or ARRAY into a single string with fields
2504separated by the value of EXPR, and returns the string.
2507 t \{\
2509 $_ = join(\|\':\', $login,$passwd,$uid,$gid,$gcos,$home,$shell);
2511.el \{\
2512 $_ = join(\|\':\',
2513 $login,$passwd,$uid,$gid,$gcos,$home,$shell);
2518.IR split .
2519.Ip "keys(ASSOC_ARRAY)" 8 6
2520.Ip "keys ASSOC_ARRAY" 8
2521Returns a normal array consisting of all the keys of the named associative
2523The keys are returned in an apparently random order, but it is the same order
2524as either the values() or each() function produces (given that the associative array
2525has not been modified).
2526Here is yet another way to print your environment:
2528 5
2530 @keys = keys %ENV;
2531 @values = values %ENV;
2532 while ($#keys >= 0) {
2533 print pop(@keys), \'=\', pop(@values), "\en";
2534 }
2536or how about sorted by key:
2537 3
2539 foreach $key (sort(keys %ENV)) {
2540 print $key, \'=\', $ENV{$key}, "\en";
2541 }
2544.Ip "kill(LIST)" 8 8
2545.Ip "kill LIST" 8 2
2546Sends a signal to a list of processes.
2547The first element of the list must be the signal to send.
2548Returns the number of processes successfully signaled.
2551 $cnt = kill 1, $child1, $child2;
2552 kill 9, @goners;
2555If the signal is negative, kills process groups instead of processes.
2556(On System V, a negative \fIprocess\fR number will also kill process groups,
2557but that's not portable.)
2558You may use a signal name in quotes.
2559.Ip "last LABEL" 8 8
2560.Ip "last" 8
2562.I last
2563command is like the
2564.I break
2565statement in C (as used in loops); it immediately exits the loop in question.
2566If the LABEL is omitted, the command refers to the innermost enclosing loop.
2568.I continue
2569block, if any, is not executed:
2571 4
2573 line: while (<STDIN>) {
2574 last line if /\|^$/; # exit when done with header
2575 .\|.\|.
2576 }
2579.Ip "length(EXPR)" 8 4
2580.Ip "length EXPR" 8
2581Returns the length in characters of the value of EXPR.
2582If EXPR is omitted, returns length of $_.
2583.Ip "link(OLDFILE,NEWFILE)" 8 2
2584Creates a new filename linked to the old filename.
2585Returns 1 for success, 0 otherwise.
2586.Ip "listen(SOCKET,QUEUESIZE)" 8 2
2587Does the same thing that the listen system call does.
2588Returns true if it succeeded, false otherwise.
2589See example in section on Interprocess Communication.
2590.Ip "local(LIST)" 8 4
2591Declares the listed variables to be local to the enclosing block,
2592subroutine, eval or \*(L"do\*(R".
2593All the listed elements must be legal lvalues.
2594This operator works by saving the current values of those variables in LIST
2595on a hidden stack and restoring them upon exiting the block, subroutine or eval.
2596This means that called subroutines can also reference the local variable,
2597but not the global one.
2598The LIST may be assigned to if desired, which allows you to initialize
2599your local variables.
2600(If no initializer is given for a particular variable, it is created with
2601an undefined value.)
2602Commonly this is used to name the parameters to a subroutine.
2605 13
2607 sub RANGEVAL {
2608 local($min, $max, $thunk) = @_;
2609 local($result) = \'\';
2610 local($i);
2612 # Presumably $thunk makes reference to $i
2614 for ($i = $min; $i < $max; $i++) {
2615 $result .= eval $thunk;
2616 }
2618 $result;
2619 }
2620 6
2622 if ($sw eq \'-v\') {
2623 # init local array with global array
2624 local(@ARGV) = @ARGV;
2625 unshift(@ARGV,\'echo\');
2626 system @ARGV;
2627 }
2628 # @ARGV restored
2629 6
2631 # temporarily add to digits associative array
2632 if ($base12) {
2633 # (NOTE: not claiming this is efficient!)
2634 local(%digits) = (%digits,'t',10,'e',11);
2635 do parse_num();
2636 }
2639Note that local() is a run-time command, and so gets executed every time
2640through a loop, using up more stack storage each time until it's all
2641released at once when the loop is exited.
2642.Ip "localtime(EXPR)" 8 4
2643.Ip "localtime EXPR" 8
2644Converts a time as returned by the time function to a 9-element array with
2645the time analyzed for the local timezone.
2646Typically used as follows:
2648 3 t \{\
2651 ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
2653.el \{\
2654 ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
2655 localtime(time);
2659All array elements are numeric, and come straight out of a struct tm.
2660In particular this means that $mon has the range 0.\|.11 and $wday has the
2661range 0.\|.6.
2662If EXPR is omitted, does localtime(time).
2663.Ip "log(EXPR)" 8 4
2664.Ip "log EXPR" 8
2665Returns logarithm (base
2666.IR e )
2667of EXPR.
2668If EXPR is omitted, returns log of $_.
2669.Ip "lstat(FILEHANDLE)" 8 6
2670.Ip "lstat FILEHANDLE" 8
2671.Ip "lstat(EXPR)" 8
2672.Ip "lstat SCALARVARIABLE" 8
2673Does the same thing as the stat() function, but stats a symbolic link
2674instead of the file the symbolic link points to.
2675If symbolic links are unimplemented on your system, a normal stat is done.
2676.Ip "m/PATTERN/gio" 8 4
2677.Ip "/PATTERN/gio" 8
2678Searches a string for a pattern match, and returns true (1) or false (\'\').
2679If no string is specified via the =~ or !~ operator,
2680the $_ string is searched.
2681(The string specified with =~ need not be an lvalue\*(--it may be the result of an expression evaluation, but remember the =~ binds rather tightly.)
2682See also the section on regular expressions.
2684If / is the delimiter then the initial \*(L'm\*(R' is optional.
2685With the \*(L'm\*(R' you can use any pair of non-alphanumeric characters
2686as delimiters.
2687This is particularly useful for matching Unix path names that contain \*(L'/\*(R'.
2688If the final delimiter is followed by the optional letter \*(L'i\*(R', the matching is
2689done in a case-insensitive manner.
2690PATTERN may contain references to scalar variables, which will be interpolated
2691(and the pattern recompiled) every time the pattern search is evaluated.
2692(Note that $) and $| may not be interpolated because they look like end-of-string tests.)
2693If you want such a pattern to be compiled only once, add an \*(L"o\*(R" after
2694the trailing delimiter.
2695This avoids expensive run-time recompilations, and
2696is useful when the value you are interpolating won't change over the
2697life of the script.
2698If the PATTERN evaluates to a null string, the most recent successful
2699regular expression is used instead.
2701If used in a context that requires an array value, a pattern match returns an
2702array consisting of the subexpressions matched by the parentheses in the
2704i.e. ($1, $2, $3.\|.\|.).
2705It does NOT actually set $1, $2, etc. in this case, nor does it set $+, $`, $&
2706or $'.
2707If the match fails, a null array is returned.
2708If the match succeeds, but there were no parentheses, an array value of (1)
2709is returned.
2713 4
2715 open(tty, \'/dev/tty\');
2716 <tty> \|=~ \|/\|^y\|/i \|&& \|do foo(\|); # do foo if desired
2718 if (/Version: \|*\|([0\-9.]*\|)\|/\|) { $version = $1; }
2720 next if m#^/usr/spool/uucp#;
2721 5
2723 # poor man's grep
2724 $arg = shift;
2725 while (<>) {
2726 print if /$arg/o; # compile only once
2727 }
2729 if (($F1, $F2, $Etc) = ($foo =~ /^(\eS+)\es+(\eS+)\es*(.*)/))
2732This last example splits $foo into the first two words and the remainder
2733of the line, and assigns those three fields to $F1, $F2 and $Etc.
2734The conditional is true if any variables were assigned, i.e. if the pattern
2737The \*(L"g\*(R" modifier specifies global pattern matching\*(--that is,
2738matching as many times as possible within the string. How it behaves
2739depends on the context. In an array context, it returns a list of
2740all the substrings matched by all the parentheses in the regular expression.
2741If there are no parentheses, it returns a list of all the matched strings,
2742as if there were parentheses around the whole pattern. In a scalar context,
2743it iterates through the string, returning TRUE each time it matches, and
2744FALSE when it eventually runs out of matches. (In other words, it remembers
2745where it left off last time and restarts the search at that point.) It
2746presumes that you have not modified the string since the last match.
2747Modifying the string between matches may result in undefined behavior.
2748(You can actually get away with in-place modifications via substr()
2749that do not change the length of the entire string. In general, however,
2750you should be using s///g for such modifications.) Examples:
2753 # array context
2754 ($one,$five,$fifteen) = (\`uptime\` =~ /(\ed+\e.\ed+)/g);
2756 # scalar context
2757 $/ = 1; $* = 1;
2758 while ($paragraph = <>) {
2759 while ($paragraph =~ /[a-z][\'")]*[.!?]+[\'")]*\es/g) {
2760 $sentences++;
2761 }
2762 }
2763 print "$sentences\en";
2766.Ip "mkdir(FILENAME,MODE)" 8 3
2767Creates the directory specified by FILENAME, with permissions specified by
2768MODE (as modified by umask).
2769If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
2770.Ip "msgctl(ID,CMD,ARG)" 8 4
2771Calls the System V IPC function msgctl. If CMD is &IPC_STAT, then ARG
2772must be a variable which will hold the returned msqid_ds structure.
2773Returns like ioctl: the undefined value for error, "0 but true" for
2774zero, or the actual return value otherwise.
2775.Ip "msgget(KEY,FLAGS)" 8 4
2776Calls the System V IPC function msgget. Returns the message queue id,
2777or the undefined value if there is an error.
2778.Ip "msgsnd(ID,MSG,FLAGS)" 8 4
2779Calls the System V IPC function msgsnd to send the message MSG to the
2780message queue ID. MSG must begin with the long integer message type,
2781which may be created with pack("L", $type). Returns true if
2782successful, or false if there is an error.
2783.Ip "msgrcv(ID,VAR,SIZE,TYPE,FLAGS)" 8 4
2784Calls the System V IPC function msgrcv to receive a message from
2785message queue ID into variable VAR with a maximum message size of
2786SIZE. Note that if a message is received, the message type will be
2787the first thing in VAR, and the maximum length of VAR is SIZE plus the
2788size of the message type. Returns true if successful, or false if
2789there is an error.
2790.Ip "next LABEL" 8 8
2791.Ip "next" 8
2793.I next
2794command is like the
2795.I continue
2796statement in C; it starts the next iteration of the loop:
2798 4
2800 line: while (<STDIN>) {
2801 next line if /\|^#/; # discard comments
2802 .\|.\|.
2803 }
2806Note that if there were a
2807.I continue
2808block on the above, it would get executed even on discarded lines.
2809If the LABEL is omitted, the command refers to the innermost enclosing loop.
2810.Ip "oct(EXPR)" 8 4
2811.Ip "oct EXPR" 8
2812Returns the decimal value of EXPR interpreted as an octal string.
2813(If EXPR happens to start off with 0x, interprets it as a hex string instead.)
2814The following will handle decimal, octal and hex in the standard notation:
2817 $val = oct($val) if $val =~ /^0/;
2820If EXPR is omitted, uses $_.
2821.Ip "open(FILEHANDLE,EXPR)" 8 8
2822.Ip "open(FILEHANDLE)" 8
2823.Ip "open FILEHANDLE" 8
2824Opens the file whose filename is given by EXPR, and associates it with
2826If FILEHANDLE is an expression, its value is used as the name of the
2827real filehandle wanted.
2828If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE
2829contains the filename.
2830If the filename begins with \*(L"<\*(R" or nothing, the file is opened for
2832If the filename begins with \*(L">\*(R", the file is opened for output.
2833If the filename begins with \*(L">>\*(R", the file is opened for appending.
2834(You can put a \'+\' in front of the \'>\' or \'<\' to indicate that you
2835want both read and write access to the file.)
2836If the filename begins with \*(L"|\*(R", the filename is interpreted
2837as a command to which output is to be piped, and if the filename ends
2838with a \*(L"|\*(R", the filename is interpreted as command which pipes
2839input to us.
2840(You may not have a command that pipes both in and out.)
2841Opening \'\-\' opens
2842.I STDIN
2843and opening \'>\-\' opens
2844.IR STDOUT .
2845Open returns non-zero upon success, the undefined value otherwise.
2846If the open involved a pipe, the return value happens to be the pid
2847of the subprocess.
2850 3
2852 $article = 100;
2853 open article || die "Can't find article $article: $!\en";
2854 while (<article>) {\|.\|.\|.
2855 t \{\
2857 open(LOG, \'>>/usr/spool/news/twitlog\'\|); # (log is reserved)
2859.el \{\
2860 open(LOG, \'>>/usr/spool/news/twitlog\'\|);
2861 # (log is reserved)
2863 t \{\
2865 open(article, "caesar <$article |"\|); # decrypt article
2867.el \{\
2868 open(article, "caesar <$article |"\|);
2869 # decrypt article
2871 t \{\
2873 open(extract, "|sort >/tmp/Tmp$$"\|); # $$ is our process#
2875.el \{\
2876 open(extract, "|sort >/tmp/Tmp$$"\|);
2877 # $$ is our process#
2879 7
2881 # process argument list of files along with any includes
2883 foreach $file (@ARGV) {
2884 do process($file, \'fh00\'); # no pun intended
2885 }
2887 sub process {
2888 local($filename, $input) = @_;
2889 $input++; # this is a string increment
2890 unless (open($input, $filename)) {
2891 print STDERR "Can't open $filename: $!\en";
2892 return;
2893 } t \{\
2895 while (<$input>) { # note the use of indirection
2897.el \{\
2898 while (<$input>) { # note use of indirection
2900 if (/^#include "(.*)"/) {
2901 do process($1, $input);
2902 next;
2903 }
2904 .\|.\|. # whatever
2905 }
2906 }
2909You may also, in the Bourne shell tradition, specify an EXPR beginning
2910with \*(L">&\*(R", in which case the rest of the string
2911is interpreted as the name of a filehandle
2912(or file descriptor, if numeric) which is to be duped and opened.
2913You may use & after >, >>, <, +>, +>> and +<.
2914The mode you specify should match the mode of the original filehandle.
2915Here is a script that saves, redirects, and restores
2918.IR STDERR :
2920 21
2922 #!/usr/bin/perl
2923 open(SAVEOUT, ">&STDOUT");
2924 open(SAVEERR, ">&STDERR");
2926 open(STDOUT, ">foo.out") || die "Can't redirect stdout";
2927 open(STDERR, ">&STDOUT") || die "Can't dup stdout";
2929 select(STDERR); $| = 1; # make unbuffered
2930 select(STDOUT); $| = 1; # make unbuffered
2932 print STDOUT "stdout 1\en"; # this works for
2933 print STDERR "stderr 1\en"; # subprocesses too
2935 close(STDOUT);
2936 close(STDERR);
2938 open(STDOUT, ">&SAVEOUT");
2939 open(STDERR, ">&SAVEERR");
2941 print STDOUT "stdout 2\en";
2942 print STDERR "stderr 2\en";
2945If you open a pipe on the command \*(L"\-\*(R", i.e. either \*(L"|\-\*(R" or \*(L"\-|\*(R",
2946then there is an implicit fork done, and the return value of open
2947is the pid of the child within the parent process, and 0 within the child
2949(Use defined($pid) to determine if the open was successful.)
2950The filehandle behaves normally for the parent, but i/o to that
2951filehandle is piped from/to the
2953of the child process.
2954In the child process the filehandle isn't opened\*(--i/o happens from/to
2955the new
2958.IR STDIN .
2959Typically this is used like the normal piped open when you want to exercise
2960more control over just how the pipe command gets executed, such as when
2961you are running setuid, and don't want to have to scan shell commands
2962for metacharacters.
2963The following pairs are more or less equivalent:
2965 5
2967 open(FOO, "|tr \'[a\-z]\' \'[A\-Z]\'");
2968 open(FOO, "|\-") || exec \'tr\', \'[a\-z]\', \'[A\-Z]\';
2970 open(FOO, "cat \-n '$file'|");
2971 open(FOO, "\-|") || exec \'cat\', \'\-n\', $file;
2974Explicitly closing any piped filehandle causes the parent process to wait for the
2975child to finish, and returns the status value in $?.
2976Note: on any operation which may do a fork,
2977unflushed buffers remain unflushed in both
2978processes, which means you may need to set $| to
2979avoid duplicate output.
2981The filename that is passed to open will have leading and trailing
2982whitespace deleted.
2983In order to open a file with arbitrary weird characters in it, it's necessary
2984to protect any leading and trailing whitespace thusly:
2986 2
2988 $file =~ s#^(\es)#./$1#;
2989 open(FOO, "< $file\e0");
2992.Ip "opendir(DIRHANDLE,EXPR)" 8 3
2993Opens a directory named EXPR for processing by readdir(), telldir(), seekdir(),
2994rewinddir() and closedir().
2995Returns true if successful.
2996DIRHANDLEs have their own namespace separate from FILEHANDLEs.
2997.Ip "ord(EXPR)" 8 4
2998.Ip "ord EXPR" 8
2999Returns the numeric ascii value of the first character of EXPR.
3000If EXPR is omitted, uses $_.
3001''' Comments on f & d by 22/11/89
3002.Ip "pack(TEMPLATE,LIST)" 8 4
3003Takes an array or list of values and packs it into a binary structure,
3004returning the string containing the structure.
3005The TEMPLATE is a sequence of characters that give the order and type
3006of values, as follows:
3009 A An ascii string, will be space padded.
3010 a An ascii string, will be null padded.
3011 c A signed char value.
3012 C An unsigned char value.
3013 s A signed short value.
3014 S An unsigned short value.
3015 i A signed integer value.
3016 I An unsigned integer value.
3017 l A signed long value.
3018 L An unsigned long value.
3019 n A short in \*(L"network\*(R" order.
3020 N A long in \*(L"network\*(R" order.
3021 f A single-precision float in the native format.
3022 d A double-precision float in the native format.
3023 p A pointer to a string.
3024 x A null byte.
3025 X Back up a byte.
3026 @ Null fill to absolute position.
3027 u A uuencoded string.
3028 b A bit string (ascending bit order, like vec()).
3029 B A bit string (descending bit order).
3030 h A hex string (low nybble first).
3031 H A hex string (high nybble first).
3034Each letter may optionally be followed by a number which gives a repeat
3036With all types except "a", "A", "b", "B", "h" and "H",
3037the pack function will gobble up that many values
3038from the LIST.
3039A * for the repeat count means to use however many items are left.
3040The "a" and "A" types gobble just one value, but pack it as a string of length
3042padding with nulls or spaces as necessary.
3043(When unpacking, "A" strips trailing spaces and nulls, but "a" does not.)
3044Likewise, the "b" and "B" fields pack a string that many bits long.
3045The "h" and "H" fields pack a string that many nybbles long.
3046Real numbers (floats and doubles) are in the native machine format
3047only; due to the multiplicity of floating formats around, and the lack
3048of a standard \*(L"network\*(R" representation, no facility for
3049interchange has been made.
3050This means that packed floating point data
3051written on one machine may not be readable on another - even if both
3052use IEEE floating point arithmetic (as the endian-ness of the memory
3053representation is not part of the IEEE spec).
3054Note that perl uses
3055doubles internally for all numeric calculation, and converting from
3056double -> float -> double will lose precision (i.e. unpack("f",
3057pack("f", $foo)) will not in general equal $foo).
3062 $foo = pack("cccc",65,66,67,68);
3063 # foo eq "ABCD"
3064 $foo = pack("c4",65,66,67,68);
3065 # same thing
3067 $foo = pack("ccxxcc",65,66,67,68);
3068 # foo eq "AB\e0\e0CD"
3070 $foo = pack("s2",1,2);
3071 # "\e1\e0\e2\e0" on little-endian
3072 # "\e0\e1\e0\e2" on big-endian
3074 $foo = pack("a4","abcd","x","y","z");
3075 # "abcd"
3077 $foo = pack("aaaa","abcd","x","y","z");
3078 # "axyz"
3080 $foo = pack("a14","abcdefg");
3081 # "abcdefg\e0\e0\e0\e0\e0\e0\e0"
3083 $foo = pack("i9pl", gmtime);
3084 # a real struct tm (on my system anyway)
3086 sub bintodec {
3087 unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
3088 }
3090The same template may generally also be used in the unpack function.
3092Opens a pair of connected pipes like the corresponding system call.
3093Note that if you set up a loop of piped processes, deadlock can occur
3094unless you are very careful.
3095In addition, note that perl's pipes use stdio buffering, so you may need
3096to set $| to flush your WRITEHANDLE after each command, depending on
3097the application.
3098[Requires version 3.0 patchlevel 9.]
3099.Ip "pop(ARRAY)" 8
3100.Ip "pop ARRAY" 8 6
3101Pops and returns the last value of the array, shortening the array by 1.
3102Has the same effect as
3105 $tmp = $ARRAY[$#ARRAY\-\|\-];
3108If there are no elements in the array, returns the undefined value.
3109.Ip "print(FILEHANDLE LIST)" 8 10
3110.Ip "print(LIST)" 8
3111.Ip "print FILEHANDLE LIST" 8
3112.Ip "print LIST" 8
3113.Ip "print" 8
3114Prints a string or a comma-separated list of strings.
3115Returns non-zero if successful.
3116FILEHANDLE may be a scalar variable name, in which case the variable contains
3117the name of the filehandle, thus introducing one level of indirection.
3118(NOTE: If FILEHANDLE is a variable and the next token is a term, it may be
3119misinterpreted as an operator unless you interpose a + or put parens around
3120the arguments.)
3121If FILEHANDLE is omitted, prints by default to standard output (or to the
3122last selected output channel\*(--see select()).
3123If LIST is also omitted, prints $_ to
3124.IR STDOUT .
3125To set the default output channel to something other than
3127use the select operation.
3128Note that, because print takes a LIST, anything in the LIST is evaluated
3129in an array context, and any subroutine that you call will have one or more
3130of its expressions evaluated in an array context.
3131Also be careful not to follow the print keyword with a left parenthesis
3132unless you want the corresponding right parenthesis to terminate the
3133arguments to the print\*(--interpose a + or put parens around all the arguments.
3134.Ip "printf(FILEHANDLE LIST)" 8 10
3135.Ip "printf(LIST)" 8
3136.Ip "printf FILEHANDLE LIST" 8
3137.Ip "printf LIST" 8
3138Equivalent to a \*(L"print FILEHANDLE sprintf(LIST)\*(R".
3139.Ip "push(ARRAY,LIST)" 8 7
3140Treats ARRAY (@ is optional) as a stack, and pushes the values of LIST
3141onto the end of ARRAY.
3142The length of ARRAY increases by the length of LIST.
3143Has the same effect as
3146 for $value (LIST) {
3147 $ARRAY[++$#ARRAY] = $value;
3148 }
3151but is more efficient.
3152.Ip "q/STRING/" 8 5
3153.Ip "qq/STRING/" 8
3154.Ip "qx/STRING/" 8
3155These are not really functions, but simply syntactic sugar to let you
3156avoid putting too many backslashes into quoted strings.
3157The q operator is a generalized single quote, and the qq operator a
3158generalized double quote.
3159The qx operator is a generalized backquote.
3160Any non-alphanumeric delimiter can be used in place of /, including newline.
3161If the delimiter is an opening bracket or parenthesis, the final delimiter
3162will be the corresponding closing bracket or parenthesis.
3163(Embedded occurrences of the closing bracket need to be backslashed as usual.)
3166 5
3168 $foo = q!I said, "You said, \'She said it.\'"!;
3169 $bar = q(\'This is it.\');
3170 $today = qx{ date };
3171 $_ .= qq
3172*** The previous line contains the naughty word "$&".\en
3173 if /(ibm|apple|awk)/; # :-)
3176.Ip "rand(EXPR)" 8 8
3177.Ip "rand EXPR" 8
3178.Ip "rand" 8
3179Returns a random fractional number between 0 and the value of EXPR.
3180(EXPR should be positive.)
3181If EXPR is omitted, returns a value between 0 and 1.
3182See also srand().
3185Attempts to read LENGTH bytes of data into variable SCALAR from the specified
3187Returns the number of bytes actually read, or undef if there was an error.
3188SCALAR will be grown or shrunk to the length actually read.
3189An OFFSET may be specified to place the read data at some other place
3190than the beginning of the string.
3191This call is actually implemented in terms of stdio's fread call. To get
3192a true read system call, see sysread.
3193.Ip "readdir(DIRHANDLE)" 8 3
3194.Ip "readdir DIRHANDLE" 8
3195Returns the next directory entry for a directory opened by opendir().
3196If used in an array context, returns all the rest of the entries in the
3198If there are no more entries, returns an undefined value in a scalar context
3199or a null list in an array context.
3200.Ip "readlink(EXPR)" 8 6
3201.Ip "readlink EXPR" 8
3202Returns the value of a symbolic link, if symbolic links are implemented.
3203If not, gives a fatal error.
3204If there is some system error, returns the undefined value and sets $! (errno).
3205If EXPR is omitted, uses $_.
3206.Ip "recv(SOCKET,SCALAR,LEN,FLAGS)" 8 4
3207Receives a message on a socket.
3208Attempts to receive LENGTH bytes of data into variable SCALAR from the specified
3209SOCKET filehandle.
3210Returns the address of the sender, or the undefined value if there's an error.
3211SCALAR will be grown or shrunk to the length actually read.
3212Takes the same flags as the system call of the same name.
3213.Ip "redo LABEL" 8 8
3214.Ip "redo" 8
3216.I redo
3217command restarts the loop block without evaluating the conditional again.
3219.I continue
3220block, if any, is not executed.
3221If the LABEL is omitted, the command refers to the innermost enclosing loop.
3222This command is normally used by programs that want to lie to themselves
3223about what was just input:
3225 16
3227 # a simpleminded Pascal comment stripper
3228 # (warning: assumes no { or } in strings)
3229 line: while (<STDIN>) {
3230 while (s|\|({.*}.*\|){.*}|$1 \||) {}
3231 s|{.*}| \||;
3232 if (s|{.*| \||) {
3233 $front = $_;
3234 while (<STDIN>) {
3235 if (\|/\|}/\|) { # end of comment?
3236 s|^|$front{|;
3237 redo line;
3238 }
3239 }
3240 }
3241 print;
3242 }
3245.Ip "rename(OLDNAME,NEWNAME)" 8 2
3246Changes the name of a file.
3247Returns 1 for success, 0 otherwise.
3248Will not work across filesystem boundaries.
3249.Ip "require(EXPR)" 8 6
3250.Ip "require EXPR" 8
3251.Ip "require" 8
3252Includes the library file specified by EXPR, or by $_ if EXPR is not supplied.
3253Has semantics similar to the following subroutine:
3256 sub require {
3257 local($filename) = @_;
3258 return 1 if $INC{$filename};
3259 local($realfilename,$result);
3260 ITER: {
3261 foreach $prefix (@INC) {
3262 $realfilename = "$prefix/$filename";
3263 if (-f $realfilename) {
3264 $result = do $realfilename;
3265 last ITER;
3266 }
3267 }
3268 die "Can't find $filename in \e@INC";
3269 }
3270 die $@ if $@;
3271 die "$filename did not return true value" unless $result;
3272 $INC{$filename} = $realfilename;
3273 $result;
3274 }
3277Note that the file will not be included twice under the same specified name.
3278.Ip "reset(EXPR)" 8 6
3279.Ip "reset EXPR" 8
3280.Ip "reset" 8
3281Generally used in a
3282.I continue
3283block at the end of a loop to clear variables and reset ?? searches
3284so that they work again.
3285The expression is interpreted as a list of single characters (hyphens allowed
3286for ranges).
3287All variables and arrays beginning with one of those letters are reset to
3288their pristine state.
3289If the expression is omitted, one-match searches (?pattern?) are reset to
3290match again.
3291Only resets variables or searches in the current package.
3292Always returns 1.
3295 3
3297 reset \'X\'; \h'|2i'# reset all X variables
3298 reset \'a\-z\';\h'|2i'# reset lower case variables
3299 reset; \h'|2i'# just reset ?? searches
3302Note: resetting \*(L"A\-Z\*(R" is not recommended since you'll wipe out your ARGV and ENV
3305The use of reset on dbm associative arrays does not change the dbm file.
3306(It does, however, flush any entries cached by perl, which may be useful if
3307you are sharing the dbm file.
3308Then again, maybe not.)
3309.Ip "return LIST" 8 3
3310Returns from a subroutine with the value specified.
3311(Note that a subroutine can automatically return
3312the value of the last expression evaluated.
3313That's the preferred method\*(--use of an explicit
3314.I return
3315is a bit slower.)
3316.Ip "reverse(LIST)" 8 4
3317.Ip "reverse LIST" 8
3318In an array context, returns an array value consisting of the elements
3319of LIST in the opposite order.
3320In a scalar context, returns a string value consisting of the bytes of
3321the first element of LIST in the opposite order.
3322.Ip "rewinddir(DIRHANDLE)" 8 5
3323.Ip "rewinddir DIRHANDLE" 8
3324Sets the current position to the beginning of the directory for the readdir() routine on DIRHANDLE.
3325.Ip "rindex(STR,SUBSTR,POSITION)" 8 6
3326.Ip "rindex(STR,SUBSTR)" 8 4
3327Works just like index except that it
3328returns the position of the LAST occurrence of SUBSTR in STR.
3329If POSITION is specified, returns the last occurrence at or before that
3331.Ip "rmdir(FILENAME)" 8 4
3332.Ip "rmdir FILENAME" 8
3333Deletes the directory specified by FILENAME if it is empty.
3334If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
3335If FILENAME is omitted, uses $_.
3336.Ip "s/PATTERN/REPLACEMENT/gieo" 8 3
3337Searches a string for a pattern, and if found, replaces that pattern with the
3338replacement text and returns the number of substitutions made.
3339Otherwise it returns false (0).
3340The \*(L"g\*(R" is optional, and if present, indicates that all occurrences
3341of the pattern are to be replaced.
3342The \*(L"i\*(R" is also optional, and if present, indicates that matching
3343is to be done in a case-insensitive manner.
3344The \*(L"e\*(R" is likewise optional, and if present, indicates that
3345the replacement string is to be evaluated as an expression rather than just
3346as a double-quoted string.
3347Any non-alphanumeric delimiter may replace the slashes;
3348if single quotes are used, no
3349interpretation is done on the replacement string (the e modifier overrides
3350this, however); if backquotes are used, the replacement string is a command
3351to execute whose output will be used as the actual replacement text.
3352If no string is specified via the =~ or !~ operator,
3353the $_ string is searched and modified.
3354(The string specified with =~ must be a scalar variable, an array element,
3355or an assignment to one of those, i.e. an lvalue.)
3356If the pattern contains a $ that looks like a variable rather than an
3357end-of-string test, the variable will be interpolated into the pattern at
3359If you only want the pattern compiled once the first time the variable is
3360interpolated, add an \*(L"o\*(R" at the end.
3361If the PATTERN evaluates to a null string, the most recent successful
3362regular expression is used instead.
3363See also the section on regular expressions.
3367 s/\|\e\|bgreen\e\|b/mauve/g; # don't change wintergreen
3369 $path \|=~ \|s|\|/usr/bin|\|/usr/local/bin|;
3371 s/Login: $foo/Login: $bar/; # run-time pattern
3373 ($foo = $bar) =~ s/bar/foo/;
3375 $_ = \'abc123xyz\';
3376 s/\ed+/$&*2/e; # yields \*(L'abc246xyz\*(R'
3377 s/\ed+/sprintf("%5d",$&)/e; # yields \*(L'abc 246xyz\*(R'
3378 s/\ew/$& x 2/eg; # yields \*(L'aabbcc 224466xxyyzz\*(R'
3380 s/\|([^ \|]*\|) *\|([^ \|]*\|)\|/\|$2 $1/; # reverse 1st two fields
3383(Note the use of $ instead of \|\e\| in the last example. See section
3384on regular expressions.)
3385.Ip "scalar(EXPR)" 8 3
3386Forces EXPR to be interpreted in a scalar context and returns the value
3387of EXPR.
3389Randomly positions the file pointer for FILEHANDLE, just like the fseek()
3390call of stdio.
3391FILEHANDLE may be an expression whose value gives the name of the filehandle.
3392Returns 1 upon success, 0 otherwise.
3393.Ip "seekdir(DIRHANDLE,POS)" 8 3
3394Sets the current position for the readdir() routine on DIRHANDLE.
3395POS must be a value returned by telldir().
3396Has the same caveats about possible directory compaction as the corresponding
3397system library routine.
3398.Ip "select(FILEHANDLE)" 8 3
3399.Ip "select" 8 3
3400Returns the currently selected filehandle.
3401Sets the current default filehandle for output, if FILEHANDLE is supplied.
3402This has two effects: first, a
3403.I write
3404or a
3405.I print
3406without a filehandle will default to this FILEHANDLE.
3407Second, references to variables related to output will refer to this output
3409For example, if you have to set the top of form format for more than
3410one output channel, you might do the following:
3412 4
3414 select(REPORT1);
3415 $^ = \'report1_top\';
3416 select(REPORT2);
3417 $^ = \'report2_top\';
3420FILEHANDLE may be an expression whose value gives the name of the actual filehandle.
3424 $oldfh = select(STDERR); $| = 1; select($oldfh);
3427.Ip "select(RBITS,WBITS,EBITS,TIMEOUT)" 8 3
3428This calls the select system call with the bitmasks specified, which can
3429be constructed using fileno() and vec(), along these lines:
3432 $rin = $win = $ein = '';
3433 vec($rin,fileno(STDIN),1) = 1;
3434 vec($win,fileno(STDOUT),1) = 1;
3435 $ein = $rin | $win;
3438If you want to select on many filehandles you might wish to write a subroutine:
3441 sub fhbits {
3442 local(@fhlist) = split(' ',$_[0]);
3443 local($bits);
3444 for (@fhlist) {
3445 vec($bits,fileno($_),1) = 1;
3446 }
3447 $bits;
3448 }
3449 $rin = &fhbits('STDIN TTY SOCK');
3452The usual idiom is:
3455 ($nfound,$timeleft) =
3456 select($rout=$rin, $wout=$win, $eout=$ein, $timeout);
3458or to block until something becomes ready:
3459 t \{\
3461 $nfound = select($rout=$rin, $wout=$win, $eout=$ein, undef);
3463.el \{\
3464 $nfound = select($rout=$rin, $wout=$win,
3465 $eout=$ein, undef);
3469Any of the bitmasks can also be undef.
3470The timeout, if specified, is in seconds, which may be fractional.
3471NOTE: not all implementations are capable of returning the $timeleft.
3472If not, they always return $timeleft equal to the supplied $timeout.
3473.Ip "semctl(ID,SEMNUM,CMD,ARG)" 8 4
3474Calls the System V IPC function semctl. If CMD is &IPC_STAT or
3475&GETALL, then ARG must be a variable which will hold the returned
3476semid_ds structure or semaphore value array. Returns like ioctl: the
3477undefined value for error, "0 but true" for zero, or the actual return
3478value otherwise.
3479.Ip "semget(KEY,NSEMS,SIZE,FLAGS)" 8 4
3480Calls the System V IPC function semget. Returns the semaphore id, or
3481the undefined value if there is an error.
3482.Ip "semop(KEY,OPSTRING)" 8 4
3483Calls the System V IPC function semop to perform semaphore operations
3484such as signaling and waiting. OPSTRING must be a packed array of
3485semop structures. Each semop structure can be generated with
3486\&'pack("sss", $semnum, $semop, $semflag)'. The number of semaphore
3487operations is implied by the length of OPSTRING. Returns true if
3488successful, or false if there is an error. As an example, the
3489following code waits on semaphore $semnum of semaphore id $semid:
3492 $semop = pack("sss", $semnum, -1, 0);
3493 die "Semaphore trouble: $!\en" unless semop($semid, $semop);
3496To signal the semaphore, replace "-1" with "1".
3497.Ip "send(SOCKET,MSG,FLAGS,TO)" 8 4
3498.Ip "send(SOCKET,MSG,FLAGS)" 8
3499Sends a message on a socket.
3500Takes the same flags as the system call of the same name.
3501On unconnected sockets you must specify a destination to send TO.
3502Returns the number of characters sent, or the undefined value if
3503there is an error.
3504.Ip "setpgrp(PID,PGRP)" 8 4
3505Sets the current process group for the specified PID, 0 for the current
3507Will produce a fatal error if used on a machine that doesn't implement
3509.Ip "setpriority(WHICH,WHO,PRIORITY)" 8 4
3510Sets the current priority for a process, a process group, or a user.
3511(See setpriority(2).)
3512Will produce a fatal error if used on a machine that doesn't implement
3514.Ip "setsockopt(SOCKET,LEVEL,OPTNAME,OPTVAL)" 8 3
3515Sets the socket option requested.
3516Returns undefined if there is an error.
3517OPTVAL may be specified as undef if you don't want to pass an argument.
3518.Ip "shift(ARRAY)" 8 6
3519.Ip "shift ARRAY" 8
3520.Ip "shift" 8
3521Shifts the first value of the array off and returns it,
3522shortening the array by 1 and moving everything down.
3523If there are no elements in the array, returns the undefined value.
3524If ARRAY is omitted, shifts the @ARGV array in the main program, and the @_
3525array in subroutines.
3526(This is determined lexically.)
3527See also unshift(), push() and pop().
3528Shift() and unshift() do the same thing to the left end of an array that push()
3529and pop() do to the right end.
3530.Ip "shmctl(ID,CMD,ARG)" 8 4
3531Calls the System V IPC function shmctl. If CMD is &IPC_STAT, then ARG
3532must be a variable which will hold the returned shmid_ds structure.
3533Returns like ioctl: the undefined value for error, "0 but true" for
3534zero, or the actual return value otherwise.
3535.Ip "shmget(KEY,SIZE,FLAGS)" 8 4
3536Calls the System V IPC function shmget. Returns the shared memory
3537segment id, or the undefined value if there is an error.
3538.Ip "shmread(ID,VAR,POS,SIZE)" 8 4
3539.Ip "shmwrite(ID,STRING,POS,SIZE)" 8
3540Reads or writes the System V shared memory segment ID starting at
3541position POS for size SIZE by attaching to it, copying in/out, and
3542detaching from it. When reading, VAR must be a variable which
3543will hold the data read. When writing, if STRING is too long,
3544only SIZE bytes are used; if STRING is too short, nulls are
3545written to fill out SIZE bytes. Return true if successful, or
3546false if there is an error.
3547.Ip "shutdown(SOCKET,HOW)" 8 3
3548Shuts down a socket connection in the manner indicated by HOW, which has
3549the same interpretation as in the system call of the same name.
3550.Ip "sin(EXPR)" 8 4
3551.Ip "sin EXPR" 8
3552Returns the sine of EXPR (expressed in radians).
3553If EXPR is omitted, returns sine of $_.
3554.Ip "sleep(EXPR)" 8 6
3555.Ip "sleep EXPR" 8
3556.Ip "sleep" 8
3557Causes the script to sleep for EXPR seconds, or forever if no EXPR.
3558May be interrupted by sending the process a SIGALARM.
3559Returns the number of seconds actually slept.
3561Opens a socket of the specified kind and attaches it to filehandle SOCKET.
3562DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
3563of the same name.
3564You may need to run h2ph on sys/socket.h to get the proper values handy
3565in a perl library file.
3566Return true if successful.
3567See the example in the section on Interprocess Communication.
3568.Ip "socketpair(SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL)" 8 3
3569Creates an unnamed pair of sockets in the specified domain, of the specified
3571DOMAIN, TYPE and PROTOCOL are specified the same as for the system call
3572of the same name.
3573If unimplemented, yields a fatal error.
3574Return true if successful.
3575.Ip "sort(SUBROUTINE LIST)" 8 9
3576.Ip "sort(LIST)" 8
3577.Ip "sort SUBROUTINE LIST" 8
3578.Ip "sort LIST" 8
3579Sorts the LIST and returns the sorted array value.
3580Nonexistent values of arrays are stripped out.
3581If SUBROUTINE is omitted, sorts in standard string comparison order.
3582If SUBROUTINE is specified, gives the name of a subroutine that returns
3583an integer less than, equal to, or greater than 0,
3584depending on how the elements of the array are to be ordered.
352d5a3a 3585(The <=> and cmp operators are extremely useful in such routines.)
3586In the interests of efficiency the normal calling code for subroutines
3587is bypassed, with the following effects: the subroutine may not be a recursive
3588subroutine, and the two elements to be compared are passed into the subroutine
3589not via @_ but as $a and $b (see example below).
3590They are passed by reference so don't modify $a and $b.
3591SUBROUTINE may be a scalar variable name, in which case the value provides
3592the name of the subroutine to use.
3595 4
3597 sub byage {
352d5a3a 3598 $age{$a} <=> $age{$b}; # presuming integers
3599 }
3600 @sortedclass = sort byage @class;
3601 9
352d5a3a 3603 sub reverse { $b cmp $a; }
3604 @harry = (\'dog\',\'cat\',\'x\',\'Cain\',\'Abel\');
3605 @george = (\'gone\',\'chased\',\'yz\',\'Punished\',\'Axed\');
3606 print sort @harry;
3607 # prints AbelCaincatdogx
3608 print sort reverse @harry;
3609 # prints xdogcatCainAbel
3610 print sort @george, \'to\', @harry;
3611 # prints AbelAxedCainPunishedcatchaseddoggonetoxyz
3614.Ip "splice(ARRAY,OFFSET,LENGTH,LIST)" 8 8
3615.Ip "splice(ARRAY,OFFSET,LENGTH)" 8
3616.Ip "splice(ARRAY,OFFSET)" 8
3617Removes the elements designated by OFFSET and LENGTH from an array, and
3618replaces them with the elements of LIST, if any.
3619Returns the elements removed from the array.
3620The array grows or shrinks as necessary.
3621If LENGTH is omitted, removes everything from OFFSET onward.
3622The following equivalencies hold (assuming $[ == 0):
3625 push(@a,$x,$y)\h'|3.5i'splice(@a,$#a+1,0,$x,$y)
3626 pop(@a)\h'|3.5i'splice(@a,-1)
3627 shift(@a)\h'|3.5i'splice(@a,0,1)
3628 unshift(@a,$x,$y)\h'|3.5i'splice(@a,0,0,$x,$y)
3629 $a[$x] = $y\h'|3.5i'splice(@a,$x,1,$y);
3631Example, assuming array lengths are passed before arrays:
3633 sub aeq { # compare two array values
3634 local(@a) = splice(@_,0,shift);
3635 local(@b) = splice(@_,0,shift);
3636 return 0 unless @a == @b; # same len?
3637 while (@a) {
3638 return 0 if pop(@a) ne pop(@b);
3639 }
3640 return 1;
3641 }
3642 if (&aeq($len,@foo[1..$len],0+@bar,@bar)) { ... }
3645.Ip "split(/PATTERN/,EXPR,LIMIT)" 8 8
3646.Ip "split(/PATTERN/,EXPR)" 8 8
3647.Ip "split(/PATTERN/)" 8
3648.Ip "split" 8
3649Splits a string into an array of strings, and returns it.
3650(If not in an array context, returns the number of fields found and splits
3651into the @_ array.
3652(In an array context, you can force the split into @_
3653by using ?? as the pattern delimiters, but it still returns the array value.))
3654If EXPR is omitted, splits the $_ string.
3655If PATTERN is also omitted, splits on whitespace (/[\ \et\en]+/).
3656Anything matching PATTERN is taken to be a delimiter separating the fields.
3657(Note that the delimiter may be longer than one character.)
3658If LIMIT is specified, splits into no more than that many fields (though it
3659may split into fewer).
3660If LIMIT is unspecified, trailing null fields are stripped (which
3661potential users of pop() would do well to remember).
3662A pattern matching the null string (not to be confused with a null pattern //,
3663which is just one member of the set of patterns matching a null string)
3664will split the value of EXPR into separate characters at each point it
3665matches that way.
3666For example:
3669 print join(\':\', split(/ */, \'hi there\'));
3672produces the output \*(L'h:i:t:h:e:r:e\*(R'.
3674The LIMIT parameter can be used to partially split a line
3677 ($login, $passwd, $remainder) = split(\|/\|:\|/\|, $_, 3);
3680(When assigning to a list, if LIMIT is omitted, perl supplies a LIMIT one
3681larger than the number of variables in the list, to avoid unnecessary work.
3682For the list above LIMIT would have been 4 by default.
3683In time critical applications it behooves you not to split into
3684more fields than you really need.)
3686If the PATTERN contains parentheses, additional array elements are created
3687from each matching substring in the delimiter.
3689 split(/([,-])/,"1-10,20");
3691produces the array value
3693 (1,'-',10,',',20)
3695The pattern /PATTERN/ may be replaced with an expression to specify patterns
3696that vary at runtime.
3697(To do runtime compilation only once, use /$variable/o.)
3698As a special case, specifying a space (\'\ \') will split on white space
3699just as split with no arguments does, but leading white space does NOT
3700produce a null first field.
3701Thus, split(\'\ \') can be used to emulate
3702.IR awk 's
3703default behavior, whereas
3704split(/\ /) will give you as many null initial fields as there are
3705leading spaces.
3709 5
3711 open(passwd, \'/etc/passwd\');
3712 while (<passwd>) { t \{\
3714 ($login, $passwd, $uid, $gid, $gcos, $home, $shell) = split(\|/\|:\|/\|);
3716.el \{\
3717 ($login, $passwd, $uid, $gid, $gcos, $home, $shell)
3718 = split(\|/\|:\|/\|);
3720 .\|.\|.
3721 }
3724(Note that $shell above will still have a newline on it. See chop().)
3725See also
3726.IR join .
3727.Ip "sprintf(FORMAT,LIST)" 8 4
3728Returns a string formatted by the usual printf conventions.
3729The * character is not supported.
3730.Ip "sqrt(EXPR)" 8 4
3731.Ip "sqrt EXPR" 8
3732Return the square root of EXPR.
3733If EXPR is omitted, returns square root of $_.
3734.Ip "srand(EXPR)" 8 4
3735.Ip "srand EXPR" 8
3736Sets the random number seed for the
3737.I rand
3739If EXPR is omitted, does srand(time).
3740.Ip "stat(FILEHANDLE)" 8 8
3741.Ip "stat FILEHANDLE" 8
3742.Ip "stat(EXPR)" 8
3743.Ip "stat SCALARVARIABLE" 8
3744Returns a 13-element array giving the statistics for a file, either the file
3745opened via FILEHANDLE, or named by EXPR.
3746Typically used as follows:
3748 3
3750 ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
3751 $atime,$mtime,$ctime,$blksize,$blocks)
3752 = stat($filename);
3755If stat is passed the special filehandle consisting of an underline,
3756no stat is done, but the current contents of the stat structure from
3757the last stat or filetest are returned.
3760 3
3762 if (-x $file && (($d) = stat(_)) && $d < 0) {
3763 print "$file is executable NFS file\en";
3764 }
352d5a3a 3767(This only works on machines for which the device number is negative under NFS.)
3768.Ip "study(SCALAR)" 8 6
3769.Ip "study SCALAR" 8
3770.Ip "study"
3771Takes extra time to study SCALAR ($_ if unspecified) in anticipation of
3772doing many pattern matches on the string before it is next modified.
3773This may or may not save time, depending on the nature and number of patterns
3774you are searching on, and on the distribution of character frequencies in
3775the string to be searched\*(--you probably want to compare runtimes with and
3776without it to see which runs faster.
3777Those loops which scan for many short constant strings (including the constant
3778parts of more complex patterns) will benefit most.
3779You may have only one study active at a time\*(--if you study a different
3780scalar the first is \*(L"unstudied\*(R".
3781(The way study works is this: a linked list of every character in the string
3782to be searched is made, so we know, for example, where all the \*(L'k\*(R' characters
3784From each search string, the rarest character is selected, based on some
3785static frequency tables constructed from some C programs and English text.
3786Only those places that contain this \*(L"rarest\*(R" character are examined.)
3788For example, here is a loop which inserts index producing entries before any line
3789containing a certain pattern:
3791 8
3793 while (<>) {
3794 study;
3795 print ".IX foo\en" if /\ebfoo\eb/;
3796 print ".IX bar\en" if /\ebbar\eb/;
3797 print ".IX blurfl\en" if /\ebblurfl\eb/;
3798 .\|.\|.
3799 print;
3800 }
3803In searching for /\ebfoo\eb/, only those locations in $_ that contain \*(L'f\*(R'
3804will be looked at, because \*(L'f\*(R' is rarer than \*(L'o\*(R'.
3805In general, this is a big win except in pathological cases.
3806The only question is whether it saves you more time than it took to build
3807the linked list in the first place.
3809Note that if you have to look for strings that you don't know till runtime,
3810you can build an entire loop as a string and eval that to avoid recompiling
3811all your patterns all the time.
3812Together with undefining $/ to input entire files as one record, this can
3813be very fast, often faster than specialized programs like fgrep.
3814The following scans a list of files (@files)
3815for a list of words (@words), and prints out the names of those files that
3816contain a match:
3818 12
3820 $search = \'while (<>) { study;\';
3821 foreach $word (@words) {
3822 $search .= "++\e$seen{\e$ARGV} if /\eb$word\eb/;\en";
3823 }
3824 $search .= "}";
3825 @ARGV = @files;
3826 undef $/;
3827 eval $search; # this screams
3828 $/ = "\en"; # put back to normal input delim
3829 foreach $file (sort keys(%seen)) {
3830 print $file, "\en";
3831 }
3834.Ip "substr(EXPR,OFFSET,LEN)" 8 2
3835.Ip "substr(EXPR,OFFSET)" 8 2
3836Extracts a substring out of EXPR and returns it.
3837First character is at offset 0, or whatever you've set $[ to.
3838If OFFSET is negative, starts that far from the end of the string.
3839If LEN is omitted, returns everything to the end of the string.
3840You can use the substr() function as an lvalue, in which case EXPR must
3841be an lvalue.
3842If you assign something shorter than LEN, the string will shrink, and
3843if you assign something longer than LEN, the string will grow to accommodate it.
3844To keep the string the same length you may need to pad or chop your value using
3846.Ip "symlink(OLDFILE,NEWFILE)" 8 2
3847Creates a new filename symbolically linked to the old filename.
3848Returns 1 for success, 0 otherwise.
3849On systems that don't support symbolic links, produces a fatal error at
3850run time.
3851To check for that, use eval:
3854 $symlink_exists = (eval \'symlink("","");\', $@ eq \'\');
3857.Ip "syscall(LIST)" 8 6
3858.Ip "syscall LIST" 8
3859Calls the system call specified as the first element of the list, passing
3860the remaining elements as arguments to the system call.
3861If unimplemented, produces a fatal error.
3862The arguments are interpreted as follows: if a given argument is numeric,
3863the argument is passed as an int.
3864If not, the pointer to the string value is passed.
3865You are responsible to make sure a string is pre-extended long enough
3866to receive any result that might be written into a string.
3867If your integer arguments are not literals and have never been interpreted
3868in a numeric context, you may need to add 0 to them to force them to look
3869like numbers.
3872 require ''; # may need to run h2ph
3873 syscall(&SYS_write, fileno(STDOUT), "hi there\en", 9);
3877.Ip "sysread(FILEHANDLE,SCALAR,LENGTH)" 8 5
3878Attempts to read LENGTH bytes of data into variable SCALAR from the specified
3879FILEHANDLE, using the system call read(2).
3880It bypasses stdio, so mixing this with other kinds of reads may cause
3882Returns the number of bytes actually read, or undef if there was an error.
3883SCALAR will be grown or shrunk to the length actually read.
3884An OFFSET may be specified to place the read data at some other place
3885than the beginning of the string.
3886.Ip "system(LIST)" 8 6
3887.Ip "system LIST" 8
3888Does exactly the same thing as \*(L"exec LIST\*(R" except that a fork
3889is done first, and the parent process waits for the child process to complete.
3890Note that argument processing varies depending on the number of arguments.
3891The return value is the exit status of the program as returned by the wait()
3893To get the actual exit value divide by 256.
3894See also
3895.IR exec .
3897.Ip "syswrite(FILEHANDLE,SCALAR,LENGTH)" 8 5
3898Attempts to write LENGTH bytes of data from variable SCALAR to the specified
3899FILEHANDLE, using the system call write(2).
3900It bypasses stdio, so mixing this with prints may cause
3902Returns the number of bytes actually written, or undef if there was an error.
3903An OFFSET may be specified to place the read data at some other place
3904than the beginning of the string.
3905.Ip "tell(FILEHANDLE)" 8 6
3906.Ip "tell FILEHANDLE" 8 6
3907.Ip "tell" 8
3908Returns the current file position for FILEHANDLE.
3909FILEHANDLE may be an expression whose value gives the name of the actual
3911If FILEHANDLE is omitted, assumes the file last read.
3912.Ip "telldir(DIRHANDLE)" 8 5
3913.Ip "telldir DIRHANDLE" 8
3914Returns the current position of the readdir() routines on DIRHANDLE.
3915Value may be given to seekdir() to access a particular location in
3916a directory.
3917Has the same caveats about possible directory compaction as the corresponding
3918system library routine.
3919.Ip "time" 8 4
3920Returns the number of non-leap seconds since 00:00:00 UTC, January 1, 1970.
3921Suitable for feeding to gmtime() and localtime().
3922.Ip "times" 8 4
3923Returns a four-element array giving the user and system times, in seconds, for this
3924process and the children of this process.
3926 ($user,$system,$cuser,$csystem) = times;
3930Translates all occurrences of the characters found in the search list with
3931the corresponding character in the replacement list.
3932It returns the number of characters replaced or deleted.
3933If no string is specified via the =~ or !~ operator,
3934the $_ string is translated.
3935(The string specified with =~ must be a scalar variable, an array element,
3936or an assignment to one of those, i.e. an lvalue.)
3938.I sed
3940.I y
3941is provided as a synonym for
3942.IR tr .
3944If the c modifier is specified, the SEARCHLIST character set is complemented.
3945If the d modifier is specified, any characters specified by SEARCHLIST that
3946are not found in REPLACEMENTLIST are deleted.
3947(Note that this is slightly more flexible than the behavior of some
3948.I tr
3949programs, which delete anything they find in the SEARCHLIST, period.)
3950If the s modifier is specified, sequences of characters that were translated
3951to the same character are squashed down to 1 instance of the character.
3953If the d modifier was used, the REPLACEMENTLIST is always interpreted exactly
3954as specified.
3955Otherwise, if the REPLACEMENTLIST is shorter than the SEARCHLIST,
3956the final character is replicated till it is long enough.
3957If the REPLACEMENTLIST is null, the SEARCHLIST is replicated.
3958This latter is useful for counting characters in a class, or for squashing
3959character sequences in a class.
3964 $ARGV[1] \|=~ \|y/A\-Z/a\-z/; \h'|3i'# canonicalize to lower case
3966 $cnt = tr/*/*/; \h'|3i'# count the stars in $_
3968 $cnt = tr/0\-9//; \h'|3i'# count the digits in $_
3970 tr/a\-zA\-Z//s; \h'|3i'# bookkeeper \-> bokeper
3972 ($HOST = $host) =~ tr/a\-z/A\-Z/;
3974 y/a\-zA\-Z/ /cs; \h'|3i'# change non-alphas to single space
3976 tr/\e200\-\e377/\e0\-\e177/;\h'|3i'# delete 8th bit
3979.Ip "truncate(FILEHANDLE,LENGTH)" 8 4
3980.Ip "truncate(EXPR,LENGTH)" 8
3981Truncates the file opened on FILEHANDLE, or named by EXPR, to the specified
3983Produces a fatal error if truncate isn't implemented on your system.
3984.Ip "umask(EXPR)" 8 4
3985.Ip "umask EXPR" 8
3986.Ip "umask" 8
3987Sets the umask for the process and returns the old one.
3988If EXPR is omitted, merely returns current umask.
3989.Ip "undef(EXPR)" 8 6
3990.Ip "undef EXPR" 8
3991.Ip "undef" 8
3992Undefines the value of EXPR, which must be an lvalue.
3993Use only on a scalar value, an entire array, or a subroutine name (using &).
3994(Undef will probably not do what you expect on most predefined variables or
3995dbm array values.)
3996Always returns the undefined value.
3997You can omit the EXPR, in which case nothing is undefined, but you still
3998get an undefined value that you could, for instance, return from a subroutine.
4001 6
4003 undef $foo;
4004 undef $bar{'blurfl'};
4005 undef @ary;
4006 undef %assoc;
4007 undef &mysub;
4008 return (wantarray ? () : undef) if $they_blew_it;
4011.Ip "unlink(LIST)" 8 4
4012.Ip "unlink LIST" 8
4013Deletes a list of files.
4014Returns the number of files successfully deleted.
4016 2
4018 $cnt = unlink \'a\', \'b\', \'c\';
4019 unlink @goners;
4020 unlink <*.bak>;
4023Note: unlink will not delete directories unless you are superuser and the
4024.B \-U
4025flag is supplied to
4026.IR perl .
4027Even if these conditions are met, be warned that unlinking a directory
4028can inflict damage on your filesystem.
4029Use rmdir instead.
4030.Ip "unpack(TEMPLATE,EXPR)" 8 4
4031Unpack does the reverse of pack: it takes a string representing
4032a structure and expands it out into an array value, returning the array
4034(In a scalar context, it merely returns the first value produced.)
4035The TEMPLATE has the same format as in the pack function.
4036Here's a subroutine that does substring:
4038 4
4040 sub substr {
4041 local($what,$where,$howmuch) = @_;
4042 unpack("x$where a$howmuch", $what);
4043 }
4044 3
4046and then there's
4048 sub ord { unpack("c",$_[0]); }
4051In addition, you may prefix a field with a %<number> to indicate that
4052you want a <number>-bit checksum of the items instead of the items themselves.
4053Default is a 16-bit checksum.
4054For example, the following computes the same number as the System V sum program:
4056 4
4058 while (<>) {
4059 $checksum += unpack("%16C*", $_);
4060 }
4061 $checksum %= 65536;
4064.Ip "unshift(ARRAY,LIST)" 8 4
4065Does the opposite of a
4066.IR shift .
4067Or the opposite of a
4068.IR push ,
4069depending on how you look at it.
4070Prepends list to the front of the array, and returns the number of elements
4071in the new array.
4074 unshift(ARGV, \'\-e\') unless $ARGV[0] =~ /^\-/;
4077.Ip "utime(LIST)" 8 2
4078.Ip "utime LIST" 8 2
4079Changes the access and modification times on each file of a list of files.
4080The first two elements of the list must be the NUMERICAL access and
4081modification times, in that order.
4082Returns the number of files successfully changed.
4083The inode modification time of each file is set to the current time.
4084Example of a \*(L"touch\*(R" command:
4086 3
4088 #!/usr/bin/perl
4089 $now = time;
4090 utime $now, $now, @ARGV;
4093.Ip "values(ASSOC_ARRAY)" 8 6
4094.Ip "values ASSOC_ARRAY" 8
4095Returns a normal array consisting of all the values of the named associative
4097The values are returned in an apparently random order, but it is the same order
4098as either the keys() or each() function would produce on the same array.
4099See also keys() and each().
4100.Ip "vec(EXPR,OFFSET,BITS)" 8 2
4101Treats a string as a vector of unsigned integers, and returns the value
4102of the bitfield specified.
4103May also be assigned to.
4104BITS must be a power of two from 1 to 32.
4106Vectors created with vec() can also be manipulated with the logical operators
4107|, & and ^,
4108which will assume a bit vector operation is desired when both operands are
4110This interpretation is not enabled unless there is at least one vec() in
4111your program, to protect older programs.
4113To transform a bit vector into a string or array of 0's and 1's, use these:
4116 $bits = unpack("b*", $vector);
4117 @bits = split(//, unpack("b*", $vector));
4120If you know the exact length in bits, it can be used in place of the *.
4121.Ip "wait" 8 6
4122Waits for a child process to terminate and returns the pid of the deceased
4123process, or -1 if there are no child processes.
4124The status is returned in $?.
4125.Ip "waitpid(PID,FLAGS)" 8 6
4126Waits for a particular child process to terminate and returns the pid of the deceased
4127process, or -1 if there is no such child process.
4128The status is returned in $?.
4129If you say
4132 require "sys/wait.h";
4133 .\|.\|.
4134 waitpid(-1,&WNOHANG);
4137then you can do a non-blocking wait for any process. Non-blocking wait
4138is only available on machines supporting either the
4139.I waitpid (2)
4141.I wait4 (2)
4142system calls.
4143However, waiting for a particular pid with FLAGS of 0 is implemented
4144everywhere. (Perl emulates the system call by remembering the status
4145values of processes that have exited but have not been harvested by the
4146Perl script yet.)
4147.Ip "wantarray" 8 4
4148Returns true if the context of the currently executing subroutine
4149is looking for an array value.
4150Returns false if the context is looking for a scalar.
4153 return wantarray ? () : undef;
4156.Ip "warn(LIST)" 8 4
4157.Ip "warn LIST" 8
4158Produces a message on STDERR just like \*(L"die\*(R", but doesn't exit.
4159.Ip "write(FILEHANDLE)" 8 6
4160.Ip "write(EXPR)" 8
4161.Ip "write" 8
4162Writes a formatted record (possibly multi-line) to the specified file,
4163using the format associated with that file.
4164By default the format for a file is the one having the same name is the
4165filehandle, but the format for the current output channel (see
4166.IR select )
4167may be set explicitly
4168by assigning the name of the format to the $~ variable.
4170Top of form processing is handled automatically:
4171if there is insufficient room on the current page for the formatted
4172record, the page is advanced by writing a form feed,
4173a special top-of-page format is used
4174to format the new page header, and then the record is written.
4175By default the top-of-page format is \*(L"top\*(R", but it
4176may be set to the
4177format of your choice by assigning the name to the $^ variable.
4178The number of lines remaining on the current page is in variable $-, which
4179can be set to 0 to force a new page.
4181If FILEHANDLE is unspecified, output goes to the current default output channel,
4182which starts out as
4184but may be changed by the
4185.I select
4187If the FILEHANDLE is an EXPR, then the expression is evaluated and the
4188resulting string is used to look up the name of the FILEHANDLE at run time.
4189For more on formats, see the section on formats later on.
4191Note that write is NOT the opposite of read.
4192.Sh "Precedence"
4193.I Perl
4194operators have the following associativity and precedence:
4197nonassoc\h'|1i'print printf exec system sort reverse
4198\h'1.5i'chmod chown kill unlink utime die return
4200right\h'|1i'= += \-= *= etc.
4205left\h'|1i'| ^
4207nonassoc\h'|1i'== != <=> eq ne cmp
4208nonassoc\h'|1i'< > <= >= lt gt le ge
4209nonassoc\h'|1i'chdir exit eval reset sleep rand umask
4210nonassoc\h'|1i'\-r \-w \-x etc.
4211left\h'|1i'<< >>
4212left\h'|1i'+ \- .
4213left\h'|1i'* / % x
4214left\h'|1i'=~ !~
4215right\h'|1i'! ~ and unary minus
4217nonassoc\h'|1i'++ \-\|\-
4221As mentioned earlier, if any list operator (print, etc.) or
4222any unary operator (chdir, etc.)
4223is followed by a left parenthesis as the next token on the same line,
4224the operator and arguments within parentheses are taken to
4225be of highest precedence, just like a normal function call.
4229 chdir $foo || die;\h'|3i'# (chdir $foo) || die
4230 chdir($foo) || die;\h'|3i'# (chdir $foo) || die
4231 chdir ($foo) || die;\h'|3i'# (chdir $foo) || die
4232 chdir +($foo) || die;\h'|3i'# (chdir $foo) || die
4234but, because * is higher precedence than ||:
4236 chdir $foo * 20;\h'|3i'# chdir ($foo * 20)
4237 chdir($foo) * 20;\h'|3i'# (chdir $foo) * 20
4238 chdir ($foo) * 20;\h'|3i'# (chdir $foo) * 20
4239 chdir +($foo) * 20;\h'|3i'# chdir ($foo * 20)
4241 rand 10 * 20;\h'|3i'# rand (10 * 20)
4242 rand(10) * 20;\h'|3i'# (rand 10) * 20
4243 rand (10) * 20;\h'|3i'# (rand 10) * 20
4244 rand +(10) * 20;\h'|3i'# rand (10 * 20)
4247In the absence of parentheses,
4248the precedence of list operators such as print, sort or chmod is
4249either very high or very low depending on whether you look at the left
4250side of operator or the right side of it.
4251For example, in
4254 @ary = (1, 3, sort 4, 2);
4255 print @ary; # prints 1324
4258the commas on the right of the sort are evaluated before the sort, but
4259the commas on the left are evaluated after.
4260In other words, list operators tend to gobble up all the arguments that
4261follow them, and then act like a simple term with regard to the preceding
4263Note that you have to be careful with parens:
4265 3
4267 # These evaluate exit before doing the print:
4268 print($foo, exit); # Obviously not what you want.
4269 print $foo, exit; # Nor is this.
4270 4
4272 # These do the print before evaluating exit:
4273 (print $foo), exit; # This is what you want.
4274 print($foo), exit; # Or this.
4275 print ($foo), exit; # Or even this.
4277Also note that
4279 print ($foo & 255) + 1, "\en";
4282probably doesn't do what you expect at first glance.
4283.Sh "Subroutines"
4284A subroutine may be declared as follows:
4287 sub NAME BLOCK
4291Any arguments passed to the routine come in as array @_,
4292that is ($_[0], $_[1], .\|.\|.).
4293The array @_ is a local array, but its values are references to the
4294actual scalar parameters.
4295The return value of the subroutine is the value of the last expression
4296evaluated, and can be either an array value or a scalar value.
4297Alternately, a return statement may be used to specify the returned value and
4298exit the subroutine.
4299To create local variables see the
4300.I local
4303A subroutine is called using the
4304.I do
4305operator or the & operator.
4307 12
4311 sub MAX {
4312 local($max) = pop(@_);
4313 foreach $foo (@_) {
4314 $max = $foo \|if \|$max < $foo;
4315 }
4316 $max;
4317 }
4319 .\|.\|.
4320 $bestday = &MAX($mon,$tue,$wed,$thu,$fri);
4321 21
4325 # get a line, combining continuation lines
4326 # that start with whitespace
4327 sub get_line {
4328 $thisline = $lookahead;
4329 line: while ($lookahead = <STDIN>) {
4330 if ($lookahead \|=~ \|/\|^[ \^\e\|t]\|/\|) {
4331 $thisline \|.= \|$lookahead;
4332 }
4333 else {
4334 last line;
4335 }
4336 }
4337 $thisline;
4338 }
4340 $lookahead = <STDIN>; # get first line
4341 while ($_ = do get_line(\|)) {
4342 .\|.\|.
4343 }
4344 6
4348Use array assignment to a local list to name your formal arguments:
4350 sub maybeset {
4351 local($key, $value) = @_;
4352 $foo{$key} = $value unless $foo{$key};
4353 }
4356This also has the effect of turning call-by-reference into call-by-value,
4357since the assignment copies the values.
4359Subroutines may be called recursively.
4360If a subroutine is called using the & form, the argument list is optional.
4361If omitted, no @_ array is set up for the subroutine; the @_ array at the
4362time of the call is visible to subroutine instead.
4365 do foo(1,2,3); # pass three arguments
4366 &foo(1,2,3); # the same
4368 do foo(); # pass a null list
4369 &foo(); # the same
4370 &foo; # pass no arguments\*(--more efficient
4373.Sh "Passing By Reference"
4374Sometimes you don't want to pass the value of an array to a subroutine but
4375rather the name of it, so that the subroutine can modify the global copy
4376of it rather than working with a local copy.
4377In perl you can refer to all the objects of a particular name by prefixing
4378the name with a star: *foo.
4379When evaluated, it produces a scalar value that represents all the objects
4380of that name, including any filehandle, format or subroutine.
4381When assigned to within a local() operation, it causes the name mentioned
4382to refer to whatever * value was assigned to it.
4386 sub doubleary {
4387 local(*someary) = @_;
4388 foreach $elem (@someary) {
4389 $elem *= 2;
4390 }
4391 }
4392 do doubleary(*foo);
4393 do doubleary(*bar);
4396Assignment to *name is currently recommended only inside a local().
4397You can actually assign to *name anywhere, but the previous referent of
4398*name may be stranded forever.
4399This may or may not bother you.
4401Note that scalars are already passed by reference, so you can modify scalar
4402arguments without using this mechanism by referring explicitly to the $_[nnn]
4403in question.
4404You can modify all the elements of an array by passing all the elements
4405as scalars, but you have to use the * mechanism to push, pop or change the
4406size of an array.
4407The * mechanism will probably be more efficient in any case.
4409Since a *name value contains unprintable binary data, if it is used as
4410an argument in a print, or as a %s argument in a printf or sprintf, it
4411then has the value '*name', just so it prints out pretty.
4413Even if you don't want to modify an array, this mechanism is useful for
4414passing multiple arrays in a single LIST, since normally the LIST mechanism
4415will merge all the array values so that you can't extract out the
4416individual arrays.
4417.Sh "Regular Expressions"
4418The patterns used in pattern matching are regular expressions such as
4419those supplied in the Version 8 regexp routines.
4420(In fact, the routines are derived from Henry Spencer's freely redistributable
4421reimplementation of the V8 routines.)
4422In addition, \ew matches an alphanumeric character (including \*(L"_\*(R") and \eW a nonalphanumeric.
4423Word boundaries may be matched by \eb, and non-boundaries by \eB.
4424A whitespace character is matched by \es, non-whitespace by \eS.
4425A numeric character is matched by \ed, non-numeric by \eD.
4426You may use \ew, \es and \ed within character classes.
4427Also, \en, \er, \ef, \et and \eNNN have their normal interpretations.
4428Within character classes \eb represents backspace rather than a word boundary.
4429Alternatives may be separated by |.
4430The bracketing construct \|(\ .\|.\|.\ \|) may also be used, in which case \e<digit>
4431matches the digit'th substring.
4432(Outside of the pattern, always use $ instead of \e in front of the digit.
4433The scope of $<digit> (and $\`, $& and $\')
4434extends to the end of the enclosing BLOCK or eval string, or to
4435the next pattern match with subexpressions.
4436The \e<digit> notation sometimes works outside the current pattern, but should
4437not be relied upon.)
4438You may have as many parentheses as you wish. If you have more than 9
4439substrings, the variables $10, $11, ... refer to the corresponding
4440substring. Within the pattern, \e10, \e11,
4441etc. refer back to substrings if there have been at least that many left parens
4442before the backreference. Otherwise (for backward compatibilty) \e10
4443is the same as \e010, a backspace,
4444and \e11 the same as \e011, a tab.
4445And so on.
4446(\e1 through \e9 are always backreferences.)
4448$+ returns whatever the last bracket match matched.
4449$& returns the entire matched string.
4450($0 used to return the same thing, but not any more.)
4451$\` returns everything before the matched string.
4452$\' returns everything after the matched string.
4456 s/\|^\|([^ \|]*\|) \|*([^ \|]*\|)\|/\|$2 $1\|/; # swap first two words
4457 5
4459 if (/\|Time: \|(.\|.\|):\|(.\|.\|):\|(.\|.\|)\|/\|) {
4460 $hours = $1;
4461 $minutes = $2;
4462 $seconds = $3;
4463 }
4466By default, the ^ character is only guaranteed to match at the beginning
4467of the string,
4468the $ character only at the end (or before the newline at the end)
4470.I perl
4471does certain optimizations with the assumption that the string contains
4472only one line.
4473The behavior of ^ and $ on embedded newlines will be inconsistent.
4474You may, however, wish to treat a string as a multi-line buffer, such that
4475the ^ will match after any newline within the string, and $ will match
4476before any newline.
4477At the cost of a little more overhead, you can do this by setting the variable
4478$* to 1.
4479Setting it back to 0 makes
4480.I perl