This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Use PERL=../miniperl
[perl5.git] / pod / perlop.pod
CommitLineData
a0d0e21e
LW
1=head1 NAME
2
3perlop - Perl operators and precedence
4
5=head1 SYNOPSIS
6
7Perl operators have the following associativity and precedence,
8listed from highest precedence to lowest. Note that all operators
9borrowed from C keep the same precedence relationship with each other,
10even where C's precedence is slightly screwy. (This makes learning
11Perl easier for C folks.)
12
13 left terms and list operators (leftward)
14 left ->
15 nonassoc ++ --
16 right **
17 right ! ~ \ and unary + and -
18 left =~ !~
19 left * / % x
20 left + - .
21 left << >>
22 nonassoc named unary operators
23 nonassoc < > <= >= lt gt le ge
24 nonassoc == != <=> eq ne cmp
25 left &
26 left | ^
27 left &&
28 left ||
29 nonassoc ..
30 right ?:
31 right = += -= *= etc.
32 left , =>
33 nonassoc list operators (rightward)
34 left not
35 left and
36 left or xor
37
38In the following sections, these operators are covered in precedence order.
39
cb1a09d0 40=head1 DESCRIPTION
a0d0e21e
LW
41
42=head2 Terms and List Operators (Leftward)
43
44Any TERM is of highest precedence of Perl. These includes variables,
45quote and quotelike operators, any expression in parentheses,
46and any function whose arguments are parenthesized. Actually, there
47aren't really functions in this sense, just list operators and unary
48operators behaving as functions because you put parentheses around
49the arguments. These are all documented in L<perlfunc>.
50
51If any list operator (print(), etc.) or any unary operator (chdir(), etc.)
52is followed by a left parenthesis as the next token, the operator and
53arguments within parentheses are taken to be of highest precedence,
54just like a normal function call.
55
56In the absence of parentheses, the precedence of list operators such as
57C<print>, C<sort>, or C<chmod> is either very high or very low depending on
58whether you look at the left side of operator or the right side of it.
59For example, in
60
61 @ary = (1, 3, sort 4, 2);
62 print @ary; # prints 1324
63
64the commas on the right of the sort are evaluated before the sort, but
65the commas on the left are evaluated after. In other words, list
66operators tend to gobble up all the arguments that follow them, and
67then act like a simple TERM with regard to the preceding expression.
68Note that you have to be careful with parens:
69
70 # These evaluate exit before doing the print:
71 print($foo, exit); # Obviously not what you want.
72 print $foo, exit; # Nor is this.
73
74 # These do the print before evaluating exit:
75 (print $foo), exit; # This is what you want.
76 print($foo), exit; # Or this.
77 print ($foo), exit; # Or even this.
78
79Also note that
80
81 print ($foo & 255) + 1, "\n";
82
83probably doesn't do what you expect at first glance. See
84L<Named Unary Operators> for more discussion of this.
85
86Also parsed as terms are the C<do {}> and C<eval {}> constructs, as
87well as subroutine and method calls, and the anonymous
88constructors C<[]> and C<{}>.
89
90See also L<Quote and Quotelike Operators> toward the end of this section,
91as well as L<I/O Operators>.
92
93=head2 The Arrow Operator
94
95Just as in C and C++, "C<-E<gt>>" is an infix dereference operator. If the
96right side is either a C<[...]> or C<{...}> subscript, then the left side
97must be either a hard or symbolic reference to an array or hash (or
98a location capable of holding a hard reference, if it's an lvalue (assignable)).
99See L<perlref>.
100
101Otherwise, the right side is a method name or a simple scalar variable
102containing the method name, and the left side must either be an object
103(a blessed reference) or a class name (that is, a package name).
104See L<perlobj>.
105
106=head2 Autoincrement and Autodecrement
107
108"++" and "--" work as in C. That is, if placed before a variable, they
109increment or decrement the variable before returning the value, and if
110placed after, increment or decrement the variable after returning the value.
111
112The autoincrement operator has a little extra built-in magic to it. If
113you increment a variable that is numeric, or that has ever been used in
114a numeric context, you get a normal increment. If, however, the
115variable has only been used in string contexts since it was set, and
116has a value that is not null and matches the pattern
117C</^[a-zA-Z]*[0-9]*$/>, the increment is done as a string, preserving each
118character within its range, with carry:
119
120 print ++($foo = '99'); # prints '100'
121 print ++($foo = 'a0'); # prints 'a1'
122 print ++($foo = 'Az'); # prints 'Ba'
123 print ++($foo = 'zz'); # prints 'aaa'
124
125The autodecrement operator is not magical.
126
127=head2 Exponentiation
128
129Binary "**" is the exponentiation operator. Note that it binds even more
cb1a09d0
AD
130tightly than unary minus, so -2**4 is -(2**4), not (-2)**4. (This is
131implemented using C's pow(3) function, which actually works on doubles
132internally.)
a0d0e21e
LW
133
134=head2 Symbolic Unary Operators
135
136Unary "!" performs logical negation, i.e. "not". See also C<not> for a lower
137precedence version of this.
138
139Unary "-" performs arithmetic negation if the operand is numeric. If
140the operand is an identifier, a string consisting of a minus sign
141concatenated with the identifier is returned. Otherwise, if the string
142starts with a plus or minus, a string starting with the opposite sign
143is returned. One effect of these rules is that C<-bareword> is equivalent
144to C<"-bareword">.
145
146Unary "~" performs bitwise negation, i.e. 1's complement.
147
148Unary "+" has no effect whatsoever, even on strings. It is useful
149syntactically for separating a function name from a parenthesized expression
150that would otherwise be interpreted as the complete list of function
151arguments. (See examples above under L<List Operators>.)
152
153Unary "\" creates a reference to whatever follows it. See L<perlref>.
154Do not confuse this behavior with the behavior of backslash within a
155string, although both forms do convey the notion of protecting the next
156thing from interpretation.
157
158=head2 Binding Operators
159
cb1a09d0
AD
160Binary "=~" binds an expression to a pattern match. Certain operations
161search or modify the string $_ by default. This operator makes that kind
162of operation work on some other string. The right argument is a search
163pattern, substitution, or translation. The left argument is what is
164supposed to be searched, substituted, or translated instead of the default
165$_. The return value indicates the success of the operation. (If the
166right argument is an expression rather than a search pattern,
167substitution, or translation, it is interpreted as a search pattern at run
168time. This is less efficient than an explicit search, since the pattern
169must be compiled every time the expression is evaluated--unless you've
170used C</o>.)
a0d0e21e
LW
171
172Binary "!~" is just like "=~" except the return value is negated in
173the logical sense.
174
175=head2 Multiplicative Operators
176
177Binary "*" multiplies two numbers.
178
179Binary "/" divides two numbers.
180
181Binary "%" computes the modulus of the two numbers.
182
183Binary "x" is the repetition operator. In a scalar context, it
184returns a string consisting of the left operand repeated the number of
185times specified by the right operand. In a list context, if the left
186operand is a list in parens, it repeats the list.
187
188 print '-' x 80; # print row of dashes
189
190 print "\t" x ($tab/8), ' ' x ($tab%8); # tab over
191
192 @ones = (1) x 80; # a list of 80 1's
193 @ones = (5) x @ones; # set all elements to 5
194
195
196=head2 Additive Operators
197
198Binary "+" returns the sum of two numbers.
199
200Binary "-" returns the difference of two numbers.
201
202Binary "." concatenates two strings.
203
204=head2 Shift Operators
205
206Binary "<<" returns the value of its left argument shifted left by the
207number of bits specified by the right argument. Arguments should be
208integers.
209
210Binary ">>" returns the value of its left argument shifted right by the
211number of bits specified by the right argument. Arguments should be
212integers.
213
214=head2 Named Unary Operators
215
216The various named unary operators are treated as functions with one
217argument, with optional parentheses. These include the filetest
218operators, like C<-f>, C<-M>, etc. See L<perlfunc>.
219
220If any list operator (print(), etc.) or any unary operator (chdir(), etc.)
221is followed by a left parenthesis as the next token, the operator and
222arguments within parentheses are taken to be of highest precedence,
223just like a normal function call. Examples:
224
225 chdir $foo || die; # (chdir $foo) || die
226 chdir($foo) || die; # (chdir $foo) || die
227 chdir ($foo) || die; # (chdir $foo) || die
228 chdir +($foo) || die; # (chdir $foo) || die
229
230but, because * is higher precedence than ||:
231
232 chdir $foo * 20; # chdir ($foo * 20)
233 chdir($foo) * 20; # (chdir $foo) * 20
234 chdir ($foo) * 20; # (chdir $foo) * 20
235 chdir +($foo) * 20; # chdir ($foo * 20)
236
237 rand 10 * 20; # rand (10 * 20)
238 rand(10) * 20; # (rand 10) * 20
239 rand (10) * 20; # (rand 10) * 20
240 rand +(10) * 20; # rand (10 * 20)
241
242See also L<"List Operators">.
243
244=head2 Relational Operators
245
246Binary "<" returns true if the left argument is numerically less than
247the right argument.
248
249Binary ">" returns true if the left argument is numerically greater
250than the right argument.
251
252Binary "<=" returns true if the left argument is numerically less than
253or equal to the right argument.
254
255Binary ">=" returns true if the left argument is numerically greater
256than or equal to the right argument.
257
258Binary "lt" returns true if the left argument is stringwise less than
259the right argument.
260
261Binary "gt" returns true if the left argument is stringwise greater
262than the right argument.
263
264Binary "le" returns true if the left argument is stringwise less than
265or equal to the right argument.
266
267Binary "ge" returns true if the left argument is stringwise greater
268than or equal to the right argument.
269
270=head2 Equality Operators
271
272Binary "==" returns true if the left argument is numerically equal to
273the right argument.
274
275Binary "!=" returns true if the left argument is numerically not equal
276to the right argument.
277
278Binary "<=>" returns -1, 0, or 1 depending on whether the left argument is numerically
279less than, equal to, or greater than the right argument.
280
281Binary "eq" returns true if the left argument is stringwise equal to
282the right argument.
283
284Binary "ne" returns true if the left argument is stringwise not equal
285to the right argument.
286
287Binary "cmp" returns -1, 0, or 1 depending on whether the left argument is stringwise
288less than, equal to, or greater than the right argument.
289
290=head2 Bitwise And
291
292Binary "&" returns its operators ANDed together bit by bit.
293
294=head2 Bitwise Or and Exclusive Or
295
296Binary "|" returns its operators ORed together bit by bit.
297
298Binary "^" returns its operators XORed together bit by bit.
299
300=head2 C-style Logical And
301
302Binary "&&" performs a short-circuit logical AND operation. That is,
303if the left operand is false, the right operand is not even evaluated.
304Scalar or list context propagates down to the right operand if it
305is evaluated.
306
307=head2 C-style Logical Or
308
309Binary "||" performs a short-circuit logical OR operation. That is,
310if the left operand is true, the right operand is not even evaluated.
311Scalar or list context propagates down to the right operand if it
312is evaluated.
313
314The C<||> and C<&&> operators differ from C's in that, rather than returning
3150 or 1, they return the last value evaluated. Thus, a reasonably portable
316way to find out the home directory (assuming it's not "0") might be:
317
318 $home = $ENV{'HOME'} || $ENV{'LOGDIR'} ||
319 (getpwuid($<))[7] || die "You're homeless!\n";
320
321As more readable alternatives to C<&&> and C<||>, Perl provides "and" and
322"or" operators (see below). The short-circuit behavior is identical. The
323precedence of "and" and "or" is much lower, however, so that you can
324safely use them after a list operator without the need for
325parentheses:
326
327 unlink "alpha", "beta", "gamma"
328 or gripe(), next LINE;
329
330With the C-style operators that would have been written like this:
331
332 unlink("alpha", "beta", "gamma")
333 || (gripe(), next LINE);
334
335=head2 Range Operator
336
337Binary ".." is the range operator, which is really two different
338operators depending on the context. In a list context, it returns an
339array of values counting (by ones) from the left value to the right
340value. This is useful for writing C<for (1..10)> loops and for doing
341slice operations on arrays. Be aware that under the current implementation,
342a temporary array is created, so you'll burn a lot of memory if you
343write something like this:
344
345 for (1 .. 1_000_000) {
346 # code
347 }
348
349In a scalar context, ".." returns a boolean value. The operator is
350bistable, like a flip-flop, and emulates the line-range (comma) operator
351of B<sed>, B<awk>, and various editors. Each ".." operator maintains its
352own boolean state. It is false as long as its left operand is false.
353Once the left operand is true, the range operator stays true until the
354right operand is true, I<AFTER> which the range operator becomes false
355again. (It doesn't become false till the next time the range operator is
356evaluated. It can test the right operand and become false on the same
357evaluation it became true (as in B<awk>), but it still returns true once.
358If you don't want it to test the right operand till the next evaluation
359(as in B<sed>), use three dots ("...") instead of two.) The right
360operand is not evaluated while the operator is in the "false" state, and
361the left operand is not evaluated while the operator is in the "true"
362state. The precedence is a little lower than || and &&. The value
363returned is either the null string for false, or a sequence number
364(beginning with 1) for true. The sequence number is reset for each range
365encountered. The final sequence number in a range has the string "E0"
366appended to it, which doesn't affect its numeric value, but gives you
367something to search for if you want to exclude the endpoint. You can
368exclude the beginning point by waiting for the sequence number to be
369greater than 1. If either operand of scalar ".." is a numeric literal,
370that operand is implicitly compared to the C<$.> variable, the current
371line number. Examples:
372
373As a scalar operator:
374
375 if (101 .. 200) { print; } # print 2nd hundred lines
376 next line if (1 .. /^$/); # skip header lines
377 s/^/> / if (/^$/ .. eof()); # quote body
378
379As a list operator:
380
381 for (101 .. 200) { print; } # print $_ 100 times
382 @foo = @foo[$[ .. $#foo]; # an expensive no-op
383 @foo = @foo[$#foo-4 .. $#foo]; # slice last 5 items
384
385The range operator (in a list context) makes use of the magical
386autoincrement algorithm if the operaands are strings. You
387can say
388
389 @alphabet = ('A' .. 'Z');
390
391to get all the letters of the alphabet, or
392
393 $hexdigit = (0 .. 9, 'a' .. 'f')[$num & 15];
394
395to get a hexadecimal digit, or
396
397 @z2 = ('01' .. '31'); print $z2[$mday];
398
399to get dates with leading zeros. If the final value specified is not
400in the sequence that the magical increment would produce, the sequence
401goes until the next value would be longer than the final value
402specified.
403
404=head2 Conditional Operator
405
406Ternary "?:" is the conditional operator, just as in C. It works much
407like an if-then-else. If the argument before the ? is true, the
408argument before the : is returned, otherwise the argument after the :
cb1a09d0
AD
409is returned. For example:
410
411 printf "I have %d dog%s.\n", $n,
412 ($n == 1) ? '' : "s";
413
414Scalar or list context propagates downward into the 2nd
415or 3rd argument, whichever is selected.
416
417 $a = $ok ? $b : $c; # get a scalar
418 @a = $ok ? @b : @c; # get an array
419 $a = $ok ? @b : @c; # oops, that's just a count!
420
421The operator may be assigned to if both the 2nd and 3rd arguments are
422legal lvalues (meaning that you can assign to them):
a0d0e21e
LW
423
424 ($a_or_b ? $a : $b) = $c;
425
cb1a09d0 426This is not necessarily guaranteed to contribute to the readability of your program.
a0d0e21e 427
4633a7c4 428=head2 Assignment Operators
a0d0e21e
LW
429
430"=" is the ordinary assignment operator.
431
432Assignment operators work as in C. That is,
433
434 $a += 2;
435
436is equivalent to
437
438 $a = $a + 2;
439
440although without duplicating any side effects that dereferencing the lvalue
441might trigger, such as from tie(). Other assignment operators work similarly.
442The following are recognized:
443
444 **= += *= &= <<= &&=
445 -= /= |= >>= ||=
446 .= %= ^=
447 x=
448
449Note that while these are grouped by family, they all have the precedence
450of assignment.
451
452Unlike in C, the assignment operator produces a valid lvalue. Modifying
453an assignment is equivalent to doing the assignment and then modifying
454the variable that was assigned to. This is useful for modifying
455a copy of something, like this:
456
457 ($tmp = $global) =~ tr [A-Z] [a-z];
458
459Likewise,
460
461 ($a += 2) *= 3;
462
463is equivalent to
464
465 $a += 2;
466 $a *= 3;
467
748a9306 468=head2 Comma Operator
a0d0e21e
LW
469
470Binary "," is the comma operator. In a scalar context it evaluates
471its left argument, throws that value away, then evaluates its right
472argument and returns that value. This is just like C's comma operator.
473
474In a list context, it's just the list argument separator, and inserts
475both its arguments into the list.
476
4633a7c4 477The => digraph is mostly just a synonym for the comma operator. It's useful for
cb1a09d0 478documenting arguments that come in pairs. As of release 5.001, it also forces
4633a7c4 479any word to the left of it to be interpreted as a string.
748a9306 480
a0d0e21e
LW
481=head2 List Operators (Rightward)
482
483On the right side of a list operator, it has very low precedence,
484such that it controls all comma-separated expressions found there.
485The only operators with lower precedence are the logical operators
486"and", "or", and "not", which may be used to evaluate calls to list
487operators without the need for extra parentheses:
488
489 open HANDLE, "filename"
490 or die "Can't open: $!\n";
491
492See also discussion of list operators in L<List Operators (Leftward)>.
493
494=head2 Logical Not
495
496Unary "not" returns the logical negation of the expression to its right.
497It's the equivalent of "!" except for the very low precedence.
498
499=head2 Logical And
500
501Binary "and" returns the logical conjunction of the two surrounding
502expressions. It's equivalent to && except for the very low
503precedence. This means that it short-circuits: i.e. the right
504expression is evaluated only if the left expression is true.
505
506=head2 Logical or and Exclusive Or
507
508Binary "or" returns the logical disjunction of the two surrounding
509expressions. It's equivalent to || except for the very low
510precedence. This means that it short-circuits: i.e. the right
511expression is evaluated only if the left expression is false.
512
513Binary "xor" returns the exclusive-OR of the two surrounding expressions.
514It cannot short circuit, of course.
515
516=head2 C Operators Missing From Perl
517
518Here is what C has that Perl doesn't:
519
520=over 8
521
522=item unary &
523
524Address-of operator. (But see the "\" operator for taking a reference.)
525
526=item unary *
527
528Dereference-address operator. (Perl's prefix dereferencing
529operators are typed: $, @, %, and &.)
530
531=item (TYPE)
532
533Type casting operator.
534
535=back
536
537=head2 Quote and Quotelike Operators
538
539While we usually think of quotes as literal values, in Perl they
540function as operators, providing various kinds of interpolating and
541pattern matching capabilities. Perl provides customary quote characters
542for these behaviors, but also provides a way for you to choose your
543quote character for any of them. In the following table, a C<{}> represents
544any pair of delimiters you choose. Non-bracketing delimiters use
545the same character fore and aft, but the 4 sorts of brackets
546(round, angle, square, curly) will all nest.
547
548 Customary Generic Meaning Interpolates
549 '' q{} Literal no
550 "" qq{} Literal yes
551 `` qx{} Command yes
552 qw{} Word list no
553 // m{} Pattern match yes
554 s{}{} Substitution yes
555 tr{}{} Translation no
556
cb1a09d0 557For constructs that do interpolation, variables beginning with "C<$>" or "C<@>"
a0d0e21e
LW
558are interpolated, as are the following sequences:
559
560 \t tab
561 \n newline
562 \r return
563 \f form feed
564 \v vertical tab, whatever that is
565 \b backspace
566 \a alarm (bell)
567 \e escape
568 \033 octal char
569 \x1b hex char
570 \c[ control char
571 \l lowercase next char
572 \u uppercase next char
573 \L lowercase till \E
574 \U uppercase till \E
575 \E end case modification
576 \Q quote regexp metacharacters till \E
577
578Patterns are subject to an additional level of interpretation as a
579regular expression. This is done as a second pass, after variables are
580interpolated, so that regular expressions may be incorporated into the
581pattern from the variables. If this is not what you want, use C<\Q> to
582interpolate a variable literally.
583
584Apart from the above, there are no multiple levels of interpolation. In
585particular, contrary to the expectations of shell programmers, backquotes
586do I<NOT> interpolate within double quotes, nor do single quotes impede
587evaluation of variables when used within double quotes.
588
cb1a09d0
AD
589=head2 Regexp Quotelike Operators
590
591Here are the quotelike operators that apply to pattern
592matching and related activities.
593
a0d0e21e
LW
594=over 8
595
596=item ?PATTERN?
597
598This is just like the C</pattern/> search, except that it matches only
599once between calls to the reset() operator. This is a useful
600optimization when you only want to see the first occurrence of
601something in each file of a set of files, for instance. Only C<??>
602patterns local to the current package are reset.
603
604This usage is vaguely deprecated, and may be removed in some future
605version of Perl.
606
607=item m/PATTERN/gimosx
608
609=item /PATTERN/gimosx
610
611Searches a string for a pattern match, and in a scalar context returns
612true (1) or false (''). If no string is specified via the C<=~> or
613C<!~> operator, the $_ string is searched. (The string specified with
614C<=~> need not be an lvalue--it may be the result of an expression
615evaluation, but remember the C<=~> binds rather tightly.) See also
616L<perlre>.
617
618Options are:
619
620 g Match globally, i.e. find all occurrences.
621 i Do case-insensitive pattern matching.
622 m Treat string as multiple lines.
623 o Only compile pattern once.
624 s Treat string as single line.
625 x Use extended regular expressions.
626
627If "/" is the delimiter then the initial C<m> is optional. With the C<m>
628you can use any pair of non-alphanumeric, non-whitespace characters as
629delimiters. This is particularly useful for matching Unix path names
630that contain "/", to avoid LTS (leaning toothpick syndrome).
631
632PATTERN may contain variables, which will be interpolated (and the
633pattern recompiled) every time the pattern search is evaluated. (Note
634that C<$)> and C<$|> might not be interpolated because they look like
635end-of-string tests.) If you want such a pattern to be compiled only
636once, add a C</o> after the trailing delimiter. This avoids expensive
637run-time recompilations, and is useful when the value you are
638interpolating won't change over the life of the script. However, mentioning
639C</o> constitutes a promise that you won't change the variables in the pattern.
640If you change them, Perl won't even notice.
641
4633a7c4
LW
642If the PATTERN evaluates to a null string, the last
643successfully executed regular expression is used instead.
a0d0e21e
LW
644
645If used in a context that requires a list value, a pattern match returns a
646list consisting of the subexpressions matched by the parentheses in the
647pattern, i.e. ($1, $2, $3...). (Note that here $1 etc. are also set, and
648that this differs from Perl 4's behavior.) If the match fails, a null
649array is returned. If the match succeeds, but there were no parentheses,
650a list value of (1) is returned.
651
652Examples:
653
654 open(TTY, '/dev/tty');
655 <TTY> =~ /^y/i && foo(); # do foo if desired
656
657 if (/Version: *([0-9.]*)/) { $version = $1; }
658
659 next if m#^/usr/spool/uucp#;
660
661 # poor man's grep
662 $arg = shift;
663 while (<>) {
664 print if /$arg/o; # compile only once
665 }
666
667 if (($F1, $F2, $Etc) = ($foo =~ /^(\S+)\s+(\S+)\s*(.*)/))
668
669This last example splits $foo into the first two words and the
670remainder of the line, and assigns those three fields to $F1, $F2 and
671$Etc. The conditional is true if any variables were assigned, i.e. if
672the pattern matched.
673
674The C</g> modifier specifies global pattern matching--that is, matching
675as many times as possible within the string. How it behaves depends on
676the context. In a list context, it returns a list of all the
677substrings matched by all the parentheses in the regular expression.
678If there are no parentheses, it returns a list of all the matched
679strings, as if there were parentheses around the whole pattern.
680
681In a scalar context, C<m//g> iterates through the string, returning TRUE
682each time it matches, and FALSE when it eventually runs out of
683matches. (In other words, it remembers where it left off last time and
684restarts the search at that point. You can actually find the current
685match position of a string using the pos() function--see L<perlfunc>.)
686If you modify the string in any way, the match position is reset to the
687beginning. Examples:
688
689 # list context
690 ($one,$five,$fifteen) = (`uptime` =~ /(\d+\.\d+)/g);
691
692 # scalar context
693 $/ = ""; $* = 1; # $* deprecated in Perl 5
694 while ($paragraph = <>) {
695 while ($paragraph =~ /[a-z]['")]*[.!?]+['")]*\s/g) {
696 $sentences++;
697 }
698 }
699 print "$sentences\n";
700
701=item q/STRING/
702
703=item C<'STRING'>
704
705A single-quoted, literal string. Backslashes are ignored, unless
706followed by the delimiter or another backslash, in which case the
707delimiter or backslash is interpolated.
708
709 $foo = q!I said, "You said, 'She said it.'"!;
710 $bar = q('This is it.');
711
712=item qq/STRING/
713
714=item "STRING"
715
716A double-quoted, interpolated string.
717
718 $_ .= qq
719 (*** The previous line contains the naughty word "$1".\n)
720 if /(tcl|rexx|python)/; # :-)
721
722=item qx/STRING/
723
724=item `STRING`
725
726A string which is interpolated and then executed as a system command.
727The collected standard output of the command is returned. In scalar
728context, it comes back as a single (potentially multi-line) string.
729In list context, returns a list of lines (however you've defined lines
730with $/ or $INPUT_RECORD_SEPARATOR).
731
732 $today = qx{ date };
733
734See L<I/O Operators> for more discussion.
735
736=item qw/STRING/
737
738Returns a list of the words extracted out of STRING, using embedded
739whitespace as the word delimiters. It is exactly equivalent to
740
741 split(' ', q/STRING/);
742
743Some frequently seen examples:
744
745 use POSIX qw( setlocale localeconv )
746 @EXPORT = qw( foo bar baz );
747
748=item s/PATTERN/REPLACEMENT/egimosx
749
750Searches a string for a pattern, and if found, replaces that pattern
751with the replacement text and returns the number of substitutions
752made. Otherwise it returns false (0).
753
754If no string is specified via the C<=~> or C<!~> operator, the C<$_>
755variable is searched and modified. (The string specified with C<=~> must
756be a scalar variable, an array element, a hash element, or an assignment
757to one of those, i.e. an lvalue.)
758
759If the delimiter chosen is single quote, no variable interpolation is
760done on either the PATTERN or the REPLACEMENT. Otherwise, if the
761PATTERN contains a $ that looks like a variable rather than an
762end-of-string test, the variable will be interpolated into the pattern
763at run-time. If you only want the pattern compiled once the first time
764the variable is interpolated, use the C</o> option. If the pattern
4633a7c4 765evaluates to a null string, the last successfully executed regular
a0d0e21e
LW
766expression is used instead. See L<perlre> for further explanation on these.
767
768Options are:
769
770 e Evaluate the right side as an expression.
771 g Replace globally, i.e. all occurrences.
772 i Do case-insensitive pattern matching.
773 m Treat string as multiple lines.
774 o Only compile pattern once.
775 s Treat string as single line.
776 x Use extended regular expressions.
777
778Any non-alphanumeric, non-whitespace delimiter may replace the
779slashes. If single quotes are used, no interpretation is done on the
780replacement string (the C</e> modifier overrides this, however). If
781backquotes are used, the replacement string is a command to execute
782whose output will be used as the actual replacement text. If the
783PATTERN is delimited by bracketing quotes, the REPLACEMENT has its own
784pair of quotes, which may or may not be bracketing quotes, e.g.
785C<s(foo)(bar)> or C<sE<lt>fooE<gt>/bar/>. A C</e> will cause the
786replacement portion to be interpreter as a full-fledged Perl expression
787and eval()ed right then and there. It is, however, syntax checked at
788compile-time.
789
790Examples:
791
792 s/\bgreen\b/mauve/g; # don't change wintergreen
793
794 $path =~ s|/usr/bin|/usr/local/bin|;
795
796 s/Login: $foo/Login: $bar/; # run-time pattern
797
798 ($foo = $bar) =~ s/this/that/;
799
800 $count = ($paragraph =~ s/Mister\b/Mr./g);
801
802 $_ = 'abc123xyz';
803 s/\d+/$&*2/e; # yields 'abc246xyz'
804 s/\d+/sprintf("%5d",$&)/e; # yields 'abc 246xyz'
805 s/\w/$& x 2/eg; # yields 'aabbcc 224466xxyyzz'
806
807 s/%(.)/$percent{$1}/g; # change percent escapes; no /e
808 s/%(.)/$percent{$1} || $&/ge; # expr now, so /e
809 s/^=(\w+)/&pod($1)/ge; # use function call
810
811 # /e's can even nest; this will expand
812 # simple embedded variables in $_
813 s/(\$\w+)/$1/eeg;
814
815 # Delete C comments.
816 $program =~ s {
4633a7c4
LW
817 /\* # Match the opening delimiter.
818 .*? # Match a minimal number of characters.
819 \*/ # Match the closing delimiter.
a0d0e21e
LW
820 } []gsx;
821
822 s/^\s*(.*?)\s*$/$1/; # trim white space
823
824 s/([^ ]*) *([^ ]*)/$2 $1/; # reverse 1st two fields
825
826Note the use of $ instead of \ in the last example. Unlike
827B<sed>, we only use the \<I<digit>> form in the left hand side.
828Anywhere else it's $<I<digit>>.
829
830Occasionally, you can't just use a C</g> to get all the changes
831to occur. Here are two common cases:
832
833 # put commas in the right places in an integer
834 1 while s/(.*\d)(\d\d\d)/$1,$2/g; # perl4
835 1 while s/(\d)(\d\d\d)(?!\d)/$1,$2/g; # perl5
836
837 # expand tabs to 8-column spacing
838 1 while s/\t+/' ' x (length($&)*8 - length($`)%8)/e;
839
840
841=item tr/SEARCHLIST/REPLACEMENTLIST/cds
842
843=item y/SEARCHLIST/REPLACEMENTLIST/cds
844
845Translates all occurrences of the characters found in the search list
846with the corresponding character in the replacement list. It returns
847the number of characters replaced or deleted. If no string is
848specified via the =~ or !~ operator, the $_ string is translated. (The
849string specified with =~ must be a scalar variable, an array element,
850or an assignment to one of those, i.e. an lvalue.) For B<sed> devotees,
851C<y> is provided as a synonym for C<tr>. If the SEARCHLIST is
852delimited by bracketing quotes, the REPLACEMENTLIST has its own pair of
853quotes, which may or may not be bracketing quotes, e.g. C<tr[A-Z][a-z]>
854or C<tr(+-*/)/ABCD/>.
855
856Options:
857
858 c Complement the SEARCHLIST.
859 d Delete found but unreplaced characters.
860 s Squash duplicate replaced characters.
861
862If the C</c> modifier is specified, the SEARCHLIST character set is
863complemented. If the C</d> modifier is specified, any characters specified
864by SEARCHLIST not found in REPLACEMENTLIST are deleted. (Note
865that this is slightly more flexible than the behavior of some B<tr>
866programs, which delete anything they find in the SEARCHLIST, period.)
867If the C</s> modifier is specified, sequences of characters that were
868translated to the same character are squashed down to a single instance of the
869character.
870
871If the C</d> modifier is used, the REPLACEMENTLIST is always interpreted
872exactly as specified. Otherwise, if the REPLACEMENTLIST is shorter
873than the SEARCHLIST, the final character is replicated till it is long
874enough. If the REPLACEMENTLIST is null, the SEARCHLIST is replicated.
875This latter is useful for counting characters in a class or for
876squashing character sequences in a class.
877
878Examples:
879
880 $ARGV[1] =~ tr/A-Z/a-z/; # canonicalize to lower case
881
882 $cnt = tr/*/*/; # count the stars in $_
883
884 $cnt = $sky =~ tr/*/*/; # count the stars in $sky
885
886 $cnt = tr/0-9//; # count the digits in $_
887
888 tr/a-zA-Z//s; # bookkeeper -> bokeper
889
890 ($HOST = $host) =~ tr/a-z/A-Z/;
891
892 tr/a-zA-Z/ /cs; # change non-alphas to single space
893
894 tr [\200-\377]
895 [\000-\177]; # delete 8th bit
896
748a9306
LW
897If multiple translations are given for a character, only the first one is used:
898
899 tr/AAA/XYZ/
900
901will translate any A to X.
902
a0d0e21e
LW
903Note that because the translation table is built at compile time, neither
904the SEARCHLIST nor the REPLACEMENTLIST are subjected to double quote
905interpolation. That means that if you want to use variables, you must use
906an eval():
907
908 eval "tr/$oldlist/$newlist/";
909 die $@ if $@;
910
911 eval "tr/$oldlist/$newlist/, 1" or die $@;
912
913=back
914
915=head2 I/O Operators
916
917There are several I/O operators you should know about.
918A string is enclosed by backticks (grave accents) first undergoes
919variable substitution just like a double quoted string. It is then
920interpreted as a command, and the output of that command is the value
921of the pseudo-literal, like in a shell. In a scalar context, a single
922string consisting of all the output is returned. In a list context,
923a list of values is returned, one for each line of output. (You can
924set C<$/> to use a different line terminator.) The command is executed
925each time the pseudo-literal is evaluated. The status value of the
926command is returned in C<$?> (see L<perlvar> for the interpretation
927of C<$?>). Unlike in B<csh>, no translation is done on the return
928data--newlines remain newlines. Unlike in any of the shells, single
929quotes do not hide variable names in the command from interpretation.
930To pass a $ through to the shell you need to hide it with a backslash.
cb1a09d0
AD
931The generalized form of backticks is C<qx//>. (Because backticks
932always undergo shell expansion as well, see L<perlsec> for
933security concerns.)
a0d0e21e
LW
934
935Evaluating a filehandle in angle brackets yields the next line from
748a9306
LW
936that file (newline included, so it's never false until end of file, at
937which time an undefined value is returned). Ordinarily you must assign
938that value to a variable, but there is one situation where an automatic
a0d0e21e
LW
939assignment happens. I<If and ONLY if> the input symbol is the only
940thing inside the conditional of a C<while> loop, the value is
748a9306
LW
941automatically assigned to the variable C<$_>. The assigned value is
942then tested to see if it is defined. (This may seem like an odd thing
943to you, but you'll use the construct in almost every Perl script you
944write.) Anyway, the following lines are equivalent to each other:
a0d0e21e 945
748a9306 946 while (defined($_ = <STDIN>)) { print; }
a0d0e21e
LW
947 while (<STDIN>) { print; }
948 for (;<STDIN>;) { print; }
748a9306 949 print while defined($_ = <STDIN>);
a0d0e21e
LW
950 print while <STDIN>;
951
952The filehandles STDIN, STDOUT and STDERR are predefined. (The
953filehandles C<stdin>, C<stdout> and C<stderr> will also work except in
954packages, where they would be interpreted as local identifiers rather
955than global.) Additional filehandles may be created with the open()
cb1a09d0 956function. See L<perlfunc/open()> for details on this.
a0d0e21e
LW
957
958If a <FILEHANDLE> is used in a context that is looking for a list, a
959list consisting of all the input lines is returned, one line per list
960element. It's easy to make a I<LARGE> data space this way, so use with
961care.
962
963The null filehandle <> is special and can be used to emulate the
964behavior of B<sed> and B<awk>. Input from <> comes either from
965standard input, or from each file listed on the command line. Here's
966how it works: the first time <> is evaluated, the @ARGV array is
967checked, and if it is null, C<$ARGV[0]> is set to "-", which when opened
968gives you standard input. The @ARGV array is then processed as a list
969of filenames. The loop
970
971 while (<>) {
972 ... # code for each line
973 }
974
975is equivalent to the following Perl-like pseudo code:
976
977 unshift(@ARGV, '-') if $#ARGV < $[;
978 while ($ARGV = shift) {
979 open(ARGV, $ARGV);
980 while (<ARGV>) {
981 ... # code for each line
982 }
983 }
984
985except that it isn't so cumbersome to say, and will actually work. It
986really does shift array @ARGV and put the current filename into variable
987$ARGV. It also uses filehandle I<ARGV> internally--<> is just a synonym
988for <ARGV>, which is magical. (The pseudo code above doesn't work
989because it treats <ARGV> as non-magical.)
990
991You can modify @ARGV before the first <> as long as the array ends up
992containing the list of filenames you really want. Line numbers (C<$.>)
993continue as if the input were one big happy file. (But see example
994under eof() for how to reset line numbers on each file.)
995
996If you want to set @ARGV to your own list of files, go right ahead. If
997you want to pass switches into your script, you can use one of the
998Getopts modules or put a loop on the front like this:
999
1000 while ($_ = $ARGV[0], /^-/) {
1001 shift;
1002 last if /^--$/;
1003 if (/^-D(.*)/) { $debug = $1 }
1004 if (/^-v/) { $verbose++ }
1005 ... # other switches
1006 }
1007 while (<>) {
1008 ... # code for each line
1009 }
1010
1011The <> symbol will return FALSE only once. If you call it again after
1012this it will assume you are processing another @ARGV list, and if you
1013haven't set @ARGV, will input from STDIN.
1014
1015If the string inside the angle brackets is a reference to a scalar
1016variable (e.g. <$foo>), then that variable contains the name of the
cb1a09d0
AD
1017filehandle to input from, or a reference to the same. For example:
1018
1019 $fh = \*STDIN;
1020 $line = <$fh>;
a0d0e21e 1021
cb1a09d0
AD
1022If the string inside angle brackets is not a filehandle or a scalar
1023variable containing a filehandle name or reference, then it is interpreted
4633a7c4
LW
1024as a filename pattern to be globbed, and either a list of filenames or the
1025next filename in the list is returned, depending on context. One level of
1026$ interpretation is done first, but you can't say C<E<lt>$fooE<gt>>
1027because that's an indirect filehandle as explained in the previous
1028paragraph. In older version of Perl, programmers would insert curly
1029brackets to force interpretation as a filename glob: C<E<lt>${foo}E<gt>>.
1030These days, it's consdired cleaner to call the internal function directly
1031as C<glob($foo)>, which is probably the right way to have done it in the
1032first place.) Example:
a0d0e21e
LW
1033
1034 while (<*.c>) {
1035 chmod 0644, $_;
1036 }
1037
1038is equivalent to
1039
1040 open(FOO, "echo *.c | tr -s ' \t\r\f' '\\012\\012\\012\\012'|");
1041 while (<FOO>) {
1042 chop;
1043 chmod 0644, $_;
1044 }
1045
1046In fact, it's currently implemented that way. (Which means it will not
1047work on filenames with spaces in them unless you have csh(1) on your
1048machine.) Of course, the shortest way to do the above is:
1049
1050 chmod 0644, <*.c>;
1051
1052Because globbing invokes a shell, it's often faster to call readdir() yourself
1053and just do your own grep() on the filenames. Furthermore, due to its current
1054implementation of using a shell, the glob() routine may get "Arg list too
1055long" errors (unless you've installed tcsh(1L) as F</bin/csh>).
1056
4633a7c4
LW
1057A glob only evaluates its (embedded) argument when it is starting a new
1058list. All values must be read before it will start over. In a list
1059context this isn't important, because you automatically get them all
1060anyway. In a scalar context, however, the operator returns the next value
1061each time it is called, or a FALSE value if you've just run out. Again,
1062FALSE is returned only once. So if you're expecting a single value from
1063a glob, it is much better to say
1064
1065 ($file) = <blurch*>;
1066
1067than
1068
1069 $file = <blurch*>;
1070
1071because the latter will alternate between returning a filename and
1072returning FALSE.
1073
1074It you're trying to do variable interpolation, it's definitely better
1075to use the glob() function, because the older notation can cause people
1076to become confused with the indirect filehandle notatin.
1077
1078 @files = glob("$dir/*.[ch]");
1079 @files = glob($files[$i]);
1080
a0d0e21e
LW
1081=head2 Constant Folding
1082
1083Like C, Perl does a certain amount of expression evaluation at
1084compile time, whenever it determines that all of the arguments to an
1085operator are static and have no side effects. In particular, string
1086concatenation happens at compile time between literals that don't do
1087variable substitution. Backslash interpretation also happens at
1088compile time. You can say
1089
1090 'Now is the time for all' . "\n" .
1091 'good men to come to.'
1092
1093and this all reduces to one string internally. Likewise, if
1094you say
1095
1096 foreach $file (@filenames) {
1097 if (-s $file > 5 + 100 * 2**16) { ... }
1098 }
1099
1100the compiler will pre-compute the number that
1101expression represents so that the interpreter
1102won't have to.
1103
1104
1105=head2 Integer arithmetic
1106
1107By default Perl assumes that it must do most of its arithmetic in
1108floating point. But by saying
1109
1110 use integer;
1111
1112you may tell the compiler that it's okay to use integer operations
1113from here to the end of the enclosing BLOCK. An inner BLOCK may
1114countermand this by saying
1115
1116 no integer;
1117
1118which lasts until the end of that BLOCK.
1119