This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
perl 5.000
[perl5.git] / pod / perlop.pod
CommitLineData
a0d0e21e
LW
1=head1 NAME
2
3perlop - Perl operators and precedence
4
5=head1 SYNOPSIS
6
7Perl operators have the following associativity and precedence,
8listed from highest precedence to lowest. Note that all operators
9borrowed from C keep the same precedence relationship with each other,
10even where C's precedence is slightly screwy. (This makes learning
11Perl easier for C folks.)
12
13 left terms and list operators (leftward)
14 left ->
15 nonassoc ++ --
16 right **
17 right ! ~ \ and unary + and -
18 left =~ !~
19 left * / % x
20 left + - .
21 left << >>
22 nonassoc named unary operators
23 nonassoc < > <= >= lt gt le ge
24 nonassoc == != <=> eq ne cmp
25 left &
26 left | ^
27 left &&
28 left ||
29 nonassoc ..
30 right ?:
31 right = += -= *= etc.
32 left , =>
33 nonassoc list operators (rightward)
34 left not
35 left and
36 left or xor
37
38In the following sections, these operators are covered in precedence order.
39
40=head1 DESCRIPTIONS
41
42=head2 Terms and List Operators (Leftward)
43
44Any TERM is of highest precedence of Perl. These includes variables,
45quote and quotelike operators, any expression in parentheses,
46and any function whose arguments are parenthesized. Actually, there
47aren't really functions in this sense, just list operators and unary
48operators behaving as functions because you put parentheses around
49the arguments. These are all documented in L<perlfunc>.
50
51If any list operator (print(), etc.) or any unary operator (chdir(), etc.)
52is followed by a left parenthesis as the next token, the operator and
53arguments within parentheses are taken to be of highest precedence,
54just like a normal function call.
55
56In the absence of parentheses, the precedence of list operators such as
57C<print>, C<sort>, or C<chmod> is either very high or very low depending on
58whether you look at the left side of operator or the right side of it.
59For example, in
60
61 @ary = (1, 3, sort 4, 2);
62 print @ary; # prints 1324
63
64the commas on the right of the sort are evaluated before the sort, but
65the commas on the left are evaluated after. In other words, list
66operators tend to gobble up all the arguments that follow them, and
67then act like a simple TERM with regard to the preceding expression.
68Note that you have to be careful with parens:
69
70 # These evaluate exit before doing the print:
71 print($foo, exit); # Obviously not what you want.
72 print $foo, exit; # Nor is this.
73
74 # These do the print before evaluating exit:
75 (print $foo), exit; # This is what you want.
76 print($foo), exit; # Or this.
77 print ($foo), exit; # Or even this.
78
79Also note that
80
81 print ($foo & 255) + 1, "\n";
82
83probably doesn't do what you expect at first glance. See
84L<Named Unary Operators> for more discussion of this.
85
86Also parsed as terms are the C<do {}> and C<eval {}> constructs, as
87well as subroutine and method calls, and the anonymous
88constructors C<[]> and C<{}>.
89
90See also L<Quote and Quotelike Operators> toward the end of this section,
91as well as L<I/O Operators>.
92
93=head2 The Arrow Operator
94
95Just as in C and C++, "C<-E<gt>>" is an infix dereference operator. If the
96right side is either a C<[...]> or C<{...}> subscript, then the left side
97must be either a hard or symbolic reference to an array or hash (or
98a location capable of holding a hard reference, if it's an lvalue (assignable)).
99See L<perlref>.
100
101Otherwise, the right side is a method name or a simple scalar variable
102containing the method name, and the left side must either be an object
103(a blessed reference) or a class name (that is, a package name).
104See L<perlobj>.
105
106=head2 Autoincrement and Autodecrement
107
108"++" and "--" work as in C. That is, if placed before a variable, they
109increment or decrement the variable before returning the value, and if
110placed after, increment or decrement the variable after returning the value.
111
112The autoincrement operator has a little extra built-in magic to it. If
113you increment a variable that is numeric, or that has ever been used in
114a numeric context, you get a normal increment. If, however, the
115variable has only been used in string contexts since it was set, and
116has a value that is not null and matches the pattern
117C</^[a-zA-Z]*[0-9]*$/>, the increment is done as a string, preserving each
118character within its range, with carry:
119
120 print ++($foo = '99'); # prints '100'
121 print ++($foo = 'a0'); # prints 'a1'
122 print ++($foo = 'Az'); # prints 'Ba'
123 print ++($foo = 'zz'); # prints 'aaa'
124
125The autodecrement operator is not magical.
126
127=head2 Exponentiation
128
129Binary "**" is the exponentiation operator. Note that it binds even more
130tightly than unary minus, so -2**4 is -(2**4), not (-2)**4.
131
132=head2 Symbolic Unary Operators
133
134Unary "!" performs logical negation, i.e. "not". See also C<not> for a lower
135precedence version of this.
136
137Unary "-" performs arithmetic negation if the operand is numeric. If
138the operand is an identifier, a string consisting of a minus sign
139concatenated with the identifier is returned. Otherwise, if the string
140starts with a plus or minus, a string starting with the opposite sign
141is returned. One effect of these rules is that C<-bareword> is equivalent
142to C<"-bareword">.
143
144Unary "~" performs bitwise negation, i.e. 1's complement.
145
146Unary "+" has no effect whatsoever, even on strings. It is useful
147syntactically for separating a function name from a parenthesized expression
148that would otherwise be interpreted as the complete list of function
149arguments. (See examples above under L<List Operators>.)
150
151Unary "\" creates a reference to whatever follows it. See L<perlref>.
152Do not confuse this behavior with the behavior of backslash within a
153string, although both forms do convey the notion of protecting the next
154thing from interpretation.
155
156=head2 Binding Operators
157
158Binary "=~" binds an expression to a pattern match.
159Certain operations search or modify the string $_ by default. This
160operator makes that kind of operation work on some other string. The
161right argument is a search pattern, substitution, or translation. The
162left argument is what is supposed to be searched, substituted, or
163translated instead of the default $_. The return value indicates the
164success of the operation. (If the right argument is an expression
165rather than a search pattern, substitution, or translation, it is
166interpreted as a search pattern at run time. This is less efficient
167than an explicit search, since the pattern must be compiled every time
168the expression is evaluated--unless you've used C</o>.)
169
170Binary "!~" is just like "=~" except the return value is negated in
171the logical sense.
172
173=head2 Multiplicative Operators
174
175Binary "*" multiplies two numbers.
176
177Binary "/" divides two numbers.
178
179Binary "%" computes the modulus of the two numbers.
180
181Binary "x" is the repetition operator. In a scalar context, it
182returns a string consisting of the left operand repeated the number of
183times specified by the right operand. In a list context, if the left
184operand is a list in parens, it repeats the list.
185
186 print '-' x 80; # print row of dashes
187
188 print "\t" x ($tab/8), ' ' x ($tab%8); # tab over
189
190 @ones = (1) x 80; # a list of 80 1's
191 @ones = (5) x @ones; # set all elements to 5
192
193
194=head2 Additive Operators
195
196Binary "+" returns the sum of two numbers.
197
198Binary "-" returns the difference of two numbers.
199
200Binary "." concatenates two strings.
201
202=head2 Shift Operators
203
204Binary "<<" returns the value of its left argument shifted left by the
205number of bits specified by the right argument. Arguments should be
206integers.
207
208Binary ">>" returns the value of its left argument shifted right by the
209number of bits specified by the right argument. Arguments should be
210integers.
211
212=head2 Named Unary Operators
213
214The various named unary operators are treated as functions with one
215argument, with optional parentheses. These include the filetest
216operators, like C<-f>, C<-M>, etc. See L<perlfunc>.
217
218If any list operator (print(), etc.) or any unary operator (chdir(), etc.)
219is followed by a left parenthesis as the next token, the operator and
220arguments within parentheses are taken to be of highest precedence,
221just like a normal function call. Examples:
222
223 chdir $foo || die; # (chdir $foo) || die
224 chdir($foo) || die; # (chdir $foo) || die
225 chdir ($foo) || die; # (chdir $foo) || die
226 chdir +($foo) || die; # (chdir $foo) || die
227
228but, because * is higher precedence than ||:
229
230 chdir $foo * 20; # chdir ($foo * 20)
231 chdir($foo) * 20; # (chdir $foo) * 20
232 chdir ($foo) * 20; # (chdir $foo) * 20
233 chdir +($foo) * 20; # chdir ($foo * 20)
234
235 rand 10 * 20; # rand (10 * 20)
236 rand(10) * 20; # (rand 10) * 20
237 rand (10) * 20; # (rand 10) * 20
238 rand +(10) * 20; # rand (10 * 20)
239
240See also L<"List Operators">.
241
242=head2 Relational Operators
243
244Binary "<" returns true if the left argument is numerically less than
245the right argument.
246
247Binary ">" returns true if the left argument is numerically greater
248than the right argument.
249
250Binary "<=" returns true if the left argument is numerically less than
251or equal to the right argument.
252
253Binary ">=" returns true if the left argument is numerically greater
254than or equal to the right argument.
255
256Binary "lt" returns true if the left argument is stringwise less than
257the right argument.
258
259Binary "gt" returns true if the left argument is stringwise greater
260than the right argument.
261
262Binary "le" returns true if the left argument is stringwise less than
263or equal to the right argument.
264
265Binary "ge" returns true if the left argument is stringwise greater
266than or equal to the right argument.
267
268=head2 Equality Operators
269
270Binary "==" returns true if the left argument is numerically equal to
271the right argument.
272
273Binary "!=" returns true if the left argument is numerically not equal
274to the right argument.
275
276Binary "<=>" returns -1, 0, or 1 depending on whether the left argument is numerically
277less than, equal to, or greater than the right argument.
278
279Binary "eq" returns true if the left argument is stringwise equal to
280the right argument.
281
282Binary "ne" returns true if the left argument is stringwise not equal
283to the right argument.
284
285Binary "cmp" returns -1, 0, or 1 depending on whether the left argument is stringwise
286less than, equal to, or greater than the right argument.
287
288=head2 Bitwise And
289
290Binary "&" returns its operators ANDed together bit by bit.
291
292=head2 Bitwise Or and Exclusive Or
293
294Binary "|" returns its operators ORed together bit by bit.
295
296Binary "^" returns its operators XORed together bit by bit.
297
298=head2 C-style Logical And
299
300Binary "&&" performs a short-circuit logical AND operation. That is,
301if the left operand is false, the right operand is not even evaluated.
302Scalar or list context propagates down to the right operand if it
303is evaluated.
304
305=head2 C-style Logical Or
306
307Binary "||" performs a short-circuit logical OR operation. That is,
308if the left operand is true, the right operand is not even evaluated.
309Scalar or list context propagates down to the right operand if it
310is evaluated.
311
312The C<||> and C<&&> operators differ from C's in that, rather than returning
3130 or 1, they return the last value evaluated. Thus, a reasonably portable
314way to find out the home directory (assuming it's not "0") might be:
315
316 $home = $ENV{'HOME'} || $ENV{'LOGDIR'} ||
317 (getpwuid($<))[7] || die "You're homeless!\n";
318
319As more readable alternatives to C<&&> and C<||>, Perl provides "and" and
320"or" operators (see below). The short-circuit behavior is identical. The
321precedence of "and" and "or" is much lower, however, so that you can
322safely use them after a list operator without the need for
323parentheses:
324
325 unlink "alpha", "beta", "gamma"
326 or gripe(), next LINE;
327
328With the C-style operators that would have been written like this:
329
330 unlink("alpha", "beta", "gamma")
331 || (gripe(), next LINE);
332
333=head2 Range Operator
334
335Binary ".." is the range operator, which is really two different
336operators depending on the context. In a list context, it returns an
337array of values counting (by ones) from the left value to the right
338value. This is useful for writing C<for (1..10)> loops and for doing
339slice operations on arrays. Be aware that under the current implementation,
340a temporary array is created, so you'll burn a lot of memory if you
341write something like this:
342
343 for (1 .. 1_000_000) {
344 # code
345 }
346
347In a scalar context, ".." returns a boolean value. The operator is
348bistable, like a flip-flop, and emulates the line-range (comma) operator
349of B<sed>, B<awk>, and various editors. Each ".." operator maintains its
350own boolean state. It is false as long as its left operand is false.
351Once the left operand is true, the range operator stays true until the
352right operand is true, I<AFTER> which the range operator becomes false
353again. (It doesn't become false till the next time the range operator is
354evaluated. It can test the right operand and become false on the same
355evaluation it became true (as in B<awk>), but it still returns true once.
356If you don't want it to test the right operand till the next evaluation
357(as in B<sed>), use three dots ("...") instead of two.) The right
358operand is not evaluated while the operator is in the "false" state, and
359the left operand is not evaluated while the operator is in the "true"
360state. The precedence is a little lower than || and &&. The value
361returned is either the null string for false, or a sequence number
362(beginning with 1) for true. The sequence number is reset for each range
363encountered. The final sequence number in a range has the string "E0"
364appended to it, which doesn't affect its numeric value, but gives you
365something to search for if you want to exclude the endpoint. You can
366exclude the beginning point by waiting for the sequence number to be
367greater than 1. If either operand of scalar ".." is a numeric literal,
368that operand is implicitly compared to the C<$.> variable, the current
369line number. Examples:
370
371As a scalar operator:
372
373 if (101 .. 200) { print; } # print 2nd hundred lines
374 next line if (1 .. /^$/); # skip header lines
375 s/^/> / if (/^$/ .. eof()); # quote body
376
377As a list operator:
378
379 for (101 .. 200) { print; } # print $_ 100 times
380 @foo = @foo[$[ .. $#foo]; # an expensive no-op
381 @foo = @foo[$#foo-4 .. $#foo]; # slice last 5 items
382
383The range operator (in a list context) makes use of the magical
384autoincrement algorithm if the operaands are strings. You
385can say
386
387 @alphabet = ('A' .. 'Z');
388
389to get all the letters of the alphabet, or
390
391 $hexdigit = (0 .. 9, 'a' .. 'f')[$num & 15];
392
393to get a hexadecimal digit, or
394
395 @z2 = ('01' .. '31'); print $z2[$mday];
396
397to get dates with leading zeros. If the final value specified is not
398in the sequence that the magical increment would produce, the sequence
399goes until the next value would be longer than the final value
400specified.
401
402=head2 Conditional Operator
403
404Ternary "?:" is the conditional operator, just as in C. It works much
405like an if-then-else. If the argument before the ? is true, the
406argument before the : is returned, otherwise the argument after the :
407is returned. Scalar or list context propagates downward into the 2nd
408or 3rd argument, whichever is selected. The operator may be assigned
409to if both the 2nd and 3rd arguments are legal lvalues (meaning that you
410can assign to them):
411
412 ($a_or_b ? $a : $b) = $c;
413
414Note that this is not guaranteed to contribute to the readability of
415your program.
416
417=head2 Assigment Operators
418
419"=" is the ordinary assignment operator.
420
421Assignment operators work as in C. That is,
422
423 $a += 2;
424
425is equivalent to
426
427 $a = $a + 2;
428
429although without duplicating any side effects that dereferencing the lvalue
430might trigger, such as from tie(). Other assignment operators work similarly.
431The following are recognized:
432
433 **= += *= &= <<= &&=
434 -= /= |= >>= ||=
435 .= %= ^=
436 x=
437
438Note that while these are grouped by family, they all have the precedence
439of assignment.
440
441Unlike in C, the assignment operator produces a valid lvalue. Modifying
442an assignment is equivalent to doing the assignment and then modifying
443the variable that was assigned to. This is useful for modifying
444a copy of something, like this:
445
446 ($tmp = $global) =~ tr [A-Z] [a-z];
447
448Likewise,
449
450 ($a += 2) *= 3;
451
452is equivalent to
453
454 $a += 2;
455 $a *= 3;
456
457=head2
458
459Binary "," is the comma operator. In a scalar context it evaluates
460its left argument, throws that value away, then evaluates its right
461argument and returns that value. This is just like C's comma operator.
462
463In a list context, it's just the list argument separator, and inserts
464both its arguments into the list.
465
466=head2 List Operators (Rightward)
467
468On the right side of a list operator, it has very low precedence,
469such that it controls all comma-separated expressions found there.
470The only operators with lower precedence are the logical operators
471"and", "or", and "not", which may be used to evaluate calls to list
472operators without the need for extra parentheses:
473
474 open HANDLE, "filename"
475 or die "Can't open: $!\n";
476
477See also discussion of list operators in L<List Operators (Leftward)>.
478
479=head2 Logical Not
480
481Unary "not" returns the logical negation of the expression to its right.
482It's the equivalent of "!" except for the very low precedence.
483
484=head2 Logical And
485
486Binary "and" returns the logical conjunction of the two surrounding
487expressions. It's equivalent to && except for the very low
488precedence. This means that it short-circuits: i.e. the right
489expression is evaluated only if the left expression is true.
490
491=head2 Logical or and Exclusive Or
492
493Binary "or" returns the logical disjunction of the two surrounding
494expressions. It's equivalent to || except for the very low
495precedence. This means that it short-circuits: i.e. the right
496expression is evaluated only if the left expression is false.
497
498Binary "xor" returns the exclusive-OR of the two surrounding expressions.
499It cannot short circuit, of course.
500
501=head2 C Operators Missing From Perl
502
503Here is what C has that Perl doesn't:
504
505=over 8
506
507=item unary &
508
509Address-of operator. (But see the "\" operator for taking a reference.)
510
511=item unary *
512
513Dereference-address operator. (Perl's prefix dereferencing
514operators are typed: $, @, %, and &.)
515
516=item (TYPE)
517
518Type casting operator.
519
520=back
521
522=head2 Quote and Quotelike Operators
523
524While we usually think of quotes as literal values, in Perl they
525function as operators, providing various kinds of interpolating and
526pattern matching capabilities. Perl provides customary quote characters
527for these behaviors, but also provides a way for you to choose your
528quote character for any of them. In the following table, a C<{}> represents
529any pair of delimiters you choose. Non-bracketing delimiters use
530the same character fore and aft, but the 4 sorts of brackets
531(round, angle, square, curly) will all nest.
532
533 Customary Generic Meaning Interpolates
534 '' q{} Literal no
535 "" qq{} Literal yes
536 `` qx{} Command yes
537 qw{} Word list no
538 // m{} Pattern match yes
539 s{}{} Substitution yes
540 tr{}{} Translation no
541
542For constructs that do interpolation, variables beginning with "C<$> or "C<@>"
543are interpolated, as are the following sequences:
544
545 \t tab
546 \n newline
547 \r return
548 \f form feed
549 \v vertical tab, whatever that is
550 \b backspace
551 \a alarm (bell)
552 \e escape
553 \033 octal char
554 \x1b hex char
555 \c[ control char
556 \l lowercase next char
557 \u uppercase next char
558 \L lowercase till \E
559 \U uppercase till \E
560 \E end case modification
561 \Q quote regexp metacharacters till \E
562
563Patterns are subject to an additional level of interpretation as a
564regular expression. This is done as a second pass, after variables are
565interpolated, so that regular expressions may be incorporated into the
566pattern from the variables. If this is not what you want, use C<\Q> to
567interpolate a variable literally.
568
569Apart from the above, there are no multiple levels of interpolation. In
570particular, contrary to the expectations of shell programmers, backquotes
571do I<NOT> interpolate within double quotes, nor do single quotes impede
572evaluation of variables when used within double quotes.
573
574=over 8
575
576=item ?PATTERN?
577
578This is just like the C</pattern/> search, except that it matches only
579once between calls to the reset() operator. This is a useful
580optimization when you only want to see the first occurrence of
581something in each file of a set of files, for instance. Only C<??>
582patterns local to the current package are reset.
583
584This usage is vaguely deprecated, and may be removed in some future
585version of Perl.
586
587=item m/PATTERN/gimosx
588
589=item /PATTERN/gimosx
590
591Searches a string for a pattern match, and in a scalar context returns
592true (1) or false (''). If no string is specified via the C<=~> or
593C<!~> operator, the $_ string is searched. (The string specified with
594C<=~> need not be an lvalue--it may be the result of an expression
595evaluation, but remember the C<=~> binds rather tightly.) See also
596L<perlre>.
597
598Options are:
599
600 g Match globally, i.e. find all occurrences.
601 i Do case-insensitive pattern matching.
602 m Treat string as multiple lines.
603 o Only compile pattern once.
604 s Treat string as single line.
605 x Use extended regular expressions.
606
607If "/" is the delimiter then the initial C<m> is optional. With the C<m>
608you can use any pair of non-alphanumeric, non-whitespace characters as
609delimiters. This is particularly useful for matching Unix path names
610that contain "/", to avoid LTS (leaning toothpick syndrome).
611
612PATTERN may contain variables, which will be interpolated (and the
613pattern recompiled) every time the pattern search is evaluated. (Note
614that C<$)> and C<$|> might not be interpolated because they look like
615end-of-string tests.) If you want such a pattern to be compiled only
616once, add a C</o> after the trailing delimiter. This avoids expensive
617run-time recompilations, and is useful when the value you are
618interpolating won't change over the life of the script. However, mentioning
619C</o> constitutes a promise that you won't change the variables in the pattern.
620If you change them, Perl won't even notice.
621
622If the PATTERN evaluates to a null string, the most recently executed
623(and successfully compiled) regular expression is used instead.
624
625If used in a context that requires a list value, a pattern match returns a
626list consisting of the subexpressions matched by the parentheses in the
627pattern, i.e. ($1, $2, $3...). (Note that here $1 etc. are also set, and
628that this differs from Perl 4's behavior.) If the match fails, a null
629array is returned. If the match succeeds, but there were no parentheses,
630a list value of (1) is returned.
631
632Examples:
633
634 open(TTY, '/dev/tty');
635 <TTY> =~ /^y/i && foo(); # do foo if desired
636
637 if (/Version: *([0-9.]*)/) { $version = $1; }
638
639 next if m#^/usr/spool/uucp#;
640
641 # poor man's grep
642 $arg = shift;
643 while (<>) {
644 print if /$arg/o; # compile only once
645 }
646
647 if (($F1, $F2, $Etc) = ($foo =~ /^(\S+)\s+(\S+)\s*(.*)/))
648
649This last example splits $foo into the first two words and the
650remainder of the line, and assigns those three fields to $F1, $F2 and
651$Etc. The conditional is true if any variables were assigned, i.e. if
652the pattern matched.
653
654The C</g> modifier specifies global pattern matching--that is, matching
655as many times as possible within the string. How it behaves depends on
656the context. In a list context, it returns a list of all the
657substrings matched by all the parentheses in the regular expression.
658If there are no parentheses, it returns a list of all the matched
659strings, as if there were parentheses around the whole pattern.
660
661In a scalar context, C<m//g> iterates through the string, returning TRUE
662each time it matches, and FALSE when it eventually runs out of
663matches. (In other words, it remembers where it left off last time and
664restarts the search at that point. You can actually find the current
665match position of a string using the pos() function--see L<perlfunc>.)
666If you modify the string in any way, the match position is reset to the
667beginning. Examples:
668
669 # list context
670 ($one,$five,$fifteen) = (`uptime` =~ /(\d+\.\d+)/g);
671
672 # scalar context
673 $/ = ""; $* = 1; # $* deprecated in Perl 5
674 while ($paragraph = <>) {
675 while ($paragraph =~ /[a-z]['")]*[.!?]+['")]*\s/g) {
676 $sentences++;
677 }
678 }
679 print "$sentences\n";
680
681=item q/STRING/
682
683=item C<'STRING'>
684
685A single-quoted, literal string. Backslashes are ignored, unless
686followed by the delimiter or another backslash, in which case the
687delimiter or backslash is interpolated.
688
689 $foo = q!I said, "You said, 'She said it.'"!;
690 $bar = q('This is it.');
691
692=item qq/STRING/
693
694=item "STRING"
695
696A double-quoted, interpolated string.
697
698 $_ .= qq
699 (*** The previous line contains the naughty word "$1".\n)
700 if /(tcl|rexx|python)/; # :-)
701
702=item qx/STRING/
703
704=item `STRING`
705
706A string which is interpolated and then executed as a system command.
707The collected standard output of the command is returned. In scalar
708context, it comes back as a single (potentially multi-line) string.
709In list context, returns a list of lines (however you've defined lines
710with $/ or $INPUT_RECORD_SEPARATOR).
711
712 $today = qx{ date };
713
714See L<I/O Operators> for more discussion.
715
716=item qw/STRING/
717
718Returns a list of the words extracted out of STRING, using embedded
719whitespace as the word delimiters. It is exactly equivalent to
720
721 split(' ', q/STRING/);
722
723Some frequently seen examples:
724
725 use POSIX qw( setlocale localeconv )
726 @EXPORT = qw( foo bar baz );
727
728=item s/PATTERN/REPLACEMENT/egimosx
729
730Searches a string for a pattern, and if found, replaces that pattern
731with the replacement text and returns the number of substitutions
732made. Otherwise it returns false (0).
733
734If no string is specified via the C<=~> or C<!~> operator, the C<$_>
735variable is searched and modified. (The string specified with C<=~> must
736be a scalar variable, an array element, a hash element, or an assignment
737to one of those, i.e. an lvalue.)
738
739If the delimiter chosen is single quote, no variable interpolation is
740done on either the PATTERN or the REPLACEMENT. Otherwise, if the
741PATTERN contains a $ that looks like a variable rather than an
742end-of-string test, the variable will be interpolated into the pattern
743at run-time. If you only want the pattern compiled once the first time
744the variable is interpolated, use the C</o> option. If the pattern
745evaluates to a null string, the most recently executed (and successfully compiled) regular
746expression is used instead. See L<perlre> for further explanation on these.
747
748Options are:
749
750 e Evaluate the right side as an expression.
751 g Replace globally, i.e. all occurrences.
752 i Do case-insensitive pattern matching.
753 m Treat string as multiple lines.
754 o Only compile pattern once.
755 s Treat string as single line.
756 x Use extended regular expressions.
757
758Any non-alphanumeric, non-whitespace delimiter may replace the
759slashes. If single quotes are used, no interpretation is done on the
760replacement string (the C</e> modifier overrides this, however). If
761backquotes are used, the replacement string is a command to execute
762whose output will be used as the actual replacement text. If the
763PATTERN is delimited by bracketing quotes, the REPLACEMENT has its own
764pair of quotes, which may or may not be bracketing quotes, e.g.
765C<s(foo)(bar)> or C<sE<lt>fooE<gt>/bar/>. A C</e> will cause the
766replacement portion to be interpreter as a full-fledged Perl expression
767and eval()ed right then and there. It is, however, syntax checked at
768compile-time.
769
770Examples:
771
772 s/\bgreen\b/mauve/g; # don't change wintergreen
773
774 $path =~ s|/usr/bin|/usr/local/bin|;
775
776 s/Login: $foo/Login: $bar/; # run-time pattern
777
778 ($foo = $bar) =~ s/this/that/;
779
780 $count = ($paragraph =~ s/Mister\b/Mr./g);
781
782 $_ = 'abc123xyz';
783 s/\d+/$&*2/e; # yields 'abc246xyz'
784 s/\d+/sprintf("%5d",$&)/e; # yields 'abc 246xyz'
785 s/\w/$& x 2/eg; # yields 'aabbcc 224466xxyyzz'
786
787 s/%(.)/$percent{$1}/g; # change percent escapes; no /e
788 s/%(.)/$percent{$1} || $&/ge; # expr now, so /e
789 s/^=(\w+)/&pod($1)/ge; # use function call
790
791 # /e's can even nest; this will expand
792 # simple embedded variables in $_
793 s/(\$\w+)/$1/eeg;
794
795 # Delete C comments.
796 $program =~ s {
797 /\* (?# Match the opening delimiter.)
798 .*? (?# Match a minimal number of characters.)
799 \*/ (?# Match the closing delimiter.)
800 } []gsx;
801
802 s/^\s*(.*?)\s*$/$1/; # trim white space
803
804 s/([^ ]*) *([^ ]*)/$2 $1/; # reverse 1st two fields
805
806Note the use of $ instead of \ in the last example. Unlike
807B<sed>, we only use the \<I<digit>> form in the left hand side.
808Anywhere else it's $<I<digit>>.
809
810Occasionally, you can't just use a C</g> to get all the changes
811to occur. Here are two common cases:
812
813 # put commas in the right places in an integer
814 1 while s/(.*\d)(\d\d\d)/$1,$2/g; # perl4
815 1 while s/(\d)(\d\d\d)(?!\d)/$1,$2/g; # perl5
816
817 # expand tabs to 8-column spacing
818 1 while s/\t+/' ' x (length($&)*8 - length($`)%8)/e;
819
820
821=item tr/SEARCHLIST/REPLACEMENTLIST/cds
822
823=item y/SEARCHLIST/REPLACEMENTLIST/cds
824
825Translates all occurrences of the characters found in the search list
826with the corresponding character in the replacement list. It returns
827the number of characters replaced or deleted. If no string is
828specified via the =~ or !~ operator, the $_ string is translated. (The
829string specified with =~ must be a scalar variable, an array element,
830or an assignment to one of those, i.e. an lvalue.) For B<sed> devotees,
831C<y> is provided as a synonym for C<tr>. If the SEARCHLIST is
832delimited by bracketing quotes, the REPLACEMENTLIST has its own pair of
833quotes, which may or may not be bracketing quotes, e.g. C<tr[A-Z][a-z]>
834or C<tr(+-*/)/ABCD/>.
835
836Options:
837
838 c Complement the SEARCHLIST.
839 d Delete found but unreplaced characters.
840 s Squash duplicate replaced characters.
841
842If the C</c> modifier is specified, the SEARCHLIST character set is
843complemented. If the C</d> modifier is specified, any characters specified
844by SEARCHLIST not found in REPLACEMENTLIST are deleted. (Note
845that this is slightly more flexible than the behavior of some B<tr>
846programs, which delete anything they find in the SEARCHLIST, period.)
847If the C</s> modifier is specified, sequences of characters that were
848translated to the same character are squashed down to a single instance of the
849character.
850
851If the C</d> modifier is used, the REPLACEMENTLIST is always interpreted
852exactly as specified. Otherwise, if the REPLACEMENTLIST is shorter
853than the SEARCHLIST, the final character is replicated till it is long
854enough. If the REPLACEMENTLIST is null, the SEARCHLIST is replicated.
855This latter is useful for counting characters in a class or for
856squashing character sequences in a class.
857
858Examples:
859
860 $ARGV[1] =~ tr/A-Z/a-z/; # canonicalize to lower case
861
862 $cnt = tr/*/*/; # count the stars in $_
863
864 $cnt = $sky =~ tr/*/*/; # count the stars in $sky
865
866 $cnt = tr/0-9//; # count the digits in $_
867
868 tr/a-zA-Z//s; # bookkeeper -> bokeper
869
870 ($HOST = $host) =~ tr/a-z/A-Z/;
871
872 tr/a-zA-Z/ /cs; # change non-alphas to single space
873
874 tr [\200-\377]
875 [\000-\177]; # delete 8th bit
876
877Note that because the translation table is built at compile time, neither
878the SEARCHLIST nor the REPLACEMENTLIST are subjected to double quote
879interpolation. That means that if you want to use variables, you must use
880an eval():
881
882 eval "tr/$oldlist/$newlist/";
883 die $@ if $@;
884
885 eval "tr/$oldlist/$newlist/, 1" or die $@;
886
887=back
888
889=head2 I/O Operators
890
891There are several I/O operators you should know about.
892A string is enclosed by backticks (grave accents) first undergoes
893variable substitution just like a double quoted string. It is then
894interpreted as a command, and the output of that command is the value
895of the pseudo-literal, like in a shell. In a scalar context, a single
896string consisting of all the output is returned. In a list context,
897a list of values is returned, one for each line of output. (You can
898set C<$/> to use a different line terminator.) The command is executed
899each time the pseudo-literal is evaluated. The status value of the
900command is returned in C<$?> (see L<perlvar> for the interpretation
901of C<$?>). Unlike in B<csh>, no translation is done on the return
902data--newlines remain newlines. Unlike in any of the shells, single
903quotes do not hide variable names in the command from interpretation.
904To pass a $ through to the shell you need to hide it with a backslash.
905The generalized form of backticks is C<qx//>.
906
907Evaluating a filehandle in angle brackets yields the next line from
908that file (newline included, so it's never false until end of file, at which
909time an undefined value is returned). Ordinarily you must assign that
910value to a variable, but there is one situation where an automatic
911assignment happens. I<If and ONLY if> the input symbol is the only
912thing inside the conditional of a C<while> loop, the value is
913automatically assigned to the variable C<$_>. (This may seem like an
914odd thing to you, but you'll use the construct in almost every Perl
915script you write.) Anyway, the following lines are equivalent to each
916other:
917
918 while ($_ = <STDIN>) { print; }
919 while (<STDIN>) { print; }
920 for (;<STDIN>;) { print; }
921 print while $_ = <STDIN>;
922 print while <STDIN>;
923
924The filehandles STDIN, STDOUT and STDERR are predefined. (The
925filehandles C<stdin>, C<stdout> and C<stderr> will also work except in
926packages, where they would be interpreted as local identifiers rather
927than global.) Additional filehandles may be created with the open()
928function.
929
930If a <FILEHANDLE> is used in a context that is looking for a list, a
931list consisting of all the input lines is returned, one line per list
932element. It's easy to make a I<LARGE> data space this way, so use with
933care.
934
935The null filehandle <> is special and can be used to emulate the
936behavior of B<sed> and B<awk>. Input from <> comes either from
937standard input, or from each file listed on the command line. Here's
938how it works: the first time <> is evaluated, the @ARGV array is
939checked, and if it is null, C<$ARGV[0]> is set to "-", which when opened
940gives you standard input. The @ARGV array is then processed as a list
941of filenames. The loop
942
943 while (<>) {
944 ... # code for each line
945 }
946
947is equivalent to the following Perl-like pseudo code:
948
949 unshift(@ARGV, '-') if $#ARGV < $[;
950 while ($ARGV = shift) {
951 open(ARGV, $ARGV);
952 while (<ARGV>) {
953 ... # code for each line
954 }
955 }
956
957except that it isn't so cumbersome to say, and will actually work. It
958really does shift array @ARGV and put the current filename into variable
959$ARGV. It also uses filehandle I<ARGV> internally--<> is just a synonym
960for <ARGV>, which is magical. (The pseudo code above doesn't work
961because it treats <ARGV> as non-magical.)
962
963You can modify @ARGV before the first <> as long as the array ends up
964containing the list of filenames you really want. Line numbers (C<$.>)
965continue as if the input were one big happy file. (But see example
966under eof() for how to reset line numbers on each file.)
967
968If you want to set @ARGV to your own list of files, go right ahead. If
969you want to pass switches into your script, you can use one of the
970Getopts modules or put a loop on the front like this:
971
972 while ($_ = $ARGV[0], /^-/) {
973 shift;
974 last if /^--$/;
975 if (/^-D(.*)/) { $debug = $1 }
976 if (/^-v/) { $verbose++ }
977 ... # other switches
978 }
979 while (<>) {
980 ... # code for each line
981 }
982
983The <> symbol will return FALSE only once. If you call it again after
984this it will assume you are processing another @ARGV list, and if you
985haven't set @ARGV, will input from STDIN.
986
987If the string inside the angle brackets is a reference to a scalar
988variable (e.g. <$foo>), then that variable contains the name of the
989filehandle to input from.
990
991If the string inside angle brackets is not a filehandle, it is
992interpreted as a filename pattern to be globbed, and either a list of
993filenames or the next filename in the list is returned, depending on
994context. One level of $ interpretation is done first, but you can't
995say C<E<lt>$fooE<gt>> because that's an indirect filehandle as explained in the
996previous paragraph. You could insert curly brackets to force
997interpretation as a filename glob: C<E<lt>${foo}E<gt>>. (Alternately, you can
998call the internal function directly as C<glob($foo)>, which is probably
999the right way to have done it in the first place.) Example:
1000
1001 while (<*.c>) {
1002 chmod 0644, $_;
1003 }
1004
1005is equivalent to
1006
1007 open(FOO, "echo *.c | tr -s ' \t\r\f' '\\012\\012\\012\\012'|");
1008 while (<FOO>) {
1009 chop;
1010 chmod 0644, $_;
1011 }
1012
1013In fact, it's currently implemented that way. (Which means it will not
1014work on filenames with spaces in them unless you have csh(1) on your
1015machine.) Of course, the shortest way to do the above is:
1016
1017 chmod 0644, <*.c>;
1018
1019Because globbing invokes a shell, it's often faster to call readdir() yourself
1020and just do your own grep() on the filenames. Furthermore, due to its current
1021implementation of using a shell, the glob() routine may get "Arg list too
1022long" errors (unless you've installed tcsh(1L) as F</bin/csh>).
1023
1024=head2 Constant Folding
1025
1026Like C, Perl does a certain amount of expression evaluation at
1027compile time, whenever it determines that all of the arguments to an
1028operator are static and have no side effects. In particular, string
1029concatenation happens at compile time between literals that don't do
1030variable substitution. Backslash interpretation also happens at
1031compile time. You can say
1032
1033 'Now is the time for all' . "\n" .
1034 'good men to come to.'
1035
1036and this all reduces to one string internally. Likewise, if
1037you say
1038
1039 foreach $file (@filenames) {
1040 if (-s $file > 5 + 100 * 2**16) { ... }
1041 }
1042
1043the compiler will pre-compute the number that
1044expression represents so that the interpreter
1045won't have to.
1046
1047
1048=head2 Integer arithmetic
1049
1050By default Perl assumes that it must do most of its arithmetic in
1051floating point. But by saying
1052
1053 use integer;
1054
1055you may tell the compiler that it's okay to use integer operations
1056from here to the end of the enclosing BLOCK. An inner BLOCK may
1057countermand this by saying
1058
1059 no integer;
1060
1061which lasts until the end of that BLOCK.
1062