C<6 * 5 == 30>.
I<Operator associativity> defines what happens if a sequence of the same
-operators is used one after another: whether they will be grouped at the left
+operators is used one after another:
+usually that they will be grouped at the left
or the right. For example, in C<9 - 3 - 2>, subtraction is left associative,
so C<9 - 3> is grouped together as the left-hand operand of the second
subtraction, rather than C<3 - 2> being grouped together as the right-hand
all; in general, the top-level operator in an expression has control of
operand evaluation.
+Some comparison operators, as their associativity, I<chain> with some
+operators of the same precedence (but never with operators of different
+precedence). This chaining means that each comparison is performed
+on the two arguments surrounding it, with each interior argument taking
+part in two comparisons, and the comparison results are implicitly ANDed.
+Thus S<C<"$x E<lt> $y E<lt>= $z">> behaves exactly like S<C<"$x E<lt>
+$y && $y E<lt>= $z">>, assuming that C<"$y"> is as simple a scalar as
+it looks. The ANDing short-circuits just like C<"&&"> does, stopping
+the sequence of comparisons as soon as one yields false.
+
+In a chained comparison, each argument expression is evaluated at most
+once, even if it takes part in two comparisons, but the result of the
+evaluation is fetched for each comparison. (It is not evaluated
+at all if the short-circuiting means that it's not required for any
+comparisons.) This matters if the computation of an interior argument
+is expensive or non-deterministic. For example,
+
+ if($x < expensive_sub() <= $z) { ...
+
+is not entirely like
+
+ if($x < expensive_sub() && expensive_sub() <= $z) { ...
+
+but instead closer to
+
+ my $tmp = expensive_sub();
+ if($x < $tmp && $tmp <= $z) { ...
+
+in that the subroutine is only called once. However, it's not exactly
+like this latter code either, because the chained comparison doesn't
+actually involve any temporary variable (named or otherwise): there is
+no assignment. This doesn't make much difference where the expression
+is a call to an ordinary subroutine, but matters more with an lvalue
+subroutine, or if the argument expression yields some unusual kind of
+scalar by other means. For example, if the argument expression yields
+a tied scalar, then the expression is evaluated to produce that scalar
+at most once, but the value of that scalar may be fetched up to twice,
+once for each comparison in which it is actually used.
+
+In this example, the expression is evaluated only once, and the tied
+scalar (the result of the expression) is fetched for each comparison that
+uses it.
+
+ if ($x < $tied_scalar < $z) { ...
+
+In the next example, the expression is evaluated only once, and the tied
+scalar is fetched once as part of the operation within the expression.
+The result of that operation is fetched for each comparison, which
+normally doesn't matter unless that expression result is also magical due
+to operator overloading.
+
+ if ($x < $tied_scalar + 42 < $z) { ...
+
+Some operators are instead non-associative, meaning that it is a syntax
+error to use a sequence of those operators of the same precedence.
+For example, S<C<"$x .. $y .. $z">> is an error.
+
Perl operators have the following associativity and precedence,
listed from highest precedence to lowest. Operators borrowed from
C keep the same precedence relationship with each other, even where
left ->
nonassoc ++ --
right **
- right ! ~ \ and unary + and -
+ right ! ~ ~. \ and unary + and -
left =~ !~
left * / % x
left + - .
left << >>
nonassoc named unary operators
- nonassoc < > <= >= lt gt le ge
- nonassoc == != <=> eq ne cmp ~~
- left &
- left | ^
+ chained < > <= >= lt gt le ge
+ chain/na == != eq ne <=> cmp ~~
+ nonassoc isa
+ left & &.
+ left | |. ^ ^.
left &&
left || //
nonassoc .. ...
Starting in Perl 5.28, it is a fatal error to try to complement a string
containing a character with an ordinal value above 255.
-If the experimental "bitwise" feature is enabled via S<C<use feature
-'bitwise'>>, then unary C<"~"> always treats its argument as a number, and an
+If the "bitwise" feature is enabled via S<C<use
+feature 'bitwise'>> or C<use v5.28>, then unary
+C<"~"> always treats its argument as a number, and an
alternate form of the operator, C<"~.">, always treats its argument as a
string. So C<~0> and C<~"0"> will both give 2**32-1 on 32-bit platforms,
-whereas C<~.0> and C<~."0"> will both yield C<"\xff">. This feature
-produces a warning unless you use S<C<no warnings 'experimental::bitwise'>>.
+whereas C<~.0> and C<~."0"> will both yield C<"\xff">. Until Perl 5.28,
+this feature produced a warning in the C<"experimental::bitwise"> category.
Unary C<"+"> has no effect whatsoever, even on strings. It is useful
syntactically for separating a function name from a parenthesized expression
than or equal to the right argument.
X<< ge >>
+A sequence of relational operators, such as S<C<"$x E<lt> $y E<lt>=
+$z">>, performs chained comparisons, in the manner described above in
+the section L</"Operator Precedence and Associativity">.
+Beware that they do not chain with equality operators, which have lower
+precedence.
+
=head2 Equality Operators
X<equality> X<equal> X<equals> X<operator, equality>
to the right argument.
X<!=>
+Binary C<"eq"> returns true if the left argument is stringwise equal to
+the right argument.
+X<eq>
+
+Binary C<"ne"> returns true if the left argument is stringwise not equal
+to the right argument.
+X<ne>
+
+A sequence of the above equality operators, such as S<C<"$x == $y ==
+$z">>, performs chained comparisons, in the manner described above in
+the section L</"Operator Precedence and Associativity">.
+Beware that they do not chain with relational operators, which have
+higher precedence.
+
Binary C<< "<=>" >> returns -1, 0, or 1 depending on whether the left
argument is numerically less than, equal to, or greater than the right
argument. If your platform supports C<NaN>'s (not-a-numbers) as numeric
(Note that the L<bigint>, L<bigrat>, and L<bignum> pragmas all
support C<"NaN">.)
-Binary C<"eq"> returns true if the left argument is stringwise equal to
-the right argument.
-X<eq>
-
-Binary C<"ne"> returns true if the left argument is stringwise not equal
-to the right argument.
-X<ne>
-
Binary C<"cmp"> returns -1, 0, or 1 depending on whether the left
argument is stringwise less than, equal to, or greater than the right
argument.
is described in the next section.
X<~~>
+The two-sided ordering operators C<"E<lt>=E<gt>"> and C<"cmp">, and the
+smartmatch operator C<"~~">, are non-associative with respect to each
+other and with respect to the equality operators of the same precedence.
+
C<"lt">, C<"le">, C<"ge">, C<"gt"> and C<"cmp"> use the collation (sort)
order specified by the current C<LC_COLLATE> locale if a S<C<use
locale>> form that includes collation is in effect. See L<perllocale>.
C<L<Unicode::Collate::Locale>> modules offer much more powerful
solutions to collation issues.
-For case-insensitive comparisions, look at the L<perlfunc/fc> case-folding
+For case-insensitive comparisons, look at the L<perlfunc/fc> case-folding
function, available in Perl v5.16 or later:
if ( fc($x) eq fc($y) ) { ... }
+=head2 Class Instance Operator
+X<isa operator>
+
+Binary C<isa> evaluates to true when the left argument is an object instance of
+the class (or a subclass derived from that class) given by the right argument.
+If the left argument is not defined, not a blessed object instance, nor does
+not derive from the class given by the right argument, the operator evaluates
+as false. The right argument may give the class either as a bareword or a
+scalar expression that yields a string class name:
+
+ if( $obj isa Some::Class ) { ... }
+
+ if( $obj isa "Different::Class" ) { ... }
+ if( $obj isa $name_of_class ) { ... }
+
+This is an experimental feature and is available from Perl 5.31.6 when enabled
+by C<use feature 'isa'>. It emits a warning in the C<experimental::isa>
+category.
+
=head2 Smartmatch Operator
-Binary C<~~> does a "smartmatch" between its arguments. The smartmatch
+First available in Perl 5.10.1 (the 5.10.0 version behaved differently),
+binary C<~~> does a "smartmatch" between its arguments. This is mostly
+used implicitly in the C<when> construct described in L<perlsyn>, although
+not all C<when> clauses call the smartmatch operator. Unique among all of
+Perl's operators, the smartmatch operator can recurse. The smartmatch
operator is L<experimental|perlpolicy/experimental> and its behavior is
-subject to change. It first became available in Perl 5.10, but prior
-to Perl 5.28 its behaviour was quite different from its present behaviour.
-
-The C<~~> operator applies some kind of matching criterion to its
-left-hand operand, and returns a truth value result. The criterion to
-apply is determined by the right-hand operand, which must be a reference
-to an object blessed into a class that overloads the C<~~> operator for
-this purpose. The class into which compiled regexp objects are blessed
-by the C<qr//> operator has such an overloading, which checks whether
-the left-hand operand matches the regexp. If the right-hand operand is
-not a reference to such a matcher object, an exception is raised.
-
-Overloading of C<~~> only applies when the object reference is the
-right-hand operand. An object reference as the left-hand operand is
-subjected to whatever criterion is specified by the right-hand operand,
-regardless of its own overloading.
+subject to change.
+
+It is also unique in that all other Perl operators impose a context
+(usually string or numeric context) on their operands, autoconverting
+those operands to those imposed contexts. In contrast, smartmatch
+I<infers> contexts from the actual types of its operands and uses that
+type information to select a suitable comparison mechanism.
+
+The C<~~> operator compares its operands "polymorphically", determining how
+to compare them according to their actual types (numeric, string, array,
+hash, etc.). Like the equality operators with which it shares the same
+precedence, C<~~> returns 1 for true and C<""> for false. It is often best
+read aloud as "in", "inside of", or "is contained in", because the left
+operand is often looked for I<inside> the right operand. That makes the
+order of the operands to the smartmatch operand often opposite that of
+the regular match operator. In other words, the "smaller" thing is usually
+placed in the left operand and the larger one in the right.
+
+The behavior of a smartmatch depends on what type of things its arguments
+are, as determined by the following table. The first row of the table
+whose types apply determines the smartmatch behavior. Because what
+actually happens is mostly determined by the type of the second operand,
+the table is sorted on the right operand instead of on the left.
+
+ Left Right Description and pseudocode
+ ===============================================================
+ Any undef check whether Any is undefined
+ like: !defined Any
+
+ Any Object invoke ~~ overloading on Object, or die
+
+ Right operand is an ARRAY:
+
+ Left Right Description and pseudocode
+ ===============================================================
+ ARRAY1 ARRAY2 recurse on paired elements of ARRAY1 and ARRAY2[2]
+ like: (ARRAY1[0] ~~ ARRAY2[0])
+ && (ARRAY1[1] ~~ ARRAY2[1]) && ...
+ HASH ARRAY any ARRAY elements exist as HASH keys
+ like: grep { exists HASH->{$_} } ARRAY
+ Regexp ARRAY any ARRAY elements pattern match Regexp
+ like: grep { /Regexp/ } ARRAY
+ undef ARRAY undef in ARRAY
+ like: grep { !defined } ARRAY
+ Any ARRAY smartmatch each ARRAY element[3]
+ like: grep { Any ~~ $_ } ARRAY
+
+ Right operand is a HASH:
+
+ Left Right Description and pseudocode
+ ===============================================================
+ HASH1 HASH2 all same keys in both HASHes
+ like: keys HASH1 ==
+ grep { exists HASH2->{$_} } keys HASH1
+ ARRAY HASH any ARRAY elements exist as HASH keys
+ like: grep { exists HASH->{$_} } ARRAY
+ Regexp HASH any HASH keys pattern match Regexp
+ like: grep { /Regexp/ } keys HASH
+ undef HASH always false (undef can't be a key)
+ like: 0 == 1
+ Any HASH HASH key existence
+ like: exists HASH->{Any}
+
+ Right operand is CODE:
+
+ Left Right Description and pseudocode
+ ===============================================================
+ ARRAY CODE sub returns true on all ARRAY elements[1]
+ like: !grep { !CODE->($_) } ARRAY
+ HASH CODE sub returns true on all HASH keys[1]
+ like: !grep { !CODE->($_) } keys HASH
+ Any CODE sub passed Any returns true
+ like: CODE->(Any)
+
+Right operand is a Regexp:
+
+ Left Right Description and pseudocode
+ ===============================================================
+ ARRAY Regexp any ARRAY elements match Regexp
+ like: grep { /Regexp/ } ARRAY
+ HASH Regexp any HASH keys match Regexp
+ like: grep { /Regexp/ } keys HASH
+ Any Regexp pattern match
+ like: Any =~ /Regexp/
+
+ Other:
+
+ Left Right Description and pseudocode
+ ===============================================================
+ Object Any invoke ~~ overloading on Object,
+ or fall back to...
+
+ Any Num numeric equality
+ like: Any == Num
+ Num nummy[4] numeric equality
+ like: Num == nummy
+ undef Any check whether undefined
+ like: !defined(Any)
+ Any Any string equality
+ like: Any eq Any
+
+
+Notes:
+
+=over
+
+=item 1.
+Empty hashes or arrays match.
+
+=item 2.
+That is, each element smartmatches the element of the same index in the other array.[3]
+
+=item 3.
+If a circular reference is found, fall back to referential equality.
+
+=item 4.
+Either an actual number, or a string that looks like one.
+
+=back
+
+The smartmatch implicitly dereferences any non-blessed hash or array
+reference, so the C<I<HASH>> and C<I<ARRAY>> entries apply in those cases.
+For blessed references, the C<I<Object>> entries apply. Smartmatches
+involving hashes only consider hash keys, never hash values.
+
+The "like" code entry is not always an exact rendition. For example, the
+smartmatch operator short-circuits whenever possible, but C<grep> does
+not. Also, C<grep> in scalar context returns the number of matches, but
+C<~~> returns only true or false.
+
+Unlike most operators, the smartmatch operator knows to treat C<undef>
+specially:
+
+ use v5.10.1;
+ @array = (1, 2, 3, undef, 4, 5);
+ say "some elements undefined" if undef ~~ @array;
+
+Each operand is considered in a modified scalar context, the modification
+being that array and hash variables are passed by reference to the
+operator, which implicitly dereferences them. Both elements
+of each pair are the same:
+
+ use v5.10.1;
+
+ my %hash = (red => 1, blue => 2, green => 3,
+ orange => 4, yellow => 5, purple => 6,
+ black => 7, grey => 8, white => 9);
+
+ my @array = qw(red blue green);
+
+ say "some array elements in hash keys" if @array ~~ %hash;
+ say "some array elements in hash keys" if \@array ~~ \%hash;
+
+ say "red in array" if "red" ~~ @array;
+ say "red in array" if "red" ~~ \@array;
+
+ say "some keys end in e" if /e$/ ~~ %hash;
+ say "some keys end in e" if /e$/ ~~ \%hash;
+
+Two arrays smartmatch if each element in the first array smartmatches
+(that is, is "in") the corresponding element in the second array,
+recursively.
+
+ use v5.10.1;
+ my @little = qw(red blue green);
+ my @bigger = ("red", "blue", [ "orange", "green" ] );
+ if (@little ~~ @bigger) { # true!
+ say "little is contained in bigger";
+ }
+
+Because the smartmatch operator recurses on nested arrays, this
+will still report that "red" is in the array.
+
+ use v5.10.1;
+ my @array = qw(red blue green);
+ my $nested_array = [[[[[[[ @array ]]]]]]];
+ say "red in array" if "red" ~~ $nested_array;
+
+If two arrays smartmatch each other, then they are deep
+copies of each others' values, as this example reports:
+
+ use v5.12.0;
+ my @a = (0, 1, 2, [3, [4, 5], 6], 7);
+ my @b = (0, 1, 2, [3, [4, 5], 6], 7);
+
+ if (@a ~~ @b && @b ~~ @a) {
+ say "a and b are deep copies of each other";
+ }
+ elsif (@a ~~ @b) {
+ say "a smartmatches in b";
+ }
+ elsif (@b ~~ @a) {
+ say "b smartmatches in a";
+ }
+ else {
+ say "a and b don't smartmatch each other at all";
+ }
+
+
+If you were to set S<C<$b[3] = 4>>, then instead of reporting that "a and b
+are deep copies of each other", it now reports that C<"b smartmatches in a">.
+That's because the corresponding position in C<@a> contains an array that
+(eventually) has a 4 in it.
+
+Smartmatching one hash against another reports whether both contain the
+same keys, no more and no less. This could be used to see whether two
+records have the same field names, without caring what values those fields
+might have. For example:
+
+ use v5.10.1;
+ sub make_dogtag {
+ state $REQUIRED_FIELDS = { name=>1, rank=>1, serial_num=>1 };
+
+ my ($class, $init_fields) = @_;
+
+ die "Must supply (only) name, rank, and serial number"
+ unless $init_fields ~~ $REQUIRED_FIELDS;
+
+ ...
+ }
+
+However, this only does what you mean if C<$init_fields> is indeed a hash
+reference. The condition C<$init_fields ~~ $REQUIRED_FIELDS> also allows the
+strings C<"name">, C<"rank">, C<"serial_num"> as well as any array reference
+that contains C<"name"> or C<"rank"> or C<"serial_num"> anywhere to pass
+through.
+
+The smartmatch operator is most often used as the implicit operator of a
+C<when> clause. See the section on "Switch Statements" in L<perlsyn>.
+
+=head3 Smartmatching of Objects
+
+To avoid relying on an object's underlying representation, if the
+smartmatch's right operand is an object that doesn't overload C<~~>,
+it raises the exception "C<Smartmatching a non-overloaded object
+breaks encapsulation>". That's because one has no business digging
+around to see whether something is "in" an object. These are all
+illegal on objects without a C<~~> overload:
+
+ %hash ~~ $object
+ 42 ~~ $object
+ "fred" ~~ $object
+
+However, you can change the way an object is smartmatched by overloading
+the C<~~> operator. This is allowed to
+extend the usual smartmatch semantics.
+For objects that do have an C<~~> overload, see L<overload>.
+
+Using an object as the left operand is allowed, although not very useful.
+Smartmatching rules take precedence over overloading, so even if the
+object in the left operand has smartmatch overloading, this will be
+ignored. A left operand that is a non-overloaded object falls back on a
+string or numeric comparison of whatever the C<ref> operator returns. That
+means that
+
+ $object ~~ X
+
+does I<not> invoke the overload method with C<I<X>> as an argument.
+Instead the above table is consulted as normal, and based on the type of
+C<I<X>>, overloading may or may not be invoked. For simple strings or
+numbers, "in" becomes equivalent to this:
+
+ $object ~~ $number ref($object) == $number
+ $object ~~ $string ref($object) eq $string
+
+For example, this reports that the handle smells IOish
+(but please don't really do this!):
+
+ use IO::Handle;
+ my $fh = IO::Handle->new();
+ if ($fh ~~ /\bIO\b/) {
+ say "handle smells IOish";
+ }
+
+That's because it treats C<$fh> as a string like
+C<"IO::Handle=GLOB(0x8039e0)">, then pattern matches against that.
=head2 Bitwise And
X<operator, bitwise, and> X<bitwise and> X<&>
print "Even\n" if ($x & 1) == 0;
-If the experimental "bitwise" feature is enabled via S<C<use feature
-'bitwise'>>, then this operator always treats its operand as numbers. This
-feature produces a warning unless you also use C<S<no warnings
-'experimental::bitwise'>>.
+If the "bitwise" feature is enabled via S<C<use feature 'bitwise'>> or
+C<use v5.28>, then this operator always treats its operands as numbers.
+Before Perl 5.28 this feature produced a warning in the
+C<"experimental::bitwise"> category.
=head2 Bitwise Or and Exclusive Or
X<operator, bitwise, or> X<bitwise or> X<|> X<operator, bitwise, xor>
print "false\n" if (8 | 2) != 10;
-If the experimental "bitwise" feature is enabled via S<C<use feature
-'bitwise'>>, then this operator always treats its operand as numbers. This
-feature produces a warning unless you also use S<C<no warnings
-'experimental::bitwise'>>.
+If the "bitwise" feature is enabled via S<C<use feature 'bitwise'>> or
+C<use v5.28>, then this operator always treats its operands as numbers.
+Before Perl 5.28. this feature produced a warning in the
+C<"experimental::bitwise"> category.
=head2 C-style Logical And
X<&&> X<logical and> X<operator, logical, and>
@foo = @foo[0 .. $#foo]; # an expensive no-op
@foo = @foo[$#foo-4 .. $#foo]; # slice last 5 items
-The range operator (in list context) makes use of the magical
-auto-increment algorithm if the operands are strings. You
-can say
+Because each operand is evaluated in integer form, S<C<2.18 .. 3.14>> will
+return two elements in list context.
- @alphabet = ("A" .. "Z");
+ @list = (2.18 .. 3.14); # same as @list = (2 .. 3);
-to get all normal letters of the English alphabet, or
+The range operator in list context can make use of the magical
+auto-increment algorithm if both operands are strings, subject to the
+following rules:
- $hexdigit = (0 .. 9, "a" .. "f")[$num & 15];
+=over
+
+=item *
+
+With one exception (below), if both strings look like numbers to Perl,
+the magic increment will not be applied, and the strings will be treated
+as numbers (more specifically, integers) instead.
+
+For example, C<"-2".."2"> is the same as C<-2..2>, and
+C<"2.18".."3.14"> produces C<2, 3>.
+
+=item *
+
+The exception to the above rule is when the left-hand string begins with
+C<0> and is longer than one character, in this case the magic increment
+I<will> be applied, even though strings like C<"01"> would normally look
+like a number to Perl.
-to get a hexadecimal digit, or
+For example, C<"01".."04"> produces C<"01", "02", "03", "04">, and
+C<"00".."-1"> produces C<"00"> through C<"99"> - this may seem
+surprising, but see the following rules for why it works this way.
+To get dates with leading zeros, you can say:
@z2 = ("01" .. "31");
print $z2[$mday];
-to get dates with leading zeros.
+If you want to force strings to be interpreted as numbers, you could say
+
+ @numbers = ( 0+$first .. 0+$last );
+
+B<Note:> In Perl versions 5.30 and below, I<any> string on the left-hand
+side beginning with C<"0">, including the string C<"0"> itself, would
+cause the magic string increment behavior. This means that on these Perl
+versions, C<"0".."-1"> would produce C<"0"> through C<"99">, which was
+inconsistent with C<0..-1>, which produces the empty list. This also means
+that C<"0".."9"> now produces a list of integers instead of a list of
+strings.
+
+=item *
+
+If the initial value specified isn't part of a magical increment
+sequence (that is, a non-empty string matching C</^[a-zA-Z]*[0-9]*\z/>),
+only the initial value will be returned.
+
+For example, C<"ax".."az"> produces C<"ax", "ay", "az">, but
+C<"*x".."az"> produces only C<"*x">.
+
+=item *
+
+For other initial values that are strings that do follow the rules of the
+magical increment, the corresponding sequence will be returned.
+
+For example, you can say
+
+ @alphabet = ("A" .. "Z");
+
+to get all normal letters of the English alphabet, or
+
+ $hexdigit = (0 .. 9, "a" .. "f")[$num & 15];
+
+to get a hexadecimal digit.
+
+=item *
If the final value specified is not in the sequence that the magical
increment would produce, the sequence goes until the next value would
-be longer than the final value specified.
+be longer than the final value specified. If the length of the final
+string is shorter than the first, the empty list is returned.
+
+For example, C<"a".."--"> is the same as C<"a".."zz">, C<"0".."xx">
+produces C<"0"> through C<"99">, and C<"aaa".."--"> returns the empty
+list.
+
+=back
As of Perl 5.26, the list-context range operator on strings works as expected
in the scope of L<< S<C<"use feature 'unicode_strings">>|feature/The
that feature, it exhibits L<perlunicode/The "Unicode Bug">: its behavior
depends on the internal encoding of the range endpoint.
-If the initial value specified isn't part of a magical increment
-sequence (that is, a non-empty string matching C</^[a-zA-Z]*[0-9]*\z/>),
-only the initial value will be returned. So the following will only
-return an alpha:
+Because the magical increment only works on non-empty strings matching
+C</^[a-zA-Z]*[0-9]*\z/>, the following will only return an alpha:
use charnames "greek";
my @greek_small = ("\N{alpha}" .. "\N{omega}");
L<experimental feature|perlrecharclass/Extended Bracketed Character
Classes> C<S</(?[ \p{Greek} & \p{Lower} ])+/>>).
-Because each operand is evaluated in integer form, S<C<2.18 .. 3.14>> will
-return two elements in list context.
-
- @list = (2.18 .. 3.14); # same as @list = (2 .. 3);
-
=head2 Conditional Operator
X<operator, conditional> X<operator, ternary> X<ternary> X<?:>
side of the assignment.
The three dotted bitwise assignment operators (C<&.=> C<|.=> C<^.=>) are new in
-Perl 5.22 and experimental. See L</Bitwise String Operators>.
+Perl 5.22. See L</Bitwise String Operators>.
=head2 Comma Operator
X<comma> X<operator, comma> X<,>
qXfooX # WRONG!
The following escape sequences are available in constructs that interpolate,
-and in transliterations:
+and in transliterations whose delimiters aren't single quotes (C<"'">).
+In all the ones with braces, any number of blanks and/or tabs adjoining
+and within the braces are allowed (and ignored).
X<\t> X<\n> X<\r> X<\f> X<\b> X<\a> X<\e> X<\x> X<\0> X<\c> X<\N> X<\N{}>
X<\o{}>
\b backspace (BS)
\a alarm (bell) (BEL)
\e escape (ESC)
- \x{263A} [1,8] hex char (example: SMILEY)
+ \x{263A} [1,8] hex char (example shown: SMILEY)
+ \x{ 263A } Same, but shows optional blanks inside and
+ adjoining the braces
\x1b [2,8] restricted range hex char (example: ESC)
\N{name} [3] named Unicode character or character sequence
\N{U+263D} [4,8] Unicode character (example: FIRST QUARTER MOON)
\o{23072} [6,8] octal char (example: SMILEY)
\033 [7,8] restricted range octal char (example: ESC)
+Note that any escape sequence using braces inside interpolated
+constructs may have optional blanks (tab or space characters) adjoining
+with and inside of the braces, as illustrated above by the second
+S<C<\x{ }>> example.
+
=over 4
=item [1]
The result is the character specified by the hexadecimal number between
the braces. See L</[8]> below for details on which character.
-Only hexadecimal digits are valid between the braces. If an invalid
-character is encountered, a warning will be issued and the invalid
-character and all subsequent characters (valid or invalid) within the
-braces will be discarded.
+Blanks (tab or space characters) may separate the number from either or
+both of the braces.
+
+Otherwise, only hexadecimal digits are valid between the braces. If an
+invalid character is encountered, a warning will be issued and the
+invalid character and all subsequent characters (valid or invalid)
+within the braces will be discarded.
If there are no valid digits between the braces, the generated character is
the NULL character (C<\x{00}>). However, an explicit empty brace (C<\x{}>)
The result is the character specified by the octal number between the braces.
See L</[8]> below for details on which character.
-If a character that isn't an octal digit is encountered, a warning is raised,
-and the value is based on the octal digits before it, discarding it and all
-following characters up to the closing brace. It is a fatal error if there are
-no octal digits at all.
+Blanks (tab or space characters) may separate the number from either or
+both of the braces.
+
+Otherwise, if a character that isn't an octal digit is encountered, a
+warning is raised, and the value is based on the octal digits before it,
+discarding it and all following characters up to the closing brace. It
+is a fatal error if there are no octal digits at all.
=item [7]
C</o> modifier has is not propagated, being restricted to those patterns
explicitly using it.
-The last four modifiers listed above, added in Perl 5.14,
+The C</a>, C</d>, C</l>, and C</u> modifiers (added in Perl 5.14)
control the character set rules, but C</a> is the only one you are likely
to want to specify explicitly; the other three are selected
automatically by various pragmas.
s/([^ ]*) *([^ ]*)/$2 $1/; # reverse 1st two fields
+ $foo !~ s/A/a/g; # Lowercase all A's in $foo; return
+ # 0 if any were found and changed;
+ # otherwise return 1
+
Note the use of C<$> instead of C<\> in the last example. Unlike
B<sed>, we use the \<I<digit>> form only in the left hand side.
Anywhere else it's $<I<digit>>.
# expand tabs to 8-column spacing
1 while s/\t+/' ' x (length($&)*8 - length($`)%8)/e;
+X</c>While C<s///> accepts the C</c> flag, it has no effect beyond
+producing a warning if warnings are enabled.
+
=back
=head2 Quote-Like Operators
=item C<qq/I<STRING>/>
X<qq> X<quote, double> X<"> X<"">
-=item "I<STRING>"
+=item C<"I<STRING>">
A double-quoted, interpolated string.
=item C<`I<STRING>`>
A string which is (possibly) interpolated and then executed as a
-system command with F</bin/sh> or its equivalent. Shell wildcards,
-pipes, and redirections will be honored. The collected standard
-output of the command is returned; standard error is unaffected. In
-scalar context, it comes back as a single (potentially multi-line)
-string, or C<undef> if the command failed. In list context, returns a
+system command, via F</bin/sh> or its equivalent if required. Shell
+wildcards, pipes, and redirections will be honored. Similarly to
+C<system>, if the string contains no shell metacharacters then it will
+executed directly. The collected standard output of the command is
+returned; standard error is unaffected. In scalar context, it comes
+back as a single (potentially multi-line) string, or C<undef> if the
+shell (or command) could not be started. In list context, returns a
list of lines (however you've defined lines with C<$/> or
-C<$INPUT_RECORD_SEPARATOR>), or an empty list if the command failed.
+C<$INPUT_RECORD_SEPARATOR>), or an empty list if the shell (or command)
+could not be started.
Because backticks do not affect standard error, use shell file descriptor
syntax (assuming the shell supports this) if you care to address this.
use open IN => ":encoding(UTF-8)";
my $x = `cmd-producing-utf-8`;
+C<qx//> can also be called like a function with L<perlfunc/readpipe>.
+
See L</"I/O Operators"> for more discussion.
=item C<qw/I<STRING>/>
=item C<y/I<SEARCHLIST>/I<REPLACEMENTLIST>/cdsr>
-Transliterates all occurrences of the characters found in the search list
-with the corresponding character in the replacement list. It returns
-the number of characters replaced or deleted. If no string is
-specified via the C<=~> or C<!~> operator, the C<$_> string is transliterated.
+Transliterates all occurrences of the characters found (or not found
+if the C</c> modifier is specified) in the search list with the
+positionally corresponding character in the replacement list, possibly
+deleting some, depending on the modifiers specified. It returns the
+number of characters replaced or deleted. If no string is specified via
+the C<=~> or C<!~> operator, the C<$_> string is transliterated.
+
+For B<sed> devotees, C<y> is provided as a synonym for C<tr>.
If the C</r> (non-destructive) option is present, a new copy of the string
is made and its characters transliterated, and this copy is returned no
scalar variable, an array element, a hash element, or an assignment to one
of those; in other words, an lvalue.
-A character range may be specified with a hyphen, so C<tr/A-J/0-9/>
-does the same replacement as C<tr/ACEGIBDFHJ/0246813579/>.
-For B<sed> devotees, C<y> is provided as a synonym for C<tr>. If the
-I<SEARCHLIST> is delimited by bracketing quotes, the I<REPLACEMENTLIST>
-must have its own pair of quotes, which may or may not be bracketing
-quotes; for example, C<tr[aeiouy][yuoiea]> or C<tr(+\-*/)/ABCD/>.
+The characters delimitting I<SEARCHLIST> and I<REPLACEMENTLIST>
+can be any printable character, not just forward slashes. If they
+are single quotes (C<tr'I<SEARCHLIST>'I<REPLACEMENTLIST>'>), the only
+interpolation is removal of C<\> from pairs of C<\\>.
+
+Otherwise, a character range may be specified with a hyphen, so
+C<tr/A-J/0-9/> does the same replacement as
+C<tr/ACEGIBDFHJ/0246813579/>.
-Characters may be literals or any of the escape sequences accepted in
-double-quoted strings. But there is no variable interpolation, so C<"$">
-and C<"@"> are treated as literals. A hyphen at the beginning or end, or
-preceded by a backslash is considered a literal. Escape sequence
-details are in L<the table near the beginning of this section|/Quote and
-Quote-like Operators>.
+If the I<SEARCHLIST> is delimited by bracketing quotes, the
+I<REPLACEMENTLIST> must have its own pair of quotes, which may or may
+not be bracketing quotes; for example, C<tr[aeiouy][yuoiea]> or
+C<tr(+\-*/)/ABCD/>.
+
+Characters may be literals, or (if the delimiters aren't single quotes)
+any of the escape sequences accepted in double-quoted strings. But
+there is never any variable interpolation, so C<"$"> and C<"@"> are
+always treated as literals. A hyphen at the beginning or end, or
+preceded by a backslash is also always considered a literal. Escape
+sequence details are in L<the table near the beginning of this
+section|/Quote and Quote-like Operators>.
Note that C<tr> does B<not> do regular expression character classes such as
C<\d> or C<\pL>. The C<tr> operator is not equivalent to the C<L<tr(1)>>
removes from C<$string> all the platform's characters which are
equivalent to any of Unicode U+0020, U+0021, ... U+007D, U+007E. This
is a portable range, and has the same effect on every platform it is
-run on. It turns out that in this example, these are the ASCII
+run on. In this example, these are the ASCII
printable characters. So after this is run, C<$string> has only
controls and characters which have no ASCII equivalents.
But, even for portable ranges, it is not generally obvious what is
-included without having to look things up. A sound principle is to use
-only ranges that begin from and end at either ASCII alphabetics of equal
-case (C<b-e>, C<B-E>), or digits (C<1-4>). Anything else is unclear
-(and unportable unless C<\N{...}> is used). If in doubt, spell out the
-character sets in full.
+included without having to look things up in the manual. A sound
+principle is to use only ranges that both begin from, and end at, either
+ASCII alphabetics of equal case (C<b-e>, C<B-E>), or digits (C<1-4>).
+Anything else is unclear (and unportable unless C<\N{...}> is used). If
+in doubt, spell out the character sets in full.
Options:
c Complement the SEARCHLIST.
d Delete found but unreplaced characters.
- s Squash duplicate replaced characters.
r Return the modified string and leave the original string
untouched.
+ s Squash duplicate replaced characters.
-If the C</c> modifier is specified, the I<SEARCHLIST> character set
-is complemented. If the C</d> modifier is specified, any characters
-specified by I<SEARCHLIST> not found in I<REPLACEMENTLIST> are deleted.
-(Note that this is slightly more flexible than the behavior of some
-B<tr> programs, which delete anything they find in the I<SEARCHLIST>,
-period.) If the C</s> modifier is specified, sequences of characters
-that were transliterated to the same character are squashed down
-to a single instance of the character.
-
-If the C</d> modifier is used, the I<REPLACEMENTLIST> is always interpreted
-exactly as specified. Otherwise, if the I<REPLACEMENTLIST> is shorter
-than the I<SEARCHLIST>, the final character is replicated till it is long
-enough. If the I<REPLACEMENTLIST> is empty, the I<SEARCHLIST> is replicated.
-This latter is useful for counting characters in a class or for
-squashing character sequences in a class.
-
-Examples:
-
- $ARGV[1] =~ tr/A-Z/a-z/; # canonicalize to lower case ASCII
-
- $cnt = tr/*/*/; # count the stars in $_
-
- $cnt = $sky =~ tr/*/*/; # count the stars in $sky
-
- $cnt = tr/0-9//; # count the digits in $_
-
- tr/a-zA-Z//s; # bookkeeper -> bokeper
-
- ($HOST = $host) =~ tr/a-z/A-Z/;
- $HOST = $host =~ tr/a-z/A-Z/r; # same thing
-
- $HOST = $host =~ tr/a-z/A-Z/r # chained with s///r
- =~ s/:/ -p/r;
+If the C</d> modifier is specified, any characters specified by
+I<SEARCHLIST> not found in I<REPLACEMENTLIST> are deleted. (Note that
+this is slightly more flexible than the behavior of some B<tr> programs,
+which delete anything they find in the I<SEARCHLIST>, period.)
- tr/a-zA-Z/ /cs; # change non-alphas to single space
+If the C</s> modifier is specified, sequences of characters, all in a
+row, that were transliterated to the same character are squashed down to
+a single instance of that character.
- @stripped = map tr/a-zA-Z/ /csr, @original;
- # /r with map
+ my $a = "aaabbbca";
+ $a =~ tr/ab/dd/s; # $a now is "dcd"
- tr [\200-\377]
- [\000-\177]; # wickedly delete 8th bit
+If the C</d> modifier is used, the I<REPLACEMENTLIST> is always interpreted
+exactly as specified. Otherwise, if the I<REPLACEMENTLIST> is shorter
+than the I<SEARCHLIST>, the final character, if any, is replicated until
+it is long enough. There won't be a final character if and only if the
+I<REPLACEMENTLIST> is empty, in which case I<REPLACEMENTLIST> is
+copied from I<SEARCHLIST>. An empty I<REPLACEMENTLIST> is useful
+for counting characters in a class, or for squashing character sequences
+in a class.
+
+ tr/abcd// tr/abcd/abcd/
+ tr/abcd/AB/ tr/abcd/ABBB/
+ tr/abcd//d s/[abcd]//g
+ tr/abcd/AB/d (tr/ab/AB/ + s/[cd]//g) - but run together
+
+If the C</c> modifier is specified, the characters to be transliterated
+are the ones NOT in I<SEARCHLIST>, that is, it is complemented. If
+C</d> and/or C</s> are also specified, they apply to the complemented
+I<SEARCHLIST>. Recall, that if I<REPLACEMENTLIST> is empty (except
+under C</d>) a copy of I<SEARCHLIST> is used instead. That copy is made
+after complementing under C</c>. I<SEARCHLIST> is sorted by code point
+order after complementing, and any I<REPLACEMENTLIST> is applied to
+that sorted result. This means that under C</c>, the order of the
+characters specified in I<SEARCHLIST> is irrelevant. This can
+lead to different results on EBCDIC systems if I<REPLACEMENTLIST>
+contains more than one character, hence it is generally non-portable to
+use C</c> with such a I<REPLACEMENTLIST>.
+
+Another way of describing the operation is this:
+If C</c> is specified, the I<SEARCHLIST> is sorted by code point order,
+then complemented. If I<REPLACEMENTLIST> is empty and C</d> is not
+specified, I<REPLACEMENTLIST> is replaced by a copy of I<SEARCHLIST> (as
+modified under C</c>), and these potentially modified lists are used as
+the basis for what follows. Any character in the target string that
+isn't in I<SEARCHLIST> is passed through unchanged. Every other
+character in the target string is replaced by the character in
+I<REPLACEMENTLIST> that positionally corresponds to its mate in
+I<SEARCHLIST>, except that under C</s>, the 2nd and following characters
+are squeezed out in a sequence of characters in a row that all translate
+to the same character. If I<SEARCHLIST> is longer than
+I<REPLACEMENTLIST>, characters in the target string that match a
+character in I<SEARCHLIST> that doesn't have a correspondence in
+I<REPLACEMENTLIST> are either deleted from the target string if C</d> is
+specified; or replaced by the final character in I<REPLACEMENTLIST> if
+C</d> isn't specified.
+
+Some examples:
+
+ $ARGV[1] =~ tr/A-Z/a-z/; # canonicalize to lower case ASCII
+
+ $cnt = tr/*/*/; # count the stars in $_
+ $cnt = tr/*//; # same thing
+
+ $cnt = $sky =~ tr/*/*/; # count the stars in $sky
+ $cnt = $sky =~ tr/*//; # same thing
+
+ $cnt = $sky =~ tr/*//c; # count all the non-stars in $sky
+ $cnt = $sky =~ tr/*/*/c; # same, but transliterate each non-star
+ # into a star, leaving the already-stars
+ # alone. Afterwards, everything in $sky
+ # is a star.
+
+ $cnt = tr/0-9//; # count the ASCII digits in $_
+
+ tr/a-zA-Z//s; # bookkeeper -> bokeper
+ tr/o/o/s; # bookkeeper -> bokkeeper
+ tr/oe/oe/s; # bookkeeper -> bokkeper
+ tr/oe//s; # bookkeeper -> bokkeper
+ tr/oe/o/s; # bookkeeper -> bokkopor
+
+ ($HOST = $host) =~ tr/a-z/A-Z/;
+ $HOST = $host =~ tr/a-z/A-Z/r; # same thing
+
+ $HOST = $host =~ tr/a-z/A-Z/r # chained with s///r
+ =~ s/:/ -p/r;
+
+ tr/a-zA-Z/ /cs; # change non-alphas to single space
+
+ @stripped = map tr/a-zA-Z/ /csr, @original;
+ # /r with map
+
+ tr [\200-\377]
+ [\000-\177]; # wickedly delete 8th bit
+
+ $foo !~ tr/A/a/ # transliterate all the A's in $foo to 'a',
+ # return 0 if any were found and changed.
+ # Otherwise return 1
If multiple transliterations are given for a character, only the
first one is used:
- tr/AAA/XYZ/
+ tr/AAA/XYZ/
will transliterate any A to X.
interpolation. That means that if you want to use variables, you
must use an C<eval()>:
- eval "tr/$oldlist/$newlist/";
- die $@ if $@;
+ eval "tr/$oldlist/$newlist/";
+ die $@ if $@;
- eval "tr/$oldlist/$newlist/, 1" or die $@;
+ eval "tr/$oldlist/$newlist/, 1" or die $@;
=item C<< <<I<EOF> >>
X<here-doc> X<heredoc> X<here-document> X<<< << >>>
The terminating string may be either an identifier (a word), or some
quoted text. An unquoted identifier works like double quotes.
There may not be a space between the C<< << >> and the identifier,
-unless the identifier is explicitly quoted. (If you put a space it
-will be treated as a null identifier, which is valid, and matches the
-first empty line.) The terminating string must appear by itself
-(unquoted and with no surrounding whitespace) on the terminating line.
+unless the identifier is explicitly quoted. The terminating string
+must appear by itself (unquoted and with no surrounding whitespace)
+on the terminating line.
If the terminating string is quoted, the type of quotes used determine
the treatment of the text.
remain newlines. Unlike in any of the shells, single quotes do not
hide variable names in the command from interpretation. To pass a
literal dollar-sign through to the shell you need to hide it with a
-backslash. The generalized form of backticks is C<qx//>. (Because
+backslash. The generalized form of backticks is C<qx//>, or you can
+call the L<perlfunc/readpipe> function. (Because
backticks always undergo shell expansion as well, see L<perlsec> for
security concerns.)
X<qx> X<`> X<``> X<backtick> X<glob>
odd thing to you, but you'll use the construct in almost every Perl
script you write.) The C<$_> variable is not implicitly localized.
You'll have to put a S<C<local $_;>> before the loop if you want that
-to happen.
+to happen. Furthermore, if the input symbol or an explicit assignment
+of the input symbol to a scalar is used as a C<while>/C<for> condition,
+then the condition actually tests for definedness of the expression's
+value, not for its regular truth value.
-The following lines are equivalent:
+Thus the following lines are equivalent:
while (defined($_ = <STDIN>)) { print; }
while ($_ = <STDIN>) { print; }
C<< <I<FILEHANDLE>> >> may also be spelled C<readline(*I<FILEHANDLE>)>.
See L<perlfunc/readline>.
-The null filehandle C<< <> >> is special: it can be used to emulate the
+The null filehandle C<< <> >> (sometimes called the diamond operator) is
+special: it can be used to emulate the
behavior of B<sed> and B<awk>, and any other Unix filter program
that takes a list of filenames, doing the same to each line
of input from all of them. Input from C<< <> >> comes either from
and call it with S<C<perl dangerous.pl 'rm -rfv *|'>>, it actually opens a
pipe, executes the C<rm> command and reads C<rm>'s output from that pipe.
If you want all items in C<@ARGV> to be interpreted as file names, you
-can use the module C<ARGV::readonly> from CPAN, or use the double bracket:
+can use the module C<ARGV::readonly> from CPAN, or use the double
+diamond bracket:
while (<<>>) {
print;
@files = glob("$dir/*.[ch]");
@files = glob($files[$i]);
+If an angle-bracket-based globbing expression is used as the condition of
+a C<while> or C<for> loop, then it will be implicitly assigned to C<$_>.
+If either a globbing expression or an explicit assignment of a globbing
+expression to a scalar is used as a C<while>/C<for> condition, then
+the condition actually tests for definedness of the expression's value,
+not for its regular truth value.
+
=head2 Constant Folding
X<constant folding> X<folding>
$baz = 0+$foo & 0+$bar; # both ops explicitly numeric
$biz = "$foo" ^ "$bar"; # both ops explicitly stringy
-This somewhat unpredictable behavior can be avoided with the experimental
-"bitwise" feature, new in Perl 5.22. You can enable it via S<C<use feature
-'bitwise'>>. By default, it will warn unless the C<"experimental::bitwise">
-warnings category has been disabled. (S<C<use experimental 'bitwise'>> will
-enable the feature and disable the warning.) Under this feature, the four
+This somewhat unpredictable behavior can be avoided with the "bitwise"
+feature, new in Perl 5.22. You can enable it via S<C<use feature
+'bitwise'>> or C<use v5.28>. Before Perl 5.28, it used to emit a warning
+in the C<"experimental::bitwise"> category. Under this feature, the four
standard bitwise operators (C<~ | & ^>) are always numeric. Adding a dot
after each operator (C<~. |. &. ^.>) forces it to treat its operands as
strings:
- use experimental "bitwise";
+ use feature "bitwise";
$foo = 150 | 105; # yields 255 (0x96 | 0x69 is 0xFF)
$foo = '150' | 105; # yields 255
$foo = 150 | '105'; # yields 255