perlexperiment: document the private_use experiment

[perl5.git] / pod / perlop.pod
diff --git a/pod/perlop.pod b/pod/perlop.pod

index 67fcb3a..33e558a 100644 (file)
--- a/pod/perlop.pod
+++ b/pod/perlop.pod
@@ -36,7 +36,8 @@ C<(2 + 4) * 5>. So the expression yields C<2 + 20 == 22>, rather than
  C<6 * 5 == 30>.
  
  I<Operator associativity> defines what happens if a sequence of the same
-operators is used one after another: whether they will be grouped at the left
+operators is used one after another:
+usually that they will be grouped at the left
  or the right. For example, in C<9 - 3 - 2>, subtraction is left associative,
  so C<9 - 3> is grouped together as the left-hand operand of the second
  subtraction, rather than C<3 - 2> being grouped together as the right-hand
@@ -59,6 +60,63 @@ special evaluation rules that can result in an operand not being evaluated at
  all; in general, the top-level operator in an expression has control of
  operand evaluation.
  
+Some comparison operators, as their associativity, I<chain> with some
+operators of the same precedence (but never with operators of different
+precedence).  This chaining means that each comparison is performed
+on the two arguments surrounding it, with each interior argument taking
+part in two comparisons, and the comparison results are implicitly ANDed.
+Thus S<C<"$x E<lt> $y E<lt>= $z">> behaves exactly like S<C<"$x E<lt>
+$y && $y E<lt>= $z">>, assuming that C<"$y"> is as simple a scalar as
+it looks.  The ANDing short-circuits just like C<"&&"> does, stopping
+the sequence of comparisons as soon as one yields false.
+
+In a chained comparison, each argument expression is evaluated at most
+once, even if it takes part in two comparisons, but the result of the
+evaluation is fetched for each comparison.  (It is not evaluated
+at all if the short-circuiting means that it's not required for any
+comparisons.)  This matters if the computation of an interior argument
+is expensive or non-deterministic.  For example,
+
+    if($x < expensive_sub() <= $z) { ...
+
+is not entirely like
+
+    if($x < expensive_sub() && expensive_sub() <= $z) { ...
+
+but instead closer to
+
+    my $tmp = expensive_sub();
+    if($x < $tmp && $tmp <= $z) { ...
+
+in that the subroutine is only called once.  However, it's not exactly
+like this latter code either, because the chained comparison doesn't
+actually involve any temporary variable (named or otherwise): there is
+no assignment.  This doesn't make much difference where the expression
+is a call to an ordinary subroutine, but matters more with an lvalue
+subroutine, or if the argument expression yields some unusual kind of
+scalar by other means.  For example, if the argument expression yields
+a tied scalar, then the expression is evaluated to produce that scalar
+at most once, but the value of that scalar may be fetched up to twice,
+once for each comparison in which it is actually used.
+
+In this example, the expression is evaluated only once, and the tied
+scalar (the result of the expression) is fetched for each comparison that
+uses it.
+
+    if ($x < $tied_scalar < $z) { ...
+
+In the next example, the expression is evaluated only once, and the tied
+scalar is fetched once as part of the operation within the expression.
+The result of that operation is fetched for each comparison, which
+normally doesn't matter unless that expression result is also magical due
+to operator overloading.
+
+    if ($x < $tied_scalar + 42 < $z) { ...
+
+Some operators are instead non-associative, meaning that it is a syntax
+error to use a sequence of those operators of the same precedence.
+For example, S<C<"$x .. $y .. $z">> is an error.
+
  Perl operators have the following associativity and precedence,
  listed from highest precedence to lowest.  Operators borrowed from
  C keep the same precedence relationship with each other, even where
@@ -70,16 +128,17 @@ values only, not array values.
      left       ->
      nonassoc   ++ --
      right      **
-    right      ! ~ \ and unary + and -
+    right      ! ~ ~. \ and unary + and -
      left       =~ !~
      left       * / % x
      left       + - .
      left       << >>
      nonassoc   named unary operators
-    nonassoc   < > <= >= lt gt le ge
-    nonassoc   == != <=> eq ne cmp ~~
-    left       &
-    left       | ^
+    chained    < > <= >= lt gt le ge
+    chain/na   == != eq ne <=> cmp ~~
+    nonassoc    isa
+    left       & &.
+    left       | |. ^ ^.
      left       &&
      left       || //
      nonassoc   ..  ...
@@ -516,6 +575,12 @@ Binary C<"ge"> returns true if the left argument is stringwise greater
  than or equal to the right argument.
  X<< ge >>
  
+A sequence of relational operators, such as S<C<"$x E<lt> $y E<lt>=
+$z">>, performs chained comparisons, in the manner described above in
+the section L</"Operator Precedence and Associativity">.
+Beware that they do not chain with equality operators, which have lower
+precedence.
+
  =head2 Equality Operators
  X<equality> X<equal> X<equals> X<operator, equality>
  
@@ -527,6 +592,20 @@ Binary C<< "!=" >> returns true if the left argument is numerically not equal
  to the right argument.
  X<!=>
  
+Binary C<"eq"> returns true if the left argument is stringwise equal to
+the right argument.
+X<eq>
+
+Binary C<"ne"> returns true if the left argument is stringwise not equal
+to the right argument.
+X<ne>
+
+A sequence of the above equality operators, such as S<C<"$x == $y ==
+$z">>, performs chained comparisons, in the manner described above in
+the section L</"Operator Precedence and Associativity">.
+Beware that they do not chain with relational operators, which have
+higher precedence.
+
  Binary C<< "<=>" >> returns -1, 0, or 1 depending on whether the left
  argument is numerically less than, equal to, or greater than the right
  argument.  If your platform supports C<NaN>'s (not-a-numbers) as numeric
@@ -544,14 +623,6 @@ X<spaceship>
  (Note that the L<bigint>, L<bigrat>, and L<bignum> pragmas all
  support C<"NaN">.)
  
-Binary C<"eq"> returns true if the left argument is stringwise equal to
-the right argument.
-X<eq>
-
-Binary C<"ne"> returns true if the left argument is stringwise not equal
-to the right argument.
-X<ne>
-
  Binary C<"cmp"> returns -1, 0, or 1 depending on whether the left
  argument is stringwise less than, equal to, or greater than the right
  argument.
@@ -561,6 +632,10 @@ Binary C<"~~"> does a smartmatch between its arguments.  Smart matching
  is described in the next section.
  X<~~>
  
+The two-sided ordering operators C<"E<lt>=E<gt>"> and C<"cmp">, and the
+smartmatch operator C<"~~">, are non-associative with respect to each
+other and with respect to the equality operators of the same precedence.
+
  C<"lt">, C<"le">, C<"ge">, C<"gt"> and C<"cmp"> use the collation (sort)
  order specified by the current C<LC_COLLATE> locale if a S<C<use
  locale>> form that includes collation is in effect.  See L<perllocale>.
@@ -575,6 +650,25 @@ function, available in Perl v5.16 or later:
  
      if ( fc($x) eq fc($y) ) { ... }
  
+=head2 Class Instance Operator
+X<isa operator>
+
+Binary C<isa> evaluates to true when the left argument is an object instance of
+the class (or a subclass derived from that class) given by the right argument.
+If the left argument is not defined, not a blessed object instance, nor does
+not derive from the class given by the right argument, the operator evaluates
+as false. The right argument may give the class either as a bareword or a
+scalar expression that yields a string class name:
+
+    if( $obj isa Some::Class ) { ... }
+
+    if( $obj isa "Different::Class" ) { ... }
+    if( $obj isa $name_of_class ) { ... }
+
+This is an experimental feature and is available from Perl 5.31.6 when enabled
+by C<use feature 'isa'>. It emits a warning in the C<experimental::isa>
+category.
+
  =head2 Smartmatch Operator
  
  First available in Perl 5.10.1 (the 5.10.0 version behaved differently),
@@ -593,7 +687,7 @@ type information to select a suitable comparison mechanism.
  
  The C<~~> operator compares its operands "polymorphically", determining how
  to compare them according to their actual types (numeric, string, array,
-hash, etc.)  Like the equality operators with which it shares the same
+hash, etc.).  Like the equality operators with which it shares the same
  precedence, C<~~> returns 1 for true and C<""> for false.  It is often best
  read aloud as "in", "inside of", or "is contained in", because the left
  operand is often looked for I<inside> the right operand.  That makes the
@@ -1081,26 +1175,89 @@ And now some examples as a list operator:
      @foo = @foo[0 .. $#foo];        # an expensive no-op
      @foo = @foo[$#foo-4 .. $#foo];  # slice last 5 items
  
-The range operator (in list context) makes use of the magical
-auto-increment algorithm if the operands are strings.  You
-can say
+Because each operand is evaluated in integer form, S<C<2.18 .. 3.14>> will
+return two elements in list context.
  
-    @alphabet = ("A" .. "Z");
+    @list = (2.18 .. 3.14); # same as @list = (2 .. 3);
  
-to get all normal letters of the English alphabet, or
+The range operator in list context can make use of the magical
+auto-increment algorithm if both operands are strings, subject to the
+following rules:
  
-    $hexdigit = (0 .. 9, "a" .. "f")[$num & 15];
+=over
+
+=item *
+
+With one exception (below), if both strings look like numbers to Perl,
+the magic increment will not be applied, and the strings will be treated
+as numbers (more specifically, integers) instead.
  
-to get a hexadecimal digit, or
+For example, C<"-2".."2"> is the same as C<-2..2>, and
+C<"2.18".."3.14"> produces C<2, 3>.
+
+=item *
+
+The exception to the above rule is when the left-hand string begins with
+C<0> and is longer than one character, in this case the magic increment
+I<will> be applied, even though strings like C<"01"> would normally look
+like a number to Perl.
+
+For example, C<"01".."04"> produces C<"01", "02", "03", "04">, and
+C<"00".."-1"> produces C<"00"> through C<"99"> - this may seem
+surprising, but see the following rules for why it works this way.
+To get dates with leading zeros, you can say:
  
      @z2 = ("01" .. "31");
      print $z2[$mday];
  
-to get dates with leading zeros.
+If you want to force strings to be interpreted as numbers, you could say
+
+    @numbers = ( 0+$first .. 0+$last );
+
+B<Note:> In Perl versions 5.30 and below, I<any> string on the left-hand
+side beginning with C<"0">, including the string C<"0"> itself, would
+cause the magic string increment behavior. This means that on these Perl
+versions, C<"0".."-1"> would produce C<"0"> through C<"99">, which was
+inconsistent with C<0..-1>, which produces the empty list. This also means
+that C<"0".."9"> now produces a list of integers instead of a list of
+strings.
+
+=item *
+
+If the initial value specified isn't part of a magical increment
+sequence (that is, a non-empty string matching C</^[a-zA-Z]*[0-9]*\z/>),
+only the initial value will be returned.
+
+For example, C<"ax".."az"> produces C<"ax", "ay", "az">, but
+C<"*x".."az"> produces only C<"*x">.
+
+=item *
+
+For other initial values that are strings that do follow the rules of the
+magical increment, the corresponding sequence will be returned.
+
+For example, you can say
+
+    @alphabet = ("A" .. "Z");
+
+to get all normal letters of the English alphabet, or
+
+    $hexdigit = (0 .. 9, "a" .. "f")[$num & 15];
+
+to get a hexadecimal digit.
+
+=item *
  
  If the final value specified is not in the sequence that the magical
  increment would produce, the sequence goes until the next value would
-be longer than the final value specified.
+be longer than the final value specified. If the length of the final
+string is shorter than the first, the empty list is returned.
+
+For example, C<"a".."--"> is the same as C<"a".."zz">, C<"0".."xx">
+produces C<"0"> through C<"99">, and C<"aaa".."--"> returns the empty
+list.
+
+=back
  
  As of Perl 5.26, the list-context range operator on strings works as expected
  in the scope of L<< S<C<"use feature 'unicode_strings">>|feature/The
@@ -1108,10 +1265,8 @@ in the scope of L<< S<C<"use feature 'unicode_strings">>|feature/The
  that feature, it exhibits L<perlunicode/The "Unicode Bug">: its behavior
  depends on the internal encoding of the range endpoint.
  
-If the initial value specified isn't part of a magical increment
-sequence (that is, a non-empty string matching C</^[a-zA-Z]*[0-9]*\z/>),
-only the initial value will be returned.  So the following will only
-return an alpha:
+Because the magical increment only works on non-empty strings matching
+C</^[a-zA-Z]*[0-9]*\z/>, the following will only return an alpha:
  
      use charnames "greek";
      my @greek_small =  ("\N{alpha}" .. "\N{omega}");
@@ -1131,11 +1286,6 @@ you could use the pattern C</(?:(?=\p{Greek})\p{Lower})+/> (or the
  L<experimental feature|perlrecharclass/Extended Bracketed Character
  Classes> C<S</(?[ \p{Greek} & \p{Lower} ])+/>>).
  
-Because each operand is evaluated in integer form, S<C<2.18 .. 3.14>> will
-return two elements in list context.
-
-    @list = (2.18 .. 3.14); # same as @list = (2 .. 3);
-
  =head2 Conditional Operator
  X<operator, conditional> X<operator, ternary> X<ternary> X<?:>
  
@@ -1443,7 +1593,9 @@ is a word character (meaning it matches C</\w/>):
      qXfooX  # WRONG!
  
  The following escape sequences are available in constructs that interpolate,
-and in transliterations:
+and in transliterations whose delimiters aren't single quotes (C<"'">).
+In all the ones with braces, any number of blanks and/or tabs adjoining
+and within the braces are allowed (and ignored).
  X<\t> X<\n> X<\r> X<\f> X<\b> X<\a> X<\e> X<\x> X<\0> X<\c> X<\N> X<\N{}>
  X<\o{}>
  
@@ -1455,7 +1607,9 @@ X<\o{}>
      \b                  backspace         (BS)
      \a                  alarm (bell)      (BEL)
      \e                  escape            (ESC)
-    \x{263A}     [1,8]  hex char          (example: SMILEY)
+    \x{263A}     [1,8]  hex char          (example shown: SMILEY)
+    \x{ 263A }          Same, but shows optional blanks inside and
+                        adjoining the braces
      \x1b         [2,8]  restricted range hex char (example: ESC)
      \N{name}     [3]    named Unicode character or character sequence
      \N{U+263D}   [4,8]  Unicode character (example: FIRST QUARTER MOON)
@@ -1463,6 +1617,11 @@ X<\o{}>
      \o{23072}    [6,8]  octal char        (example: SMILEY)
      \033         [7,8]  restricted range octal char  (example: ESC)
  
+Note that any escape sequence using braces inside interpolated
+constructs may have optional blanks (tab or space characters) adjoining
+with and inside of the braces, as illustrated above by the second
+S<C<\x{ }>> example.
+
  =over 4
  
  =item [1]
@@ -1470,10 +1629,13 @@ X<\o{}>
  The result is the character specified by the hexadecimal number between
  the braces.  See L</[8]> below for details on which character.
  
-Only hexadecimal digits are valid between the braces.  If an invalid
-character is encountered, a warning will be issued and the invalid
-character and all subsequent characters (valid or invalid) within the
-braces will be discarded.
+Blanks (tab or space characters) may separate the number from either or
+both of the braces.
+
+Otherwise, only hexadecimal digits are valid between the braces.  If an
+invalid character is encountered, a warning will be issued and the
+invalid character and all subsequent characters (valid or invalid)
+within the braces will be discarded.
  
  If there are no valid digits between the braces, the generated character is
  the NULL character (C<\x{00}>).  However, an explicit empty brace (C<\x{}>)
@@ -1559,10 +1721,13 @@ To get platform independent controls, you can use C<\N{...}>.
  The result is the character specified by the octal number between the braces.
  See L</[8]> below for details on which character.
  
-If a character that isn't an octal digit is encountered, a warning is raised,
-and the value is based on the octal digits before it, discarding it and all
-following characters up to the closing brace.  It is a fatal error if there are
-no octal digits at all.
+Blanks (tab or space characters) may separate the number from either or
+both of the braces.
+
+Otherwise, if a character that isn't an octal digit is encountered, a
+warning is raised, and the value is based on the octal digits before it,
+discarding it and all following characters up to the closing brace.  It
+is a fatal error if there are no octal digits at all.
  
  =item [7]
  
@@ -2211,6 +2376,10 @@ Examples:
  
      s/([^ ]*) *([^ ]*)/$2 $1/; # reverse 1st two fields
  
+    $foo !~ s/A/a/g;    # Lowercase all A's in $foo; return
+                        # 0 if any were found and changed;
+                        # otherwise return 1
+
  Note the use of C<$> instead of C<\> in the last example.  Unlike
  B<sed>, we use the \<I<digit>> form only in the left hand side.
  Anywhere else it's $<I<digit>>.
@@ -2224,6 +2393,9 @@ to occur that you might want.  Here are two common cases:
      # expand tabs to 8-column spacing
      1 while s/\t+/' ' x (length($&)*8 - length($`)%8)/e;
  
+X</c>While C<s///> accepts the C</c> flag, it has no effect beyond
+producing a warning if warnings are enabled.
+
  =back
  
  =head2 Quote-Like Operators
@@ -2247,7 +2419,7 @@ the delimiter or backslash is interpolated.
  =item C<qq/I<STRING>/>
  X<qq> X<quote, double> X<"> X<"">
  
-=item "I<STRING>"
+=item C<"I<STRING>">
  
  A double-quoted, interpolated string.
  
@@ -2262,13 +2434,16 @@ X<qx> X<`> X<``> X<backtick>
  =item C<`I<STRING>`>
  
  A string which is (possibly) interpolated and then executed as a
-system command with F</bin/sh> or its equivalent.  Shell wildcards,
-pipes, and redirections will be honored.  The collected standard
-output of the command is returned; standard error is unaffected.  In
-scalar context, it comes back as a single (potentially multi-line)
-string, or C<undef> if the command failed.  In list context, returns a
+system command, via F</bin/sh> or its equivalent if required.  Shell
+wildcards, pipes, and redirections will be honored.  Similarly to
+C<system>, if the string contains no shell metacharacters then it will
+executed directly.  The collected standard output of the command is
+returned; standard error is unaffected.  In scalar context, it comes
+back as a single (potentially multi-line) string, or C<undef> if the
+shell (or command) could not be started.  In list context, returns a
  list of lines (however you've defined lines with C<$/> or
-C<$INPUT_RECORD_SEPARATOR>), or an empty list if the command failed.
+C<$INPUT_RECORD_SEPARATOR>), or an empty list if the shell (or command)
+could not be started.
  
  Because backticks do not affect standard error, use shell file descriptor
  syntax (assuming the shell supports this) if you care to address this.
@@ -2366,6 +2541,8 @@ output of the command, for example:
    use open IN => ":encoding(UTF-8)";
    my $x = `cmd-producing-utf-8`;
  
+C<qx//> can also be called like a function with L<perlfunc/readpipe>.
+
  See L</"I/O Operators"> for more discussion.
  
  =item C<qw/I<STRING>/>
@@ -2403,10 +2580,14 @@ X<tr> X<y> X<transliterate> X</c> X</d> X</s>
  
  =item C<y/I<SEARCHLIST>/I<REPLACEMENTLIST>/cdsr>
  
-Transliterates all occurrences of the characters found in the search list
-with the corresponding character in the replacement list.  It returns
-the number of characters replaced or deleted.  If no string is
-specified via the C<=~> or C<!~> operator, the C<$_> string is transliterated.
+Transliterates all occurrences of the characters found (or not found
+if the C</c> modifier is specified) in the search list with the
+positionally corresponding character in the replacement list, possibly
+deleting some, depending on the modifiers specified.  It returns the
+number of characters replaced or deleted.  If no string is specified via
+the C<=~> or C<!~> operator, the C<$_> string is transliterated.
+
+For B<sed> devotees, C<y> is provided as a synonym for C<tr>.
  
  If the C</r> (non-destructive) option is present, a new copy of the string
  is made and its characters transliterated, and this copy is returned no
@@ -2418,22 +2599,27 @@ Unless the C</r> option is used, the string specified with C<=~> must be a
  scalar variable, an array element, a hash element, or an assignment to one
  of those; in other words, an lvalue.
  
-A character range may be specified with a hyphen, so C<tr/A-J/0-9/>
-does the same replacement as C<tr/ACEGIBDFHJ/0246813579/>.
+The characters delimitting I<SEARCHLIST> and I<REPLACEMENTLIST>
+can be any printable character, not just forward slashes.  If they
+are single quotes (C<tr'I<SEARCHLIST>'I<REPLACEMENTLIST>'>), the only
+interpolation is removal of C<\> from pairs of C<\\>.
  
-For B<sed> devotees, C<y> is provided as a synonym for C<tr>.
+Otherwise, a character range may be specified with a hyphen, so
+C<tr/A-J/0-9/> does the same replacement as
+C<tr/ACEGIBDFHJ/0246813579/>.
  
-If the
-I<SEARCHLIST> is delimited by bracketing quotes, the I<REPLACEMENTLIST>
-must have its own pair of quotes, which may or may not be bracketing
-quotes; for example, C<tr[aeiouy][yuoiea]> or C<tr(+\-*/)/ABCD/>.
+If the I<SEARCHLIST> is delimited by bracketing quotes, the
+I<REPLACEMENTLIST> must have its own pair of quotes, which may or may
+not be bracketing quotes; for example, C<tr[aeiouy][yuoiea]> or
+C<tr(+\-*/)/ABCD/>.
  
-Characters may be literals or any of the escape sequences accepted in
-double-quoted strings.  But there is no variable interpolation, so C<"$">
-and C<"@"> are treated as literals.  A hyphen at the beginning or end, or
-preceded by a backslash is considered a literal.  Escape sequence
-details are in L<the table near the beginning of this section|/Quote and
-Quote-like Operators>.
+Characters may be literals, or (if the delimiters aren't single quotes)
+any of the escape sequences accepted in double-quoted strings.  But
+there is never any variable interpolation, so C<"$"> and C<"@"> are
+always treated as literals.  A hyphen at the beginning or end, or
+preceded by a backslash is also always considered a literal.  Escape
+sequence details are in L<the table near the beginning of this
+section|/Quote and Quote-like Operators>.
  
  Note that C<tr> does B<not> do regular expression character classes such as
  C<\d> or C<\pL>.  The C<tr> operator is not equivalent to the C<L<tr(1)>>
@@ -2472,85 +2658,128 @@ range's end points are expressed as C<\N{...}>
  removes from C<$string> all the platform's characters which are
  equivalent to any of Unicode U+0020, U+0021, ... U+007D, U+007E.  This
  is a portable range, and has the same effect on every platform it is
-run on.  It turns out that in this example, these are the ASCII
+run on.  In this example, these are the ASCII
  printable characters.  So after this is run, C<$string> has only
  controls and characters which have no ASCII equivalents.
  
  But, even for portable ranges, it is not generally obvious what is
-included without having to look things up.  A sound principle is to use
-only ranges that both begin from and end at either ASCII alphabetics of
-equal case (C<b-e>, C<B-E>), or digits (C<1-4>).  Anything else is
-unclear (and unportable unless C<\N{...}> is used).  If in doubt, spell
-out the character sets in full.
+included without having to look things up in the manual.  A sound
+principle is to use only ranges that both begin from, and end at, either
+ASCII alphabetics of equal case (C<b-e>, C<B-E>), or digits (C<1-4>).
+Anything else is unclear (and unportable unless C<\N{...}> is used).  If
+in doubt, spell out the character sets in full.
  
  Options:
  
      c  Complement the SEARCHLIST.
      d  Delete found but unreplaced characters.
-    s  Squash duplicate replaced characters.
      r  Return the modified string and leave the original string
         untouched.
+    s  Squash duplicate replaced characters.
  
-If the C</c> modifier is specified, the I<SEARCHLIST> character set
-is complemented. So for example these two are equivalent (the exact
-maximum number will depend on your platform):
-
-    tr/\x00-\xfd/ABCD/c
-    tr/\xfe-\x{7fffffff}/ABCD/
+If the C</d> modifier is specified, any characters specified by
+I<SEARCHLIST>  not found in I<REPLACEMENTLIST> are deleted.  (Note that
+this is slightly more flexible than the behavior of some B<tr> programs,
+which delete anything they find in the I<SEARCHLIST>, period.)
  
-If the C</d> modifier is specified, any characters
-specified by I<SEARCHLIST> not found in I<REPLACEMENTLIST> are deleted.
-(Note that this is slightly more flexible than the behavior of some
-B<tr> programs, which delete anything they find in the I<SEARCHLIST>,
-period.)
+If the C</s> modifier is specified, sequences of characters, all in a
+row, that were transliterated to the same character are squashed down to
+a single instance of that character.
  
-If the C</s> modifier is specified, runs of the same character in the
-result, where each those characters were substituted by the
-transliteration, are squashed down to a single instance of the character.
+ my $a = "aaabbbca";
+ $a =~ tr/ab/dd/s;     # $a now is "dcd"
  
  If the C</d> modifier is used, the I<REPLACEMENTLIST> is always interpreted
  exactly as specified.  Otherwise, if the I<REPLACEMENTLIST> is shorter
-than the I<SEARCHLIST>, the final character is replicated till it is long
-enough.  If the I<REPLACEMENTLIST> is empty, the I<SEARCHLIST> is replicated.
-This latter is useful for counting characters in a class or for
-squashing character sequences in a class. For example, each of these pairs
-are equivalent:
+than the I<SEARCHLIST>, the final character, if any, is replicated until
+it is long enough.  There won't be a final character if and only if the
+I<REPLACEMENTLIST> is empty, in which case I<REPLACEMENTLIST> is
+copied from I<SEARCHLIST>.    An empty I<REPLACEMENTLIST> is useful
+for counting characters in a class, or for squashing character sequences
+in a class.
  
      tr/abcd//            tr/abcd/abcd/
      tr/abcd/AB/          tr/abcd/ABBB/
      tr/abcd//d           s/[abcd]//g
      tr/abcd/AB/d         (tr/ab/AB/ + s/[cd]//g)  - but run together
  
+If the C</c> modifier is specified, the characters to be transliterated
+are the ones NOT in I<SEARCHLIST>, that is, it is complemented.  If
+C</d> and/or C</s> are also specified, they apply to the complemented
+I<SEARCHLIST>.  Recall, that if I<REPLACEMENTLIST> is empty (except
+under C</d>) a copy of I<SEARCHLIST> is used instead.  That copy is made
+after complementing under C</c>.  I<SEARCHLIST> is sorted by code point
+order after complementing, and any I<REPLACEMENTLIST>  is applied to
+that sorted result.  This means that under C</c>, the order of the
+characters specified in I<SEARCHLIST> is irrelevant.  This can
+lead to different results on EBCDIC systems if I<REPLACEMENTLIST>
+contains more than one character, hence it is generally non-portable to
+use C</c> with such a I<REPLACEMENTLIST>.
+
+Another way of describing the operation is this:
+If C</c> is specified, the I<SEARCHLIST> is sorted by code point order,
+then complemented.  If I<REPLACEMENTLIST> is empty and C</d> is not
+specified, I<REPLACEMENTLIST> is replaced by a copy of I<SEARCHLIST> (as
+modified under C</c>), and these potentially modified lists are used as
+the basis for what follows.  Any character in the target string that
+isn't in I<SEARCHLIST> is passed through unchanged.  Every other
+character in the target string is replaced by the character in
+I<REPLACEMENTLIST> that positionally corresponds to its mate in
+I<SEARCHLIST>, except that under C</s>, the 2nd and following characters
+are squeezed out in a sequence of characters in a row that all translate
+to the same character.  If I<SEARCHLIST> is longer than
+I<REPLACEMENTLIST>, characters in the target string that match a
+character in I<SEARCHLIST> that doesn't have a correspondence in
+I<REPLACEMENTLIST> are either deleted from the target string if C</d> is
+specified; or replaced by the final character in I<REPLACEMENTLIST> if
+C</d> isn't specified.
+
  Some examples:
  
-    $ARGV[1] =~ tr/A-Z/a-z/;   # canonicalize to lower case ASCII
+ $ARGV[1] =~ tr/A-Z/a-z/;   # canonicalize to lower case ASCII
+
+ $cnt = tr/*/*/;            # count the stars in $_
+ $cnt = tr/*//;             # same thing
+
+ $cnt = $sky =~ tr/*/*/;    # count the stars in $sky
+ $cnt = $sky =~ tr/*//;     # same thing
  
-    $cnt = tr/*/*/;            # count the stars in $_
+ $cnt = $sky =~ tr/*//c;    # count all the non-stars in $sky
+ $cnt = $sky =~ tr/*/*/c;   # same, but transliterate each non-star
+                            # into a star, leaving the already-stars
+                            # alone.  Afterwards, everything in $sky
+                            # is a star.
  
-    $cnt = $sky =~ tr/*/*/;    # count the stars in $sky
+ $cnt = tr/0-9//;           # count the ASCII digits in $_
  
-    $cnt = tr/0-9//;           # count the digits in $_
+ tr/a-zA-Z//s;              # bookkeeper -> bokeper
+ tr/o/o/s;                  # bookkeeper -> bokkeeper
+ tr/oe/oe/s;                # bookkeeper -> bokkeper
+ tr/oe//s;                  # bookkeeper -> bokkeper
+ tr/oe/o/s;                 # bookkeeper -> bokkopor
  
-    tr/a-zA-Z//s;              # bookkeeper -> bokeper
+ ($HOST = $host) =~ tr/a-z/A-Z/;
+  $HOST = $host  =~ tr/a-z/A-Z/r; # same thing
  
-    ($HOST = $host) =~ tr/a-z/A-Z/;
-     $HOST = $host  =~ tr/a-z/A-Z/r;   # same thing
+ $HOST = $host =~ tr/a-z/A-Z/r   # chained with s///r
+               =~ s/:/ -p/r;
  
-    $HOST = $host =~ tr/a-z/A-Z/r    # chained with s///r
-                  =~ s/:/ -p/r;
+ tr/a-zA-Z/ /cs;                 # change non-alphas to single space
  
-    tr/a-zA-Z/ /cs;            # change non-alphas to single space
+ @stripped = map tr/a-zA-Z/ /csr, @original;
+                                 # /r with map
  
-    @stripped = map tr/a-zA-Z/ /csr, @original;
-                               # /r with map
+ tr [\200-\377]
+    [\000-\177];                 # wickedly delete 8th bit
  
-    tr [\200-\377]
-       [\000-\177];            # wickedly delete 8th bit
+ $foo !~ tr/A/a/    # transliterate all the A's in $foo to 'a',
+                    # return 0 if any were found and changed.
+                    # Otherwise return 1
  
  If multiple transliterations are given for a character, only the
  first one is used:
  
-    tr/AAA/XYZ/
+ tr/AAA/XYZ/
  
  will transliterate any A to X.
  
@@ -2559,10 +2788,10 @@ the I<SEARCHLIST> nor the I<REPLACEMENTLIST> are subjected to double quote
  interpolation.  That means that if you want to use variables, you
  must use an C<eval()>:
  
-    eval "tr/$oldlist/$newlist/";
-    die $@ if $@;
+ eval "tr/$oldlist/$newlist/";
+ die $@ if $@;
  
-    eval "tr/$oldlist/$newlist/, 1" or die $@;
+ eval "tr/$oldlist/$newlist/, 1" or die $@;
  
  =item C<< <<I<EOF> >>
  X<here-doc> X<heredoc> X<here-document> X<<< << >>>
@@ -2578,10 +2807,9 @@ want to use L</Indented Here-docs> (see below).
  The terminating string may be either an identifier (a word), or some
  quoted text.  An unquoted identifier works like double quotes.
  There may not be a space between the C<< << >> and the identifier,
-unless the identifier is explicitly quoted.  (If you put a space it
-will be treated as a null identifier, which is valid, and matches the
-first empty line.)  The terminating string must appear by itself
-(unquoted and with no surrounding whitespace) on the terminating line.
+unless the identifier is explicitly quoted.  The terminating string
+must appear by itself (unquoted and with no surrounding whitespace)
+on the terminating line.
  
  If the terminating string is quoted, the type of quotes used determine
  the treatment of the text.
@@ -3091,7 +3319,8 @@ Unlike in B<csh>, no translation is done on the return data--newlines
  remain newlines.  Unlike in any of the shells, single quotes do not
  hide variable names in the command from interpretation.  To pass a
  literal dollar-sign through to the shell you need to hide it with a
-backslash.  The generalized form of backticks is C<qx//>.  (Because
+backslash.  The generalized form of backticks is C<qx//>, or you can
+call the L<perlfunc/readpipe> function.  (Because
  backticks always undergo shell expansion as well, see L<perlsec> for
  security concerns.)
  X<qx> X<`> X<``> X<backtick> X<glob>
@@ -3162,7 +3391,8 @@ way, so use with care.
  C<< <I<FILEHANDLE>> >>  may also be spelled C<readline(*I<FILEHANDLE>)>.
  See L<perlfunc/readline>.
  
-The null filehandle C<< <> >> is special: it can be used to emulate the
+The null filehandle C<< <> >> (sometimes called the diamond operator) is
+special: it can be used to emulate the
  behavior of B<sed> and B<awk>, and any other Unix filter program
  that takes a list of filenames, doing the same to each line
  of input from all of them.  Input from C<< <> >> comes either from
@@ -3203,7 +3433,8 @@ it interprets special characters, so if you have a script like this:
  and call it with S<C<perl dangerous.pl 'rm -rfv *|'>>, it actually opens a
  pipe, executes the C<rm> command and reads C<rm>'s output from that pipe.
  If you want all items in C<@ARGV> to be interpreted as file names, you
-can use the module C<ARGV::readonly> from CPAN, or use the double bracket:
+can use the module C<ARGV::readonly> from CPAN, or use the double
+diamond bracket:
  
      while (<<>>) {
          print;