Add the URL for annotated svn of S03.

[perl5.git] / pod / perlop.pod
diff --git a/pod/perlop.pod b/pod/perlop.pod

index 159cf34..7b0b0d2 100644 (file)
--- a/pod/perlop.pod
+++ b/pod/perlop.pod
@@ -53,7 +53,7 @@ values only, not array values.
      nonassoc   list operators (rightward)
      right      not
      left       and
-    left       or xor err
+    left       or xor
  
  In the following sections, these operators are covered in precedence order.
  
@@ -200,7 +200,7 @@ concatenated with the identifier is returned.  Otherwise, if the string
  starts with a plus or minus, a string starting with the opposite sign
  is returned.  One effect of these rules is that -bareword is equivalent
  to the string "-bareword".  If, however, the string begins with a
-non-alphabetic character (exluding "+" or "-"), Perl will attempt to convert
+non-alphabetic character (excluding "+" or "-"), Perl will attempt to convert
  the string to a numeric and the arithmetic negation is performed. If the
  string cannot be cleanly converted to a numeric, Perl will give the warning
  B<Argument "the string" isn't numeric in negation (-) at ...>.
@@ -260,19 +260,29 @@ X<*>
  Binary "/" divides two numbers.
  X</> X<slash>
  
-Binary "%" computes the modulus of two numbers.  Given integer
+Binary "%" is the modulo operator, which computes the division
+remainder of its first argument with respect to its second argument.
+Given integer
  operands C<$a> and C<$b>: If C<$b> is positive, then C<$a % $b> is
-C<$a> minus the largest multiple of C<$b> that is not greater than
+C<$a> minus the largest multiple of C<$b> less than or equal to
  C<$a>.  If C<$b> is negative, then C<$a % $b> is C<$a> minus the
  smallest multiple of C<$b> that is not less than C<$a> (i.e. the
  result will be less than or equal to zero).  If the operands
-C<$a> and C<$b> are floting point values, only the integer portion
-of C<$a> and C<$b> will be used in the operation.
+C<$a> and C<$b> are floating point values and the absolute value of
+C<$b> (that is C<abs($b)>) is less than C<(UV_MAX + 1)>, only
+the integer portion of C<$a> and C<$b> will be used in the operation
+(Note: here C<UV_MAX> means the maximum of the unsigned integer type).
+If the absolute value of the right operand (C<abs($b)>) is greater than
+or equal to C<(UV_MAX + 1)>, "%" computes the floating-point remainder
+C<$r> in the equation C<($r = $a - $i*$b)> where C<$i> is a certain
+integer that makes C<$r> have the same sign as the right operand
+C<$b> (B<not> as the left operand C<$a> like C function C<fmod()>)
+and the absolute value less than that of C<$b>.
  Note that when C<use integer> is in scope, "%" gives you direct access
-to the modulus operator as implemented by your C compiler.  This
+to the modulo operator as implemented by your C compiler.  This
  operator is not as well defined for negative operands, but it will
  execute faster.
-X<%> X<remainder> X<modulus> X<mod>
+X<%> X<remainder> X<modulo> X<mod>
  
  Binary "x" is the repetition operator.  In scalar context or if the left
  operand is not enclosed in parentheses, it returns a string consisting
@@ -438,9 +448,7 @@ argument.
  X<cmp>
  
  Binary "~~" does a smart match between its arguments. Smart matching
-is described in L<perlsyn/"Smart Matching in Detail">.
-This operator is only available if you enable the "~~" feature:
-see L<feature> for more information.
+is described in L<perlsyn/"Smart matching in detail">.
  X<~~>
  
  "lt", "le", "ge", "gt" and "cmp" use the collation (sort) order specified
@@ -514,9 +522,9 @@ for selecting between two aggregates for assignment:
      @a = scalar(@b) || @c;     # really meant this
      @a = @b ? @b : @c;         # this works fine, though
  
-As more readable alternatives to C<&&>, C<//> and C<||> when used for
-control flow, Perl provides C<and>, C<err> and C<or> operators (see below).
-The short-circuit behavior is identical.  The precedence of "and", "err"
+As more readable alternatives to C<&&> and C<||> when used for
+control flow, Perl provides the C<and> and C<or> operators (see below).
+The short-circuit behavior is identical.  The precedence of "and"
  and "or" is much lower, however, so that you can safely use them after a
  list operator without the need for parentheses:
  
@@ -560,7 +568,7 @@ right operand is true, I<AFTER> which the range operator becomes false
  again.  It doesn't become false till the next time the range operator is
  evaluated.  It can test the right operand and become false on the same
  evaluation it became true (as in B<awk>), but it still returns true once.
-If you don't want it to test the right operand till the next
+If you don't want it to test the right operand until the next
  evaluation, as in B<sed>, just use three dots ("...") instead of
  two.  In all other regards, "..." behaves just like ".." does.
  
@@ -801,6 +809,39 @@ between keys and values in hashes, and other paired elements in lists.
          %hash = ( $key => $value );
          login( $username => $password );
  
+=head2 Yada Yada Operators
+X<...> X<... operator> X<!!!> X<!!! operator> X<???> X<??? operator>
+X<yada yada operator>
+
+The yada yada operators are placeholders for code.  They parse without error,
+but when executed either throw an exception or a warning.
+
+The C<...> operator takes no arguments.  When executed, it throws an exception
+with the text C<Unimplemented>:
+
+    sub foo { ... }
+    foo();
+
+    Unimplemented at <file> line <line number>.
+
+The C<!!!> operator is similar, but it takes one argument, a string to use as
+the text of the exception:
+
+    sub bar { !!! "Don't call me, Ishmael!" }
+    bar();
+
+    Don't call me, Ishmael! at <file> line <line number>.
+
+The C<???> operator also takes one argument, but it emits a warning instead of
+throwing an exception:
+
+    sub baz { ??? "Who are you?  What do you want?" }
+    baz();
+    say "Why are you here?";
+
+    Who are you?  What do you want? at <file> line <line number>.
+    Why are you here?
+
  =head2 List Operators (Rightward)
  X<operator, list, rightward> X<list operator>
  
@@ -830,9 +871,9 @@ precedence.  This means that it short-circuits: i.e., the right
  expression is evaluated only if the left expression is true.
  
  =head2 Logical or, Defined or, and Exclusive Or
-X<operator, logical, or> X<operator, logical, xor> X<operator, logical, err>
+X<operator, logical, or> X<operator, logical, xor>
  X<operator, logical, defined or> X<operator, logical, exclusive or>
-X<or> X<xor> X<err>
+X<or> X<xor>
  
  Binary "or" returns the logical disjunction of the two surrounding
  expressions.  It's equivalent to || except for the very low precedence.
@@ -857,13 +898,6 @@ takes higher precedence.
  
  Then again, you could always use parentheses.
  
-Binary "err" is equivalent to C<//>--it's just like binary "or", except it
-tests its left argument's definedness instead of its truth.  There are two
-ways to remember "err":  either because many functions return C<undef> on
-an B<err>or, or as a sort of correction:  C<$a = ($b err 'default')>. This
-keyword is only available when the 'err' feature is enabled: see
-L<feature> for more information.
-
  Binary "xor" returns the exclusive-OR of the two surrounding expressions.
  It cannot short circuit, of course.
  
@@ -954,14 +988,22 @@ X<\t> X<\n> X<\r> X<\f> X<\b> X<\a> X<\e> X<\x> X<\0> X<\c> X<\N>
      \b         backspace       (BS)
      \a         alarm (bell)    (BEL)
      \e         escape          (ESC)
-    \033       octal char      (ESC)
-    \x1b       hex char        (ESC)
-    \x{263a}   wide hex char   (SMILEY)
-    \c[                control char    (ESC)
+    \033       octal char      (example: ESC)
+    \x1b       hex char        (example: ESC)
+    \x{263a}   wide hex char   (example: SMILEY)
+    \c[                control char    (example: ESC)
      \N{name}   named Unicode character
  
+The character following C<\c> is mapped to some other character by
+converting letters to upper case and then (on ASCII systems) by inverting
+the 7th bit (0x40). The most interesting range is from '@' to '_'
+(0x40 through 0x5F), resulting in a control character from 0x00
+through 0x1F. A '?' maps to the DEL character. On EBCDIC systems only
+'@', the letters, '[', '\', ']', '^', '_' and '?' will work, resulting
+in 0x00 through 0x1F and 0x7F.
+
  B<NOTE>: Unlike C and other languages, Perl has no \v escape sequence for
-the vertical tab (VT - ASCII 11).
+the vertical tab (VT - ASCII 11), but you may use C<\ck> or C<\x0b>.
  
  The following escape sequences are available in constructs that interpolate
  but not in transliterations.
@@ -1004,8 +1046,9 @@ But method calls such as C<< $obj->meth >> are not.
  
  Interpolating an array or slice interpolates the elements in order,
  separated by the value of C<$">, so is equivalent to interpolating
-C<join $", @array>.    "Punctuation" arrays such as C<@+> are only
-interpolated if the name is enclosed in braces C<@{+}>.
+C<join $", @array>.    "Punctuation" arrays such as C<@*> are only
+interpolated if the name is enclosed in braces C<@{*}>, but special
+arrays C<@_>, C<@+>, and C<@-> are interpolated, even without braces.
  
  You cannot include a literal C<$> or C<@> within a C<\Q> sequence.
  An unescaped C<$> or C<@> interpolates the corresponding variable,
@@ -1032,33 +1075,81 @@ matching and related activities.
  
  =over 8
  
-=item ?PATTERN?
-X<?>
+=item qr/STRING/msixpo
+X<qr> X</i> X</m> X</o> X</s> X</x> X</p>
  
-This is just like the C</pattern/> search, except that it matches only
-once between calls to the reset() operator.  This is a useful
-optimization when you want to see only the first occurrence of
-something in each file of a set of files, for instance.  Only C<??>
-patterns local to the current package are reset.
+This operator quotes (and possibly compiles) its I<STRING> as a regular
+expression.  I<STRING> is interpolated the same way as I<PATTERN>
+in C<m/PATTERN/>.  If "'" is used as the delimiter, no interpolation
+is done.  Returns a Perl value which may be used instead of the
+corresponding C</STRING/msixpo> expression. The returned value is a
+normalized version of the original pattern. It magically differs from
+a string containing the same characters: C<ref(qr/x/)> returns "Regexp",
+even though dereferencing the result returns undef.
  
-    while (<>) {
-       if (?^$?) {
-                           # blank line between header and body
-       }
-    } continue {
-       reset if eof;       # clear ?? status for next file
+For example,
+
+    $rex = qr/my.STRING/is;
+    print $rex;                 # prints (?si-xm:my.STRING)
+    s/$rex/foo/;
+
+is equivalent to
+
+    s/my.STRING/foo/is;
+
+The result may be used as a subpattern in a match:
+
+    $re = qr/$pattern/;
+    $string =~ /foo${re}bar/;  # can be interpolated in other patterns
+    $string =~ $re;            # or used standalone
+    $string =~ /$re/;          # or this way
+
+Since Perl may compile the pattern at the moment of execution of qr()
+operator, using qr() may have speed advantages in some situations,
+notably if the result of qr() is used standalone:
+
+    sub match {
+       my $patterns = shift;
+       my @compiled = map qr/$_/i, @$patterns;
+       grep {
+           my $success = 0;
+           foreach my $pat (@compiled) {
+               $success = 1, last if /$pat/;
+           }
+           $success;
+       } @_;
      }
  
-This usage is vaguely deprecated, which means it just might possibly
-be removed in some distant future version of Perl, perhaps somewhere
-around the year 2168.
+Precompilation of the pattern into an internal representation at
+the moment of qr() avoids a need to recompile the pattern every
+time a match C</$pat/> is attempted.  (Perl has many other internal
+optimizations, but none would be triggered in the above example if
+we did not use qr() operator.)
  
-=item m/PATTERN/cgimosx
+Options are:
+
+    m  Treat string as multiple lines.
+    s  Treat string as single line. (Make . match a newline)
+    i  Do case-insensitive pattern matching.
+    x  Use extended regular expressions.
+    p  When matching preserve a copy of the matched string so
+        that ${^PREMATCH}, ${^MATCH}, ${^POSTMATCH} will be defined.
+    o  Compile pattern only once.
+
+If a precompiled pattern is embedded in a larger pattern then the effect
+of 'msixp' will be propagated appropriately.  The effect of the 'o'
+modifier has is not propagated, being restricted to those patterns
+explicitly using it.
+
+See L<perlre> for additional information on valid syntax for STRING, and
+for a detailed look at the semantics of regular expressions.
+
+=item m/PATTERN/msixpogc
  X<m> X<operator, match>
  X<regexp, options> X<regexp> X<regex, options> X<regex>
-X</c> X</i> X</m> X</o> X</s> X</x>
+X</m> X</s> X</i> X</x> X</p> X</o> X</g> X</c>
  
-=item /PATTERN/cgimosx
+=item /PATTERN/msixpogc
  
  Searches a string for a pattern match, and in scalar context returns
  true if it succeeds, false if it fails.  If no string is specified
@@ -1069,15 +1160,11 @@ rather tightly.)  See also L<perlre>.  See L<perllocale> for
  discussion of additional considerations that apply when C<use locale>
  is in effect.
  
-Options are:
+Options are as described in C<qr//>; in addition, the following match
+process modifiers are available:
  
-    c  Do not reset search position on a failed match when /g is in effect.
      g  Match globally, i.e., find all occurrences.
-    i  Do case-insensitive pattern matching.
-    m  Treat string as multiple lines.
-    o  Compile pattern only once.
-    s  Treat string as single line.
-    x  Use extended regular expressions.
+    c  Do not reset search position on a failed match when /g is in effect.
  
  If "/" is the delimiter then the initial C<m> is optional.  With the C<m>
  you can use any pair of non-alphanumeric, non-whitespace characters
@@ -1095,7 +1182,9 @@ the trailing delimiter.  This avoids expensive run-time recompilations,
  and is useful when the value you are interpolating won't change over
  the life of the script.  However, mentioning C</o> constitutes a promise
  that you won't change the variables in the pattern.  If you change them,
-Perl won't even notice.  See also L<"qr/STRING/imosx">.
+Perl won't even notice.  See also L<"qr/STRING/msixpo">.
+
+=item The empty pattern //
  
  If the PATTERN evaluates to the empty string, the last
  I<successfully> matched regular expression is used instead. In this
@@ -1113,6 +1202,8 @@ will assume you meant defined-or.  If you meant the empty regex, just
  use parentheses or spaces to disambiguate, or even prefix the empty
  regex with an C<m> (so C<//> becomes C<m//>).
  
+=item Matching in list context
+
  If the C</g> option is not used, C<m//> in list context returns a
  list consisting of the subexpressions matched by the parentheses in the
  pattern, i.e., (C<$1>, C<$2>, C<$3>...).  (Note that here C<$1> etc. are
@@ -1159,6 +1250,8 @@ search position to the beginning of the string, but you can avoid that
  by adding the C</c> modifier (e.g. C<m//gc>).  Modifying the target
  string also resets the search position.
  
+=item \G assertion
+
  You can intermix C<m//g> matches with C<m/\G.../g>, where C<\G> is a
  zero-width assertion that matches the exact position where the previous
  C<m//g>, if any, left off.  Without the C</g> modifier, the C<\G> assertion
@@ -1216,7 +1309,7 @@ doing different actions depending on which regexp matched.  Each
  regexp tries to match where the previous one leaves off.
  
   $_ = <<'EOL';
-      $url = new URI::URL "http://www/";   die if $url eq "xXx";
+      $url = URI::URL->new( "http://www/" );   die if $url eq "xXx";
   EOL
   LOOP:
      {
@@ -1237,6 +1330,139 @@ Here is the output (split into several lines):
   lowercase lowercase line-noise lowercase lowercase line-noise
   MiXeD line-noise. That's all!
  
+=item ?PATTERN?
+X<?>
+
+This is just like the C</pattern/> search, except that it matches only
+once between calls to the reset() operator.  This is a useful
+optimization when you want to see only the first occurrence of
+something in each file of a set of files, for instance.  Only C<??>
+patterns local to the current package are reset.
+
+    while (<>) {
+       if (?^$?) {
+                           # blank line between header and body
+       }
+    } continue {
+       reset if eof;       # clear ?? status for next file
+    }
+
+This usage is vaguely deprecated, which means it just might possibly
+be removed in some distant future version of Perl, perhaps somewhere
+around the year 2168.
+
+=item s/PATTERN/REPLACEMENT/msixpogce
+X<substitute> X<substitution> X<replace> X<regexp, replace>
+X<regexp, substitute> X</m> X</s> X</i> X</x> X</p> X</o> X</g> X</c> X</e>
+
+Searches a string for a pattern, and if found, replaces that pattern
+with the replacement text and returns the number of substitutions
+made.  Otherwise it returns false (specifically, the empty string).
+
+If no string is specified via the C<=~> or C<!~> operator, the C<$_>
+variable is searched and modified.  (The string specified with C<=~> must
+be scalar variable, an array element, a hash element, or an assignment
+to one of those, i.e., an lvalue.)
+
+If the delimiter chosen is a single quote, no interpolation is
+done on either the PATTERN or the REPLACEMENT.  Otherwise, if the
+PATTERN contains a $ that looks like a variable rather than an
+end-of-string test, the variable will be interpolated into the pattern
+at run-time.  If you want the pattern compiled only once the first time
+the variable is interpolated, use the C</o> option.  If the pattern
+evaluates to the empty string, the last successfully executed regular
+expression is used instead.  See L<perlre> for further explanation on these.
+See L<perllocale> for discussion of additional considerations that apply
+when C<use locale> is in effect.
+
+Options are as with m// with the addition of the following replacement
+specific options:
+
+    e  Evaluate the right side as an expression.
+    ee  Evaluate the right side as a string then eval the result
+
+Any non-alphanumeric, non-whitespace delimiter may replace the
+slashes.  If single quotes are used, no interpretation is done on the
+replacement string (the C</e> modifier overrides this, however).  Unlike
+Perl 4, Perl 5 treats backticks as normal delimiters; the replacement
+text is not evaluated as a command.  If the
+PATTERN is delimited by bracketing quotes, the REPLACEMENT has its own
+pair of quotes, which may or may not be bracketing quotes, e.g.,
+C<s(foo)(bar)> or C<< s<foo>/bar/ >>.  A C</e> will cause the
+replacement portion to be treated as a full-fledged Perl expression
+and evaluated right then and there.  It is, however, syntax checked at
+compile-time. A second C<e> modifier will cause the replacement portion
+to be C<eval>ed before being run as a Perl expression.
+
+Examples:
+
+    s/\bgreen\b/mauve/g;               # don't change wintergreen
+
+    $path =~ s|/usr/bin|/usr/local/bin|;
+
+    s/Login: $foo/Login: $bar/; # run-time pattern
+
+    ($foo = $bar) =~ s/this/that/;     # copy first, then change
+
+    $count = ($paragraph =~ s/Mister\b/Mr./g);  # get change-count
+
+    $_ = 'abc123xyz';
+    s/\d+/$&*2/e;              # yields 'abc246xyz'
+    s/\d+/sprintf("%5d",$&)/e; # yields 'abc  246xyz'
+    s/\w/$& x 2/eg;            # yields 'aabbcc  224466xxyyzz'
+
+    s/%(.)/$percent{$1}/g;     # change percent escapes; no /e
+    s/%(.)/$percent{$1} || $&/ge;      # expr now, so /e
+    s/^=(\w+)/pod($1)/ge;      # use function call
+
+    # expand variables in $_, but dynamics only, using
+    # symbolic dereferencing
+    s/\$(\w+)/${$1}/g;
+
+    # Add one to the value of any numbers in the string
+    s/(\d+)/1 + $1/eg;
+
+    # This will expand any embedded scalar variable
+    # (including lexicals) in $_ : First $1 is interpolated
+    # to the variable name, and then evaluated
+    s/(\$\w+)/$1/eeg;
+
+    # Delete (most) C comments.
+    $program =~ s {
+       /\*     # Match the opening delimiter.
+       .*?     # Match a minimal number of characters.
+       \*/     # Match the closing delimiter.
+    } []gsx;
+
+    s/^\s*(.*?)\s*$/$1/;       # trim whitespace in $_, expensively
+
+    for ($variable) {          # trim whitespace in $variable, cheap
+       s/^\s+//;
+       s/\s+$//;
+    }
+
+    s/([^ ]*) *([^ ]*)/$2 $1/; # reverse 1st two fields
+
+Note the use of $ instead of \ in the last example.  Unlike
+B<sed>, we use the \<I<digit>> form in only the left hand side.
+Anywhere else it's $<I<digit>>.
+
+Occasionally, you can't use just a C</g> to get all the changes
+to occur that you might want.  Here are two common cases:
+
+    # put commas in the right places in an integer
+    1 while s/(\d)(\d\d\d)(?!\d)/$1,$2/g;
+
+    # expand tabs to 8-column spacing
+    1 while s/\t+/' ' x (length($&)*8 - length($`)%8)/e;
+
+=back
+
+=head2 Quote-Like Operators
+X<operator, quote-like>
+
+=over 4
+
  =item q/STRING/
  X<q> X<quote, single> X<'> X<''>
  
@@ -1262,64 +1488,6 @@ A double-quoted, interpolated string.
                 if /\b(tcl|java|python)\b/i;      # :-)
      $baz = "\n";               # a one-character string
  
-=item qr/STRING/imosx
-X<qr> X</i> X</m> X</o> X</s> X</x>
-
-This operator quotes (and possibly compiles) its I<STRING> as a regular
-expression.  I<STRING> is interpolated the same way as I<PATTERN>
-in C<m/PATTERN/>.  If "'" is used as the delimiter, no interpolation
-is done.  Returns a Perl value which may be used instead of the
-corresponding C</STRING/imosx> expression.
-
-For example,
-
-    $rex = qr/my.STRING/is;
-    s/$rex/foo/;
-
-is equivalent to
-
-    s/my.STRING/foo/is;
-
-The result may be used as a subpattern in a match:
-
-    $re = qr/$pattern/;
-    $string =~ /foo${re}bar/;  # can be interpolated in other patterns
-    $string =~ $re;            # or used standalone
-    $string =~ /$re/;          # or this way
-
-Since Perl may compile the pattern at the moment of execution of qr()
-operator, using qr() may have speed advantages in some situations,
-notably if the result of qr() is used standalone:
-
-    sub match {
-       my $patterns = shift;
-       my @compiled = map qr/$_/i, @$patterns;
-       grep {
-           my $success = 0;
-           foreach my $pat (@compiled) {
-               $success = 1, last if /$pat/;
-           }
-           $success;
-       } @_;
-    }
-
-Precompilation of the pattern into an internal representation at
-the moment of qr() avoids a need to recompile the pattern every
-time a match C</$pat/> is attempted.  (Perl has many other internal
-optimizations, but none would be triggered in the above example if
-we did not use qr() operator.)
-
-Options are:
-
-    i  Do case-insensitive pattern matching.
-    m  Treat string as multiple lines.
-    o  Compile pattern only once.
-    s  Treat string as single line.
-    x  Use extended regular expressions.
-
-See L<perlre> for additional information on valid syntax for STRING, and
-for a detailed look at the semantics of regular expressions.
-
  =item qx/STRING/
  X<qx> X<`> X<``> X<backtick>
  
@@ -1440,114 +1608,6 @@ put comments into a multi-line C<qw>-string.  For this reason, the
  C<use warnings> pragma and the B<-w> switch (that is, the C<$^W> variable)
  produces warnings if the STRING contains the "," or the "#" character.
  
-=item s/PATTERN/REPLACEMENT/egimosx
-X<substitute> X<substitution> X<replace> X<regexp, replace>
-X<regexp, substitute> X</e> X</g> X</i> X</m> X</o> X</s> X</x>
-
-Searches a string for a pattern, and if found, replaces that pattern
-with the replacement text and returns the number of substitutions
-made.  Otherwise it returns false (specifically, the empty string).
-
-If no string is specified via the C<=~> or C<!~> operator, the C<$_>
-variable is searched and modified.  (The string specified with C<=~> must
-be scalar variable, an array element, a hash element, or an assignment
-to one of those, i.e., an lvalue.)
-
-If the delimiter chosen is a single quote, no interpolation is
-done on either the PATTERN or the REPLACEMENT.  Otherwise, if the
-PATTERN contains a $ that looks like a variable rather than an
-end-of-string test, the variable will be interpolated into the pattern
-at run-time.  If you want the pattern compiled only once the first time
-the variable is interpolated, use the C</o> option.  If the pattern
-evaluates to the empty string, the last successfully executed regular
-expression is used instead.  See L<perlre> for further explanation on these.
-See L<perllocale> for discussion of additional considerations that apply
-when C<use locale> is in effect.
-
-Options are:
-
-    e  Evaluate the right side as an expression.
-    g  Replace globally, i.e., all occurrences.
-    i  Do case-insensitive pattern matching.
-    m  Treat string as multiple lines.
-    o  Compile pattern only once.
-    s  Treat string as single line.
-    x  Use extended regular expressions.
-
-Any non-alphanumeric, non-whitespace delimiter may replace the
-slashes.  If single quotes are used, no interpretation is done on the
-replacement string (the C</e> modifier overrides this, however).  Unlike
-Perl 4, Perl 5 treats backticks as normal delimiters; the replacement
-text is not evaluated as a command.  If the
-PATTERN is delimited by bracketing quotes, the REPLACEMENT has its own
-pair of quotes, which may or may not be bracketing quotes, e.g.,
-C<s(foo)(bar)> or C<< s<foo>/bar/ >>.  A C</e> will cause the
-replacement portion to be treated as a full-fledged Perl expression
-and evaluated right then and there.  It is, however, syntax checked at
-compile-time. A second C<e> modifier will cause the replacement portion
-to be C<eval>ed before being run as a Perl expression.
-
-Examples:
-
-    s/\bgreen\b/mauve/g;               # don't change wintergreen
-
-    $path =~ s|/usr/bin|/usr/local/bin|;
-
-    s/Login: $foo/Login: $bar/; # run-time pattern
-
-    ($foo = $bar) =~ s/this/that/;     # copy first, then change
-
-    $count = ($paragraph =~ s/Mister\b/Mr./g);  # get change-count
-
-    $_ = 'abc123xyz';
-    s/\d+/$&*2/e;              # yields 'abc246xyz'
-    s/\d+/sprintf("%5d",$&)/e; # yields 'abc  246xyz'
-    s/\w/$& x 2/eg;            # yields 'aabbcc  224466xxyyzz'
-
-    s/%(.)/$percent{$1}/g;     # change percent escapes; no /e
-    s/%(.)/$percent{$1} || $&/ge;      # expr now, so /e
-    s/^=(\w+)/&pod($1)/ge;     # use function call
-
-    # expand variables in $_, but dynamics only, using
-    # symbolic dereferencing
-    s/\$(\w+)/${$1}/g;
-
-    # Add one to the value of any numbers in the string
-    s/(\d+)/1 + $1/eg;
-
-    # This will expand any embedded scalar variable
-    # (including lexicals) in $_ : First $1 is interpolated
-    # to the variable name, and then evaluated
-    s/(\$\w+)/$1/eeg;
-
-    # Delete (most) C comments.
-    $program =~ s {
-       /\*     # Match the opening delimiter.
-       .*?     # Match a minimal number of characters.
-       \*/     # Match the closing delimiter.
-    } []gsx;
-
-    s/^\s*(.*?)\s*$/$1/;       # trim whitespace in $_, expensively
-
-    for ($variable) {          # trim whitespace in $variable, cheap
-       s/^\s+//;
-       s/\s+$//;
-    }
-
-    s/([^ ]*) *([^ ]*)/$2 $1/; # reverse 1st two fields
-
-Note the use of $ instead of \ in the last example.  Unlike
-B<sed>, we use the \<I<digit>> form in only the left hand side.
-Anywhere else it's $<I<digit>>.
-
-Occasionally, you can't use just a C</g> to get all the changes
-to occur that you might want.  Here are two common cases:
-
-    # put commas in the right places in an integer
-    1 while s/(\d)(\d\d\d)(?!\d)/$1,$2/g;
-
-    # expand tabs to 8-column spacing
-    1 while s/\t+/' ' x (length($&)*8 - length($`)%8)/e;
  
  =item tr/SEARCHLIST/REPLACEMENTLIST/cds
  X<tr> X<y> X<transliterate> X</c> X</d> X</s>
@@ -1569,7 +1629,7 @@ its own pair of quotes, which may or may not be bracketing quotes,
  e.g., C<tr[A-Z][a-z]> or C<tr(+\-*/)/ABCD/>.
  
  Note that C<tr> does B<not> do regular expression character classes
-such as C<\d> or C<[:lower:]>.  The <tr> operator is not equivalent to
+such as C<\d> or C<[:lower:]>.  The C<tr> operator is not equivalent to
  the tr(1) utility.  If you want to map strings between lower/upper
  cases, see L<perlfunc/lc> and L<perlfunc/uc>, and in general consider
  using the C<s> operator if you need regular expressions.
@@ -1798,28 +1858,56 @@ expectations much less frequently than this first one.
  Some passes discussed below are performed concurrently, but because
  their results are the same, we consider them individually.  For different
  quoting constructs, Perl performs different numbers of passes, from
-one to five, but these passes are always performed in the same order.
+one to four, but these passes are always performed in the same order.
  
  =over 4
  
  =item Finding the end
  
-The first pass is finding the end of the quoted construct, whether
-it be a multicharacter delimiter C<"EOF\n"> in the C<<<EOF>
-construct, a C</> that terminates a C<qq//> construct, a C<]> which
-terminates C<qq[]> construct, or a C<< > >> which terminates a
-fileglob started with C<< < >>.
-
-When searching for single-character non-pairing delimiters, such
-as C</>, combinations of C<\\> and C<\/> are skipped.  However,
-when searching for single-character pairing delimiter like C<[>,
-combinations of C<\\>, C<\]>, and C<\[> are all skipped, and nested
-C<[>, C<]> are skipped as well.  When searching for multicharacter
-delimiters like C<"EOF\n">, nothing is skipped, though the delimiter
-must start from the first column of the terminating line.
+The first pass is finding the end of the quoted construct, where
+the information about the delimiters is used in parsing.
+During this search, text between the starting and ending delimiters
+is copied to a safe location. The text copied gets delimiter-independent.
+
+If the construct is a here-doc, the ending delimiter is a line
+that has a terminating string as the content. Therefore C<<<EOF> is
+terminated by C<EOF> immediately followed by C<"\n"> and starting
+from the first column of the terminating line.
+When searching for the terminating line of a here-doc, nothing
+is skipped. In other words, lines after the here-doc syntax
+are compared with the terminating string line by line.
+
+For the constructs except here-docs, single characters are used as starting
+and ending delimiters. If the starting delimiter is an opening punctuation
+(that is C<(>, C<[>, C<{>, or C<< < >>), the ending delimiter is the
+corresponding closing punctuation (that is C<)>, C<]>, C<}>, or C<< > >>).
+If the starting delimiter is an unpaired character like C</> or a closing
+punctuation, the ending delimiter is same as the starting delimiter.
+Therefore a C</> terminates a C<qq//> construct, while a C<]> terminates
+C<qq[]> and C<qq]]> constructs.
+
+When searching for single-character delimiters, escaped delimiters
+and C<\\> are skipped. For example, while searching for terminating C</>,
+combinations of C<\\> and C<\/> are skipped.  If the delimiters are
+bracketing, nested pairs are also skipped.  For example, while searching
+for closing C<]> paired with the opening C<[>, combinations of C<\\>, C<\]>,
+and C<\[> are all skipped, and nested C<[> and C<]> are skipped as well.
+However, when backslashes are used as the delimiters (like C<qq\\> and
+C<tr\\\>), nothing is skipped.
+During the search for the end, backslashes that escape delimiters
+are removed (exactly speaking, they are not copied to the safe location).
  
  For constructs with three-part delimiters (C<s///>, C<y///>, and
  C<tr///>), the search is repeated once more.
+If the first delimiter is not an opening punctuation, three delimiters must
+be same such as C<s!!!> and C<tr)))>, in which case the second delimiter
+terminates the left part and starts the right part at once.
+If the left part is delimited by bracketing punctuations (that is C<()>,
+C<[]>, C<{}>, or C<< <> >>), the right part needs another pair of
+delimiters such as C<s(){}> and C<tr[]//>.  In these cases, whitespaces
+and comments are allowed between both parts, though the comment must follow
+at least one whitespace; otherwise a character expected as the start of
+the comment may be regarded as the starting delimiter of the right part.
  
  During this search no attention is paid to the semantics of the construct.
  Thus:
@@ -1843,18 +1931,6 @@ this search. Thus the second C<\> in C<qq/\c\/> is interpreted as a part
  of C<\/>, and the following C</> is not recognized as a delimiter.
  Instead, use C<\034> or C<\x1c> at the end of quoted constructs.
  
-=item Removal of backslashes before delimiters
-
-During the second pass, text between the starting and ending
-delimiters is copied to a safe location, and the C<\> is removed
-from combinations consisting of C<\> and delimiter--or delimiters,
-meaning both starting and ending delimiters will be handled,
-should these differ. This removal does not happen for multi-character
-delimiters. Note that the combination C<\\> is left intact.
-
-Starting from this step no information about the delimiters is
-used in parsing.
-
  =item Interpolation
  X<interpolation>
  
@@ -1866,21 +1942,28 @@ delimiter-independent.  There are multiple cases.
  =item C<<<'EOF'>
  
  No interpolation is performed.
+Note that the combination C<\\> is left intact, since escaped delimiters
+are not available for here-docs.
  
-=item  C<m''>, C<s'''>
+=item  C<m''>, the pattern of C<s'''>
  
-No interpolation is performed at this stage, see
-L</"Interpolation of regular expressions"> for comments on later
-processing of their contents.
+No interpolation is performed at this stage.
+Any backslashed sequences including C<\\> are treated at the stage
+to L</"parsing regular expressions">.
  
-=item C<''>, C<q//>
+=item C<''>, C<q//>, C<tr'''>, C<y'''>, the replacement of C<s'''>
  
  The only interpolation is removal of C<\> from pairs of C<\\>.
+Therefore C<-> in C<tr'''> and C<y'''> is treated literally
+as a hyphen and no character range is available.
+C<\1> in the replacement of C<s'''> does not work as C<$1>.
  
  =item C<tr///>, C<y///>
  
-No variable interpolation occurs. Escape sequences such as \200
-and the common escapes such as \t for tab are converted to literals.
+No variable interpolation occurs.  String modifying combinations for
+case and quoting such as C<\Q>, C<\U>, and C<\E> are not recognized.
+The other escape sequences such as C<\200> and C<\t> and backslashed
+characters such as C<\\> and C<\-> are converted to appropriate literals.
  The character C<-> is treated specially and therefore C<\-> is treated
  as a literal C<->.
  
@@ -1889,7 +1972,9 @@ as a literal C<->.
  C<\Q>, C<\U>, C<\u>, C<\L>, C<\l> (possibly paired with C<\E>) are
  converted to corresponding Perl constructs.  Thus, C<"$foo\Qbaz$bar">
  is converted to C<$foo . (quotemeta("baz" . $bar))> internally.
-The other combinations are replaced with appropriate expansions.
+The other escape sequences such as C<\200> and C<\t> and backslashed
+characters such as C<\\> and C<\-> are replaced with appropriate
+expansions.
  
  Let it be stressed that I<whatever falls between C<\Q> and C<\E>>
  is interpolated in the usual way.  Something like C<"\Q\\E"> has
@@ -1933,32 +2018,44 @@ brackets.  because the outcome may be determined by voting based
  on heuristic estimators, the result is not strictly predictable.
  Fortunately, it's usually correct for ambiguous cases.
  
-=item C<?RE?>, C</RE/>, C<m/RE/>, C<s/RE/foo/>,
+=item the replacement of C<s///>
  
  Processing of C<\Q>, C<\U>, C<\u>, C<\L>, C<\l>, and interpolation
-happens (almost) as with C<qq//> constructs, but the substitution
-of C<\> followed by RE-special chars (including C<\>) is not
-performed.  Moreover, inside C<(?{BLOCK})>, C<(?# comment )>, and
+happens as with C<qq//> constructs.
+
+It is at this step that C<\1> is begrudgingly converted to C<$1> in
+the replacement text of C<s///>, in order to correct the incorrigible
+I<sed> hackers who haven't picked up the saner idiom yet.  A warning
+is emitted if the C<use warnings> pragma or the B<-w> command-line flag
+(that is, the C<$^W> variable) was set.
+
+=item C<RE> in C<?RE?>, C</RE/>, C<m/RE/>, C<s/RE/foo/>,
+
+Processing of C<\Q>, C<\U>, C<\u>, C<\L>, C<\l>, C<\E>,
+and interpolation happens (almost) as with C<qq//> constructs.
+
+However any other combinations of C<\> followed by a character
+are not substituted but only skipped, in order to parse them
+as regular expressions at the following step.
+As C<\c> is skipped at this step, C<@> of C<\c@> in RE is possibly
+treated as an array symbol (for example C<@foo>),
+even though the same text in C<qq//> gives interpolation of C<\c@>.
+
+Moreover, inside C<(?{BLOCK})>, C<(?# comment )>, and
  a C<#>-comment in a C<//x>-regular expression, no processing is
  performed whatsoever.  This is the first step at which the presence
  of the C<//x> modifier is relevant.
  
-Interpolation has several quirks: C<$|>, C<$(>, and C<$)> are not
-interpolated, and constructs C<$var[SOMETHING]> are voted (by several
-different estimators) to be either an array element or C<$var>
-followed by an RE alternative.  This is where the notation
+Interpolation in patterns has several quirks: C<$|>, C<$(>, C<$)>, C<@+>
+and C<@-> are not interpolated, and constructs C<$var[SOMETHING]> are
+voted (by several different estimators) to be either an array element
+or C<$var> followed by an RE alternative.  This is where the notation
  C<${arr[$bar]}> comes handy: C</${arr[0-9]}/> is interpreted as
  array element C<-9>, not as a regular expression from the variable
  C<$arr> followed by a digit, which would be the interpretation of
  C</$arr[0-9]/>.  Since voting among different estimators may occur,
  the result is not predictable.
  
-It is at this step that C<\1> is begrudgingly converted to C<$1> in
-the replacement text of C<s///> to correct the incorrigible
-I<sed> hackers who haven't picked up the saner idiom yet.  A warning
-is emitted if the C<use warnings> pragma or the B<-w> command-line flag
-(that is, the C<$^W> variable) was set.
-
  The lack of processing of C<\\> creates specific restrictions on
  the post-processed text.  If the delimiter is C</>, one cannot get
  the combination C<\/> into the result of this step.  C</> will
@@ -1972,7 +2069,7 @@ alphanumeric char, as in:
    m m ^ a \s* b mmx;
  
  In the RE above, which is intentionally obfuscated for illustration, the
-delimiter is C<m>, the modifier is C<mx>, and after backslash-removal the
+delimiter is C<m>, the modifier is C<mx>, and after delimiter-removal the
  RE is the same as for C<m/ ^ a \s* b /mx>.  There's more than one
  reason you're encouraged to restrict your delimiters to non-alphanumeric,
  non-whitespace choices.
@@ -1982,13 +2079,13 @@ non-whitespace choices.
  This step is the last one for all constructs except regular expressions,
  which are processed further.
  
-=item Interpolation of regular expressions
-X<regexp, interpolation>
+=item parsing regular expressions
+X<regexp, parse>
  
  Previous steps were performed during the compilation of Perl code,
  but this one happens at run time--although it may be optimized to
  be calculated at compile time if appropriate.  After preprocessing
-described above, and possibly after evaluation if catenation,
+described above, and possibly after evaluation if concatenation,
  joining, casing translation, or metaquoting are involved, the
  resulting I<string> is passed to the RE engine for compilation.