pattern, substitution, or transliteration. The left argument is what is
supposed to be searched, substituted, or transliterated instead of the default
$_. When used in scalar context, the return value generally indicates the
-success of the operation. Not always though: the non-destructive substitution
-option (C</r>) causes the return value to be the result of the substition, for
-example. Behavior in list context depends on the particular operator. See
-L</"Regexp Quote-Like Operators"> for details and L<perlretut> for examples
-using these operators.
+success of the operation. The exception is substitution with the C</r>
+(non-destructive) option, which causes the return value to be the result of
+the substition. Behavior in list context depends on the particular operator.
+See L</"Regexp Quote-Like Operators"> for details and L<perlretut> for
+examples using these operators.
If the right argument is an expression rather than a search pattern,
substitution, or transliteration, it is interpreted as a search pattern at run
Binary "!~" is just like "=~" except the return value is negated in
the logical sense.
-Binary "!~" is not permitted to bind to a non-destructive substitute (s///r).
+Binary "!~" with a non-destructive substitution (s///r) is a syntax error.
=head2 Multiplicative Operators
X<operator, multiplicative>
The following escape sequences are available in constructs that interpolate
and in transliterations.
X<\t> X<\n> X<\r> X<\f> X<\b> X<\a> X<\e> X<\x> X<\0> X<\c> X<\N> X<\N{}>
-
- Sequence Note Description
- \t tab (HT, TAB)
- \n newline (NL)
- \r return (CR)
- \f form feed (FF)
- \b backspace (BS)
- \a alarm (bell) (BEL)
- \e escape (ESC)
- \033 octal char (example: ESC)
- \x1b hex char (example: ESC)
- \x{263a} wide hex char (example: SMILEY)
- \c[ [1] control char (example: chr(27))
- \N{name} [2] named Unicode character
- \N{U+263D} [3] Unicode character (example: FIRST QUARTER MOON)
+X<\o{}>
+
+ Sequence Note Description
+ \t tab (HT, TAB)
+ \n newline (NL)
+ \r return (CR)
+ \f form feed (FF)
+ \b backspace (BS)
+ \a alarm (bell) (BEL)
+ \e escape (ESC)
+ \x{263a} [1,8] hex char (example: SMILEY)
+ \x1b [2,8] restricted range hex char (example: ESC)
+ \N{name} [3] named Unicode character
+ \N{U+263D} [4,8] Unicode character (example: FIRST QUARTER MOON)
+ \c[ [5] control char (example: chr(27))
+ \o{23072} [6,8] octal char (example: SMILEY)
+ \033 [7,8] restricted range octal char (example: ESC)
=over 4
=item [1]
+The result is the character specified by the hexadecimal number between
+the braces. See L</[8]> below for details on which character.
+
+Only hexadecimal digits are valid between the braces. If an invalid
+character is encountered, a warning will be issued and the invalid
+character and all subsequent characters (valid or invalid) within the
+braces will be discarded.
+
+If there are no valid digits between the braces, the generated character is
+the NULL character (C<\x{00}>). However, an explicit empty brace (C<\x{}>)
+will not cause a warning.
+
+=item [2]
+
+The result is the character specified by the hexadecimal number in the range
+0x00 to 0xFF. See L</[8]> below for details on which character.
+
+Only hexadecimal digits are valid following C<\x>. When C<\x> is followed
+by fewer than two valid digits, any valid digits will be zero-padded. This
+means that C<\x7> will be interpreted as C<\x07> and C<\x> alone will be
+interpreted as C<\x00>. Except at the end of a string, having fewer than
+two valid digits will result in a warning. Note that while the warning
+says the illegal character is ignored, it is only ignored as part of the
+escape and will still be used as the subsequent character in the string.
+For example:
+
+ Original Result Warns?
+ "\x7" "\x07" no
+ "\x" "\x00" no
+ "\x7q" "\x07q" yes
+ "\xq" "\x00q" yes
+
+=item [3]
+
+The result is the Unicode character given by I<name>.
+See L<charnames>.
+
+=item [4]
+
+C<\N{U+I<hexadecimal number>}> means the Unicode character whose Unicode code
+point is I<hexadecimal number>.
+
+=item [5]
+
The character following C<\c> is mapped to some other character as shown in the
table:
To get platform independent controls, you can use C<\N{...}>.
-=item [2]
-
-For documentation of C<\N{name}>, see L<charnames>.
-
-=item [3]
-
-C<\N{U+I<wide hex char>}> means the Unicode character whose Unicode ordinal
-number is I<wide hex char>.
+=item [6]
+
+The result is the character specified by the octal number between the braces.
+See L</[8]> below for details on which character.
+
+If a character that isn't an octal digit is encountered, a warning is raised,
+and the value is based on the octal digits before it, discarding it and all
+following characters up to the closing brace. It is a fatal error if there are
+no octal digits at all.
+
+=item [7]
+
+The result is the character specified by the three digit octal number in the
+range 000 to 777 (but best to not use above 077, see next paragraph). See
+L</[8]> below for details on which character.
+
+Some contexts allow 2 or even 1 digit, but any usage without exactly
+three digits, the first being a zero, may give unintended results. (For
+example, see L<perlrebackslash/Octal escapes>.) Starting in Perl 5.14, you may
+use C<\o{}> instead which avoids all these problems. Otherwise, it is best to
+use this construct only for ordinals C<\077> and below, remembering to pad to
+the left with zeros to make three digits. For larger ordinals, either use
+C<\o{}> , or convert to someething else, such as to hex and use C<\x{}>
+instead.
+
+A backslash followed by a non-octal digit in a bracketed character class
+(C<[\8]> or C<[\9]>) will be interpreted as a NULL character and the digit.
+
+Having fewer than 3 digits may lead to a misleading warning message that says
+that what follows is ignored. For example, C<"\128"> in the ASCII character set
+is equivalent to the two characters C<"\n8">, but the warning C<Illegal octal
+digit '8' ignored> will be thrown. To avoid this warning, make sure to pad
+your octal number with C<0>s: C<"\0128">.
+
+=item [8]
+
+Several of the constructs above specify a character by a number. That number
+gives the character's position in the character set encoding (indexed from 0).
+This is called synonymously its ordinal, code position, or code point). Perl
+works on platforms that have a native encoding currently of either ASCII/Latin1
+or EBCDIC, each of which allow specification of 256 characters. In general, if
+the number is 255 (0xFF, 0377) or below, Perl interprets this in the platform's
+native encoding. If the number is 256 (0x100, 0400) or above, Perl interprets
+it as as a Unicode code point and the result is the corresponding Unicode
+character. For example C<\x{50}> and C<\o{120}> both are the number 80 in
+decimal, which is less than 256, so the number is interpreted in the native
+character set encoding. In ASCII the character in the 80th position (indexed
+from 0) is the letter "P", and in EBCDIC it is the ampersand symbol "&".
+C<\x{100}> and C<\o{400}> are both 256 in decimal, so the number is interpreted
+as a Unicode code point no matter what the native encoding is. The name of the
+character in the 100th position (indexed by 0) in Unicode is
+C<LATIN CAPITAL LETTER A WITH MACRON>.
+
+There are a couple of exceptions to the above rule. C<\N{U+I<hex number>}> is
+always interpreted as a Unicode code point, so that C<\N{U+0050}> is "P" even
+on EBCDIC platforms. And if L<C<S<use encoding>>|encoding> is in effect, the
+number is considered to be in that encoding, and is translated from that into
+the platform's native encoding if there is a corresponding native character;
+otherwise to Unicode.
=back
If the C</r> (non-destructive) option is used then it will perform the
substitution on a copy of the string and return the copy whether or not a
substitution occurred. The original string will always remain unchanged in
-this case. The copy will always be a plain string, even If the input is an
+this case. The copy will always be a plain string, even if the input is an
object or a tied variable.
If no string is specified via the C<=~> or C<!~> operator, the C<$_>