+New in v5.22, L<C<use re 'strict'>|re/'strict' mode> applies stricter
+rules than otherwise when compiling regular expression patterns. It can
+find things that, while legal, may not be what you intended.
+
+=head2 The Basics
+X<regular expression, version 8> X<regex, version 8> X<regexp, version 8>
+
+Regular expressions are strings with the very particular syntax and
+meaning described in this document and auxiliary documents referred to
+by this one. The strings are called "patterns". Patterns are used to
+determine if some other string, called the "target", has (or doesn't
+have) the characteristics specified by the pattern. We call this
+"matching" the target string against the pattern. Usually the match is
+done by having the target be the first operand, and the pattern be the
+second operand, of one of the two binary operators C<=~> and C<!~>,
+listed in L<perlop/Binding Operators>; and the pattern will have been
+converted from an ordinary string by one of the operators in
+L<perlop/"Regexp Quote-Like Operators">, like so:
+
+ $foo =~ m/abc/
+
+This evaluates to true if and only if the string in the variable C<$foo>
+contains somewhere in it, the sequence of characters "a", "b", then "c".
+(The C<=~ m>, or match operator, is described in
+L<perlop/m/PATTERN/msixpodualngc>.)
+
+Patterns that aren't already stored in some variable must be delimitted,
+at both ends, by delimitter characters. These are often, as in the
+example above, forward slashes, and the typical way a pattern is written
+in documentation is with those slashes. In most cases, the delimitter
+is the same character, fore and aft, but there are a few cases where a
+character looks like it has a mirror-image mate, where the opening
+version is the beginning delimiter, and the closing one is the ending
+delimiter, like
+
+ $foo =~ m<abc>
+
+Most times, the pattern is evaluated in double-quotish context, but it
+is possible to choose delimiters to force single-quotish, like
+
+ $foo =~ m'abc'
+
+If the pattern contains its delimiter within it, that delimiter must be
+escaped. Prefixing it with a backslash (I<e.g.>, C<"/foo\/bar/">)
+serves this purpose.
+
+Any single character in a pattern matches that same character in the
+target string, unless the character is a I<metacharacter> with a special
+meaning described in this document. A sequence of non-metacharacters
+matches the same sequence in the target string, as we saw above with
+C<m/abc/>.
+
+Only a few characters (all of them being ASCII punctuation characters)
+are metacharacters. The most commonly used one is a dot C<".">, which
+normally matches almost any character (including a dot itself).
+
+You can cause characters that normally function as metacharacters to be
+interpreted literally by prefixing them with a C<"\">, just like the
+pattern's delimiter must be escaped if it also occurs within the
+pattern. Thus, C<"\."> matches just a literal dot, C<"."> instead of
+its normal meaning. This means that the backslash is also a
+metacharacter, so C<"\\"> matches a single C<"\">. And a sequence that
+contains an escaped metacharacter matches the same sequence (but without
+the escape) in the target string. So, the pattern C</blur\\fl/> would
+match any target string that contains the sequence C<"blur\fl">.
+
+The metacharacter C<"|"> is used to match one thing or another. Thus
+
+ $foo =~ m/this|that/
+
+is TRUE if and only if C<$foo> contains either the sequence C<"this"> or
+the sequence C<"that">. Like all metacharacters, prefixing the C<"|">
+with a backslash makes it match the plain punctuation character; in its
+case, the VERTICAL LINE.
+
+ $foo =~ m/this\|that/
+
+is TRUE if and only if C<$foo> contains the sequence C<"this|that">.
+
+You aren't limited to just a single C<"|">.
+
+ $foo =~ m/fee|fie|foe|fum/
+
+is TRUE if and only if C<$foo> contains any of those 4 sequences from
+the children's story "Jack and the Beanstalk".
+
+As you can see, the C<"|"> binds less tightly than a sequence of
+ordinary characters. We can override this by using the grouping
+metacharacters, the parentheses C<"("> and C<")">.
+
+ $foo =~ m/th(is|at) thing/
+
+is TRUE if and only if C<$foo> contains either the sequence S<C<"this
+thing">> or the sequence S<C<"that thing">>. The portions of the string
+that match the portions of the pattern enclosed in parentheses are
+normally made available separately for use later in the pattern,
+substitution, or program. This is called "capturing", and it can get
+complicated. See L</Capture groups>.
+
+The first alternative includes everything from the last pattern
+delimiter (C<"(">, C<"(?:"> (described later), I<etc>. or the beginning
+of the pattern) up to the first C<"|">, and the last alternative
+contains everything from the last C<"|"> to the next closing pattern
+delimiter. That's why it's common practice to include alternatives in
+parentheses: to minimize confusion about where they start and end.
+
+Alternatives are tried from left to right, so the first
+alternative found for which the entire expression matches, is the one that
+is chosen. This means that alternatives are not necessarily greedy. For
+example: when matching C<foo|foot> against C<"barefoot">, only the C<"foo">
+part will match, as that is the first alternative tried, and it successfully
+matches the target string. (This might not seem important, but it is
+important when you are capturing matched text using parentheses.)
+
+Besides taking away the special meaning of a metacharacter, a prefixed
+backslash changes some letter and digit characters away from matching
+just themselves to instead have special meaning. These are called
+"escape sequences", and all such are described in L<perlrebackslash>. A
+backslash sequence (of a letter or digit) that doesn't currently have
+special meaning to Perl will raise a warning if warnings are enabled,
+as those are reserved for potential future use.
+
+One such sequence is C<\b>, which matches a boundary of some sort.
+C<\b{wb}> and a few others give specialized types of boundaries.
+(They are all described in detail starting at
+L<perlrebackslash/\b{}, \b, \B{}, \B>.) Note that these don't match
+characters, but the zero-width spaces between characters. They are an
+example of a L<zero-width assertion|/Assertions>. Consider again,
+
+ $foo =~ m/fee|fie|foe|fum/
+
+It evaluates to TRUE if, besides those 4 words, any of the sequences
+"feed", "field", "Defoe", "fume", and many others are in C<$foo>. By
+judicious use of C<\b> (or better (because it is designed to handle
+natural language) C<\b{wb}>), we can make sure that only the Giant's
+words are matched:
+
+ $foo =~ m/\b(fee|fie|foe|fum)\b/
+ $foo =~ m/\b{wb}(fee|fie|foe|fum)\b{wb}/
+
+The final example shows that the characters C<"{"> and C<"}"> are
+metacharacters.
+
+Another use for escape sequences is to specify characters that cannot
+(or which you prefer not to) be written literally. These are described
+in detail in L<perlrebackslash/Character Escapes>, but the next three
+paragraphs briefly describe some of them.
+
+Various control characters can be written in C language style: C<"\n">
+matches a newline, C<"\t"> a tab, C<"\r"> a carriage return, C<"\f"> a
+form feed, I<etc>.
+
+More generally, C<\I<nnn>>, where I<nnn> is a string of three octal
+digits, matches the character whose native code point is I<nnn>. You
+can easily run into trouble if you don't have exactly three digits. So
+always use three, or since Perl 5.14, you can use C<\o{...}> to specify
+any number of octal digits.
+
+Similarly, C<\xI<nn>>, where I<nn> are hexadecimal digits, matches the
+character whose native ordinal is I<nn>. Again, not using exactly two
+digits is a recipe for disaster, but you can use C<\x{...}> to specify
+any number of hex digits.
+
+Besides being a metacharacter, the C<"."> is an example of a "character
+class", something that can match any single character of a given set of
+them. In its case, the set is just about all possible characters. Perl
+predefines several character classes besides the C<".">; there is a
+separate reference page about just these, L<perlrecharclass>.
+
+You can define your own custom character classes, by putting into your
+pattern in the appropriate place(s), a list of all the characters you
+want in the set. You do this by enclosing the list within C<[]> bracket
+characters. These are called "bracketed character classes" when we are
+being precise, but often the word "bracketed" is dropped. (Dropping it
+usually doesn't cause confusion.) This means that the C<"["> character
+is another metacharacter. It doesn't match anything just by itself; it
+is used only to tell Perl that what follows it is a bracketed character
+class. If you want to match a literal left square bracket, you must
+escape it, like C<"\[">. The matching C<"]"> is also a metacharacter;
+again it doesn't match anything by itself, but just marks the end of
+your custom class to Perl. It is an example of a "sometimes
+metacharacter". It isn't a metacharacter if there is no corresponding
+C<"[">, and matches its literal self:
+
+ print "]" =~ /]/; # prints 1
+
+The list of characters within the character class gives the set of
+characters matched by the class. C<"[abc]"> matches a single "a" or "b"
+or "c". But if the first character after the C<"["> is C<"^">, the
+class instead matches any character not in the list. Within a list, the
+C<"-"> character specifies a range of characters, so that C<a-z>
+represents all characters between "a" and "z", inclusive. If you want
+either C<"-"> or C<"]"> itself to be a member of a class, put it at the
+start of the list (possibly after a C<"^">), or escape it with a
+backslash. C<"-"> is also taken literally when it is at the end of the
+list, just before the closing C<"]">. (The following all specify the
+same class of three characters: C<[-az]>, C<[az-]>, and C<[a\-z]>. All
+are different from C<[a-z]>, which specifies a class containing
+twenty-six characters, even on EBCDIC-based character sets.)
+
+There is lots more to bracketed character classes; full details are in
+L<perlrecharclass/Bracketed Character Classes>.
+
+=head3 Metacharacters
+X<metacharacter>
+X<\> X<^> X<.> X<$> X<|> X<(> X<()> X<[> X<[]>
+
+L</The Basics> introduced some of the metacharacters. This section
+gives them all. Most of them have the same meaning as in the I<egrep>
+command.
+
+Only the C<"\"> is always a metacharacter. The others are metacharacters
+just sometimes. The following tables lists all of them, summarizes
+their use, and gives the contexts where they are metacharacters.
+Outside those contexts or if prefixed by a C<"\">, they match their
+corresponding punctuation character. In some cases, their meaning
+varies depending on various pattern modifiers that alter the default
+behaviors. See L</Modifiers>.
+
+
+ PURPOSE WHERE
+ \ Escape the next character Always, except when
+ escaped by another \
+ ^ Match the beginning of the string Not in []
+ (or line, if /m is used)
+ ^ Complement the [] class At the beginning of []
+ . Match any single character except newline Not in []
+ (under /s, includes newline)
+ $ Match the end of the string Not in [], but can
+ (or before newline at the end of the mean interpolate a
+ string; or before any newline if /m is scalar
+ used)
+ | Alternation Not in []
+ () Grouping Not in []
+ [ Start Bracketed Character class Not in []
+ ] End Bracketed Character class Only in [], and
+ not first
+ * Matches the preceding element 0 or more Not in []
+ times
+ + Matches the preceding element 1 or more Not in []
+ times
+ ? Matches the preceding element 0 or 1 Not in []
+ times
+ { Starts a sequence that gives number(s) Not in []
+ of times the preceding element can be
+ matched
+ { when following certain escape sequences
+ starts a modifier to the meaning of the
+ sequence
+ } End sequence started by {
+ - Indicates a range Only in [] interior
+ # Beginning of comment, extends to line end Only with /x modifier
+
+Notice that most of the metacharacters lose their special meaning when
+they occur in a bracketed character class, except C<"^"> has a different
+meaning when it is at the beginning of such a class. And C<"-"> and C<"]">
+are metacharacters only at restricted positions within bracketed
+character classes; while C<"}"> is a metacharacter only when closing a
+special construct started by C<"{">.
+
+In double-quotish context, as is usually the case, you need to be
+careful about C<"$"> and the non-metacharacter C<"@">. Those could
+interpolate variables, which may or may not be what you intended.
+
+These rules were designed for compactness of expression, rather than
+legibility and maintainability. The L</E<sol>x and E<sol>xx> pattern
+modifiers allow you to insert white space to improve readability. And
+use of S<C<L<re 'strict'|re/'strict' mode>>> adds extra checking to
+catch some typos that might silently compile into something unintended.
+
+By default, the C<"^"> character is guaranteed to match only the
+beginning of the string, the C<"$"> character only the end (or before the
+newline at the end), and Perl does certain optimizations with the
+assumption that the string contains only one line. Embedded newlines
+will not be matched by C<"^"> or C<"$">. You may, however, wish to treat a
+string as a multi-line buffer, such that the C<"^"> will match after any
+newline within the string (except if the newline is the last character in
+the string), and C<"$"> will match before any newline. At the
+cost of a little more overhead, you can do this by using the
+L</C<E<sol>m>> modifier on the pattern match operator. (Older programs
+did this by setting C<$*>, but this option was removed in perl 5.10.)
+X<^> X<$> X</m>
+
+To simplify multi-line substitutions, the C<"."> character never matches a
+newline unless you use the L<C<E<sol>s>|/s> modifier, which in effect tells
+Perl to pretend the string is a single line--even if it isn't.
+X<.> X</s>