-Some of the character classes have a somewhat different behaviour depending
-on the internal encoding of the source string, and the locale that is
-in effect, and if the program is running on an EBCDIC platform.
-
-C<\w>, C<\d>, C<\s> and the POSIX character classes (and their negations,
-including C<\W>, C<\D>, C<\S>) suffer from this behaviour. (Since the backslash
-sequences C<\b> and C<\B> are defined in terms of C<\w> and C<\W>, they also are
-affected.)
-
-The rule is that if the source string is in UTF-8 format, the character
-classes match according to the Unicode properties. If the source string
-isn't, then the character classes match according to whatever locale or EBCDIC
-code page is in effect. If there is no locale nor EBCDIC, they match the ASCII
-defaults (0 to 9 for C<\d>; 52 letters, 10 digits and underscore for C<\w>;
-etc.).
-
-This usually means that if you are matching against characters whose C<ord()>
-values are between 128 and 255 inclusive, your character class may match
-or not depending on the current locale or EBCDIC code page, and whether the
-source string is in UTF-8 format. The string will be in UTF-8 format if it
-contains characters whose C<ord()> value exceeds 255. But a string may be in
-UTF-8 format without it having such characters. See L<perlunicode/The
-"Unicode Bug">.
-
-For portability reasons, it may be better to not use C<\w>, C<\d>, C<\s>
-or the POSIX character classes, and use the Unicode properties instead.
+Some of the character classes have a somewhat different behaviour
+depending on the internal encoding of the source string, whether the regular
+expression is marked as having Unicode semantics, whatever locale is in
+effect, and whether the program is running on an EBCDIC platform.
+
+C<\w>, C<\d>, C<\s> and the POSIX character classes (and their
+negations, including C<\W>, C<\D>, C<\S>) have this behaviour. (Since
+the backslash sequences C<\b> and C<\B> are defined in terms of C<\w>
+and C<\W>, they also are affected.)
+
+Starting in Perl 5.14, if the regular expression is compiled with the
+C</a> modifier, the behavior doesn't differ regardless of any other
+factors. C<\d> matches the 10 digits 0-9; C<\D> any character but those
+10; C<\s>, exactly the five characters "[ \f\n\r\t]"; C<\w> only the 63
+characters "[A-Za-z0-9_]"; and the C<"[[:posix:]]"> classes only the
+appropriate ASCII characters, the same characters as are matched by the
+corresponding C<\p{}> property given in the "ASCII-range Unicode" column
+in the table above. (The behavior of all of their complements follows
+the same paradigm.)
+
+Otherwise, a regular expression is marked for Unicode semantics if it is
+encoded in utf8 (usually as a result of including a literal character
+whose code point is above 255), or if it contains a C<\N{U+...}> or
+C<\N{I<name>}> construct, or (starting in Perl 5.14) if it was compiled
+in the scope of a C<S<use feature "unicode_strings">> pragma and not in
+the scope of a C<S<use locale>> pragma, or has the C</u> regular
+expression modifier.
+
+Note that one can specify C<"use re '/l'"> for example, for any regular
+expression modifier, and this has precedence over either of the
+C<S<use feature "unicode_strings">> or C<S<use locale>> pragmas.
+
+The differences in behavior between locale and non-locale semantics
+can affect any character whose code point is 255 or less. The
+differences in behavior between Unicode and non-Unicode semantics
+affects only ASCII platforms, and only when matching against characters
+whose code points are between 128 and 255 inclusive. See
+L<perlunicode/The "Unicode Bug">.
+
+For portability reasons, unless the C</a> modifier is specified,
+it may be better to not use C<\w>, C<\d>, C<\s> or the POSIX character
+classes and use the Unicode properties instead.
+
+That way you can control whether you want matching of characters in
+the ASCII character set alone, or whether to match Unicode characters.
+C<S<use feature "unicode_strings">> allows seamless Unicode behavior
+no matter the internal encodings, but won't allow restricting
+to ASCII characters only.