compatibility and chooses to use byte semantics.
When C<use locale> (but not C<use locale ':not_characters'>) is in
-effect, Perl uses the semantics associated with the current locale.
+effect, Perl uses the rules associated with the current locale.
(C<use locale> overrides C<use feature 'unicode_strings'> in the same scope;
while C<use locale ':not_characters'> effectively also selects
C<use feature 'unicode_strings'> in its scope; see L<perllocale>.)
Otherwise, Perl uses the platform's native
byte semantics for characters whose code points are less than 256, and
-Unicode semantics for those greater than 255. That means that non-ASCII
+Unicode rules for those greater than 255. That means that non-ASCII
characters are undefined except for their
ordinal numbers. This means that none have case (upper and lower), nor are any
a member of character classes, like C<[:alpha:]> or C<\w>. (But all do belong
=item [8]
-Should do C<^> and C<$> also on C<U+000B> (C<\v> in C), C<FF> (C<\f>), C<CR> (C<\r>), C<CRLF>
-(C<\r\n>), C<NEL> (C<U+0085>), C<LS> (C<U+2028>), and C<PS> (C<U+2029>); should also affect
-C<E<lt>E<gt>>, C<$.>, and script line numbers; should not split lines within C<CRLF>
-(i.e. there is no empty line between C<\r> and C<\n>). For C<CRLF>, try the
+Should do C<^> and C<$> also on C<U+000B> (C<\v> in C), C<FF> (C<\f>),
+C<CR> (C<\r>), C<CRLF> (C<\r\n>), C<NEL> (C<U+0085>), C<LS> (C<U+2028>),
+and C<PS> (C<U+2029>); should also affect C<E<lt>E<gt>>, C<$.>, and
+script line numbers; should not split lines within C<CRLF> (i.e. there
+is no empty line between C<\r> and C<\n>). For C<CRLF>, try the
C<:crlf> layer (see L<PerlIO>).
=item [9]
use warnings FATAL => "non_unicode"
-(see L<perllexwarn>). In this mode of operation, Perl will raise the
+(see L<warnings>). In this mode of operation, Perl will raise the
warning for all matches against a non-Unicode code point (not just the
arguable ones), and it skips the optimizations that might cause the
warning to not be output. (It currently still won't warn if the match
=item *
-C<is_utf8_char_buf(buf, buf_end)> returns true if the pointer points to
+C<isUTF8_CHAR(buf, buf_end)> returns true if the pointer points to
a valid UTF-8 character.
=item *