perlrecharclass: Fix typo

[perl5.git] / pod / perllocale.pod
diff --git a/pod/perllocale.pod b/pod/perllocale.pod

index d083c09..a44ffbc 100644 (file)
--- a/pod/perllocale.pod
+++ b/pod/perllocale.pod
@@ -239,7 +239,7 @@ is.
  =item *
  
  Regular expression patterns can be compiled using
-L<qrE<sol>E<sol>|perlop/qrE<sol>STRINGE<sol>msixpodual> with actual
+L<qrE<sol>E<sol>|perlop/qrE<sol>STRINGE<sol>msixpodualn> with actual
  matching deferred to later.  Again, it is whether or not the compilation
  was done within the scope of C<use locale> that determines the match
  behavior, not if the matches are done within such a scope or not.
@@ -298,8 +298,8 @@ C<ucfirst()>, and C<lcfirst()>) use C<LC_CTYPE>
  
  =item *
  
-The variables L<$!|perlvar/$ERRNO> (and its synonyms C<$ERRNO> and
-C<$OS_ERROR>) and L<$^E|perlvar/$EXTENDED_OS_ERROR> (and its synonym
+B<The variables L<$!|perlvar/$ERRNO>> (and its synonyms C<$ERRNO> and
+C<$OS_ERROR>) B<and L<$^E|perlvar/$EXTENDED_OS_ERROR>> (and its synonym
  C<$EXTENDED_OS_ERROR>) when used as strings use C<LC_MESSAGES>.
  
  =back
@@ -755,7 +755,7 @@ alphabets, but where do "E<aacute>" and "E<aring>" belong?  And while
  "color" follows "chocolate" in English, what about in traditional Spanish?
  
  The following collations all make sense and you may meet any of them
-if you "use locale".
+if you C<"use locale">.
  
         A B C D E a b c d e
         A a B b C c D d E e
@@ -792,7 +792,7 @@ C<$equal_in_locale> will be true if the collation locale specifies a
  dictionary-like ordering that ignores space characters completely and
  which folds case.
  
-Perl only supports single-byte locales for C<LC_COLLATE>.  This means
+Perl currently only supports single-byte locales for C<LC_COLLATE>.  This means
  that a UTF-8 locale likely will just give you machine-native ordering.
  Use L<Unicode::Collate> for the full implementation of the Unicode
  Collation Algorithm.
@@ -1005,7 +1005,7 @@ results.  Here are a few possibilities:
  
  Regular expression checks for safe file names or mail addresses using
  C<\w> may be spoofed by an C<LC_CTYPE> locale that claims that
-characters such as "E<gt>" and "|" are alphanumeric.
+characters such as C<"E<gt>"> and C<"|"> are alphanumeric.
  
  =item *
  
@@ -1466,9 +1466,12 @@ the characters in the upper half of the Latin-1 range (128 - 255)
  properly under C<LC_CTYPE>.  To see if a character is a particular type
  under a locale, Perl uses the functions like C<isalnum()>.  Your C
  library may not work for UTF-8 locales with those functions, instead
-only working under the newer wide library functions like C<iswalnum()>.
-However, they are treated like single-byte locales, and will have the
-restrictions described below.
+only working under the newer wide library functions like C<iswalnum()>,
+which Perl does not use.
+These multi-byte locales are treated like single-byte locales, and will
+have the restrictions described below.  Starting in Perl v5.22 a warning
+message is raised when Perl detects a multi-byte locale that it doesn't
+fully support.
  
  For single-byte locales,
  Perl generally takes the tack to use locale rules on code points that can fit
@@ -1488,7 +1491,7 @@ Unicode, C<\p{Alpha}> will never match it, regardless of locale.  A similar
  issue occurs with C<\N{...}>.  Prior to v5.20, It is therefore a bad
  idea to use C<\p{}> or
  C<\N{}> under plain C<use locale>--I<unless> you can guarantee that the
-locale will be a ISO8859-1.  Use POSIX character classes instead.
+locale will be ISO8859-1.  Use POSIX character classes instead.
  
  Another problem with this approach is that operations that cross the
  single byte/multiple byte boundary are not well-defined, and so are
@@ -1516,6 +1519,11 @@ Still another problem is that this approach can lead to two code
  points meaning the same character.  Thus in a Greek locale, both U+03A7
  and U+00D7 are GREEK CAPITAL LETTER CHI.
  
+Because of all these problems, starting in v5.22, Perl will raise a
+warning if a multi-byte (hence Unicode) code point is used when a
+single-byte locale is in effect.  (Although it doesn't check for this if
+doing so would unreasonably slow execution down.)
+
  Vendor locales are notoriously buggy, and it is difficult for Perl to test
  its locale-handling code because this interacts with code that Perl has no
  control over; therefore the locale-handling code in Perl may be buggy as
@@ -1541,8 +1549,8 @@ Pre-v5.12, it was somewhat haphazard; in v5.12 it was applied fairly
  consistently to regular expression matching except for bracketed
  character classes; in v5.14 it was extended to all regex matches; and in
  v5.16 to the casing operations such as C<\L> and C<uc()>.  For
-collation, in all releases, the system's C<strxfrm()> function is called,
-and whatever it does is what you get.
+collation, in all releases so far, the system's C<strxfrm()> function is
+called, and whatever it does is what you get.
  
  =head1 BUGS