From f7f5e97b7a9bb1015c2778e4f6b40b39f1459074 Mon Sep 17 00:00:00 2001
From: Karl Williamson <public@khwilliamson.com>
Date: Fri, 28 Jan 2011 09:01:05 -0700
Subject: [PATCH] perldiag.pod: Expand \p in locale description

---
 pod/perldiag.pod | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)
diff --git a/pod/perldiag.pod b/pod/perldiag.pod
index 0ab9e92..bf22e1e 100644
--- a/pod/perldiag.pod
+++ b/pod/perldiag.pod
@@ -3331,10 +3331,25 @@ mixed-case attribute name, instead.  See L<attributes>.
 
 (W) You compiled a regular expression that contained a Unicode property
 match (C<\p> or C<\P>), but the regular expression is also being told to
-use the run-time locale, not Unicode.  It's best to not use these
-Unicode properties with locale, as only if the locale is a properly
-implemented ISO 8859-1 (Latin1) locale (which is supposed to be a subset
-of Unicode) will there not be any anomalies.
+use the run-time locale, not Unicode.  Instead, use a POSIX character
+class, which should know about the locale's rules.
+(See L<perlrecharclass/POSIX Character Classes>.)
+
+Even if the run-time locale is ISO 8859-1 (Latin1), which is a subset of
+Unicode, some properties will give results that are not valid for that
+subset.
+
+Here are a couple of examples to help you see what's going on.  If the
+locale is ISO 8859-7, the character at code point 0xD7 is the "GREEK
+CAPITAL LETTER CHI".  But in Unicode that code point means the
+"MULTIPLICATION SIGN" instead, and C<\p> always uses the Unicode
+meaning.  That means that C<\p{Alpha}> won't match, but C<[[:alpha:]]>
+should.  Only in the Latin1 locale are all the characters in the same
+positions as they are in Unicode.  But, even here, some properties give
+incorrect results.  An example is C<\p{Changes_When_Uppercased}> which
+is true for "LATIN SMALL LETTER Y WITH DIAERESIS", but since the upper
+case of that character is not in Latin1, in that locale it doesn't
+change when upper cased.
 
 =item pack/unpack repeat count overflow
 
-- 
1.8.3.1