Unicode::UCD: Add charprops_all() public function

[perl5.git] / pod / perlrebackslash.pod
diff --git a/pod/perlrebackslash.pod b/pod/perlrebackslash.pod

index 44b0e7d..230e76d 100644 (file)
--- a/pod/perlrebackslash.pod
+++ b/pod/perlrebackslash.pod
@@ -70,6 +70,7 @@ as C<Not in [].>
   \B                Not a word/non-word boundary.  Not in [].
   \cX               Control-X.
   \C                Single octet, even under UTF-8.  Not in [].
+                   (Deprecated)
   \d                Character class for digits.
   \D                Character class for non-digits.
   \e                Escape character.
@@ -575,11 +576,14 @@ categories above. These are:
  
  =item \C
  
-C<\C> always matches a single octet, even if the source string is encoded
+(Deprecated.) C<\C> always matches a single octet, even if the source
+string is encoded
  in UTF-8 format, and the character to be matched is a multi-octet character.
  This is very dangerous, because it violates
  the logical character abstraction and can cause UTF-8 sequences to become malformed.
  
+Use C<utf8::encode()> instead.
+
  Mnemonic: oI<C>tet.
  
  =item \K
@@ -643,15 +647,15 @@ Unicode, but one can be composed by using a G followed by a Unicode "COMBINING
  UPWARDS ARROW BELOW", and would be displayed by Unicode-aware software as if it
  were a single character.
  
+The match is greedy and non-backtracking, so that the cluster is never
+broken up into smaller components.
+
  Mnemonic: eI<X>tended Unicode character.
  
  =back
  
  =head4 Examples
  
- "\x{256}" =~ /^\C\C$/;    # Match as chr (0x256) takes 
-                           # 2 octets in UTF-8.
-
   $str =~ s/foo\Kbar/baz/g; # Change any 'bar' following a 'foo' to 'baz'
   $str =~ s/(.)\K\g1//g;    # Delete duplicated characters.