Integrate:

[perl5.git] / pod / perluniintro.pod
diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod

index eadcedd..19bc82e 100644 (file)
--- a/pod/perluniintro.pod
+++ b/pod/perluniintro.pod
@@ -246,16 +246,14 @@ Note that both C<\x{...}> and C<\N{...}> are compile-time string
  constants: you cannot use variables in them.  if you want similar
  run-time functionality, use C<chr()> and C<charnames::vianame()>.
  
-Also note that if all the code points for pack "U" are below 0x100,
-bytes will be generated, just like if you were using C<chr()>.
-
-   my $bytes = pack("U*", 0x80, 0xFF);
-
  If you want to force the result to Unicode characters, use the special
  C<"U0"> prefix.  It consumes no arguments but forces the result to be
  in Unicode characters, instead of bytes.
  
-   my $chars = pack("U0U*", 0x80, 0xFF);
+   my $chars = pack("U0C*", 0x80, 0x42);
+
+Likewise, you can force the result to be bytes by using the special
+C<"C0"> prefix.
  
  =head2 Handling Unicode
  
@@ -265,7 +263,7 @@ C<substr()> will work on the Unicode characters; regular expressions
  will work on the Unicode characters (see L<perlunicode> and L<perlretut>).
  
  Note that Perl considers combining character sequences to be
-characters, so for example
+separate characters, so for example
  
      use charnames ':full';
      print length("\N{LATIN CAPITAL LETTER A}\N{COMBINING ACUTE ACCENT}"), "\n";
@@ -299,8 +297,8 @@ If that variable isn't set, the encoding pragma will fail.
  The C<Encode> module knows about many encodings and has interfaces
  for doing conversions between those encodings:
  
-    use Encode 'from_to';
-    from_to($data, "iso-8859-3", "utf-8"); # from legacy to utf-8
+    use Encode 'decode';
+    $data = decode("iso-8859-3", $data); # convert from legacy to utf-8
  
  =head2 Unicode I/O
  
@@ -504,7 +502,7 @@ Yet another way would be to use the Devel::Peek module:
  
      perl -MDevel::Peek -e 'Dump(chr(0x100))'
  
-That shows the UTF8 flag in FLAGS and both the UTF-8 bytes
+That shows the C<UTF8> flag in FLAGS and both the UTF-8 bytes
  and Unicode characters in C<PV>.  See also later in this document
  the discussion about the C<utf8::is_utf8()> function.
  
@@ -638,7 +636,7 @@ C<$string>.  If the flag is off, the bytes in the scalar are interpreted
  as a single byte encoding.  If the flag is on, the bytes in the scalar
  are interpreted as the (multi-byte, variable-length) UTF-8 encoded code
  points of the characters.  Bytes added to an UTF-8 encoded string are
-automatically upgraded to UTF-8.  If mixed non-UTF8 and UTF-8 scalars
+automatically upgraded to UTF-8.  If mixed non-UTF-8 and UTF-8 scalars
  are merged (double-quoted interpolation, explicit concatenation, and
  printf/sprintf parameter substitution), the result will be UTF-8 encoded
  as if copies of the byte strings were upgraded to UTF-8: for example,