-number for every character" breaks down a bit: "at least one number
-for every character" is closer to truth. (This happens when the same
-character has been encoded in several legacy encodings.) The converse
-is also not true: not every code point has an assigned character.
-Firstly, there are unallocated code points within otherwise used
-blocks. Secondly, there are special Unicode control characters that
-do not represent true characters.
-
-A common myth about Unicode is that it would be "16-bit", that is,
-0x10000 (or 65536) characters from 0x0000 to 0xFFFF. B<This is untrue.>
-Since Unicode 2.0 Unicode has been defined all the way up to 21 bits
-(0x10FFFF), and since 3.1 characters have been defined beyond 0xFFFF.
-The first 0x10000 characters are called the I<Plane 0>, or the I<Basic
-Multilingual Plane> (BMP). With the Unicode 3.1, 17 planes in all are
-defined (but nowhere near full of defined characters yet).
-
-Another myth is that the 256-character blocks have something to do
-with languages: a block per language. B<Also this is untrue.>
-The division into the blocks exists but it is almost completely
-accidental, an artifact of how the characters have been historically
-allocated. Instead, there is a concept called I<scripts>, which may
-be more useful: there is C<Latin> script, C<Greek> script, and so on.
-Scripts usually span several parts of several blocks. For further
-information see L<Unicode::UCD>.
+number for every character" idea breaks down a bit: instead, there is
+"at least one number for every character". The same character could
+be represented differently in several legacy encodings. The
+converse is also not true: some code points do not have an assigned
+character. Firstly, there are unallocated code points within
+otherwise used blocks. Secondly, there are special Unicode control
+characters that do not represent true characters.
+
+A common myth about Unicode is that it is "16-bit", that is,
+Unicode is only represented as C<0x10000> (or 65536) characters from
+C<0x0000> to C<0xFFFF>. B<This is untrue.> Since Unicode 2.0 (July
+1996), Unicode has been defined all the way up to 21 bits (C<0x10FFFF>),
+and since Unicode 3.1 (March 2001), characters have been defined
+beyond C<0xFFFF>. The first C<0x10000> characters are called the
+I<Plane 0>, or the I<Basic Multilingual Plane> (BMP). With Unicode
+3.1, 17 (yes, seventeen) planes in all were defined--but they are
+nowhere near full of defined characters, yet.
+
+Another myth is about Unicode blocks--that they have something to
+do with languages--that each block would define the characters used
+by a language or a set of languages. B<This is also untrue.>
+The division into blocks exists, but it is almost completely
+accidental--an artifact of how the characters have been and
+still are allocated. Instead, there is a concept called I<scripts>, which is
+more useful: there is C<Latin> script, C<Greek> script, and so on. Scripts
+usually span varied parts of several blocks. For more information about
+scripts, see L<perlunicode/Scripts>.