lib/locale.t: Don't calculate value unless needed

[perl5.git] / autodoc.pl
diff --git a/autodoc.pl b/autodoc.pl

index 4a55c3c..161310d 100644 (file)
--- a/autodoc.pl
+++ b/autodoc.pl
@@ -263,7 +263,7 @@ removed without notice.\n\n$docs" if $flags =~ /x/;
  sub sort_helper {
      # Do a case-insensitive dictionary sort, with only alphabetics
      # significant, falling back to using everything for determinancy
-    return (uc($a =~ s/[[^:alpha]]//r) cmp uc($b =~ s/[[^:alpha]]//r))
+    return (uc($a =~ s/[[:^alpha:]]//r) cmp uc($b =~ s/[[:^alpha:]]//r))
             || uc($a) cmp uc($b)
             || $a cmp $b;
  }
@@ -396,6 +396,11 @@ not part of the public API, and should not be used by extension writers at
  all.  For these reasons, blindly using functions listed in proto.h is to be
  avoided when writing extensions.
  
+In Perl, unlike C, a string of characters may generally contain embedded
+C<NUL> characters.  Sometimes in the documentation a Perl string is referred
+to as a "buffer" to distinguish it from a C string, but sometimes they are
+both just referred to as strings.
+
  Note that all Perl API global variables must be referenced with the C<PL_>
  prefix.  Again, those not listed here are not to be used by extension writers,
  and can be changed or removed without notice; same with macros.
@@ -407,6 +412,14 @@ whose ordinal numbers are in the range 0 - 127).
  And documentation and comments may still use the term ASCII, when
  sometimes in fact the entire range from 0 - 255 is meant.
  
+The non-ASCII characters below 256 can have various meanings, depending on
+various things.  (See, most notably, L<perllocale>.)  But usually the whole
+range can be referred to as ISO-8859-1.  Often, the term "Latin-1" (or
+"Latin1") is used as an equivalent for ISO-8859-1.  But some people treat
+"Latin1" as referring just to the characters in the range 128 through 255, or
+somethimes from 160 through 255.
+This documentation uses "Latin1" and "Latin-1" to refer to all 256 characters.
+
  Note that Perl can be compiled and run under either ASCII or EBCDIC (See
  L<perlebcdic>).  Most of the documentation (and even comments in the code)
  ignore the EBCDIC possibility.  
@@ -417,8 +430,8 @@ whenever this documentation refers to C<utf8>
  (and variants of that name, including in function names),
  it also (essentially transparently) means C<UTF-EBCDIC>.
  But the ordinals of characters differ between ASCII, EBCDIC, and
-the UTF- encodings, and a string encoded in UTF-EBCDIC may occupy more bytes
-than in UTF-8.
+the UTF- encodings, and a string encoded in UTF-EBCDIC may occupy a different
+number of bytes than in UTF-8.
  
  The listing below is alphabetical, case insensitive.