From: Karl Williamson Date: Wed, 24 Apr 2013 21:39:08 +0000 (-0600) Subject: perlclib.pod: Update character class macro descriptions X-Git-Tag: v5.19.1~442 X-Git-Url: https://perl5.git.perl.org/perl5.git/commitdiff_plain/fa2b1084e3c2ca84300c9da8bdd7f808b78d35bf?hp=5b2821400d3dea1b132a955e865a58e89133c569 perlclib.pod: Update character class macro descriptions Much has changed since this pod was last updated. --- diff --git a/pod/perlclib.pod b/pod/perlclib.pod index 4bb5ae8..0cdee24 100644 --- a/pod/perlclib.pod +++ b/pod/perlclib.pod @@ -150,28 +150,50 @@ macros, which have similar arguments to Zero(): =head2 Character Class Tests -There are two types of character class tests that Perl implements: one -type deals in Cs and are thus B Unicode aware (and hence -deprecated unless you B you should use them) and the other type -deal in Cs and know about Unicode properties. In the following -table, C is a C, and C is a Unicode codepoint. - - Instead Of: Use: But better use: - - isalnum(c) isALNUM(c) isALNUM_uni(u) - isalpha(c) isALPHA(c) isALPHA_uni(u) - iscntrl(c) isCNTRL(c) isCNTRL_uni(u) - isdigit(c) isDIGIT(c) isDIGIT_uni(u) - isgraph(c) isGRAPH(c) isGRAPH_uni(u) - islower(c) isLOWER(c) isLOWER_uni(u) - isprint(c) isPRINT(c) isPRINT_uni(u) - ispunct(c) isPUNCT(c) isPUNCT_uni(u) - isspace(c) isSPACE(c) isSPACE_uni(u) - isupper(c) isUPPER(c) isUPPER_uni(u) - isxdigit(c) isXDIGIT(c) isXDIGIT_uni(u) - - tolower(c) toLOWER(c) toLOWER_uni(u) - toupper(c) toUPPER(c) toUPPER_uni(u) +There are several types of character class tests that Perl implements. +The only ones described here are those that directly correspond to C +library functions that operate on 8-bit characters, but there are +equivalents that operate on wide characters, and UTF-8 encoded strings. +All are more fully described in L and +L. + +The C library routines listed in the table below return values based on +the current locale. Use the entries in the final column for that +functionality. The other two columns always assume a POSIX (or C) +locale. The entries in the ASCII column are only meaningful for ASCII +inputs, returning FALSE for anything else. Use these only when you +B that is what you want. The entries in the Latin1 column assume +that the non-ASCII 8-bit characters are as Unicode defines, them, the +same as ISO-8859-1, often called Latin 1. + + Instead Of: Use for ASCII: Use for Latin1: Use for locale: + + isalnum(c) isALPHANUMERIC(c) isALPHANUMERIC_L1(c) isALPHANUMERIC_LC(c) + isalpha(c) isALPHA(c) isALPHA_L1(c) isALPHA_LC(u ) + isascii(c) isASCII(c) isASCII_LC(c) + isblank(c) isBLANK(c) isBLANK_L1(c) isBLANK_LC(c) + iscntrl(c) isCNTRL(c) isCNTRL_L1(c) isCNTRL_LC(c) + isdigit(c) isDIGIT(c) isDIGIT_L1(c) isDIGIT_LC(c) + isgraph(c) isGRAPH(c) isGRAPH_L1(c) isGRAPH_LC(c) + islower(c) isLOWER(c) isLOWER_L1(c) isLOWER_LC(c) + isprint(c) isPRINT(c) isPRINT_L1(c) isPRINT_LC(c) + ispunct(c) isPUNCT(c) isPUNCT_L1(c) isPUNCT_LC(c) + isspace(c) isSPACE(c) isSPACE_L1(c) isSPACE_LC(c) + isupper(c) isUPPER(c) isUPPER_L1(c) isUPPER_LC(c) + isxdigit(c) isXDIGIT(c) isXDIGIT_L1(c) isXDIGIT_LC(c) + + tolower(c) toLOWER(c) toLOWER_L1(c) toLOWER_LC(c) + toupper(c) toUPPER(c) toUPPER_LC(c) + +To emphasize that you are operating only on ASCII characters, you can +append C<_A> to each of the macros in the ASCII column: C, +C, and so on. + +(There is no entry in the Latin1 column for C even though there +is an C, which is identical to C; the +latter name is clearer. There is no entry in the Latin1 column for +C because the result can be non-Latin1. You have to use +C, as described in L.) =head2 F functions