always returned.
Variant C<isFOO_A> (e.g., C<isALPHA_A()>) will return TRUE only if the input is
-also in the ASCII character set. For ASCII platforms, the base function with
-no suffix and the one with the C<_A> suffix are identical. On EBCDIC
-platforms, the C<_A> suffix function will not return true unless the specified
-character also has an ASCII equivalent.
-
-Variant C<isFOO_L1> operates on the full Latin1 character set. For EBCDIC
-platforms, the base function with no suffix and the one with the C<_L1> suffix
-are identical. For ASCII platforms, the C<_L1> suffix imposes the Latin-1
-character set onto the platform. That is, the code points that are ASCII are
-unaffected, since ASCII is a subset of Latin-1. But the non-ASCII code points
-are treated as if they are Latin-1 characters. For example, C<isSPACE_L1()>
-will return true when called with the code point 0xA0, which is the Latin-1
-NO-BREAK SPACE.
+also in the ASCII character set. The base function with no suffix and the one
+with the C<_A> suffix are identical.
+
+Variant C<isFOO_L1> imposes the Latin-1 (or EBCDIC equivlalent) character set
+onto the platform. That is, the code points that are ASCII are unaffected,
+since ASCII is a subset of Latin-1. But the non-ASCII code points are treated
+as if they are Latin-1 characters. For example, C<isWORDCHAR_L1()> will return
+true when called with the code point 0xDF, which is a word character in both
+ASCII and EBCDIC (though it represent different characters in each).
Variant C<isFOO_uni> is like the C<isFOO_L1> variant, but accepts any UV code
point as input. If the code point is larger than 255, Unicode rules are used
alphabetic character in the platform's native character set, analogous to
C<m/[[:alpha:]]/>.
See the L<top of this section|/Character classes> for an explanation of variants
-C<isALPHA_A>, C<isALPHA_L1>, C<isALPHA_uni>, C<isALPHA_utf8>, C<isALPHA_LC>
+C<isALPHA_A>, C<isALPHA_L1>, C<isALPHA_uni>, C<isALPHA_utf8>, C<isALPHA_LC>,
C<isALPHA_LC_uvchr>, and C<isALPHA_LC_utf8>.
=for apidoc Am|bool|isALPHANUMERIC|char ch
analogous to C<m/[[:alnum:]]/>.
See the L<top of this section|/Character classes> for an explanation of variants
C<isALPHANUMERIC_A>, C<isALPHANUMERIC_L1>, C<isALPHANUMERIC_uni>,
-C<isALPHANUMERIC_utf8>, C<isALPHANUMERIC_LC> C<isALPHANUMERIC_LC_uvchr>, and
+C<isALPHANUMERIC_utf8>, C<isALPHANUMERIC_LC>, C<isALPHANUMERIC_LC_uvchr>, and
C<isALPHANUMERIC_LC_utf8>.
=for apidoc Am|bool|isASCII|char ch
Returns a boolean indicating whether the specified character is one of the 128
characters in the ASCII character set, analogous to C<m/[[:ascii:]]/>.
-On non-ASCII platforms, it is if this
+On non-ASCII platforms, it returns TRUE iff this
character corresponds to an ASCII character. Variants C<isASCII_A()> and
C<isASCII_L1()> are identical to C<isASCII()>.
See the L<top of this section|/Character classes> for an explanation of variants
character considered to be a blank in the platform's native character set,
analogous to C<m/[[:blank:]]/>.
See the L<top of this section|/Character classes> for an explanation of variants
-C<isBLANK_A>, C<isBLANK_L1>, C<isBLANK_uni>, C<isBLANK_utf8>, C<isBLANK_LC>
+C<isBLANK_A>, C<isBLANK_L1>, C<isBLANK_uni>, C<isBLANK_utf8>, C<isBLANK_LC>,
C<isBLANK_LC_uvchr>, and C<isBLANK_LC_utf8>. Note, however, that some
platforms do not have the C library routine C<isblank()>. In these cases, the
variants whose names contain C<LC> are the same as the corresponding ones
control character in the platform's native character set,
analogous to C<m/[[:cntrl:]]/>.
See the L<top of this section|/Character classes> for an explanation of variants
-C<isCNTRL_A>, C<isCNTRL_L1>, C<isCNTRL_uni>, C<isCNTRL_utf8>, C<isCNTRL_LC>
+C<isCNTRL_A>, C<isCNTRL_L1>, C<isCNTRL_uni>, C<isCNTRL_utf8>, C<isCNTRL_LC>,
C<isCNTRL_LC_uvchr>, and C<isCNTRL_LC_utf8>.
=for apidoc Am|bool|isDIGIT|char ch
digit in the platform's native character set, analogous to C<m/[[:digit:]]/>.
Variants C<isDIGIT_A> and C<isDIGIT_L1> are identical to C<isDIGIT>.
See the L<top of this section|/Character classes> for an explanation of variants
-C<isDIGIT_uni>, C<isDIGIT_utf8>, C<isDIGIT_LC> C<isDIGIT_LC_uvchr>, and
+C<isDIGIT_uni>, C<isDIGIT_utf8>, C<isDIGIT_LC>, C<isDIGIT_LC_uvchr>, and
C<isDIGIT_LC_utf8>.
=for apidoc Am|bool|isGRAPH|char ch
graphic character in the platform's native character set, analogous to
C<m/[[:graph:]]/>.
See the L<top of this section|/Character classes> for an explanation of variants
-C<isGRAPH_A>, C<isGRAPH_L1>, C<isGRAPH_uni>, C<isGRAPH_utf8>, C<isGRAPH_LC>
+C<isGRAPH_A>, C<isGRAPH_L1>, C<isGRAPH_uni>, C<isGRAPH_utf8>, C<isGRAPH_LC>,
C<isGRAPH_LC_uvchr>, and C<isGRAPH_LC_utf8>.
=for apidoc Am|bool|isLOWER|char ch
lowercase character in the platform's native character set, analogous to
C<m/[[:lower:]]/>.
See the L<top of this section|/Character classes> for an explanation of variants
-C<isLOWER_A>, C<isLOWER_L1>, C<isLOWER_uni>, C<isLOWER_utf8>, C<isLOWER_LC>
+C<isLOWER_A>, C<isLOWER_L1>, C<isLOWER_uni>, C<isLOWER_utf8>, C<isLOWER_LC>,
C<isLOWER_LC_uvchr>, and C<isLOWER_LC_utf8>.
=for apidoc Am|bool|isOCTAL|char ch
straightforward as one might desire. See L<perlrecharclass/POSIX Character
Classes> for details.
See the L<top of this section|/Character classes> for an explanation of variants
-C<isPUNCT_A>, C<isPUNCT_L1>, C<isPUNCT_uni>, C<isPUNCT_utf8>, C<isPUNCT_LC>
+C<isPUNCT_A>, C<isPUNCT_L1>, C<isPUNCT_uni>, C<isPUNCT_utf8>, C<isPUNCT_LC>,
C<isPUNCT_LC_uvchr>, and C<isPUNCT_LC_utf8>.
=for apidoc Am|bool|isSPACE|char ch
in the non-locale variants, was that C<isSPACE()> did not match a vertical tab.
(See L</isPSXSPC> for a macro that matches a vertical tab in all releases.)
See the L<top of this section|/Character classes> for an explanation of variants
-C<isSPACE_A>, C<isSPACE_L1>, C<isSPACE_uni>, C<isSPACE_utf8>, C<isSPACE_LC>
+C<isSPACE_A>, C<isSPACE_L1>, C<isSPACE_uni>, C<isSPACE_utf8>, C<isSPACE_LC>,
C<isSPACE_LC_uvchr>, and C<isSPACE_LC_utf8>.
=for apidoc Am|bool|isPSXSPC|char ch
Otherwise they are identical. Thus this macro is analogous to what
C<m/[[:space:]]/> matches in a regular expression.
See the L<top of this section|/Character classes> for an explanation of variants
-C<isPSXSPC_A>, C<isPSXSPC_L1>, C<isPSXSPC_uni>, C<isPSXSPC_utf8>, C<isPSXSPC_LC>
+C<isPSXSPC_A>, C<isPSXSPC_L1>, C<isPSXSPC_uni>, C<isPSXSPC_utf8>, C<isPSXSPC_LC>,
C<isPSXSPC_LC_uvchr>, and C<isPSXSPC_LC_utf8>.
=for apidoc Am|bool|isUPPER|char ch
uppercase character in the platform's native character set, analogous to
C<m/[[:upper:]]/>.
See the L<top of this section|/Character classes> for an explanation of variants
-C<isUPPER_A>, C<isUPPER_L1>, C<isUPPER_uni>, C<isUPPER_utf8>, C<isUPPER_LC>
+C<isUPPER_A>, C<isUPPER_L1>, C<isUPPER_uni>, C<isUPPER_utf8>, C<isUPPER_LC>,
C<isUPPER_LC_uvchr>, and C<isUPPER_LC_utf8>.
=for apidoc Am|bool|isPRINT|char ch
printable character in the platform's native character set, analogous to
C<m/[[:print:]]/>.
See the L<top of this section|/Character classes> for an explanation of variants
-C<isPRINT_A>, C<isPRINT_L1>, C<isPRINT_uni>, C<isPRINT_utf8>, C<isPRINT_LC>
+C<isPRINT_A>, C<isPRINT_L1>, C<isPRINT_uni>, C<isPRINT_utf8>, C<isPRINT_LC>,
C<isPRINT_LC_uvchr>, and C<isPRINT_LC_utf8>.
=for apidoc Am|bool|isWORDCHAR|char ch
C<isXDIGIT_uni>, C<isXDIGIT_utf8>, C<isXDIGIT_LC>, C<isXDIGIT_LC_uvchr>, and
C<isXDIGIT_LC_utf8>.
+=for apidoc Am|bool|isIDFIRST|char ch
+Returns a boolean indicating whether the specified character can be the first
+character of an identifier. This is very close to, but not quite the same as
+the official Unicode property C<XID_Start>. The difference is that this
+returns true only if the input character also matches L</isWORDCHAR>.
+See the L<top of this section|/Character classes> for an explanation of variants
+C<isIDFIRST_A>, C<isIDFIRST_L1>, C<isIDFIRST_uni>, C<isIDFIRST_utf8>,
+C<isIDFIRST_LC>, C<isIDFIRST_LC_uvchr>, and C<isIDFIRST_LC_utf8>.
+
+=for apidoc Am|bool|isIDCONT|char ch
+Returns a boolean indicating whether the specified character can be the
+second or succeeding character of an identifier. This is very close to, but
+not quite the same as the official Unicode property C<XID_Continue>. The
+difference is that this returns true only if the input character also matches
+L</isWORDCHAR>. See the L<top of this section|/Character classes> for an
+explanation of variants C<isIDCONT_A>, C<isIDCONT_L1>, C<isIDCONT_uni>,
+C<isIDCONT_utf8>, C<isIDCONT_LC>, C<isIDCONT_LC_uvchr>, and
+C<isIDCONT_LC_utf8>.
+
+=for apidoc Am|bool|isVERTWS|char ch
+Returns a boolean indicating whether the specified character is considered
+to be vertical white space, such as C<"\n"> or C<"\f">. See the L<top of this
+section|/Character classes> for an explanation of variants
+C<isVERTWS_uni>, and C<isVERTWS_utf8>.
+
=head1 Miscellaneous Functions
=for apidoc Am|U8|READ_XDIGIT|char str*
=cut
-XXX Still undocumented are VERTSPACE, and IDFIRST IDCONT, and the
-other toUPPER etc functions
+XXX Still undocumented the other toUPPER etc functions
Note that these macros are repeated in Devel::PPPort, so should also be
patched there. The file as of this writing is cpan/Devel-PPPort/parts/inc/misc
# define _CC_QUOTEMETA 20
# define _CC_NON_FINAL_FOLD 21
# define _CC_IS_IN_SOME_FOLD 22
-/* Unused: 23-31
+# define _CC_BACKSLASH_FOO_LBRACE_IS_META 31 /* temp, see mk_PL_charclass.pl */
+/* Unused: 23-30
* If more bits are needed, one could add a second word for non-64bit
* QUAD_IS_INT systems, using some #ifdefs to distinguish between having a 2nd
* word or not. The IS_IN_SOME_FOLD bit is the most easily expendable, as it