prefix. Some macros are provided for compatibility with the older,
unadorned names, but this support may be disabled in a future release.
-The listing is alphabetical, case insensitive.
+Perl was originally written to handle US-ASCII only (that is characters
+whose ordinal numbers are in the range 0 - 127).
+And documentation and comments may still use the term ASCII, when
+sometimes in fact the entire range from 0 - 255 is meant.
+
+Note that Perl can be compiled and run under EBCDIC (See L<perlebcdic>)
+or ASCII. Most of the documentation (and even comments in the code)
+ignore the EBCDIC possibility.
+For almost all purposes the differences are transparent.
+As an example, under EBCDIC,
+instead of UTF-8, UTF-EBCDIC is used to encode Unicode strings, and so
+whenever this documentation refers to C<utf8>
+(and variants of that name, including in function names),
+it also (essentially transparently) means C<UTF-EBCDIC>.
+But the ordinals of characters differ between ASCII, EBCDIC, and
+the UTF- encodings, and a string encoded in UTF-EBCDIC may occupy more bytes
+than in UTF-8.
+
+Also, on some EBCDIC machines, functions that are documented as operating on
+US-ASCII (or Basic Latin in Unicode terminology) may in fact operate on all
+256 characters in the EBCDIC range, not just the subset corresponding to
+US-ASCII.
+
+The listing below is alphabetical, case insensitive.
=head1 "Gimme" Values
=item get_av
X<get_av>
-Returns the AV of the specified Perl array. If C<create> is set and the
-Perl variable does not exist then it will be created. If C<create> is not
-set and the variable does not exist then NULL is returned.
+Returns the AV of the specified Perl array. C<flags> are passed to
+C<gv_fetchpv>. If C<GV_ADD> is set and the
+Perl variable does not exist then it will be created. If C<flags> is zero
+and the variable does not exist then NULL is returned.
NOTE: the perl_ form of this function is deprecated.
- AV* get_av(const char* name, I32 create)
+ AV* get_av(const char *name, I32 flags)
=for hackers
Found in file perl.c
=item isALNUM
X<isALNUM>
-Returns a boolean indicating whether the C C<char> is an ASCII alphanumeric
-character (including underscore) or digit.
+Returns a boolean indicating whether the C C<char> is a US-ASCII (Basic Latin)
+alphanumeric character (including underscore) or digit.
bool isALNUM(char ch)
=item isALPHA
X<isALPHA>
-Returns a boolean indicating whether the C C<char> is an ASCII alphabetic
-character.
+Returns a boolean indicating whether the C C<char> is a US-ASCII (Basic Latin)
+alphabetic character.
bool isALPHA(char ch)
=item isDIGIT
X<isDIGIT>
-Returns a boolean indicating whether the C C<char> is an ASCII
+Returns a boolean indicating whether the C C<char> is a US-ASCII (Basic Latin)
digit.
bool isDIGIT(char ch)
=item isLOWER
X<isLOWER>
-Returns a boolean indicating whether the C C<char> is a lowercase
-character.
+Returns a boolean indicating whether the C C<char> is a US-ASCII (Basic Latin)
+lowercase character.
bool isLOWER(char ch)
=item isSPACE
X<isSPACE>
-Returns a boolean indicating whether the C C<char> is whitespace.
+Returns a boolean indicating whether the C C<char> is a US-ASCII (Basic Latin)
+whitespace.
bool isSPACE(char ch)
=item isUPPER
X<isUPPER>
-Returns a boolean indicating whether the C C<char> is an uppercase
-character.
+Returns a boolean indicating whether the C C<char> is a US-ASCII (Basic Latin)
+uppercase character.
bool isUPPER(char ch)
=item toLOWER
X<toLOWER>
-Converts the specified character to lowercase.
+Converts the specified character to lowercase. Characters outside the
+US-ASCII (Basic Latin) range are viewed as not having any case.
char toLOWER(char ch)
=item toUPPER
X<toUPPER>
-Converts the specified character to uppercase.
+Converts the specified character to uppercase. Characters outside the
+US-ASCII (Basic Latin) range are viewed as not having any case.
char toUPPER(char ch)
=back
+=head1 Functions in file perl.h
+
+
+=over 8
+
+=item PERL_SYS_INIT
+X<PERL_SYS_INIT>
+
+Provides system-specific tune up of the C runtime environment necessary to
+run Perl interpreters. This should be called only once, before creating
+any Perl interpreters.
+
+ void PERL_SYS_INIT(int argc, char** argv)
+
+=for hackers
+Found in file perl.h
+
+=item PERL_SYS_INIT3
+X<PERL_SYS_INIT3>
+
+Provides system-specific tune up of the C runtime environment necessary to
+run Perl interpreters. This should be called only once, before creating
+any Perl interpreters.
+
+ void PERL_SYS_INIT3(int argc, char** argv, char** env)
+
+=for hackers
+Found in file perl.h
+
+=item PERL_SYS_TERM
+X<PERL_SYS_TERM>
+
+Provides system-specific clean up of the C runtime environment after
+running Perl interpreters. This should be called only once, after
+freeing any remaining Perl interpreters.
+
+ void PERL_SYS_TERM()
+
+=for hackers
+Found in file perl.h
+
+
+=back
+
=head1 Functions in file pp_ctl.c
=item get_hv
X<get_hv>
-Returns the HV of the specified Perl hash. If C<create> is set and the
-Perl variable does not exist then it will be created. If C<create> is not
-set and the variable does not exist then NULL is returned.
+Returns the HV of the specified Perl hash. C<flags> are passed to
+C<gv_fetchpv>. If C<GV_ADD> is set and the
+Perl variable does not exist then it will be created. If C<flags> is zero
+and the variable does not exist then NULL is returned.
NOTE: the perl_ form of this function is deprecated.
- HV* get_hv(const char* name, I32 create)
+ HV* get_hv(const char *name, I32 flags)
=for hackers
Found in file perl.c
Creates a constant sub equivalent to Perl C<sub FOO () { 123 }> which is
eligible for inlining at compile-time.
+Passing NULL for SV creates a constant sub equivalent to C<sub BAR () {}>,
+which won't be called if used as a destructor, but will suppress the overhead
+of a call to C<AUTOLOAD>. (This form, however, isn't eligible for inlining at
+compile time.)
+
CV* newCONSTSUB(HV* stash, const char* name, SV* sv)
=for hackers
=item get_sv
X<get_sv>
-Returns the SV of the specified Perl scalar. If C<create> is set and the
-Perl variable does not exist then it will be created. If C<create> is not
-set and the variable does not exist then NULL is returned.
+Returns the SV of the specified Perl scalar. C<flags> are passed to
+C<gv_fetchpv>. If C<GV_ADD> is set and the
+Perl variable does not exist then it will be created. If C<flags> is zero
+and the variable does not exist then NULL is returned.
NOTE: the perl_ form of this function is deprecated.
- SV* get_sv(const char* name, I32 create)
+ SV* get_sv(const char *name, I32 flags)
=for hackers
Found in file perl.c
X<SvIOKp>
Returns a U32 value indicating whether the SV contains an integer. Checks
-the B<private> setting. Use C<SvIOK>.
+the B<private> setting. Use C<SvIOK> instead.
U32 SvIOKp(SV* sv)
X<SvNIOKp>
Returns a U32 value indicating whether the SV contains a number, integer or
-double. Checks the B<private> setting. Use C<SvNIOK>.
+double. Checks the B<private> setting. Use C<SvNIOK> instead.
U32 SvNIOKp(SV* sv)
X<SvNOKp>
Returns a U32 value indicating whether the SV contains a double. Checks the
-B<private> setting. Use C<SvNOK>.
+B<private> setting. Use C<SvNOK> instead.
U32 SvNOKp(SV* sv)
X<SvPOKp>
Returns a U32 value indicating whether the SV contains a character string.
-Checks the B<private> setting. Use C<SvPOK>.
+Checks the B<private> setting. Use C<SvPOK> instead.
U32 SvPOKp(SV* sv)
=for hackers
Found in file sv.h
+=item sv_utf8_upgrade_nomg
+X<sv_utf8_upgrade_nomg>
+
+Like sv_utf8_upgrade, but doesn't do magic on C<sv>
+
+ STRLEN sv_utf8_upgrade_nomg(NN SV *sv)
+
+=for hackers
+Found in file sv.h
+
=back
X<sv_utf8_downgrade>
Attempts to convert the PV of an SV from characters to bytes.
-If the PV contains a character beyond byte, this conversion will fail;
+If the PV contains a character that cannot fit
+in a byte, this conversion will fail;
in this case, either returns false or, if C<fail_ok> is not
true, croaks.
Converts the PV of an SV to its UTF-8-encoded form.
Forces the SV to string form if it is not already.
+Will C<mg_get> on C<sv> if appropriate.
Always sets the SvUTF8 flag to avoid future validity checks even
-if all the bytes have hibit clear.
+if the whole string is the same in UTF-8 as not.
+Returns the number of bytes in the converted string
This is not as a general purpose byte encoding to Unicode interface:
use the Encode extension for that.
Converts the PV of an SV to its UTF-8-encoded form.
Forces the SV to string form if it is not already.
Always sets the SvUTF8 flag to avoid future validity checks even
-if all the bytes have hibit clear. If C<flags> has C<SV_GMAGIC> bit set,
-will C<mg_get> on C<sv> if appropriate, else not. C<sv_utf8_upgrade> and
+if all the bytes are invariant in UTF-8. If C<flags> has C<SV_GMAGIC> bit set,
+will C<mg_get> on C<sv> if appropriate, else not.
+Returns the number of bytes in the converted string
+C<sv_utf8_upgrade> and
C<sv_utf8_upgrade_nomg> are implemented in terms of this function.
This is not as a general purpose byte encoding to Unicode interface:
=for hackers
Found in file sv.c
+=item sv_utf8_upgrade_nomg
+X<sv_utf8_upgrade_nomg>
+
+Like sv_utf8_upgrade, but doesn't do magic on C<sv>
+
+ STRLEN sv_utf8_upgrade_nomg(SV *sv)
+
+=for hackers
+Found in file sv.c
+
=item sv_vcatpvf
X<sv_vcatpvf>
=item bytes_from_utf8
X<bytes_from_utf8>
-Converts a string C<s> of length C<len> from UTF-8 into byte encoding.
+Converts a string C<s> of length C<len> from UTF-8 into native byte encoding.
Unlike C<utf8_to_bytes> but like C<bytes_to_utf8>, returns a pointer to
the newly-created string, and updates C<len> to contain the new
length. Returns the original string if no conversion occurs, C<len>
is unchanged. Do nothing if C<is_utf8> points to 0. Sets C<is_utf8> to
-0 if C<s> is converted or contains all 7bit characters.
+0 if C<s> is converted or consisted entirely of characters that are invariant
+in utf8 (i.e., US-ASCII on non-EBCDIC machines).
NOTE: this function is experimental and may change or be
removed without notice.
=item bytes_to_utf8
X<bytes_to_utf8>
-Converts a string C<s> of length C<len> from ASCII into UTF-8 encoding.
+Converts a string C<s> of length C<len> from the native encoding into UTF-8.
Returns a pointer to the newly-created string, and sets C<len> to
reflect the new length.
-If you want to convert to UTF-8 from other encodings than ASCII,
+A NUL character will be written after the end of the string.
+
+If you want to convert to UTF-8 from encodings other than
+the native (Latin1 or EBCDIC),
see sv_recode_to_utf8().
NOTE: this function is experimental and may change or be
X<is_utf8_char>
Tests if some arbitrary number of bytes begins in a valid UTF-8
-character. Note that an INVARIANT (i.e. ASCII) character is a valid
-UTF-8 character. The actual number of bytes in the UTF-8 character
-will be returned if it is valid, otherwise 0.
+character. Note that an INVARIANT (i.e. ASCII on non-EBCDIC machines)
+character is a valid UTF-8 character. The actual number of bytes in the UTF-8
+character will be returned if it is valid, otherwise 0.
STRLEN is_utf8_char(const U8 *s)
=item utf8_to_bytes
X<utf8_to_bytes>
-Converts a string C<s> of length C<len> from UTF-8 into byte encoding.
+Converts a string C<s> of length C<len> from UTF-8 into native byte encoding.
Unlike C<bytes_to_utf8>, this over-writes the original string, and
updates len to contain the new length.
Returns zero on failure, setting C<len> to -1.
which is assumed to be in UTF-8 encoding; C<retlen> will be set to the
length, in bytes, of that character.
-This function should only be used when returned UV is considered
+This function should only be used when the returned UV is considered
an index into the Unicode semantic tables (e.g. swashes).
If C<s> does not point to a well-formed UTF-8 character, zero is
If you want to throw an exception object, assign the object to
C<$@> and then pass C<NULL> to croak():
- errsv = get_sv("@", TRUE);
+ errsv = get_sv("@", GV_ADD);
sv_setsv(errsv, exception_object);
croak(NULL);