value (UV), a double (NV), a string (PV), and another scalar (SV).
("PV" stands for "Pointer Value". You might think that it is misnamed
because it is described as pointing only to strings. However, it is
-possible to have it point to other things. For example, inversion
-lists, used in regular expression data structures, are scalars, each
-consisting of an array of UVs which are accessed through PVs. But,
+possible to have it point to other things For example, it could point
+to an array of UVs. But,
using it for non-strings requires care, as the underlying assumption of
much of the internals is that PVs are just for strings. Often, for
example, a trailing NUL is tacked on automatically. The non-string use
F<config.h>) guaranteed to be large enough to represent the size of
any string that perl can handle.
-In the unlikely case of a SV requiring more complex initialisation, you
+In the unlikely case of a SV requiring more complex initialization, you
can create an empty SV with newSV(len). If C<len> is 0 an empty SV of
type NULL is returned, else an SV of type PV is returned with len + 1 (for
the NUL) bytes of storage allocated, accessible via SvPVX. In both cases
pointer. The efficiency comes by means of a little hack: instead of
actually removing the characters, C<sv_chop> sets the flag C<OOK>
(offset OK) to signal to other functions that the offset hack is in
-effect, and it puts the number of bytes chopped off into the IV field
-of the SV. It then moves the PV pointer (called C<SvPVX>) forward that
-many bytes, and adjusts C<SvCUR> and C<SvLEN>.
+effect, and it moves the PV pointer (called C<SvPVX>) forward
+by the number of bytes chopped off, and adjusts C<SvCUR> and C<SvLEN>
+accordingly. (A portion of the space between the old and new PV
+pointers is used to store the count of chopped bytes.)
Hence, at this point, the start of the buffer that we allocated lives
at C<SvPVX(sv) - SvIV(sv)> in memory and the PV pointer is pointing
Here are some other functions:
- I32 av_len(AV*);
+ I32 av_top_index(AV*);
SV** av_fetch(AV*, I32 key, I32 lval);
SV** av_store(AV*, I32 key, SV* val);
-The C<av_len> function returns the highest index value in an array (just
+The C<av_top_index> function returns the highest index value in an array (just
like $#array in Perl). If the array is empty, -1 is returned. The
C<av_fetch> function returns the value at index C<key>, but if C<lval>
is non-zero, then C<av_fetch> will store an undef value at that index.
This returns NULL if the variable does not exist.
-The hash algorithm is defined in the C<PERL_HASH(hash, key, klen)> macro:
+The hash algorithm is defined in the C<PERL_HASH> macro:
- hash = 0;
- while (klen--)
- hash = (hash * 33) + *key++;
- hash = hash + (hash >> 5); /* after 5.6 */
+ PERL_HASH(hash, key, klen)
-The last step was added in version 5.6 to improve distribution of
-lower bits in the resulting hash value.
+The exact implementation of this macro varies by architecture and version
+of perl, and the return value may change per invocation, so the value
+is only valid for the duration of a single perl process.
See L<Understanding the Magic of Tied Hashes and Arrays> for more
information on how to use the hash access functions on tied hashes.
The most useful types that will be returned are:
- SVt_IV Scalar
- SVt_NV Scalar
- SVt_PV Scalar
- SVt_RV Scalar
- SVt_PVAV Array
- SVt_PVHV Hash
- SVt_PVCV Code
- SVt_PVGV Glob (possibly a file handle)
- SVt_PVMG Blessed or Magical Scalar
+ < SVt_PVAV Scalar
+ SVt_PVAV Array
+ SVt_PVHV Hash
+ SVt_PVCV Code
+ SVt_PVGV Glob (possibly a file handle)
-See the F<sv.h> header file for more details.
+See L<perlapi/svtype> for more details.
=head2 Blessed References and Class Objects
# PERL_MAGIC_arylen vtbl_arylen Array length ($#ary)
% PERL_MAGIC_rhash (none) extra data for restricted
hashes
+ & PERL_MAGIC_proto (none) my sub prototype CV
. PERL_MAGIC_pos vtbl_pos pos() lvalue
: PERL_MAGIC_symtab (none) extra data for symbol
tables
element
E PERL_MAGIC_env vtbl_env %ENV hash
e PERL_MAGIC_envelem vtbl_envelem %ENV hash element
- f PERL_MAGIC_fm vtbl_regdata Formline
+ f PERL_MAGIC_fm vtbl_regexp Formline
('compiled' format)
g PERL_MAGIC_regex_global vtbl_mglob m//g target
H PERL_MAGIC_hints vtbl_hints %^H hash
extensions
u PERL_MAGIC_uvar_elem (none) Reserved for use by
extensions
- V PERL_MAGIC_vstring vtbl_vstring SV was vstring literal
+ V PERL_MAGIC_vstring (none) SV was vstring literal
v PERL_MAGIC_vec vtbl_vec vec() lvalue
w PERL_MAGIC_utf8 vtbl_utf8 Cached UTF-8 information
x PERL_MAGIC_substr vtbl_substr substr() lvalue
const char *subname = SvPVX(cv);
STRLEN name_length = SvCUR(cv); /* in bytes */
U32 is_utf8 = SvUTF8(cv);
-
+
C<SvPVX(cv)> contains just the sub name itself, not including the package.
For an AUTOLOAD routine in UNIVERSAL or one of its superclasses,
C<CvSTASH(cv)> returns NULL during a method call on a nonexistent package.
I32 call_sv(SV*, I32);
I32 call_pv(const char*, I32);
I32 call_method(const char*, I32);
- I32 call_argv(const char*, I32, register char**);
+ I32 call_argv(const char*, I32, char**);
The routine most often used is C<call_sv>. The C<SV*> argument
contains either the name of the Perl subroutine to be called, or a
static void my_peep(pTHX_ OP *o)
{
/* custom per-subroutine optimisation goes here */
- prev_peepp(o);
+ prev_peepp(aTHX_ o);
/* custom per-subroutine optimisation may also go here */
}
BOOT:
for(; o; o = o->op_next) {
/* custom per-op optimisation goes here */
}
- prev_rpeepp(orig_o);
+ prev_rpeepp(aTHX_ orig_o);
}
BOOT:
prev_rpeepp = PL_rpeepp;
the function Perl_GetVars(). The PERL_GLOBAL_STRUCT_PRIVATE goes
one step further, there is still a single struct (allocated in main()
either from heap or from stack) but there are no global data symbols
-pointing to it. In either case the global struct should be initialised
+pointing to it. In either case the global struct should be initialized
as the very first thing in main() using Perl_init_global_struct() and
correspondingly tear it down after perl_free() using Perl_free_global_struct(),
please see F<miniperlmain.c> for usage details. You may also need
In general, you either have to know what you're dealing with, or you
have to guess. The API function C<is_utf8_string> can help; it'll tell
you if a string contains only valid UTF-8 characters. However, it can't
-do the work for you. On a character-by-character basis, XXX C<is_utf8_char>
+do the work for you. On a character-by-character basis,
+C<is_utf8_char_buf>
will tell you whether the current character in a string is valid UTF-8.
=head2 How does UTF-8 represent Unicode characters?
All bytes in a multi-byte UTF-8 character will have the high bit set,
so you can test if you need to do something special with this
character like this (the UTF8_IS_INVARIANT() is a macro that tests
-whether the byte can be encoded as a single byte even in UTF-8):
+whether the byte is encoded as a single byte even in UTF-8):
U8 *utf;
U8 *utf_end; /* 1 beyond buffer pointed to by utf */
=head1 Custom Operators
-Custom operator support is a new experimental feature that allows you to
+Custom operator support is an experimental feature that allows you to
define your own ops. This is primarily to allow the building of
interpreters for other languages in the Perl core, but it also allows
optimizations through the creation of "macro-ops" (ops which perform the
a custom peephole optimizer with the C<optimize> module.
When you do this, you replace ordinary Perl ops with custom ops by
-creating ops with the type C<OP_CUSTOM> and the C<pp_addr> of your own
+creating ops with the type C<OP_CUSTOM> and the C<op_ppaddr> of your own
PP function. This should be defined in XS code, and should look like
the PP ops in C<pp_*.c>. You are responsible for ensuring that your op
takes the appropriate number of values from the stack, and you are