An SV can be created and loaded with one command. There are five types of
values that can be loaded: an integer value (IV), an unsigned integer
value (UV), a double (NV), a string (PV), and another scalar (SV).
+("PV" stands for "Pointer Value". You might think that it is misnamed
+because it is described as pointing only to strings. However, it is
+possible to have it point to other things. For example, inversion
+lists, used in regular expression data structures, are scalars, each
+consisting of an array of UVs which are accessed through PVs. But,
+using it for non-strings requires care, as the underlying assumption of
+much of the internals is that PVs are just for strings. Often, for
+example, a trailing NUL is tacked on automatically. The non-string use
+is documented only in this paragraph.)
The seven routines are:
SV *s;
STRLEN len;
- char * ptr;
+ char *ptr;
ptr = SvPV(s, len);
foo(ptr, len);
Here are some other functions:
- I32 av_len(AV*);
+ I32 av_top(AV*);
SV** av_fetch(AV*, I32 key, I32 lval);
SV** av_store(AV*, I32 key, SV* val);
-The C<av_len> function returns the highest index value in an array (just
+The C<av_top> function returns the highest index value in an array (just
like $#array in Perl). If the array is empty, -1 is returned. The
C<av_fetch> function returns the value at index C<key>, but if C<lval>
is non-zero, then C<av_fetch> will store an undef value at that index.
This returns NULL if the variable does not exist.
-The hash algorithm is defined in the C<PERL_HASH(hash, key, klen)> macro:
+The hash algorithm is defined in the C<PERL_HASH> macro:
- hash = 0;
- while (klen--)
- hash = (hash * 33) + *key++;
- hash = hash + (hash >> 5); /* after 5.6 */
+ PERL_HASH(hash, key, klen)
-The last step was added in version 5.6 to improve distribution of
-lower bits in the resulting hash value.
+The exact implementation of this macro varies by architecture and version
+of perl, and the return value may change per invocation, so the value
+is only valid for the duration of a single perl process.
See L<Understanding the Magic of Tied Hashes and Arrays> for more
information on how to use the hash access functions on tied hashes.
The most useful types that will be returned are:
- SVt_IV Scalar
- SVt_NV Scalar
- SVt_PV Scalar
- SVt_RV Scalar
- SVt_PVAV Array
- SVt_PVHV Hash
- SVt_PVCV Code
- SVt_PVGV Glob (possibly a file handle)
- SVt_PVMG Blessed or Magical Scalar
+ < SVt_PVAV Scalar
+ SVt_PVAV Array
+ SVt_PVHV Hash
+ SVt_PVCV Code
+ SVt_PVGV Glob (possibly a file handle)
-See the F<sv.h> header file for more details.
+See L<perlapi/svtype> for more details.
=head2 Blessed References and Class Objects
=for mg_vtable.pl begin
mg_type
- (old-style char and macro) MGVTBL Type of magic
- -------------------------- ------ -------------
- \0 PERL_MAGIC_sv vtbl_sv Special scalar variable
- # PERL_MAGIC_arylen vtbl_arylen Array length ($#ary)
- % PERL_MAGIC_rhash (none) extra data for restricted
- hashes
- . PERL_MAGIC_pos vtbl_pos pos() lvalue
- : PERL_MAGIC_symtab (none) extra data for symbol
- tables
- < PERL_MAGIC_backref vtbl_backref for weak ref data
- @ PERL_MAGIC_arylen_p (none) to move arylen out of
- XPVAV
- A PERL_MAGIC_overload vtbl_amagic %OVERLOAD hash
- a PERL_MAGIC_overload_elem vtbl_amagicelem %OVERLOAD hash element
- B PERL_MAGIC_bm vtbl_regexp Boyer-Moore
- (fast string search)
- c PERL_MAGIC_overload_table vtbl_ovrld Holds overload table
- (AMT) on stash
- D PERL_MAGIC_regdata vtbl_regdata Regex match position data
- (@+ and @- vars)
- d PERL_MAGIC_regdatum vtbl_regdatum Regex match position data
- element
- E PERL_MAGIC_env vtbl_env %ENV hash
- e PERL_MAGIC_envelem vtbl_envelem %ENV hash element
- f PERL_MAGIC_fm vtbl_regdata Formline
- ('compiled' format)
- G PERL_MAGIC_study vtbl_regexp study()ed string
- g PERL_MAGIC_regex_global vtbl_mglob m//g target
- H PERL_MAGIC_hints vtbl_hints %^H hash
- h PERL_MAGIC_hintselem vtbl_hintselem %^H hash element
- I PERL_MAGIC_isa vtbl_isa @ISA array
- i PERL_MAGIC_isaelem vtbl_isaelem @ISA array element
- k PERL_MAGIC_nkeys vtbl_nkeys scalar(keys()) lvalue
- L PERL_MAGIC_dbfile (none) Debugger %_<filename
- l PERL_MAGIC_dbline vtbl_dbline Debugger %_<filename
- element
- N PERL_MAGIC_shared (none) Shared between threads
- n PERL_MAGIC_shared_scalar (none) Shared between threads
- o PERL_MAGIC_collxfrm vtbl_collxfrm Locale transformation
- P PERL_MAGIC_tied vtbl_pack Tied array or hash
- p PERL_MAGIC_tiedelem vtbl_packelem Tied array or hash element
- q PERL_MAGIC_tiedscalar vtbl_packelem Tied scalar or handle
- r PERL_MAGIC_qr vtbl_regexp precompiled qr// regex
- S PERL_MAGIC_sig (none) %SIG hash
- s PERL_MAGIC_sigelem vtbl_sigelem %SIG hash element
- t PERL_MAGIC_taint vtbl_taint Taintedness
- U PERL_MAGIC_uvar vtbl_uvar Available for use by
- extensions
- u PERL_MAGIC_uvar_elem (none) Reserved for use by
- extensions
- V PERL_MAGIC_vstring vtbl_vstring SV was vstring literal
- v PERL_MAGIC_vec vtbl_vec vec() lvalue
- w PERL_MAGIC_utf8 vtbl_utf8 Cached UTF-8 information
- x PERL_MAGIC_substr vtbl_substr substr() lvalue
- y PERL_MAGIC_defelem vtbl_defelem Shadow "foreach" iterator
- variable / smart parameter
- vivification
- ] PERL_MAGIC_checkcall (none) inlining/mutation of call
- to this CV
- ~ PERL_MAGIC_ext (none) Available for use by
- extensions
+ (old-style char and macro) MGVTBL Type of magic
+ -------------------------- ------ -------------
+ \0 PERL_MAGIC_sv vtbl_sv Special scalar variable
+ # PERL_MAGIC_arylen vtbl_arylen Array length ($#ary)
+ % PERL_MAGIC_rhash (none) extra data for restricted
+ hashes
+ & PERL_MAGIC_proto (none) my sub prototype CV
+ . PERL_MAGIC_pos vtbl_pos pos() lvalue
+ : PERL_MAGIC_symtab (none) extra data for symbol
+ tables
+ < PERL_MAGIC_backref vtbl_backref for weak ref data
+ @ PERL_MAGIC_arylen_p (none) to move arylen out of XPVAV
+ B PERL_MAGIC_bm vtbl_regexp Boyer-Moore
+ (fast string search)
+ c PERL_MAGIC_overload_table vtbl_ovrld Holds overload table
+ (AMT) on stash
+ D PERL_MAGIC_regdata vtbl_regdata Regex match position data
+ (@+ and @- vars)
+ d PERL_MAGIC_regdatum vtbl_regdatum Regex match position data
+ element
+ E PERL_MAGIC_env vtbl_env %ENV hash
+ e PERL_MAGIC_envelem vtbl_envelem %ENV hash element
+ f PERL_MAGIC_fm vtbl_regexp Formline
+ ('compiled' format)
+ g PERL_MAGIC_regex_global vtbl_mglob m//g target
+ H PERL_MAGIC_hints vtbl_hints %^H hash
+ h PERL_MAGIC_hintselem vtbl_hintselem %^H hash element
+ I PERL_MAGIC_isa vtbl_isa @ISA array
+ i PERL_MAGIC_isaelem vtbl_isaelem @ISA array element
+ k PERL_MAGIC_nkeys vtbl_nkeys scalar(keys()) lvalue
+ L PERL_MAGIC_dbfile (none) Debugger %_<filename
+ l PERL_MAGIC_dbline vtbl_dbline Debugger %_<filename
+ element
+ N PERL_MAGIC_shared (none) Shared between threads
+ n PERL_MAGIC_shared_scalar (none) Shared between threads
+ o PERL_MAGIC_collxfrm vtbl_collxfrm Locale transformation
+ P PERL_MAGIC_tied vtbl_pack Tied array or hash
+ p PERL_MAGIC_tiedelem vtbl_packelem Tied array or hash element
+ q PERL_MAGIC_tiedscalar vtbl_packelem Tied scalar or handle
+ r PERL_MAGIC_qr vtbl_regexp precompiled qr// regex
+ S PERL_MAGIC_sig (none) %SIG hash
+ s PERL_MAGIC_sigelem vtbl_sigelem %SIG hash element
+ t PERL_MAGIC_taint vtbl_taint Taintedness
+ U PERL_MAGIC_uvar vtbl_uvar Available for use by
+ extensions
+ u PERL_MAGIC_uvar_elem (none) Reserved for use by
+ extensions
+ V PERL_MAGIC_vstring (none) SV was vstring literal
+ v PERL_MAGIC_vec vtbl_vec vec() lvalue
+ w PERL_MAGIC_utf8 vtbl_utf8 Cached UTF-8 information
+ x PERL_MAGIC_substr vtbl_substr substr() lvalue
+ y PERL_MAGIC_defelem vtbl_defelem Shadow "foreach" iterator
+ variable / smart parameter
+ vivification
+ ] PERL_MAGIC_checkcall vtbl_checkcall inlining/mutation of call
+ to this CV
+ ~ PERL_MAGIC_ext (none) Available for use by
+ extensions
=for mg_vtable.pl end
I32 call_sv(SV*, I32);
I32 call_pv(const char*, I32);
I32 call_method(const char*, I32);
- I32 call_argv(const char*, I32, register char**);
+ I32 call_argv(const char*, I32, char**);
The routine most often used is C<call_sv>. The C<SV*> argument
contains either the name of the Perl subroutine to be called, or a
In general, you either have to know what you're dealing with, or you
have to guess. The API function C<is_utf8_string> can help; it'll tell
you if a string contains only valid UTF-8 characters. However, it can't
-do the work for you. On a character-by-character basis, C<is_utf8_char>
+do the work for you. On a character-by-character basis,
+C<is_utf8_char_buf>
will tell you whether the current character in a string is valid UTF-8.
=head2 How does UTF-8 represent Unicode characters?
whether the byte can be encoded as a single byte even in UTF-8):
U8 *utf;
+ U8 *utf_end; /* 1 beyond buffer pointed to by utf */
UV uv; /* Note: a UV, not a U8, not a char */
+ STRLEN len; /* length of character in bytes */
if (!UTF8_IS_INVARIANT(*utf))
/* Must treat this as UTF-8 */
- uv = utf8_to_uv(utf);
+ uv = utf8_to_uvchr_buf(utf, utf_end, &len);
else
/* OK to treat this character as a byte */
uv = *utf;
-You can also see in that example that we use C<utf8_to_uv> to get the
-value of the character; the inverse function C<uv_to_utf8> is available
+You can also see in that example that we use C<utf8_to_uvchr_buf> to get the
+value of the character; the inverse function C<uvchr_to_utf8> is available
for putting a UV into UTF-8:
if (!UTF8_IS_INVARIANT(uv))
/* Must treat this as UTF8 */
- utf8 = uv_to_utf8(utf8, uv);
+ utf8 = uvchr_to_utf8(utf8, uv);
else
/* OK to treat this character as a byte */
*utf8++ = uv;
=item *
-If a string is UTF-8, B<always> use C<utf8_to_uv> to get at the value,
+If a string is UTF-8, B<always> use C<utf8_to_uvchr_buf> to get at the value,
unless C<UTF8_IS_INVARIANT(*s)> in which case you can use C<*s>.
=item *
When writing a character C<uv> to a UTF-8 string, B<always> use
-C<uv_to_utf8>, unless C<UTF8_IS_INVARIANT(uv))> in which case
+C<uvchr_to_utf8>, unless C<UTF8_IS_INVARIANT(uv))> in which case
you can use C<*s = uv>.
=item *