This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
perl5.001 patch.1e
[perl5.git] / pod / perlguts.pod
CommitLineData
a0d0e21e
LW
1=head1 NAME
2
3perlguts - Perl's Internal Functions
4
5=head1 DESCRIPTION
6
7This document attempts to describe some of the internal functions of the
8Perl executable. It is far from complete and probably contains many errors.
9Please refer any questions or comments to the author below.
10
11=head1 Datatypes
12
13Perl has three typedefs that handle Perl's three main data types:
14
15 SV Scalar Value
16 AV Array Value
17 HV Hash Value
18
19Each typedef has specific routines that manipulate the various data type.
20
21=head2 What is an "IV"?
22
23Perl uses a special typedef IV which is large enough to hold either an
24integer or a pointer.
25
26Perl also uses a special typedef I32 which will always be a 32-bit integer.
27
28=head2 Working with SV's
29
30An SV can be created and loaded with one command. There are four types of
31values that can be loaded: an integer value (IV), a double (NV), a string,
32(PV), and another scalar (SV).
33
34The four routines are:
35
36 SV* newSViv(IV);
37 SV* newSVnv(double);
38 SV* newSVpv(char*, int);
39 SV* newSVsv(SV*);
40
41To change the value of an *already-existing* scalar, there are five routines:
42
43 void sv_setiv(SV*, IV);
44 void sv_setnv(SV*, double);
45 void sv_setpvn(SV*, char*, int)
46 void sv_setpv(SV*, char*);
47 void sv_setsv(SV*, SV*);
48
49Notice that you can choose to specify the length of the string to be
50assigned by using C<sv_setpvn>, or allow Perl to calculate the length by
51using C<sv_setpv>. Be warned, though, that C<sv_setpv> determines the
52string's length by using C<strlen>, which depends on the string terminating
53with a NUL character.
54
55To access the actual value that an SV points to, you can use the macros:
56
57 SvIV(SV*)
58 SvNV(SV*)
59 SvPV(SV*, STRLEN len)
60
61which will automatically coerce the actual scalar type into an IV, double,
62or string.
63
64In the C<SvPV> macro, the length of the string returned is placed into the
65variable C<len> (this is a macro, so you do I<not> use C<&len>). If you do not
66care what the length of the data is, use the global variable C<na>. Remember,
67however, that Perl allows arbitrary strings of data that may both contain
68NUL's and not be terminated by a NUL.
69
70If you simply want to know if the scalar value is TRUE, you can use:
71
72 SvTRUE(SV*)
73
74Although Perl will automatically grow strings for you, if you need to force
75Perl to allocate more memory for your SV, you can use the macro
76
77 SvGROW(SV*, STRLEN newlen)
78
79which will determine if more memory needs to be allocated. If so, it will
80call the function C<sv_grow>. Note that C<SvGROW> can only increase, not
81decrease, the allocated memory of an SV.
82
83If you have an SV and want to know what kind of data Perl thinks is stored
84in it, you can use the following macros to check the type of SV you have.
85
86 SvIOK(SV*)
87 SvNOK(SV*)
88 SvPOK(SV*)
89
90You can get and set the current length of the string stored in an SV with
91the following macros:
92
93 SvCUR(SV*)
94 SvCUR_set(SV*, I32 val)
95
96But note that these are valid only if C<SvPOK()> is true.
97
98If you know the name of a scalar variable, you can get a pointer to its SV
99by using the following:
100
101 SV* perl_get_sv("varname", FALSE);
102
103This returns NULL if the variable does not exist.
104
105If you want to know if this variable (or any other SV) is actually defined,
106you can call:
107
108 SvOK(SV*)
109
110The scalar C<undef> value is stored in an SV instance called C<sv_undef>. Its
111address can be used whenever an C<SV*> is needed.
112
113There are also the two values C<sv_yes> and C<sv_no>, which contain Boolean
114TRUE and FALSE values, respectively. Like C<sv_undef>, their addresses can
115be used whenever an C<SV*> is needed.
116
117Do not be fooled into thinking that C<(SV *) 0> is the same as C<&sv_undef>.
118Take this code:
119
120 SV* sv = (SV*) 0;
121 if (I-am-to-return-a-real-value) {
122 sv = sv_2mortal(newSViv(42));
123 }
124 sv_setsv(ST(0), sv);
125
126This code tries to return a new SV (which contains the value 42) if it should
127return a real value, or undef otherwise. Instead it has returned a null
128pointer which, somewhere down the line, will cause a segmentation violation,
129or just weird results. Change the zero to C<&sv_undef> in the first line and
130all will be well.
131
132To free an SV that you've created, call C<SvREFCNT_dec(SV*)>. Normally this
133call is not necessary. See the section on B<MORTALITY>.
134
135=head2 Private and Public Values
136
137Recall that the usual method of determining the type of scalar you have is
138to use C<Sv[INP]OK> macros. Since a scalar can be both a number and a string,
139usually these macros will always return TRUE and calling the C<Sv[INP]V>
140macros will do the appropriate conversion of string to integer/double or
141integer/double to string.
142
143If you I<really> need to know if you have an integer, double, or string
144pointer in an SV, you can use the following three macros instead:
145
146 SvIOKp(SV*)
147 SvNOKp(SV*)
148 SvPOKp(SV*)
149
150These will tell you if you truly have an integer, double, or string pointer
151stored in your SV.
152
153In general, though, it's best to just use the C<Sv[INP]V> macros.
154
155=head2 Working with AV's
156
157There are two ways to create and load an AV. The first method just creates
158an empty AV:
159
160 AV* newAV();
161
162The second method both creates the AV and initially populates it with SV's:
163
164 AV* av_make(I32 num, SV **ptr);
165
166The second argument points to an array containing C<num> C<SV*>'s.
167
168Once the AV has been created, the following operations are possible on AV's:
169
170 void av_push(AV*, SV*);
171 SV* av_pop(AV*);
172 SV* av_shift(AV*);
173 void av_unshift(AV*, I32 num);
174
175These should be familiar operations, with the exception of C<av_unshift>.
176This routine adds C<num> elements at the front of the array with the C<undef>
177value. You must then use C<av_store> (described below) to assign values
178to these new elements.
179
180Here are some other functions:
181
182 I32 av_len(AV*); /* Returns length of array */
183
184 SV** av_fetch(AV*, I32 key, I32 lval);
185 /* Fetches value at key offset, but it seems to
186 set the value to lval if lval is non-zero */
187 SV** av_store(AV*, I32 key, SV* val);
188 /* Stores val at offset key */
189
190 void av_clear(AV*);
191 /* Clear out all elements, but leave the array */
192 void av_undef(AV*);
193 /* Undefines the array, removing all elements */
194
195If you know the name of an array variable, you can get a pointer to its AV
196by using the following:
197
198 AV* perl_get_av("varname", FALSE);
199
200This returns NULL if the variable does not exist.
201
202=head2 Working with HV's
203
204To create an HV, you use the following routine:
205
206 HV* newHV();
207
208Once the HV has been created, the following operations are possible on HV's:
209
210 SV** hv_store(HV*, char* key, U32 klen, SV* val, U32 hash);
211 SV** hv_fetch(HV*, char* key, U32 klen, I32 lval);
212
213The C<klen> parameter is the length of the key being passed in. The C<val>
214argument contains the SV pointer to the scalar being stored, and C<hash> is
215the pre-computed hash value (zero if you want C<hv_store> to calculate it
216for you). The C<lval> parameter indicates whether this fetch is actually a
217part of a store operation.
218
219Remember that C<hv_store> and C<hv_fetch> return C<SV**>'s and not just
220C<SV*>. In order to access the scalar value, you must first dereference
221the return value. However, you should check to make sure that the return
222value is not NULL before dereferencing it.
223
224These two functions check if a hash table entry exists, and deletes it.
225
226 bool hv_exists(HV*, char* key, U32 klen);
227 SV* hv_delete(HV*, char* key, U32 klen);
228
229And more miscellaneous functions:
230
231 void hv_clear(HV*);
232 /* Clears all entries in hash table */
233 void hv_undef(HV*);
234 /* Undefines the hash table */
235
236 I32 hv_iterinit(HV*);
237 /* Prepares starting point to traverse hash table */
238 HE* hv_iternext(HV*);
239 /* Get the next entry, and return a pointer to a
240 structure that has both the key and value */
241 char* hv_iterkey(HE* entry, I32* retlen);
242 /* Get the key from an HE structure and also return
243 the length of the key string */
244 SV* hv_iterval(HV*, HE* entry);
245 /* Return a SV pointer to the value of the HE
246 structure */
247
248If you know the name of a hash variable, you can get a pointer to its HV
249by using the following:
250
251 HV* perl_get_hv("varname", FALSE);
252
253This returns NULL if the variable does not exist.
254
255The hash algorithm, for those who are interested, is:
256
257 i = klen;
258 hash = 0;
259 s = key;
260 while (i--)
261 hash = hash * 33 + *s++;
262
263=head2 References
264
265References are a special type of scalar that point to other scalar types
266(including references). To treat an AV or HV as a scalar, it is simply
267a matter of casting an AV or HV to an SV.
268
269To create a reference, use the following command:
270
271 SV* newRV((SV*) pointer);
272
273Once you have a reference, you can use the following macro with a cast to
274the appropriate typedef (SV, AV, HV):
275
276 SvRV(SV*)
277
278then call the appropriate routines, casting the returned C<SV*> to either an
279C<AV*> or C<HV*>.
280
281To determine, after dereferencing a reference, if you still have a reference,
282you can use the following macro:
283
284 SvROK(SV*)
285
286=head1 XSUB'S and the Argument Stack
287
288The XSUB mechanism is a simple way for Perl programs to access C subroutines.
289An XSUB routine will have a stack that contains the arguments from the Perl
290program, and a way to map from the Perl data structures to a C equivalent.
291
292The stack arguments are accessible through the C<ST(n)> macro, which returns
293the C<n>'th stack argument. Argument 0 is the first argument passed in the
294Perl subroutine call. These arguments are C<SV*>, and can be used anywhere
295an C<SV*> is used.
296
297Most of the time, output from the C routine can be handled through use of
298the RETVAL and OUTPUT directives. However, there are some cases where the
299argument stack is not already long enough to handle all the return values.
300An example is the POSIX tzname() call, which takes no arguments, but returns
301two, the local timezone's standard and summer time abbreviations.
302
303To handle this situation, the PPCODE directive is used and the stack is
304extended using the macro:
305
306 EXTEND(sp, num);
307
308where C<sp> is the stack pointer, and C<num> is the number of elements the
309stack should be extended by.
310
311Now that there is room on the stack, values can be pushed on it using the
312macros to push IV's, doubles, strings, and SV pointers respectively:
313
314 PUSHi(IV)
315 PUSHn(double)
316 PUSHp(char*, I32)
317 PUSHs(SV*)
318
319And now the Perl program calling C<tzname>, the two values will be assigned
320as in:
321
322 ($standard_abbrev, $summer_abbrev) = POSIX::tzname;
323
324An alternate (and possibly simpler) method to pushing values on the stack is
325to use the macros:
326
327 XPUSHi(IV)
328 XPUSHn(double)
329 XPUSHp(char*, I32)
330 XPUSHs(SV*)
331
332These macros automatically adjust the stack for you, if needed.
333
334=head1 Mortality
335
336In Perl, values are normally "immortal" -- that is, they are not freed unless
337explicitly done so (via the Perl C<undef> call or other routines in Perl
338itself).
339
340In the above example with C<tzname>, we needed to create two new SV's to push
341onto the argument stack, that being the two strings. However, we don't want
342these new SV's to stick around forever because they will eventually be
343copied into the SV's that hold the two scalar variables.
344
345An SV (or AV or HV) that is "mortal" acts in all ways as a normal "immortal"
346SV, AV, or HV, but is only valid in the "current context". When the Perl
347interpreter leaves the current context, the mortal SV, AV, or HV is
348automatically freed. Generally the "current context" means a single
349Perl statement.
350
351To create a mortal variable, use the functions:
352
353 SV* sv_newmortal()
354 SV* sv_2mortal(SV*)
355 SV* sv_mortalcopy(SV*)
356
357The first call creates a mortal SV, the second converts an existing SV to
358a mortal SV, the third creates a mortal copy of an existing SV.
359
360The mortal routines are not just for SV's -- AV's and HV's can be made mortal
361by passing their address (and casting them to C<SV*>) to the C<sv_2mortal> or
362C<sv_mortalcopy> routines.
363
364=head1 Creating New Variables
365
366To create a new Perl variable, which can be accessed from your Perl script,
367use the following routines, depending on the variable type.
368
369 SV* perl_get_sv("varname", TRUE);
370 AV* perl_get_av("varname", TRUE);
371 HV* perl_get_hv("varname", TRUE);
372
373Notice the use of TRUE as the second parameter. The new variable can now
374be set, using the routines appropriate to the data type.
375
376=head1 Stashes and Objects
377
378A stash is a hash table (associative array) that contains all of the
379different objects that are contained within a package. Each key of the
380hash table is a symbol name (shared by all the different types of
381objects that have the same name), and each value in the hash table is
382called a GV (for Glob Value). The GV in turn contains references to
383the various objects of that name, including (but not limited to) the
384following:
385
386 Scalar Value
387 Array Value
388 Hash Value
389 File Handle
390 Directory Handle
391 Format
392 Subroutine
393
394Perl stores various stashes in a GV structure (for global variable) but
395represents them with an HV structure.
396
397To get the HV pointer for a particular package, use the function:
398
399 HV* gv_stashpv(char* name, I32 create)
400 HV* gv_stashsv(SV*, I32 create)
401
402The first function takes a literal string, the second uses the string stored
403in the SV.
404
405The name that C<gv_stash*v> wants is the name of the package whose symbol table
406you want. The default package is called C<main>. If you have multiply nested
407packages, it is legal to pass their names to C<gv_stash*v>, separated by
408C<::> as in the Perl language itself.
409
410Alternately, if you have an SV that is a blessed reference, you can find
411out the stash pointer by using:
412
413 HV* SvSTASH(SvRV(SV*));
414
415then use the following to get the package name itself:
416
417 char* HvNAME(HV* stash);
418
419If you need to return a blessed value to your Perl script, you can use the
420following function:
421
422 SV* sv_bless(SV*, HV* stash)
423
424where the first argument, an C<SV*>, must be a reference, and the second
425argument is a stash. The returned C<SV*> can now be used in the same way
426as any other SV.
427
428=head1 Magic
429
430[This section under construction]
431
432=head1 Double-Typed SV's
433
434Scalar variables normally contain only one type of value, an integer,
435double, pointer, or reference. Perl will automatically convert the
436actual scalar data from the stored type into the requested type.
437
438Some scalar variables contain more than one type of scalar data. For
439example, the variable C<$!> contains either the numeric value of C<errno>
440or its string equivalent from C<sys_errlist[]>.
441
442To force multiple data values into an SV, you must do two things: use the
443C<sv_set*v> routines to add the additional scalar type, then set a flag
444so that Perl will believe it contains more than one type of data. The
445four macros to set the flags are:
446
447 SvIOK_on
448 SvNOK_on
449 SvPOK_on
450 SvROK_on
451
452The particular macro you must use depends on which C<sv_set*v> routine
453you called first. This is because every C<sv_set*v> routine turns on
454only the bit for the particular type of data being set, and turns off
455all the rest.
456
457For example, to create a new Perl variable called "dberror" that contains
458both the numeric and descriptive string error values, you could use the
459following code:
460
461 extern int dberror;
462 extern char *dberror_list;
463
464 SV* sv = perl_get_sv("dberror", TRUE);
465 sv_setiv(sv, (IV) dberror);
466 sv_setpv(sv, dberror_list[dberror]);
467 SvIOK_on(sv);
468
469If the order of C<sv_setiv> and C<sv_setpv> had been reversed, then the
470macro C<SvPOK_on> would need to be called instead of C<SvIOK_on>.
471
472=head1 Calling Perl Routines from within C Programs
473
474There are four routines that can be used to call a Perl subroutine from
475within a C program. These four are:
476
477 I32 perl_call_sv(SV*, I32);
478 I32 perl_call_pv(char*, I32);
479 I32 perl_call_method(char*, I32);
480 I32 perl_call_argv(char*, I32, register char**);
481
482The routine most often used should be C<perl_call_sv>. The C<SV*> argument
483contains either the name of the Perl subroutine to be called, or a reference
484to the subroutine. The second argument tells the appropriate routine what,
485if any, variables are being returned by the Perl subroutine.
486
487All four routines return the number of arguments that the subroutine returned
488on the Perl stack.
489
490When using these four routines, the programmer must manipulate the Perl stack.
491These include the following macros and functions:
492
493 dSP
494 PUSHMARK()
495 PUTBACK
496 SPAGAIN
497 ENTER
498 SAVETMPS
499 FREETMPS
500 LEAVE
501 XPUSH*()
502
503For more information, consult L<perlcall>.
504
505=head1 Memory Allocation
506
507[This section under construction]
508
509=head1 AUTHOR
510
511Jeff Okamoto <okamoto@corp.hp.com>
512
513With lots of help and suggestions from Dean Roehrich, Malcolm Beattie,
514Andreas Koenig, Paul Hudson, Ilya Zakharevich, Paul Marquess, and Neil
515Bowers.
516
517=head1 DATE
518
519Version 12: 1994/10/16
520
521