Commit | Line | Data |
---|---|---|
a0d0e21e LW |
1 | =head1 NAME |
2 | ||
3 | perlguts - Perl's Internal Functions | |
4 | ||
5 | =head1 DESCRIPTION | |
6 | ||
7 | This document attempts to describe some of the internal functions of the | |
8 | Perl executable. It is far from complete and probably contains many errors. | |
9 | Please refer any questions or comments to the author below. | |
10 | ||
11 | =head1 Datatypes | |
12 | ||
13 | Perl has three typedefs that handle Perl's three main data types: | |
14 | ||
15 | SV Scalar Value | |
16 | AV Array Value | |
17 | HV Hash Value | |
18 | ||
19 | Each typedef has specific routines that manipulate the various data type. | |
20 | ||
21 | =head2 What is an "IV"? | |
22 | ||
23 | Perl uses a special typedef IV which is large enough to hold either an | |
24 | integer or a pointer. | |
25 | ||
26 | Perl also uses a special typedef I32 which will always be a 32-bit integer. | |
27 | ||
28 | =head2 Working with SV's | |
29 | ||
30 | An SV can be created and loaded with one command. There are four types of | |
31 | values that can be loaded: an integer value (IV), a double (NV), a string, | |
32 | (PV), and another scalar (SV). | |
33 | ||
34 | The four routines are: | |
35 | ||
36 | SV* newSViv(IV); | |
37 | SV* newSVnv(double); | |
38 | SV* newSVpv(char*, int); | |
39 | SV* newSVsv(SV*); | |
40 | ||
41 | To change the value of an *already-existing* scalar, there are five routines: | |
42 | ||
43 | void sv_setiv(SV*, IV); | |
44 | void sv_setnv(SV*, double); | |
45 | void sv_setpvn(SV*, char*, int) | |
46 | void sv_setpv(SV*, char*); | |
47 | void sv_setsv(SV*, SV*); | |
48 | ||
49 | Notice that you can choose to specify the length of the string to be | |
50 | assigned by using C<sv_setpvn>, or allow Perl to calculate the length by | |
51 | using C<sv_setpv>. Be warned, though, that C<sv_setpv> determines the | |
52 | string's length by using C<strlen>, which depends on the string terminating | |
53 | with a NUL character. | |
54 | ||
55 | To access the actual value that an SV points to, you can use the macros: | |
56 | ||
57 | SvIV(SV*) | |
58 | SvNV(SV*) | |
59 | SvPV(SV*, STRLEN len) | |
60 | ||
61 | which will automatically coerce the actual scalar type into an IV, double, | |
62 | or string. | |
63 | ||
64 | In the C<SvPV> macro, the length of the string returned is placed into the | |
65 | variable C<len> (this is a macro, so you do I<not> use C<&len>). If you do not | |
66 | care what the length of the data is, use the global variable C<na>. Remember, | |
67 | however, that Perl allows arbitrary strings of data that may both contain | |
68 | NUL's and not be terminated by a NUL. | |
69 | ||
70 | If you simply want to know if the scalar value is TRUE, you can use: | |
71 | ||
72 | SvTRUE(SV*) | |
73 | ||
74 | Although Perl will automatically grow strings for you, if you need to force | |
75 | Perl to allocate more memory for your SV, you can use the macro | |
76 | ||
77 | SvGROW(SV*, STRLEN newlen) | |
78 | ||
79 | which will determine if more memory needs to be allocated. If so, it will | |
80 | call the function C<sv_grow>. Note that C<SvGROW> can only increase, not | |
81 | decrease, the allocated memory of an SV. | |
82 | ||
83 | If you have an SV and want to know what kind of data Perl thinks is stored | |
84 | in it, you can use the following macros to check the type of SV you have. | |
85 | ||
86 | SvIOK(SV*) | |
87 | SvNOK(SV*) | |
88 | SvPOK(SV*) | |
89 | ||
90 | You can get and set the current length of the string stored in an SV with | |
91 | the following macros: | |
92 | ||
93 | SvCUR(SV*) | |
94 | SvCUR_set(SV*, I32 val) | |
95 | ||
96 | But note that these are valid only if C<SvPOK()> is true. | |
97 | ||
98 | If you know the name of a scalar variable, you can get a pointer to its SV | |
99 | by using the following: | |
100 | ||
101 | SV* perl_get_sv("varname", FALSE); | |
102 | ||
103 | This returns NULL if the variable does not exist. | |
104 | ||
105 | If you want to know if this variable (or any other SV) is actually defined, | |
106 | you can call: | |
107 | ||
108 | SvOK(SV*) | |
109 | ||
110 | The scalar C<undef> value is stored in an SV instance called C<sv_undef>. Its | |
111 | address can be used whenever an C<SV*> is needed. | |
112 | ||
113 | There are also the two values C<sv_yes> and C<sv_no>, which contain Boolean | |
114 | TRUE and FALSE values, respectively. Like C<sv_undef>, their addresses can | |
115 | be used whenever an C<SV*> is needed. | |
116 | ||
117 | Do not be fooled into thinking that C<(SV *) 0> is the same as C<&sv_undef>. | |
118 | Take this code: | |
119 | ||
120 | SV* sv = (SV*) 0; | |
121 | if (I-am-to-return-a-real-value) { | |
122 | sv = sv_2mortal(newSViv(42)); | |
123 | } | |
124 | sv_setsv(ST(0), sv); | |
125 | ||
126 | This code tries to return a new SV (which contains the value 42) if it should | |
127 | return a real value, or undef otherwise. Instead it has returned a null | |
128 | pointer which, somewhere down the line, will cause a segmentation violation, | |
129 | or just weird results. Change the zero to C<&sv_undef> in the first line and | |
130 | all will be well. | |
131 | ||
132 | To free an SV that you've created, call C<SvREFCNT_dec(SV*)>. Normally this | |
133 | call is not necessary. See the section on B<MORTALITY>. | |
134 | ||
135 | =head2 Private and Public Values | |
136 | ||
137 | Recall that the usual method of determining the type of scalar you have is | |
138 | to use C<Sv[INP]OK> macros. Since a scalar can be both a number and a string, | |
139 | usually these macros will always return TRUE and calling the C<Sv[INP]V> | |
140 | macros will do the appropriate conversion of string to integer/double or | |
141 | integer/double to string. | |
142 | ||
143 | If you I<really> need to know if you have an integer, double, or string | |
144 | pointer in an SV, you can use the following three macros instead: | |
145 | ||
146 | SvIOKp(SV*) | |
147 | SvNOKp(SV*) | |
148 | SvPOKp(SV*) | |
149 | ||
150 | These will tell you if you truly have an integer, double, or string pointer | |
151 | stored in your SV. | |
152 | ||
153 | In general, though, it's best to just use the C<Sv[INP]V> macros. | |
154 | ||
155 | =head2 Working with AV's | |
156 | ||
157 | There are two ways to create and load an AV. The first method just creates | |
158 | an empty AV: | |
159 | ||
160 | AV* newAV(); | |
161 | ||
162 | The second method both creates the AV and initially populates it with SV's: | |
163 | ||
164 | AV* av_make(I32 num, SV **ptr); | |
165 | ||
166 | The second argument points to an array containing C<num> C<SV*>'s. | |
167 | ||
168 | Once the AV has been created, the following operations are possible on AV's: | |
169 | ||
170 | void av_push(AV*, SV*); | |
171 | SV* av_pop(AV*); | |
172 | SV* av_shift(AV*); | |
173 | void av_unshift(AV*, I32 num); | |
174 | ||
175 | These should be familiar operations, with the exception of C<av_unshift>. | |
176 | This routine adds C<num> elements at the front of the array with the C<undef> | |
177 | value. You must then use C<av_store> (described below) to assign values | |
178 | to these new elements. | |
179 | ||
180 | Here are some other functions: | |
181 | ||
182 | I32 av_len(AV*); /* Returns length of array */ | |
183 | ||
184 | SV** av_fetch(AV*, I32 key, I32 lval); | |
185 | /* Fetches value at key offset, but it seems to | |
186 | set the value to lval if lval is non-zero */ | |
187 | SV** av_store(AV*, I32 key, SV* val); | |
188 | /* Stores val at offset key */ | |
189 | ||
190 | void av_clear(AV*); | |
191 | /* Clear out all elements, but leave the array */ | |
192 | void av_undef(AV*); | |
193 | /* Undefines the array, removing all elements */ | |
194 | ||
195 | If you know the name of an array variable, you can get a pointer to its AV | |
196 | by using the following: | |
197 | ||
198 | AV* perl_get_av("varname", FALSE); | |
199 | ||
200 | This returns NULL if the variable does not exist. | |
201 | ||
202 | =head2 Working with HV's | |
203 | ||
204 | To create an HV, you use the following routine: | |
205 | ||
206 | HV* newHV(); | |
207 | ||
208 | Once the HV has been created, the following operations are possible on HV's: | |
209 | ||
210 | SV** hv_store(HV*, char* key, U32 klen, SV* val, U32 hash); | |
211 | SV** hv_fetch(HV*, char* key, U32 klen, I32 lval); | |
212 | ||
213 | The C<klen> parameter is the length of the key being passed in. The C<val> | |
214 | argument contains the SV pointer to the scalar being stored, and C<hash> is | |
215 | the pre-computed hash value (zero if you want C<hv_store> to calculate it | |
216 | for you). The C<lval> parameter indicates whether this fetch is actually a | |
217 | part of a store operation. | |
218 | ||
219 | Remember that C<hv_store> and C<hv_fetch> return C<SV**>'s and not just | |
220 | C<SV*>. In order to access the scalar value, you must first dereference | |
221 | the return value. However, you should check to make sure that the return | |
222 | value is not NULL before dereferencing it. | |
223 | ||
224 | These two functions check if a hash table entry exists, and deletes it. | |
225 | ||
226 | bool hv_exists(HV*, char* key, U32 klen); | |
227 | SV* hv_delete(HV*, char* key, U32 klen); | |
228 | ||
229 | And more miscellaneous functions: | |
230 | ||
231 | void hv_clear(HV*); | |
232 | /* Clears all entries in hash table */ | |
233 | void hv_undef(HV*); | |
234 | /* Undefines the hash table */ | |
235 | ||
236 | I32 hv_iterinit(HV*); | |
237 | /* Prepares starting point to traverse hash table */ | |
238 | HE* hv_iternext(HV*); | |
239 | /* Get the next entry, and return a pointer to a | |
240 | structure that has both the key and value */ | |
241 | char* hv_iterkey(HE* entry, I32* retlen); | |
242 | /* Get the key from an HE structure and also return | |
243 | the length of the key string */ | |
244 | SV* hv_iterval(HV*, HE* entry); | |
245 | /* Return a SV pointer to the value of the HE | |
246 | structure */ | |
247 | ||
248 | If you know the name of a hash variable, you can get a pointer to its HV | |
249 | by using the following: | |
250 | ||
251 | HV* perl_get_hv("varname", FALSE); | |
252 | ||
253 | This returns NULL if the variable does not exist. | |
254 | ||
255 | The hash algorithm, for those who are interested, is: | |
256 | ||
257 | i = klen; | |
258 | hash = 0; | |
259 | s = key; | |
260 | while (i--) | |
261 | hash = hash * 33 + *s++; | |
262 | ||
263 | =head2 References | |
264 | ||
265 | References are a special type of scalar that point to other scalar types | |
266 | (including references). To treat an AV or HV as a scalar, it is simply | |
267 | a matter of casting an AV or HV to an SV. | |
268 | ||
269 | To create a reference, use the following command: | |
270 | ||
271 | SV* newRV((SV*) pointer); | |
272 | ||
273 | Once you have a reference, you can use the following macro with a cast to | |
274 | the appropriate typedef (SV, AV, HV): | |
275 | ||
276 | SvRV(SV*) | |
277 | ||
278 | then call the appropriate routines, casting the returned C<SV*> to either an | |
279 | C<AV*> or C<HV*>. | |
280 | ||
281 | To determine, after dereferencing a reference, if you still have a reference, | |
282 | you can use the following macro: | |
283 | ||
284 | SvROK(SV*) | |
285 | ||
286 | =head1 XSUB'S and the Argument Stack | |
287 | ||
288 | The XSUB mechanism is a simple way for Perl programs to access C subroutines. | |
289 | An XSUB routine will have a stack that contains the arguments from the Perl | |
290 | program, and a way to map from the Perl data structures to a C equivalent. | |
291 | ||
292 | The stack arguments are accessible through the C<ST(n)> macro, which returns | |
293 | the C<n>'th stack argument. Argument 0 is the first argument passed in the | |
294 | Perl subroutine call. These arguments are C<SV*>, and can be used anywhere | |
295 | an C<SV*> is used. | |
296 | ||
297 | Most of the time, output from the C routine can be handled through use of | |
298 | the RETVAL and OUTPUT directives. However, there are some cases where the | |
299 | argument stack is not already long enough to handle all the return values. | |
300 | An example is the POSIX tzname() call, which takes no arguments, but returns | |
301 | two, the local timezone's standard and summer time abbreviations. | |
302 | ||
303 | To handle this situation, the PPCODE directive is used and the stack is | |
304 | extended using the macro: | |
305 | ||
306 | EXTEND(sp, num); | |
307 | ||
308 | where C<sp> is the stack pointer, and C<num> is the number of elements the | |
309 | stack should be extended by. | |
310 | ||
311 | Now that there is room on the stack, values can be pushed on it using the | |
312 | macros to push IV's, doubles, strings, and SV pointers respectively: | |
313 | ||
314 | PUSHi(IV) | |
315 | PUSHn(double) | |
316 | PUSHp(char*, I32) | |
317 | PUSHs(SV*) | |
318 | ||
319 | And now the Perl program calling C<tzname>, the two values will be assigned | |
320 | as in: | |
321 | ||
322 | ($standard_abbrev, $summer_abbrev) = POSIX::tzname; | |
323 | ||
324 | An alternate (and possibly simpler) method to pushing values on the stack is | |
325 | to use the macros: | |
326 | ||
327 | XPUSHi(IV) | |
328 | XPUSHn(double) | |
329 | XPUSHp(char*, I32) | |
330 | XPUSHs(SV*) | |
331 | ||
332 | These macros automatically adjust the stack for you, if needed. | |
333 | ||
334 | =head1 Mortality | |
335 | ||
336 | In Perl, values are normally "immortal" -- that is, they are not freed unless | |
337 | explicitly done so (via the Perl C<undef> call or other routines in Perl | |
338 | itself). | |
339 | ||
340 | In the above example with C<tzname>, we needed to create two new SV's to push | |
341 | onto the argument stack, that being the two strings. However, we don't want | |
342 | these new SV's to stick around forever because they will eventually be | |
343 | copied into the SV's that hold the two scalar variables. | |
344 | ||
345 | An SV (or AV or HV) that is "mortal" acts in all ways as a normal "immortal" | |
346 | SV, AV, or HV, but is only valid in the "current context". When the Perl | |
347 | interpreter leaves the current context, the mortal SV, AV, or HV is | |
348 | automatically freed. Generally the "current context" means a single | |
349 | Perl statement. | |
350 | ||
351 | To create a mortal variable, use the functions: | |
352 | ||
353 | SV* sv_newmortal() | |
354 | SV* sv_2mortal(SV*) | |
355 | SV* sv_mortalcopy(SV*) | |
356 | ||
357 | The first call creates a mortal SV, the second converts an existing SV to | |
358 | a mortal SV, the third creates a mortal copy of an existing SV. | |
359 | ||
360 | The mortal routines are not just for SV's -- AV's and HV's can be made mortal | |
361 | by passing their address (and casting them to C<SV*>) to the C<sv_2mortal> or | |
362 | C<sv_mortalcopy> routines. | |
363 | ||
364 | =head1 Creating New Variables | |
365 | ||
366 | To create a new Perl variable, which can be accessed from your Perl script, | |
367 | use the following routines, depending on the variable type. | |
368 | ||
369 | SV* perl_get_sv("varname", TRUE); | |
370 | AV* perl_get_av("varname", TRUE); | |
371 | HV* perl_get_hv("varname", TRUE); | |
372 | ||
373 | Notice the use of TRUE as the second parameter. The new variable can now | |
374 | be set, using the routines appropriate to the data type. | |
375 | ||
376 | =head1 Stashes and Objects | |
377 | ||
378 | A stash is a hash table (associative array) that contains all of the | |
379 | different objects that are contained within a package. Each key of the | |
380 | hash table is a symbol name (shared by all the different types of | |
381 | objects that have the same name), and each value in the hash table is | |
382 | called a GV (for Glob Value). The GV in turn contains references to | |
383 | the various objects of that name, including (but not limited to) the | |
384 | following: | |
385 | ||
386 | Scalar Value | |
387 | Array Value | |
388 | Hash Value | |
389 | File Handle | |
390 | Directory Handle | |
391 | Format | |
392 | Subroutine | |
393 | ||
394 | Perl stores various stashes in a GV structure (for global variable) but | |
395 | represents them with an HV structure. | |
396 | ||
397 | To get the HV pointer for a particular package, use the function: | |
398 | ||
399 | HV* gv_stashpv(char* name, I32 create) | |
400 | HV* gv_stashsv(SV*, I32 create) | |
401 | ||
402 | The first function takes a literal string, the second uses the string stored | |
403 | in the SV. | |
404 | ||
405 | The name that C<gv_stash*v> wants is the name of the package whose symbol table | |
406 | you want. The default package is called C<main>. If you have multiply nested | |
407 | packages, it is legal to pass their names to C<gv_stash*v>, separated by | |
408 | C<::> as in the Perl language itself. | |
409 | ||
410 | Alternately, if you have an SV that is a blessed reference, you can find | |
411 | out the stash pointer by using: | |
412 | ||
413 | HV* SvSTASH(SvRV(SV*)); | |
414 | ||
415 | then use the following to get the package name itself: | |
416 | ||
417 | char* HvNAME(HV* stash); | |
418 | ||
419 | If you need to return a blessed value to your Perl script, you can use the | |
420 | following function: | |
421 | ||
422 | SV* sv_bless(SV*, HV* stash) | |
423 | ||
424 | where the first argument, an C<SV*>, must be a reference, and the second | |
425 | argument is a stash. The returned C<SV*> can now be used in the same way | |
426 | as any other SV. | |
427 | ||
428 | =head1 Magic | |
429 | ||
430 | [This section under construction] | |
431 | ||
432 | =head1 Double-Typed SV's | |
433 | ||
434 | Scalar variables normally contain only one type of value, an integer, | |
435 | double, pointer, or reference. Perl will automatically convert the | |
436 | actual scalar data from the stored type into the requested type. | |
437 | ||
438 | Some scalar variables contain more than one type of scalar data. For | |
439 | example, the variable C<$!> contains either the numeric value of C<errno> | |
440 | or its string equivalent from C<sys_errlist[]>. | |
441 | ||
442 | To force multiple data values into an SV, you must do two things: use the | |
443 | C<sv_set*v> routines to add the additional scalar type, then set a flag | |
444 | so that Perl will believe it contains more than one type of data. The | |
445 | four macros to set the flags are: | |
446 | ||
447 | SvIOK_on | |
448 | SvNOK_on | |
449 | SvPOK_on | |
450 | SvROK_on | |
451 | ||
452 | The particular macro you must use depends on which C<sv_set*v> routine | |
453 | you called first. This is because every C<sv_set*v> routine turns on | |
454 | only the bit for the particular type of data being set, and turns off | |
455 | all the rest. | |
456 | ||
457 | For example, to create a new Perl variable called "dberror" that contains | |
458 | both the numeric and descriptive string error values, you could use the | |
459 | following code: | |
460 | ||
461 | extern int dberror; | |
462 | extern char *dberror_list; | |
463 | ||
464 | SV* sv = perl_get_sv("dberror", TRUE); | |
465 | sv_setiv(sv, (IV) dberror); | |
466 | sv_setpv(sv, dberror_list[dberror]); | |
467 | SvIOK_on(sv); | |
468 | ||
469 | If the order of C<sv_setiv> and C<sv_setpv> had been reversed, then the | |
470 | macro C<SvPOK_on> would need to be called instead of C<SvIOK_on>. | |
471 | ||
472 | =head1 Calling Perl Routines from within C Programs | |
473 | ||
474 | There are four routines that can be used to call a Perl subroutine from | |
475 | within a C program. These four are: | |
476 | ||
477 | I32 perl_call_sv(SV*, I32); | |
478 | I32 perl_call_pv(char*, I32); | |
479 | I32 perl_call_method(char*, I32); | |
480 | I32 perl_call_argv(char*, I32, register char**); | |
481 | ||
482 | The routine most often used should be C<perl_call_sv>. The C<SV*> argument | |
483 | contains either the name of the Perl subroutine to be called, or a reference | |
484 | to the subroutine. The second argument tells the appropriate routine what, | |
485 | if any, variables are being returned by the Perl subroutine. | |
486 | ||
487 | All four routines return the number of arguments that the subroutine returned | |
488 | on the Perl stack. | |
489 | ||
490 | When using these four routines, the programmer must manipulate the Perl stack. | |
491 | These include the following macros and functions: | |
492 | ||
493 | dSP | |
494 | PUSHMARK() | |
495 | PUTBACK | |
496 | SPAGAIN | |
497 | ENTER | |
498 | SAVETMPS | |
499 | FREETMPS | |
500 | LEAVE | |
501 | XPUSH*() | |
502 | ||
503 | For more information, consult L<perlcall>. | |
504 | ||
505 | =head1 Memory Allocation | |
506 | ||
507 | [This section under construction] | |
508 | ||
509 | =head1 AUTHOR | |
510 | ||
511 | Jeff Okamoto <okamoto@corp.hp.com> | |
512 | ||
513 | With lots of help and suggestions from Dean Roehrich, Malcolm Beattie, | |
514 | Andreas Koenig, Paul Hudson, Ilya Zakharevich, Paul Marquess, and Neil | |
515 | Bowers. | |
516 | ||
517 | =head1 DATE | |
518 | ||
519 | Version 12: 1994/10/16 | |
520 | ||
521 |