Commit | Line | Data |
---|---|---|
04c692a8 DR |
1 | =encoding utf8 |
2 | ||
3 | =for comment | |
4 | Consistent formatting of this file is achieved with: | |
5 | perl ./Porting/podtidy pod/perlhacktips.pod | |
6 | ||
7 | =head1 NAME | |
8 | ||
9 | perlhacktips - Tips for Perl core C code hacking | |
10 | ||
11 | =head1 DESCRIPTION | |
12 | ||
13 | This document will help you learn the best way to go about hacking on | |
14 | the Perl core C code. It covers common problems, debugging, profiling, | |
15 | and more. | |
16 | ||
17 | If you haven't read L<perlhack> and L<perlhacktut> yet, you might want | |
18 | to do that first. | |
19 | ||
20 | =head1 COMMON PROBLEMS | |
21 | ||
22 | Perl source plays by ANSI C89 rules: no C99 (or C++) extensions. In | |
23 | some cases we have to take pre-ANSI requirements into consideration. | |
24 | You don't care about some particular platform having broken Perl? I | |
25 | hear there is still a strong demand for J2EE programmers. | |
26 | ||
27 | =head2 Perl environment problems | |
28 | ||
29 | =over 4 | |
30 | ||
31 | =item * | |
32 | ||
33 | Not compiling with threading | |
34 | ||
35 | Compiling with threading (-Duseithreads) completely rewrites the | |
36 | function prototypes of Perl. You better try your changes with that. | |
37 | Related to this is the difference between "Perl_-less" and "Perl_-ly" | |
38 | APIs, for example: | |
39 | ||
40 | Perl_sv_setiv(aTHX_ ...); | |
41 | sv_setiv(...); | |
42 | ||
43 | The first one explicitly passes in the context, which is needed for | |
44 | e.g. threaded builds. The second one does that implicitly; do not get | |
45 | them mixed. If you are not passing in a aTHX_, you will need to do a | |
46 | dTHX (or a dVAR) as the first thing in the function. | |
47 | ||
48 | See L<perlguts/"How multiple interpreters and concurrency are | |
49 | supported"> for further discussion about context. | |
50 | ||
51 | =item * | |
52 | ||
53 | Not compiling with -DDEBUGGING | |
54 | ||
55 | The DEBUGGING define exposes more code to the compiler, therefore more | |
56 | ways for things to go wrong. You should try it. | |
57 | ||
58 | =item * | |
59 | ||
60 | Introducing (non-read-only) globals | |
61 | ||
62 | Do not introduce any modifiable globals, truly global or file static. | |
63 | They are bad form and complicate multithreading and other forms of | |
64 | concurrency. The right way is to introduce them as new interpreter | |
65 | variables, see F<intrpvar.h> (at the very end for binary | |
66 | compatibility). | |
67 | ||
68 | Introducing read-only (const) globals is okay, as long as you verify | |
69 | with e.g. C<nm libperl.a|egrep -v ' [TURtr] '> (if your C<nm> has | |
70 | BSD-style output) that the data you added really is read-only. (If it | |
71 | is, it shouldn't show up in the output of that command.) | |
72 | ||
73 | If you want to have static strings, make them constant: | |
74 | ||
75 | static const char etc[] = "..."; | |
76 | ||
77 | If you want to have arrays of constant strings, note carefully the | |
78 | right combination of C<const>s: | |
79 | ||
80 | static const char * const yippee[] = | |
81 | {"hi", "ho", "silver"}; | |
82 | ||
83 | There is a way to completely hide any modifiable globals (they are all | |
84 | moved to heap), the compilation setting | |
85 | C<-DPERL_GLOBAL_STRUCT_PRIVATE>. It is not normally used, but can be | |
86 | used for testing, read more about it in L<perlguts/"Background and | |
87 | PERL_IMPLICIT_CONTEXT">. | |
88 | ||
89 | =item * | |
90 | ||
91 | Not exporting your new function | |
92 | ||
93 | Some platforms (Win32, AIX, VMS, OS/2, to name a few) require any | |
94 | function that is part of the public API (the shared Perl library) to be | |
95 | explicitly marked as exported. See the discussion about F<embed.pl> in | |
96 | L<perlguts>. | |
97 | ||
98 | =item * | |
99 | ||
100 | Exporting your new function | |
101 | ||
102 | The new shiny result of either genuine new functionality or your | |
103 | arduous refactoring is now ready and correctly exported. So what could | |
104 | possibly go wrong? | |
105 | ||
106 | Maybe simply that your function did not need to be exported in the | |
107 | first place. Perl has a long and not so glorious history of exporting | |
108 | functions that it should not have. | |
109 | ||
110 | If the function is used only inside one source code file, make it | |
111 | static. See the discussion about F<embed.pl> in L<perlguts>. | |
112 | ||
113 | If the function is used across several files, but intended only for | |
114 | Perl's internal use (and this should be the common case), do not export | |
115 | it to the public API. See the discussion about F<embed.pl> in | |
116 | L<perlguts>. | |
117 | ||
118 | =back | |
119 | ||
120 | =head2 Portability problems | |
121 | ||
122 | The following are common causes of compilation and/or execution | |
123 | failures, not common to Perl as such. The C FAQ is good bedtime | |
124 | reading. Please test your changes with as many C compilers and | |
125 | platforms as possible; we will, anyway, and it's nice to save oneself | |
126 | from public embarrassment. | |
127 | ||
128 | If using gcc, you can add the C<-std=c89> option which will hopefully | |
129 | catch most of these unportabilities. (However it might also catch | |
130 | incompatibilities in your system's header files.) | |
131 | ||
132 | Use the Configure C<-Dgccansipedantic> flag to enable the gcc C<-ansi | |
133 | -pedantic> flags which enforce stricter ANSI rules. | |
134 | ||
135 | If using the C<gcc -Wall> note that not all the possible warnings (like | |
136 | C<-Wunitialized>) are given unless you also compile with C<-O>. | |
137 | ||
138 | Note that if using gcc, starting from Perl 5.9.5 the Perl core source | |
139 | code files (the ones at the top level of the source code distribution, | |
140 | but not e.g. the extensions under ext/) are automatically compiled with | |
141 | as many as possible of the C<-std=c89>, C<-ansi>, C<-pedantic>, and a | |
142 | selection of C<-W> flags (see cflags.SH). | |
143 | ||
144 | Also study L<perlport> carefully to avoid any bad assumptions about the | |
145 | operating system, filesystems, and so forth. | |
146 | ||
147 | You may once in a while try a "make microperl" to see whether we can | |
148 | still compile Perl with just the bare minimum of interfaces. (See | |
149 | README.micro.) | |
150 | ||
151 | Do not assume an operating system indicates a certain compiler. | |
152 | ||
153 | =over 4 | |
154 | ||
155 | =item * | |
156 | ||
157 | Casting pointers to integers or casting integers to pointers | |
158 | ||
159 | void castaway(U8* p) | |
160 | { | |
161 | IV i = p; | |
162 | ||
163 | or | |
164 | ||
165 | void castaway(U8* p) | |
166 | { | |
167 | IV i = (IV)p; | |
168 | ||
169 | Both are bad, and broken, and unportable. Use the PTR2IV() macro that | |
170 | does it right. (Likewise, there are PTR2UV(), PTR2NV(), INT2PTR(), and | |
171 | NUM2PTR().) | |
172 | ||
173 | =item * | |
174 | ||
175 | Casting between data function pointers and data pointers | |
176 | ||
177 | Technically speaking casting between function pointers and data | |
178 | pointers is unportable and undefined, but practically speaking it seems | |
179 | to work, but you should use the FPTR2DPTR() and DPTR2FPTR() macros. | |
180 | Sometimes you can also play games with unions. | |
181 | ||
182 | =item * | |
183 | ||
184 | Assuming sizeof(int) == sizeof(long) | |
185 | ||
186 | There are platforms where longs are 64 bits, and platforms where ints | |
187 | are 64 bits, and while we are out to shock you, even platforms where | |
188 | shorts are 64 bits. This is all legal according to the C standard. (In | |
189 | other words, "long long" is not a portable way to specify 64 bits, and | |
190 | "long long" is not even guaranteed to be any wider than "long".) | |
191 | ||
192 | Instead, use the definitions IV, UV, IVSIZE, I32SIZE, and so forth. | |
193 | Avoid things like I32 because they are B<not> guaranteed to be | |
194 | I<exactly> 32 bits, they are I<at least> 32 bits, nor are they | |
195 | guaranteed to be B<int> or B<long>. If you really explicitly need | |
196 | 64-bit variables, use I64 and U64, but only if guarded by HAS_QUAD. | |
197 | ||
198 | =item * | |
199 | ||
200 | Assuming one can dereference any type of pointer for any type of data | |
201 | ||
202 | char *p = ...; | |
203 | long pony = *p; /* BAD */ | |
204 | ||
205 | Many platforms, quite rightly so, will give you a core dump instead of | |
206 | a pony if the p happens not be correctly aligned. | |
207 | ||
208 | =item * | |
209 | ||
210 | Lvalue casts | |
211 | ||
212 | (int)*p = ...; /* BAD */ | |
213 | ||
214 | Simply not portable. Get your lvalue to be of the right type, or maybe | |
215 | use temporary variables, or dirty tricks with unions. | |
216 | ||
217 | =item * | |
218 | ||
219 | Assume B<anything> about structs (especially the ones you don't | |
220 | control, like the ones coming from the system headers) | |
221 | ||
222 | =over 8 | |
223 | ||
224 | =item * | |
225 | ||
226 | That a certain field exists in a struct | |
227 | ||
228 | =item * | |
229 | ||
230 | That no other fields exist besides the ones you know of | |
231 | ||
232 | =item * | |
233 | ||
234 | That a field is of certain signedness, sizeof, or type | |
235 | ||
236 | =item * | |
237 | ||
238 | That the fields are in a certain order | |
239 | ||
240 | =over 8 | |
241 | ||
242 | =item * | |
243 | ||
244 | While C guarantees the ordering specified in the struct definition, | |
245 | between different platforms the definitions might differ | |
246 | ||
247 | =back | |
248 | ||
249 | =item * | |
250 | ||
251 | That the sizeof(struct) or the alignments are the same everywhere | |
252 | ||
253 | =over 8 | |
254 | ||
255 | =item * | |
256 | ||
257 | There might be padding bytes between the fields to align the fields - | |
258 | the bytes can be anything | |
259 | ||
260 | =item * | |
261 | ||
262 | Structs are required to be aligned to the maximum alignment required by | |
263 | the fields - which for native types is for usually equivalent to | |
264 | sizeof() of the field | |
265 | ||
266 | =back | |
267 | ||
268 | =back | |
269 | ||
270 | =item * | |
271 | ||
272 | Assuming the character set is ASCIIish | |
273 | ||
274 | Perl can compile and run under EBCDIC platforms. See L<perlebcdic>. | |
275 | This is transparent for the most part, but because the character sets | |
276 | differ, you shouldn't use numeric (decimal, octal, nor hex) constants | |
277 | to refer to characters. You can safely say 'A', but not 0x41. You can | |
278 | safely say '\n', but not \012. If a character doesn't have a trivial | |
279 | input form, you can create a #define for it in both C<utfebcdic.h> and | |
280 | C<utf8.h>, so that it resolves to different values depending on the | |
281 | character set being used. (There are three different EBCDIC character | |
282 | sets defined in C<utfebcdic.h>, so it might be best to insert the | |
283 | #define three times in that file.) | |
284 | ||
285 | Also, the range 'A' - 'Z' in ASCII is an unbroken sequence of 26 upper | |
286 | case alphabetic characters. That is not true in EBCDIC. Nor for 'a' to | |
287 | 'z'. But '0' - '9' is an unbroken range in both systems. Don't assume | |
288 | anything about other ranges. | |
289 | ||
290 | Many of the comments in the existing code ignore the possibility of | |
291 | EBCDIC, and may be wrong therefore, even if the code works. This is | |
292 | actually a tribute to the successful transparent insertion of being | |
293 | able to handle EBCDIC without having to change pre-existing code. | |
294 | ||
295 | UTF-8 and UTF-EBCDIC are two different encodings used to represent | |
296 | Unicode code points as sequences of bytes. Macros with the same names | |
297 | (but different definitions) in C<utf8.h> and C<utfebcdic.h> are used to | |
298 | allow the calling code to think that there is only one such encoding. | |
299 | This is almost always referred to as C<utf8>, but it means the EBCDIC | |
300 | version as well. Again, comments in the code may well be wrong even if | |
301 | the code itself is right. For example, the concept of C<invariant | |
302 | characters> differs between ASCII and EBCDIC. On ASCII platforms, only | |
303 | characters that do not have the high-order bit set (i.e. whose ordinals | |
304 | are strict ASCII, 0 - 127) are invariant, and the documentation and | |
305 | comments in the code may assume that, often referring to something | |
306 | like, say, C<hibit>. The situation differs and is not so simple on | |
307 | EBCDIC machines, but as long as the code itself uses the | |
308 | C<NATIVE_IS_INVARIANT()> macro appropriately, it works, even if the | |
309 | comments are wrong. | |
310 | ||
311 | =item * | |
312 | ||
313 | Assuming the character set is just ASCII | |
314 | ||
315 | ASCII is a 7 bit encoding, but bytes have 8 bits in them. The 128 extra | |
316 | characters have different meanings depending on the locale. Absent a | |
317 | locale, currently these extra characters are generally considered to be | |
318 | unassigned, and this has presented some problems. This is being changed | |
319 | starting in 5.12 so that these characters will be considered to be | |
320 | Latin-1 (ISO-8859-1). | |
321 | ||
322 | =item * | |
323 | ||
324 | Mixing #define and #ifdef | |
325 | ||
326 | #define BURGLE(x) ... \ | |
327 | #ifdef BURGLE_OLD_STYLE /* BAD */ | |
328 | ... do it the old way ... \ | |
329 | #else | |
330 | ... do it the new way ... \ | |
331 | #endif | |
332 | ||
333 | You cannot portably "stack" cpp directives. For example in the above | |
334 | you need two separate BURGLE() #defines, one for each #ifdef branch. | |
335 | ||
336 | =item * | |
337 | ||
338 | Adding non-comment stuff after #endif or #else | |
339 | ||
340 | #ifdef SNOSH | |
341 | ... | |
342 | #else !SNOSH /* BAD */ | |
343 | ... | |
344 | #endif SNOSH /* BAD */ | |
345 | ||
346 | The #endif and #else cannot portably have anything non-comment after | |
347 | them. If you want to document what is going (which is a good idea | |
348 | especially if the branches are long), use (C) comments: | |
349 | ||
350 | #ifdef SNOSH | |
351 | ... | |
352 | #else /* !SNOSH */ | |
353 | ... | |
354 | #endif /* SNOSH */ | |
355 | ||
356 | The gcc option C<-Wendif-labels> warns about the bad variant (by | |
357 | default on starting from Perl 5.9.4). | |
358 | ||
359 | =item * | |
360 | ||
361 | Having a comma after the last element of an enum list | |
362 | ||
363 | enum color { | |
364 | CERULEAN, | |
365 | CHARTREUSE, | |
366 | CINNABAR, /* BAD */ | |
367 | }; | |
368 | ||
369 | is not portable. Leave out the last comma. | |
370 | ||
371 | Also note that whether enums are implicitly morphable to ints varies | |
372 | between compilers, you might need to (int). | |
373 | ||
374 | =item * | |
375 | ||
376 | Using //-comments | |
377 | ||
378 | // This function bamfoodles the zorklator. /* BAD */ | |
379 | ||
380 | That is C99 or C++. Perl is C89. Using the //-comments is silently | |
381 | allowed by many C compilers but cranking up the ANSI C89 strictness | |
382 | (which we like to do) causes the compilation to fail. | |
383 | ||
384 | =item * | |
385 | ||
386 | Mixing declarations and code | |
387 | ||
388 | void zorklator() | |
389 | { | |
390 | int n = 3; | |
391 | set_zorkmids(n); /* BAD */ | |
392 | int q = 4; | |
393 | ||
394 | That is C99 or C++. Some C compilers allow that, but you shouldn't. | |
395 | ||
396 | The gcc option C<-Wdeclaration-after-statements> scans for such | |
397 | problems (by default on starting from Perl 5.9.4). | |
398 | ||
399 | =item * | |
400 | ||
401 | Introducing variables inside for() | |
402 | ||
403 | for(int i = ...; ...; ...) { /* BAD */ | |
404 | ||
405 | That is C99 or C++. While it would indeed be awfully nice to have that | |
406 | also in C89, to limit the scope of the loop variable, alas, we cannot. | |
407 | ||
408 | =item * | |
409 | ||
410 | Mixing signed char pointers with unsigned char pointers | |
411 | ||
412 | int foo(char *s) { ... } | |
413 | ... | |
414 | unsigned char *t = ...; /* Or U8* t = ... */ | |
415 | foo(t); /* BAD */ | |
416 | ||
417 | While this is legal practice, it is certainly dubious, and downright | |
418 | fatal in at least one platform: for example VMS cc considers this a | |
419 | fatal error. One cause for people often making this mistake is that a | |
420 | "naked char" and therefore dereferencing a "naked char pointer" have an | |
421 | undefined signedness: it depends on the compiler and the flags of the | |
422 | compiler and the underlying platform whether the result is signed or | |
423 | unsigned. For this very same reason using a 'char' as an array index is | |
424 | bad. | |
425 | ||
426 | =item * | |
427 | ||
428 | Macros that have string constants and their arguments as substrings of | |
429 | the string constants | |
430 | ||
431 | #define FOO(n) printf("number = %d\n", n) /* BAD */ | |
432 | FOO(10); | |
433 | ||
434 | Pre-ANSI semantics for that was equivalent to | |
435 | ||
436 | printf("10umber = %d\10"); | |
437 | ||
438 | which is probably not what you were expecting. Unfortunately at least | |
439 | one reasonably common and modern C compiler does "real backward | |
440 | compatibility" here, in AIX that is what still happens even though the | |
441 | rest of the AIX compiler is very happily C89. | |
442 | ||
443 | =item * | |
444 | ||
445 | Using printf formats for non-basic C types | |
446 | ||
447 | IV i = ...; | |
448 | printf("i = %d\n", i); /* BAD */ | |
449 | ||
450 | While this might by accident work in some platform (where IV happens to | |
451 | be an C<int>), in general it cannot. IV might be something larger. Even | |
452 | worse the situation is with more specific types (defined by Perl's | |
453 | configuration step in F<config.h>): | |
454 | ||
455 | Uid_t who = ...; | |
456 | printf("who = %d\n", who); /* BAD */ | |
457 | ||
458 | The problem here is that Uid_t might be not only not C<int>-wide but it | |
459 | might also be unsigned, in which case large uids would be printed as | |
460 | negative values. | |
461 | ||
462 | There is no simple solution to this because of printf()'s limited | |
463 | intelligence, but for many types the right format is available as with | |
464 | either 'f' or '_f' suffix, for example: | |
465 | ||
466 | IVdf /* IV in decimal */ | |
467 | UVxf /* UV is hexadecimal */ | |
468 | ||
469 | printf("i = %"IVdf"\n", i); /* The IVdf is a string constant. */ | |
470 | ||
471 | Uid_t_f /* Uid_t in decimal */ | |
472 | ||
473 | printf("who = %"Uid_t_f"\n", who); | |
474 | ||
475 | Or you can try casting to a "wide enough" type: | |
476 | ||
477 | printf("i = %"IVdf"\n", (IV)something_very_small_and_signed); | |
478 | ||
479 | Also remember that the C<%p> format really does require a void pointer: | |
480 | ||
481 | U8* p = ...; | |
482 | printf("p = %p\n", (void*)p); | |
483 | ||
484 | The gcc option C<-Wformat> scans for such problems. | |
485 | ||
486 | =item * | |
487 | ||
488 | Blindly using variadic macros | |
489 | ||
490 | gcc has had them for a while with its own syntax, and C99 brought them | |
491 | with a standardized syntax. Don't use the former, and use the latter | |
492 | only if the HAS_C99_VARIADIC_MACROS is defined. | |
493 | ||
494 | =item * | |
495 | ||
496 | Blindly passing va_list | |
497 | ||
498 | Not all platforms support passing va_list to further varargs (stdarg) | |
499 | functions. The right thing to do is to copy the va_list using the | |
500 | Perl_va_copy() if the NEED_VA_COPY is defined. | |
501 | ||
502 | =item * | |
503 | ||
504 | Using gcc statement expressions | |
505 | ||
506 | val = ({...;...;...}); /* BAD */ | |
507 | ||
508 | While a nice extension, it's not portable. The Perl code does | |
509 | admittedly use them if available to gain some extra speed (essentially | |
510 | as a funky form of inlining), but you shouldn't. | |
511 | ||
512 | =item * | |
513 | ||
514 | Binding together several statements in a macro | |
515 | ||
516 | Use the macros STMT_START and STMT_END. | |
517 | ||
518 | STMT_START { | |
519 | ... | |
520 | } STMT_END | |
521 | ||
522 | =item * | |
523 | ||
524 | Testing for operating systems or versions when should be testing for | |
525 | features | |
526 | ||
527 | #ifdef __FOONIX__ /* BAD */ | |
528 | foo = quux(); | |
529 | #endif | |
530 | ||
531 | Unless you know with 100% certainty that quux() is only ever available | |
532 | for the "Foonix" operating system B<and> that is available B<and> | |
533 | correctly working for B<all> past, present, B<and> future versions of | |
534 | "Foonix", the above is very wrong. This is more correct (though still | |
535 | not perfect, because the below is a compile-time check): | |
536 | ||
537 | #ifdef HAS_QUUX | |
538 | foo = quux(); | |
539 | #endif | |
540 | ||
541 | How does the HAS_QUUX become defined where it needs to be? Well, if | |
542 | Foonix happens to be Unixy enough to be able to run the Configure | |
543 | script, and Configure has been taught about detecting and testing | |
544 | quux(), the HAS_QUUX will be correctly defined. In other platforms, the | |
545 | corresponding configuration step will hopefully do the same. | |
546 | ||
547 | In a pinch, if you cannot wait for Configure to be educated, or if you | |
548 | have a good hunch of where quux() might be available, you can | |
549 | temporarily try the following: | |
550 | ||
551 | #if (defined(__FOONIX__) || defined(__BARNIX__)) | |
552 | # define HAS_QUUX | |
553 | #endif | |
554 | ||
555 | ... | |
556 | ||
557 | #ifdef HAS_QUUX | |
558 | foo = quux(); | |
559 | #endif | |
560 | ||
561 | But in any case, try to keep the features and operating systems | |
562 | separate. | |
563 | ||
564 | =back | |
565 | ||
566 | =head2 Problematic System Interfaces | |
567 | ||
568 | =over 4 | |
569 | ||
570 | =item * | |
571 | ||
572 | malloc(0), realloc(0), calloc(0, 0) are non-portable. To be portable | |
573 | allocate at least one byte. (In general you should rarely need to work | |
574 | at this low level, but instead use the various malloc wrappers.) | |
575 | ||
576 | =item * | |
577 | ||
578 | snprintf() - the return type is unportable. Use my_snprintf() instead. | |
579 | ||
580 | =back | |
581 | ||
582 | =head2 Security problems | |
583 | ||
584 | Last but not least, here are various tips for safer coding. | |
585 | ||
586 | =over 4 | |
587 | ||
588 | =item * | |
589 | ||
590 | Do not use gets() | |
591 | ||
592 | Or we will publicly ridicule you. Seriously. | |
593 | ||
594 | =item * | |
595 | ||
596 | Do not use strcpy() or strcat() or strncpy() or strncat() | |
597 | ||
598 | Use my_strlcpy() and my_strlcat() instead: they either use the native | |
599 | implementation, or Perl's own implementation (borrowed from the public | |
600 | domain implementation of INN). | |
601 | ||
602 | =item * | |
603 | ||
604 | Do not use sprintf() or vsprintf() | |
605 | ||
606 | If you really want just plain byte strings, use my_snprintf() and | |
607 | my_vsnprintf() instead, which will try to use snprintf() and | |
608 | vsnprintf() if those safer APIs are available. If you want something | |
609 | fancier than a plain byte string, use SVs and Perl_sv_catpvf(). | |
610 | ||
611 | =back | |
612 | ||
613 | =head1 DEBUGGING | |
614 | ||
615 | You can compile a special debugging version of Perl, which allows you | |
616 | to use the C<-D> option of Perl to tell more about what Perl is doing. | |
617 | But sometimes there is no alternative than to dive in with a debugger, | |
618 | either to see the stack trace of a core dump (very useful in a bug | |
619 | report), or trying to figure out what went wrong before the core dump | |
620 | happened, or how did we end up having wrong or unexpected results. | |
621 | ||
622 | =head2 Poking at Perl | |
623 | ||
624 | To really poke around with Perl, you'll probably want to build Perl for | |
625 | debugging, like this: | |
626 | ||
627 | ./Configure -d -D optimize=-g | |
628 | make | |
629 | ||
630 | C<-g> is a flag to the C compiler to have it produce debugging | |
631 | information which will allow us to step through a running program, and | |
632 | to see in which C function we are at (without the debugging information | |
633 | we might see only the numerical addresses of the functions, which is | |
634 | not very helpful). | |
635 | ||
636 | F<Configure> will also turn on the C<DEBUGGING> compilation symbol | |
637 | which enables all the internal debugging code in Perl. There are a | |
638 | whole bunch of things you can debug with this: L<perlrun> lists them | |
639 | all, and the best way to find out about them is to play about with | |
640 | them. The most useful options are probably | |
641 | ||
642 | l Context (loop) stack processing | |
643 | t Trace execution | |
644 | o Method and overloading resolution | |
645 | c String/numeric conversions | |
646 | ||
647 | Some of the functionality of the debugging code can be achieved using | |
648 | XS modules. | |
649 | ||
650 | -Dr => use re 'debug' | |
651 | -Dx => use O 'Debug' | |
652 | ||
653 | =head2 Using a source-level debugger | |
654 | ||
655 | If the debugging output of C<-D> doesn't help you, it's time to step | |
656 | through perl's execution with a source-level debugger. | |
657 | ||
658 | =over 3 | |
659 | ||
660 | =item * | |
661 | ||
662 | We'll use C<gdb> for our examples here; the principles will apply to | |
663 | any debugger (many vendors call their debugger C<dbx>), but check the | |
664 | manual of the one you're using. | |
665 | ||
666 | =back | |
667 | ||
668 | To fire up the debugger, type | |
669 | ||
670 | gdb ./perl | |
671 | ||
672 | Or if you have a core dump: | |
673 | ||
674 | gdb ./perl core | |
675 | ||
676 | You'll want to do that in your Perl source tree so the debugger can | |
677 | read the source code. You should see the copyright message, followed by | |
678 | the prompt. | |
679 | ||
680 | (gdb) | |
681 | ||
682 | C<help> will get you into the documentation, but here are the most | |
683 | useful commands: | |
684 | ||
685 | =over 3 | |
686 | ||
687 | =item * run [args] | |
688 | ||
689 | Run the program with the given arguments. | |
690 | ||
691 | =item * break function_name | |
692 | ||
693 | =item * break source.c:xxx | |
694 | ||
695 | Tells the debugger that we'll want to pause execution when we reach | |
696 | either the named function (but see L<perlguts/Internal Functions>!) or | |
697 | the given line in the named source file. | |
698 | ||
699 | =item * step | |
700 | ||
701 | Steps through the program a line at a time. | |
702 | ||
703 | =item * next | |
704 | ||
705 | Steps through the program a line at a time, without descending into | |
706 | functions. | |
707 | ||
708 | =item * continue | |
709 | ||
710 | Run until the next breakpoint. | |
711 | ||
712 | =item * finish | |
713 | ||
714 | Run until the end of the current function, then stop again. | |
715 | ||
716 | =item * 'enter' | |
717 | ||
718 | Just pressing Enter will do the most recent operation again - it's a | |
719 | blessing when stepping through miles of source code. | |
720 | ||
721 | =item * print | |
722 | ||
723 | Execute the given C code and print its results. B<WARNING>: Perl makes | |
724 | heavy use of macros, and F<gdb> does not necessarily support macros | |
725 | (see later L</"gdb macro support">). You'll have to substitute them | |
726 | yourself, or to invoke cpp on the source code files (see L</"The .i | |
727 | Targets">) So, for instance, you can't say | |
728 | ||
729 | print SvPV_nolen(sv) | |
730 | ||
731 | but you have to say | |
732 | ||
733 | print Perl_sv_2pv_nolen(sv) | |
734 | ||
735 | =back | |
736 | ||
737 | You may find it helpful to have a "macro dictionary", which you can | |
738 | produce by saying C<cpp -dM perl.c | sort>. Even then, F<cpp> won't | |
739 | recursively apply those macros for you. | |
740 | ||
741 | =head2 gdb macro support | |
742 | ||
743 | Recent versions of F<gdb> have fairly good macro support, but in order | |
744 | to use it you'll need to compile perl with macro definitions included | |
745 | in the debugging information. Using F<gcc> version 3.1, this means | |
746 | configuring with C<-Doptimize=-g3>. Other compilers might use a | |
747 | different switch (if they support debugging macros at all). | |
748 | ||
749 | =head2 Dumping Perl Data Structures | |
750 | ||
751 | One way to get around this macro hell is to use the dumping functions | |
752 | in F<dump.c>; these work a little like an internal | |
753 | L<Devel::Peek|Devel::Peek>, but they also cover OPs and other | |
754 | structures that you can't get at from Perl. Let's take an example. | |
755 | We'll use the C<$a = $b + $c> we used before, but give it a bit of | |
756 | context: C<$b = "6XXXX"; $c = 2.3;>. Where's a good place to stop and | |
757 | poke around? | |
758 | ||
759 | What about C<pp_add>, the function we examined earlier to implement the | |
760 | C<+> operator: | |
761 | ||
762 | (gdb) break Perl_pp_add | |
763 | Breakpoint 1 at 0x46249f: file pp_hot.c, line 309. | |
764 | ||
765 | Notice we use C<Perl_pp_add> and not C<pp_add> - see | |
766 | L<perlguts/Internal Functions>. With the breakpoint in place, we can | |
767 | run our program: | |
768 | ||
769 | (gdb) run -e '$b = "6XXXX"; $c = 2.3; $a = $b + $c' | |
770 | ||
771 | Lots of junk will go past as gdb reads in the relevant source files and | |
772 | libraries, and then: | |
773 | ||
774 | Breakpoint 1, Perl_pp_add () at pp_hot.c:309 | |
775 | 309 dSP; dATARGET; tryAMAGICbin(add,opASSIGN); | |
776 | (gdb) step | |
777 | 311 dPOPTOPnnrl_ul; | |
778 | (gdb) | |
779 | ||
780 | We looked at this bit of code before, and we said that | |
781 | C<dPOPTOPnnrl_ul> arranges for two C<NV>s to be placed into C<left> and | |
782 | C<right> - let's slightly expand it: | |
783 | ||
784 | #define dPOPTOPnnrl_ul NV right = POPn; \ | |
785 | SV *leftsv = TOPs; \ | |
786 | NV left = USE_LEFT(leftsv) ? SvNV(leftsv) : 0.0 | |
787 | ||
788 | C<POPn> takes the SV from the top of the stack and obtains its NV | |
789 | either directly (if C<SvNOK> is set) or by calling the C<sv_2nv> | |
790 | function. C<TOPs> takes the next SV from the top of the stack - yes, | |
791 | C<POPn> uses C<TOPs> - but doesn't remove it. We then use C<SvNV> to | |
792 | get the NV from C<leftsv> in the same way as before - yes, C<POPn> uses | |
793 | C<SvNV>. | |
794 | ||
795 | Since we don't have an NV for C<$b>, we'll have to use C<sv_2nv> to | |
796 | convert it. If we step again, we'll find ourselves there: | |
797 | ||
798 | Perl_sv_2nv (sv=0xa0675d0) at sv.c:1669 | |
799 | 1669 if (!sv) | |
800 | (gdb) | |
801 | ||
802 | We can now use C<Perl_sv_dump> to investigate the SV: | |
803 | ||
804 | SV = PV(0xa057cc0) at 0xa0675d0 | |
805 | REFCNT = 1 | |
806 | FLAGS = (POK,pPOK) | |
807 | PV = 0xa06a510 "6XXXX"\0 | |
808 | CUR = 5 | |
809 | LEN = 6 | |
810 | $1 = void | |
811 | ||
812 | We know we're going to get C<6> from this, so let's finish the | |
813 | subroutine: | |
814 | ||
815 | (gdb) finish | |
816 | Run till exit from #0 Perl_sv_2nv (sv=0xa0675d0) at sv.c:1671 | |
817 | 0x462669 in Perl_pp_add () at pp_hot.c:311 | |
818 | 311 dPOPTOPnnrl_ul; | |
819 | ||
820 | We can also dump out this op: the current op is always stored in | |
821 | C<PL_op>, and we can dump it with C<Perl_op_dump>. This'll give us | |
822 | similar output to L<B::Debug|B::Debug>. | |
823 | ||
824 | { | |
825 | 13 TYPE = add ===> 14 | |
826 | TARG = 1 | |
827 | FLAGS = (SCALAR,KIDS) | |
828 | { | |
829 | TYPE = null ===> (12) | |
830 | (was rv2sv) | |
831 | FLAGS = (SCALAR,KIDS) | |
832 | { | |
833 | 11 TYPE = gvsv ===> 12 | |
834 | FLAGS = (SCALAR) | |
835 | GV = main::b | |
836 | } | |
837 | } | |
838 | ||
839 | # finish this later # | |
840 | ||
841 | =head1 SOURCE CODE STATIC ANALYSIS | |
842 | ||
843 | Various tools exist for analysing C source code B<statically>, as | |
844 | opposed to B<dynamically>, that is, without executing the code. It is | |
845 | possible to detect resource leaks, undefined behaviour, type | |
846 | mismatches, portability problems, code paths that would cause illegal | |
847 | memory accesses, and other similar problems by just parsing the C code | |
848 | and looking at the resulting graph, what does it tell about the | |
849 | execution and data flows. As a matter of fact, this is exactly how C | |
850 | compilers know to give warnings about dubious code. | |
851 | ||
852 | =head2 lint, splint | |
853 | ||
854 | The good old C code quality inspector, C<lint>, is available in several | |
855 | platforms, but please be aware that there are several different | |
856 | implementations of it by different vendors, which means that the flags | |
857 | are not identical across different platforms. | |
858 | ||
859 | There is a lint variant called C<splint> (Secure Programming Lint) | |
860 | available from http://www.splint.org/ that should compile on any | |
861 | Unix-like platform. | |
862 | ||
863 | There are C<lint> and <splint> targets in Makefile, but you may have to | |
864 | diddle with the flags (see above). | |
865 | ||
866 | =head2 Coverity | |
867 | ||
868 | Coverity (http://www.coverity.com/) is a product similar to lint and as | |
869 | a testbed for their product they periodically check several open source | |
870 | projects, and they give out accounts to open source developers to the | |
871 | defect databases. | |
872 | ||
873 | =head2 cpd (cut-and-paste detector) | |
874 | ||
875 | The cpd tool detects cut-and-paste coding. If one instance of the | |
876 | cut-and-pasted code changes, all the other spots should probably be | |
877 | changed, too. Therefore such code should probably be turned into a | |
878 | subroutine or a macro. | |
879 | ||
880 | cpd (http://pmd.sourceforge.net/cpd.html) is part of the pmd project | |
881 | (http://pmd.sourceforge.net/). pmd was originally written for static | |
882 | analysis of Java code, but later the cpd part of it was extended to | |
883 | parse also C and C++. | |
884 | ||
885 | Download the pmd-bin-X.Y.zip () from the SourceForge site, extract the | |
886 | pmd-X.Y.jar from it, and then run that on source code thusly: | |
887 | ||
888 | java -cp pmd-X.Y.jar net.sourceforge.pmd.cpd.CPD --minimum-tokens 100 --files /some/where/src --language c > cpd.txt | |
889 | ||
890 | You may run into memory limits, in which case you should use the -Xmx | |
891 | option: | |
892 | ||
893 | java -Xmx512M ... | |
894 | ||
895 | =head2 gcc warnings | |
896 | ||
897 | Though much can be written about the inconsistency and coverage | |
898 | problems of gcc warnings (like C<-Wall> not meaning "all the warnings", | |
899 | or some common portability problems not being covered by C<-Wall>, or | |
900 | C<-ansi> and C<-pedantic> both being a poorly defined collection of | |
901 | warnings, and so forth), gcc is still a useful tool in keeping our | |
902 | coding nose clean. | |
903 | ||
904 | The C<-Wall> is by default on. | |
905 | ||
906 | The C<-ansi> (and its sidekick, C<-pedantic>) would be nice to be on | |
907 | always, but unfortunately they are not safe on all platforms, they can | |
908 | for example cause fatal conflicts with the system headers (Solaris | |
909 | being a prime example). If Configure C<-Dgccansipedantic> is used, the | |
910 | C<cflags> frontend selects C<-ansi -pedantic> for the platforms where | |
911 | they are known to be safe. | |
912 | ||
913 | Starting from Perl 5.9.4 the following extra flags are added: | |
914 | ||
915 | =over 4 | |
916 | ||
917 | =item * | |
918 | ||
919 | C<-Wendif-labels> | |
920 | ||
921 | =item * | |
922 | ||
923 | C<-Wextra> | |
924 | ||
925 | =item * | |
926 | ||
927 | C<-Wdeclaration-after-statement> | |
928 | ||
929 | =back | |
930 | ||
931 | The following flags would be nice to have but they would first need | |
932 | their own Augean stablemaster: | |
933 | ||
934 | =over 4 | |
935 | ||
936 | =item * | |
937 | ||
938 | C<-Wpointer-arith> | |
939 | ||
940 | =item * | |
941 | ||
942 | C<-Wshadow> | |
943 | ||
944 | =item * | |
945 | ||
946 | C<-Wstrict-prototypes> | |
947 | ||
948 | =back | |
949 | ||
950 | The C<-Wtraditional> is another example of the annoying tendency of gcc | |
951 | to bundle a lot of warnings under one switch (it would be impossible to | |
952 | deploy in practice because it would complain a lot) but it does contain | |
953 | some warnings that would be beneficial to have available on their own, | |
954 | such as the warning about string constants inside macros containing the | |
955 | macro arguments: this behaved differently pre-ANSI than it does in | |
956 | ANSI, and some C compilers are still in transition, AIX being an | |
957 | example. | |
958 | ||
959 | =head2 Warnings of other C compilers | |
960 | ||
961 | Other C compilers (yes, there B<are> other C compilers than gcc) often | |
962 | have their "strict ANSI" or "strict ANSI with some portability | |
963 | extensions" modes on, like for example the Sun Workshop has its C<-Xa> | |
964 | mode on (though implicitly), or the DEC (these days, HP...) has its | |
965 | C<-std1> mode on. | |
966 | ||
967 | =head1 MEMORY DEBUGGERS | |
968 | ||
969 | B<NOTE 1>: Running under memory debuggers such as Purify, valgrind, or | |
970 | Third Degree greatly slows down the execution: seconds become minutes, | |
971 | minutes become hours. For example as of Perl 5.8.1, the | |
972 | ext/Encode/t/Unicode.t takes extraordinarily long to complete under | |
973 | e.g. Purify, Third Degree, and valgrind. Under valgrind it takes more | |
974 | than six hours, even on a snappy computer. The said test must be doing | |
975 | something that is quite unfriendly for memory debuggers. If you don't | |
976 | feel like waiting, that you can simply kill away the perl process. | |
977 | ||
978 | B<NOTE 2>: To minimize the number of memory leak false alarms (see | |
979 | L</PERL_DESTRUCT_LEVEL> for more information), you have to set the | |
980 | environment variable PERL_DESTRUCT_LEVEL to 2. | |
981 | ||
982 | For csh-like shells: | |
983 | ||
984 | setenv PERL_DESTRUCT_LEVEL 2 | |
985 | ||
986 | For Bourne-type shells: | |
987 | ||
988 | PERL_DESTRUCT_LEVEL=2 | |
989 | export PERL_DESTRUCT_LEVEL | |
990 | ||
991 | In Unixy environments you can also use the C<env> command: | |
992 | ||
993 | env PERL_DESTRUCT_LEVEL=2 valgrind ./perl -Ilib ... | |
994 | ||
995 | B<NOTE 3>: There are known memory leaks when there are compile-time | |
996 | errors within eval or require, seeing C<S_doeval> in the call stack is | |
997 | a good sign of these. Fixing these leaks is non-trivial, unfortunately, | |
998 | but they must be fixed eventually. | |
999 | ||
1000 | B<NOTE 4>: L<DynaLoader> will not clean up after itself completely | |
1001 | unless Perl is built with the Configure option | |
1002 | C<-Accflags=-DDL_UNLOAD_ALL_AT_EXIT>. | |
1003 | ||
1004 | =head2 Rational Software's Purify | |
1005 | ||
1006 | Purify is a commercial tool that is helpful in identifying memory | |
1007 | overruns, wild pointers, memory leaks and other such badness. Perl must | |
1008 | be compiled in a specific way for optimal testing with Purify. Purify | |
1009 | is available under Windows NT, Solaris, HP-UX, SGI, and Siemens Unix. | |
1010 | ||
1011 | =head3 Purify on Unix | |
1012 | ||
1013 | On Unix, Purify creates a new Perl binary. To get the most benefit out | |
1014 | of Purify, you should create the perl to Purify using: | |
1015 | ||
1016 | sh Configure -Accflags=-DPURIFY -Doptimize='-g' \ | |
1017 | -Uusemymalloc -Dusemultiplicity | |
1018 | ||
1019 | where these arguments mean: | |
1020 | ||
1021 | =over 4 | |
1022 | ||
1023 | =item * -Accflags=-DPURIFY | |
1024 | ||
1025 | Disables Perl's arena memory allocation functions, as well as forcing | |
1026 | use of memory allocation functions derived from the system malloc. | |
1027 | ||
1028 | =item * -Doptimize='-g' | |
1029 | ||
1030 | Adds debugging information so that you see the exact source statements | |
1031 | where the problem occurs. Without this flag, all you will see is the | |
1032 | source filename of where the error occurred. | |
1033 | ||
1034 | =item * -Uusemymalloc | |
1035 | ||
1036 | Disable Perl's malloc so that Purify can more closely monitor | |
1037 | allocations and leaks. Using Perl's malloc will make Purify report most | |
1038 | leaks in the "potential" leaks category. | |
1039 | ||
1040 | =item * -Dusemultiplicity | |
1041 | ||
1042 | Enabling the multiplicity option allows perl to clean up thoroughly | |
1043 | when the interpreter shuts down, which reduces the number of bogus leak | |
1044 | reports from Purify. | |
1045 | ||
1046 | =back | |
1047 | ||
1048 | Once you've compiled a perl suitable for Purify'ing, then you can just: | |
1049 | ||
1050 | make pureperl | |
1051 | ||
1052 | which creates a binary named 'pureperl' that has been Purify'ed. This | |
1053 | binary is used in place of the standard 'perl' binary when you want to | |
1054 | debug Perl memory problems. | |
1055 | ||
1056 | As an example, to show any memory leaks produced during the standard | |
1057 | Perl testset you would create and run the Purify'ed perl as: | |
1058 | ||
1059 | make pureperl | |
1060 | cd t | |
1061 | ../pureperl -I../lib harness | |
1062 | ||
1063 | which would run Perl on test.pl and report any memory problems. | |
1064 | ||
1065 | Purify outputs messages in "Viewer" windows by default. If you don't | |
1066 | have a windowing environment or if you simply want the Purify output to | |
1067 | unobtrusively go to a log file instead of to the interactive window, | |
1068 | use these following options to output to the log file "perl.log": | |
1069 | ||
1070 | setenv PURIFYOPTIONS "-chain-length=25 -windows=no \ | |
1071 | -log-file=perl.log -append-logfile=yes" | |
1072 | ||
1073 | If you plan to use the "Viewer" windows, then you only need this | |
1074 | option: | |
1075 | ||
1076 | setenv PURIFYOPTIONS "-chain-length=25" | |
1077 | ||
1078 | In Bourne-type shells: | |
1079 | ||
1080 | PURIFYOPTIONS="..." | |
1081 | export PURIFYOPTIONS | |
1082 | ||
1083 | or if you have the "env" utility: | |
1084 | ||
1085 | env PURIFYOPTIONS="..." ../pureperl ... | |
1086 | ||
1087 | =head3 Purify on NT | |
1088 | ||
1089 | Purify on Windows NT instruments the Perl binary 'perl.exe' on the fly. | |
1090 | There are several options in the makefile you should change to get the | |
1091 | most use out of Purify: | |
1092 | ||
1093 | =over 4 | |
1094 | ||
1095 | =item * DEFINES | |
1096 | ||
1097 | You should add -DPURIFY to the DEFINES line so the DEFINES line looks | |
1098 | something like: | |
1099 | ||
1100 | DEFINES = -DWIN32 -D_CONSOLE -DNO_STRICT $(CRYPT_FLAG) -DPURIFY=1 | |
1101 | ||
1102 | to disable Perl's arena memory allocation functions, as well as to | |
1103 | force use of memory allocation functions derived from the system | |
1104 | malloc. | |
1105 | ||
1106 | =item * USE_MULTI = define | |
1107 | ||
1108 | Enabling the multiplicity option allows perl to clean up thoroughly | |
1109 | when the interpreter shuts down, which reduces the number of bogus leak | |
1110 | reports from Purify. | |
1111 | ||
1112 | =item * #PERL_MALLOC = define | |
1113 | ||
1114 | Disable Perl's malloc so that Purify can more closely monitor | |
1115 | allocations and leaks. Using Perl's malloc will make Purify report most | |
1116 | leaks in the "potential" leaks category. | |
1117 | ||
1118 | =item * CFG = Debug | |
1119 | ||
1120 | Adds debugging information so that you see the exact source statements | |
1121 | where the problem occurs. Without this flag, all you will see is the | |
1122 | source filename of where the error occurred. | |
1123 | ||
1124 | =back | |
1125 | ||
1126 | As an example, to show any memory leaks produced during the standard | |
1127 | Perl testset you would create and run Purify as: | |
1128 | ||
1129 | cd win32 | |
1130 | make | |
1131 | cd ../t | |
1132 | purify ../perl -I../lib harness | |
1133 | ||
1134 | which would instrument Perl in memory, run Perl on test.pl, then | |
1135 | finally report any memory problems. | |
1136 | ||
1137 | =head2 valgrind | |
1138 | ||
1139 | The excellent valgrind tool can be used to find out both memory leaks | |
1140 | and illegal memory accesses. As of version 3.3.0, Valgrind only | |
1141 | supports Linux on x86, x86-64 and PowerPC. The special "test.valgrind" | |
1142 | target can be used to run the tests under valgrind. Found errors and | |
1143 | memory leaks are logged in files named F<testfile.valgrind>. | |
1144 | ||
1145 | Valgrind also provides a cachegrind tool, invoked on perl as: | |
1146 | ||
1147 | VG_OPTS=--tool=cachegrind make test.valgrind | |
1148 | ||
1149 | As system libraries (most notably glibc) are also triggering errors, | |
1150 | valgrind allows to suppress such errors using suppression files. The | |
1151 | default suppression file that comes with valgrind already catches a lot | |
1152 | of them. Some additional suppressions are defined in F<t/perl.supp>. | |
1153 | ||
1154 | To get valgrind and for more information see | |
1155 | ||
1156 | http://developer.kde.org/~sewardj/ | |
1157 | ||
1158 | =head1 PROFILING | |
1159 | ||
1160 | Depending on your platform there are various ways of profiling Perl. | |
1161 | ||
1162 | There are two commonly used techniques of profiling executables: | |
1163 | I<statistical time-sampling> and I<basic-block counting>. | |
1164 | ||
1165 | The first method takes periodically samples of the CPU program counter, | |
1166 | and since the program counter can be correlated with the code generated | |
1167 | for functions, we get a statistical view of in which functions the | |
1168 | program is spending its time. The caveats are that very small/fast | |
1169 | functions have lower probability of showing up in the profile, and that | |
1170 | periodically interrupting the program (this is usually done rather | |
1171 | frequently, in the scale of milliseconds) imposes an additional | |
1172 | overhead that may skew the results. The first problem can be alleviated | |
1173 | by running the code for longer (in general this is a good idea for | |
1174 | profiling), the second problem is usually kept in guard by the | |
1175 | profiling tools themselves. | |
1176 | ||
1177 | The second method divides up the generated code into I<basic blocks>. | |
1178 | Basic blocks are sections of code that are entered only in the | |
1179 | beginning and exited only at the end. For example, a conditional jump | |
1180 | starts a basic block. Basic block profiling usually works by | |
1181 | I<instrumenting> the code by adding I<enter basic block #nnnn> | |
1182 | book-keeping code to the generated code. During the execution of the | |
1183 | code the basic block counters are then updated appropriately. The | |
1184 | caveat is that the added extra code can skew the results: again, the | |
1185 | profiling tools usually try to factor their own effects out of the | |
1186 | results. | |
1187 | ||
1188 | =head2 Gprof Profiling | |
1189 | ||
1190 | gprof is a profiling tool available in many Unix platforms, it uses | |
1191 | F<statistical time-sampling>. | |
1192 | ||
1193 | You can build a profiled version of perl called "perl.gprof" by | |
1194 | invoking the make target "perl.gprof" (What is required is that Perl | |
1195 | must be compiled using the C<-pg> flag, you may need to re-Configure). | |
1196 | Running the profiled version of Perl will create an output file called | |
1197 | F<gmon.out> is created which contains the profiling data collected | |
1198 | during the execution. | |
1199 | ||
1200 | The gprof tool can then display the collected data in various ways. | |
1201 | Usually gprof understands the following options: | |
1202 | ||
1203 | =over 4 | |
1204 | ||
1205 | =item * -a | |
1206 | ||
1207 | Suppress statically defined functions from the profile. | |
1208 | ||
1209 | =item * -b | |
1210 | ||
1211 | Suppress the verbose descriptions in the profile. | |
1212 | ||
1213 | =item * -e routine | |
1214 | ||
1215 | Exclude the given routine and its descendants from the profile. | |
1216 | ||
1217 | =item * -f routine | |
1218 | ||
1219 | Display only the given routine and its descendants in the profile. | |
1220 | ||
1221 | =item * -s | |
1222 | ||
1223 | Generate a summary file called F<gmon.sum> which then may be given to | |
1224 | subsequent gprof runs to accumulate data over several runs. | |
1225 | ||
1226 | =item * -z | |
1227 | ||
1228 | Display routines that have zero usage. | |
1229 | ||
1230 | =back | |
1231 | ||
1232 | For more detailed explanation of the available commands and output | |
1233 | formats, see your own local documentation of gprof. | |
1234 | ||
1235 | quick hint: | |
1236 | ||
1237 | $ sh Configure -des -Dusedevel -Doptimize='-pg' && make perl.gprof | |
1238 | $ ./perl.gprof someprog # creates gmon.out in current directory | |
1239 | $ gprof ./perl.gprof > out | |
1240 | $ view out | |
1241 | ||
1242 | =head2 GCC gcov Profiling | |
1243 | ||
1244 | Starting from GCC 3.0 I<basic block profiling> is officially available | |
1245 | for the GNU CC. | |
1246 | ||
1247 | You can build a profiled version of perl called F<perl.gcov> by | |
1248 | invoking the make target "perl.gcov" (what is required that Perl must | |
1249 | be compiled using gcc with the flags C<-fprofile-arcs -ftest-coverage>, | |
1250 | you may need to re-Configure). | |
1251 | ||
1252 | Running the profiled version of Perl will cause profile output to be | |
1253 | generated. For each source file an accompanying ".da" file will be | |
1254 | created. | |
1255 | ||
1256 | To display the results you use the "gcov" utility (which should be | |
1257 | installed if you have gcc 3.0 or newer installed). F<gcov> is run on | |
1258 | source code files, like this | |
1259 | ||
1260 | gcov sv.c | |
1261 | ||
1262 | which will cause F<sv.c.gcov> to be created. The F<.gcov> files contain | |
1263 | the source code annotated with relative frequencies of execution | |
1264 | indicated by "#" markers. | |
1265 | ||
1266 | Useful options of F<gcov> include C<-b> which will summarise the basic | |
1267 | block, branch, and function call coverage, and C<-c> which instead of | |
1268 | relative frequencies will use the actual counts. For more information | |
1269 | on the use of F<gcov> and basic block profiling with gcc, see the | |
1270 | latest GNU CC manual, as of GCC 3.0 see | |
1271 | ||
1272 | http://gcc.gnu.org/onlinedocs/gcc-3.0/gcc.html | |
1273 | ||
1274 | and its section titled "8. gcov: a Test Coverage Program" | |
1275 | ||
1276 | http://gcc.gnu.org/onlinedocs/gcc-3.0/gcc_8.html#SEC132 | |
1277 | ||
1278 | quick hint: | |
1279 | ||
1280 | $ sh Configure -des -Dusedevel -Doptimize='-g' \ | |
1281 | -Accflags='-fprofile-arcs -ftest-coverage' \ | |
1282 | -Aldflags='-fprofile-arcs -ftest-coverage' && make perl.gcov | |
1283 | $ rm -f regexec.c.gcov regexec.gcda | |
1284 | $ ./perl.gcov | |
1285 | $ gcov regexec.c | |
1286 | $ view regexec.c.gcov | |
1287 | ||
1288 | =head1 MISCELLANEOUS TRICKS | |
1289 | ||
1290 | =head2 PERL_DESTRUCT_LEVEL | |
1291 | ||
1292 | If you want to run any of the tests yourself manually using e.g. | |
1293 | valgrind, or the pureperl or perl.third executables, please note that | |
1294 | by default perl B<does not> explicitly cleanup all the memory it has | |
1295 | allocated (such as global memory arenas) but instead lets the exit() of | |
1296 | the whole program "take care" of such allocations, also known as | |
1297 | "global destruction of objects". | |
1298 | ||
1299 | There is a way to tell perl to do complete cleanup: set the environment | |
1300 | variable PERL_DESTRUCT_LEVEL to a non-zero value. The t/TEST wrapper | |
1301 | does set this to 2, and this is what you need to do too, if you don't | |
1302 | want to see the "global leaks": For example, for "third-degreed" Perl: | |
1303 | ||
1304 | env PERL_DESTRUCT_LEVEL=2 ./perl.third -Ilib t/foo/bar.t | |
1305 | ||
1306 | (Note: the mod_perl apache module uses also this environment variable | |
1307 | for its own purposes and extended its semantics. Refer to the mod_perl | |
1308 | documentation for more information. Also, spawned threads do the | |
1309 | equivalent of setting this variable to the value 1.) | |
1310 | ||
1311 | If, at the end of a run you get the message I<N scalars leaked>, you | |
1312 | can recompile with C<-DDEBUG_LEAKING_SCALARS>, which will cause the | |
1313 | addresses of all those leaked SVs to be dumped along with details as to | |
1314 | where each SV was originally allocated. This information is also | |
1315 | displayed by Devel::Peek. Note that the extra details recorded with | |
1316 | each SV increases memory usage, so it shouldn't be used in production | |
1317 | environments. It also converts C<new_SV()> from a macro into a real | |
1318 | function, so you can use your favourite debugger to discover where | |
1319 | those pesky SVs were allocated. | |
1320 | ||
1321 | If you see that you're leaking memory at runtime, but neither valgrind | |
1322 | nor C<-DDEBUG_LEAKING_SCALARS> will find anything, you're probably | |
1323 | leaking SVs that are still reachable and will be properly cleaned up | |
1324 | during destruction of the interpreter. In such cases, using the C<-Dm> | |
1325 | switch can point you to the source of the leak. If the executable was | |
1326 | built with C<-DDEBUG_LEAKING_SCALARS>, C<-Dm> will output SV | |
1327 | allocations in addition to memory allocations. Each SV allocation has a | |
1328 | distinct serial number that will be written on creation and destruction | |
1329 | of the SV. So if you're executing the leaking code in a loop, you need | |
1330 | to look for SVs that are created, but never destroyed between each | |
1331 | cycle. If such an SV is found, set a conditional breakpoint within | |
1332 | C<new_SV()> and make it break only when C<PL_sv_serial> is equal to the | |
1333 | serial number of the leaking SV. Then you will catch the interpreter in | |
1334 | exactly the state where the leaking SV is allocated, which is | |
1335 | sufficient in many cases to find the source of the leak. | |
1336 | ||
1337 | As C<-Dm> is using the PerlIO layer for output, it will by itself | |
1338 | allocate quite a bunch of SVs, which are hidden to avoid recursion. You | |
1339 | can bypass the PerlIO layer if you use the SV logging provided by | |
1340 | C<-DPERL_MEM_LOG> instead. | |
1341 | ||
1342 | =head2 PERL_MEM_LOG | |
1343 | ||
1344 | If compiled with C<-DPERL_MEM_LOG>, both memory and SV allocations go | |
1345 | through logging functions, which is handy for breakpoint setting. | |
1346 | ||
1347 | Unless C<-DPERL_MEM_LOG_NOIMPL> is also compiled, the logging functions | |
1348 | read $ENV{PERL_MEM_LOG} to determine whether to log the event, and if | |
1349 | so how: | |
1350 | ||
1351 | $ENV{PERL_MEM_LOG} =~ /m/ Log all memory ops | |
1352 | $ENV{PERL_MEM_LOG} =~ /s/ Log all SV ops | |
1353 | $ENV{PERL_MEM_LOG} =~ /t/ include timestamp in Log | |
1354 | $ENV{PERL_MEM_LOG} =~ /^(\d+)/ write to FD given (default is 2) | |
1355 | ||
1356 | Memory logging is somewhat similar to C<-Dm> but is independent of | |
1357 | C<-DDEBUGGING>, and at a higher level; all uses of Newx(), Renew(), and | |
1358 | Safefree() are logged with the caller's source code file and line | |
1359 | number (and C function name, if supported by the C compiler). In | |
1360 | contrast, C<-Dm> is directly at the point of C<malloc()>. SV logging is | |
1361 | similar. | |
1362 | ||
1363 | Since the logging doesn't use PerlIO, all SV allocations are logged and | |
1364 | no extra SV allocations are introduced by enabling the logging. If | |
1365 | compiled with C<-DDEBUG_LEAKING_SCALARS>, the serial number for each SV | |
1366 | allocation is also logged. | |
1367 | ||
1368 | =head2 DDD over gdb | |
1369 | ||
1370 | Those debugging perl with the DDD frontend over gdb may find the | |
1371 | following useful: | |
1372 | ||
1373 | You can extend the data conversion shortcuts menu, so for example you | |
1374 | can display an SV's IV value with one click, without doing any typing. | |
1375 | To do that simply edit ~/.ddd/init file and add after: | |
1376 | ||
1377 | ! Display shortcuts. | |
1378 | Ddd*gdbDisplayShortcuts: \ | |
1379 | /t () // Convert to Bin\n\ | |
1380 | /d () // Convert to Dec\n\ | |
1381 | /x () // Convert to Hex\n\ | |
1382 | /o () // Convert to Oct(\n\ | |
1383 | ||
1384 | the following two lines: | |
1385 | ||
1386 | ((XPV*) (())->sv_any )->xpv_pv // 2pvx\n\ | |
1387 | ((XPVIV*) (())->sv_any )->xiv_iv // 2ivx | |
1388 | ||
1389 | so now you can do ivx and pvx lookups or you can plug there the sv_peek | |
1390 | "conversion": | |
1391 | ||
1392 | Perl_sv_peek(my_perl, (SV*)()) // sv_peek | |
1393 | ||
1394 | (The my_perl is for threaded builds.) Just remember that every line, | |
1395 | but the last one, should end with \n\ | |
1396 | ||
1397 | Alternatively edit the init file interactively via: 3rd mouse button -> | |
1398 | New Display -> Edit Menu | |
1399 | ||
1400 | Note: you can define up to 20 conversion shortcuts in the gdb section. | |
1401 | ||
1402 | =head2 Poison | |
1403 | ||
1404 | If you see in a debugger a memory area mysteriously full of 0xABABABAB | |
1405 | or 0xEFEFEFEF, you may be seeing the effect of the Poison() macros, see | |
1406 | L<perlclib>. | |
1407 | ||
1408 | =head2 Read-only optrees | |
1409 | ||
1410 | Under ithreads the optree is read only. If you want to enforce this, to | |
1411 | check for write accesses from buggy code, compile with | |
1412 | C<-DPL_OP_SLAB_ALLOC> to enable the OP slab allocator and | |
1413 | C<-DPERL_DEBUG_READONLY_OPS> to enable code that allocates op memory | |
1414 | via C<mmap>, and sets it read-only at run time. Any write access to an | |
1415 | op results in a C<SIGBUS> and abort. | |
1416 | ||
1417 | This code is intended for development only, and may not be portable | |
1418 | even to all Unix variants. Also, it is an 80% solution, in that it | |
1419 | isn't able to make all ops read only. Specifically it | |
1420 | ||
1421 | =over | |
1422 | ||
1423 | =item * 1 | |
1424 | ||
1425 | Only sets read-only on all slabs of ops at C<CHECK> time, hence ops | |
1426 | allocated later via C<require> or C<eval> will be re-write | |
1427 | ||
1428 | =item * 2 | |
1429 | ||
1430 | Turns an entire slab of ops read-write if the refcount of any op in the | |
1431 | slab needs to be decreased. | |
1432 | ||
1433 | =item * 3 | |
1434 | ||
1435 | Turns an entire slab of ops read-write if any op from the slab is | |
1436 | freed. | |
1437 | ||
1438 | =back | |
1439 | ||
1440 | It's not possible to turn the slabs to read-only after an action | |
1441 | requiring read-write access, as either can happen during op tree | |
1442 | building time, so there may still be legitimate write access. | |
1443 | ||
1444 | However, as an 80% solution it is still effective, as currently it | |
1445 | catches a write access during the generation of F<Config.pm>, which | |
1446 | means that we can't yet build F<perl> with this enabled. | |
1447 | ||
1448 | =head2 The .i Targets | |
1449 | ||
1450 | You can expand the macros in a F<foo.c> file by saying | |
1451 | ||
1452 | make foo.i | |
1453 | ||
1454 | which will expand the macros using cpp. Don't be scared by the results. | |
1455 | ||
1456 | =head1 AUTHOR | |
1457 | ||
1458 | This document was originally written by Nathan Torkington, and is | |
1459 | maintained by the perl5-porters mailing list. |