| 1 | |
| 2 | =encoding utf8 |
| 3 | |
| 4 | =for comment |
| 5 | Consistent formatting of this file is achieved with: |
| 6 | perl ./Porting/podtidy pod/perlhacktips.pod |
| 7 | |
| 8 | =head1 NAME |
| 9 | |
| 10 | perlhacktips - Tips for Perl core C code hacking |
| 11 | |
| 12 | =head1 DESCRIPTION |
| 13 | |
| 14 | This document will help you learn the best way to go about hacking on |
| 15 | the Perl core C code. It covers common problems, debugging, profiling, |
| 16 | and more. |
| 17 | |
| 18 | If you haven't read L<perlhack> and L<perlhacktut> yet, you might want |
| 19 | to do that first. |
| 20 | |
| 21 | =head1 COMMON PROBLEMS |
| 22 | |
| 23 | Perl source plays by ANSI C89 rules: no C99 (or C++) extensions. In |
| 24 | some cases we have to take pre-ANSI requirements into consideration. |
| 25 | You don't care about some particular platform having broken Perl? I |
| 26 | hear there is still a strong demand for J2EE programmers. |
| 27 | |
| 28 | =head2 Perl environment problems |
| 29 | |
| 30 | =over 4 |
| 31 | |
| 32 | =item * |
| 33 | |
| 34 | Not compiling with threading |
| 35 | |
| 36 | Compiling with threading (-Duseithreads) completely rewrites the |
| 37 | function prototypes of Perl. You better try your changes with that. |
| 38 | Related to this is the difference between "Perl_-less" and "Perl_-ly" |
| 39 | APIs, for example: |
| 40 | |
| 41 | Perl_sv_setiv(aTHX_ ...); |
| 42 | sv_setiv(...); |
| 43 | |
| 44 | The first one explicitly passes in the context, which is needed for |
| 45 | e.g. threaded builds. The second one does that implicitly; do not get |
| 46 | them mixed. If you are not passing in a aTHX_, you will need to do a |
| 47 | dTHX (or a dVAR) as the first thing in the function. |
| 48 | |
| 49 | See L<perlguts/"How multiple interpreters and concurrency are |
| 50 | supported"> for further discussion about context. |
| 51 | |
| 52 | =item * |
| 53 | |
| 54 | Not compiling with -DDEBUGGING |
| 55 | |
| 56 | The DEBUGGING define exposes more code to the compiler, therefore more |
| 57 | ways for things to go wrong. You should try it. |
| 58 | |
| 59 | =item * |
| 60 | |
| 61 | Introducing (non-read-only) globals |
| 62 | |
| 63 | Do not introduce any modifiable globals, truly global or file static. |
| 64 | They are bad form and complicate multithreading and other forms of |
| 65 | concurrency. The right way is to introduce them as new interpreter |
| 66 | variables, see F<intrpvar.h> (at the very end for binary |
| 67 | compatibility). |
| 68 | |
| 69 | Introducing read-only (const) globals is okay, as long as you verify |
| 70 | with e.g. C<nm libperl.a|egrep -v ' [TURtr] '> (if your C<nm> has |
| 71 | BSD-style output) that the data you added really is read-only. (If it |
| 72 | is, it shouldn't show up in the output of that command.) |
| 73 | |
| 74 | If you want to have static strings, make them constant: |
| 75 | |
| 76 | static const char etc[] = "..."; |
| 77 | |
| 78 | If you want to have arrays of constant strings, note carefully the |
| 79 | right combination of C<const>s: |
| 80 | |
| 81 | static const char * const yippee[] = |
| 82 | {"hi", "ho", "silver"}; |
| 83 | |
| 84 | There is a way to completely hide any modifiable globals (they are all |
| 85 | moved to heap), the compilation setting |
| 86 | C<-DPERL_GLOBAL_STRUCT_PRIVATE>. It is not normally used, but can be |
| 87 | used for testing, read more about it in L<perlguts/"Background and |
| 88 | PERL_IMPLICIT_CONTEXT">. |
| 89 | |
| 90 | =item * |
| 91 | |
| 92 | Not exporting your new function |
| 93 | |
| 94 | Some platforms (Win32, AIX, VMS, OS/2, to name a few) require any |
| 95 | function that is part of the public API (the shared Perl library) to be |
| 96 | explicitly marked as exported. See the discussion about F<embed.pl> in |
| 97 | L<perlguts>. |
| 98 | |
| 99 | =item * |
| 100 | |
| 101 | Exporting your new function |
| 102 | |
| 103 | The new shiny result of either genuine new functionality or your |
| 104 | arduous refactoring is now ready and correctly exported. So what could |
| 105 | possibly go wrong? |
| 106 | |
| 107 | Maybe simply that your function did not need to be exported in the |
| 108 | first place. Perl has a long and not so glorious history of exporting |
| 109 | functions that it should not have. |
| 110 | |
| 111 | If the function is used only inside one source code file, make it |
| 112 | static. See the discussion about F<embed.pl> in L<perlguts>. |
| 113 | |
| 114 | If the function is used across several files, but intended only for |
| 115 | Perl's internal use (and this should be the common case), do not export |
| 116 | it to the public API. See the discussion about F<embed.pl> in |
| 117 | L<perlguts>. |
| 118 | |
| 119 | =back |
| 120 | |
| 121 | =head2 Portability problems |
| 122 | |
| 123 | The following are common causes of compilation and/or execution |
| 124 | failures, not common to Perl as such. The C FAQ is good bedtime |
| 125 | reading. Please test your changes with as many C compilers and |
| 126 | platforms as possible; we will, anyway, and it's nice to save oneself |
| 127 | from public embarrassment. |
| 128 | |
| 129 | If using gcc, you can add the C<-std=c89> option which will hopefully |
| 130 | catch most of these unportabilities. (However it might also catch |
| 131 | incompatibilities in your system's header files.) |
| 132 | |
| 133 | Use the Configure C<-Dgccansipedantic> flag to enable the gcc C<-ansi |
| 134 | -pedantic> flags which enforce stricter ANSI rules. |
| 135 | |
| 136 | If using the C<gcc -Wall> note that not all the possible warnings (like |
| 137 | C<-Wunitialized>) are given unless you also compile with C<-O>. |
| 138 | |
| 139 | Note that if using gcc, starting from Perl 5.9.5 the Perl core source |
| 140 | code files (the ones at the top level of the source code distribution, |
| 141 | but not e.g. the extensions under ext/) are automatically compiled with |
| 142 | as many as possible of the C<-std=c89>, C<-ansi>, C<-pedantic>, and a |
| 143 | selection of C<-W> flags (see cflags.SH). |
| 144 | |
| 145 | Also study L<perlport> carefully to avoid any bad assumptions about the |
| 146 | operating system, filesystems, and so forth. |
| 147 | |
| 148 | You may once in a while try a "make microperl" to see whether we can |
| 149 | still compile Perl with just the bare minimum of interfaces. (See |
| 150 | README.micro.) |
| 151 | |
| 152 | Do not assume an operating system indicates a certain compiler. |
| 153 | |
| 154 | =over 4 |
| 155 | |
| 156 | =item * |
| 157 | |
| 158 | Casting pointers to integers or casting integers to pointers |
| 159 | |
| 160 | void castaway(U8* p) |
| 161 | { |
| 162 | IV i = p; |
| 163 | |
| 164 | or |
| 165 | |
| 166 | void castaway(U8* p) |
| 167 | { |
| 168 | IV i = (IV)p; |
| 169 | |
| 170 | Both are bad, and broken, and unportable. Use the PTR2IV() macro that |
| 171 | does it right. (Likewise, there are PTR2UV(), PTR2NV(), INT2PTR(), and |
| 172 | NUM2PTR().) |
| 173 | |
| 174 | =item * |
| 175 | |
| 176 | Casting between data function pointers and data pointers |
| 177 | |
| 178 | Technically speaking casting between function pointers and data |
| 179 | pointers is unportable and undefined, but practically speaking it seems |
| 180 | to work, but you should use the FPTR2DPTR() and DPTR2FPTR() macros. |
| 181 | Sometimes you can also play games with unions. |
| 182 | |
| 183 | =item * |
| 184 | |
| 185 | Assuming sizeof(int) == sizeof(long) |
| 186 | |
| 187 | There are platforms where longs are 64 bits, and platforms where ints |
| 188 | are 64 bits, and while we are out to shock you, even platforms where |
| 189 | shorts are 64 bits. This is all legal according to the C standard. (In |
| 190 | other words, "long long" is not a portable way to specify 64 bits, and |
| 191 | "long long" is not even guaranteed to be any wider than "long".) |
| 192 | |
| 193 | Instead, use the definitions IV, UV, IVSIZE, I32SIZE, and so forth. |
| 194 | Avoid things like I32 because they are B<not> guaranteed to be |
| 195 | I<exactly> 32 bits, they are I<at least> 32 bits, nor are they |
| 196 | guaranteed to be B<int> or B<long>. If you really explicitly need |
| 197 | 64-bit variables, use I64 and U64, but only if guarded by HAS_QUAD. |
| 198 | |
| 199 | =item * |
| 200 | |
| 201 | Assuming one can dereference any type of pointer for any type of data |
| 202 | |
| 203 | char *p = ...; |
| 204 | long pony = *p; /* BAD */ |
| 205 | |
| 206 | Many platforms, quite rightly so, will give you a core dump instead of |
| 207 | a pony if the p happens not to be correctly aligned. |
| 208 | |
| 209 | =item * |
| 210 | |
| 211 | Lvalue casts |
| 212 | |
| 213 | (int)*p = ...; /* BAD */ |
| 214 | |
| 215 | Simply not portable. Get your lvalue to be of the right type, or maybe |
| 216 | use temporary variables, or dirty tricks with unions. |
| 217 | |
| 218 | =item * |
| 219 | |
| 220 | Assume B<anything> about structs (especially the ones you don't |
| 221 | control, like the ones coming from the system headers) |
| 222 | |
| 223 | =over 8 |
| 224 | |
| 225 | =item * |
| 226 | |
| 227 | That a certain field exists in a struct |
| 228 | |
| 229 | =item * |
| 230 | |
| 231 | That no other fields exist besides the ones you know of |
| 232 | |
| 233 | =item * |
| 234 | |
| 235 | That a field is of certain signedness, sizeof, or type |
| 236 | |
| 237 | =item * |
| 238 | |
| 239 | That the fields are in a certain order |
| 240 | |
| 241 | =over 8 |
| 242 | |
| 243 | =item * |
| 244 | |
| 245 | While C guarantees the ordering specified in the struct definition, |
| 246 | between different platforms the definitions might differ |
| 247 | |
| 248 | =back |
| 249 | |
| 250 | =item * |
| 251 | |
| 252 | That the sizeof(struct) or the alignments are the same everywhere |
| 253 | |
| 254 | =over 8 |
| 255 | |
| 256 | =item * |
| 257 | |
| 258 | There might be padding bytes between the fields to align the fields - |
| 259 | the bytes can be anything |
| 260 | |
| 261 | =item * |
| 262 | |
| 263 | Structs are required to be aligned to the maximum alignment required by |
| 264 | the fields - which for native types is for usually equivalent to |
| 265 | sizeof() of the field |
| 266 | |
| 267 | =back |
| 268 | |
| 269 | =back |
| 270 | |
| 271 | =item * |
| 272 | |
| 273 | Assuming the character set is ASCIIish |
| 274 | |
| 275 | Perl can compile and run under EBCDIC platforms. See L<perlebcdic>. |
| 276 | This is transparent for the most part, but because the character sets |
| 277 | differ, you shouldn't use numeric (decimal, octal, nor hex) constants |
| 278 | to refer to characters. You can safely say 'A', but not 0x41. You can |
| 279 | safely say '\n', but not \012. If a character doesn't have a trivial |
| 280 | input form, you should add it to the list in |
| 281 | F<regen/unicode_constants.pl>, and have Perl create #defines for you, |
| 282 | based on the current platform. |
| 283 | |
| 284 | Also, the range 'A' - 'Z' in ASCII is an unbroken sequence of 26 upper |
| 285 | case alphabetic characters. That is not true in EBCDIC. Nor for 'a' to |
| 286 | 'z'. But '0' - '9' is an unbroken range in both systems. Don't assume |
| 287 | anything about other ranges. |
| 288 | |
| 289 | Many of the comments in the existing code ignore the possibility of |
| 290 | EBCDIC, and may be wrong therefore, even if the code works. This is |
| 291 | actually a tribute to the successful transparent insertion of being |
| 292 | able to handle EBCDIC without having to change pre-existing code. |
| 293 | |
| 294 | UTF-8 and UTF-EBCDIC are two different encodings used to represent |
| 295 | Unicode code points as sequences of bytes. Macros with the same names |
| 296 | (but different definitions) in C<utf8.h> and C<utfebcdic.h> are used to |
| 297 | allow the calling code to think that there is only one such encoding. |
| 298 | This is almost always referred to as C<utf8>, but it means the EBCDIC |
| 299 | version as well. Again, comments in the code may well be wrong even if |
| 300 | the code itself is right. For example, the concept of C<invariant |
| 301 | characters> differs between ASCII and EBCDIC. On ASCII platforms, only |
| 302 | characters that do not have the high-order bit set (i.e. whose ordinals |
| 303 | are strict ASCII, 0 - 127) are invariant, and the documentation and |
| 304 | comments in the code may assume that, often referring to something |
| 305 | like, say, C<hibit>. The situation differs and is not so simple on |
| 306 | EBCDIC machines, but as long as the code itself uses the |
| 307 | C<NATIVE_IS_INVARIANT()> macro appropriately, it works, even if the |
| 308 | comments are wrong. |
| 309 | |
| 310 | =item * |
| 311 | |
| 312 | Assuming the character set is just ASCII |
| 313 | |
| 314 | ASCII is a 7 bit encoding, but bytes have 8 bits in them. The 128 extra |
| 315 | characters have different meanings depending on the locale. Absent a |
| 316 | locale, currently these extra characters are generally considered to be |
| 317 | unassigned, and this has presented some problems. This is being changed |
| 318 | starting in 5.12 so that these characters will be considered to be |
| 319 | Latin-1 (ISO-8859-1). |
| 320 | |
| 321 | =item * |
| 322 | |
| 323 | Mixing #define and #ifdef |
| 324 | |
| 325 | #define BURGLE(x) ... \ |
| 326 | #ifdef BURGLE_OLD_STYLE /* BAD */ |
| 327 | ... do it the old way ... \ |
| 328 | #else |
| 329 | ... do it the new way ... \ |
| 330 | #endif |
| 331 | |
| 332 | You cannot portably "stack" cpp directives. For example in the above |
| 333 | you need two separate BURGLE() #defines, one for each #ifdef branch. |
| 334 | |
| 335 | =item * |
| 336 | |
| 337 | Adding non-comment stuff after #endif or #else |
| 338 | |
| 339 | #ifdef SNOSH |
| 340 | ... |
| 341 | #else !SNOSH /* BAD */ |
| 342 | ... |
| 343 | #endif SNOSH /* BAD */ |
| 344 | |
| 345 | The #endif and #else cannot portably have anything non-comment after |
| 346 | them. If you want to document what is going (which is a good idea |
| 347 | especially if the branches are long), use (C) comments: |
| 348 | |
| 349 | #ifdef SNOSH |
| 350 | ... |
| 351 | #else /* !SNOSH */ |
| 352 | ... |
| 353 | #endif /* SNOSH */ |
| 354 | |
| 355 | The gcc option C<-Wendif-labels> warns about the bad variant (by |
| 356 | default on starting from Perl 5.9.4). |
| 357 | |
| 358 | =item * |
| 359 | |
| 360 | Having a comma after the last element of an enum list |
| 361 | |
| 362 | enum color { |
| 363 | CERULEAN, |
| 364 | CHARTREUSE, |
| 365 | CINNABAR, /* BAD */ |
| 366 | }; |
| 367 | |
| 368 | is not portable. Leave out the last comma. |
| 369 | |
| 370 | Also note that whether enums are implicitly morphable to ints varies |
| 371 | between compilers, you might need to (int). |
| 372 | |
| 373 | =item * |
| 374 | |
| 375 | Using //-comments |
| 376 | |
| 377 | // This function bamfoodles the zorklator. /* BAD */ |
| 378 | |
| 379 | That is C99 or C++. Perl is C89. Using the //-comments is silently |
| 380 | allowed by many C compilers but cranking up the ANSI C89 strictness |
| 381 | (which we like to do) causes the compilation to fail. |
| 382 | |
| 383 | =item * |
| 384 | |
| 385 | Mixing declarations and code |
| 386 | |
| 387 | void zorklator() |
| 388 | { |
| 389 | int n = 3; |
| 390 | set_zorkmids(n); /* BAD */ |
| 391 | int q = 4; |
| 392 | |
| 393 | That is C99 or C++. Some C compilers allow that, but you shouldn't. |
| 394 | |
| 395 | The gcc option C<-Wdeclaration-after-statements> scans for such |
| 396 | problems (by default on starting from Perl 5.9.4). |
| 397 | |
| 398 | =item * |
| 399 | |
| 400 | Introducing variables inside for() |
| 401 | |
| 402 | for(int i = ...; ...; ...) { /* BAD */ |
| 403 | |
| 404 | That is C99 or C++. While it would indeed be awfully nice to have that |
| 405 | also in C89, to limit the scope of the loop variable, alas, we cannot. |
| 406 | |
| 407 | =item * |
| 408 | |
| 409 | Mixing signed char pointers with unsigned char pointers |
| 410 | |
| 411 | int foo(char *s) { ... } |
| 412 | ... |
| 413 | unsigned char *t = ...; /* Or U8* t = ... */ |
| 414 | foo(t); /* BAD */ |
| 415 | |
| 416 | While this is legal practice, it is certainly dubious, and downright |
| 417 | fatal in at least one platform: for example VMS cc considers this a |
| 418 | fatal error. One cause for people often making this mistake is that a |
| 419 | "naked char" and therefore dereferencing a "naked char pointer" have an |
| 420 | undefined signedness: it depends on the compiler and the flags of the |
| 421 | compiler and the underlying platform whether the result is signed or |
| 422 | unsigned. For this very same reason using a 'char' as an array index is |
| 423 | bad. |
| 424 | |
| 425 | =item * |
| 426 | |
| 427 | Macros that have string constants and their arguments as substrings of |
| 428 | the string constants |
| 429 | |
| 430 | #define FOO(n) printf("number = %d\n", n) /* BAD */ |
| 431 | FOO(10); |
| 432 | |
| 433 | Pre-ANSI semantics for that was equivalent to |
| 434 | |
| 435 | printf("10umber = %d\10"); |
| 436 | |
| 437 | which is probably not what you were expecting. Unfortunately at least |
| 438 | one reasonably common and modern C compiler does "real backward |
| 439 | compatibility" here, in AIX that is what still happens even though the |
| 440 | rest of the AIX compiler is very happily C89. |
| 441 | |
| 442 | =item * |
| 443 | |
| 444 | Using printf formats for non-basic C types |
| 445 | |
| 446 | IV i = ...; |
| 447 | printf("i = %d\n", i); /* BAD */ |
| 448 | |
| 449 | While this might by accident work in some platform (where IV happens to |
| 450 | be an C<int>), in general it cannot. IV might be something larger. Even |
| 451 | worse the situation is with more specific types (defined by Perl's |
| 452 | configuration step in F<config.h>): |
| 453 | |
| 454 | Uid_t who = ...; |
| 455 | printf("who = %d\n", who); /* BAD */ |
| 456 | |
| 457 | The problem here is that Uid_t might be not only not C<int>-wide but it |
| 458 | might also be unsigned, in which case large uids would be printed as |
| 459 | negative values. |
| 460 | |
| 461 | There is no simple solution to this because of printf()'s limited |
| 462 | intelligence, but for many types the right format is available as with |
| 463 | either 'f' or '_f' suffix, for example: |
| 464 | |
| 465 | IVdf /* IV in decimal */ |
| 466 | UVxf /* UV is hexadecimal */ |
| 467 | |
| 468 | printf("i = %"IVdf"\n", i); /* The IVdf is a string constant. */ |
| 469 | |
| 470 | Uid_t_f /* Uid_t in decimal */ |
| 471 | |
| 472 | printf("who = %"Uid_t_f"\n", who); |
| 473 | |
| 474 | Or you can try casting to a "wide enough" type: |
| 475 | |
| 476 | printf("i = %"IVdf"\n", (IV)something_very_small_and_signed); |
| 477 | |
| 478 | Also remember that the C<%p> format really does require a void pointer: |
| 479 | |
| 480 | U8* p = ...; |
| 481 | printf("p = %p\n", (void*)p); |
| 482 | |
| 483 | The gcc option C<-Wformat> scans for such problems. |
| 484 | |
| 485 | =item * |
| 486 | |
| 487 | Blindly using variadic macros |
| 488 | |
| 489 | gcc has had them for a while with its own syntax, and C99 brought them |
| 490 | with a standardized syntax. Don't use the former, and use the latter |
| 491 | only if the HAS_C99_VARIADIC_MACROS is defined. |
| 492 | |
| 493 | =item * |
| 494 | |
| 495 | Blindly passing va_list |
| 496 | |
| 497 | Not all platforms support passing va_list to further varargs (stdarg) |
| 498 | functions. The right thing to do is to copy the va_list using the |
| 499 | Perl_va_copy() if the NEED_VA_COPY is defined. |
| 500 | |
| 501 | =item * |
| 502 | |
| 503 | Using gcc statement expressions |
| 504 | |
| 505 | val = ({...;...;...}); /* BAD */ |
| 506 | |
| 507 | While a nice extension, it's not portable. The Perl code does |
| 508 | admittedly use them if available to gain some extra speed (essentially |
| 509 | as a funky form of inlining), but you shouldn't. |
| 510 | |
| 511 | =item * |
| 512 | |
| 513 | Binding together several statements in a macro |
| 514 | |
| 515 | Use the macros STMT_START and STMT_END. |
| 516 | |
| 517 | STMT_START { |
| 518 | ... |
| 519 | } STMT_END |
| 520 | |
| 521 | =item * |
| 522 | |
| 523 | Testing for operating systems or versions when should be testing for |
| 524 | features |
| 525 | |
| 526 | #ifdef __FOONIX__ /* BAD */ |
| 527 | foo = quux(); |
| 528 | #endif |
| 529 | |
| 530 | Unless you know with 100% certainty that quux() is only ever available |
| 531 | for the "Foonix" operating system B<and> that is available B<and> |
| 532 | correctly working for B<all> past, present, B<and> future versions of |
| 533 | "Foonix", the above is very wrong. This is more correct (though still |
| 534 | not perfect, because the below is a compile-time check): |
| 535 | |
| 536 | #ifdef HAS_QUUX |
| 537 | foo = quux(); |
| 538 | #endif |
| 539 | |
| 540 | How does the HAS_QUUX become defined where it needs to be? Well, if |
| 541 | Foonix happens to be Unixy enough to be able to run the Configure |
| 542 | script, and Configure has been taught about detecting and testing |
| 543 | quux(), the HAS_QUUX will be correctly defined. In other platforms, the |
| 544 | corresponding configuration step will hopefully do the same. |
| 545 | |
| 546 | In a pinch, if you cannot wait for Configure to be educated, or if you |
| 547 | have a good hunch of where quux() might be available, you can |
| 548 | temporarily try the following: |
| 549 | |
| 550 | #if (defined(__FOONIX__) || defined(__BARNIX__)) |
| 551 | # define HAS_QUUX |
| 552 | #endif |
| 553 | |
| 554 | ... |
| 555 | |
| 556 | #ifdef HAS_QUUX |
| 557 | foo = quux(); |
| 558 | #endif |
| 559 | |
| 560 | But in any case, try to keep the features and operating systems |
| 561 | separate. |
| 562 | |
| 563 | =back |
| 564 | |
| 565 | =head2 Problematic System Interfaces |
| 566 | |
| 567 | =over 4 |
| 568 | |
| 569 | =item * |
| 570 | |
| 571 | malloc(0), realloc(0), calloc(0, 0) are non-portable. To be portable |
| 572 | allocate at least one byte. (In general you should rarely need to work |
| 573 | at this low level, but instead use the various malloc wrappers.) |
| 574 | |
| 575 | =item * |
| 576 | |
| 577 | snprintf() - the return type is unportable. Use my_snprintf() instead. |
| 578 | |
| 579 | =back |
| 580 | |
| 581 | =head2 Security problems |
| 582 | |
| 583 | Last but not least, here are various tips for safer coding. |
| 584 | |
| 585 | =over 4 |
| 586 | |
| 587 | =item * |
| 588 | |
| 589 | Do not use gets() |
| 590 | |
| 591 | Or we will publicly ridicule you. Seriously. |
| 592 | |
| 593 | =item * |
| 594 | |
| 595 | Do not use strcpy() or strcat() or strncpy() or strncat() |
| 596 | |
| 597 | Use my_strlcpy() and my_strlcat() instead: they either use the native |
| 598 | implementation, or Perl's own implementation (borrowed from the public |
| 599 | domain implementation of INN). |
| 600 | |
| 601 | =item * |
| 602 | |
| 603 | Do not use sprintf() or vsprintf() |
| 604 | |
| 605 | If you really want just plain byte strings, use my_snprintf() and |
| 606 | my_vsnprintf() instead, which will try to use snprintf() and |
| 607 | vsnprintf() if those safer APIs are available. If you want something |
| 608 | fancier than a plain byte string, use |
| 609 | L<C<Perl_form>()|perlapi/form> or SVs and |
| 610 | L<C<Perl_sv_catpvf()>|perlapi/sv_catpvf>. |
| 611 | |
| 612 | Note that glibc C<printf()>, C<sprintf()>, etc. are buggy before glibc |
| 613 | version 2.17. They won't allow a C<%.s> format with a precision to |
| 614 | create a string that isn't valid UTF-8 if the current underlying locale |
| 615 | of the program is UTF-8. What happens is that the C<%s> and its operand are |
| 616 | simply skipped without any notice. |
| 617 | L<https://sourceware.org/bugzilla/show_bug.cgi?id=6530>. |
| 618 | |
| 619 | =back |
| 620 | |
| 621 | =head1 DEBUGGING |
| 622 | |
| 623 | You can compile a special debugging version of Perl, which allows you |
| 624 | to use the C<-D> option of Perl to tell more about what Perl is doing. |
| 625 | But sometimes there is no alternative than to dive in with a debugger, |
| 626 | either to see the stack trace of a core dump (very useful in a bug |
| 627 | report), or trying to figure out what went wrong before the core dump |
| 628 | happened, or how did we end up having wrong or unexpected results. |
| 629 | |
| 630 | =head2 Poking at Perl |
| 631 | |
| 632 | To really poke around with Perl, you'll probably want to build Perl for |
| 633 | debugging, like this: |
| 634 | |
| 635 | ./Configure -d -D optimize=-g |
| 636 | make |
| 637 | |
| 638 | C<-g> is a flag to the C compiler to have it produce debugging |
| 639 | information which will allow us to step through a running program, and |
| 640 | to see in which C function we are at (without the debugging information |
| 641 | we might see only the numerical addresses of the functions, which is |
| 642 | not very helpful). |
| 643 | |
| 644 | F<Configure> will also turn on the C<DEBUGGING> compilation symbol |
| 645 | which enables all the internal debugging code in Perl. There are a |
| 646 | whole bunch of things you can debug with this: L<perlrun> lists them |
| 647 | all, and the best way to find out about them is to play about with |
| 648 | them. The most useful options are probably |
| 649 | |
| 650 | l Context (loop) stack processing |
| 651 | t Trace execution |
| 652 | o Method and overloading resolution |
| 653 | c String/numeric conversions |
| 654 | |
| 655 | Some of the functionality of the debugging code can be achieved using |
| 656 | XS modules. |
| 657 | |
| 658 | -Dr => use re 'debug' |
| 659 | -Dx => use O 'Debug' |
| 660 | |
| 661 | =head2 Using a source-level debugger |
| 662 | |
| 663 | If the debugging output of C<-D> doesn't help you, it's time to step |
| 664 | through perl's execution with a source-level debugger. |
| 665 | |
| 666 | =over 3 |
| 667 | |
| 668 | =item * |
| 669 | |
| 670 | We'll use C<gdb> for our examples here; the principles will apply to |
| 671 | any debugger (many vendors call their debugger C<dbx>), but check the |
| 672 | manual of the one you're using. |
| 673 | |
| 674 | =back |
| 675 | |
| 676 | To fire up the debugger, type |
| 677 | |
| 678 | gdb ./perl |
| 679 | |
| 680 | Or if you have a core dump: |
| 681 | |
| 682 | gdb ./perl core |
| 683 | |
| 684 | You'll want to do that in your Perl source tree so the debugger can |
| 685 | read the source code. You should see the copyright message, followed by |
| 686 | the prompt. |
| 687 | |
| 688 | (gdb) |
| 689 | |
| 690 | C<help> will get you into the documentation, but here are the most |
| 691 | useful commands: |
| 692 | |
| 693 | =over 3 |
| 694 | |
| 695 | =item * run [args] |
| 696 | |
| 697 | Run the program with the given arguments. |
| 698 | |
| 699 | =item * break function_name |
| 700 | |
| 701 | =item * break source.c:xxx |
| 702 | |
| 703 | Tells the debugger that we'll want to pause execution when we reach |
| 704 | either the named function (but see L<perlguts/Internal Functions>!) or |
| 705 | the given line in the named source file. |
| 706 | |
| 707 | =item * step |
| 708 | |
| 709 | Steps through the program a line at a time. |
| 710 | |
| 711 | =item * next |
| 712 | |
| 713 | Steps through the program a line at a time, without descending into |
| 714 | functions. |
| 715 | |
| 716 | =item * continue |
| 717 | |
| 718 | Run until the next breakpoint. |
| 719 | |
| 720 | =item * finish |
| 721 | |
| 722 | Run until the end of the current function, then stop again. |
| 723 | |
| 724 | =item * 'enter' |
| 725 | |
| 726 | Just pressing Enter will do the most recent operation again - it's a |
| 727 | blessing when stepping through miles of source code. |
| 728 | |
| 729 | =item * ptype |
| 730 | |
| 731 | Prints the C definition of the argument given. |
| 732 | |
| 733 | (gdb) ptype PL_op |
| 734 | type = struct op { |
| 735 | OP *op_next; |
| 736 | OP *op_sibling; |
| 737 | OP *(*op_ppaddr)(void); |
| 738 | PADOFFSET op_targ; |
| 739 | unsigned int op_type : 9; |
| 740 | unsigned int op_opt : 1; |
| 741 | unsigned int op_slabbed : 1; |
| 742 | unsigned int op_savefree : 1; |
| 743 | unsigned int op_static : 1; |
| 744 | unsigned int op_folded : 1; |
| 745 | unsigned int op_spare : 2; |
| 746 | U8 op_flags; |
| 747 | U8 op_private; |
| 748 | } * |
| 749 | |
| 750 | =item * print |
| 751 | |
| 752 | Execute the given C code and print its results. B<WARNING>: Perl makes |
| 753 | heavy use of macros, and F<gdb> does not necessarily support macros |
| 754 | (see later L</"gdb macro support">). You'll have to substitute them |
| 755 | yourself, or to invoke cpp on the source code files (see L</"The .i |
| 756 | Targets">) So, for instance, you can't say |
| 757 | |
| 758 | print SvPV_nolen(sv) |
| 759 | |
| 760 | but you have to say |
| 761 | |
| 762 | print Perl_sv_2pv_nolen(sv) |
| 763 | |
| 764 | =back |
| 765 | |
| 766 | You may find it helpful to have a "macro dictionary", which you can |
| 767 | produce by saying C<cpp -dM perl.c | sort>. Even then, F<cpp> won't |
| 768 | recursively apply those macros for you. |
| 769 | |
| 770 | =head2 gdb macro support |
| 771 | |
| 772 | Recent versions of F<gdb> have fairly good macro support, but in order |
| 773 | to use it you'll need to compile perl with macro definitions included |
| 774 | in the debugging information. Using F<gcc> version 3.1, this means |
| 775 | configuring with C<-Doptimize=-g3>. Other compilers might use a |
| 776 | different switch (if they support debugging macros at all). |
| 777 | |
| 778 | =head2 Dumping Perl Data Structures |
| 779 | |
| 780 | One way to get around this macro hell is to use the dumping functions |
| 781 | in F<dump.c>; these work a little like an internal |
| 782 | L<Devel::Peek|Devel::Peek>, but they also cover OPs and other |
| 783 | structures that you can't get at from Perl. Let's take an example. |
| 784 | We'll use the C<$a = $b + $c> we used before, but give it a bit of |
| 785 | context: C<$b = "6XXXX"; $c = 2.3;>. Where's a good place to stop and |
| 786 | poke around? |
| 787 | |
| 788 | What about C<pp_add>, the function we examined earlier to implement the |
| 789 | C<+> operator: |
| 790 | |
| 791 | (gdb) break Perl_pp_add |
| 792 | Breakpoint 1 at 0x46249f: file pp_hot.c, line 309. |
| 793 | |
| 794 | Notice we use C<Perl_pp_add> and not C<pp_add> - see |
| 795 | L<perlguts/Internal Functions>. With the breakpoint in place, we can |
| 796 | run our program: |
| 797 | |
| 798 | (gdb) run -e '$b = "6XXXX"; $c = 2.3; $a = $b + $c' |
| 799 | |
| 800 | Lots of junk will go past as gdb reads in the relevant source files and |
| 801 | libraries, and then: |
| 802 | |
| 803 | Breakpoint 1, Perl_pp_add () at pp_hot.c:309 |
| 804 | 309 dSP; dATARGET; tryAMAGICbin(add,opASSIGN); |
| 805 | (gdb) step |
| 806 | 311 dPOPTOPnnrl_ul; |
| 807 | (gdb) |
| 808 | |
| 809 | We looked at this bit of code before, and we said that |
| 810 | C<dPOPTOPnnrl_ul> arranges for two C<NV>s to be placed into C<left> and |
| 811 | C<right> - let's slightly expand it: |
| 812 | |
| 813 | #define dPOPTOPnnrl_ul NV right = POPn; \ |
| 814 | SV *leftsv = TOPs; \ |
| 815 | NV left = USE_LEFT(leftsv) ? SvNV(leftsv) : 0.0 |
| 816 | |
| 817 | C<POPn> takes the SV from the top of the stack and obtains its NV |
| 818 | either directly (if C<SvNOK> is set) or by calling the C<sv_2nv> |
| 819 | function. C<TOPs> takes the next SV from the top of the stack - yes, |
| 820 | C<POPn> uses C<TOPs> - but doesn't remove it. We then use C<SvNV> to |
| 821 | get the NV from C<leftsv> in the same way as before - yes, C<POPn> uses |
| 822 | C<SvNV>. |
| 823 | |
| 824 | Since we don't have an NV for C<$b>, we'll have to use C<sv_2nv> to |
| 825 | convert it. If we step again, we'll find ourselves there: |
| 826 | |
| 827 | (gdb) step |
| 828 | Perl_sv_2nv (sv=0xa0675d0) at sv.c:1669 |
| 829 | 1669 if (!sv) |
| 830 | (gdb) |
| 831 | |
| 832 | We can now use C<Perl_sv_dump> to investigate the SV: |
| 833 | |
| 834 | (gdb) print Perl_sv_dump(sv) |
| 835 | SV = PV(0xa057cc0) at 0xa0675d0 |
| 836 | REFCNT = 1 |
| 837 | FLAGS = (POK,pPOK) |
| 838 | PV = 0xa06a510 "6XXXX"\0 |
| 839 | CUR = 5 |
| 840 | LEN = 6 |
| 841 | $1 = void |
| 842 | |
| 843 | We know we're going to get C<6> from this, so let's finish the |
| 844 | subroutine: |
| 845 | |
| 846 | (gdb) finish |
| 847 | Run till exit from #0 Perl_sv_2nv (sv=0xa0675d0) at sv.c:1671 |
| 848 | 0x462669 in Perl_pp_add () at pp_hot.c:311 |
| 849 | 311 dPOPTOPnnrl_ul; |
| 850 | |
| 851 | We can also dump out this op: the current op is always stored in |
| 852 | C<PL_op>, and we can dump it with C<Perl_op_dump>. This'll give us |
| 853 | similar output to L<B::Debug|B::Debug>. |
| 854 | |
| 855 | (gdb) print Perl_op_dump(PL_op) |
| 856 | { |
| 857 | 13 TYPE = add ===> 14 |
| 858 | TARG = 1 |
| 859 | FLAGS = (SCALAR,KIDS) |
| 860 | { |
| 861 | TYPE = null ===> (12) |
| 862 | (was rv2sv) |
| 863 | FLAGS = (SCALAR,KIDS) |
| 864 | { |
| 865 | 11 TYPE = gvsv ===> 12 |
| 866 | FLAGS = (SCALAR) |
| 867 | GV = main::b |
| 868 | } |
| 869 | } |
| 870 | |
| 871 | # finish this later # |
| 872 | |
| 873 | =head2 Using gdb to look at specific parts of a program |
| 874 | |
| 875 | With the example above, you knew to look for C<Perl_pp_add>, but what if |
| 876 | there were multiple calls to it all over the place, or you didn't know what |
| 877 | the op was you were looking for? |
| 878 | |
| 879 | One way to do this is to inject a rare call somewhere near what you're looking |
| 880 | for. For example, you could add C<study> before your method: |
| 881 | |
| 882 | study; |
| 883 | |
| 884 | And in gdb do: |
| 885 | |
| 886 | (gdb) break Perl_pp_study |
| 887 | |
| 888 | And then step until you hit what you're |
| 889 | looking for. This works well in a loop |
| 890 | if you want to only break at certain iterations: |
| 891 | |
| 892 | for my $c (1..100) { |
| 893 | study if $c == 50; |
| 894 | } |
| 895 | |
| 896 | =head2 Using gdb to look at what the parser/lexer are doing |
| 897 | |
| 898 | If you want to see what perl is doing when parsing/lexing your code, you can |
| 899 | use C<BEGIN {}>: |
| 900 | |
| 901 | print "Before\n"; |
| 902 | BEGIN { study; } |
| 903 | print "After\n"; |
| 904 | |
| 905 | And in gdb: |
| 906 | |
| 907 | (gdb) break Perl_pp_study |
| 908 | |
| 909 | If you want to see what the parser/lexer is doing inside of C<if> blocks and |
| 910 | the like you need to be a little trickier: |
| 911 | |
| 912 | if ($a && $b && do { BEGIN { study } 1 } && $c) { ... } |
| 913 | |
| 914 | =head1 SOURCE CODE STATIC ANALYSIS |
| 915 | |
| 916 | Various tools exist for analysing C source code B<statically>, as |
| 917 | opposed to B<dynamically>, that is, without executing the code. It is |
| 918 | possible to detect resource leaks, undefined behaviour, type |
| 919 | mismatches, portability problems, code paths that would cause illegal |
| 920 | memory accesses, and other similar problems by just parsing the C code |
| 921 | and looking at the resulting graph, what does it tell about the |
| 922 | execution and data flows. As a matter of fact, this is exactly how C |
| 923 | compilers know to give warnings about dubious code. |
| 924 | |
| 925 | =head2 lint, splint |
| 926 | |
| 927 | The good old C code quality inspector, C<lint>, is available in several |
| 928 | platforms, but please be aware that there are several different |
| 929 | implementations of it by different vendors, which means that the flags |
| 930 | are not identical across different platforms. |
| 931 | |
| 932 | There is a lint variant called C<splint> (Secure Programming Lint) |
| 933 | available from http://www.splint.org/ that should compile on any |
| 934 | Unix-like platform. |
| 935 | |
| 936 | There are C<lint> and <splint> targets in Makefile, but you may have to |
| 937 | diddle with the flags (see above). |
| 938 | |
| 939 | =head2 Coverity |
| 940 | |
| 941 | Coverity (http://www.coverity.com/) is a product similar to lint and as |
| 942 | a testbed for their product they periodically check several open source |
| 943 | projects, and they give out accounts to open source developers to the |
| 944 | defect databases. |
| 945 | |
| 946 | =head2 cpd (cut-and-paste detector) |
| 947 | |
| 948 | The cpd tool detects cut-and-paste coding. If one instance of the |
| 949 | cut-and-pasted code changes, all the other spots should probably be |
| 950 | changed, too. Therefore such code should probably be turned into a |
| 951 | subroutine or a macro. |
| 952 | |
| 953 | cpd (http://pmd.sourceforge.net/cpd.html) is part of the pmd project |
| 954 | (http://pmd.sourceforge.net/). pmd was originally written for static |
| 955 | analysis of Java code, but later the cpd part of it was extended to |
| 956 | parse also C and C++. |
| 957 | |
| 958 | Download the pmd-bin-X.Y.zip () from the SourceForge site, extract the |
| 959 | pmd-X.Y.jar from it, and then run that on source code thusly: |
| 960 | |
| 961 | java -cp pmd-X.Y.jar net.sourceforge.pmd.cpd.CPD \ |
| 962 | --minimum-tokens 100 --files /some/where/src --language c > cpd.txt |
| 963 | |
| 964 | You may run into memory limits, in which case you should use the -Xmx |
| 965 | option: |
| 966 | |
| 967 | java -Xmx512M ... |
| 968 | |
| 969 | =head2 gcc warnings |
| 970 | |
| 971 | Though much can be written about the inconsistency and coverage |
| 972 | problems of gcc warnings (like C<-Wall> not meaning "all the warnings", |
| 973 | or some common portability problems not being covered by C<-Wall>, or |
| 974 | C<-ansi> and C<-pedantic> both being a poorly defined collection of |
| 975 | warnings, and so forth), gcc is still a useful tool in keeping our |
| 976 | coding nose clean. |
| 977 | |
| 978 | The C<-Wall> is by default on. |
| 979 | |
| 980 | The C<-ansi> (and its sidekick, C<-pedantic>) would be nice to be on |
| 981 | always, but unfortunately they are not safe on all platforms, they can |
| 982 | for example cause fatal conflicts with the system headers (Solaris |
| 983 | being a prime example). If Configure C<-Dgccansipedantic> is used, the |
| 984 | C<cflags> frontend selects C<-ansi -pedantic> for the platforms where |
| 985 | they are known to be safe. |
| 986 | |
| 987 | Starting from Perl 5.9.4 the following extra flags are added: |
| 988 | |
| 989 | =over 4 |
| 990 | |
| 991 | =item * |
| 992 | |
| 993 | C<-Wendif-labels> |
| 994 | |
| 995 | =item * |
| 996 | |
| 997 | C<-Wextra> |
| 998 | |
| 999 | =item * |
| 1000 | |
| 1001 | C<-Wdeclaration-after-statement> |
| 1002 | |
| 1003 | =back |
| 1004 | |
| 1005 | The following flags would be nice to have but they would first need |
| 1006 | their own Augean stablemaster: |
| 1007 | |
| 1008 | =over 4 |
| 1009 | |
| 1010 | =item * |
| 1011 | |
| 1012 | C<-Wpointer-arith> |
| 1013 | |
| 1014 | =item * |
| 1015 | |
| 1016 | C<-Wshadow> |
| 1017 | |
| 1018 | =item * |
| 1019 | |
| 1020 | C<-Wstrict-prototypes> |
| 1021 | |
| 1022 | =back |
| 1023 | |
| 1024 | The C<-Wtraditional> is another example of the annoying tendency of gcc |
| 1025 | to bundle a lot of warnings under one switch (it would be impossible to |
| 1026 | deploy in practice because it would complain a lot) but it does contain |
| 1027 | some warnings that would be beneficial to have available on their own, |
| 1028 | such as the warning about string constants inside macros containing the |
| 1029 | macro arguments: this behaved differently pre-ANSI than it does in |
| 1030 | ANSI, and some C compilers are still in transition, AIX being an |
| 1031 | example. |
| 1032 | |
| 1033 | =head2 Warnings of other C compilers |
| 1034 | |
| 1035 | Other C compilers (yes, there B<are> other C compilers than gcc) often |
| 1036 | have their "strict ANSI" or "strict ANSI with some portability |
| 1037 | extensions" modes on, like for example the Sun Workshop has its C<-Xa> |
| 1038 | mode on (though implicitly), or the DEC (these days, HP...) has its |
| 1039 | C<-std1> mode on. |
| 1040 | |
| 1041 | =head1 MEMORY DEBUGGERS |
| 1042 | |
| 1043 | B<NOTE 1>: Running under older memory debuggers such as Purify, |
| 1044 | valgrind or Third Degree greatly slows down the execution: seconds |
| 1045 | become minutes, minutes become hours. For example as of Perl 5.8.1, the |
| 1046 | ext/Encode/t/Unicode.t takes extraordinarily long to complete under |
| 1047 | e.g. Purify, Third Degree, and valgrind. Under valgrind it takes more |
| 1048 | than six hours, even on a snappy computer. The said test must be doing |
| 1049 | something that is quite unfriendly for memory debuggers. If you don't |
| 1050 | feel like waiting, that you can simply kill away the perl process. |
| 1051 | Roughly valgrind slows down execution by factor 10, AddressSanitizer by |
| 1052 | factor 2. |
| 1053 | |
| 1054 | B<NOTE 2>: To minimize the number of memory leak false alarms (see |
| 1055 | L</PERL_DESTRUCT_LEVEL> for more information), you have to set the |
| 1056 | environment variable PERL_DESTRUCT_LEVEL to 2. For example, like this: |
| 1057 | |
| 1058 | env PERL_DESTRUCT_LEVEL=2 valgrind ./perl -Ilib ... |
| 1059 | |
| 1060 | B<NOTE 3>: There are known memory leaks when there are compile-time |
| 1061 | errors within eval or require, seeing C<S_doeval> in the call stack is |
| 1062 | a good sign of these. Fixing these leaks is non-trivial, unfortunately, |
| 1063 | but they must be fixed eventually. |
| 1064 | |
| 1065 | B<NOTE 4>: L<DynaLoader> will not clean up after itself completely |
| 1066 | unless Perl is built with the Configure option |
| 1067 | C<-Accflags=-DDL_UNLOAD_ALL_AT_EXIT>. |
| 1068 | |
| 1069 | =head2 valgrind |
| 1070 | |
| 1071 | The valgrind tool can be used to find out both memory leaks and illegal |
| 1072 | heap memory accesses. As of version 3.3.0, Valgrind only supports Linux |
| 1073 | on x86, x86-64 and PowerPC and Darwin (OS X) on x86 and x86-64). The |
| 1074 | special "test.valgrind" target can be used to run the tests under |
| 1075 | valgrind. Found errors and memory leaks are logged in files named |
| 1076 | F<testfile.valgrind>. |
| 1077 | |
| 1078 | Valgrind also provides a cachegrind tool, invoked on perl as: |
| 1079 | |
| 1080 | VG_OPTS=--tool=cachegrind make test.valgrind |
| 1081 | |
| 1082 | As system libraries (most notably glibc) are also triggering errors, |
| 1083 | valgrind allows to suppress such errors using suppression files. The |
| 1084 | default suppression file that comes with valgrind already catches a lot |
| 1085 | of them. Some additional suppressions are defined in F<t/perl.supp>. |
| 1086 | |
| 1087 | To get valgrind and for more information see |
| 1088 | |
| 1089 | http://valgrind.org/ |
| 1090 | |
| 1091 | =head2 AddressSanitizer |
| 1092 | |
| 1093 | AddressSanitizer is a clang and gcc extension, included in clang since |
| 1094 | v3.1 and gcc since v4.8. It checks illegal heap pointers, global |
| 1095 | pointers, stack pointers and use after free errors, and is fast enough |
| 1096 | that you can easily compile your debugging or optimized perl with it. |
| 1097 | It does not check memory leaks though. AddressSanitizer is available |
| 1098 | for Linux, Mac OS X and soon on Windows. |
| 1099 | |
| 1100 | To build perl with AddressSanitizer, your Configure invocation should |
| 1101 | look like: |
| 1102 | |
| 1103 | sh Configure -des -Dcc=clang \ |
| 1104 | -Accflags=-faddress-sanitizer -Aldflags=-faddress-sanitizer \ |
| 1105 | -Alddlflags=-shared\ -faddress-sanitizer |
| 1106 | |
| 1107 | where these arguments mean: |
| 1108 | |
| 1109 | =over 4 |
| 1110 | |
| 1111 | =item * -Dcc=clang |
| 1112 | |
| 1113 | This should be replaced by the full path to your clang executable if it |
| 1114 | is not in your path. |
| 1115 | |
| 1116 | =item * -Accflags=-faddress-sanitizer |
| 1117 | |
| 1118 | Compile perl and extensions sources with AddressSanitizer. |
| 1119 | |
| 1120 | =item * -Aldflags=-faddress-sanitizer |
| 1121 | |
| 1122 | Link the perl executable with AddressSanitizer. |
| 1123 | |
| 1124 | =item * -Alddlflags=-shared\ -faddress-sanitizer |
| 1125 | |
| 1126 | Link dynamic extensions with AddressSanitizer. You must manually |
| 1127 | specify C<-shared> because using C<-Alddlflags=-shared> will prevent |
| 1128 | Configure from setting a default value for C<lddlflags>, which usually |
| 1129 | contains C<-shared> (at least on Linux). |
| 1130 | |
| 1131 | =back |
| 1132 | |
| 1133 | See also |
| 1134 | L<http://code.google.com/p/address-sanitizer/wiki/AddressSanitizer>. |
| 1135 | |
| 1136 | |
| 1137 | =head1 PROFILING |
| 1138 | |
| 1139 | Depending on your platform there are various ways of profiling Perl. |
| 1140 | |
| 1141 | There are two commonly used techniques of profiling executables: |
| 1142 | I<statistical time-sampling> and I<basic-block counting>. |
| 1143 | |
| 1144 | The first method takes periodically samples of the CPU program counter, |
| 1145 | and since the program counter can be correlated with the code generated |
| 1146 | for functions, we get a statistical view of in which functions the |
| 1147 | program is spending its time. The caveats are that very small/fast |
| 1148 | functions have lower probability of showing up in the profile, and that |
| 1149 | periodically interrupting the program (this is usually done rather |
| 1150 | frequently, in the scale of milliseconds) imposes an additional |
| 1151 | overhead that may skew the results. The first problem can be alleviated |
| 1152 | by running the code for longer (in general this is a good idea for |
| 1153 | profiling), the second problem is usually kept in guard by the |
| 1154 | profiling tools themselves. |
| 1155 | |
| 1156 | The second method divides up the generated code into I<basic blocks>. |
| 1157 | Basic blocks are sections of code that are entered only in the |
| 1158 | beginning and exited only at the end. For example, a conditional jump |
| 1159 | starts a basic block. Basic block profiling usually works by |
| 1160 | I<instrumenting> the code by adding I<enter basic block #nnnn> |
| 1161 | book-keeping code to the generated code. During the execution of the |
| 1162 | code the basic block counters are then updated appropriately. The |
| 1163 | caveat is that the added extra code can skew the results: again, the |
| 1164 | profiling tools usually try to factor their own effects out of the |
| 1165 | results. |
| 1166 | |
| 1167 | =head2 Gprof Profiling |
| 1168 | |
| 1169 | I<gprof> is a profiling tool available in many Unix platforms which |
| 1170 | uses I<statistical time-sampling>. You can build a profiled version of |
| 1171 | F<perl> by compiling using gcc with the flag C<-pg>. Either edit |
| 1172 | F<config.sh> or re-run F<Configure>. Running the profiled version of |
| 1173 | Perl will create an output file called F<gmon.out> which contains the |
| 1174 | profiling data collected during the execution. |
| 1175 | |
| 1176 | quick hint: |
| 1177 | |
| 1178 | $ sh Configure -des -Dusedevel -Accflags='-pg' \ |
| 1179 | -Aldflags='-pg' -Alddlflags='-pg -shared' \ |
| 1180 | && make perl |
| 1181 | $ ./perl ... # creates gmon.out in current directory |
| 1182 | $ gprof ./perl > out |
| 1183 | $ less out |
| 1184 | |
| 1185 | (you probably need to add C<-shared> to the <-Alddlflags> line until RT |
| 1186 | #118199 is resolved) |
| 1187 | |
| 1188 | The F<gprof> tool can then display the collected data in various ways. |
| 1189 | Usually F<gprof> understands the following options: |
| 1190 | |
| 1191 | =over 4 |
| 1192 | |
| 1193 | =item * -a |
| 1194 | |
| 1195 | Suppress statically defined functions from the profile. |
| 1196 | |
| 1197 | =item * -b |
| 1198 | |
| 1199 | Suppress the verbose descriptions in the profile. |
| 1200 | |
| 1201 | =item * -e routine |
| 1202 | |
| 1203 | Exclude the given routine and its descendants from the profile. |
| 1204 | |
| 1205 | =item * -f routine |
| 1206 | |
| 1207 | Display only the given routine and its descendants in the profile. |
| 1208 | |
| 1209 | =item * -s |
| 1210 | |
| 1211 | Generate a summary file called F<gmon.sum> which then may be given to |
| 1212 | subsequent gprof runs to accumulate data over several runs. |
| 1213 | |
| 1214 | =item * -z |
| 1215 | |
| 1216 | Display routines that have zero usage. |
| 1217 | |
| 1218 | =back |
| 1219 | |
| 1220 | For more detailed explanation of the available commands and output |
| 1221 | formats, see your own local documentation of F<gprof>. |
| 1222 | |
| 1223 | =head2 GCC gcov Profiling |
| 1224 | |
| 1225 | I<basic block profiling> is officially available in gcc 3.0 and later. |
| 1226 | You can build a profiled version of F<perl> by compiling using gcc with |
| 1227 | the flags C<-fprofile-arcs -ftest-coverage>. Either edit F<config.sh> |
| 1228 | or re-run F<Configure>. |
| 1229 | |
| 1230 | quick hint: |
| 1231 | |
| 1232 | $ sh Configure -des -Dusedevel -Doptimize='-g' \ |
| 1233 | -Accflags='-fprofile-arcs -ftest-coverage' \ |
| 1234 | -Aldflags='-fprofile-arcs -ftest-coverage' \ |
| 1235 | -Alddlflags='-fprofile-arcs -ftest-coverage -shared' \ |
| 1236 | && make perl |
| 1237 | $ rm -f regexec.c.gcov regexec.gcda |
| 1238 | $ ./perl ... |
| 1239 | $ gcov regexec.c |
| 1240 | $ less regexec.c.gcov |
| 1241 | |
| 1242 | (you probably need to add C<-shared> to the <-Alddlflags> line until RT |
| 1243 | #118199 is resolved) |
| 1244 | |
| 1245 | Running the profiled version of Perl will cause profile output to be |
| 1246 | generated. For each source file an accompanying F<.gcda> file will be |
| 1247 | created. |
| 1248 | |
| 1249 | To display the results you use the I<gcov> utility (which should be |
| 1250 | installed if you have gcc 3.0 or newer installed). F<gcov> is run on |
| 1251 | source code files, like this |
| 1252 | |
| 1253 | gcov sv.c |
| 1254 | |
| 1255 | which will cause F<sv.c.gcov> to be created. The F<.gcov> files contain |
| 1256 | the source code annotated with relative frequencies of execution |
| 1257 | indicated by "#" markers. If you want to generate F<.gcov> files for |
| 1258 | all profiled object files, you can run something like this: |
| 1259 | |
| 1260 | for file in `find . -name \*.gcno` |
| 1261 | do sh -c "cd `dirname $file` && gcov `basename $file .gcno`" |
| 1262 | done |
| 1263 | |
| 1264 | Useful options of F<gcov> include C<-b> which will summarise the basic |
| 1265 | block, branch, and function call coverage, and C<-c> which instead of |
| 1266 | relative frequencies will use the actual counts. For more information |
| 1267 | on the use of F<gcov> and basic block profiling with gcc, see the |
| 1268 | latest GNU CC manual. As of gcc 4.8, this is at |
| 1269 | L<http://gcc.gnu.org/onlinedocs/gcc/Gcov-Intro.html#Gcov-Intro> |
| 1270 | |
| 1271 | =head1 MISCELLANEOUS TRICKS |
| 1272 | |
| 1273 | =head2 PERL_DESTRUCT_LEVEL |
| 1274 | |
| 1275 | If you want to run any of the tests yourself manually using e.g. |
| 1276 | valgrind, please note that by default perl B<does not> explicitly |
| 1277 | cleanup all the memory it has allocated (such as global memory arenas) |
| 1278 | but instead lets the exit() of the whole program "take care" of such |
| 1279 | allocations, also known as "global destruction of objects". |
| 1280 | |
| 1281 | There is a way to tell perl to do complete cleanup: set the environment |
| 1282 | variable PERL_DESTRUCT_LEVEL to a non-zero value. The t/TEST wrapper |
| 1283 | does set this to 2, and this is what you need to do too, if you don't |
| 1284 | want to see the "global leaks": For example, for running under valgrind |
| 1285 | |
| 1286 | env PERL_DESTRUCT_LEVEL=2 valgrind ./perl -Ilib t/foo/bar.t |
| 1287 | |
| 1288 | (Note: the mod_perl apache module uses also this environment variable |
| 1289 | for its own purposes and extended its semantics. Refer to the mod_perl |
| 1290 | documentation for more information. Also, spawned threads do the |
| 1291 | equivalent of setting this variable to the value 1.) |
| 1292 | |
| 1293 | If, at the end of a run you get the message I<N scalars leaked>, you |
| 1294 | can recompile with C<-DDEBUG_LEAKING_SCALARS>, which will cause the |
| 1295 | addresses of all those leaked SVs to be dumped along with details as to |
| 1296 | where each SV was originally allocated. This information is also |
| 1297 | displayed by Devel::Peek. Note that the extra details recorded with |
| 1298 | each SV increases memory usage, so it shouldn't be used in production |
| 1299 | environments. It also converts C<new_SV()> from a macro into a real |
| 1300 | function, so you can use your favourite debugger to discover where |
| 1301 | those pesky SVs were allocated. |
| 1302 | |
| 1303 | If you see that you're leaking memory at runtime, but neither valgrind |
| 1304 | nor C<-DDEBUG_LEAKING_SCALARS> will find anything, you're probably |
| 1305 | leaking SVs that are still reachable and will be properly cleaned up |
| 1306 | during destruction of the interpreter. In such cases, using the C<-Dm> |
| 1307 | switch can point you to the source of the leak. If the executable was |
| 1308 | built with C<-DDEBUG_LEAKING_SCALARS>, C<-Dm> will output SV |
| 1309 | allocations in addition to memory allocations. Each SV allocation has a |
| 1310 | distinct serial number that will be written on creation and destruction |
| 1311 | of the SV. So if you're executing the leaking code in a loop, you need |
| 1312 | to look for SVs that are created, but never destroyed between each |
| 1313 | cycle. If such an SV is found, set a conditional breakpoint within |
| 1314 | C<new_SV()> and make it break only when C<PL_sv_serial> is equal to the |
| 1315 | serial number of the leaking SV. Then you will catch the interpreter in |
| 1316 | exactly the state where the leaking SV is allocated, which is |
| 1317 | sufficient in many cases to find the source of the leak. |
| 1318 | |
| 1319 | As C<-Dm> is using the PerlIO layer for output, it will by itself |
| 1320 | allocate quite a bunch of SVs, which are hidden to avoid recursion. You |
| 1321 | can bypass the PerlIO layer if you use the SV logging provided by |
| 1322 | C<-DPERL_MEM_LOG> instead. |
| 1323 | |
| 1324 | =head2 PERL_MEM_LOG |
| 1325 | |
| 1326 | If compiled with C<-DPERL_MEM_LOG>, both memory and SV allocations go |
| 1327 | through logging functions, which is handy for breakpoint setting. |
| 1328 | |
| 1329 | Unless C<-DPERL_MEM_LOG_NOIMPL> is also compiled, the logging functions |
| 1330 | read $ENV{PERL_MEM_LOG} to determine whether to log the event, and if |
| 1331 | so how: |
| 1332 | |
| 1333 | $ENV{PERL_MEM_LOG} =~ /m/ Log all memory ops |
| 1334 | $ENV{PERL_MEM_LOG} =~ /s/ Log all SV ops |
| 1335 | $ENV{PERL_MEM_LOG} =~ /t/ include timestamp in Log |
| 1336 | $ENV{PERL_MEM_LOG} =~ /^(\d+)/ write to FD given (default is 2) |
| 1337 | |
| 1338 | Memory logging is somewhat similar to C<-Dm> but is independent of |
| 1339 | C<-DDEBUGGING>, and at a higher level; all uses of Newx(), Renew(), and |
| 1340 | Safefree() are logged with the caller's source code file and line |
| 1341 | number (and C function name, if supported by the C compiler). In |
| 1342 | contrast, C<-Dm> is directly at the point of C<malloc()>. SV logging is |
| 1343 | similar. |
| 1344 | |
| 1345 | Since the logging doesn't use PerlIO, all SV allocations are logged and |
| 1346 | no extra SV allocations are introduced by enabling the logging. If |
| 1347 | compiled with C<-DDEBUG_LEAKING_SCALARS>, the serial number for each SV |
| 1348 | allocation is also logged. |
| 1349 | |
| 1350 | =head2 DDD over gdb |
| 1351 | |
| 1352 | Those debugging perl with the DDD frontend over gdb may find the |
| 1353 | following useful: |
| 1354 | |
| 1355 | You can extend the data conversion shortcuts menu, so for example you |
| 1356 | can display an SV's IV value with one click, without doing any typing. |
| 1357 | To do that simply edit ~/.ddd/init file and add after: |
| 1358 | |
| 1359 | ! Display shortcuts. |
| 1360 | Ddd*gdbDisplayShortcuts: \ |
| 1361 | /t () // Convert to Bin\n\ |
| 1362 | /d () // Convert to Dec\n\ |
| 1363 | /x () // Convert to Hex\n\ |
| 1364 | /o () // Convert to Oct(\n\ |
| 1365 | |
| 1366 | the following two lines: |
| 1367 | |
| 1368 | ((XPV*) (())->sv_any )->xpv_pv // 2pvx\n\ |
| 1369 | ((XPVIV*) (())->sv_any )->xiv_iv // 2ivx |
| 1370 | |
| 1371 | so now you can do ivx and pvx lookups or you can plug there the sv_peek |
| 1372 | "conversion": |
| 1373 | |
| 1374 | Perl_sv_peek(my_perl, (SV*)()) // sv_peek |
| 1375 | |
| 1376 | (The my_perl is for threaded builds.) Just remember that every line, |
| 1377 | but the last one, should end with \n\ |
| 1378 | |
| 1379 | Alternatively edit the init file interactively via: 3rd mouse button -> |
| 1380 | New Display -> Edit Menu |
| 1381 | |
| 1382 | Note: you can define up to 20 conversion shortcuts in the gdb section. |
| 1383 | |
| 1384 | =head2 Poison |
| 1385 | |
| 1386 | If you see in a debugger a memory area mysteriously full of 0xABABABAB |
| 1387 | or 0xEFEFEFEF, you may be seeing the effect of the Poison() macros, see |
| 1388 | L<perlclib>. |
| 1389 | |
| 1390 | =head2 Read-only optrees |
| 1391 | |
| 1392 | Under ithreads the optree is read only. If you want to enforce this, to |
| 1393 | check for write accesses from buggy code, compile with |
| 1394 | C<-Accflags=-DPERL_DEBUG_READONLY_OPS> |
| 1395 | to enable code that allocates op memory |
| 1396 | via C<mmap>, and sets it read-only when it is attached to a subroutine. |
| 1397 | Any write access to an op results in a C<SIGBUS> and abort. |
| 1398 | |
| 1399 | This code is intended for development only, and may not be portable |
| 1400 | even to all Unix variants. Also, it is an 80% solution, in that it |
| 1401 | isn't able to make all ops read only. Specifically it does not apply to |
| 1402 | op slabs belonging to C<BEGIN> blocks. |
| 1403 | |
| 1404 | However, as an 80% solution it is still effective, as it has caught |
| 1405 | bugs in the past. |
| 1406 | |
| 1407 | =head2 When is a bool not a bool? |
| 1408 | |
| 1409 | On pre-C99 compilers, C<bool> is defined as equivalent to C<char>. |
| 1410 | Consequently assignment of any larger type to a C<bool> is unsafe and may |
| 1411 | be truncated. The C<cBOOL> macro exists to cast it correctly. |
| 1412 | |
| 1413 | On those platforms and compilers where C<bool> really is a boolean (C++, |
| 1414 | C99), it is easy to forget the cast. You can force C<bool> to be a C<char> |
| 1415 | by compiling with C<-Accflags=-DPERL_BOOL_AS_CHAR>. You may also wish to |
| 1416 | run C<Configure> with something like |
| 1417 | |
| 1418 | -Accflags='-Wconversion -Wno-sign-conversion -Wno-shorten-64-to-32' |
| 1419 | |
| 1420 | or your compiler's equivalent to make it easier to spot any unsafe truncations |
| 1421 | that show up. |
| 1422 | |
| 1423 | =head2 The .i Targets |
| 1424 | |
| 1425 | You can expand the macros in a F<foo.c> file by saying |
| 1426 | |
| 1427 | make foo.i |
| 1428 | |
| 1429 | which will expand the macros using cpp. Don't be scared by the |
| 1430 | results. |
| 1431 | |
| 1432 | =head1 AUTHOR |
| 1433 | |
| 1434 | This document was originally written by Nathan Torkington, and is |
| 1435 | maintained by the perl5-porters mailing list. |