It's then up to you to apply these patches, using something like
- # last=`ls -t *.gz | sed q`
+ # last="`cat ../perl-current/.patch`.gz"
# rsync -avz rsync://public.activestate.com/perl-current-diffs/ .
# find . -name '*.gz' -newer $last -exec gzcat {} \; >blead.patch
# cd ../perl-current
7 call_method("PUSH", G_SCALAR|G_DISCARD);
8 LEAVE;
-The lines which concern the mark stack are the first, fifth and last
-lines: they save away, restore and remove the current position of the
-argument stack.
-
Let's examine the whole implementation, for practice:
1 PUSHMARK(SP);
5 PUTBACK;
-Next we tell Perl to make the change to the global stack pointer: C<dSP>
-only gave us a local copy, not a reference to the global.
+Next we tell Perl to update the global stack pointer from our internal
+variable: C<dSP> only gave us a local copy, not a reference to the global.
6 ENTER;
7 call_method("PUSH", G_SCALAR|G_DISCARD);
To actually do the magic method call, we have to call a subroutine in
Perl space: C<call_method> takes care of that, and it's described in
L<perlcall>. We call the C<PUSH> method in scalar context, and we're
-going to discard its return value.
+going to discard its return value. The call_method() function
+removes the top element of the mark stack, so there is nothing for
+the caller to clean up.
=item Save stack
which will expand the macros using cpp. Don't be scared by the results.
+=head1 SOURCE CODE STATIC ANALYSIS
+
+Various tools exist for analysing C source code B<statically>, as
+opposed to B<dynamically>, that is, without executing the code.
+It is possible to detect resource leaks, undefined behaviour, type
+mismatches, portability problems, code paths that would cause illegal
+memory accesses, and other similar problems by just parsing the C code
+and looking at the resulting graph, what does it tell about the
+execution and data flows. As a matter of fact, this is exactly
+how C compilers know to give warnings about dubious code.
+
+=head2 lint, splint
+
+The good old C code quality inspector, C<lint>, is available in
+several platforms, but please be aware that there are several
+different implementations of it by different vendors, which means that
+the flags are not identical across different platforms.
+
+There is a lint variant called C<splint> (Secure Programming Lint)
+available from http://www.splint.org/ that should compile on any
+Unix-like platform.
+
+There are C<lint> and <splint> targets in Makefile, but you may have
+to diddle with the flags (see above).
+
+=head2 Coverity
+
+Coverity (http://www.coverity.com/) is a product similar to lint and
+as a testbed for their product they periodically check several open
+source projects, and they give out accounts to open source developers
+to the defect databases.
+
+=head2 cpd (cut-and-paste detector)
+
+The cpd tool detects cut-and-paste coding. If one instance of the
+cut-and-pasted code changes, all the other spots should probably be
+changed, too. Therefore such code should probably be turned into a
+subroutine or a macro.
+
+cpd (http://pmd.sourceforge.net/cpd.html) is part of the pmd project
+(http://pmd.sourceforge.net/). pmd was originally written for static
+analysis of Java code, but later the cpd part of it was extended to
+parse also C and C++.
+
+Download the pmd-X.y.jar from the SourceForge site, and then run
+it on source code thusly:
+
+ java -cp pmd-X.Y.jar net.sourceforge.pmd.cpd.CPD --minimum-tokens 100 --files /some/where/src --language c > cpd.txt
+
+You may run into memory limits, in which case you should use the -Xmx option:
+
+ java -Xmx512M ...
+
+=head2 gcc warnings
+
+Though much can be written about the inconsistency and coverage
+problems of gcc warnings (like C<-Wall> not meaning "all the
+warnings", or some common portability problems not being covered by
+C<-Wall>, or C<-ansi> and C<-pedantic> both being a poorly defined
+collection of warnings, and so forth), gcc is still a useful tool in
+keeping our coding nose clean.
+
+The C<-Wall> is by default on.
+
+The C<-ansi> (and its sidekick, C<-pedantic>) would be nice to be
+on always, but unfortunately they are not safe on all platforms,
+they can for example cause fatal conflicts with the system headers
+(Solaris being a prime example). The C<cflags> frontend selects
+C<-ansi -pedantic> for the platforms where they are known to be safe.
+
+Starting from Perl 5.9.4 the following extra flags are added:
+
+=over 4
+
+=item *
+
+C<-Wendif-labels>
+
+=item *
+
+C<-Wextra>
+
+=item *
+
+C<-Wdeclaration-after-statement>
+
+=back
+
+The following flags would be nice to have but they would first need
+their own Stygian stablemaster:
+
+=over 4
+
+=item *
+
+C<-Wpointer-arith>
+
+=item *
+
+C<-Wshadow>
+
+=item *
+
+C<-Wstrict-prototypes>
+
+=item *
+
+=back
+
+The C<-Wtraditional> is another example of the annoying tendency of
+gcc to bundle a lot of warnings under one switch -- it would be
+impossible to deploy in practice because it would complain a lot -- but
+it does contain some warnings that would be beneficial to have available
+on their own, such as the warning about string constants inside macros
+containing the macro arguments: this behaved differently pre-ANSI
+than it does in ANSI, and some C compilers are still in transition,
+AIX being an example.
+
+=head2 Warnings of other C compilers
+
+Other C compilers (yes, there B<are> other C compilers than gcc) often
+have their "strict ANSI" or "strict ANSI with some portability extensions"
+modes on, like for example the Sun Workshop has its C<-Xa> mode on
+(though implicitly), or the DEC (these days, HP...) has its C<-std1>
+mode on.
+
+=head2 DEBUGGING
+
+You can compile a special debugging version of Perl, which allows you
+to use the C<-D> option of Perl to tell more about what Perl is doing.
+But sometimes there is no alternative than to dive in with a debugger,
+either to see the stack trace of a core dump (very useful in a bug
+report), or trying to figure out what went wrong before the core dump
+happened, or how did we end up having wrong or unexpected results.
+
=head2 Poking at Perl
To really poke around with Perl, you'll probably want to build Perl for
make
C<-g> is a flag to the C compiler to have it produce debugging
-information which will allow us to step through a running program.
+information which will allow us to step through a running program,
+and to see in which C function we are at (without the debugging
+information we might see only the numerical addresses of the functions,
+which is not very helpful).
+
F<Configure> will also turn on the C<DEBUGGING> compilation symbol which
enables all the internal debugging code in Perl. There are a whole bunch
of things you can debug with this: L<perlrun> lists them all, and the
=item *
-We'll use C<gdb> for our examples here; the principles will apply to any
-debugger, but check the manual of the one you're using.
+We'll use C<gdb> for our examples here; the principles will apply to
+any debugger (many vendors call their debugger C<dbx>), but check the
+manual of the one you're using.
=back
gdb ./perl
+Or if you have a core dump:
+
+ gdb ./perl core
+
You'll want to do that in your Perl source tree so the debugger can read
the source code. You should see the copyright message, followed by the
prompt.
=back
+=head2 Common problems when patching Perl source code
+
+Perl source plays by ANSI C89 rules: no C99 (or C++) extensions. In
+some cases we have to take pre-ANSI requirements into consideration.
+You don't care about some particular platform having broken Perl?
+I hear there is still a strong demand for J2EE programmers.
+
+=head2 Perl environment problems
+
+=over 4
+
+=item *
+
+Not compiling with threading
+
+Compiling with threading (-Duseithreads) completely rewrites
+the function prototypes of Perl. You better try your changes
+with that. Related to this is the difference between "Perl_-less"
+and "Perl_-ly" APIs, for example:
+
+ Perl_sv_setiv(aTHX_ ...);
+ sv_setiv(...);
+
+The first one explicitly passes in the context, which is needed for
+e.g. threaded builds. The second one does that implicitly; do not get
+them mixed.
+
+See L<perlguts/"How multiple interpreters and concurrency are supported">
+for further discussion about context.
+
+=item *
+
+Not compiling with -DDEBUGGING
+
+The DEBUGGING define exposes more code to the compiler,
+therefore more ways for things to go wrong. You should try it.
+
+=item *
+
+Not exporting your new function
+
+Some platforms (Win32, AIX, VMS, OS/2, to name a few) require any
+function that is part of the public API (the shared Perl library)
+to be explicitly marked as exported. See the discussion about
+F<embed.pl> in L<perlguts>.
+
+=item *
+
+Exporting your new function
+
+The new shiny result of either genuine new functionality or your
+arduous refactoring is now ready and correctly exported. So what
+could possibly be wrong?
+
+Maybe simply that your function did not need to be exported in the
+first place. Perl has a long and not so glorious history of exporting
+functions that it should not have.
+
+If the function is used only inside one source code file, make it
+static. See the discussion about F<embed.pl> in L<perlguts>.
+
+If the function is used across several files, but intended only for
+Perl's internal use (and this should be the common case), do not
+export it to the public API. See the discussion about F<embed.pl>
+in L<perlguts>.
+
+=back
+
+=head Portability problems
+
+The following are common causes of compilation and/or execution
+failures, not common to Perl as such. The C FAQ is good bedtime
+reading. Please test your changes with as many C compilers and
+platforms as possible -- we will, anyway, and it's nice to save
+oneself from public embarrassment.
+
+Also study L<perlport> carefully to avoid any bad assumptions
+about the operating system, filesystem, and so forth.
+
+Do not assume an operating system indicates a certain compiler.
+
+=over 4
+
+=item *
+
+Casting pointers to integers or casting integers to pointers
+
+ void castaway(U8* p)
+ {
+ IV i = p;
+
+or
+
+ void castaway(U8* p)
+ {
+ IV i = (IV)p;
+
+Either are bad, and broken, and unportable. Use the PTR2IV()
+macro that does it right. (Likewise, there are PTR2UV(), PTR2NV(),
+INT2PTR(), and NUM2PTR().)
+
+=item *
+
+Casting between data function pointers and data pointers
+
+Technically speaking casting between function pointers and data
+pointers is unportable and undefined, but practically speaking
+it seems to work, but you should use the FPTR2DPTR() and DPTR2FPTR()
+macros. Sometimes you can also play games with unions.
+
+=item *
+
+Assuming sizeof(int) == sizeof(long)
+
+There are platforms where longs are 64 bits, and platforms where ints
+are 64 bits, and while we are out to shock you, even platforms where
+shorts are 64 bits. This is all legal according to the C standard.
+(In other words, "long long" is not a portable way to specify 64 bits,
+and "long long" is not even guaranteed to be any wider than "long".)
+Use the definitions IV, UV, IVSIZE, I32SIZE, and so forth. Avoid
+things like I32 because they are B<not> guaranteed to be I<exactly>
+32 bits, they are I<at least> 32 bits, nor are they guaranteed to
+be B<int> or B<long>. If you really explicitly need 64-bit variables,
+use I64 and U64, but only if guarded by HAS_QUAD.
+
+=item *
+
+Assuming one can dereference any type of pointer for any type of data
+
+ char *p = ...;
+ long pony = *p;
+
+Many platforms, quite rightly so, will give you a core dump instead
+of a pony if the p happens not be correctly aligned.
+
+=item *
+
+Lvalue casts
+
+ (int)*p = ...;
+
+Simply not portable. Get your lvalue to be of the right type,
+or maybe use temporary variables.
+
+=item *
+
+Mixing #define and #ifdef
+
+ #define BURGLE(x) ... \
+ #ifdef BURGLE_OLD_STYLE
+ ... do it the old way ... \
+ #else
+ ... do it the new way ... \
+ #endif
+
+You cannot portably "stack" cpp directives. For example in the
+above you need two separate #defines, one in each #ifdef branch.
+
+=item *
+
+Using //-comments
+
+ // This function bamfoodles the zorklator.
+
+That is C99 or C++. Perl is C89. Using the //-comments is silently
+allowed by many C compilers but cranking up the ANSI C89 strictness
+(which we like to do) causes the compilation to fail.
+
+=item *
+
+Mixing declarations and code
+
+ void zorklator()
+ {
+ int n = 3;
+ set_zorkmids(n);
+ int q = 4;
+
+That is C99 or C++. Some C compilers allow that, but you shouldn't.
+
+=item *
+
+Introducing variables inside for()
+
+ for(int i = ...; ...; ...)
+
+That is C99 or C++. While it would indeed be awfully nice to have that
+also in C89, to limit the scope of the loop variable, alas, we cannot.
+
+=item *
+
+Mixing signed char pointers with unsigned char pointers
+
+ int foo(char *s) { ... }
+ ...
+ unsigned char *t = ...; /* Or U8* t = ... */
+ foo(t);
+
+While this is legal practice, it is certainly dubious, and downright
+fatal in at least one platform: for example VMS cc considers this a
+fatal error. One cause for people often making this mistake is that a
+"naked char" and therefore dereferencing a "naked char pointer" have
+an undefined signedness: it depends on the compiler and the platform
+whether the result is signed or unsigned.
+
+=item *
+
+Macros that have string constants and their arguments as substrings of
+the string constants
+
+ #define FOO(n) printf("number = %d\n", n)
+ FOO(10);
+
+Pre-ANSI semantics for that was equivalent to
+
+ printf("10umber = %d\10");
+
+which is probably not what you were expecting. Unfortunately at least
+one reasonably common and modern C compiler does "real backward
+compatibility here", in AIX that is what still happens even though the
+rest of the AIX compiler is very happily C89.
+
+=item *
+
+Blindly using variadic macros
+
+gcc has had them for a while with its own syntax, and C99
+brought them with a standardized syntax. Don't use the former,
+and use the latter only if the HAS_C99_VARIADIC_MACROS.
+
+=item *
+
+Blindly passing va_list
+
+Not all platforms support passing va_list to further varargs (stdarg)
+functions. The right thing to do is to copy the va_list using the
+Perl_va_copy() if the NEED_VA_COPY is defined.
+
+=back
+
+=head2 Security problems
+
+Last but not least, here are various tips for safer coding.
+
+=over 4
+
+=item *
+
+Do not use gets()
+
+Or we will publicly ridicule you. Seriously.
+
+=item *
+
+Do not use strcpy() or strcat()
+
+While some uses of these still linger in the Perl source code,
+we have inspected them for safety and are very, very ashamed of them,
+and plan to get rid of them. In places where there are strlcpy()
+and strlcat() we prefer to use them, and there is a plan to integrate
+the strlcpy/strlcat implementation of INN.
+
+=item *
+
+Do not use sprintf() or vsprintf()
+
+If you really want just plain byte strings, use my_snprintf()
+and my_vnsprintf() instead, which will try to use snprintf() and
+vsnprintf() if those safer APIs are available. If you want something
+fancier than a plain byte string, use SVs and Perl_sv_catpvf().
+
+=back
+
=head1 EXTERNAL TOOLS FOR DEBUGGING PERL
Sometimes it helps to use external tools while debugging and
=back
-=head2 CONCLUSION
+=head1 CONCLUSION
-We've had a brief look around the Perl source, an overview of the stages
-F<perl> goes through when it's running your code, and how to use a
-debugger to poke at the Perl guts. We took a very simple problem and
-demonstrated how to solve it fully - with documentation, regression
-tests, and finally a patch for submission to p5p. Finally, we talked
-about how to use external tools to debug and test Perl.
+We've had a brief look around the Perl source, how to maintain quality
+of the source code, an overview of the stages F<perl> goes through
+when it's running your code, how to use debuggers to poke at the Perl
+guts, and finally how to analyse the execution of Perl. We took a very
+simple problem and demonstrated how to solve it fully - with
+documentation, regression tests, and finally a patch for submission to
+p5p. Finally, we talked about how to use external tools to debug and
+test Perl.
I'd now suggest you read over those references again, and then, as soon
as possible, get your hands dirty. The best way to learn is by doing,