Review assertions. Review syntax to combine assertions. Assertions could take
advantage of the lexical pragmas work. L</What hooks would assertions need?>
+=item *
+
+C<encoding::warnings> should be turned into a lexical pragma.
+C<encoding> should, too (probably).
+
=back
=head2 Needed for a 5.9.5 release
various times worked to cut it down. There is probably still fat to cut out -
for example POSIX passes Exporter some very memory hungry data structures.
+=head2 embed.pl/makedef.pl
+
+There is a script F<embed.pl> that generates several header files to prefix
+all of Perl's symbols in a consistent way, to provide some semblance of
+namespace support in C<C>. Functions are declared in F<embed.fnc>, variables
+in F<interpvar.h> and F<thrdvar.h>. Quite a few of the functions and variables
+are conditionally declared there, using C<#ifdef>. However, F<embed.pl>
+doesn't understand the C macros, so the rules about which symbols are present
+when is duplicated in F<makedef.pl>. Writing things twice is bad, m'kay.
+It would be good to teach C<embed.pl> to understand the conditional
+compilation, and hence remove the duplication, and the mistakes it has caused.
Or if you prefer, tasks that you would learn from, and broaden your skills
base...
-=head2 Relocatable perl
-
-The C level patches needed to create a relocatable perl binary are done, as
-is the work on F<Config.pm>. All that's left to do is the C<Configure> tweaking
-to let people specify how they want to do the install.
-
=head2 make HTML install work
There is an C<installhtml> target in the Makefile. It's marked as
a binary distribution better describes the installed machine, when the
installed machine differs from the build machine in some significant way.
-=head2 make parallel builds work
-
-Currently parallel builds (such as C<make -j3>) don't work reliably. We believe
-that this is due to incomplete dependency specification in the F<Makefile>.
-It would be good if someone were able to track down the causes of these
-problems, so that parallel builds worked properly.
-
=head2 linker specification files
Some platforms mandate that you provide a list of a shared library's external
could have been removed, but now it has to remain in the 5.8.x releases to
keep the structure the same size, to retain binary compatibility.
-=head2 am I hot or not?
+It's probably worth checking if all need to be the types they are. For example
+
+ PERLVAR(Ierror_count, I32) /* how many errors so far, max 10 */
+
+might work as well if stored in a signed (or unsigned) 8 bit value, if the
+comment is accurate. C<PL_multi_open> and C<PL_multi_close> can probably
+become C<char>s. Finding variables to downsize coupled with rearrangement
+could shrink the interpreter structure; a size saving which is multiplied by
+the number of threads running.
+
+=head2 Profile Perl - am I hot or not?
-The idea of F<pp_hot.c> is that it contains the I<hot> ops, the ops that are
-most commonly used. The idea is that by grouping them, their object code will
-be adjacent in the executable, so they have a greater chance of already being
-in the CPU cache (or swapped in) due to being near another op already in use.
+The Perl source code is stable enough that it makes sense to profile it,
+identify and optimise the hotspots. It would be good to measure the
+performance of the Perl interpreter using free tools such as cachegrind,
+gprof, and dtrace, and work to reduce the bottlenecks they reveal.
+
+As part of this, the idea of F<pp_hot.c> is that it contains the I<hot> ops,
+the ops that are most commonly used. The idea is that by grouping them, their
+object code will be adjacent in the executable, so they have a greater chance
+of already being in the CPU cache (or swapped in) due to being near another op
+already in use.
Except that it's not clear if these really are the most commonly used ops. So
-anyone feeling like exercising their skill with coverage and profiling tools
-might want to determine what ops I<really> are the most commonly used. And in
-turn suggest evictions and promotions to achieve a better F<pp_hot.c>.
+as part of exercising your skills with coverage and profiling tools you might
+want to determine what ops I<really> are the most commonly used. And in turn
+suggest evictions and promotions to achieve a better F<pp_hot.c>.
+=head2 Shrink struct context
+In F<cop.h>, we have
+ struct context {
+ U32 cx_type; /* what kind of context this is */
+ union {
+ struct block cx_blk;
+ struct subst cx_subst;
+ } cx_u;
+ };
-=head1 Tasks that need a knowledge of XS
+There are less than 256 values for C<cx_type>, and the constituent parts
+C<struct block> and C<struct subst> both contain some C<U8> and C<U16> fields,
+so it should be possible to move them to the first word, and share space with
+a C<U8> C<cx_type>, saving 1 word.
-These tasks would need C knowledge, and roughly the level of knowledge of
-the perl API that comes from writing modules that use XS to interface to
-C.
+=head2 Allocate OPs from arenas
-=head2 IPv6
+Currently all new OP structures are individually malloc()ed and free()d.
+All C<malloc> implementations have space overheads, and are now as fast as
+custom allocates so it would both use less memory and less CPU to allocate
+the various OP structures from arenas. The SV arena code can probably be
+re-used for this.
-Clean this up. Check everything in core works
-=head2 shrink C<GV>s, C<CV>s
-By removing unused elements and careful re-ordering, the structures for C<AV>s
-and C<HV>s have recently been shrunk considerably. It's probable that the same
-approach would find savings in C<GV>s and C<CV>s, if not all the other
-larger-than-C<PVMG> types.
-=head2 UTF8 caching code
+=head1 Tasks that need a knowledge of XS
+
+These tasks would need C knowledge, and roughly the level of knowledge of
+the perl API that comes from writing modules that use XS to interface to
+C.
+
+=head2 shrink C<PVBM>s
-The string position/offset cache is not optional. It should be.
+By removing unused elements and careful re-ordering, the structures for C<AV>s,
+C<HV>s, C<CV>s and C<GV>s have recently been shrunk considerably. C<PVIO>s
+probably aren't worth it, as typical programs don't use more than 8, and
+(at least) C<Filter::Util::Call> uses C<SvPVX>/C<SvCUR>/C<SvLEN> on a C<PVIO>,
+so it would mean code changes to modules on CPAN. C<PVBM>s might have some
+savings to win.
=head2 Implicit Latin 1 => Unicode translation
debugger on a running Perl program, although I'm not sure how it would be
done." ssh and screen do this with named pipes in /tmp. Maybe we can too.
-=head2 Constant folding
-
-The peephole optimiser should trap errors during constant folding, and give
-up on the folding, rather than bailing out at compile time. It is quite
-possible that the unfoldable constant is in unreachable code, eg something
-akin to C<$a = 0/0 if 0;>
-
=head2 LVALUE functions for lists
The old perltodo notes that lvalue functions don't work for list or hash
C<my $foo if 0;> is deprecated, and should be replaced with
C<state $x = "initial value\n";> the syntax from Perl 6.
-
-=head2 @INC source filter to Filter::Simple
-
-The second return value from a sub in @INC can be a source filter. This isn't
-documented. It should be changed to use Filter::Simple, tested and documented.
+Rafael has sent a first cut patch to perl5-porters.
=head2 regexp optimiser optional
the full assertion support from a CPAN module, so that we aren't constraining
the imagination of future CPAN authors.
+=head2 Properly Unicode safe tokeniser and pads.
+
+The tokeniser isn't actually very UTF-8 clean. C<use utf8;> is a hack -
+variable names are stored in stashes as raw bytes, without the utf-8 flag
+set. The pad API only takes a C<char *> pointer, so that's all bytes too. The
+tokeniser ignores the UTF-8-ness of C<PL_rsfp>, or any SVs returned from
+source filters. All this could be fixed.
+