perl5.git.perl.org Git - perl5.git/blame_incremental

... / ...

Commit	Line	Data
	1	=head1 NAME
	2
	3	perltodo - Perl TO-DO List
	4
	5	=head1 DESCRIPTION
	6
	7	This is a list of wishes for Perl. The most up to date version of this file
	8	is at http://perl5.git.perl.org/perl.git/blob_plain/HEAD:/pod/perltodo.pod
	9
	10	The tasks we think are smaller or easier are listed first. Anyone is welcome
	11	to work on any of these, but it's a good idea to first contact
	12	I<perl5-porters@perl.org> to avoid duplication of effort, and to learn from
	13	any previous attempts. By all means contact a pumpking privately first if you
	14	prefer.
	15
	16	Whilst patches to make the list shorter are most welcome, ideas to add to
	17	the list are also encouraged. Check the perl5-porters archives for past
	18	ideas, and any discussion about them. One set of archives may be found at:
	19
	20	http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/
	21
	22	What can we offer you in return? Fame, fortune, and everlasting glory? Maybe
	23	not, but if your patch is incorporated, then we'll add your name to the
	24	F<AUTHORS> file, which ships in the official distribution. How many other
	25	programming languages offer you 1 line of immortality?
	26
	27	=head1 Tasks that only need Perl knowledge
	28
	29	=head2 Migrate t/ from custom TAP generation
	30
	31	Many tests below F<t/> still generate TAP by "hand", rather than using library
	32	functions. As explained in L<perlhack/Writing a test>, tests in F<t/> are
	33	written in a particular way to test that more complex constructions actually
	34	work before using them routinely. Hence they don't use C<Test::More>, but
	35	instead there is an intentionally simpler library, F<t/test.pl>. However,
	36	quite a few tests in F<t/> have not been refactored to use it. Refactoring
	37	any of these tests, one at a time, is a useful thing TODO.
	38
	39	The subdirectories F<base>, F<cmd> and F<comp>, that contain the most
	40	basic tests, should be excluded from this task.
	41
	42	=head2 Test that regen.pl was run
	43
	44	There are various generated files shipped with the perl distribution, for
	45	things like header files generate from data. The generation scripts are
	46	written in perl, and all can be run by F<regen.pl>. However, because they're
	47	written in perl, we can't run them before we've built perl. We can't run them
	48	as part of the F<Makefile>, because changing files underneath F<make> confuses
	49	it completely, and we don't want to run them automatically anyway, as they
	50	change files shipped by the distribution, something we seek not do to.
	51
	52	If someone changes the data, but forgets to re-run F<regen.pl> then the
	53	generated files are out of sync. It would be good to have a test in
	54	F<t/porting> that checks that the generated files are in sync, and fails
	55	otherwise, to alert someone before they make a poor commit. I suspect that this
	56	would require adapting the scripts run from F<regen.pl> to have dry-run
	57	options, and invoking them with these, or by refactoring them into a library
	58	that does the generation, which can be called by the scripts, and by the test.
	59
	60	=head2 Automate perldelta generation
	61
	62	The perldelta file accompanying each release summaries the major changes.
	63	It's mostly manually generated currently, but some of that could be
	64	automated with a bit of perl, specifically the generation of
	65
	66	=over
	67
	68	=item Modules and Pragmata
	69
	70	=item New Documentation
	71
	72	=item New Tests
	73
	74	=back
	75
	76	See F<Porting/how_to_write_a_perldelta.pod> for details.
	77
	78	=head2 Remove duplication of test setup.
	79
	80	Schwern notes, that there's duplication of code - lots and lots of tests have
	81	some variation on the big block of C<$Is_Foo> checks. We can safely put this
	82	into a file, change it to build an C<%Is> hash and require it. Maybe just put
	83	it into F<test.pl>. Throw in the handy tainting subroutines.
	84
	85	=head2 POD -E<gt> HTML conversion in the core still sucks
	86
	87	Which is crazy given just how simple POD purports to be, and how simple HTML
	88	can be. It's not actually I<as> simple as it sounds, particularly with the
	89	flexibility POD allows for C<=item>, but it would be good to improve the
	90	visual appeal of the HTML generated, and to avoid it having any validation
	91	errors. See also L</make HTML install work>, as the layout of installation tree
	92	is needed to improve the cross-linking.
	93
	94	The addition of C<Pod::Simple> and its related modules may make this task
	95	easier to complete.
	96
	97	=head2 Make ExtUtils::ParseXS use strict;
	98
	99	F<lib/ExtUtils/ParseXS.pm> contains this line
	100
	101	# use strict; # One of these days...
	102
	103	Simply uncomment it, and fix all the resulting issues :-)
	104
	105	The more practical approach, to break the task down into manageable chunks, is
	106	to work your way though the code from bottom to top, or if necessary adding
	107	extra C<{ ... }> blocks, and turning on strict within them.
	108
	109	=head2 Make Schwern poorer
	110
	111	We should have tests for everything. When all the core's modules are tested,
	112	Schwern has promised to donate to $500 to TPF. We may need volunteers to
	113	hold him upside down and shake vigorously in order to actually extract the
	114	cash.
	115
	116	=head2 Improve the coverage of the core tests
	117
	118	Use Devel::Cover to ascertain the core modules' test coverage, then add
	119	tests that are currently missing.
	120
	121	=head2 test B
	122
	123	A full test suite for the B module would be nice.
	124
	125	=head2 A decent benchmark
	126
	127	C<perlbench> seems impervious to any recent changes made to the perl core. It
	128	would be useful to have a reasonable general benchmarking suite that roughly
	129	represented what current perl programs do, and measurably reported whether
	130	tweaks to the core improve, degrade or don't really affect performance, to
	131	guide people attempting to optimise the guts of perl. Gisle would welcome
	132	new tests for perlbench.
	133
	134	=head2 fix tainting bugs
	135
	136	Fix the bugs revealed by running the test suite with the C<-t> switch (via
	137	C<make test.taintwarn>).
	138
	139	=head2 Dual life everything
	140
	141	As part of the "dists" plan, anything that doesn't belong in the smallest perl
	142	distribution needs to be dual lifed. Anything else can be too. Figure out what
	143	changes would be needed to package that module and its tests up for CPAN, and
	144	do so. Test it with older perl releases, and fix the problems you find.
	145
	146	To make a minimal perl distribution, it's useful to look at
	147	F<t/lib/commonsense.t>.
	148
	149	=head2 Move dual-life pod/*.PL into ext
	150
	151	Nearly all the dual-life modules have been moved to F<ext>. However, we
	152	still need to move F<pod/*.PL> into their respective directories
	153	in F<ext/>. They're referenced by (at least) C<plextract> in F<Makefile.SH>
	154	and C<utils> in F<win32/Makefile> and F<win32/makefile.ml>, and listed
	155	explicitly in F<win32/pod.mak>, F<vms/descrip_mms.template> and F<utils.lst>
	156
	157	=head2 POSIX memory footprint
	158
	159	Ilya observed that use POSIX; eats memory like there's no tomorrow, and at
	160	various times worked to cut it down. There is probably still fat to cut out -
	161	for example POSIX passes Exporter some very memory hungry data structures.
	162
	163	=head2 embed.pl/makedef.pl
	164
	165	There is a script F<embed.pl> that generates several header files to prefix
	166	all of Perl's symbols in a consistent way, to provide some semblance of
	167	namespace support in C<C>. Functions are declared in F<embed.fnc>, variables
	168	in F<interpvar.h>. Quite a few of the functions and variables
	169	are conditionally declared there, using C<#ifdef>. However, F<embed.pl>
	170	doesn't understand the C macros, so the rules about which symbols are present
	171	when is duplicated in F<makedef.pl>. Writing things twice is bad, m'kay.
	172	It would be good to teach C<embed.pl> to understand the conditional
	173	compilation, and hence remove the duplication, and the mistakes it has caused.
	174
	175	=head2 use strict; and AutoLoad
	176
	177	Currently if you write
	178
	179	package Whack;
	180	use AutoLoader 'AUTOLOAD';
	181	use strict;
	182	1;
	183	__END__
	184	sub bloop {
	185	print join (' ', No, strict, here), "!\n";
	186	}
	187
	188	then C<use strict;> isn't in force within the autoloaded subroutines. It would
	189	be more consistent (and less surprising) to arrange for all lexical pragmas
	190	in force at the __END__ block to be in force within each autoloaded subroutine.
	191
	192	There's a similar problem with SelfLoader.
	193
	194	=head2 profile installman
	195
	196	The F<installman> script is slow. All it is doing text processing, which we're
	197	told is something Perl is good at. So it would be nice to know what it is doing
	198	that is taking so much CPU, and where possible address it.
	199
	200	=head2 enable lexical enabling/disabling of inidvidual warnings
	201
	202	Currently, warnings can only be enabled or disabled by category. There
	203	are times when it would be useful to quash a single warning, not a
	204	whole category.
	205
	206	=head1 Tasks that need a little sysadmin-type knowledge
	207
	208	Or if you prefer, tasks that you would learn from, and broaden your skills
	209	base...
	210
	211	=head2 make HTML install work
	212
	213	There is an C<installhtml> target in the Makefile. It's marked as
	214	"experimental". It would be good to get this tested, make it work reliably, and
	215	remove the "experimental" tag. This would include
	216
	217	=over 4
	218
	219	=item 1
	220
	221	Checking that cross linking between various parts of the documentation works.
	222	In particular that links work between the modules (files with POD in F<lib/>)
	223	and the core documentation (files in F<pod/>)
	224
	225	=item 2
	226
	227	Work out how to split C<perlfunc> into chunks, preferably one per function
	228	group, preferably with general case code that could be used elsewhere.
	229	Challenges here are correctly identifying the groups of functions that go
	230	together, and making the right named external cross-links point to the right
	231	page. Things to be aware of are C<-X>, groups such as C<getpwnam> to
	232	C<endservent>, two or more C<=items> giving the different parameter lists, such
	233	as
	234
	235	=item substr EXPR,OFFSET,LENGTH,REPLACEMENT
	236	=item substr EXPR,OFFSET,LENGTH
	237	=item substr EXPR,OFFSET
	238
	239	and different parameter lists having different meanings. (eg C<select>)
	240
	241	=back
	242
	243	=head2 compressed man pages
	244
	245	Be able to install them. This would probably need a configure test to see how
	246	the system does compressed man pages (same directory/different directory?
	247	same filename/different filename), as well as tweaking the F<installman> script
	248	to compress as necessary.
	249
	250	=head2 Add a code coverage target to the Makefile
	251
	252	Make it easy for anyone to run Devel::Cover on the core's tests. The steps
	253	to do this manually are roughly
	254
	255	=over 4
	256
	257	=item *
	258
	259	do a normal C<Configure>, but include Devel::Cover as a module to install
	260	(see L<INSTALL> for how to do this)
	261
	262	=item *
	263
	264	make perl
	265
	266	=item *
	267
	268	cd t; HARNESS_PERL_SWITCHES=-MDevel::Cover ./perl -I../lib harness
	269
	270	=item *
	271
	272	Process the resulting Devel::Cover database
	273
	274	=back
	275
	276	This just give you the coverage of the F<.pm>s. To also get the C level
	277	coverage you need to
	278
	279	=over 4
	280
	281	=item *
	282
	283	Additionally tell C<Configure> to use the appropriate C compiler flags for
	284	C<gcov>
	285
	286	=item *
	287
	288	make perl.gcov
	289
	290	(instead of C<make perl>)
	291
	292	=item *
	293
	294	After running the tests run C<gcov> to generate all the F<.gcov> files.
	295	(Including down in the subdirectories of F<ext/>
	296
	297	=item *
	298
	299	(From the top level perl directory) run C<gcov2perl> on all the C<.gcov> files
	300	to get their stats into the cover_db directory.
	301
	302	=item *
	303
	304	Then process the Devel::Cover database
	305
	306	=back
	307
	308	It would be good to add a single switch to C<Configure> to specify that you
	309	wanted to perform perl level coverage, and another to specify C level
	310	coverage, and have C<Configure> and the F<Makefile> do all the right things
	311	automatically.
	312
	313	=head2 Make Config.pm cope with differences between built and installed perl
	314
	315	Quite often vendors ship a perl binary compiled with their (pay-for)
	316	compilers. People install a free compiler, such as gcc. To work out how to
	317	build extensions, Perl interrogates C<%Config>, so in this situation
	318	C<%Config> describes compilers that aren't there, and extension building
	319	fails. This forces people into choosing between re-compiling perl themselves
	320	using the compiler they have, or only using modules that the vendor ships.
	321
	322	It would be good to find a way teach C<Config.pm> about the installation setup,
	323	possibly involving probing at install time or later, so that the C<%Config> in
	324	a binary distribution better describes the installed machine, when the
	325	installed machine differs from the build machine in some significant way.
	326
	327	=head2 linker specification files
	328
	329	Some platforms mandate that you provide a list of a shared library's external
	330	symbols to the linker, so the core already has the infrastructure in place to
	331	do this for generating shared perl libraries. My understanding is that the
	332	GNU toolchain can accept an optional linker specification file, and restrict
	333	visibility just to symbols declared in that file. It would be good to extend
	334	F<makedef.pl> to support this format, and to provide a means within
	335	C<Configure> to enable it. This would allow Unix users to test that the
	336	export list is correct, and to build a perl that does not pollute the global
	337	namespace with private symbols, and will fail in the same way as msvc or mingw
	338	builds or when using PERL_DL_NONLAZY=1.
	339
	340	=head2 Cross-compile support
	341
	342	Currently C<Configure> understands C<-Dusecrosscompile> option. This option
	343	arranges for building C<miniperl> for TARGET machine, so this C<miniperl> is
	344	assumed then to be copied to TARGET machine and used as a replacement of full
	345	C<perl> executable.
	346
	347	This could be done little differently. Namely C<miniperl> should be built for
	348	HOST and then full C<perl> with extensions should be compiled for TARGET.
	349	This, however, might require extra trickery for %Config: we have one config
	350	first for HOST and then another for TARGET. Tools like MakeMaker will be
	351	mightily confused. Having around two different types of executables and
	352	libraries (HOST and TARGET) makes life interesting for Makefiles and
	353	shell (and Perl) scripts. There is $Config{run}, normally empty, which
	354	can be used as an execution wrapper. Also note that in some
	355	cross-compilation/execution environments the HOST and the TARGET do
	356	not see the same filesystem(s), the $Config{run} may need to do some
	357	file/directory copying back and forth.
	358
	359	=head2 roffitall
	360
	361	Make F<pod/roffitall> be updated by F<pod/buildtoc>.
	362
	363	=head2 Split "linker" from "compiler"
	364
	365	Right now, Configure probes for two commands, and sets two variables:
	366
	367	=over 4
	368
	369	=item * C<cc> (in F<cc.U>)
	370
	371	This variable holds the name of a command to execute a C compiler which
	372	can resolve multiple global references that happen to have the same
	373	name. Usual values are F<cc> and F<gcc>.
	374	Fervent ANSI compilers may be called F<c89>. AIX has F<xlc>.
	375
	376	=item * C<ld> (in F<dlsrc.U>)
	377
	378	This variable indicates the program to be used to link
	379	libraries for dynamic loading. On some systems, it is F<ld>.
	380	On ELF systems, it should be C<$cc>. Mostly, we'll try to respect
	381	the hint file setting.
	382
	383	=back
	384
	385	There is an implicit historical assumption from around Perl5.000alpha
	386	something, that C<$cc> is also the correct command for linking object files
	387	together to make an executable. This may be true on Unix, but it's not true
	388	on other platforms, and there are a maze of work arounds in other places (such
	389	as F<Makefile.SH>) to cope with this.
	390
	391	Ideally, we should create a new variable to hold the name of the executable
	392	linker program, probe for it in F<Configure>, and centralise all the special
	393	case logic there or in hints files.
	394
	395	A small bikeshed issue remains - what to call it, given that C<$ld> is already
	396	taken (arguably for the wrong thing now, but on SunOS 4.1 it is the command
	397	for creating dynamically-loadable modules) and C<$link> could be confused with
	398	the Unix command line executable of the same name, which does something
	399	completely different. Andy Dougherty makes the counter argument "In parrot, I
	400	tried to call the command used to link object files and libraries into an
	401	executable F<link>, since that's what my vaguely-remembered DOS and VMS
	402	experience suggested. I don't think any real confusion has ensued, so it's
	403	probably a reasonable name for perl5 to use."
	404
	405	"Alas, I've always worried that introducing it would make things worse,
	406	since now the module building utilities would have to look for
	407	C<$Config{link}> and institute a fall-back plan if it weren't found."
	408	Although I can see that as confusing, given that C<$Config{d_link}> is true
	409	when (hard) links are available.
	410
	411	=head2 Configure Windows using PowerShell
	412
	413	Currently, Windows uses hard-coded config files based to build the
	414	config.h for compiling Perl. Makefiles are also hard-coded and need to be
	415	hand edited prior to building Perl. While this makes it easy to create a perl.exe
	416	that works across multiple Windows versions, being able to accurately
	417	configure a perl.exe for a specific Windows versions and VS C++ would be
	418	a nice enhancement. With PowerShell available on Windows XP and up, this
	419	may now be possible. Step 1 might be to investigate whether this is possible
	420	and use this to clean up our current makefile situation. Step 2 would be to
	421	see if there would be a way to use our existing metaconfig units to configure a
	422	Windows Perl or whether we go in a separate direction and make it so. Of
	423	course, we all know what step 3 is.
	424
	425	=head2 decouple -g and -DDEBUGGING
	426
	427	Currently F<Configure> automatically adds C<-DDEBUGGING> to the C compiler
	428	flags if it spots C<-g> in the optimiser flags. The pre-processor directive
	429	C<DEBUGGING> enables F<perl>'s command line C<-D> options, but in the process
	430	makes F<perl> slower. It would be good to disentangle this logic, so that
	431	C-level debugging with C<-g> and Perl level debugging with C<-D> can easily
	432	be enabled independently.
	433
	434	=head1 Tasks that need a little C knowledge
	435
	436	These tasks would need a little C knowledge, but don't need any specific
	437	background or experience with XS, or how the Perl interpreter works
	438
	439	=head2 Weed out needless PERL_UNUSED_ARG
	440
	441	The C code uses the macro C<PERL_UNUSED_ARG> to stop compilers warning about
	442	unused arguments. Often the arguments can't be removed, as there is an
	443	external constraint that determines the prototype of the function, so this
	444	approach is valid. However, there are some cases where C<PERL_UNUSED_ARG>
	445	could be removed. Specifically
	446
	447	=over 4
	448
	449	=item *
	450
	451	The prototypes of (nearly all) static functions can be changed
	452
	453	=item *
	454
	455	Unused arguments generated by short cut macros are wasteful - the short cut
	456	macro used can be changed.
	457
	458	=back
	459
	460	=head2 Modernize the order of directories in @INC
	461
	462	The way @INC is laid out by default, one cannot upgrade core (dual-life)
	463	modules without overwriting files. This causes problems for binary
	464	package builders. One possible proposal is laid out in this
	465	message:
	466	L<http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2002-04/msg02380.html>.
	467
	468	=head2 -Duse32bit*
	469
	470	Natively 64-bit systems need neither -Duse64bitint nor -Duse64bitall.
	471	On these systems, it might be the default compilation mode, and there
	472	is currently no guarantee that passing no use64bitall option to the
	473	Configure process will build a 32bit perl. Implementing -Duse32bit*
	474	options would be nice for perl 5.12.
	475
	476	=head2 Profile Perl - am I hot or not?
	477
	478	The Perl source code is stable enough that it makes sense to profile it,
	479	identify and optimise the hotspots. It would be good to measure the
	480	performance of the Perl interpreter using free tools such as cachegrind,
	481	gprof, and dtrace, and work to reduce the bottlenecks they reveal.
	482
	483	As part of this, the idea of F<pp_hot.c> is that it contains the I<hot> ops,
	484	the ops that are most commonly used. The idea is that by grouping them, their
	485	object code will be adjacent in the executable, so they have a greater chance
	486	of already being in the CPU cache (or swapped in) due to being near another op
	487	already in use.
	488
	489	Except that it's not clear if these really are the most commonly used ops. So
	490	as part of exercising your skills with coverage and profiling tools you might
	491	want to determine what ops I<really> are the most commonly used. And in turn
	492	suggest evictions and promotions to achieve a better F<pp_hot.c>.
	493
	494	One piece of Perl code that might make a good testbed is F<installman>.
	495
	496	=head2 Allocate OPs from arenas
	497
	498	Currently all new OP structures are individually malloc()ed and free()d.
	499	All C<malloc> implementations have space overheads, and are now as fast as
	500	custom allocates so it would both use less memory and less CPU to allocate
	501	the various OP structures from arenas. The SV arena code can probably be
	502	re-used for this.
	503
	504	Note that Configuring perl with C<-Accflags=-DPL_OP_SLAB_ALLOC> will use
	505	Perl_Slab_alloc() to pack optrees into a contiguous block, which is
	506	probably superior to the use of OP arenas, esp. from a cache locality
	507	standpoint. See L<Profile Perl - am I hot or not?>.
	508
	509	=head2 Improve win32/wince.c
	510
	511	Currently, numerous functions look virtually, if not completely,
	512	identical in both C<win32/wince.c> and C<win32/win32.c> files, which can't
	513	be good.
	514
	515	=head2 Use secure CRT functions when building with VC8 on Win32
	516
	517	Visual C++ 2005 (VC++ 8.x) deprecated a number of CRT functions on the basis
	518	that they were "unsafe" and introduced differently named secure versions of
	519	them as replacements, e.g. instead of writing
	520
	521	FILE* f = fopen(__FILE__, "r");
	522
	523	one should now write
	524
	525	FILE* f;
	526	errno_t err = fopen_s(&f, __FILE__, "r");
	527
	528	Currently, the warnings about these deprecations have been disabled by adding
	529	-D_CRT_SECURE_NO_DEPRECATE to the CFLAGS. It would be nice to remove that
	530	warning suppressant and actually make use of the new secure CRT functions.
	531
	532	There is also a similar issue with POSIX CRT function names like fileno having
	533	been deprecated in favour of ISO C++ conformant names like _fileno. These
	534	warnings are also currently suppressed by adding -D_CRT_NONSTDC_NO_DEPRECATE. It
	535	might be nice to do as Microsoft suggest here too, although, unlike the secure
	536	functions issue, there is presumably little or no benefit in this case.
	537
	538	=head2 Fix POSIX::access() and chdir() on Win32
	539
	540	These functions currently take no account of DACLs and therefore do not behave
	541	correctly in situations where access is restricted by DACLs (as opposed to the
	542	read-only attribute).
	543
	544	Furthermore, POSIX::access() behaves differently for directories having the
	545	read-only attribute set depending on what CRT library is being used. For
	546	example, the _access() function in the VC6 and VC7 CRTs (wrongly) claim that
	547	such directories are not writable, whereas in fact all directories are writable
	548	unless access is denied by DACLs. (In the case of directories, the read-only
	549	attribute actually only means that the directory cannot be deleted.) This CRT
	550	bug is fixed in the VC8 and VC9 CRTs (but, of course, the directory may still
	551	not actually be writable if access is indeed denied by DACLs).
	552
	553	For the chdir() issue, see ActiveState bug #74552:
	554	http://bugs.activestate.com/show_bug.cgi?id=74552
	555
	556	Therefore, DACLs should be checked both for consistency across CRTs and for
	557	the correct answer.
	558
	559	(Note that perl's -w operator should not be modified to check DACLs. It has
	560	been written so that it reflects the state of the read-only attribute, even
	561	for directories (whatever CRT is being used), for symmetry with chmod().)
	562
	563	=head2 strcat(), strcpy(), strncat(), strncpy(), sprintf(), vsprintf()
	564
	565	Maybe create a utility that checks after each libperl.a creation that
	566	none of the above (nor sprintf(), vsprintf(), or SHUDDER gets())
	567	ever creep back to libperl.a.
	568
	569	nm libperl.a \| ./miniperl -alne '$o = $F[0] if /:$/; print "$o $F[1]" if $F[0] eq "U" && $F[1] =~ /^(?:strn?c(?:at\|py)\|v?sprintf\|gets)$/'
	570
	571	Note, of course, that this will only tell whether B<your> platform
	572	is using those naughty interfaces.
	573
	574	=head2 -D_FORTIFY_SOURCE=2, -fstack-protector
	575
	576	Recent glibcs support C<-D_FORTIFY_SOURCE=2> and recent gcc
	577	(4.1 onwards?) supports C<-fstack-protector>, both of which give
	578	protection against various kinds of buffer overflow problems.
	579	These should probably be used for compiling Perl whenever available,
	580	Configure and/or hints files should be adjusted to probe for the
	581	availability of these features and enable them as appropriate.
	582
	583	=head2 Arenas for GPs? For MAGIC?
	584
	585	C<struct gp> and C<struct magic> are both currently allocated by C<malloc>.
	586	It might be a speed or memory saving to change to using arenas. Or it might
	587	not. It would need some suitable benchmarking first. In particular, C<GP>s
	588	can probably be changed with minimal compatibility impact (probably nothing
	589	outside of the core, or even outside of F<gv.c> allocates them), but they
	590	probably aren't allocated/deallocated often enough for a speed saving. Whereas
	591	C<MAGIC> is allocated/deallocated more often, but in turn, is also something
	592	more externally visible, so changing the rules here may bite external code.
	593
	594	=head2 Shared arenas
	595
	596	Several SV body structs are now the same size, notably PVMG and PVGV, PVAV and
	597	PVHV, and PVCV and PVFM. It should be possible to allocate and return same
	598	sized bodies from the same actual arena, rather than maintaining one arena for
	599	each. This could save 4-6K per thread, of memory no longer tied up in the
	600	not-yet-allocated part of an arena.
	601
	602
	603	=head1 Tasks that need a knowledge of XS
	604
	605	These tasks would need C knowledge, and roughly the level of knowledge of
	606	the perl API that comes from writing modules that use XS to interface to
	607	C.
	608
	609	=head2 Write an XS cookbook
	610
	611	Create pod/perlxscookbook.pod with short, task-focused 'recipes' in XS that
	612	demonstrate common tasks and good practices. (Some of these might be
	613	extracted from perlguts.) The target audience should be XS novices, who need
	614	more examples than perlguts but something less overwhelming than perlapi.
	615	Recipes should provide "one pretty good way to do it" instead of TIMTOWTDI.
	616
	617	Rather than focusing on interfacing Perl to C libraries, such a cookbook
	618	should probably focus on how to optimize Perl routines by re-writing them
	619	in XS. This will likely be more motivating to those who mostly work in
	620	Perl but are looking to take the next step into XS.
	621
	622	Deconstructing and explaining some simpler XS modules could be one way to
	623	bootstrap a cookbook. (List::Util? Class::XSAccessor? Tree::Ternary_XS?)
	624	Another option could be deconstructing the implementation of some simpler
	625	functions in op.c.
	626
	627	=head2 Allow XSUBs to inline themselves as OPs
	628
	629	For a simple XSUB, often the subroutine dispatch takes more time than the
	630	XSUB itself. The tokeniser already has the ability to inline constant
	631	subroutines - it would be good to provide a way to inline other subroutines.
	632
	633	Specifically, simplest approach looks to be to allow an XSUB to provide an
	634	alternative implementation of itself as a custom OP. A new flag bit in
	635	C<CvFLAGS()> would signal to the peephole optimiser to take an optree
	636	such as this:
	637
	638	b <@> leave[1 ref] vKP/REFC ->(end)
	639	1 <0> enter ->2
	640	2 <;> nextstate(main 1 -e:1) v:{ ->3
	641	a <2> sassign vKS/2 ->b
	642	8 <1> entersub[t2] sKS/TARG,1 ->9
	643	- <1> ex-list sK ->8
	644	3 <0> pushmark s ->4
	645	4 <$> const(IV 1) sM ->5
	646	6 <1> rv2av[t1] lKM/1 ->7
	647	5 <$> gv(*a) s ->6
	648	- <1> ex-rv2cv sK ->-
	649	7 <$> gv(*x) s/EARLYCV ->8
	650	- <1> ex-rv2sv sKRM*/1 ->a
	651	9 <$> gvsv(*b) s ->a
	652
	653	perform the symbol table lookup of C<rv2cv> and C<gv(*x)>, locate the
	654	pointer to the custom OP that provides the direct implementation, and re-
	655	write the optree something like:
	656
	657	b <@> leave[1 ref] vKP/REFC ->(end)
	658	1 <0> enter ->2
	659	2 <;> nextstate(main 1 -e:1) v:{ ->3
	660	a <2> sassign vKS/2 ->b
	661	7 <1> custom_x -> 8
	662	- <1> ex-list sK ->7
	663	3 <0> pushmark s ->4
	664	4 <$> const(IV 1) sM ->5
	665	6 <1> rv2av[t1] lKM/1 ->7
	666	5 <$> gv(*a) s ->6
	667	- <1> ex-rv2cv sK ->-
	668	- <$> ex-gv(*x) s/EARLYCV ->7
	669	- <1> ex-rv2sv sKRM*/1 ->a
	670	8 <$> gvsv(*b) s ->a
	671
	672	I<i.e.> the C<gv(*)> OP has been nulled and spliced out of the execution
	673	path, and the C<entersub> OP has been replaced by the custom op.
	674
	675	This approach should provide a measurable speed up to simple XSUBs inside
	676	tight loops. Initially one would have to write the OP alternative
	677	implementation by hand, but it's likely that this should be reasonably
	678	straightforward for the type of XSUB that would benefit the most. Longer
	679	term, once the run-time implementation is proven, it should be possible to
	680	progressively update ExtUtils::ParseXS to generate OP implementations for
	681	some XSUBs.
	682
	683	=head2 Remove the use of SVs as temporaries in dump.c
	684
	685	F<dump.c> contains debugging routines to dump out the contains of perl data
	686	structures, such as C<SV>s, C<AV>s and C<HV>s. Currently, the dumping code
	687	B<uses> C<SV>s for its temporary buffers, which was a logical initial
	688	implementation choice, as they provide ready made memory handling.
	689
	690	However, they also lead to a lot of confusion when it happens that what you're
	691	trying to debug is seen by the code in F<dump.c>, correctly or incorrectly, as
	692	a temporary scalar it can use for a temporary buffer. It's also not possible
	693	to dump scalars before the interpreter is properly set up, such as during
	694	ithreads cloning. It would be good to progressively replace the use of scalars
	695	as string accumulation buffers with something much simpler, directly allocated
	696	by C<malloc>. The F<dump.c> code is (or should be) only producing 7 bit
	697	US-ASCII, so output character sets are not an issue.
	698
	699	Producing and proving an internal simple buffer allocation would make it easier
	700	to re-write the internals of the PerlIO subsystem to avoid using C<SV>s for
	701	B<its> buffers, use of which can cause problems similar to those of F<dump.c>,
	702	at similar times.
	703
	704	=head2 safely supporting POSIX SA_SIGINFO
	705
	706	Some years ago Jarkko supplied patches to provide support for the POSIX
	707	SA_SIGINFO feature in Perl, passing the extra data to the Perl signal handler.
	708
	709	Unfortunately, it only works with "unsafe" signals, because under safe
	710	signals, by the time Perl gets to run the signal handler, the extra
	711	information has been lost. Moreover, it's not easy to store it somewhere,
	712	as you can't call mutexs, or do anything else fancy, from inside a signal
	713	handler.
	714
	715	So it strikes me that we could provide safe SA_SIGINFO support
	716
	717	=over 4
	718
	719	=item 1
	720
	721	Provide global variables for two file descriptors
	722
	723	=item 2
	724
	725	When the first request is made via C<sigaction> for C<SA_SIGINFO>, create a
	726	pipe, store the reader in one, the writer in the other
	727
	728	=item 3
	729
	730	In the "safe" signal handler (C<Perl_csighandler()>/C<S_raise_signal()>), if
	731	the C<siginfo_t> pointer non-C<NULL>, and the writer file handle is open,
	732
	733	=over 8
	734
	735	=item 1
	736
	737	serialise signal number, C<struct siginfo_t> (or at least the parts we care
	738	about) into a small auto char buff
	739
	740	=item 2
	741
	742	C<write()> that (non-blocking) to the writer fd
	743
	744	=over 12
	745
	746	=item 1
	747
	748	if it writes 100%, flag the signal in a counter of "signals on the pipe" akin
	749	to the current per-signal-number counts
	750
	751	=item 2
	752
	753	if it writes 0%, assume the pipe is full. Flag the data as lost?
	754
	755	=item 3
	756
	757	if it writes partially, croak a panic, as your OS is broken.
	758
	759	=back
	760
	761	=back
	762
	763	=item 4
	764
	765	in the regular C<PERL_ASYNC_CHECK()> processing, if there are "signals on
	766	the pipe", read the data out, deserialise, build the Perl structures on
	767	the stack (code in C<Perl_sighandler()>, the "unsafe" handler), and call as
	768	usual.
	769
	770	=back
	771
	772	I think that this gets us decent C<SA_SIGINFO> support, without the current risk
	773	of running Perl code inside the signal handler context. (With all the dangers
	774	of things like C<malloc> corruption that that currently offers us)
	775
	776	For more information see the thread starting with this message:
	777	http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2008-03/msg00305.html
	778
	779	=head2 autovivification
	780
	781	Make all autovivification consistent w.r.t LVALUE/RVALUE and strict/no strict;
	782
	783	This task is incremental - even a little bit of work on it will help.
	784
	785	=head2 Unicode in Filenames
	786
	787	chdir, chmod, chown, chroot, exec, glob, link, lstat, mkdir, open,
	788	opendir, qx, readdir, readlink, rename, rmdir, stat, symlink, sysopen,
	789	system, truncate, unlink, utime, -X. All these could potentially accept
	790	Unicode filenames either as input or output (and in the case of system
	791	and qx Unicode in general, as input or output to/from the shell).
	792	Whether a filesystem - an operating system pair understands Unicode in
	793	filenames varies.
	794
	795	Known combinations that have some level of understanding include
	796	Microsoft NTFS, Apple HFS+ (In Mac OS 9 and X) and Apple UFS (in Mac
	797	OS X), NFS v4 is rumored to be Unicode, and of course Plan 9. How to
	798	create Unicode filenames, what forms of Unicode are accepted and used
	799	(UCS-2, UTF-16, UTF-8), what (if any) is the normalization form used,
	800	and so on, varies. Finding the right level of interfacing to Perl
	801	requires some thought. Remember that an OS does not implicate a
	802	filesystem.
	803
	804	(The Windows -C command flag "wide API support" has been at least
	805	temporarily retired in 5.8.1, and the -C has been repurposed, see
	806	L<perlrun>.)
	807
	808	Most probably the right way to do this would be this:
	809	L</"Virtualize operating system access">.
	810
	811	=head2 Unicode in %ENV
	812
	813	Currently the %ENV entries are always byte strings.
	814	See L</"Virtualize operating system access">.
	815
	816	=head2 Unicode and glob()
	817
	818	Currently glob patterns and filenames returned from File::Glob::glob()
	819	are always byte strings. See L</"Virtualize operating system access">.
	820
	821	=head2 use less 'memory'
	822
	823	Investigate trade offs to switch out perl's choices on memory usage.
	824	Particularly perl should be able to give memory back.
	825
	826	This task is incremental - even a little bit of work on it will help.
	827
	828	=head2 Re-implement C<:unique> in a way that is actually thread-safe
	829
	830	The old implementation made bad assumptions on several levels. A good 90%
	831	solution might be just to make C<:unique> work to share the string buffer
	832	of SvPVs. That way large constant strings can be shared between ithreads,
	833	such as the configuration information in F<Config>.
	834
	835	=head2 Make tainting consistent
	836
	837	Tainting would be easier to use if it didn't take documented shortcuts and
	838	allow taint to "leak" everywhere within an expression.
	839
	840	=head2 readpipe(LIST)
	841
	842	system() accepts a LIST syntax (and a PROGRAM LIST syntax) to avoid
	843	running a shell. readpipe() (the function behind qx//) could be similarly
	844	extended.
	845
	846	=head2 Audit the code for destruction ordering assumptions
	847
	848	Change 25773 notes
	849
	850	/* Need to check SvMAGICAL, as during global destruction it may be that
	851	AvARYLEN(av) has been freed before av, and hence the SvANY() pointer
	852	is now part of the linked list of SV heads, rather than pointing to
	853	the original body. */
	854	/* FIXME - audit the code for other bugs like this one. */
	855
	856	adding the C<SvMAGICAL> check to
	857
	858	if (AvARYLEN(av) && SvMAGICAL(AvARYLEN(av))) {
	859	MAGIC *mg = mg_find (AvARYLEN(av), PERL_MAGIC_arylen);
	860
	861	Go through the core and look for similar assumptions that SVs have particular
	862	types, as all bets are off during global destruction.
	863
	864	=head2 Extend PerlIO and PerlIO::Scalar
	865
	866	PerlIO::Scalar doesn't know how to truncate(). Implementing this
	867	would require extending the PerlIO vtable.
	868
	869	Similarly the PerlIO vtable doesn't know about formats (write()), or
	870	about stat(), or chmod()/chown(), utime(), or flock().
	871
	872	(For PerlIO::Scalar it's hard to see what e.g. mode bits or ownership
	873	would mean.)
	874
	875	PerlIO doesn't do directories or symlinks, either: mkdir(), rmdir(),
	876	opendir(), closedir(), seekdir(), rewinddir(), glob(); symlink(),
	877	readlink().
	878
	879	See also L</"Virtualize operating system access">.
	880
	881	=head2 -C on the #! line
	882
	883	It should be possible to make -C work correctly if found on the #! line,
	884	given that all perl command line options are strict ASCII, and -C changes
	885	only the interpretation of non-ASCII characters, and not for the script file
	886	handle. To make it work needs some investigation of the ordering of function
	887	calls during startup, and (by implication) a bit of tweaking of that order.
	888
	889	=head2 Organize error messages
	890
	891	Perl's diagnostics (error messages, see L<perldiag>) could use
	892	reorganizing and formalizing so that each error message has its
	893	stable-for-all-eternity unique id, categorized by severity, type, and
	894	subsystem. (The error messages would be listed in a datafile outside
	895	of the Perl source code, and the source code would only refer to the
	896	messages by the id.) This clean-up and regularizing should apply
	897	for all croak() messages.
	898
	899	This would enable all sorts of things: easier translation/localization
	900	of the messages (though please do keep in mind the caveats of
	901	L<Locale::Maketext> about too straightforward approaches to
	902	translation), filtering by severity, and instead of grepping for a
	903	particular error message one could look for a stable error id. (Of
	904	course, changing the error messages by default would break all the
	905	existing software depending on some particular error message...)
	906
	907	This kind of functionality is known as I<message catalogs>. Look for
	908	inspiration for example in the catgets() system, possibly even use it
	909	if available-- but B<only> if available, all platforms will B<not>
	910	have catgets().
	911
	912	For the really pure at heart, consider extending this item to cover
	913	also the warning messages (see L<perllexwarn>, C<warnings.pl>).
	914
	915	=head1 Tasks that need a knowledge of the interpreter
	916
	917	These tasks would need C knowledge, and knowledge of how the interpreter works,
	918	or a willingness to learn.
	919
	920	=head2 forbid labels with keyword names
	921
	922	Currently C<goto keyword> "computes" the label value:
	923
	924	$ perl -e 'goto print'
	925	Can't find label 1 at -e line 1.
	926
	927	It is controversial if the right way to avoid the confusion is to forbid
	928	labels with keyword names, or if it would be better to always treat
	929	bareword expressions after a "goto" as a label and never as a keyword.
	930
	931	=head2 truncate() prototype
	932
	933	The prototype of truncate() is currently C<$$>. It should probably
	934	be C<*$> instead. (This is changed in F<opcode.pl>)
	935
	936	=head2 decapsulation of smart match argument
	937
	938	Currently C<$foo ~~ $object> will die with the message "Smart matching a
	939	non-overloaded object breaks encapsulation". It would be nice to allow
	940	to bypass this by using explictly the syntax C<$foo ~~ %$object> or
	941	C<$foo ~~ @$object>.
	942
	943	=head2 error reporting of [$a ; $b]
	944
	945	Using C<;> inside brackets is a syntax error, and we don't propose to change
	946	that by giving it any meaning. However, it's not reported very helpfully:
	947
	948	$ perl -e '$a = [$b; $c];'
	949	syntax error at -e line 1, near "$b;"
	950	syntax error at -e line 1, near "$c]"
	951	Execution of -e aborted due to compilation errors.
	952
	953	It should be possible to hook into the tokeniser or the lexer, so that when a
	954	C<;> is parsed where it is not legal as a statement terminator (ie inside
	955	C<{}> used as a hashref, C<[]> or C<()>) it issues an error something like
	956	I<';' isn't legal inside an expression - if you need multiple statements use a
	957	do {...} block>. See the thread starting at
	958	http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2008-09/msg00573.html
	959
	960	=head2 lexicals used only once
	961
	962	This warns:
	963
	964	$ perl -we '$pie = 42'
	965	Name "main::pie" used only once: possible typo at -e line 1.
	966
	967	This does not:
	968
	969	$ perl -we 'my $pie = 42'
	970
	971	Logically all lexicals used only once should warn, if the user asks for
	972	warnings. An unworked RT ticket (#5087) has been open for almost seven
	973	years for this discrepancy.
	974
	975	=head2 UTF-8 revamp
	976
	977	The handling of Unicode is unclean in many places. For example, the regexp
	978	engine matches in Unicode semantics whenever the string or the pattern is
	979	flagged as UTF-8, but that should not be dependent on an internal storage
	980	detail of the string.
	981
	982	=head2 Properly Unicode safe tokeniser and pads.
	983
	984	The tokeniser isn't actually very UTF-8 clean. C<use utf8;> is a hack -
	985	variable names are stored in stashes as raw bytes, without the utf-8 flag
	986	set. The pad API only takes a C<char *> pointer, so that's all bytes too. The
	987	tokeniser ignores the UTF-8-ness of C<PL_rsfp>, or any SVs returned from
	988	source filters. All this could be fixed.
	989
	990	=head2 state variable initialization in list context
	991
	992	Currently this is illegal:
	993
	994	state ($a, $b) = foo();
	995
	996	In Perl 6, C<state ($a) = foo();> and C<(state $a) = foo();> have different
	997	semantics, which is tricky to implement in Perl 5 as currently they produce
	998	the same opcode trees. The Perl 6 design is firm, so it would be good to
	999	implement the necessary code in Perl 5. There are comments in
	1000	C<Perl_newASSIGNOP()> that show the code paths taken by various assignment
	1001	constructions involving state variables.
	1002
	1003	=head2 Implement $value ~~ 0 .. $range
	1004
	1005	It would be nice to extend the syntax of the C<~~> operator to also
	1006	understand numeric (and maybe alphanumeric) ranges.
	1007
	1008	=head2 A does() built-in
	1009
	1010	Like ref(), only useful. It would call the C<DOES> method on objects; it
	1011	would also tell whether something can be dereferenced as an
	1012	array/hash/etc., or used as a regexp, etc.
	1013	L<http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2007-03/msg00481.html>
	1014
	1015	=head2 Tied filehandles and write() don't mix
	1016
	1017	There is no method on tied filehandles to allow them to be called back by
	1018	formats.
	1019
	1020	=head2 Propagate compilation hints to the debugger
	1021
	1022	Currently a debugger started with -dE on the command-line doesn't see the
	1023	features enabled by -E. More generally hints (C<$^H> and C<%^H>) aren't
	1024	propagated to the debugger. Probably it would be a good thing to propagate
	1025	hints from the innermost non-C<DB::> scope: this would make code eval'ed
	1026	in the debugger see the features (and strictures, etc.) currently in
	1027	scope.
	1028
	1029	=head2 Attach/detach debugger from running program
	1030
	1031	The old perltodo notes "With C<gdb>, you can attach the debugger to a running
	1032	program if you pass the process ID. It would be good to do this with the Perl
	1033	debugger on a running Perl program, although I'm not sure how it would be
	1034	done." ssh and screen do this with named pipes in /tmp. Maybe we can too.
	1035
	1036	=head2 LVALUE functions for lists
	1037
	1038	The old perltodo notes that lvalue functions don't work for list or hash
	1039	slices. This would be good to fix.
	1040
	1041	=head2 regexp optimiser optional
	1042
	1043	The regexp optimiser is not optional. It should configurable to be, to allow
	1044	its performance to be measured, and its bugs to be easily demonstrated.
	1045
	1046	=head2 C</w> regex modifier
	1047
	1048	That flag would enable to match whole words, and also to interpolate
	1049	arrays as alternations. With it, C</P/w> would be roughly equivalent to:
	1050
	1051	do { local $"='\|'; /\b(?:P)\b/ }
	1052
	1053	See L<http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2007-01/msg00400.html>
	1054	for the discussion.
	1055
	1056	=head2 optional optimizer
	1057
	1058	Make the peephole optimizer optional. Currently it performs two tasks as
	1059	it walks the optree - genuine peephole optimisations, and necessary fixups of
	1060	ops. It would be good to find an efficient way to switch out the
	1061	optimisations whilst keeping the fixups.
	1062
	1063	=head2 You WANT how many
	1064
	1065	Currently contexts are void, scalar and list. split has a special mechanism in
	1066	place to pass in the number of return values wanted. It would be useful to
	1067	have a general mechanism for this, backwards compatible and little speed hit.
	1068	This would allow proposals such as short circuiting sort to be implemented
	1069	as a module on CPAN.
	1070
	1071	=head2 lexical aliases
	1072
	1073	Allow lexical aliases (maybe via the syntax C<my \$alias = \$foo>.
	1074
	1075	=head2 entersub XS vs Perl
	1076
	1077	At the moment pp_entersub is huge, and has code to deal with entering both
	1078	perl and XS subroutines. Subroutine implementations rarely change between
	1079	perl and XS at run time, so investigate using 2 ops to enter subs (one for
	1080	XS, one for perl) and swap between if a sub is redefined.
	1081
	1082	=head2 Self-ties
	1083
	1084	Self-ties are currently illegal because they caused too many segfaults. Maybe
	1085	the causes of these could be tracked down and self-ties on all types
	1086	reinstated.
	1087
	1088	=head2 Optimize away @_
	1089
	1090	The old perltodo notes "Look at the "reification" code in C<av.c>".
	1091
	1092	=head2 Virtualize operating system access
	1093
	1094	Implement a set of "vtables" that virtualizes operating system access
	1095	(open(), mkdir(), unlink(), readdir(), getenv(), etc.) At the very
	1096	least these interfaces should take SVs as "name" arguments instead of
	1097	bare char pointers; probably the most flexible and extensible way
	1098	would be for the Perl-facing interfaces to accept HVs. The system
	1099	needs to be per-operating-system and per-file-system
	1100	hookable/filterable, preferably both from XS and Perl level
	1101	(L<perlport/"Files and Filesystems"> is good reading at this point,
	1102	in fact, all of L<perlport> is.)
	1103
	1104	This has actually already been implemented (but only for Win32),
	1105	take a look at F<iperlsys.h> and F<win32/perlhost.h>. While all Win32
	1106	variants go through a set of "vtables" for operating system access,
	1107	non-Win32 systems currently go straight for the POSIX/Unix-style
	1108	system/library call. Similar system as for Win32 should be
	1109	implemented for all platforms. The existing Win32 implementation
	1110	probably does not need to survive alongside this proposed new
	1111	implementation, the approaches could be merged.
	1112
	1113	What would this give us? One often-asked-for feature this would
	1114	enable is using Unicode for filenames, and other "names" like %ENV,
	1115	usernames, hostnames, and so forth.
	1116	(See L<perlunicode/"When Unicode Does Not Happen">.)
	1117
	1118	But this kind of virtualization would also allow for things like
	1119	virtual filesystems, virtual networks, and "sandboxes" (though as long
	1120	as dynamic loading of random object code is allowed, not very safe
	1121	sandboxes since external code of course know not of Perl's vtables).
	1122	An example of a smaller "sandbox" is that this feature can be used to
	1123	implement per-thread working directories: Win32 already does this.
	1124
	1125	See also L</"Extend PerlIO and PerlIO::Scalar">.
	1126
	1127	=head2 Investigate PADTMP hash pessimisation
	1128
	1129	The peephole optimiser converts constants used for hash key lookups to shared
	1130	hash key scalars. Under ithreads, something is undoing this work.
	1131	See http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2007-09/msg00793.html
	1132
	1133	=head2 Store the current pad in the OP slab allocator
	1134
	1135	=for clarification
	1136	I hope that I got that "current pad" part correct
	1137
	1138	Currently we leak ops in various cases of parse failure. I suggested that we
	1139	could solve this by always using the op slab allocator, and walking it to
	1140	free ops. Dave comments that as some ops are already freed during optree
	1141	creation one would have to mark which ops are freed, and not double free them
	1142	when walking the slab. He notes that one problem with this is that for some ops
	1143	you have to know which pad was current at the time of allocation, which does
	1144	change. I suggested storing a pointer to the current pad in the memory allocated
	1145	for the slab, and swapping to a new slab each time the pad changes. Dave thinks
	1146	that this would work.
	1147
	1148	=head2 repack the optree
	1149
	1150	Repacking the optree after execution order is determined could allow
	1151	removal of NULL ops, and optimal ordering of OPs with respect to cache-line
	1152	filling. The slab allocator could be reused for this purpose. I think that
	1153	the best way to do this is to make it an optional step just before the
	1154	completed optree is attached to anything else, and to use the slab allocator
	1155	unchanged, so that freeing ops is identical whether or not this step runs.
	1156	Note that the slab allocator allocates ops downwards in memory, so one would
	1157	have to actually "allocate" the ops in reverse-execution order to get them
	1158	contiguous in memory in execution order.
	1159
	1160	See http://www.nntp.perl.org/group/perl.perl5.porters/2007/12/msg131975.html
	1161
	1162	Note that running this copy, and then freeing all the old location ops would
	1163	cause their slabs to be freed, which would eliminate possible memory wastage if
	1164	the previous suggestion is implemented, and we swap slabs more frequently.
	1165
	1166	=head2 eliminate incorrect line numbers in warnings
	1167
	1168	This code
	1169
	1170	use warnings;
	1171	my $undef;
	1172
	1173	if ($undef == 3) {
	1174	} elsif ($undef == 0) {
	1175	}
	1176
	1177	used to produce this output:
	1178
	1179	Use of uninitialized value in numeric eq (==) at wrong.pl line 4.
	1180	Use of uninitialized value in numeric eq (==) at wrong.pl line 4.
	1181
	1182	where the line of the second warning was misreported - it should be line 5.
	1183	Rafael fixed this - the problem arose because there was no nextstate OP
	1184	between the execution of the C<if> and the C<elsif>, hence C<PL_curcop> still
	1185	reports that the currently executing line is line 4. The solution was to inject
	1186	a nextstate OPs for each C<elsif>, although it turned out that the nextstate
	1187	OP needed to be a nulled OP, rather than a live nextstate OP, else other line
	1188	numbers became misreported. (Jenga!)
	1189
	1190	The problem is more general than C<elsif> (although the C<elsif> case is the
	1191	most common and the most confusing). Ideally this code
	1192
	1193	use warnings;
	1194	my $undef;
	1195
	1196	my $a = $undef + 1;
	1197	my $b
	1198	= $undef
	1199	+ 1;
	1200
	1201	would produce this output
	1202
	1203	Use of uninitialized value $undef in addition (+) at wrong.pl line 4.
	1204	Use of uninitialized value $undef in addition (+) at wrong.pl line 7.
	1205
	1206	(rather than lines 4 and 5), but this would seem to require every OP to carry
	1207	(at least) line number information.
	1208
	1209	What might work is to have an optional line number in memory just before the
	1210	BASEOP structure, with a flag bit in the op to say whether it's present.
	1211	Initially during compile every OP would carry its line number. Then add a late
	1212	pass to the optimiser (potentially combined with L</repack the optree>) which
	1213	looks at the two ops on every edge of the graph of the execution path. If
	1214	the line number changes, flags the destination OP with this information.
	1215	Once all paths are traced, replace every op with the flag with a
	1216	nextstate-light op (that just updates C<PL_curcop>), which in turn then passes
	1217	control on to the true op. All ops would then be replaced by variants that
	1218	do not store the line number. (Which, logically, why it would work best in
	1219	conjunction with L</repack the optree>, as that is already copying/reallocating
	1220	all the OPs)
	1221
	1222	(Although I should note that we're not certain that doing this for the general
	1223	case is worth it)
	1224
	1225	=head2 optimize tail-calls
	1226
	1227	Tail-calls present an opportunity for broadly applicable optimization;
	1228	anywhere that C<< return foo(...) >> is called, the outer return can
	1229	be replaced by a goto, and foo will return directly to the outer
	1230	caller, saving (conservatively) 25% of perl's call&return cost, which
	1231	is relatively higher than in C. The scheme language is known to do
	1232	this heavily. B::Concise provides good insight into where this
	1233	optimization is possible, ie anywhere entersub,leavesub op-sequence
	1234	occurs.
	1235
	1236	perl -MO=Concise,-exec,a,b,-main -e 'sub a{ 1 }; sub b {a()}; b(2)'
	1237
	1238	Bottom line on this is probably a new pp_tailcall function which
	1239	combines the code in pp_entersub, pp_leavesub. This should probably
	1240	be done 1st in XS, and using B::Generate to patch the new OP into the
	1241	optrees.
	1242
	1243	=head2 Add C<00dddd>
	1244
	1245	It has been proposed that octal constants be specifiable through the syntax
	1246	C<0oddddd>, parallel to the existing construct to specify hex constants
	1247	C<0xddddd>
	1248
	1249	=head1 Big projects
	1250
	1251	Tasks that will get your name mentioned in the description of the "Highlights
	1252	of 5.12"
	1253
	1254	=head2 make ithreads more robust
	1255
	1256	Generally make ithreads more robust. See also L</iCOW>
	1257
	1258	This task is incremental - even a little bit of work on it will help, and
	1259	will be greatly appreciated.
	1260
	1261	One bit would be to determine how to clone directory handles on systems
	1262	without a C<fchdir> function (in sv.c:Perl_dirp_dup).
	1263
	1264	Fix Perl_sv_dup, et al so that threads can return objects.
	1265
	1266	=head2 iCOW
	1267
	1268	Sarathy and Arthur have a proposal for an improved Copy On Write which
	1269	specifically will be able to COW new ithreads. If this can be implemented
	1270	it would be a good thing.
	1271
	1272	=head2 (?{...}) closures in regexps
	1273
	1274	Fix (or rewrite) the implementation of the C</(?{...})/> closures.
	1275
	1276	=head2 A re-entrant regexp engine
	1277
	1278	This will allow the use of a regex from inside (?{ }), (??{ }) and
	1279	(?(?{ })\|) constructs.
	1280
	1281	=head2 Add class set operations to regexp engine
	1282
	1283	Apparently these are quite useful. Anyway, Jeffery Friedl wants them.
	1284
	1285	demerphq has this on his todo list, but right at the bottom.
	1286
	1287
	1288	=head1 Tasks for microperl
	1289
	1290
	1291	[ Each and every one of these may be obsolete, but they were listed
	1292	in the old Todo.micro file]
	1293
	1294
	1295	=head2 make creating uconfig.sh automatic
	1296
	1297	=head2 make creating Makefile.micro automatic
	1298
	1299	=head2 do away with fork/exec/wait?
	1300
	1301	(system, popen should be enough?)
	1302
	1303	=head2 some of the uconfig.sh really needs to be probed (using cc) in buildtime:
	1304
	1305	(uConfigure? :-) native datatype widths and endianness come to mind
	1306