This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
perldiag: rewording
[perl5.git] / pod / perltodo.pod
CommitLineData
7711098a
GS
1=head1 NAME
2
3perltodo - Perl TO-DO List
4
5=head1 DESCRIPTION
e50bb9a1 6
049aabcb
NC
7This is a list of wishes for Perl. The most up to date version of this file
8is at http://perl5.git.perl.org/perl.git/blob_plain/HEAD:/pod/perltodo.pod
9
10The tasks we think are smaller or easier are listed first. Anyone is welcome
11to work on any of these, but it's a good idea to first contact
12I<perl5-porters@perl.org> to avoid duplication of effort, and to learn from
13any previous attempts. By all means contact a pumpking privately first if you
14prefer.
e50bb9a1 15
0bdfc961
NC
16Whilst patches to make the list shorter are most welcome, ideas to add to
17the list are also encouraged. Check the perl5-porters archives for past
18ideas, and any discussion about them. One set of archives may be found at:
e50bb9a1 19
0bdfc961 20 http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/
938c8732 21
617eabfa
NC
22What can we offer you in return? Fame, fortune, and everlasting glory? Maybe
23not, but if your patch is incorporated, then we'll add your name to the
24F<AUTHORS> file, which ships in the official distribution. How many other
25programming languages offer you 1 line of immortality?
938c8732 26
0bdfc961 27=head1 Tasks that only need Perl knowledge
e50bb9a1 28
de2b17d8
NC
29=head2 Migrate t/ from custom TAP generation
30
31Many tests below F<t/> still generate TAP by "hand", rather than using library
32functions. As explained in L<perlhack/Writing a test>, tests in F<t/> are
33written in a particular way to test that more complex constructions actually
34work before using them routinely. Hence they don't use C<Test::More>, but
35instead there is an intentionally simpler library, F<t/test.pl>. However,
36quite a few tests in F<t/> have not been refactored to use it. Refactoring
37any of these tests, one at a time, is a useful thing TODO.
38
0d8e5a42
RGS
39The subdirectories F<base>, F<cmd> and F<comp>, that contain the most
40basic tests, should be excluded from this task.
41
0be987a2
NC
42=head2 Automate perldelta generation
43
44The perldelta file accompanying each release summaries the major changes.
45It's mostly manually generated currently, but some of that could be
46automated with a bit of perl, specifically the generation of
47
48=over
49
50=item Modules and Pragmata
51
52=item New Documentation
53
54=item New Tests
55
56=back
57
58See F<Porting/how_to_write_a_perldelta.pod> for details.
59
5a176cbc
NC
60=head2 Remove duplication of test setup.
61
62Schwern notes, that there's duplication of code - lots and lots of tests have
63some variation on the big block of C<$Is_Foo> checks. We can safely put this
64into a file, change it to build an C<%Is> hash and require it. Maybe just put
65it into F<test.pl>. Throw in the handy tainting subroutines.
66
87a942b1 67=head2 POD -E<gt> HTML conversion in the core still sucks
e50bb9a1 68
938c8732 69Which is crazy given just how simple POD purports to be, and how simple HTML
adebf063
NC
70can be. It's not actually I<as> simple as it sounds, particularly with the
71flexibility POD allows for C<=item>, but it would be good to improve the
72visual appeal of the HTML generated, and to avoid it having any validation
73errors. See also L</make HTML install work>, as the layout of installation tree
74is needed to improve the cross-linking.
938c8732 75
dc0fb092
SP
76The addition of C<Pod::Simple> and its related modules may make this task
77easier to complete.
78
0befdfba
NC
79=head2 Make ExtUtils::ParseXS use strict;
80
81F<lib/ExtUtils/ParseXS.pm> contains this line
82
83 # use strict; # One of these days...
84
85Simply uncomment it, and fix all the resulting issues :-)
86
87The more practical approach, to break the task down into manageable chunks, is
88to work your way though the code from bottom to top, or if necessary adding
89extra C<{ ... }> blocks, and turning on strict within them.
90
0bdfc961 91=head2 Make Schwern poorer
e50bb9a1 92
613bd4f7 93We should have tests for everything. When all the core's modules are tested,
0bdfc961
NC
94Schwern has promised to donate to $500 to TPF. We may need volunteers to
95hold him upside down and shake vigorously in order to actually extract the
96cash.
3958b146 97
0bdfc961 98=head2 Improve the coverage of the core tests
e50bb9a1 99
e1020413 100Use Devel::Cover to ascertain the core modules' test coverage, then add
02f21748 101tests that are currently missing.
30222c0f 102
0bdfc961 103=head2 test B
e50bb9a1 104
0bdfc961 105A full test suite for the B module would be nice.
e50bb9a1 106
0bdfc961 107=head2 A decent benchmark
e50bb9a1 108
617eabfa 109C<perlbench> seems impervious to any recent changes made to the perl core. It
0bdfc961
NC
110would be useful to have a reasonable general benchmarking suite that roughly
111represented what current perl programs do, and measurably reported whether
112tweaks to the core improve, degrade or don't really affect performance, to
113guide people attempting to optimise the guts of perl. Gisle would welcome
114new tests for perlbench.
6168cf99 115
0bdfc961 116=head2 fix tainting bugs
6168cf99 117
0bdfc961
NC
118Fix the bugs revealed by running the test suite with the C<-t> switch (via
119C<make test.taintwarn>).
e50bb9a1 120
0bdfc961 121=head2 Dual life everything
e50bb9a1 122
0bdfc961
NC
123As part of the "dists" plan, anything that doesn't belong in the smallest perl
124distribution needs to be dual lifed. Anything else can be too. Figure out what
125changes would be needed to package that module and its tests up for CPAN, and
126do so. Test it with older perl releases, and fix the problems you find.
e50bb9a1 127
a393eb28
RGS
128To make a minimal perl distribution, it's useful to look at
129F<t/lib/commonsense.t>.
130
0bdfc961 131=head2 POSIX memory footprint
e50bb9a1 132
0bdfc961
NC
133Ilya observed that use POSIX; eats memory like there's no tomorrow, and at
134various times worked to cut it down. There is probably still fat to cut out -
135for example POSIX passes Exporter some very memory hungry data structures.
e50bb9a1 136
eed36644
NC
137=head2 embed.pl/makedef.pl
138
139There is a script F<embed.pl> that generates several header files to prefix
140all of Perl's symbols in a consistent way, to provide some semblance of
141namespace support in C<C>. Functions are declared in F<embed.fnc>, variables
907b3e23 142in F<interpvar.h>. Quite a few of the functions and variables
eed36644
NC
143are conditionally declared there, using C<#ifdef>. However, F<embed.pl>
144doesn't understand the C macros, so the rules about which symbols are present
145when is duplicated in F<makedef.pl>. Writing things twice is bad, m'kay.
146It would be good to teach C<embed.pl> to understand the conditional
147compilation, and hence remove the duplication, and the mistakes it has caused.
e50bb9a1 148
801de10e
NC
149=head2 use strict; and AutoLoad
150
151Currently if you write
152
153 package Whack;
154 use AutoLoader 'AUTOLOAD';
155 use strict;
156 1;
157 __END__
158 sub bloop {
159 print join (' ', No, strict, here), "!\n";
160 }
161
162then C<use strict;> isn't in force within the autoloaded subroutines. It would
163be more consistent (and less surprising) to arrange for all lexical pragmas
164in force at the __END__ block to be in force within each autoloaded subroutine.
165
773b3597
RGS
166There's a similar problem with SelfLoader.
167
91d0cbf6
NC
168=head2 profile installman
169
170The F<installman> script is slow. All it is doing text processing, which we're
171told is something Perl is good at. So it would be nice to know what it is doing
172that is taking so much CPU, and where possible address it.
173
c69ca1d4 174=head2 enable lexical enabling/disabling of individual warnings
a9ed9b74
JV
175
176Currently, warnings can only be enabled or disabled by category. There
177are times when it would be useful to quash a single warning, not a
178whole category.
91d0cbf6 179
0bdfc961 180=head1 Tasks that need a little sysadmin-type knowledge
e50bb9a1 181
0bdfc961
NC
182Or if you prefer, tasks that you would learn from, and broaden your skills
183base...
e50bb9a1 184
cd793d32 185=head2 make HTML install work
e50bb9a1 186
adebf063
NC
187There is an C<installhtml> target in the Makefile. It's marked as
188"experimental". It would be good to get this tested, make it work reliably, and
189remove the "experimental" tag. This would include
190
191=over 4
192
193=item 1
194
195Checking that cross linking between various parts of the documentation works.
196In particular that links work between the modules (files with POD in F<lib/>)
197and the core documentation (files in F<pod/>)
198
199=item 2
200
617eabfa
NC
201Work out how to split C<perlfunc> into chunks, preferably one per function
202group, preferably with general case code that could be used elsewhere.
203Challenges here are correctly identifying the groups of functions that go
204together, and making the right named external cross-links point to the right
205page. Things to be aware of are C<-X>, groups such as C<getpwnam> to
206C<endservent>, two or more C<=items> giving the different parameter lists, such
207as
adebf063
NC
208
209 =item substr EXPR,OFFSET,LENGTH,REPLACEMENT
adebf063 210 =item substr EXPR,OFFSET,LENGTH
adebf063
NC
211 =item substr EXPR,OFFSET
212
213and different parameter lists having different meanings. (eg C<select>)
214
215=back
3a89a73c 216
0bdfc961
NC
217=head2 compressed man pages
218
219Be able to install them. This would probably need a configure test to see how
220the system does compressed man pages (same directory/different directory?
221same filename/different filename), as well as tweaking the F<installman> script
222to compress as necessary.
223
30222c0f
NC
224=head2 Add a code coverage target to the Makefile
225
226Make it easy for anyone to run Devel::Cover on the core's tests. The steps
227to do this manually are roughly
228
229=over 4
230
231=item *
232
233do a normal C<Configure>, but include Devel::Cover as a module to install
f11a3063 234(see L<INSTALL> for how to do this)
30222c0f
NC
235
236=item *
237
238 make perl
239
240=item *
241
242 cd t; HARNESS_PERL_SWITCHES=-MDevel::Cover ./perl -I../lib harness
243
244=item *
245
246Process the resulting Devel::Cover database
247
248=back
249
250This just give you the coverage of the F<.pm>s. To also get the C level
251coverage you need to
252
253=over 4
254
255=item *
256
257Additionally tell C<Configure> to use the appropriate C compiler flags for
258C<gcov>
259
260=item *
261
262 make perl.gcov
263
264(instead of C<make perl>)
265
266=item *
267
268After running the tests run C<gcov> to generate all the F<.gcov> files.
269(Including down in the subdirectories of F<ext/>
270
271=item *
272
273(From the top level perl directory) run C<gcov2perl> on all the C<.gcov> files
274to get their stats into the cover_db directory.
275
276=item *
277
278Then process the Devel::Cover database
279
280=back
281
282It would be good to add a single switch to C<Configure> to specify that you
283wanted to perform perl level coverage, and another to specify C level
284coverage, and have C<Configure> and the F<Makefile> do all the right things
285automatically.
286
02f21748 287=head2 Make Config.pm cope with differences between built and installed perl
0bdfc961
NC
288
289Quite often vendors ship a perl binary compiled with their (pay-for)
290compilers. People install a free compiler, such as gcc. To work out how to
291build extensions, Perl interrogates C<%Config>, so in this situation
292C<%Config> describes compilers that aren't there, and extension building
293fails. This forces people into choosing between re-compiling perl themselves
294using the compiler they have, or only using modules that the vendor ships.
295
296It would be good to find a way teach C<Config.pm> about the installation setup,
297possibly involving probing at install time or later, so that the C<%Config> in
298a binary distribution better describes the installed machine, when the
299installed machine differs from the build machine in some significant way.
300
728f4ecd
NC
301=head2 linker specification files
302
303Some platforms mandate that you provide a list of a shared library's external
304symbols to the linker, so the core already has the infrastructure in place to
305do this for generating shared perl libraries. My understanding is that the
306GNU toolchain can accept an optional linker specification file, and restrict
307visibility just to symbols declared in that file. It would be good to extend
308F<makedef.pl> to support this format, and to provide a means within
309C<Configure> to enable it. This would allow Unix users to test that the
310export list is correct, and to build a perl that does not pollute the global
32d539f5
RU
311namespace with private symbols, and will fail in the same way as msvc or mingw
312builds or when using PERL_DL_NONLAZY=1.
728f4ecd 313
a229ae3b
RGS
314=head2 Cross-compile support
315
316Currently C<Configure> understands C<-Dusecrosscompile> option. This option
317arranges for building C<miniperl> for TARGET machine, so this C<miniperl> is
318assumed then to be copied to TARGET machine and used as a replacement of full
319C<perl> executable.
320
d1307786 321This could be done little differently. Namely C<miniperl> should be built for
a229ae3b 322HOST and then full C<perl> with extensions should be compiled for TARGET.
d1307786 323This, however, might require extra trickery for %Config: we have one config
87a942b1
JH
324first for HOST and then another for TARGET. Tools like MakeMaker will be
325mightily confused. Having around two different types of executables and
326libraries (HOST and TARGET) makes life interesting for Makefiles and
327shell (and Perl) scripts. There is $Config{run}, normally empty, which
328can be used as an execution wrapper. Also note that in some
329cross-compilation/execution environments the HOST and the TARGET do
330not see the same filesystem(s), the $Config{run} may need to do some
331file/directory copying back and forth.
0bdfc961 332
8537f021
RGS
333=head2 roffitall
334
335Make F<pod/roffitall> be updated by F<pod/buildtoc>.
336
98fca0e8
NC
337=head2 Split "linker" from "compiler"
338
339Right now, Configure probes for two commands, and sets two variables:
340
341=over 4
342
b91dd380 343=item * C<cc> (in F<cc.U>)
98fca0e8
NC
344
345This variable holds the name of a command to execute a C compiler which
346can resolve multiple global references that happen to have the same
347name. Usual values are F<cc> and F<gcc>.
348Fervent ANSI compilers may be called F<c89>. AIX has F<xlc>.
349
b91dd380 350=item * C<ld> (in F<dlsrc.U>)
98fca0e8
NC
351
352This variable indicates the program to be used to link
353libraries for dynamic loading. On some systems, it is F<ld>.
354On ELF systems, it should be C<$cc>. Mostly, we'll try to respect
355the hint file setting.
356
357=back
358
8d159ec1
NC
359There is an implicit historical assumption from around Perl5.000alpha
360something, that C<$cc> is also the correct command for linking object files
361together to make an executable. This may be true on Unix, but it's not true
362on other platforms, and there are a maze of work arounds in other places (such
363as F<Makefile.SH>) to cope with this.
98fca0e8
NC
364
365Ideally, we should create a new variable to hold the name of the executable
366linker program, probe for it in F<Configure>, and centralise all the special
367case logic there or in hints files.
368
369A small bikeshed issue remains - what to call it, given that C<$ld> is already
8d159ec1
NC
370taken (arguably for the wrong thing now, but on SunOS 4.1 it is the command
371for creating dynamically-loadable modules) and C<$link> could be confused with
372the Unix command line executable of the same name, which does something
373completely different. Andy Dougherty makes the counter argument "In parrot, I
374tried to call the command used to link object files and libraries into an
375executable F<link>, since that's what my vaguely-remembered DOS and VMS
376experience suggested. I don't think any real confusion has ensued, so it's
377probably a reasonable name for perl5 to use."
98fca0e8
NC
378
379"Alas, I've always worried that introducing it would make things worse,
380since now the module building utilities would have to look for
381C<$Config{link}> and institute a fall-back plan if it weren't found."
8d159ec1
NC
382Although I can see that as confusing, given that C<$Config{d_link}> is true
383when (hard) links are available.
98fca0e8 384
75585ce3
SP
385=head2 Configure Windows using PowerShell
386
387Currently, Windows uses hard-coded config files based to build the
388config.h for compiling Perl. Makefiles are also hard-coded and need to be
389hand edited prior to building Perl. While this makes it easy to create a perl.exe
390that works across multiple Windows versions, being able to accurately
391configure a perl.exe for a specific Windows versions and VS C++ would be
392a nice enhancement. With PowerShell available on Windows XP and up, this
393may now be possible. Step 1 might be to investigate whether this is possible
394and use this to clean up our current makefile situation. Step 2 would be to
395see if there would be a way to use our existing metaconfig units to configure a
396Windows Perl or whether we go in a separate direction and make it so. Of
397course, we all know what step 3 is.
398
ab45a0fa
NC
399=head2 decouple -g and -DDEBUGGING
400
401Currently F<Configure> automatically adds C<-DDEBUGGING> to the C compiler
402flags if it spots C<-g> in the optimiser flags. The pre-processor directive
eeab323f 403C<DEBUGGING> enables F<perl>'s command line C<-D> options, but in the process
ab45a0fa
NC
404makes F<perl> slower. It would be good to disentangle this logic, so that
405C-level debugging with C<-g> and Perl level debugging with C<-D> can easily
406be enabled independently.
407
0bdfc961
NC
408=head1 Tasks that need a little C knowledge
409
410These tasks would need a little C knowledge, but don't need any specific
411background or experience with XS, or how the Perl interpreter works
412
3d826b29
NC
413=head2 Weed out needless PERL_UNUSED_ARG
414
415The C code uses the macro C<PERL_UNUSED_ARG> to stop compilers warning about
416unused arguments. Often the arguments can't be removed, as there is an
417external constraint that determines the prototype of the function, so this
418approach is valid. However, there are some cases where C<PERL_UNUSED_ARG>
419could be removed. Specifically
420
421=over 4
422
423=item *
424
425The prototypes of (nearly all) static functions can be changed
426
427=item *
428
429Unused arguments generated by short cut macros are wasteful - the short cut
430macro used can be changed.
431
432=back
433
fbf638cb
RGS
434=head2 Modernize the order of directories in @INC
435
436The way @INC is laid out by default, one cannot upgrade core (dual-life)
437modules without overwriting files. This causes problems for binary
3d14fd97
AD
438package builders. One possible proposal is laid out in this
439message:
440L<http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2002-04/msg02380.html>.
fbf638cb 441
bcbaa2d5
RGS
442=head2 -Duse32bit*
443
444Natively 64-bit systems need neither -Duse64bitint nor -Duse64bitall.
445On these systems, it might be the default compilation mode, and there
446is currently no guarantee that passing no use64bitall option to the
447Configure process will build a 32bit perl. Implementing -Duse32bit*
fd2dadea 448options would be nice for perl 5.14.
bcbaa2d5 449
fee0a0f7 450=head2 Profile Perl - am I hot or not?
62403a3c 451
fee0a0f7
NC
452The Perl source code is stable enough that it makes sense to profile it,
453identify and optimise the hotspots. It would be good to measure the
454performance of the Perl interpreter using free tools such as cachegrind,
455gprof, and dtrace, and work to reduce the bottlenecks they reveal.
456
457As part of this, the idea of F<pp_hot.c> is that it contains the I<hot> ops,
458the ops that are most commonly used. The idea is that by grouping them, their
459object code will be adjacent in the executable, so they have a greater chance
460of already being in the CPU cache (or swapped in) due to being near another op
461already in use.
62403a3c
NC
462
463Except that it's not clear if these really are the most commonly used ops. So
fee0a0f7
NC
464as part of exercising your skills with coverage and profiling tools you might
465want to determine what ops I<really> are the most commonly used. And in turn
466suggest evictions and promotions to achieve a better F<pp_hot.c>.
62403a3c 467
91d0cbf6
NC
468One piece of Perl code that might make a good testbed is F<installman>.
469
98fed0ad
NC
470=head2 Allocate OPs from arenas
471
472Currently all new OP structures are individually malloc()ed and free()d.
473All C<malloc> implementations have space overheads, and are now as fast as
474custom allocates so it would both use less memory and less CPU to allocate
475the various OP structures from arenas. The SV arena code can probably be
476re-used for this.
477
539f2c54
JC
478Note that Configuring perl with C<-Accflags=-DPL_OP_SLAB_ALLOC> will use
479Perl_Slab_alloc() to pack optrees into a contiguous block, which is
480probably superior to the use of OP arenas, esp. from a cache locality
481standpoint. See L<Profile Perl - am I hot or not?>.
482
a229ae3b 483=head2 Improve win32/wince.c
0bdfc961 484
a229ae3b 485Currently, numerous functions look virtually, if not completely,
02f21748 486identical in both C<win32/wince.c> and C<win32/win32.c> files, which can't
6d71adcd
NC
487be good.
488
c5b31784
SH
489=head2 Use secure CRT functions when building with VC8 on Win32
490
491Visual C++ 2005 (VC++ 8.x) deprecated a number of CRT functions on the basis
492that they were "unsafe" and introduced differently named secure versions of
493them as replacements, e.g. instead of writing
494
495 FILE* f = fopen(__FILE__, "r");
496
497one should now write
498
499 FILE* f;
500 errno_t err = fopen_s(&f, __FILE__, "r");
501
502Currently, the warnings about these deprecations have been disabled by adding
503-D_CRT_SECURE_NO_DEPRECATE to the CFLAGS. It would be nice to remove that
504warning suppressant and actually make use of the new secure CRT functions.
505
506There is also a similar issue with POSIX CRT function names like fileno having
507been deprecated in favour of ISO C++ conformant names like _fileno. These
26a6faa8 508warnings are also currently suppressed by adding -D_CRT_NONSTDC_NO_DEPRECATE. It
c5b31784
SH
509might be nice to do as Microsoft suggest here too, although, unlike the secure
510functions issue, there is presumably little or no benefit in this case.
511
038ae9a4
SH
512=head2 Fix POSIX::access() and chdir() on Win32
513
514These functions currently take no account of DACLs and therefore do not behave
515correctly in situations where access is restricted by DACLs (as opposed to the
516read-only attribute).
517
518Furthermore, POSIX::access() behaves differently for directories having the
519read-only attribute set depending on what CRT library is being used. For
520example, the _access() function in the VC6 and VC7 CRTs (wrongly) claim that
521such directories are not writable, whereas in fact all directories are writable
522unless access is denied by DACLs. (In the case of directories, the read-only
523attribute actually only means that the directory cannot be deleted.) This CRT
524bug is fixed in the VC8 and VC9 CRTs (but, of course, the directory may still
525not actually be writable if access is indeed denied by DACLs).
526
527For the chdir() issue, see ActiveState bug #74552:
528http://bugs.activestate.com/show_bug.cgi?id=74552
529
530Therefore, DACLs should be checked both for consistency across CRTs and for
531the correct answer.
532
533(Note that perl's -w operator should not be modified to check DACLs. It has
534been written so that it reflects the state of the read-only attribute, even
535for directories (whatever CRT is being used), for symmetry with chmod().)
536
16815324
NC
537=head2 strcat(), strcpy(), strncat(), strncpy(), sprintf(), vsprintf()
538
539Maybe create a utility that checks after each libperl.a creation that
540none of the above (nor sprintf(), vsprintf(), or *SHUDDER* gets())
541ever creep back to libperl.a.
542
543 nm libperl.a | ./miniperl -alne '$o = $F[0] if /:$/; print "$o $F[1]" if $F[0] eq "U" && $F[1] =~ /^(?:strn?c(?:at|py)|v?sprintf|gets)$/'
544
545Note, of course, that this will only tell whether B<your> platform
546is using those naughty interfaces.
547
de96509d
JH
548=head2 -D_FORTIFY_SOURCE=2, -fstack-protector
549
550Recent glibcs support C<-D_FORTIFY_SOURCE=2> and recent gcc
551(4.1 onwards?) supports C<-fstack-protector>, both of which give
552protection against various kinds of buffer overflow problems.
553These should probably be used for compiling Perl whenever available,
554Configure and/or hints files should be adjusted to probe for the
555availability of these features and enable them as appropriate.
16815324 556
8964cfe0
NC
557=head2 Arenas for GPs? For MAGIC?
558
559C<struct gp> and C<struct magic> are both currently allocated by C<malloc>.
560It might be a speed or memory saving to change to using arenas. Or it might
561not. It would need some suitable benchmarking first. In particular, C<GP>s
562can probably be changed with minimal compatibility impact (probably nothing
563outside of the core, or even outside of F<gv.c> allocates them), but they
564probably aren't allocated/deallocated often enough for a speed saving. Whereas
565C<MAGIC> is allocated/deallocated more often, but in turn, is also something
566more externally visible, so changing the rules here may bite external code.
567
3880c8ec
NC
568=head2 Shared arenas
569
570Several SV body structs are now the same size, notably PVMG and PVGV, PVAV and
571PVHV, and PVCV and PVFM. It should be possible to allocate and return same
572sized bodies from the same actual arena, rather than maintaining one arena for
573each. This could save 4-6K per thread, of memory no longer tied up in the
574not-yet-allocated part of an arena.
575
8964cfe0 576
6d71adcd
NC
577=head1 Tasks that need a knowledge of XS
578
579These tasks would need C knowledge, and roughly the level of knowledge of
580the perl API that comes from writing modules that use XS to interface to
581C.
582
e851c105
DG
583=head2 Write an XS cookbook
584
585Create pod/perlxscookbook.pod with short, task-focused 'recipes' in XS that
586demonstrate common tasks and good practices. (Some of these might be
587extracted from perlguts.) The target audience should be XS novices, who need
588more examples than perlguts but something less overwhelming than perlapi.
589Recipes should provide "one pretty good way to do it" instead of TIMTOWTDI.
590
5b7d14ff
DG
591Rather than focusing on interfacing Perl to C libraries, such a cookbook
592should probably focus on how to optimize Perl routines by re-writing them
593in XS. This will likely be more motivating to those who mostly work in
594Perl but are looking to take the next step into XS.
595
596Deconstructing and explaining some simpler XS modules could be one way to
597bootstrap a cookbook. (List::Util? Class::XSAccessor? Tree::Ternary_XS?)
598Another option could be deconstructing the implementation of some simpler
599functions in op.c.
600
05fb4e20
NC
601=head2 Allow XSUBs to inline themselves as OPs
602
603For a simple XSUB, often the subroutine dispatch takes more time than the
604XSUB itself. The tokeniser already has the ability to inline constant
605subroutines - it would be good to provide a way to inline other subroutines.
606
607Specifically, simplest approach looks to be to allow an XSUB to provide an
608alternative implementation of itself as a custom OP. A new flag bit in
609C<CvFLAGS()> would signal to the peephole optimiser to take an optree
610such as this:
611
612 b <@> leave[1 ref] vKP/REFC ->(end)
613 1 <0> enter ->2
614 2 <;> nextstate(main 1 -e:1) v:{ ->3
615 a <2> sassign vKS/2 ->b
616 8 <1> entersub[t2] sKS/TARG,1 ->9
617 - <1> ex-list sK ->8
618 3 <0> pushmark s ->4
619 4 <$> const(IV 1) sM ->5
620 6 <1> rv2av[t1] lKM/1 ->7
621 5 <$> gv(*a) s ->6
622 - <1> ex-rv2cv sK ->-
623 7 <$> gv(*x) s/EARLYCV ->8
624 - <1> ex-rv2sv sKRM*/1 ->a
625 9 <$> gvsv(*b) s ->a
626
627perform the symbol table lookup of C<rv2cv> and C<gv(*x)>, locate the
628pointer to the custom OP that provides the direct implementation, and re-
629write the optree something like:
630
631 b <@> leave[1 ref] vKP/REFC ->(end)
632 1 <0> enter ->2
633 2 <;> nextstate(main 1 -e:1) v:{ ->3
634 a <2> sassign vKS/2 ->b
635 7 <1> custom_x -> 8
636 - <1> ex-list sK ->7
637 3 <0> pushmark s ->4
638 4 <$> const(IV 1) sM ->5
639 6 <1> rv2av[t1] lKM/1 ->7
640 5 <$> gv(*a) s ->6
641 - <1> ex-rv2cv sK ->-
642 - <$> ex-gv(*x) s/EARLYCV ->7
643 - <1> ex-rv2sv sKRM*/1 ->a
644 8 <$> gvsv(*b) s ->a
645
646I<i.e.> the C<gv(*)> OP has been nulled and spliced out of the execution
647path, and the C<entersub> OP has been replaced by the custom op.
648
649This approach should provide a measurable speed up to simple XSUBs inside
650tight loops. Initially one would have to write the OP alternative
651implementation by hand, but it's likely that this should be reasonably
652straightforward for the type of XSUB that would benefit the most. Longer
653term, once the run-time implementation is proven, it should be possible to
654progressively update ExtUtils::ParseXS to generate OP implementations for
655some XSUBs.
656
318bf708
NC
657=head2 Remove the use of SVs as temporaries in dump.c
658
659F<dump.c> contains debugging routines to dump out the contains of perl data
660structures, such as C<SV>s, C<AV>s and C<HV>s. Currently, the dumping code
661B<uses> C<SV>s for its temporary buffers, which was a logical initial
662implementation choice, as they provide ready made memory handling.
663
664However, they also lead to a lot of confusion when it happens that what you're
665trying to debug is seen by the code in F<dump.c>, correctly or incorrectly, as
666a temporary scalar it can use for a temporary buffer. It's also not possible
667to dump scalars before the interpreter is properly set up, such as during
668ithreads cloning. It would be good to progressively replace the use of scalars
669as string accumulation buffers with something much simpler, directly allocated
670by C<malloc>. The F<dump.c> code is (or should be) only producing 7 bit
671US-ASCII, so output character sets are not an issue.
672
673Producing and proving an internal simple buffer allocation would make it easier
674to re-write the internals of the PerlIO subsystem to avoid using C<SV>s for
675B<its> buffers, use of which can cause problems similar to those of F<dump.c>,
676at similar times.
677
5d96f598
NC
678=head2 safely supporting POSIX SA_SIGINFO
679
680Some years ago Jarkko supplied patches to provide support for the POSIX
681SA_SIGINFO feature in Perl, passing the extra data to the Perl signal handler.
682
683Unfortunately, it only works with "unsafe" signals, because under safe
684signals, by the time Perl gets to run the signal handler, the extra
685information has been lost. Moreover, it's not easy to store it somewhere,
686as you can't call mutexs, or do anything else fancy, from inside a signal
687handler.
688
689So it strikes me that we could provide safe SA_SIGINFO support
690
691=over 4
692
693=item 1
694
695Provide global variables for two file descriptors
696
697=item 2
698
699When the first request is made via C<sigaction> for C<SA_SIGINFO>, create a
700pipe, store the reader in one, the writer in the other
701
702=item 3
703
704In the "safe" signal handler (C<Perl_csighandler()>/C<S_raise_signal()>), if
705the C<siginfo_t> pointer non-C<NULL>, and the writer file handle is open,
706
707=over 8
708
709=item 1
710
711serialise signal number, C<struct siginfo_t> (or at least the parts we care
712about) into a small auto char buff
713
714=item 2
715
716C<write()> that (non-blocking) to the writer fd
717
718=over 12
719
720=item 1
721
722if it writes 100%, flag the signal in a counter of "signals on the pipe" akin
723to the current per-signal-number counts
724
725=item 2
726
727if it writes 0%, assume the pipe is full. Flag the data as lost?
728
729=item 3
730
731if it writes partially, croak a panic, as your OS is broken.
732
733=back
734
735=back
736
737=item 4
738
739in the regular C<PERL_ASYNC_CHECK()> processing, if there are "signals on
740the pipe", read the data out, deserialise, build the Perl structures on
741the stack (code in C<Perl_sighandler()>, the "unsafe" handler), and call as
742usual.
743
744=back
745
746I think that this gets us decent C<SA_SIGINFO> support, without the current risk
747of running Perl code inside the signal handler context. (With all the dangers
748of things like C<malloc> corruption that that currently offers us)
749
750For more information see the thread starting with this message:
751http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2008-03/msg00305.html
752
6d71adcd
NC
753=head2 autovivification
754
755Make all autovivification consistent w.r.t LVALUE/RVALUE and strict/no strict;
756
757This task is incremental - even a little bit of work on it will help.
758
759=head2 Unicode in Filenames
760
761chdir, chmod, chown, chroot, exec, glob, link, lstat, mkdir, open,
762opendir, qx, readdir, readlink, rename, rmdir, stat, symlink, sysopen,
763system, truncate, unlink, utime, -X. All these could potentially accept
764Unicode filenames either as input or output (and in the case of system
765and qx Unicode in general, as input or output to/from the shell).
766Whether a filesystem - an operating system pair understands Unicode in
767filenames varies.
768
769Known combinations that have some level of understanding include
770Microsoft NTFS, Apple HFS+ (In Mac OS 9 and X) and Apple UFS (in Mac
771OS X), NFS v4 is rumored to be Unicode, and of course Plan 9. How to
772create Unicode filenames, what forms of Unicode are accepted and used
773(UCS-2, UTF-16, UTF-8), what (if any) is the normalization form used,
774and so on, varies. Finding the right level of interfacing to Perl
775requires some thought. Remember that an OS does not implicate a
776filesystem.
777
778(The Windows -C command flag "wide API support" has been at least
779temporarily retired in 5.8.1, and the -C has been repurposed, see
780L<perlrun>.)
781
87a942b1
JH
782Most probably the right way to do this would be this:
783L</"Virtualize operating system access">.
784
6d71adcd
NC
785=head2 Unicode in %ENV
786
787Currently the %ENV entries are always byte strings.
87a942b1 788See L</"Virtualize operating system access">.
6d71adcd 789
1f2e7916
JD
790=head2 Unicode and glob()
791
792Currently glob patterns and filenames returned from File::Glob::glob()
87a942b1 793are always byte strings. See L</"Virtualize operating system access">.
1f2e7916 794
6d71adcd
NC
795=head2 use less 'memory'
796
797Investigate trade offs to switch out perl's choices on memory usage.
798Particularly perl should be able to give memory back.
799
800This task is incremental - even a little bit of work on it will help.
801
802=head2 Re-implement C<:unique> in a way that is actually thread-safe
803
804The old implementation made bad assumptions on several levels. A good 90%
805solution might be just to make C<:unique> work to share the string buffer
806of SvPVs. That way large constant strings can be shared between ithreads,
807such as the configuration information in F<Config>.
808
809=head2 Make tainting consistent
810
811Tainting would be easier to use if it didn't take documented shortcuts and
812allow taint to "leak" everywhere within an expression.
813
814=head2 readpipe(LIST)
815
816system() accepts a LIST syntax (and a PROGRAM LIST syntax) to avoid
817running a shell. readpipe() (the function behind qx//) could be similarly
818extended.
819
6d71adcd
NC
820=head2 Audit the code for destruction ordering assumptions
821
822Change 25773 notes
823
824 /* Need to check SvMAGICAL, as during global destruction it may be that
825 AvARYLEN(av) has been freed before av, and hence the SvANY() pointer
826 is now part of the linked list of SV heads, rather than pointing to
827 the original body. */
828 /* FIXME - audit the code for other bugs like this one. */
829
830adding the C<SvMAGICAL> check to
831
832 if (AvARYLEN(av) && SvMAGICAL(AvARYLEN(av))) {
833 MAGIC *mg = mg_find (AvARYLEN(av), PERL_MAGIC_arylen);
834
835Go through the core and look for similar assumptions that SVs have particular
836types, as all bets are off during global destruction.
837
749904bf
JH
838=head2 Extend PerlIO and PerlIO::Scalar
839
840PerlIO::Scalar doesn't know how to truncate(). Implementing this
841would require extending the PerlIO vtable.
842
843Similarly the PerlIO vtable doesn't know about formats (write()), or
844about stat(), or chmod()/chown(), utime(), or flock().
845
846(For PerlIO::Scalar it's hard to see what e.g. mode bits or ownership
847would mean.)
848
849PerlIO doesn't do directories or symlinks, either: mkdir(), rmdir(),
850opendir(), closedir(), seekdir(), rewinddir(), glob(); symlink(),
851readlink().
852
94da6c29
JH
853See also L</"Virtualize operating system access">.
854
3236f110
NC
855=head2 -C on the #! line
856
857It should be possible to make -C work correctly if found on the #! line,
858given that all perl command line options are strict ASCII, and -C changes
859only the interpretation of non-ASCII characters, and not for the script file
860handle. To make it work needs some investigation of the ordering of function
861calls during startup, and (by implication) a bit of tweaking of that order.
862
d6c1e11f
JH
863=head2 Organize error messages
864
865Perl's diagnostics (error messages, see L<perldiag>) could use
a8d0aeb9 866reorganizing and formalizing so that each error message has its
d6c1e11f
JH
867stable-for-all-eternity unique id, categorized by severity, type, and
868subsystem. (The error messages would be listed in a datafile outside
c4bd451b
CB
869of the Perl source code, and the source code would only refer to the
870messages by the id.) This clean-up and regularizing should apply
d6c1e11f
JH
871for all croak() messages.
872
873This would enable all sorts of things: easier translation/localization
874of the messages (though please do keep in mind the caveats of
875L<Locale::Maketext> about too straightforward approaches to
876translation), filtering by severity, and instead of grepping for a
877particular error message one could look for a stable error id. (Of
878course, changing the error messages by default would break all the
879existing software depending on some particular error message...)
880
881This kind of functionality is known as I<message catalogs>. Look for
882inspiration for example in the catgets() system, possibly even use it
883if available-- but B<only> if available, all platforms will B<not>
de96509d 884have catgets().
d6c1e11f
JH
885
886For the really pure at heart, consider extending this item to cover
887also the warning messages (see L<perllexwarn>, C<warnings.pl>).
3236f110 888
0bdfc961 889=head1 Tasks that need a knowledge of the interpreter
3298bd4d 890
0bdfc961
NC
891These tasks would need C knowledge, and knowledge of how the interpreter works,
892or a willingness to learn.
3298bd4d 893
10517af5
JD
894=head2 forbid labels with keyword names
895
896Currently C<goto keyword> "computes" the label value:
897
898 $ perl -e 'goto print'
899 Can't find label 1 at -e line 1.
900
343c8006
JD
901It is controversial if the right way to avoid the confusion is to forbid
902labels with keyword names, or if it would be better to always treat
903bareword expressions after a "goto" as a label and never as a keyword.
10517af5 904
de6375e3
RGS
905=head2 truncate() prototype
906
907The prototype of truncate() is currently C<$$>. It should probably
908be C<*$> instead. (This is changed in F<opcode.pl>)
909
2d0587d8
RGS
910=head2 decapsulation of smart match argument
911
912Currently C<$foo ~~ $object> will die with the message "Smart matching a
913non-overloaded object breaks encapsulation". It would be nice to allow
c69ca1d4 914to bypass this by using explicitly the syntax C<$foo ~~ %$object> or
2d0587d8
RGS
915C<$foo ~~ @$object>.
916
565590b5
NC
917=head2 error reporting of [$a ; $b]
918
919Using C<;> inside brackets is a syntax error, and we don't propose to change
920that by giving it any meaning. However, it's not reported very helpfully:
921
922 $ perl -e '$a = [$b; $c];'
923 syntax error at -e line 1, near "$b;"
924 syntax error at -e line 1, near "$c]"
925 Execution of -e aborted due to compilation errors.
926
927It should be possible to hook into the tokeniser or the lexer, so that when a
928C<;> is parsed where it is not legal as a statement terminator (ie inside
929C<{}> used as a hashref, C<[]> or C<()>) it issues an error something like
930I<';' isn't legal inside an expression - if you need multiple statements use a
931do {...} block>. See the thread starting at
932http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2008-09/msg00573.html
933
718140ec
NC
934=head2 lexicals used only once
935
936This warns:
937
938 $ perl -we '$pie = 42'
939 Name "main::pie" used only once: possible typo at -e line 1.
940
941This does not:
942
943 $ perl -we 'my $pie = 42'
944
945Logically all lexicals used only once should warn, if the user asks for
d6f4ea2e
SP
946warnings. An unworked RT ticket (#5087) has been open for almost seven
947years for this discrepancy.
718140ec 948
a3d15f9a
RGS
949=head2 UTF-8 revamp
950
85c006b6
KW
951The handling of Unicode is unclean in many places. In the regex engine
952there are especially many problems. The swash data structure could be
953replaced my something better. Inversion lists and maps are likely
954candidates. The whole Unicode database could be placed in-core for a
955huge speed-up. Only minimal work was done on the optimizer when utf8
956was added, with the result that the synthetic start class often will
957fail to narrow down the possible choices when given non-Latin1 input.
a3d15f9a
RGS
958
959=head2 Properly Unicode safe tokeniser and pads.
960
961The tokeniser isn't actually very UTF-8 clean. C<use utf8;> is a hack -
962variable names are stored in stashes as raw bytes, without the utf-8 flag
963set. The pad API only takes a C<char *> pointer, so that's all bytes too. The
964tokeniser ignores the UTF-8-ness of C<PL_rsfp>, or any SVs returned from
965source filters. All this could be fixed.
966
636e63cb
NC
967=head2 state variable initialization in list context
968
969Currently this is illegal:
970
971 state ($a, $b) = foo();
972
a2874905 973In Perl 6, C<state ($a) = foo();> and C<(state $a) = foo();> have different
a8d0aeb9 974semantics, which is tricky to implement in Perl 5 as currently they produce
a2874905 975the same opcode trees. The Perl 6 design is firm, so it would be good to
a8d0aeb9 976implement the necessary code in Perl 5. There are comments in
a2874905
NC
977C<Perl_newASSIGNOP()> that show the code paths taken by various assignment
978constructions involving state variables.
636e63cb 979
4fedb12c
RGS
980=head2 Implement $value ~~ 0 .. $range
981
982It would be nice to extend the syntax of the C<~~> operator to also
983understand numeric (and maybe alphanumeric) ranges.
a393eb28
RGS
984
985=head2 A does() built-in
986
987Like ref(), only useful. It would call the C<DOES> method on objects; it
988would also tell whether something can be dereferenced as an
989array/hash/etc., or used as a regexp, etc.
990L<http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2007-03/msg00481.html>
991
992=head2 Tied filehandles and write() don't mix
993
994There is no method on tied filehandles to allow them to be called back by
995formats.
4fedb12c 996
53967bb9
RGS
997=head2 Propagate compilation hints to the debugger
998
999Currently a debugger started with -dE on the command-line doesn't see the
1000features enabled by -E. More generally hints (C<$^H> and C<%^H>) aren't
1001propagated to the debugger. Probably it would be a good thing to propagate
1002hints from the innermost non-C<DB::> scope: this would make code eval'ed
1003in the debugger see the features (and strictures, etc.) currently in
1004scope.
1005
d10fc472 1006=head2 Attach/detach debugger from running program
1626a787 1007
cd793d32
NC
1008The old perltodo notes "With C<gdb>, you can attach the debugger to a running
1009program if you pass the process ID. It would be good to do this with the Perl
0bdfc961
NC
1010debugger on a running Perl program, although I'm not sure how it would be
1011done." ssh and screen do this with named pipes in /tmp. Maybe we can too.
1626a787 1012
0bdfc961
NC
1013=head2 LVALUE functions for lists
1014
1015The old perltodo notes that lvalue functions don't work for list or hash
1016slices. This would be good to fix.
1017
0bdfc961
NC
1018=head2 regexp optimiser optional
1019
1020The regexp optimiser is not optional. It should configurable to be, to allow
1021its performance to be measured, and its bugs to be easily demonstrated.
1022
ef36c6a7
RGS
1023=head2 C</w> regex modifier
1024
1025That flag would enable to match whole words, and also to interpolate
1026arrays as alternations. With it, C</P/w> would be roughly equivalent to:
1027
1028 do { local $"='|'; /\b(?:P)\b/ }
1029
1030See L<http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2007-01/msg00400.html>
1031for the discussion.
1032
0bdfc961
NC
1033=head2 optional optimizer
1034
1035Make the peephole optimizer optional. Currently it performs two tasks as
1036it walks the optree - genuine peephole optimisations, and necessary fixups of
1037ops. It would be good to find an efficient way to switch out the
1038optimisations whilst keeping the fixups.
1039
1040=head2 You WANT *how* many
1041
1042Currently contexts are void, scalar and list. split has a special mechanism in
1043place to pass in the number of return values wanted. It would be useful to
1044have a general mechanism for this, backwards compatible and little speed hit.
1045This would allow proposals such as short circuiting sort to be implemented
1046as a module on CPAN.
1047
1048=head2 lexical aliases
1049
1050Allow lexical aliases (maybe via the syntax C<my \$alias = \$foo>.
1051
1052=head2 entersub XS vs Perl
1053
1054At the moment pp_entersub is huge, and has code to deal with entering both
1055perl and XS subroutines. Subroutine implementations rarely change between
1056perl and XS at run time, so investigate using 2 ops to enter subs (one for
1057XS, one for perl) and swap between if a sub is redefined.
2810d901 1058
de535794 1059=head2 Self-ties
2810d901 1060
de535794 1061Self-ties are currently illegal because they caused too many segfaults. Maybe
a8d0aeb9 1062the causes of these could be tracked down and self-ties on all types
de535794 1063reinstated.
0bdfc961
NC
1064
1065=head2 Optimize away @_
1066
1067The old perltodo notes "Look at the "reification" code in C<av.c>".
1068
87a942b1
JH
1069=head2 Virtualize operating system access
1070
1071Implement a set of "vtables" that virtualizes operating system access
1072(open(), mkdir(), unlink(), readdir(), getenv(), etc.) At the very
1073least these interfaces should take SVs as "name" arguments instead of
1074bare char pointers; probably the most flexible and extensible way
e1a3d5d1
JH
1075would be for the Perl-facing interfaces to accept HVs. The system
1076needs to be per-operating-system and per-file-system
1077hookable/filterable, preferably both from XS and Perl level
87a942b1
JH
1078(L<perlport/"Files and Filesystems"> is good reading at this point,
1079in fact, all of L<perlport> is.)
1080
e1a3d5d1
JH
1081This has actually already been implemented (but only for Win32),
1082take a look at F<iperlsys.h> and F<win32/perlhost.h>. While all Win32
1083variants go through a set of "vtables" for operating system access,
e1020413 1084non-Win32 systems currently go straight for the POSIX/Unix-style
e1a3d5d1
JH
1085system/library call. Similar system as for Win32 should be
1086implemented for all platforms. The existing Win32 implementation
1087probably does not need to survive alongside this proposed new
1088implementation, the approaches could be merged.
87a942b1
JH
1089
1090What would this give us? One often-asked-for feature this would
94da6c29
JH
1091enable is using Unicode for filenames, and other "names" like %ENV,
1092usernames, hostnames, and so forth.
1093(See L<perlunicode/"When Unicode Does Not Happen">.)
1094
1095But this kind of virtualization would also allow for things like
1096virtual filesystems, virtual networks, and "sandboxes" (though as long
1097as dynamic loading of random object code is allowed, not very safe
1098sandboxes since external code of course know not of Perl's vtables).
1099An example of a smaller "sandbox" is that this feature can be used to
1100implement per-thread working directories: Win32 already does this.
1101
1102See also L</"Extend PerlIO and PerlIO::Scalar">.
87a942b1 1103
ac6197af
NC
1104=head2 Investigate PADTMP hash pessimisation
1105
9a2f2e6b 1106The peephole optimiser converts constants used for hash key lookups to shared
057163d7 1107hash key scalars. Under ithreads, something is undoing this work.
ac6197af
NC
1108See http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2007-09/msg00793.html
1109
057163d7
NC
1110=head2 Store the current pad in the OP slab allocator
1111
1112=for clarification
1113I hope that I got that "current pad" part correct
1114
1115Currently we leak ops in various cases of parse failure. I suggested that we
1116could solve this by always using the op slab allocator, and walking it to
1117free ops. Dave comments that as some ops are already freed during optree
1118creation one would have to mark which ops are freed, and not double free them
1119when walking the slab. He notes that one problem with this is that for some ops
1120you have to know which pad was current at the time of allocation, which does
1121change. I suggested storing a pointer to the current pad in the memory allocated
1122for the slab, and swapping to a new slab each time the pad changes. Dave thinks
1123that this would work.
1124
52960e22
JC
1125=head2 repack the optree
1126
1127Repacking the optree after execution order is determined could allow
057163d7
NC
1128removal of NULL ops, and optimal ordering of OPs with respect to cache-line
1129filling. The slab allocator could be reused for this purpose. I think that
1130the best way to do this is to make it an optional step just before the
1131completed optree is attached to anything else, and to use the slab allocator
1132unchanged, so that freeing ops is identical whether or not this step runs.
1133Note that the slab allocator allocates ops downwards in memory, so one would
1134have to actually "allocate" the ops in reverse-execution order to get them
1135contiguous in memory in execution order.
1136
1137See http://www.nntp.perl.org/group/perl.perl5.porters/2007/12/msg131975.html
1138
1139Note that running this copy, and then freeing all the old location ops would
1140cause their slabs to be freed, which would eliminate possible memory wastage if
1141the previous suggestion is implemented, and we swap slabs more frequently.
52960e22 1142
12e06b6f
NC
1143=head2 eliminate incorrect line numbers in warnings
1144
1145This code
1146
1147 use warnings;
1148 my $undef;
1149
1150 if ($undef == 3) {
1151 } elsif ($undef == 0) {
1152 }
1153
18a16cc5 1154used to produce this output:
12e06b6f
NC
1155
1156 Use of uninitialized value in numeric eq (==) at wrong.pl line 4.
1157 Use of uninitialized value in numeric eq (==) at wrong.pl line 4.
1158
18a16cc5
NC
1159where the line of the second warning was misreported - it should be line 5.
1160Rafael fixed this - the problem arose because there was no nextstate OP
1161between the execution of the C<if> and the C<elsif>, hence C<PL_curcop> still
1162reports that the currently executing line is line 4. The solution was to inject
1163a nextstate OPs for each C<elsif>, although it turned out that the nextstate
1164OP needed to be a nulled OP, rather than a live nextstate OP, else other line
1165numbers became misreported. (Jenga!)
12e06b6f
NC
1166
1167The problem is more general than C<elsif> (although the C<elsif> case is the
1168most common and the most confusing). Ideally this code
1169
1170 use warnings;
1171 my $undef;
1172
1173 my $a = $undef + 1;
1174 my $b
1175 = $undef
1176 + 1;
1177
1178would produce this output
1179
1180 Use of uninitialized value $undef in addition (+) at wrong.pl line 4.
1181 Use of uninitialized value $undef in addition (+) at wrong.pl line 7.
1182
1183(rather than lines 4 and 5), but this would seem to require every OP to carry
1184(at least) line number information.
1185
1186What might work is to have an optional line number in memory just before the
1187BASEOP structure, with a flag bit in the op to say whether it's present.
1188Initially during compile every OP would carry its line number. Then add a late
1189pass to the optimiser (potentially combined with L</repack the optree>) which
1190looks at the two ops on every edge of the graph of the execution path. If
1191the line number changes, flags the destination OP with this information.
1192Once all paths are traced, replace every op with the flag with a
1193nextstate-light op (that just updates C<PL_curcop>), which in turn then passes
1194control on to the true op. All ops would then be replaced by variants that
1195do not store the line number. (Which, logically, why it would work best in
1196conjunction with L</repack the optree>, as that is already copying/reallocating
1197all the OPs)
1198
18a16cc5
NC
1199(Although I should note that we're not certain that doing this for the general
1200case is worth it)
1201
52960e22
JC
1202=head2 optimize tail-calls
1203
1204Tail-calls present an opportunity for broadly applicable optimization;
1205anywhere that C<< return foo(...) >> is called, the outer return can
1206be replaced by a goto, and foo will return directly to the outer
1207caller, saving (conservatively) 25% of perl's call&return cost, which
1208is relatively higher than in C. The scheme language is known to do
1209this heavily. B::Concise provides good insight into where this
1210optimization is possible, ie anywhere entersub,leavesub op-sequence
1211occurs.
1212
1213 perl -MO=Concise,-exec,a,b,-main -e 'sub a{ 1 }; sub b {a()}; b(2)'
1214
1215Bottom line on this is probably a new pp_tailcall function which
1216combines the code in pp_entersub, pp_leavesub. This should probably
1217be done 1st in XS, and using B::Generate to patch the new OP into the
1218optrees.
1219
0c397127
KW
1220=head2 Add C<00dddd>
1221
1222It has been proposed that octal constants be specifiable through the syntax
1223C<0oddddd>, parallel to the existing construct to specify hex constants
1224C<0xddddd>
1225
0bdfc961
NC
1226=head1 Big projects
1227
1228Tasks that will get your name mentioned in the description of the "Highlights
fd2dadea 1229of 5.14"
0bdfc961
NC
1230
1231=head2 make ithreads more robust
1232
4e577f8b 1233Generally make ithreads more robust. See also L</iCOW>
0bdfc961
NC
1234
1235This task is incremental - even a little bit of work on it will help, and
1236will be greatly appreciated.
1237
07577ec1
FC
1238One bit would be to determine how to clone directory handles on systems
1239without a C<fchdir> function (in sv.c:Perl_dirp_dup).
6c047da7 1240
59c7f7d5
RGS
1241Fix Perl_sv_dup, et al so that threads can return objects.
1242
0bdfc961
NC
1243=head2 iCOW
1244
1245Sarathy and Arthur have a proposal for an improved Copy On Write which
1246specifically will be able to COW new ithreads. If this can be implemented
1247it would be a good thing.
1248
1249=head2 (?{...}) closures in regexps
1250
1251Fix (or rewrite) the implementation of the C</(?{...})/> closures.
1252
6bda09f9
YO
1253=head2 Add class set operations to regexp engine
1254
1255Apparently these are quite useful. Anyway, Jeffery Friedl wants them.
1256
1257demerphq has this on his todo list, but right at the bottom.
44a7a252
JV
1258
1259
1260=head1 Tasks for microperl
1261
1262
1263[ Each and every one of these may be obsolete, but they were listed
1264 in the old Todo.micro file]
1265
1266
1267=head2 make creating uconfig.sh automatic
1268
1269=head2 make creating Makefile.micro automatic
1270
1271=head2 do away with fork/exec/wait?
1272
1273(system, popen should be enough?)
1274
1275=head2 some of the uconfig.sh really needs to be probed (using cc) in buildtime:
1276
1277(uConfigure? :-) native datatype widths and endianness come to mind
1278