Unicode support finish byte <-> utf8 and localencoding <-> utf8 conversions make substr($bytestr,0,0,$charstr) do the right conversion add Unicode::Map equivivalent to core add support for I/O disciplines - a way to specify disciplines when opening things: open(F, "<:crlf :utf16", $file) - a way to specify disciplines for an already opened handle: binmode(STDIN, ":slurp :raw") - a way to set default disciplines for all handle constructors: use open IN => ":any", OUT => ":utf8", SYS => ":utf16" eliminate need for "use utf8;" autoload byte.pm when byte:: is seen by the parser check uv_to_utf8() calls for buffer overflow make \uXXXX (and \u{XXXX}?) where XXXX are hex digits to work similarly to Unicode tech reports and Java notation \uXXXX (and already existing \x{XXXX))? more than four hexdigits? make also \U+XXXX work? overloadable regex assertions? e.g. in Thai \b cannot be deduced by any simple character class boundary rules, word boundaries must algorithmically computed see ext/Encode/Todo for notes and references about proper detection of malformed UTF-8 SCSU? http://www.unicode.org/unicode/reports/tr6/ Collation? http://www.unicode.org/unicode/reports/tr10/ Normalization? http://www.unicode.org/unicode/reports/tr15/ EBCDIC? http://www.unicode.org/unicode/reports/tr16/ Regexes? http://www.unicode.org/unicode/reports/tr18/ Case Mappings? http://www.unicode.org/unicode/reports/tr21/ See also "Locales", "Regexen", and "Miscellaneous". Multi-threading support "use Thread;" under useithreads add mechanism to: - create new interpreter in a different thread - exchange data between interpreters/threads - share namespaces between interpreters/threads work out consistent semantics for exit/die in threads support for externally created threads? Thread::Pool? Compiler auto-produce executable typed lexicals should affect B::CC::load_pad workarounds to help Win32 END blocks need saving in compiled output _AUTOLOAD prodding fix comppadlist (names in comppad_name can have fake SvCUR from where newASSIGNOP steals the field) Namespace cleanup CPP-space: restrict what we export from headers when !PERL_CORE header-space: move into CORE/perl/? API-space: complete the list of things that constitute public api Configure make configuring+building away from source directory work (VPATH et al) this is related to: cross-compilation configuring (see Todo) _r support (see Todo for mode detailed description) POSIX 1003.1 1996 Edition support--realtime stuff: POSIX semaphores, message queues, shared memory, realtime clocks, timers, signals (the metaconfig units mostly already exist for these) PREFERABLY AS AN EXTENSION UNIX98 support: reader-writer locks, realtime/asynchronous IO PREFERABLY AS AN EXTENSION IPv6 support: see RFC2292, RFC2553 PREFERABLY AS AN EXTENSION there already is Socket6 in CPAN Long doubles figure out where the PV->NV->PV conversion gets it wrong at least in AIX and Tru64 (V5.0 and onwards) when using long doubles: see the regexp tricks we had to insert to t/comp/use.t and t/lib/bigfltpm.t, (?:9|8999\d+) and the like. 64-bit support Configure probe for quad_t, uquad_t, and (argh) u_quad_t, they might be in some systems the only thing working as quadtype and uquadtype. more pain: long_long, u_long_long. Locales deprecate traditional/legacy locales? How do locales work across packages? figure out how to support Unicode locales suggestion: integrate the IBM Classes for Unicode (ICU) http://oss.software.ibm.com/developerworks/opensource/icu/project/ ICU is "portable, open-source Unicode library with: charset-independent locales (with multiple locales simultaneously supported in same thread; character conversions; formatting/parsing for numbers, currencies, date/time and messages; message catalogs (resources); transliteration, collation, normalization, and text boundaries (grapheme, word, line-break))". Check out also the Locale Converter: http://alphaworks.ibm.com/tech/localeconverter There is also the iconv interface, either from XPG4 or GNU (glibc). iconv is about character set conversions. Either ICU or iconv would be valuable to get integrated into Perl, Configure already probes for libiconv and . Regexen make RE engine thread-safe a way to do full character set arithmetics: now one can do addition, negate a whole class, and negate certain subclasses (e.g. \D, [:^digit:]), but a more generic way to add/subtract/ intersect characters/classes, like described in the Unicode technical report on Regular Expression Guidelines, http://www.unicode.org/unicode/reports/tr18/ (amusingly, the TR notes that difference and intersection can be done using "Perl-style look-ahead") difference syntax? maybe [[:alpha:][^abc]] meaning "all alphabetic expect a, b, and c"? or [[:alpha:]-[abc]]? (maybe bad, as we explicitly disallow such 'ranges') intersection syntax? maybe [[..]&[...]]? POSIX [=bar=] and [.zap.] would nice too but there's no API for them =bar= could be done with Unicode, though, see the Unicode TR #15 about normalization forms: http://www.unicode.org/unicode/reports/tr15/ this is also a part of the Unicode 3.0: http://www.unicode.org/unicode/uni2book/u2.html executive summary: there are several different levels of 'equivalence' trie optimization: factor out common suffixes (and prefixes?) from |-alternating groups (both for exact strings and character classes, use lookaheads?) approximate matching Security use fchown, fchmod (and futimes?) internally when possible use fchdir(how portable?) create secure reliable portable temporary file modules audit the standard utilities for security problems and fix them Reliable Signals custom opcodes alternate runops() for signal despatch figure out how to die() in delayed sighandler make Thread::Signal work under useithreads Win32 stuff sort out the spawnvp() mess for system('a','b','c') compatibility work out DLL versioning Miscellaneous introduce @( and @) because group names can have spaces add new modules (Archive::Tar, Compress::Zlib, CPAN::FTP?) sub-second sleep()? alarm()? time()? (integrate Time::HiRes? Configure doesn't yet probe for usleep/nanosleep/ualarm but the units exist) floating point handling: nans, infinities, fp exception masks, etc. At least the following interfaces exist: fp_classify(), fp_class(), class(), isinf(), isfinite(), finite(), isnormal(), unordered(), , (there are metaconfig units for all these), fp_setmask(), fp_getmask(), fp_setround(), fp_getround() (no metaconfig units yet for these). Don't forget finitel(), fp_classl(), fp_class_l(), (yes, both do, unfortunately, exist), and unorderedl(). PREFERABLY AS AN EXTENSION. As of 5.6.1 there is cpp macro Perl_isnan(). fix the basic arithmetics (+ - * / %) to preserve IVness/UVness if both arguments are IVs/UVs: it sucks that one cannot see the 'carry flag' (or equivalent) of the CPU from C, C is too high-level... replace pod2html with new PodtoHtml? (requires other modules from CPAN) automate testing with large parts of CPAN turn Cwd into an XS module? (Configure already probes for getcwd()) mmap for speeding up input? (Configure already probes for the mmap family) sendmsg, recvmsg? (Configure doesn't probe for these but the units exist) setitimer, getitimer? (the metaconfig units exist) Ongoing keep filenames 8.3 friendly, where feasible upgrade to newer versions of all independently maintained modules comprehensive perldelta.pod Documentation describe new age patterns update perl{guts,call,embed,xs} with additions, changes to API convert more examples to use autovivified filehandles document Win32 choices spot-check all new modules for completeness better docs for pack()/unpack() reorg tutorials vs. reference sections make roffitall to be dynamical about its pods and libs