Commit | Line | Data |
---|---|---|
cf6c151c RGS |
1 | =head1 NAME |
2 | ||
3 | perldelta - what is new for perl 5.10.0 | |
4 | ||
5 | =head1 DESCRIPTION | |
6 | ||
7 | This document describes the differences between the 5.8.8 release and | |
8 | the 5.10.0 release. | |
9 | ||
10 | Many of the bug fixes in 5.10.0 were already seen in the 5.8.X maintenance | |
11 | releases; they are not duplicated here and are documented in the set of | |
12 | man pages named perl58[1-8]?delta. | |
13 | ||
cf6c151c RGS |
14 | =head1 Core Enhancements |
15 | ||
16 | =head2 The C<feature> pragma | |
17 | ||
18 | The C<feature> pragma is used to enable new syntax that would break Perl's | |
19 | backwards-compatibility with older releases of the language. It's a lexical | |
20 | pragma, like C<strict> or C<warnings>. | |
21 | ||
22 | Currently the following new features are available: C<switch> (adds a | |
23 | switch statement), C<say> (adds a C<say> built-in function), and C<state> | |
24 | (adds an C<state> keyword for declaring "static" variables). Those | |
25 | features are described in their own sections of this document. | |
26 | ||
27 | The C<feature> pragma is also implicitly loaded when you require a minimal | |
28 | perl version (with the C<use VERSION> construct) greater than, or equal | |
29 | to, 5.9.5. See L<feature> for details. | |
30 | ||
31 | =head2 New B<-E> command-line switch | |
32 | ||
33 | B<-E> is equivalent to B<-e>, but it implicitly enables all | |
34 | optional features (like C<use feature ":5.10">). | |
35 | ||
36 | =head2 Defined-or operator | |
37 | ||
38 | A new operator C<//> (defined-or) has been implemented. | |
dbef3c66 | 39 | The following expression: |
cf6c151c RGS |
40 | |
41 | $a // $b | |
42 | ||
43 | is merely equivalent to | |
44 | ||
45 | defined $a ? $a : $b | |
46 | ||
dbef3c66 | 47 | and the statement |
cf6c151c RGS |
48 | |
49 | $c //= $d; | |
50 | ||
51 | can now be used instead of | |
52 | ||
53 | $c = $d unless defined $c; | |
54 | ||
55 | The C<//> operator has the same precedence and associativity as C<||>. | |
56 | Special care has been taken to ensure that this operator Do What You Mean | |
57 | while not breaking old code, but some edge cases involving the empty | |
58 | regular expression may now parse differently. See L<perlop> for | |
59 | details. | |
60 | ||
61 | =head2 Switch and Smart Match operator | |
62 | ||
63 | Perl 5 now has a switch statement. It's available when C<use feature | |
64 | 'switch'> is in effect. This feature introduces three new keywords, | |
65 | C<given>, C<when>, and C<default>: | |
66 | ||
67 | given ($foo) { | |
68 | when (/^abc/) { $abc = 1; } | |
69 | when (/^def/) { $def = 1; } | |
70 | when (/^xyz/) { $xyz = 1; } | |
71 | default { $nothing = 1; } | |
72 | } | |
73 | ||
74 | A more complete description of how Perl matches the switch variable | |
75 | against the C<when> conditions is given in L<perlsyn/"Switch statements">. | |
76 | ||
77 | This kind of match is called I<smart match>, and it's also possible to use | |
78 | it outside of switch statements, via the new C<~~> operator. See | |
79 | L<perlsyn/"Smart matching in detail">. | |
80 | ||
81 | This feature was contributed by Robin Houston. | |
82 | ||
83 | =head2 Regular expressions | |
84 | ||
85 | =over 4 | |
86 | ||
87 | =item Recursive Patterns | |
88 | ||
89 | It is now possible to write recursive patterns without using the C<(??{})> | |
90 | construct. This new way is more efficient, and in many cases easier to | |
91 | read. | |
92 | ||
93 | Each capturing parenthesis can now be treated as an independent pattern | |
94 | that can be entered by using the C<(?PARNO)> syntax (C<PARNO> standing for | |
95 | "parenthesis number"). For example, the following pattern will match | |
96 | nested balanced angle brackets: | |
97 | ||
98 | / | |
99 | ^ # start of line | |
100 | ( # start capture buffer 1 | |
101 | < # match an opening angle bracket | |
102 | (?: # match one of: | |
103 | (?> # don't backtrack over the inside of this group | |
104 | [^<>]+ # one or more non angle brackets | |
105 | ) # end non backtracking group | |
106 | | # ... or ... | |
107 | (?1) # recurse to bracket 1 and try it again | |
108 | )* # 0 or more times. | |
109 | > # match a closing angle bracket | |
110 | ) # end capture buffer one | |
111 | $ # end of line | |
112 | /x | |
113 | ||
114 | Note, users experienced with PCRE will find that the Perl implementation | |
115 | of this feature differs from the PCRE one in that it is possible to | |
116 | backtrack into a recursed pattern, whereas in PCRE the recursion is | |
117 | atomic or "possessive" in nature. (Yves Orton) | |
118 | ||
119 | =item Named Capture Buffers | |
120 | ||
121 | It is now possible to name capturing parenthesis in a pattern and refer to | |
122 | the captured contents by name. The naming syntax is C<< (?<NAME>....) >>. | |
123 | It's possible to backreference to a named buffer with the C<< \k<NAME> >> | |
124 | syntax. In code, the new magical hashes C<%+> and C<%-> can be used to | |
125 | access the contents of the capture buffers. | |
126 | ||
127 | Thus, to replace all doubled chars, one could write | |
128 | ||
129 | s/(?<letter>.)\k<letter>/$+{letter}/g | |
130 | ||
131 | Only buffers with defined contents will be "visible" in the C<%+> hash, so | |
132 | it's possible to do something like | |
133 | ||
134 | foreach my $name (keys %+) { | |
135 | print "content of buffer '$name' is $+{$name}\n"; | |
136 | } | |
137 | ||
138 | The C<%-> hash is a bit more complete, since it will contain array refs | |
139 | holding values from all capture buffers similarly named, if there should | |
140 | be many of them. | |
141 | ||
142 | C<%+> and C<%-> are implemented as tied hashes through the new module | |
143 | C<Tie::Hash::NamedCapture>. | |
144 | ||
145 | Users exposed to the .NET regex engine will find that the perl | |
146 | implementation differs in that the numerical ordering of the buffers | |
147 | is sequential, and not "unnamed first, then named". Thus in the pattern | |
148 | ||
149 | /(A)(?<B>B)(C)(?<D>D)/ | |
150 | ||
151 | $1 will be 'A', $2 will be 'B', $3 will be 'C' and $4 will be 'D' and not | |
152 | $1 is 'A', $2 is 'C' and $3 is 'B' and $4 is 'D' that a .NET programmer | |
153 | would expect. This is considered a feature. :-) (Yves Orton) | |
154 | ||
155 | =item Possessive Quantifiers | |
156 | ||
157 | Perl now supports the "possessive quantifier" syntax of the "atomic match" | |
158 | pattern. Basically a possessive quantifier matches as much as it can and never | |
159 | gives any back. Thus it can be used to control backtracking. The syntax is | |
160 | similar to non-greedy matching, except instead of using a '?' as the modifier | |
161 | the '+' is used. Thus C<?+>, C<*+>, C<++>, C<{min,max}+> are now legal | |
162 | quantifiers. (Yves Orton) | |
163 | ||
164 | =item Backtracking control verbs | |
165 | ||
166 | The regex engine now supports a number of special-purpose backtrack | |
167 | control verbs: (*THEN), (*PRUNE), (*MARK), (*SKIP), (*COMMIT), (*FAIL) | |
168 | and (*ACCEPT). See L<perlre> for their descriptions. (Yves Orton) | |
169 | ||
170 | =item Relative backreferences | |
171 | ||
172 | A new syntax C<\g{N}> or C<\gN> where "N" is a decimal integer allows a | |
173 | safer form of back-reference notation as well as allowing relative | |
174 | backreferences. This should make it easier to generate and embed patterns | |
175 | that contain backreferences. See L<perlre/"Capture buffers">. (Yves Orton) | |
176 | ||
177 | =item C<\K> escape | |
178 | ||
179 | The functionality of Jeff Pinyan's module Regexp::Keep has been added to | |
180 | the core. You can now use in regular expressions the special escape C<\K> | |
181 | as a way to do something like floating length positive lookbehind. It is | |
182 | also useful in substitutions like: | |
183 | ||
184 | s/(foo)bar/$1/g | |
185 | ||
186 | that can now be converted to | |
187 | ||
188 | s/foo\Kbar//g | |
189 | ||
190 | which is much more efficient. (Yves Orton) | |
191 | ||
192 | =item Vertical and horizontal whitespace, and linebreak | |
193 | ||
194 | Regular expressions now recognize the C<\v> and C<\h> escapes, that match | |
195 | vertical and horizontal whitespace, respectively. C<\V> and C<\H> | |
196 | logically match their complements. | |
197 | ||
198 | C<\R> matches a generic linebreak, that is, vertical whitespace, plus | |
199 | the multi-character sequence C<"\x0D\x0A">. | |
200 | ||
201 | =back | |
202 | ||
203 | =head2 C<say()> | |
204 | ||
205 | say() is a new built-in, only available when C<use feature 'say'> is in | |
206 | effect, that is similar to print(), but that implicitly appends a newline | |
207 | to the printed string. See L<perlfunc/say>. (Robin Houston) | |
208 | ||
209 | =head2 Lexical C<$_> | |
210 | ||
211 | The default variable C<$_> can now be lexicalized, by declaring it like | |
212 | any other lexical variable, with a simple | |
213 | ||
214 | my $_; | |
215 | ||
216 | The operations that default on C<$_> will use the lexically-scoped | |
217 | version of C<$_> when it exists, instead of the global C<$_>. | |
218 | ||
219 | In a C<map> or a C<grep> block, if C<$_> was previously my'ed, then the | |
220 | C<$_> inside the block is lexical as well (and scoped to the block). | |
221 | ||
222 | In a scope where C<$_> has been lexicalized, you can still have access to | |
223 | the global version of C<$_> by using C<$::_>, or, more simply, by | |
597bb945 | 224 | overriding the lexical declaration with C<our $_>. (Rafael Garcia-Suarez) |
cf6c151c RGS |
225 | |
226 | =head2 The C<_> prototype | |
227 | ||
228 | A new prototype character has been added. C<_> is equivalent to C<$> (it | |
229 | denotes a scalar), but defaults to C<$_> if the corresponding argument | |
230 | isn't supplied. Due to the optional nature of the argument, you can only | |
231 | use it at the end of a prototype, or before a semicolon. | |
232 | ||
233 | This has a small incompatible consequence: the prototype() function has | |
234 | been adjusted to return C<_> for some built-ins in appropriate cases (for | |
235 | example, C<prototype('CORE::rmdir')>). (Rafael Garcia-Suarez) | |
236 | ||
237 | =head2 UNITCHECK blocks | |
238 | ||
239 | C<UNITCHECK>, a new special code block has been introduced, in addition to | |
240 | C<BEGIN>, C<CHECK>, C<INIT> and C<END>. | |
241 | ||
242 | C<CHECK> and C<INIT> blocks, while useful for some specialized purposes, | |
243 | are always executed at the transition between the compilation and the | |
244 | execution of the main program, and thus are useless whenever code is | |
245 | loaded at runtime. On the other hand, C<UNITCHECK> blocks are executed | |
246 | just after the unit which defined them has been compiled. See L<perlmod> | |
247 | for more information. (Alex Gough) | |
248 | ||
249 | =head2 New Pragma, C<mro> | |
250 | ||
251 | A new pragma, C<mro> (for Method Resolution Order) has been added. It | |
252 | permits to switch, on a per-class basis, the algorithm that perl uses to | |
dbef3c66 | 253 | find inherited methods in case of a multiple inheritance hierarchy. The |
cf6c151c RGS |
254 | default MRO hasn't changed (DFS, for Depth First Search). Another MRO is |
255 | available: the C3 algorithm. See L<mro> for more information. | |
256 | (Brandon Black) | |
257 | ||
dbef3c66 | 258 | Note that, due to changes in the implementation of class hierarchy search, |
cf6c151c RGS |
259 | code that used to undef the C<*ISA> glob will most probably break. Anyway, |
260 | undef'ing C<*ISA> had the side-effect of removing the magic on the @ISA | |
261 | array and should not have been done in the first place. | |
262 | ||
263 | =head2 readpipe() is now overridable | |
264 | ||
265 | The built-in function readpipe() is now overridable. Overriding it permits | |
266 | also to override its operator counterpart, C<qx//> (a.k.a. C<``>). | |
267 | Moreover, it now defaults to C<$_> if no argument is provided. (Rafael | |
268 | Garcia-Suarez) | |
269 | ||
597bb945 | 270 | =head2 Default argument for readline() |
cf6c151c RGS |
271 | |
272 | readline() now defaults to C<*ARGV> if no argument is provided. (Rafael | |
273 | Garcia-Suarez) | |
274 | ||
275 | =head2 state() variables | |
276 | ||
277 | A new class of variables has been introduced. State variables are similar | |
278 | to C<my> variables, but are declared with the C<state> keyword in place of | |
279 | C<my>. They're visible only in their lexical scope, but their value is | |
280 | persistent: unlike C<my> variables, they're not undefined at scope entry, | |
281 | but retain their previous value. (Rafael Garcia-Suarez, Nicholas Clark) | |
282 | ||
283 | To use state variables, one needs to enable them by using | |
284 | ||
285 | use feature "state"; | |
286 | ||
287 | or by using the C<-E> command-line switch in one-liners. | |
288 | See L<perlsub/"Persistent variables via state()">. | |
289 | ||
290 | =head2 Stacked filetest operators | |
291 | ||
292 | As a new form of syntactic sugar, it's now possible to stack up filetest | |
293 | operators. You can now write C<-f -w -x $file> in a row to mean | |
294 | C<-x $file && -w _ && -f _>. See L<perlfunc/-X>. | |
295 | ||
296 | =head2 UNIVERSAL::DOES() | |
297 | ||
298 | The C<UNIVERSAL> class has a new method, C<DOES()>. It has been added to | |
299 | solve semantic problems with the C<isa()> method. C<isa()> checks for | |
300 | inheritance, while C<DOES()> has been designed to be overridden when | |
301 | module authors use other types of relations between classes (in addition | |
302 | to inheritance). (chromatic) | |
303 | ||
304 | See L<< UNIVERSAL/"$obj->DOES( ROLE )" >>. | |
305 | ||
306 | =head2 C<CLONE_SKIP()> | |
307 | ||
308 | Perl has now support for the C<CLONE_SKIP> special subroutine. Like | |
309 | C<CLONE>, C<CLONE_SKIP> is called once per package; however, it is called | |
310 | just before cloning starts, and in the context of the parent thread. If it | |
311 | returns a true value, then no objects of that class will be cloned. See | |
312 | L<perlmod> for details. (Contributed by Dave Mitchell.) | |
313 | ||
314 | =head2 Formats | |
315 | ||
316 | Formats were improved in several ways. A new field, C<^*>, can be used for | |
317 | variable-width, one-line-at-a-time text. Null characters are now handled | |
318 | correctly in picture lines. Using C<@#> and C<~~> together will now | |
319 | produce a compile-time error, as those format fields are incompatible. | |
320 | L<perlform> has been improved, and miscellaneous bugs fixed. | |
321 | ||
322 | =head2 Byte-order modifiers for pack() and unpack() | |
323 | ||
324 | There are two new byte-order modifiers, C<E<gt>> (big-endian) and C<E<lt>> | |
325 | (little-endian), that can be appended to most pack() and unpack() template | |
326 | characters and groups to force a certain byte-order for that type or group. | |
327 | See L<perlfunc/pack> and L<perlpacktut> for details. | |
328 | ||
cf6c151c RGS |
329 | =head2 C<no VERSION> |
330 | ||
331 | You can now use C<no> followed by a version number to specify that you | |
332 | want to use a version of perl older than the specified one. | |
333 | ||
334 | =head2 C<chdir>, C<chmod> and C<chown> on filehandles | |
335 | ||
336 | C<chdir>, C<chmod> and C<chown> can now work on filehandles as well as | |
337 | filenames, if the system supports respectively C<fchdir>, C<fchmod> and | |
338 | C<fchown>, thanks to a patch provided by Gisle Aas. | |
339 | ||
340 | =head2 OS groups | |
341 | ||
342 | C<$(> and C<$)> now return groups in the order where the OS returns them, | |
343 | thanks to Gisle Aas. This wasn't previously the case. | |
344 | ||
345 | =head2 Recursive sort subs | |
346 | ||
347 | You can now use recursive subroutines with sort(), thanks to Robin Houston. | |
348 | ||
349 | =head2 Exceptions in constant folding | |
350 | ||
351 | The constant folding routine is now wrapped in an exception handler, and | |
352 | if folding throws an exception (such as attempting to evaluate 0/0), perl | |
353 | now retains the current optree, rather than aborting the whole program. | |
354 | (Nicholas Clark, Dave Mitchell) | |
355 | ||
356 | =head2 Source filters in @INC | |
357 | ||
358 | It's possible to enhance the mechanism of subroutine hooks in @INC by | |
359 | adding a source filter on top of the filehandle opened and returned by the | |
360 | hook. This feature was planned a long time ago, but wasn't quite working | |
361 | until now. See L<perlfunc/require> for details. (Nicholas Clark) | |
362 | ||
363 | =head2 New internal variables | |
364 | ||
365 | =over 4 | |
366 | ||
367 | =item C<${^RE_DEBUG_FLAGS}> | |
368 | ||
369 | This variable controls what debug flags are in effect for the regular | |
370 | expression engine when running under C<use re "debug">. See L<re> for | |
371 | details. | |
372 | ||
373 | =item C<${^CHILD_ERROR_NATIVE}> | |
374 | ||
375 | This variable gives the native status returned by the last pipe close, | |
376 | backtick command, successful call to wait() or waitpid(), or from the | |
377 | system() operator. See L<perlrun> for details. (Contributed by Gisle Aas.) | |
378 | ||
597bb945 RGS |
379 | =item C<${^RE_TRIE_MAXBUF}> |
380 | ||
381 | See L</"Trie optimisation of literal string alternations">. | |
382 | ||
383 | =item C<${^WIN32_SLOPPY_STAT}> | |
384 | ||
385 | See L</"Sloppy stat on Windows">. | |
386 | ||
cf6c151c RGS |
387 | =back |
388 | ||
389 | =head2 Miscellaneous | |
390 | ||
391 | C<unpack()> now defaults to unpacking the C<$_> variable. | |
392 | ||
393 | C<mkdir()> without arguments now defaults to C<$_>. | |
394 | ||
395 | The internal dump output has been improved, so that non-printable characters | |
396 | such as newline and backspace are output in C<\x> notation, rather than | |
397 | octal. | |
398 | ||
399 | The B<-C> option can no longer be used on the C<#!> line. It wasn't | |
400 | working there anyway. | |
401 | ||
402 | =head2 UCD 5.0.0 | |
403 | ||
404 | The copy of the Unicode Character Database included in Perl 5 has | |
405 | been updated to version 5.0.0. | |
406 | ||
cf6c151c RGS |
407 | =head2 MAD |
408 | ||
409 | MAD, which stands for I<Misc Attribute Decoration>, is a | |
410 | still-in-development work leading to a Perl 5 to Perl 6 converter. To | |
411 | enable it, it's necessary to pass the argument C<-Dmad> to Configure. The | |
412 | obtained perl isn't binary compatible with a regular perl 5.9.4, and has | |
413 | space and speed penalties; moreover not all regression tests still pass | |
414 | with it. (Larry Wall, Nicholas Clark) | |
415 | ||
597bb945 RGS |
416 | =head1 Incompatible Changes |
417 | ||
418 | =head2 Packing and UTF-8 strings | |
419 | ||
420 | =for XXX update this | |
421 | ||
422 | The semantics of pack() and unpack() regarding UTF-8-encoded data has been | |
423 | changed. Processing is now by default character per character instead of | |
424 | byte per byte on the underlying encoding. Notably, code that used things | |
425 | like C<pack("a*", $string)> to see through the encoding of string will now | |
426 | simply get back the original $string. Packed strings can also get upgraded | |
427 | during processing when you store upgraded characters. You can get the old | |
428 | behaviour by using C<use bytes>. | |
429 | ||
430 | To be consistent with pack(), the C<C0> in unpack() templates indicates | |
431 | that the data is to be processed in character mode, i.e. character by | |
432 | character; on the contrary, C<U0> in unpack() indicates UTF-8 mode, where | |
433 | the packed string is processed in its UTF-8-encoded Unicode form on a byte | |
434 | by byte basis. This is reversed with regard to perl 5.8.X. | |
435 | ||
436 | Moreover, C<C0> and C<U0> can also be used in pack() templates to specify | |
437 | respectively character and byte modes. | |
438 | ||
439 | C<C0> and C<U0> in the middle of a pack or unpack format now switch to the | |
440 | specified encoding mode, honoring parens grouping. Previously, parens were | |
441 | ignored. | |
442 | ||
443 | Also, there is a new pack() character format, C<W>, which is intended to | |
444 | replace the old C<C>. C<C> is kept for unsigned chars coded as bytes in | |
445 | the strings internal representation. C<W> represents unsigned (logical) | |
446 | character values, which can be greater than 255. It is therefore more | |
447 | robust when dealing with potentially UTF-8-encoded data (as C<C> will wrap | |
448 | values outside the range 0..255, and not respect the string encoding). | |
449 | ||
450 | In practice, that means that pack formats are now encoding-neutral, except | |
451 | C<C>. | |
452 | ||
453 | For consistency, C<A> in unpack() format now trims all Unicode whitespace | |
454 | from the end of the string. Before perl 5.9.2, it used to strip only the | |
455 | classical ASCII space characters. | |
456 | ||
457 | =head2 Byte/character count feature in unpack() | |
458 | ||
459 | A new unpack() template character, C<".">, returns the number of bytes or | |
460 | characters (depending on the selected encoding mode, see above) read so far. | |
461 | ||
462 | =head2 The C<$*> and C<$#> variables have been removed | |
463 | ||
464 | C<$*>, which was deprecated in favor of the C</s> and C</m> regexp | |
465 | modifiers, has been removed. | |
466 | ||
467 | The deprecated C<$#> variable (output format for numbers) has been | |
468 | removed. | |
469 | ||
f00638a2 | 470 | Two new severe warnings, C<$#/$* is no longer supported>, have been added. |
597bb945 RGS |
471 | |
472 | =head2 substr() lvalues are no longer fixed-length | |
473 | ||
474 | The lvalues returned by the three argument form of substr() used to be a | |
475 | "fixed length window" on the original string. In some cases this could | |
476 | cause surprising action at distance or other undefined behaviour. Now the | |
477 | length of the window adjusts itself to the length of the string assigned to | |
478 | it. | |
479 | ||
480 | =head2 Parsing of C<-f _> | |
481 | ||
482 | The identifier C<_> is now forced to be a bareword after a filetest | |
483 | operator. This solves a number of misparsing issues when a global C<_> | |
484 | subroutine is defined. | |
485 | ||
486 | =head2 C<:unique> | |
487 | ||
488 | The C<:unique> attribute has been made a no-op, since its current | |
489 | implementation was fundamentally flawed and not threadsafe. | |
490 | ||
597bb945 RGS |
491 | =head2 Effect of pragmas in eval |
492 | ||
493 | The compile-time value of the C<%^H> hint variable can now propagate into | |
494 | eval("")uated code. This makes it more useful to implement lexical | |
495 | pragmas. | |
496 | ||
497 | As a side-effect of this, the overloaded-ness of constants now propagates | |
498 | into eval(""). | |
499 | ||
500 | =head2 chdir FOO | |
501 | ||
502 | A bareword argument to chdir() is now recognized as a file handle. | |
503 | Earlier releases interpreted the bareword as a directory name. | |
504 | (Gisle Aas) | |
505 | ||
506 | =head2 Handling of .pmc files | |
507 | ||
508 | An old feature of perl was that before C<require> or C<use> look for a | |
509 | file with a F<.pm> extension, they will first look for a similar filename | |
510 | with a F<.pmc> extension. If this file is found, it will be loaded in | |
511 | place of any potentially existing file ending in a F<.pm> extension. | |
512 | ||
513 | Previously, F<.pmc> files were loaded only if more recent than the | |
514 | matching F<.pm> file. Starting with 5.9.4, they'll be always loaded if | |
515 | they exist. | |
516 | ||
517 | =head2 @- and @+ in patterns | |
518 | ||
519 | The special arrays C<@-> and C<@+> are no longer interpolated in regular | |
520 | expressions. (Sadahiro Tomoyuki) | |
521 | ||
522 | =head2 $AUTOLOAD can now be tainted | |
523 | ||
524 | If you call a subroutine by a tainted name, and if it defers to an | |
525 | AUTOLOAD function, then $AUTOLOAD will be (correctly) tainted. | |
526 | (Rick Delaney) | |
527 | ||
528 | =head2 Tainting and printf | |
529 | ||
530 | When perl is run under taint mode, C<printf()> and C<sprintf()> will now | |
531 | reject any tainted format argument. (Rafael Garcia-Suarez) | |
532 | ||
533 | =head2 undef and signal handlers | |
534 | ||
535 | Undefining or deleting a signal handler via C<undef $SIG{FOO}> is now | |
536 | equivalent to setting it to C<'DEFAULT'>. (Rafael Garcia-Suarez) | |
537 | ||
538 | =head2 strictures and dereferencing in defined() | |
539 | ||
540 | C<use strict "refs"> was ignoring taking a hard reference in an argument | |
541 | to defined(), as in : | |
542 | ||
543 | use strict "refs"; | |
544 | my $x = "foo"; | |
545 | if (defined $$x) {...} | |
546 | ||
547 | This now correctly produces the run-time error C<Can't use string as a | |
548 | SCALAR ref while "strict refs" in use>. | |
549 | ||
550 | C<defined @$foo> and C<defined %$bar> are now also subject to C<strict | |
551 | 'refs'> (that is, C<$foo> and C<$bar> shall be proper references there.) | |
552 | (C<defined(@foo)> and C<defined(%bar)> are discouraged constructs anyway.) | |
553 | (Nicholas Clark) | |
554 | ||
555 | =head2 C<(?p{})> has been removed | |
556 | ||
557 | The regular expression construct C<(?p{})>, which was deprecated in perl | |
558 | 5.8, has been removed. Use C<(??{})> instead. (Rafael Garcia-Suarez) | |
559 | ||
560 | =head2 Pseudo-hashes have been removed | |
561 | ||
562 | Support for pseudo-hashes has been removed from Perl 5.9. (The C<fields> | |
563 | pragma remains here, but uses an alternate implementation.) | |
564 | ||
565 | =head2 Removal of the bytecode compiler and of perlcc | |
566 | ||
567 | C<perlcc>, the byteloader and the supporting modules (B::C, B::CC, | |
568 | B::Bytecode, etc.) are no longer distributed with the perl sources. Those | |
569 | experimental tools have never worked reliably, and, due to the lack of | |
570 | volunteers to keep them in line with the perl interpreter developments, it | |
571 | was decided to remove them instead of shipping a broken version of those. | |
572 | The last version of those modules can be found with perl 5.9.4. | |
573 | ||
574 | However the B compiler framework stays supported in the perl core, as with | |
575 | the more useful modules it has permitted (among others, B::Deparse and | |
576 | B::Concise). | |
577 | ||
578 | =head2 Removal of the JPL | |
579 | ||
580 | The JPL (Java-Perl Linguo) has been removed from the perl sources tarball. | |
581 | ||
582 | =head2 Recursive inheritance detected earlier | |
583 | ||
584 | Perl will now immediately throw an exception if you modify any package's | |
585 | C<@ISA> in such a way that it would cause recursive inheritance. | |
586 | ||
587 | Previously, the exception would not occur until Perl attempted to make | |
588 | use of the recursive inheritance while resolving a method or doing a | |
589 | C<$foo-E<gt>isa($bar)> lookup. | |
590 | ||
cf6c151c | 591 | =head1 Modules and Pragmata |
c0c97549 | 592 | |
f0e260b8 RGS |
593 | =head2 Pragmata Changes |
594 | ||
595 | =over 4 | |
596 | ||
597 | =item C<feature> | |
598 | ||
599 | The new pragma C<feature> is used to enable new features that might break | |
600 | old code. See L</"The C<feature> pragma"> above. | |
601 | ||
602 | =item C<mro> | |
603 | ||
604 | This new pragma enables to change the algorithm used to resolve inherited | |
605 | methods. See L</"New Pragma, C<mro>"> above. | |
606 | ||
607 | =item Scoping of the C<sort> pragma | |
608 | ||
609 | The C<sort> pragma is now lexically scoped. Its effect used to be global. | |
610 | ||
611 | =item Scoping of C<bignum>, C<bigint>, C<bigrat> | |
612 | ||
613 | The three numeric pragmas C<bignum>, C<bigint> and C<bigrat> are now | |
614 | lexically scoped. (Tels) | |
615 | ||
616 | =item C<base> | |
617 | ||
618 | The C<base> pragma now warns if a class tries to inherit from itself. | |
619 | (Curtis "Ovid" Poe) | |
620 | ||
621 | =item C<strict> and C<warnings> | |
622 | ||
623 | C<strict> and C<warnings> will now complain loudly if they are loaded via | |
624 | incorrect casing (as in C<use Strict;>). (Johan Vromans) | |
625 | ||
6601a838 RGS |
626 | =item C<version> |
627 | ||
628 | The C<version> module provides support for version objects. | |
629 | ||
f0e260b8 RGS |
630 | =item C<warnings> |
631 | ||
632 | The C<warnings> pragma doesn't load C<Carp> anymore. That means that code | |
633 | that used C<Carp> routines without having loaded it at compile time might | |
634 | need to be adjusted; typically, the following (faulty) code won't work | |
635 | anymore, and will require parentheses to be added after the function name: | |
636 | ||
637 | use warnings; | |
638 | require Carp; | |
639 | Carp::confess "argh"; | |
640 | ||
641 | =item C<less> | |
642 | ||
643 | C<less> now does something useful (or at least it tries to). In fact, it | |
644 | has been turned into a lexical pragma. So, in your modules, you can now | |
645 | test whether your users have requested to use less CPU, or less memory, | |
646 | less magic, or maybe even less fat. See L<less> for more. (Joshua ben | |
647 | Jore) | |
648 | ||
649 | =back | |
650 | ||
0eece9c0 RGS |
651 | =head2 New modules |
652 | ||
653 | =over 4 | |
654 | ||
655 | =item * | |
656 | ||
657 | C<encoding::warnings>, by Audrey Tang, is a module to emit warnings | |
658 | whenever an ASCII character string containing high-bit bytes is implicitly | |
597bb945 RGS |
659 | converted into UTF-8. It's a lexical pragma since Perl 5.9.4; on older |
660 | perls, its effect is global. | |
0eece9c0 RGS |
661 | |
662 | =item * | |
663 | ||
664 | C<Module::CoreList>, by Richard Clamp, is a small handy module that tells | |
665 | you what versions of core modules ship with any versions of Perl 5. It | |
666 | comes with a command-line frontend, C<corelist>. | |
667 | ||
bd3831ee RGS |
668 | =item * |
669 | ||
670 | C<Math::BigInt::FastCalc> is an XS-enabled, and thus faster, version of | |
671 | C<Math::BigInt::Calc>. | |
672 | ||
673 | =item * | |
674 | ||
675 | C<Compress::Zlib> is an interface to the zlib compression library. It | |
676 | comes with a bundled version of zlib, so having a working zlib is not a | |
677 | prerequisite to install it. It's used by C<Archive::Tar> (see below). | |
678 | ||
679 | =item * | |
680 | ||
681 | C<IO::Zlib> is an C<IO::>-style interface to C<Compress::Zlib>. | |
682 | ||
683 | =item * | |
684 | ||
685 | C<Archive::Tar> is a module to manipulate C<tar> archives. | |
686 | ||
687 | =item * | |
688 | ||
689 | C<Digest::SHA> is a module used to calculate many types of SHA digests, | |
690 | has been included for SHA support in the CPAN module. | |
691 | ||
692 | =item * | |
693 | ||
694 | C<ExtUtils::CBuilder> and C<ExtUtils::ParseXS> have been added. | |
695 | ||
597bb945 RGS |
696 | =item * |
697 | ||
698 | C<Hash::Util::FieldHash>, by Anno Siegel, has been added. This module | |
699 | provides support for I<field hashes>: hashes that maintain an association | |
700 | of a reference with a value, in a thread-safe garbage-collected way. | |
701 | Such hashes are useful to implement inside-out objects. | |
702 | ||
703 | =item * | |
704 | ||
705 | C<Module::Build>, by Ken Williams, has been added. It's an alternative to | |
706 | C<ExtUtils::MakeMaker> to build and install perl modules. | |
707 | ||
708 | =item * | |
709 | ||
710 | C<Module::Load>, by Jos Boumans, has been added. It provides a single | |
711 | interface to load Perl modules and F<.pl> files. | |
712 | ||
713 | =item * | |
714 | ||
715 | C<Module::Loaded>, by Jos Boumans, has been added. It's used to mark | |
716 | modules as loaded or unloaded. | |
717 | ||
718 | =item * | |
719 | ||
720 | C<Package::Constants>, by Jos Boumans, has been added. It's a simple | |
721 | helper to list all constants declared in a given package. | |
722 | ||
723 | =item * | |
724 | ||
725 | C<Win32API::File>, by Tye McQueen, has been added (for Windows builds). | |
726 | This module provides low-level access to Win32 system API calls for | |
727 | files/dirs. | |
728 | ||
f0e260b8 RGS |
729 | =item * |
730 | ||
731 | C<Locale::Maketext::Simple>, needed by CPANPLUS, is a simple wrapper around | |
732 | C<Locale::Maketext::Lexicon>. Note that C<Locale::Maketext::Lexicon> isn't | |
733 | included in the perl core; the behaviour of C<Locale::Maketext::Simple> | |
734 | gracefully degrades when the later isn't present. | |
735 | ||
736 | =item * | |
737 | ||
738 | C<Params::Check> implements a generic input parsing/checking mechanism. It | |
739 | is used by CPANPLUS. | |
740 | ||
741 | =item * | |
742 | ||
743 | C<Term::UI> simplifies the task to ask questions at a terminal prompt. | |
744 | ||
745 | =item * | |
746 | ||
747 | C<Object::Accessor> provides an interface to create per-object accessors. | |
748 | ||
749 | =item * | |
750 | ||
751 | C<Module::Pluggable> is a simple framework to create modules that accept | |
752 | pluggable sub-modules. | |
753 | ||
754 | =item * | |
755 | ||
756 | C<Module::Load::Conditional> provides simple ways to query and possibly | |
757 | load installed modules. | |
758 | ||
759 | =item * | |
760 | ||
761 | C<Time::Piece> provides an object oriented interface to time functions, | |
762 | overriding the built-ins localtime() and gmtime(). | |
763 | ||
764 | =item * | |
765 | ||
766 | C<IPC::Cmd> helps to find and run external commands, possibly | |
767 | interactively. | |
768 | ||
769 | =item * | |
770 | ||
771 | C<File::Fetch> provide a simple generic file fetching mechanism. | |
772 | ||
773 | =item * | |
774 | ||
775 | C<Log::Message> and C<Log::Message::Simple> are used by the log facility | |
776 | of C<CPANPLUS>. | |
777 | ||
778 | =item * | |
779 | ||
780 | C<Archive::Extract> is a generic archive extraction mechanism | |
781 | for F<.tar> (plain, gziped or bzipped) or F<.zip> files. | |
782 | ||
783 | =item * | |
784 | ||
785 | C<CPANPLUS> provides an API and a command-line tool to access the CPAN | |
786 | mirrors. | |
787 | ||
788 | =back | |
789 | ||
790 | =head2 Selected Changes to Core Modules | |
791 | ||
792 | =over 4 | |
793 | ||
794 | =item C<Attribute::Handlers> | |
795 | ||
796 | C<Attribute::Handlers> can now report the caller's file and line number. | |
797 | (David Feldman) | |
798 | ||
799 | =item C<B::Lint> | |
800 | ||
801 | C<B::Lint> is now based on C<Module::Pluggable>, and so can be extended | |
802 | with plugins. (Joshua ben Jore) | |
803 | ||
804 | =item C<B> | |
805 | ||
806 | It's now possible to access the lexical pragma hints (C<%^H>) by using the | |
807 | method B::COP::hints_hash(). It returns a C<B::RHE> object, which in turn | |
808 | can be used to get a hash reference via the method B::RHE::HASH(). (Joshua | |
809 | ben Jore) | |
810 | ||
811 | =item C<Thread> | |
812 | ||
813 | As the old 5005thread threading model has been removed, in favor of the | |
814 | ithreads scheme, the C<Thread> module is now a compatibility wrapper, to | |
815 | be used in old code only. It has been removed from the default list of | |
816 | dynamic extensions. | |
817 | ||
0eece9c0 RGS |
818 | =back |
819 | ||
cf6c151c | 820 | =head1 Utility Changes |
c0c97549 RGS |
821 | |
822 | =over 4 | |
823 | ||
bd3831ee | 824 | =item perl -d |
c0c97549 RGS |
825 | |
826 | The Perl debugger can now save all debugger commands for sourcing later; | |
827 | notably, it can now emulate stepping backwards, by restarting and | |
828 | rerunning all bar the last command from a saved command history. | |
829 | ||
830 | It can also display the parent inheritance tree of a given class, with the | |
831 | C<i> command. | |
832 | ||
833 | Perl has a new -dt command-line flag, which enables threads support in the | |
834 | debugger. | |
835 | ||
bd3831ee RGS |
836 | =item ptar |
837 | ||
838 | C<ptar> is a pure perl implementation of C<tar>, that comes with | |
839 | C<Archive::Tar>. | |
840 | ||
841 | =item ptardiff | |
842 | ||
843 | C<ptardiff> is a small script used to generate a diff between the contents | |
844 | of a tar archive and a directory tree. Like C<ptar>, it comes with | |
845 | C<Archive::Tar>. | |
846 | ||
847 | =item shasum | |
848 | ||
849 | C<shasum> is a command-line utility, used to print or to check SHA | |
850 | digests. It comes with the new C<Digest::SHA> module. | |
851 | ||
852 | =item corelist | |
0eece9c0 RGS |
853 | |
854 | The C<corelist> utility is now installed with perl (see L</"New modules"> | |
855 | above). | |
856 | ||
bd3831ee | 857 | =item h2ph and h2xs |
0eece9c0 RGS |
858 | |
859 | C<h2ph> and C<h2xs> have been made a bit more robust with regard to | |
860 | "modern" C code. | |
861 | ||
bd3831ee RGS |
862 | C<h2xs> implements a new option C<--use-xsloader> to force use of |
863 | C<XSLoader> even in backwards compatible modules. | |
864 | ||
865 | The handling of authors' names that had apostrophes has been fixed. | |
866 | ||
867 | Any enums with negative values are now skipped. | |
868 | ||
869 | =item perlivp | |
870 | ||
871 | C<perlivp> no longer checks for F<*.ph> files by default. Use the new C<-a> | |
872 | option to run I<all> tests. | |
873 | ||
874 | =item find2perl | |
0eece9c0 RGS |
875 | |
876 | C<find2perl> now assumes C<-print> as a default action. Previously, it | |
877 | needed to be specified explicitly. | |
878 | ||
879 | Several bugs have been fixed in C<find2perl>, regarding C<-exec> and | |
880 | C<-eval>. Also the options C<-path>, C<-ipath> and C<-iname> have been | |
881 | added. | |
882 | ||
597bb945 RGS |
883 | =item config_data |
884 | ||
885 | C<config_data> is a new utility that comes with C<Module::Build>. It | |
886 | provides a command-line interface to the configuration of Perl modules | |
887 | that use Module::Build's framework of configurability (that is, | |
888 | C<*::ConfigData> modules that contain local configuration information for | |
889 | their parent modules.) | |
890 | ||
f00638a2 | 891 | =item cpanp |
f0e260b8 RGS |
892 | |
893 | C<cpanp>, the CPANPLUS shell, has been added. (C<cpanp-run-perl>, an | |
894 | helper for CPANPLUS operation, has been added too, but isn't intended for | |
895 | direct use). | |
896 | ||
f00638a2 | 897 | =item cpan2dist |
f0e260b8 RGS |
898 | |
899 | C<cpan2dist> is a new utility, that comes with CPANPLUS. It's a tool to | |
900 | create distributions (or packages) from CPAN modules. | |
901 | ||
f00638a2 | 902 | =item pod2html |
f0e260b8 RGS |
903 | |
904 | The output of C<pod2html> has been enhanced to be more customizable via | |
905 | CSS. Some formatting problems were also corrected. (Jari Aalto) | |
906 | ||
c0c97549 RGS |
907 | =back |
908 | ||
cf6c151c | 909 | =head1 New Documentation |
c0c97549 | 910 | |
597bb945 RGS |
911 | The L<perlpragma> manpage documents how to write one's own lexical |
912 | pragmas in pure Perl (something that is possible starting with 5.9.4). | |
913 | ||
bd3831ee RGS |
914 | The new L<perlglossary> manpage is a glossary of terms used in the Perl |
915 | documentation, technical and otherwise, kindly provided by O'Reilly Media, | |
916 | Inc. | |
917 | ||
597bb945 RGS |
918 | The L<perlreguts> manpage, courtesy of Yves Orton, describes internals of the |
919 | Perl regular expression engine. | |
920 | ||
921 | The L<perlunitut> manpage is an tutorial for programming with Unicode and | |
922 | string encodings in Perl, courtesy of Juerd Waalboer. | |
923 | ||
f0e260b8 RGS |
924 | A new manual page, L<perlunifaq> (the Perl Unicode FAQ), has been added |
925 | (Juerd Waalboer). | |
926 | ||
dbef3c66 RGS |
927 | The L<perlcommunity> manpage gives a description of the Perl community |
928 | on the Internet and in real life. (Edgar "Trizor" Bering) | |
929 | ||
f00638a2 RGS |
930 | The L<CORE> manual page documents the C<CORE::> namespace. (Tels) |
931 | ||
c0c97549 RGS |
932 | The long-existing feature of C</(?{...})/> regexps setting C<$_> and pos() |
933 | is now documented. | |
934 | ||
cf6c151c | 935 | =head1 Performance Enhancements |
c0c97549 | 936 | |
597bb945 | 937 | =head2 In-place sorting |
0eece9c0 | 938 | |
c0c97549 RGS |
939 | Sorting arrays in place (C<@a = sort @a>) is now optimized to avoid |
940 | making a temporary copy of the array. | |
941 | ||
0eece9c0 RGS |
942 | Likewise, C<reverse sort ...> is now optimized to sort in reverse, |
943 | avoiding the generation of a temporary intermediate list. | |
944 | ||
597bb945 | 945 | =head2 Lexical array access |
0eece9c0 | 946 | |
c0c97549 RGS |
947 | Access to elements of lexical arrays via a numeric constant between 0 and |
948 | 255 is now faster. (This used to be only the case for global arrays.) | |
949 | ||
597bb945 | 950 | =head2 XS-assisted SWASHGET |
bd3831ee RGS |
951 | |
952 | Some pure-perl code that perl was using to retrieve Unicode properties and | |
953 | transliteration mappings has been reimplemented in XS. | |
954 | ||
597bb945 | 955 | =head2 Constant subroutines |
bd3831ee RGS |
956 | |
957 | The interpreter internals now support a far more memory efficient form of | |
958 | inlineable constants. Storing a reference to a constant value in a symbol | |
959 | table is equivalent to a full typeglob referencing a constant subroutine, | |
960 | but using about 400 bytes less memory. This proxy constant subroutine is | |
961 | automatically upgraded to a real typeglob with subroutine if necessary. | |
962 | The approach taken is analogous to the existing space optimisation for | |
963 | subroutine stub declarations, which are stored as plain scalars in place | |
964 | of the full typeglob. | |
965 | ||
966 | Several of the core modules have been converted to use this feature for | |
967 | their system dependent constants - as a result C<use POSIX;> now takes about | |
968 | 200K less memory. | |
969 | ||
597bb945 | 970 | =head2 C<PERL_DONT_CREATE_GVSV> |
bd3831ee RGS |
971 | |
972 | The new compilation flag C<PERL_DONT_CREATE_GVSV>, introduced as an option | |
973 | in perl 5.8.8, is turned on by default in perl 5.9.3. It prevents perl | |
974 | from creating an empty scalar with every new typeglob. See L<perl588delta> | |
975 | for details. | |
976 | ||
597bb945 | 977 | =head2 Weak references are cheaper |
bd3831ee RGS |
978 | |
979 | Weak reference creation is now I<O(1)> rather than I<O(n)>, courtesy of | |
980 | Nicholas Clark. Weak reference deletion remains I<O(n)>, but if deletion only | |
981 | happens at program exit, it may be skipped completely. | |
982 | ||
597bb945 | 983 | =head2 sort() enhancements |
bd3831ee RGS |
984 | |
985 | Salvador Fandiño provided improvements to reduce the memory usage of C<sort> | |
986 | and to speed up some cases. | |
987 | ||
597bb945 RGS |
988 | =head2 Memory optimisations |
989 | ||
990 | Several internal data structures (typeglobs, GVs, CVs, formats) have been | |
991 | restructured to use less memory. (Nicholas Clark) | |
992 | ||
993 | =head2 UTF-8 cache optimisation | |
994 | ||
995 | The UTF-8 caching code is now more efficient, and used more often. | |
996 | (Nicholas Clark) | |
997 | ||
998 | =head2 Sloppy stat on Windows | |
999 | ||
1000 | On Windows, perl's stat() function normally opens the file to determine | |
1001 | the link count and update attributes that may have been changed through | |
1002 | hard links. Setting ${^WIN32_SLOPPY_STAT} to a true value speeds up | |
1003 | stat() by not performing this operation. (Jan Dubois) | |
1004 | ||
597bb945 RGS |
1005 | =head2 Regular expressions optimisations |
1006 | ||
1007 | =over 4 | |
1008 | ||
1009 | =item Engine de-recursivised | |
1010 | ||
1011 | The regular expression engine is no longer recursive, meaning that | |
1012 | patterns that used to overflow the stack will either die with useful | |
1013 | explanations, or run to completion, which, since they were able to blow | |
1014 | the stack before, will likely take a very long time to happen. If you were | |
1015 | experiencing the occasional stack overflow (or segfault) and upgrade to | |
1016 | discover that now perl apparently hangs instead, look for a degenerate | |
1017 | regex. (Dave Mitchell) | |
1018 | ||
1019 | =item Single char char-classes treated as literals | |
1020 | ||
1021 | Classes of a single character are now treated the same as if the character | |
1022 | had been used as a literal, meaning that code that uses char-classes as an | |
1023 | escaping mechanism will see a speedup. (Yves Orton) | |
1024 | ||
1025 | =item Trie optimisation of literal string alternations | |
1026 | ||
1027 | Alternations, where possible, are optimised into more efficient matching | |
1028 | structures. String literal alternations are merged into a trie and are | |
1029 | matched simultaneously. This means that instead of O(N) time for matching | |
1030 | N alternations at a given point, the new code performs in O(1) time. | |
1031 | A new special variable, ${^RE_TRIE_MAXBUF}, has been added to fine-tune | |
1032 | this optimization. (Yves Orton) | |
1033 | ||
1034 | B<Note:> Much code exists that works around perl's historic poor | |
1035 | performance on alternations. Often the tricks used to do so will disable | |
1036 | the new optimisations. Hopefully the utility modules used for this purpose | |
1037 | will be educated about these new optimisations by the time 5.10 is | |
1038 | released. | |
1039 | ||
1040 | =item Aho-Corasick start-point optimisation | |
1041 | ||
1042 | When a pattern starts with a trie-able alternation and there aren't | |
1043 | better optimisations available the regex engine will use Aho-Corasick | |
1044 | matching to find the start point. (Yves Orton) | |
1045 | ||
0eece9c0 RGS |
1046 | =back |
1047 | ||
cf6c151c | 1048 | =head1 Installation and Configuration Improvements |
c0c97549 | 1049 | |
597bb945 RGS |
1050 | =head2 Configuration improvements |
1051 | ||
1052 | =over 4 | |
1053 | ||
1054 | =item C<-Dusesitecustomize> | |
bd3831ee | 1055 | |
0eece9c0 | 1056 | Run-time customization of @INC can be enabled by passing the |
597bb945 | 1057 | C<-Dusesitecustomize> flag to Configure. When enabled, this will make perl |
0eece9c0 RGS |
1058 | run F<$sitelibexp/sitecustomize.pl> before anything else. This script can |
1059 | then be set up to add additional entries to @INC. | |
1060 | ||
597bb945 RGS |
1061 | =item Relocatable installations |
1062 | ||
1063 | There is now Configure support for creating a relocatable perl tree. If | |
1064 | you Configure with C<-Duserelocatableinc>, then the paths in @INC (and | |
1065 | everything else in %Config) can be optionally located via the path of the | |
1066 | perl executable. | |
1067 | ||
1068 | That means that, if the string C<".../"> is found at the start of any | |
1069 | path, it's substituted with the directory of $^X. So, the relocation can | |
1070 | be configured on a per-directory basis, although the default with | |
1071 | C<-Duserelocatableinc> is that everything is relocated. The initial | |
1072 | install is done to the original configured prefix. | |
1073 | ||
1074 | =item strlcat() and strlcpy() | |
1075 | ||
1076 | The configuration process now detects whether strlcat() and strlcpy() are | |
1077 | available. When they are not available, perl's own version is used (from | |
1078 | Russ Allbery's public domain implementation). Various places in the perl | |
1079 | interpreter now use them. (Steve Peters) | |
1080 | ||
f0e260b8 RGS |
1081 | =item C<d_pseudofork> and C<d_printf_format_null> |
1082 | ||
1083 | A new configuration variable, available as C<$Config{d_pseudofork}> in | |
1084 | the L<Config> module, has been added, to distinguish real fork() support | |
1085 | from fake pseudofork used on Windows platforms. | |
1086 | ||
1087 | A new configuration variable, C<d_printf_format_null>, has been added, | |
1088 | to see if printf-like formats are allowed to be NULL. | |
1089 | ||
1090 | =item Configure help | |
1091 | ||
1092 | C<Configure -h> has been extended with the most commonly used options. | |
1093 | ||
597bb945 RGS |
1094 | =back |
1095 | ||
1096 | =head2 Compilation improvements | |
1097 | ||
1098 | =over 4 | |
1099 | ||
1100 | =item Parallel build | |
0eece9c0 | 1101 | |
bd3831ee RGS |
1102 | Parallel makes should work properly now, although there may still be problems |
1103 | if C<make test> is instructed to run in parallel. | |
1104 | ||
597bb945 RGS |
1105 | =item Borland's compilers support |
1106 | ||
bd3831ee RGS |
1107 | Building with Borland's compilers on Win32 should work more smoothly. In |
1108 | particular Steve Hay has worked to side step many warnings emitted by their | |
1109 | compilers and at least one C compiler internal error. | |
1110 | ||
597bb945 RGS |
1111 | =item Static build on Windows |
1112 | ||
f0e260b8 RGS |
1113 | Perl extensions on Windows now can be statically built into the Perl DLL. |
1114 | ||
1115 | Also, it's now possible to build a C<perl-static.exe> that doesn't depend | |
1116 | on the Perl DLL on Win32. See the Win32 makefiles for details. | |
1117 | (Vadim Konovalov) | |
bd3831ee | 1118 | |
69d2c521 | 1119 | =item ppport.h files |
597bb945 RGS |
1120 | |
1121 | All F<ppport.h> files in the XS modules bundled with perl are now | |
1122 | autogenerated at build time. (Marcus Holland-Moritz) | |
1123 | ||
f0e260b8 RGS |
1124 | =item C++ compatibility |
1125 | ||
1126 | Efforts have been made to make perl and the core XS modules compilable | |
1127 | with various C++ compilers (although the situation is not perfect with | |
1128 | some of the compilers on some of the platforms tested.) | |
1129 | ||
597bb945 RGS |
1130 | =item Building XS extensions on Windows |
1131 | ||
1132 | Support for building XS extension modules with the free MinGW compiler has | |
1133 | been improved in the case where perl itself was built with the Microsoft | |
1134 | VC++ compiler. (ActiveState) | |
1135 | ||
1136 | =item Support for Microsoft 64-bit compiler | |
1137 | ||
1138 | Support for building perl with Microsoft's 64-bit compiler has been | |
1139 | improved. (ActiveState) | |
1140 | ||
f0e260b8 RGS |
1141 | =item Visual C++ |
1142 | ||
f00638a2 | 1143 | Perl now can be compiled with Microsoft Visual C++. |
f0e260b8 RGS |
1144 | |
1145 | =item Win32 builds | |
1146 | ||
1147 | All win32 builds (MS-Win, WinCE) have been merged and cleaned up. | |
1148 | ||
597bb945 RGS |
1149 | =back |
1150 | ||
1151 | =head2 Installation improvements | |
1152 | ||
1153 | =over 4 | |
1154 | ||
1155 | =item Module auxiliary files | |
1156 | ||
1157 | README files and changelogs for CPAN modules bundled with perl are no | |
1158 | longer installed. | |
1159 | ||
1160 | =back | |
1161 | ||
bd3831ee RGS |
1162 | =head2 New Or Improved Platforms |
1163 | ||
597bb945 | 1164 | Perl has been reported to work on Symbian OS. See L<perlsymbian> for more |
bd3831ee RGS |
1165 | information. |
1166 | ||
597bb945 RGS |
1167 | Many improvements have been made towards making Perl work correctly on |
1168 | z/OS. | |
1169 | ||
f0e260b8 | 1170 | Perl has been reported to work on DragonFlyBSD and MidnightBSD. |
597bb945 | 1171 | |
bd3831ee RGS |
1172 | The VMS port has been improved. See L<perlvms>. |
1173 | ||
d43695a1 RGS |
1174 | Support for Cray XT4 Catamount/Qk has been added. See |
1175 | F<hints/catamount.sh> in the source code distribution for more | |
1176 | information. | |
bd3831ee | 1177 | |
f0e260b8 RGS |
1178 | Vendor patches have been merged for RedHat and Gentoo. |
1179 | ||
1180 | DynaLoader::dl_unload_file() now works on Windows. | |
bd3831ee | 1181 | |
cf6c151c | 1182 | =head1 Selected Bug Fixes |
c0c97549 | 1183 | |
bd3831ee RGS |
1184 | =over 4 |
1185 | ||
1186 | =item strictures in regexp-eval blocks | |
1187 | ||
c0c97549 RGS |
1188 | C<strict> wasn't in effect in regexp-eval blocks (C</(?{...})/>). |
1189 | ||
bd3831ee RGS |
1190 | =item Calling CORE::require() |
1191 | ||
1192 | CORE::require() and CORE::do() were always parsed as require() and do() | |
1193 | when they were overridden. This is now fixed. | |
1194 | ||
1195 | =item Subscripts of slices | |
1196 | ||
1197 | You can now use a non-arrowed form for chained subscripts after a list | |
1198 | slice, like in: | |
1199 | ||
1200 | ({foo => "bar"})[0]{foo} | |
1201 | ||
1202 | This used to be a syntax error; a C<< -> >> was required. | |
1203 | ||
1204 | =item C<no warnings 'category'> works correctly with -w | |
1205 | ||
1206 | Previously when running with warnings enabled globally via C<-w>, selective | |
1207 | disabling of specific warning categories would actually turn off all warnings. | |
1208 | This is now fixed; now C<no warnings 'io';> will only turn off warnings in the | |
1209 | C<io> class. Previously it would erroneously turn off all warnings. | |
1210 | ||
597bb945 | 1211 | =item threads improvements |
bd3831ee RGS |
1212 | |
1213 | Several memory leaks in ithreads were closed. Also, ithreads were made | |
1214 | less memory-intensive. | |
1215 | ||
597bb945 RGS |
1216 | C<threads> is now a dual-life module, also available on CPAN. It has been |
1217 | expanded in many ways. A kill() method is available for thread signalling. | |
1218 | One can get thread status, or the list of running or joinable threads. | |
1219 | ||
1220 | A new C<< threads->exit() >> method is used to exit from the application | |
1221 | (this is the default for the main thread) or from the current thread only | |
1222 | (this is the default for all other threads). On the other hand, the exit() | |
1223 | built-in now always causes the whole application to terminate. (Jerry | |
1224 | D. Hedden) | |
1225 | ||
bd3831ee RGS |
1226 | =item chr() and negative values |
1227 | ||
1228 | chr() on a negative value now gives C<\x{FFFD}>, the Unicode replacement | |
1229 | character, unless when the C<bytes> pragma is in effect, where the low | |
1230 | eight bytes of the value are used. | |
1231 | ||
597bb945 RGS |
1232 | =item PERL5SHELL and tainting |
1233 | ||
1234 | On Windows, the PERL5SHELL environment variable is now checked for | |
1235 | taintedness. (Rafael Garcia-Suarez) | |
1236 | ||
1237 | =item Using *FILE{IO} | |
1238 | ||
1239 | C<stat()> and C<-X> filetests now treat *FILE{IO} filehandles like *FILE | |
1240 | filehandles. (Steve Peters) | |
1241 | ||
1242 | =item Overloading and reblessing | |
1243 | ||
1244 | Overloading now works when references are reblessed into another class. | |
1245 | Internally, this has been implemented by moving the flag for "overloading" | |
1246 | from the reference to the referent, which logically is where it should | |
1247 | always have been. (Nicholas Clark) | |
1248 | ||
1249 | =item Overloading and UTF-8 | |
1250 | ||
1251 | A few bugs related to UTF-8 handling with objects that have | |
1252 | stringification overloaded have been fixed. (Nicholas Clark) | |
1253 | ||
1254 | =item eval memory leaks fixed | |
1255 | ||
1256 | Traditionally, C<eval 'syntax error'> has leaked badly. Many (but not all) | |
1257 | of these leaks have now been eliminated or reduced. (Dave Mitchell) | |
1258 | ||
1259 | =item Random device on Windows | |
1260 | ||
1261 | In previous versions, perl would read the file F</dev/urandom> if it | |
1262 | existed when seeding its random number generator. That file is unlikely | |
1263 | to exist on Windows, and if it did would probably not contain appropriate | |
1264 | data, so perl no longer tries to read it on Windows. (Alex Davies) | |
1265 | ||
1266 | =item PERLIO_DEBUG | |
1267 | ||
1268 | The C<PERLIO_DEBUG> environment variable has no longer any effect for | |
1269 | setuid scripts and for scripts run with B<-T>. | |
1270 | ||
1271 | Moreover, with a thread-enabled perl, using C<PERLIO_DEBUG> could lead to | |
1272 | an internal buffer overflow. This has been fixed. | |
1273 | ||
f0e260b8 RGS |
1274 | =item PerlIO::scalar and read-only scalars |
1275 | ||
1276 | PerlIO::scalar will now prevent writing to read-only scalars. Moreover, | |
1277 | seek() is now supported with PerlIO::scalar-based filehandles, the | |
1278 | underlying string being zero-filled as needed. (Rafael, Jarkko Hietaniemi) | |
1279 | ||
1280 | =item study() and UTF-8 | |
1281 | ||
1282 | study() never worked for UTF-8 strings, but could lead to false results. | |
1283 | It's now a no-op on UTF-8 data. (Yves Orton) | |
1284 | ||
1285 | =item Critical signals | |
1286 | ||
1287 | The signals SIGILL, SIGBUS and SIGSEGV are now always delivered in an | |
1288 | "unsafe" manner (contrary to other signals, that are deferred until the | |
1289 | perl interpreter reaches a reasonably stable state; see | |
1290 | L<perlipc/"Deferred Signals (Safe Signals)">). (Rafael) | |
1291 | ||
1292 | =item @INC-hook fix | |
1293 | ||
1294 | When a module or a file is loaded through an @INC-hook, and when this hook | |
1295 | has set a filename entry in %INC, __FILE__ is now set for this module | |
1296 | accordingly to the contents of that %INC entry. (Rafael) | |
1297 | ||
1298 | =item C<-t> switch fix | |
1299 | ||
1300 | The C<-w> and C<-t> switches can now be used together without messing | |
1301 | up what categories of warnings are activated or not. (Rafael) | |
1302 | ||
1303 | =item Duping UTF-8 filehandles | |
1304 | ||
1305 | Duping a filehandle which has the C<:utf8> PerlIO layer set will now | |
1306 | properly carry that layer on the duped filehandle. (Rafael) | |
1307 | ||
1308 | =item Localisation of hash elements | |
1309 | ||
1310 | Localizing an hash element whose key was given as a variable didn't work | |
1311 | correctly if the variable was changed while the local() was in effect (as | |
1312 | in C<local $h{$x}; ++$x>). (Bo Lindbergh) | |
1313 | ||
bd3831ee | 1314 | =back |
0eece9c0 | 1315 | |
cf6c151c | 1316 | =head1 New or Changed Diagnostics |
c0c97549 | 1317 | |
bd3831ee RGS |
1318 | =over 4 |
1319 | ||
d43695a1 RGS |
1320 | =item Use of uninitialized value |
1321 | ||
1322 | Perl will now try to tell you the name of the variable (if any) that was | |
1323 | undefined. | |
1324 | ||
bd3831ee RGS |
1325 | =item Deprecated use of my() in false conditional |
1326 | ||
c0c97549 RGS |
1327 | A new deprecation warning, I<Deprecated use of my() in false conditional>, |
1328 | has been added, to warn against the use of the dubious and deprecated | |
1329 | construct | |
1330 | ||
1331 | my $x if 0; | |
1332 | ||
1333 | See L<perldiag>. Use C<state> variables instead. | |
1334 | ||
bd3831ee RGS |
1335 | =item !=~ should be !~ |
1336 | ||
0eece9c0 RGS |
1337 | A new warning, C<!=~ should be !~>, is emitted to prevent this misspelling |
1338 | of the non-matching operator. | |
1339 | ||
bd3831ee RGS |
1340 | =item Newline in left-justified string |
1341 | ||
0eece9c0 RGS |
1342 | The warning I<Newline in left-justified string> has been removed. |
1343 | ||
bd3831ee RGS |
1344 | =item Too late for "-T" option |
1345 | ||
0eece9c0 RGS |
1346 | The error I<Too late for "-T" option> has been reformulated to be more |
1347 | descriptive. | |
1348 | ||
bd3831ee RGS |
1349 | =item "%s" variable %s masks earlier declaration |
1350 | ||
1351 | This warning is now emitted in more consistent cases; in short, when one | |
1352 | of the declarations involved is a C<my> variable: | |
1353 | ||
1354 | my $x; my $x; # warns | |
1355 | my $x; our $x; # warns | |
1356 | our $x; my $x; # warns | |
1357 | ||
1358 | On the other hand, the following: | |
1359 | ||
1360 | our $x; our $x; | |
1361 | ||
1362 | now gives a C<"our" variable %s redeclared> warning. | |
1363 | ||
1364 | =item readdir()/closedir()/etc. attempted on invalid dirhandle | |
1365 | ||
1366 | These new warnings are now emitted when a dirhandle is used but is | |
1367 | either closed or not really a dirhandle. | |
1368 | ||
f0e260b8 RGS |
1369 | =item Opening dirhandle/filehandle %s also as a file/directory |
1370 | ||
1371 | Two deprecation warnings have been added: (Rafael) | |
1372 | ||
1373 | Opening dirhandle %s also as a file | |
1374 | Opening filehandle %s also as a directory | |
1375 | ||
f00638a2 RGS |
1376 | =item Use of -P is deprecated |
1377 | ||
1378 | Perl's command-line switch C<-P> is now deprecated. | |
1379 | ||
6601a838 RGS |
1380 | =item v-string in use/require is non-portable |
1381 | ||
1382 | Perl will warn you against potential backwards compatibility problems with | |
1383 | the C<use VERSION> syntax. | |
1384 | ||
bd3831ee RGS |
1385 | =item perl -V |
1386 | ||
0eece9c0 RGS |
1387 | C<perl -V> has several improvements, making it more useable from shell |
1388 | scripts to get the value of configuration variables. See L<perlrun> for | |
1389 | details. | |
1390 | ||
bd3831ee RGS |
1391 | =back |
1392 | ||
cf6c151c | 1393 | =head1 Changed Internals |
c0c97549 | 1394 | |
bd3831ee RGS |
1395 | In general, the source code of perl has been refactored, tied up, and |
1396 | optimized in many places. Also, memory management and allocation has been | |
1397 | improved in a couple of points. | |
1398 | ||
c0c97549 RGS |
1399 | =head2 Reordering of SVt_* constants |
1400 | ||
1401 | The relative ordering of constants that define the various types of C<SV> | |
1402 | have changed; in particular, C<SVt_PVGV> has been moved before C<SVt_PVLV>, | |
1403 | C<SVt_PVAV>, C<SVt_PVHV> and C<SVt_PVCV>. This is unlikely to make any | |
1404 | difference unless you have code that explicitly makes assumptions about that | |
1405 | ordering. (The inheritance hierarchy of C<B::*> objects has been changed | |
1406 | to reflect this.) | |
1407 | ||
1408 | =head2 Removal of CPP symbols | |
1409 | ||
1410 | The C preprocessor symbols C<PERL_PM_APIVERSION> and | |
1411 | C<PERL_XS_APIVERSION>, which were supposed to give the version number of | |
1412 | the oldest perl binary-compatible (resp. source-compatible) with the | |
1413 | present one, were not used, and sometimes had misleading values. They have | |
1414 | been removed. | |
1415 | ||
1416 | =head2 Less space is used by ops | |
1417 | ||
1418 | The C<BASEOP> structure now uses less space. The C<op_seq> field has been | |
1419 | removed and replaced by the one-bit fields C<op_opt>. C<op_type> is now 9 | |
1420 | bits long. (Consequently, the C<B::OP> class doesn't provide an C<seq> | |
1421 | method anymore.) | |
1422 | ||
1423 | =head2 New parser | |
1424 | ||
1425 | perl's parser is now generated by bison (it used to be generated by | |
1426 | byacc.) As a result, it seems to be a bit more robust. | |
1427 | ||
bd3831ee RGS |
1428 | Also, Dave Mitchell improved the lexer debugging output under C<-DT>. |
1429 | ||
1430 | =head2 Use of C<const> | |
1431 | ||
1432 | Andy Lester supplied many improvements to determine which function | |
1433 | parameters and local variables could actually be declared C<const> to the C | |
1434 | compiler. Steve Peters provided new C<*_set> macros and reworked the core to | |
1435 | use these rather than assigning to macros in LVALUE context. | |
1436 | ||
1437 | =head2 Mathoms | |
1438 | ||
1439 | A new file, F<mathoms.c>, has been added. It contains functions that are | |
1440 | no longer used in the perl core, but that remain available for binary or | |
1441 | source compatibility reasons. However, those functions will not be | |
1442 | compiled in if you add C<-DNO_MATHOMS> in the compiler flags. | |
1443 | ||
1444 | =head2 C<AvFLAGS> has been removed | |
1445 | ||
1446 | The C<AvFLAGS> macro has been removed. | |
1447 | ||
1448 | =head2 C<av_*> changes | |
1449 | ||
1450 | The C<av_*()> functions, used to manipulate arrays, no longer accept null | |
1451 | C<AV*> parameters. | |
1452 | ||
597bb945 RGS |
1453 | =head2 $^H and %^H |
1454 | ||
1455 | The implementation of the special variables $^H and %^H has changed, to | |
1456 | allow implementing lexical pragmas in pure perl. | |
1457 | ||
bd3831ee RGS |
1458 | =head2 B:: modules inheritance changed |
1459 | ||
1460 | The inheritance hierarchy of C<B::> modules has changed; C<B::NV> now | |
1461 | inherits from C<B::SV> (it used to inherit from C<B::IV>). | |
1462 | ||
f0e260b8 RGS |
1463 | =head2 Anonymous hash and array constructors |
1464 | ||
1465 | The anonymous hash and array constructors now take 1 op in the optree | |
1466 | instead of 3, now that pp_anonhash and pp_anonlist return a reference to | |
1467 | an hash/array when the op is flagged with OPf_SPECIAL (Nicholas Clark). | |
1468 | ||
1469 | =for p5p XXX have we some docs on how to create regexp engine plugins, since that's now possible ? (perlreguts) | |
1470 | ||
1471 | =for p5p XXX new BIND SV type, #29544, #29642 | |
1472 | ||
cf6c151c | 1473 | =head1 Known Problems |
c0c97549 RGS |
1474 | |
1475 | There's still a remaining problem in the implementation of the lexical | |
1476 | C<$_>: it doesn't work inside C</(?{...})/> blocks. (See the TODO test in | |
1477 | F<t/op/mydef.t>.) | |
1478 | ||
cf6c151c | 1479 | =head1 Platform Specific Problems |
c0c97549 | 1480 | |
cf6c151c RGS |
1481 | =head1 Reporting Bugs |
1482 | ||
1483 | =head1 SEE ALSO | |
1484 | ||
1485 | The F<Changes> file and the perl590delta to perl595delta man pages for | |
1486 | exhaustive details on what changed. | |
1487 | ||
1488 | The F<INSTALL> file for how to build Perl. | |
1489 | ||
1490 | The F<README> file for general stuff. | |
1491 | ||
1492 | The F<Artistic> and F<Copying> files for copyright information. | |
1493 | ||
1494 | =cut |