This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Add Regexp::Keep \K functionality to regex engine as well as add \v and \V, cleanup...
[perl5.git] / pod / perl595delta.pod
CommitLineData
f6eae373
RGS
1=head1 NAME
2
3perldelta - what is new for perl v5.9.5
4
5=head1 DESCRIPTION
6
7This document describes differences between the 5.9.4 and the 5.9.5
8development releases. See L<perl590delta>, L<perl591delta>,
9L<perl592delta>, L<perl593delta> and L<perl594delta> for the differences
10between 5.8.0 and 5.9.4.
11
12=head1 Incompatible Changes
13
20ee07fb
RGS
14=head2 Tainting and printf
15
16When perl is run under taint mode, C<printf()> and C<sprintf()> will now
5a093634 17reject any tainted format argument. (Rafael Garcia-SUarez)
20ee07fb 18
54a37cc6
RGS
19=head2 undef and signal handlers
20
21Undefining or deleting a signal handler via C<undef $SIG{FOO}> is now
22equivalent to setting it to C<'DEFAULT'>.
23
73966613
RGS
24=head2 Removal of the bytecode compiler and of perlcc
25
26C<perlcc>, the byteloader and the supporting modules (B::C, B::CC,
27B::Bytecode, etc.) are no longer distributed with the perl sources. Those
28experimental tools have never worked reliably, and, due to the lack of
29volunteers to keep them in line with the perl interpreter developments, it
30was decided to remove them instead of shipping a broken version of those.
31The last version of those modules can be found with perl 5.9.4.
32
33However the B compiler framework stays supported in the perl core, as with
34the more useful modules it has permitted (among others, B::Deparse and
35B::Concise).
36
37=head2 Removal of the JPL
38
39The JPL (Java-Perl Linguo) has been removed from the perl sources tarball.
40
f6eae373
RGS
41=head1 Core Enhancements
42
072f65b4
RGS
43=head2 Regular expressions
44
45=over 4
46
47=item Recursive Patterns
48
49It is now possible to write recursive patterns without using the C<(??{})>
50construct. This new way is more efficient, and in many cases easier to
51read.
52
53Each capturing parenthesis can now be treated as an independent pattern
54that can be entered by using the C<(?PARNO)> syntax (C<PARNO> standing for
55"parenthesis number"). For example, the following pattern will match
56nested balanced angle brackets:
57
58 /
59 ^ # start of line
60 ( # start capture buffer 1
61 < # match an opening angle bracket
62 (?: # match one of:
63 (?> # don't backtrack over the inside of this group
64 [^<>]+ # one or more non angle brackets
65 ) # end non backtracking group
66 | # ... or ...
67 (?1) # recurse to bracket 1 and try it again
68 )* # 0 or more times.
69 > # match a closing angle bracket
70 ) # end capture buffer one
71 $ # end of line
72 /x
73
74Note, users experienced with PCRE will find that the Perl implementation
75of this feature differs from the PCRE one in that it is possible to
76backtrack into a recursed pattern, whereas in PCRE the recursion is
73966613 77atomic or "possessive" in nature. (Yves Orton)
072f65b4
RGS
78
79=item Named Capture Buffers
80
81It is now possible to name capturing parenthesis in a pattern and refer to
82the captured contents by name. The naming syntax is C<< (?<NAME>....) >>.
83It's possible to backreference to a named buffer with the C<< \k<NAME> >>
84syntax. In code, the new magical hash C<%+> can be used to access the
85contents of the buffers.
86
87Thus, to replace all doubled chars, one could write
88
89 s/(?<letter>.)\k<letter>/$+{letter}/g
90
91Only buffers with defined contents will be "visible" in the hash, so
92it's possible to do something like
93
94 foreach my $name (keys %+) {
95 print "content of buffer '$name' is $+{$name}\n";
96 }
97
98Users exposed to the .NET regex engine will find that the perl
99implementation differs in that the numerical ordering of the buffers
100is sequential, and not "unnamed first, then named". Thus in the pattern
101
102 /(A)(?<B>B)(C)(?<D>D)/
103
104$1 will be 'A', $2 will be 'B', $3 will be 'C' and $4 will be 'D' and not
105$1 is 'A', $2 is 'C' and $3 is 'B' and $4 is 'D' that a .NET programmer
73966613 106would expect. This is considered a feature. :-) (Yves Orton)
072f65b4 107
b9b4dddf
YO
108=item Possessive Quantifiers
109
ee9b8eae 110Perl now supports the "possessive quantifier" syntax of the "atomic match"
b9b4dddf 111pattern. Basically a possessive quantifier matches as much as it can and never
ee9b8eae 112gives any back. Thus it can be used to control backtracking. The syntax is
b9b4dddf
YO
113similar to non-greedy matching, except instead of using a '?' as the modifier
114the '+' is used. Thus C<?+>, C<*+>, C<++>, C<{min,max}+> are now legal
73966613 115quantifiers. (Yves Orton)
b9b4dddf 116
24b23f37
YO
117=item Backtracking control verbs
118
119The regex engine now supports a number of special purpose backtrack
5d458dd8 120control verbs: (*THEN), (*PRUNE), (*MARK), (*SKIP), (*COMMIT), (*FAIL)
c74340f9
YO
121and (*ACCEPT). See L<perlre> for their descriptions. (Yves Orton)
122
123=item Relative backreferences
124
2bf803e2
YO
125A new syntax C<\g{N}> or C<\gN> where "N" is a decimal integer allows a
126safer form of back-reference notation as well as allowing relative
127backreferences. This should make it easier to generate and embed patterns
c74340f9 128that contain backreferences. (Yves Orton)
24b23f37 129
072f65b4
RGS
130=back
131
ee9b8eae
YO
132=item Regexp::Keep internalized
133
134The functionality of Jeff Pinyan's module Regexp::Keep has been added to
135the core. You can now use in regular expressions the special escape C<\K>
136as a way to do something like floating length positive lookbehind. It is
137also useful in substitutions like:
138
139 s/(foo)bar/$1/g
140
141that can now be converted to
142
143 s/foo\Kbar//g
144
145which is much more efficient.
146
d5494b07
RGS
147=head2 The C<_> prototype
148
149A new prototype character has been added. C<_> is equivalent to C<$> (it
150denotes a scalar), but defaults to C<$_> if the corresponding argument
151isn't supplied. Due to the optional nature of the argument, you can only
152use it at the end of a prototype, or before a semicolon.
153
73966613
RGS
154This has a small incompatible consequence: the prototype() function has
155been adjusted to return C<_> for some built-ins in appropriate cases (for
156example, C<prototype('CORE::rmdir')>). (Rafael Garcia-Suarez)
157
49f595a6
RGS
158=head2 UNITCHECK blocks
159
160C<UNITCHECK>, a new special code block has been introduced, in addition to
161C<BEGIN>, C<CHECK>, C<INIT> and C<END>.
162
163C<CHECK> and C<INIT> blocks, while useful for some specialized purposes,
164are always executed at the transition between the compilation and the
165execution of the main program, and thus are useless whenever code is
166loaded at runtime. On the other hand, C<UNITCHECK> blocks are executed
167just after the unit which defined them has been compiled. See L<perlmod>
168for more information. (Alex Gough)
169
5a093634
RGS
170=head2 readpipe() is now overridable
171
172The built-in function readpipe() is now overridable. Overriding it permits
173also to override its operator counterpart, C<qx//> (a.k.a. C<``>). (Rafael
174Garcia-Suarez)
175
73966613
RGS
176=head2 UCD 5.0.0
177
178The copy of the Unicode Character Database included in Perl 5.9 has
179been updated to version 5.0.0.
180
f6eae373
RGS
181=head1 Modules and Pragmas
182
183=head2 New Core Modules
184
73966613
RGS
185=over 4
186
187=item *
188
189C<Locale::Maketext::Simple>, needed by CPANPLUS, is a simple wrapper around
190C<Locale::Maketext::Lexicon>. Note that C<Locale::Maketext::Lexicon> isn't
191included in the perl core; the behaviour of C<Locale::Maketext::Simple>
192gracefully degrades when the later isn't present.
193
194=item *
195
196C<Params::Check> implements a generic input parsing/checking mechanism. It
197is used by CPANPLUS.
198
5a093634
RGS
199=item *
200
201C<Term::UI> simplifies the task to ask questions at a terminal prompt.
202
203=item *
204
205C<Object::Accessor> provides an interface to create per-object accessors.
206
73966613
RGS
207=back
208
d5494b07
RGS
209=head2 Module changes
210
211=over 4
212
213=item C<base>
214
215The C<base> pragma now warns if a class tries to inherit from itself.
216
18857c0b
RGS
217=item C<warnings>
218
219The C<warnings> pragma doesn't load C<Carp> anymore. That means that code
220that used C<Carp> routines without having loaded it at compile time might
221need to be adjusted; typically, the following (faulty) code won't work
222anymore, and will require parentheses to be added after the function name:
223
224 use warnings;
225 require Carp;
226 Carp::confess "argh";
227
d5494b07
RGS
228=back
229
f6eae373
RGS
230=head1 Utility Changes
231
232=head1 Documentation
233
234=head1 Performance Enhancements
235
236=head1 Installation and Configuration Improvements
237
73966613
RGS
238=head2 C++ compatibility
239
240Efforts have been made to make perl and the core XS modules compilable
241with various C++ compilers (although the situation is not perfect with
242some of the compilers on some of the platforms tested.)
243
244=head2 Ports
245
246Perl has been reported to work on MidnightBSD.
247
f6eae373
RGS
248=head1 Selected Bug Fixes
249
49f595a6
RGS
250PerlIO::scalar will now prevent writing to read-only scalars. Moreover,
251seek() is now supported with PerlIO::scalar-based filehandles, the
252underlying string being zero-filled as needed.
73966613
RGS
253
254study() never worked for UTF-8 strings, but could lead to false results.
255It's now a no-op on UTF-8 data. (Yves Orton)
256
49f595a6
RGS
257The signals SIGILL, SIGBUS and SIGSEGV are now always delivered in an
258"unsafe" manner (contrary to other signals, that are deferred until the
259perl interpreter reaches a reasonably stable state; see
260L<perlipc/"Deferred Signals (Safe Signals)">).
261
5a093634
RGS
262When a module or a file is loaded through an @INC-hook, and when this hook
263has set a filename entry in %INC, __FILE__ is now set for this module
264accordingly to the contents of that %INC entry.
265
f6eae373
RGS
266=head1 New or Changed Diagnostics
267
268=head1 Changed Internals
269
73966613
RGS
270The anonymous hash and array constructors now take 1 op in the optree
271instead of 3, now that pp_anonhash and pp_anonlist return a reference to
272an hash/array when the op is flagged with OPf_SPECIAL (Nicholas Clark).
273
f6eae373
RGS
274=head1 Known Problems
275
276=head2 Platform Specific Problems
277
278=head1 Reporting Bugs
279
280If you find what you think is a bug, you might check the articles
281recently posted to the comp.lang.perl.misc newsgroup and the perl
282bug database at http://rt.perl.org/rt3/ . There may also be
283information at http://www.perl.org/ , the Perl Home Page.
284
285If you believe you have an unreported bug, please run the B<perlbug>
286program included with your release. Be sure to trim your bug down
287to a tiny but sufficient test case. Your bug report, along with the
288output of C<perl -V>, will be sent off to perlbug@perl.org to be
289analysed by the Perl porting team.
290
291=head1 SEE ALSO
292
293The F<Changes> file for exhaustive details on what changed.
294
295The F<INSTALL> file for how to build Perl.
296
297The F<README> file for general stuff.
298
299The F<Artistic> and F<Copying> files for copyright information.
300
301=cut