=head1 NAME perl5004delta - what's new for perl5.004 =head1 DESCRIPTION This document describes differences between the 5.003 release (as documented in I, second edition--the Camel Book) and this one. =head1 Supported Environments Perl5.004 builds out of the box on Unix, Plan 9, LynxOS, VMS, OS/2, QNX, AmigaOS, and Windows NT. Perl runs on Windows 95 as well, but it cannot be built there, for lack of a reasonable command interpreter. =head1 Core Changes Most importantly, many bugs were fixed, including several security problems. See the F file in the distribution for details. =head2 List assignment to %ENV works C<%ENV = ()> and C<%ENV = @list> now work as expected (except on VMS where it generates a fatal error). =head2 Change to "Can't locate Foo.pm in @INC" error The error "Can't locate Foo.pm in @INC" now lists the contents of @INC for easier debugging. =head2 Compilation option: Binary compatibility with 5.003 There is a new Configure question that asks if you want to maintain binary compatibility with Perl 5.003. If you choose binary compatibility, you do not have to recompile your extensions, but you might have symbol conflicts if you embed Perl in another application, just as in the 5.003 release. By default, binary compatibility is preserved at the expense of symbol table pollution. =head2 $PERL5OPT environment variable You may now put Perl options in the $PERL5OPT environment variable. Unless Perl is running with taint checks, it will interpret this variable as if its contents had appeared on a "#!perl" line at the beginning of your script, except that hyphens are optional. PERL5OPT may only be used to set the following switches: B<-[DIMUdmw]>. =head2 Limitations on B<-M>, B<-m>, and B<-T> options The C<-M> and C<-m> options are no longer allowed on the C<#!> line of a script. If a script needs a module, it should invoke it with the C pragma. The B<-T> option is also forbidden on the C<#!> line of a script, unless it was present on the Perl command line. Due to the way C<#!> works, this usually means that B<-T> must be in the first argument. Thus: #!/usr/bin/perl -T -w will probably work for an executable script invoked as C, while: #!/usr/bin/perl -w -T will probably fail under the same conditions. (Non-Unix systems will probably not follow this rule.) But C is guaranteed to fail, since then there is no chance of B<-T> being found on the command line before it is found on the C<#!> line. =head2 More precise warnings If you removed the B<-w> option from your Perl 5.003 scripts because it made Perl too verbose, we recommend that you try putting it back when you upgrade to Perl 5.004. Each new perl version tends to remove some undesirable warnings, while adding new warnings that may catch bugs in your scripts. =head2 Deprecated: Inherited C for non-methods Before Perl 5.004, C functions were looked up as methods (using the C<@ISA> hierarchy), even when the function to be autoloaded was called as a plain function (e.g. C), not a method (e.g. C<< Foo->bar() >> or C<< $obj->bar() >>). Perl 5.005 will use method lookup only for methods' Cs. However, there is a significant base of existing code that may be using the old behavior. So, as an interim step, Perl 5.004 issues an optional warning when a non-method uses an inherited C. The simple rule is: Inheritance will not work when autoloading non-methods. The simple fix for old code is: In any module that used to depend on inheriting C for non-methods from a base class named C, execute C<*AUTOLOAD = \&BaseClass::AUTOLOAD> during startup. =head2 Previously deprecated %OVERLOAD is no longer usable Using %OVERLOAD to define overloading was deprecated in 5.003. Overloading is now defined using the overload pragma. %OVERLOAD is still used internally but should not be used by Perl scripts. See L for more details. =head2 Subroutine arguments created only when they're modified In Perl 5.004, nonexistent array and hash elements used as subroutine parameters are brought into existence only if they are actually assigned to (via C<@_>). Earlier versions of Perl vary in their handling of such arguments. Perl versions 5.002 and 5.003 always brought them into existence. Perl versions 5.000 and 5.001 brought them into existence only if they were not the first argument (which was almost certainly a bug). Earlier versions of Perl never brought them into existence. For example, given this code: undef @a; undef %a; sub show { print $_[0] }; sub change { $_[0]++ }; show($a[2]); change($a{b}); After this code executes in Perl 5.004, $a{b} exists but $a[2] does not. In Perl 5.002 and 5.003, both $a{b} and $a[2] would have existed (but $a[2]'s value would have been undefined). =head2 Group vector changeable with C<$)> The C<$)> special variable has always (well, in Perl 5, at least) reflected not only the current effective group, but also the group list as returned by the C C function (if there is one). However, until this release, there has not been a way to call the C C function from Perl. In Perl 5.004, assigning to C<$)> is exactly symmetrical with examining it: The first number in its string value is used as the effective gid; if there are any numbers after the first one, they are passed to the C C function (if there is one). =head2 Fixed parsing of $$, &$, etc. Perl versions before 5.004 misinterpreted any type marker followed by "$" and a digit. For example, "$$0" was incorrectly taken to mean "${$}0" instead of "${$0}". This bug is (mostly) fixed in Perl 5.004. However, the developers of Perl 5.004 could not fix this bug completely, because at least two widely-used modules depend on the old meaning of "$$0" in a string. So Perl 5.004 still interprets "$$" in the old (broken) way inside strings; but it generates this message as a warning. And in Perl 5.005, this special treatment will cease. =head2 Fixed localization of $, $&, etc. Perl versions before 5.004 did not always properly localize the regex-related special variables. Perl 5.004 does localize them, as the documentation has always said it should. This may result in $1, $2, etc. no longer being set where existing programs use them. =head2 No resetting of $. on implicit close The documentation for Perl 5.0 has always stated that C<$.> is I reset when an already-open file handle is reopened with no intervening call to C. Due to a bug, perl versions 5.000 through 5.003 I reset C<$.> under that circumstance; Perl 5.004 does not. =head2 C may return undef The C operator returns true if a subroutine is expected to return a list, and false otherwise. In Perl 5.004, C can also return the undefined value if a subroutine's return value will not be used at all, which allows subroutines to avoid a time-consuming calculation of a return value if it isn't going to be used. =head2 C determines value of EXPR in scalar context Perl (version 5) used to determine the value of EXPR inconsistently, sometimes incorrectly using the surrounding context for the determination. Now, the value of EXPR (before being parsed by eval) is always determined in a scalar context. Once parsed, it is executed as before, by providing the context that the scope surrounding the eval provided. This change makes the behavior Perl4 compatible, besides fixing bugs resulting from the inconsistent behavior. This program: @a = qw(time now is time); print eval @a; print '|', scalar eval @a; used to print something like "timenowis881399109|4", but now (and in perl4) prints "4|4". =head2 Changes to tainting checks A bug in previous versions may have failed to detect some insecure conditions when taint checks are turned on. (Taint checks are used in setuid or setgid scripts, or when explicitly turned on with the C<-T> invocation option.) Although it's unlikely, this may cause a previously-working script to now fail, which should be construed as a blessing since that indicates a potentially-serious security hole was just plugged. The new restrictions when tainting include: =over 4 =item No glob() or <*> These operators may spawn the C shell (csh), which cannot be made safe. This restriction will be lifted in a future version of Perl when globbing is implemented without the use of an external program. =item No spawning if tainted $CDPATH, $ENV, $BASH_ENV These environment variables may alter the behavior of spawned programs (especially shells) in ways that subvert security. So now they are treated as dangerous, in the manner of $IFS and $PATH. =item No spawning if tainted $TERM doesn't look like a terminal name Some termcap libraries do unsafe things with $TERM. However, it would be unnecessarily harsh to treat all $TERM values as unsafe, since only shell metacharacters can cause trouble in $TERM. So a tainted $TERM is considered to be safe if it contains only alphanumerics, underscores, dashes, and colons, and unsafe if it contains other characters (including whitespace). =back =head2 New Opcode module and revised Safe module A new Opcode module supports the creation, manipulation and application of opcode masks. The revised Safe module has a new API and is implemented using the new Opcode module. Please read the new Opcode and Safe documentation. =head2 Embedding improvements In older versions of Perl it was not possible to create more than one Perl interpreter instance inside a single process without leaking like a sieve and/or crashing. The bugs that caused this behavior have all been fixed. However, you still must take care when embedding Perl in a C program. See the updated perlembed manpage for tips on how to manage your interpreters. =head2 Internal change: FileHandle class based on IO::* classes File handles are now stored internally as type IO::Handle. The FileHandle module is still supported for backwards compatibility, but it is now merely a front end to the IO::* modules, specifically IO::Handle, IO::Seekable, and IO::File. We suggest, but do not require, that you use the IO::* modules in new code. In harmony with this change, C<*GLOB{FILEHANDLE}> is now just a backward-compatible synonym for C<*GLOB{IO}>. =head2 Internal change: PerlIO abstraction interface It is now possible to build Perl with AT&T's sfio IO package instead of stdio. See L for more details, and the F file for how to use it. =head2 New and changed syntax =over 4 =item $coderef->(PARAMS) A subroutine reference may now be suffixed with an arrow and a (possibly empty) parameter list. This syntax denotes a call of the referenced subroutine, with the given parameters (if any). This new syntax follows the pattern of S{FOO} >>> and S[$foo] >>>: You may now write S> as S($foo) >>>. All these arrow terms may be chained; thus, S{FOO}}($bar) >>> may now be written S{FOO}->($bar) >>>. =back =head2 New and changed builtin constants =over 4 =item __PACKAGE__ The current package name at compile time, or the undefined value if there is no current package (due to a C directive). Like C<__FILE__> and C<__LINE__>, C<__PACKAGE__> does I interpolate into strings. =back =head2 New and changed builtin variables =over 4 =item $^E Extended error message on some platforms. (Also known as $EXTENDED_OS_ERROR if you C). =item $^H The current set of syntax checks enabled by C. See the documentation of C for more details. Not actually new, but newly documented. Because it is intended for internal use by Perl core components, there is no C long name for this variable. =item $^M By default, running out of memory it is not trappable. However, if compiled for this, Perl may use the contents of C<$^M> as an emergency pool after die()ing with this message. Suppose that your Perl were compiled with -DPERL_EMERGENCY_SBRK and used Perl's malloc. Then $^M = 'a' x (1<<16); would allocate a 64K buffer for use when in emergency. See the F file for information on how to enable this option. As a disincentive to casual use of this advanced feature, there is no C long name for this variable. =back =head2 New and changed builtin functions =over 4 =item delete on slices This now works. (e.g. C) =item flock is now supported on more platforms, prefers fcntl to lockf when emulating, and always flushes before (un)locking. =item printf and sprintf Perl now implements these functions itself; it doesn't use the C library function sprintf() any more, except for floating-point numbers, and even then only known flags are allowed. As a result, it is now possible to know which conversions and flags will work, and what they will do. The new conversions in Perl's sprintf() are: %i a synonym for %d %p a pointer (the address of the Perl value, in hexadecimal) %n special: *stores* the number of characters output so far into the next variable in the parameter list The new flags that go between the C<%> and the conversion are: # prefix octal with "0", hex with "0x" h interpret integer as C type "short" or "unsigned short" V interpret integer as Perl's standard integer type Also, where a number would appear in the flags, an asterisk ("*") may be used instead, in which case Perl uses the next item in the parameter list as the given number (that is, as the field width or precision). If a field width obtained through "*" is negative, it has the same effect as the '-' flag: left-justification. See L for a complete list of conversion and flags. =item keys as an lvalue As an lvalue, C allows you to increase the number of hash buckets allocated for the given hash. This can gain you a measure of efficiency if you know the hash is going to get big. (This is similar to pre-extending an array by assigning a larger number to $#array.) If you say keys %hash = 200; then C<%hash> will have at least 200 buckets allocated for it. These buckets will be retained even if you do C<%hash = ()>; use C if you want to free the storage while C<%hash> is still in scope. You can't shrink the number of buckets allocated for the hash using C in this way (but you needn't worry about doing this by accident, as trying has no effect). =item my() in Control Structures You can now use my() (with or without the parentheses) in the control expressions of control structures such as: while (defined(my $line = <>)) { $line = lc $line; } continue { print $line; } if ((my $answer = ) =~ /^y(es)?$/i) { user_agrees(); } elsif ($answer =~ /^n(o)?$/i) { user_disagrees(); } else { chomp $answer; die "`$answer' is neither `yes' nor `no'"; } Also, you can declare a foreach loop control variable as lexical by preceding it with the word "my". For example, in: foreach my $i (1, 2, 3) { some_function(); } $i is a lexical variable, and the scope of $i extends to the end of the loop, but not beyond it. Note that you still cannot use my() on global punctuation variables such as $_ and the like. =item pack() and unpack() A new format 'w' represents a BER compressed integer (as defined in ASN.1). Its format is a sequence of one or more bytes, each of which provides seven bits of the total value, with the most significant first. Bit eight of each byte is set, except for the last byte, in which bit eight is clear. If 'p' or 'P' are given undef as values, they now generate a NULL pointer. Both pack() and unpack() now fail when their templates contain invalid types. (Invalid types used to be ignored.) =item sysseek() The new sysseek() operator is a variant of seek() that sets and gets the file's system read/write position, using the lseek(2) system call. It is the only reliable way to seek before using sysread() or syswrite(). Its return value is the new position, or the undefined value on failure. =item use VERSION If the first argument to C is a number, it is treated as a version number instead of a module name. If the version of the Perl interpreter is less than VERSION, then an error message is printed and Perl exits immediately. Because C occurs at compile time, this check happens immediately during the compilation process, unlike C, which waits until runtime for the check. This is often useful if you need to check the current Perl version before Cing library modules which have changed in incompatible ways from older versions of Perl. (We try not to do this more than we have to.) =item use Module VERSION LIST If the VERSION argument is present between Module and LIST, then the C will call the VERSION method in class Module with the given version as an argument. The default VERSION method, inherited from the UNIVERSAL class, croaks if the given version is larger than the value of the variable $Module::VERSION. (Note that there is not a comma after VERSION!) This version-checking mechanism is similar to the one currently used in the Exporter module, but it is faster and can be used with modules that don't use the Exporter. It is the recommended method for new code. =item prototype(FUNCTION) Returns the prototype of a function as a string (or C if the function has no prototype). FUNCTION is a reference to or the name of the function whose prototype you want to retrieve. (Not actually new; just never documented before.) =item srand The default seed for C, which used to be C