X-Git-Url: https://perl5.git.perl.org/perl5.git/blobdiff_plain/86feb2c5020849c60df097178dd21ab793b7c689..287a962e687f7bb371dc3759b32ea8df45f0ba62:/pod/perlport.pod diff --git a/pod/perlport.pod b/pod/perlport.pod index 74cf721..9920a83 100644 --- a/pod/perlport.pod +++ b/pod/perlport.pod @@ -67,9 +67,9 @@ The important thing is to decide where the code will run and to be deliberate in your decision. The material below is separated into three main sections: main issues of -portability (L<"ISSUES">, platform-specific issues (L<"PLATFORMS">, and +portability (L<"ISSUES">), platform-specific issues (L<"PLATFORMS">), and built-in perl functions that behave differently on various ports -(L<"FUNCTION IMPLEMENTATIONS">. +(L<"FUNCTION IMPLEMENTATIONS">). This information should not be considered complete; it includes possibly transient information about idiosyncrasies of some of the ports, almost @@ -94,21 +94,9 @@ from) C<\015\012>, depending on whether you're reading or writing. Unix does the same thing on ttys in canonical mode. C<\015\012> is commonly referred to as CRLF. -A common cause of unportable programs is the misuse of chop() to trim -newlines: - - # XXX UNPORTABLE! - while() { - chop; - @array = split(/:/); - #... - } - -You can get away with this on Unix and Mac OS (they have a single -character end-of-line), but the same program will break under DOSish -perls because you're only chop()ing half the end-of-line. Instead, -chomp() should be used to trim newlines. The Dunce::Files module can -help audit your code for misuses of chop(). +To trim trailing newlines from text lines use chomp(). With default +settings that function looks for a trailing C<\n> character and thus +trims in a portable way. When dealing with binary files (or text files in binary mode) be sure to explicitly set $/ to the appropriate value for your file format @@ -224,6 +212,10 @@ them in big-endian mode. To avoid this problem in network (socket) connections use the C and C formats C and C, the "network" orders. These are guaranteed to be portable. +As of perl 5.9.2, you can also use the C> and C> modifiers +to force big- or little-endian byte-order. This is useful if you want +to store signed integers or 64-bit integers, for example. + You can explore the endianness of your platform by unpacking a data structure packed in native format such as: @@ -356,7 +348,8 @@ Whitespace in filenames is tolerated on most systems, but not all, and even on systems where it might be tolerated, some utilities might become confused by such whitespace. -Many systems (DOS, VMS) cannot have more than one C<.> in their filenames. +Many systems (DOS, VMS ODS-2) cannot have more than one C<.> in their +filenames. Don't assume C<< > >> won't be the first character of a filename. Always use C<< < >> explicitly to open a file for reading, or even @@ -440,6 +433,16 @@ if you really have to, make it conditional on C<$^O ne 'VMS'> since in VMS the C<%ENV> table is much more than a per-process key-value string table. +On VMS, some entries in the %ENV hash are dynamically created when +their key is used on a read if they did not previously exist. The +values for C<$ENV{HOME}>, C<$ENV{TERM}>, C<$ENV{HOME}>, and C<$ENV{USER}>, +are known to be dynamically generated. The specific names that are +dynamically generated may vary with the version of the C library on VMS, +and more may exist than is documented. + +On VMS by default, changes to the %ENV hash are persistent after the process +exits. This can cause unintended issues. + Don't count on signals or C<%SIG> for anything. Don't count on filename globbing. Use C, C, and @@ -476,12 +479,14 @@ file name. To convert $^X to a file pathname, taking account of the requirements of the various operating system possibilities, say: + use Config; $thisperl = $^X; if ($^O ne 'VMS') {$thisperl .= $Config{_exe} unless $thisperl =~ m/$Config{_exe}$/i;} To convert $Config{perlpath} to a file pathname, say: + use Config; $thisperl = $Config{perlpath}; if ($^O ne 'VMS') @@ -497,19 +502,28 @@ to the public Internet. Don't assume that you can reach outside world through any other port than 80, or some web proxy. ftp is blocked by many firewalls. +Don't assume that you can send email by connecting to the local SMTP port. + Don't assume that you can reach yourself or any node by the name -'localhost'. The same goes for '127.0.0.1'. You will have to try -both. +'localhost'. The same goes for '127.0.0.1'. You will have to try both. Don't assume that the host has only one network card, or that it can't bind to many virtual IP addresses. Don't assume a particular network device name. -Don't assume that any particular port (service) will respond. +Don't assume a particular set of ioctl()s will work. Don't assume that you can ping hosts and get replies. +Don't assume that any particular port (service) will respond. + +Don't assume that Sys::Hostname (or any other API or command) +returns either a fully qualified hostname or a non-qualified hostname: +it all depends on how the system had been configured. Also remember +things like DHCP and NAT-- the hostname you get back might not be very +useful. + All the above "don't":s may look daunting, and they are -- but the key is to degrade gracefully if one cannot reach the particular network service one wants. Croaking or hanging do not look very professional. @@ -627,9 +641,6 @@ The value for C<$offset> in Unix will be C<0>, but in Mac OS will be some large number. C<$offset> can then be added to a Unix time value to get what should be the proper value on any system. -On Windows (at least), you shouldn't pass a negative value to C or -C. - =head2 Character sets and character encoding Assume very little about character sets. @@ -643,9 +654,9 @@ Do not assume that the alphabetic characters are encoded contiguously Do not assume anything about the ordering of the characters. The lowercase letters may come before or after the uppercase letters; -the lowercase and uppercase may be interlaced so that both `a' and `A' -come before `b'; the accented and other international characters may -be interlaced so that E comes before `b'. +the lowercase and uppercase may be interlaced so that both "a" and "A" +come before "b"; the accented and other international characters may +be interlaced so that E comes before "b". =head2 Internationalisation @@ -668,12 +679,9 @@ ISO 8859-1 bytes beyond 0x7f into your strings might cause trouble later. If the bytes are native 8-bit bytes, you can use the C pragma. If the bytes are in a string (regular expression being a curious string), you can often also use the C<\xHH> notation instead -of embedding the bytes as-is. If they are in some particular legacy -encoding (ether single-byte or something more complicated), you can -use the C pragma. (If you want to write your code in UTF-8, -you can use either the C pragma, or the C pragma.) -The C and C pragmata are available since Perl 5.6.0, and -the C pragma since Perl 5.8.0. +of embedding the bytes as-is. (If you want to write your code in UTF-8, +you can use the C.) The C and C pragmata are +available since Perl 5.6.0. =head2 System Resources @@ -757,11 +765,17 @@ problems in their code that crop up because of lack of testing on other platforms; two, to provide users with information about whether a given module works on a given platform. +Also see: + =over 4 -=item Mailing list: cpan-testers@perl.org +=item * + +Mailing list: cpan-testers@perl.org + +=item * -=item Testing results: http://testers.cpan.org/ +Testing results: http://testers.cpan.org/ =back @@ -798,6 +812,7 @@ are a few of the more popular Unix flavors: dgux dgux AViiON-dgux DYNIX/ptx dynixptx i386-dynixptx FreeBSD freebsd freebsd-i386 + Haiku haiku BePC-haiku Linux linux arm-linux Linux linux i386-linux Linux linux i586-linux @@ -884,10 +899,11 @@ DOSish perls are as follows: Windows NT MSWin32 MSWin32-x86 2 4 xx Windows NT MSWin32 MSWin32-ALPHA 2 4 xx Windows NT MSWin32 MSWin32-ppc 2 4 xx - Windows 2000 MSWin32 MSWin32-x86 2 5 xx - Windows XP MSWin32 MSWin32-x86 2 ? + Windows 2000 MSWin32 MSWin32-x86 2 5 00 + Windows XP MSWin32 MSWin32-x86 2 5 01 + Windows 2003 MSWin32 MSWin32-x86 2 5 02 Windows CE MSWin32 ? 3 - Cygwin cygwin ? + Cygwin cygwin cygwin The various MSWin32 Perl's can distinguish the OS they are running on via the value of the fifth element of the list returned from @@ -1026,11 +1042,18 @@ The MacPerl Pages, http://www.macperl.com/ . The MacPerl mailing lists, http://lists.perl.org/ . +=item * + +MPW, ftp://ftp.apple.com/developer/Tool_Chest/Core_Mac_OS_Tools/ + =back =head2 VMS Perl on VMS is discussed in L in the perl distribution. + +The official name of VMS as of this writing is OpenVMS. + Perl on VMS can accept either VMS- or Unix-style file specifications as in either of the following: @@ -1067,39 +1090,112 @@ you are so inclined. For example: Do take care with C<$ ASSIGN/nolog/user SYS$COMMAND: SYS$INPUT> if your perl-in-DCL script expects to do things like C<< $read = ; >>. -Filenames are in the format "name.extension;version". The maximum -length for filenames is 39 characters, and the maximum length for +The VMS operating system has two filesystems, known as ODS-2 and ODS-5. + +For ODS-2, filenames are in the format "name.extension;version". The +maximum length for filenames is 39 characters, and the maximum length for extensions is also 39 characters. Version is a number from 1 to 32767. Valid characters are C. -VMS's RMS filesystem is case-insensitive and does not preserve case. -C returns lowercased filenames, but specifying a file for -opening remains case-insensitive. Files without extensions have a -trailing period on them, so doing a C with a file named F -will return F (though that file could be opened with +The ODS-2 filesystem is case-insensitive and does not preserve case. +Perl simulates this by converting all filenames to lowercase internally. + +For ODS-5, filenames may have almost any character in them and can include +Unicode characters. Characters that could be misinterpreted by the DCL +shell or file parsing utilities need to be prefixed with the C<^> +character, or replaced with hexadecimal characters prefixed with the +C<^> character. Such prefixing is only needed with the pathnames are +in VMS format in applications. Programs that can accept the UNIX format +of pathnames do not need the escape characters. The maximum length for +filenames is 255 characters. The ODS-5 file system can handle both +a case preserved and a case sensitive mode. + +ODS-5 is only available on the OpenVMS for 64 bit platforms. + +Support for the extended file specifications is being done as optional +settings to preserve backward compatibility with Perl scripts that +assume the previous VMS limitations. + +In general routines on VMS that get a UNIX format file specification +should return it in a UNIX format, and when they get a VMS format +specification they should return a VMS format unless they are documented +to do a conversion. + +For routines that generate return a file specification, VMS allows setting +if the C library which Perl is built on if it will be returned in VMS +format or in UNIX format. + +With the ODS-2 file system, there is not much difference in syntax of +filenames without paths for VMS or UNIX. With the extended character +set available with ODS-5 there can be a significant difference. + +Because of this, existing Perl scripts written for VMS were sometimes +treating VMS and UNIX filenames interchangeably. Without the extended +character set enabled, this behavior will mostly be maintained for +backwards compatibility. + +When extended characters are enabled with ODS-5, the handling of +UNIX formatted file specifications is to that of a UNIX system. + +VMS file specifications without extensions have a trailing dot. An +equivalent UNIX file specification should not show the trailing dot. + +The result of all of this, is that for VMS, for portable scripts, you +can not depend on Perl to present the filenames in lowercase, to be +case sensitive, and that the filenames could be returned in either +UNIX or VMS format. + +And if a routine returns a file specification, unless it is intended to +convert it, it should return it in the same format as it found it. + +C by default has traditionally returned lowercased filenames. +When the ODS-5 support is enabled, it will return the exact case of the +filename on the disk. + +Files without extensions have a trailing period on them, so doing a +C in the default mode with a file named F will +return F when VMS is (though that file could be opened with C). +With support for extended file specifications and if C was +given a UNIX format directory, a file named F will return F +and optionally in the exact case on the disk. When C is given +a VMS format directory, then C should return F, and +again with the optionally the exact case. + RMS had an eight level limit on directory depths from any rooted logical -(allowing 16 levels overall) prior to VMS 7.2. Hence -C is a valid directory specification but -C is not. F authors might -have to take this into account, but at least they can refer to the former -as C. +(allowing 16 levels overall) prior to VMS 7.2, and even with versions of +VMS on VAX up through 7.3. Hence C is a +valid directory specification but C is +not. F authors might have to take this into account, but at +least they can refer to the former as C. + +Pumpkings and module integrators can easily see whether files with too many +directory levels have snuck into the core by running the following in the +top-level source directory: + + $ perl -ne "$_=~s/\s+.*//; print if scalar(split /\//) > 8;" < MANIFEST + The VMS::Filespec module, which gets installed as part of the build process on VMS, is a pure Perl module that can easily be installed on non-VMS platforms and can be helpful for conversions to and from RMS -native formats. +native formats. It is also now the only way that you should check to +see if VMS is in a case sensitive mode. What C<\n> represents depends on the type of file opened. It usually represents C<\012> but it could also be C<\015>, C<\012>, C<\015\012>, -C<\000>, C<\040>, or nothing depending on the file organiztion and +C<\000>, C<\040>, or nothing depending on the file organization and record format. The VMS::Stdio module provides access to the special fopen() requirements of files with unusual attributes on VMS. TCP/IP stacks are optional on VMS, so socket routines might not be implemented. UDP sockets may not be supported. +The TCP/IP library support for all current versions of VMS is dynamically +loaded if present, so even if the routines are configured, they may +return a status indicating that they are not implemented. + The value of C<$^O> on OpenVMS is "VMS". To determine the architecture that you are running on without resorting to loading all of C<%Config> you can examine the content of the C<@INC> array like so: @@ -1110,10 +1206,16 @@ you can examine the content of the C<@INC> array like so: } elsif (grep(/VMS_VAX/, @INC)) { print "I'm on VAX!\n"; + } elsif (grep(/VMS_IA64/, @INC)) { + print "I'm on IA64!\n"; + } else { print "I'm not so sure about where $^O is...\n"; } +In general, the significant differences should only be if Perl is running +on VMS_VAX or one of the 64 bit OpenVMS platforms. + On VMS, perl determines the UTC offset from the C logical name. Although the VMS epoch began at 17-NOV-1858 00:00:00.00, calls to C are adjusted to count offsets from @@ -1129,9 +1231,7 @@ F (installed as L), L =item * -vmsperl list, majordomo@perl.org - -(Put the words C in message body.) +vmsperl list, vmsperl-subscribe@perl.org =item * @@ -1157,7 +1257,8 @@ names, because the VOS port of Perl interprets it as a pathname delimiting character, VOS files, directories, or links whose names contain a slash character cannot be processed. Such files must be renamed before they can be processed by Perl. Note that VOS limits -file names to 32 or fewer characters. +file names to 32 or fewer characters, file names cannot start with a +C<-> character, or contain any character matching C<< tr/ !%&'()*+;<>?// >> The value of C<$^O> on VOS is "VOS". To determine the architecture that you are running on without resorting to loading all of C<%Config> you @@ -1275,8 +1376,6 @@ Also see: =item * -* - L, F, F, F, L. @@ -1286,7 +1385,7 @@ The perl-mvs@perl.org list is for discussion of porting issues as well as general usage issues for all EBCDIC Perls. Send a message body of "subscribe perl-mvs" to majordomo@perl.org. -=item * +=item * AS/400 Perl information at http://as400.rochester.ibm.com/ @@ -1474,16 +1573,17 @@ L for a full description of available variables. =over 8 -=item -X FILEHANDLE - -=item -X EXPR - =item -X C<-r>, C<-w>, and C<-x> have a limited meaning only; directories and applications are executable, and there are no uid/gid considerations. C<-o> is not supported. (S) +C<-w> only inspects the read-only file attribute (FILE_ATTRIBUTE_READONLY), +which determines whether the directory can be deleted, not whether it can +be written to. Directories always have read and write access unless denied +by discretionary access control lists (DACLs). (S) + C<-r>, C<-w>, C<-x>, and C<-o> tell whether the file is accessible, which may not reflect UIC-based file protections. (VMS) @@ -1500,9 +1600,11 @@ C<-x>, C<-o>. (S, Win32, VMS, S) C<-b>, C<-c>, C<-k>, C<-g>, C<-p>, C<-u>, C<-A> are not implemented. (S) -C<-g>, C<-k>, C<-l>, C<-p>, C<-u>, C<-A> are not particularly meaningful. +C<-g>, C<-k>, C<-l>, C<-u>, C<-A> are not particularly meaningful. (Win32, VMS, S) +C<-p> is not particularly meaningful. (VMS, S) + C<-d> is true if passed a device spec without an explicit directory. (VMS) @@ -1516,13 +1618,18 @@ suffixes. C<-S> is meaningless. (Win32) C<-x> (or C<-X>) determine if a file has an executable file type. (S) -=item alarm SECONDS +=item atan2 -=item alarm +Due to issues with various CPUs, math libraries, compilers, and standards, +results for C may vary depending on any combination of the above. +Perl attempts to conform to the Open Group/IEEE standards for the results +returned from C, but cannot force the issue if the system Perl is +run on does not allow it. (Tru64, HP-UX 10.20) -Not implemented. (Win32) +The current version of the standards for C is available at +L. -=item binmode FILEHANDLE +=item binmode Meaningless. (S, S) @@ -1533,7 +1640,7 @@ filehandle may be closed, or pointer may be in a different position. The value returned by C may be affected after the call, and the filehandle may be flushed. (Win32) -=item chmod LIST +=item chmod Only limited meaning. Disabling/enabling write permission is mapped to locking/unlocking the file. (S) @@ -1548,7 +1655,7 @@ Access permissions are mapped onto VOS access-control list changes. (VOS) The actual permissions set depend on the value of the C in the SYSTEM environment settings. (Cygwin) -=item chown LIST +=item chown Not implemented. (S, Win32, S, S) @@ -1556,34 +1663,32 @@ Does nothing, but won't fail. (Win32) A little funky, because VOS's notion of ownership is a little funky (VOS). -=item chroot FILENAME - =item chroot Not implemented. (S, Win32, VMS, S, S, VOS, VM/ESA) -=item crypt PLAINTEXT,SALT +=item crypt May not be available if library or source was not provided when building perl. (Win32) -=item dbmclose HASH +=item dbmclose Not implemented. (VMS, S, VOS) -=item dbmopen HASH,DBNAME,MODE +=item dbmopen Not implemented. (VMS, S, VOS) -=item dump LABEL +=item dump Not useful. (S, S) -Not implemented. (Win32) +Not supported. (Cygwin, Win32) Invokes VMS debugger. (VMS) -=item exec LIST +=item exec Not implemented. (S) @@ -1592,8 +1697,6 @@ Implemented via Spawn. (VM/ESA) Does not automatically flush output handles on some platforms. (SunOS, Solaris, HP-UX) -=item exit EXPR - =item exit Emulates UNIX exit() (which considers C to indicate an error) by @@ -1601,13 +1704,19 @@ mapping the C<1> to SS$_ABORT (C<44>). This behavior may be overridden with the pragma C. As with the CRTL's exit() function, C is also mapped to an exit status of SS$_NORMAL (C<1>); this mapping cannot be overridden. Any other argument to exit() -is used directly as Perl's exit status. (VMS) +is used directly as Perl's exit status. On VMS, unless the future +POSIX_EXIT mode is enabled, the exit code should always be a valid +VMS exit code and not a generic number. When the POSIX_EXIT mode is +enabled, a generic number will be encoded in a method compatible with +the C library _POSIX_EXIT macro so that it can be decoded by other +programs, particularly ones written in C, like the GNV package. (VMS) -=item fcntl FILEHANDLE,FUNCTION,SCALAR +=item fcntl -Not implemented. (Win32, VMS) +Not implemented. (Win32) +Some functions available based on the version of VMS. (VMS) -=item flock FILEHANDLE,OPERATION +=item flock Not implemented (S, VMS, S, VOS). @@ -1626,7 +1735,7 @@ Does not automatically flush output handles on some platforms. Not implemented. (S, S) -=item getpgrp PID +=item getpgrp Not implemented. (S, Win32, VMS, S) @@ -1634,43 +1743,43 @@ Not implemented. (S, Win32, VMS, S) Not implemented. (S, Win32, S) -=item getpriority WHICH,WHO +=item getpriority Not implemented. (S, Win32, VMS, S, VOS, VM/ESA) -=item getpwnam NAME +=item getpwnam Not implemented. (S, Win32) Not useful. (S) -=item getgrnam NAME +=item getgrnam Not implemented. (S, Win32, VMS, S) -=item getnetbyname NAME +=item getnetbyname Not implemented. (S, Win32, S) -=item getpwuid UID +=item getpwuid Not implemented. (S, Win32) Not useful. (S) -=item getgrgid GID +=item getgrgid Not implemented. (S, Win32, VMS, S) -=item getnetbyaddr ADDR,ADDRTYPE +=item getnetbyaddr Not implemented. (S, Win32, S) -=item getprotobynumber NUMBER +=item getprotobynumber Not implemented. (S) -=item getservbyport PORT,PROTO +=item getservbyport Not implemented. (S) @@ -1703,19 +1812,19 @@ Not implemented. (S, Win32, S) Not implemented. (Win32, S) -=item sethostent STAYOPEN +=item sethostent Not implemented. (S, Win32, S, S) -=item setnetent STAYOPEN +=item setnetent Not implemented. (S, Win32, S, S) -=item setprotoent STAYOPEN +=item setprotoent Not implemented. (S, Win32, S, S) -=item setservent STAYOPEN +=item setservent Not implemented. (S, Win32, S) @@ -1747,13 +1856,18 @@ Not implemented. (S, Win32) Not implemented. (S) -=item glob EXPR - =item glob This operator is implemented via the File::Glob extension on most platforms. See L for portability information. +=item gmtime + +In theory, gmtime() is reliable from -2**63 to 2**63-1. However, +because work arounds in the implementation use floating point numbers, +it will become inaccurate as the time gets larger. This is a bug and +will be fixed in the future. + =item ioctl FILEHANDLE,FUNCTION,SCALAR Not implemented. (VMS) @@ -1763,7 +1877,7 @@ in the Winsock API does. (Win32) Available only for socket handles. (S) -=item kill SIGNAL, LIST +=item kill C is implemented for the sake of taint checking; use with other signals is unimplemented. (S) @@ -1777,39 +1891,53 @@ and makes it exit immediately with exit status $sig. As in Unix, if $sig is 0 and the specified process exists, it returns true without actually terminating it. (Win32) -=item link OLDFILE,NEWFILE +C will terminate the process specified by $pid and +recursively all child processes owned by it. This is different from +the Unix semantics, where the signal will be delivered to all +processes in the same process group as the process specified by +$pid. (Win32) + +Is not supported for process identification number of 0 or negative +numbers. (VMS) -Not implemented. (S, MPE/iX, VMS, S) +=item link + +Not implemented. (S, MPE/iX, S) Link count not updated because hard links are not quite that hard (They are sort of half-way between hard and soft links). (AmigaOS) -Hard links are implemented on Win32 (Windows NT and Windows 2000) -under NTFS only. +Hard links are implemented on Win32 under NTFS only. They are +natively supported on Windows 2000 and later. On Windows NT they +are implemented using the Windows POSIX subsystem support and the +Perl process will need Administrator or Backup Operator privileges +to create hard links. + +Available on 64 bit OpenVMS 8.2 and later. (VMS) -=item lstat FILEHANDLE +=item localtime -=item lstat EXPR +localtime() has the same range as L, but because time zone +rules change its accuracy for historical and future times may degrade +but usually by no more than an hour. =item lstat -Not implemented. (VMS, S) +Not implemented. (S) Return values (especially for device and inode) may be bogus. (Win32) -=item msgctl ID,CMD,ARG +=item msgctl -=item msgget KEY,FLAGS +=item msgget -=item msgsnd ID,MSG,FLAGS +=item msgsnd -=item msgrcv ID,VAR,SIZE,TYPE,FLAGS +=item msgrcv Not implemented. (S, Win32, VMS, S, S, VOS) -=item open FILEHANDLE,EXPR - -=item open FILEHANDLE +=item open The C<|> variants are supported only if ToolServer is installed. (S) @@ -1819,17 +1947,19 @@ open to C<|-> and C<-|> are unsupported. (S, Win32, S) Opening a process does not automatically flush output handles on some platforms. (SunOS, Solaris, HP-UX) -=item pipe READHANDLE,WRITEHANDLE +=item pipe Very limited functionality. (MiNT) -=item readlink EXPR - =item readlink Not implemented. (Win32, VMS, S) -=item select RBITS,WBITS,EBITS,TIMEOUT +=item rename + +Can't move directories between directories on different logical volumes. (Win32) + +=item select Only implemented on sockets. (Win32, VMS) @@ -1837,11 +1967,11 @@ Only reliable on sockets. (S) Note that the C