From 7678ccedef3d2583c849cbd8e5a13ba36925ac4c Mon Sep 17 00:00:00 2001 From: Rafael Garcia-Suarez Date: Fri, 11 Mar 2005 11:12:31 +0000 Subject: [PATCH 1/1] FAQ sync p4raw-id: //depot/perl@24024 --- pod/perlfaq.pod | 10 ++-- pod/perlfaq1.pod | 133 +++++++++++++++++++++++++++++++++++-------------- pod/perlfaq2.pod | 34 ++++++++++--- pod/perlfaq3.pod | 116 +++++++++++++++++++++---------------------- pod/perlfaq4.pod | 89 +++++++++++++++------------------ pod/perlfaq5.pod | 32 ++++++++++-- pod/perlfaq6.pod | 148 ++++++++++++++++++++++++++++++++++++++----------------- pod/perlfaq7.pod | 44 ++++++++++------- pod/perlfaq8.pod | 6 +-- pod/perlfaq9.pod | 75 ++++++++++++++++------------ 10 files changed, 432 insertions(+), 255 deletions(-) diff --git a/pod/perlfaq.pod b/pod/perlfaq.pod index e97a59a..a77d8e4 100644 --- a/pod/perlfaq.pod +++ b/pod/perlfaq.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq - frequently asked questions about Perl ($Date: 2004/10/05 22:15:44 $) +perlfaq - frequently asked questions about Perl ($Date: 2005/01/31 15:52:15 $) =head1 DESCRIPTION @@ -119,7 +119,7 @@ Which version of Perl should I use? =item * -What are perl4 and perl5? +What are perl4, perl5, or perl6? =item * @@ -377,7 +377,7 @@ Where can I learn about linking C with Perl? [h2xs, xsubpp] =item * -I've read perlembed, perlguts, etc., but I can't embed perl in +I've read perlembed, perlguts, etc., but I can't embed perl in my C program; what am I doing wrong? =item * @@ -727,6 +727,10 @@ How can I use Perl's C<-i> option from within a program? =item * +How can I copy a file? + +=item * + How do I make a temporary file name? =item * diff --git a/pod/perlfaq1.pod b/pod/perlfaq1.pod index 293aa7f..abde261 100644 --- a/pod/perlfaq1.pod +++ b/pod/perlfaq1.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq1 - General Questions About Perl ($Revision: 1.15 $, $Date: 2004/10/11 05:06:29 $) +perlfaq1 - General Questions About Perl ($Revision: 1.17 $, $Date: 2005/01/31 15:52:15 $) =head1 DESCRIPTION @@ -56,39 +56,100 @@ users the informal support will more than suffice. See the answer to =head2 Which version of Perl should I use? -You should definitely use version 5. Version 4 is old, limited, and -no longer maintained; its last patch (4.036) was in 1992, long ago and -far away. Sure, it's stable, but so is anything that's dead; in fact, -perl4 had been called a dead, flea-bitten camel carcass. The most -recent production release is 5.8.2 (although 5.005_03 and 5.6.2 are -still supported). The most cutting-edge development release is 5.9. -Further references to the Perl language in this document refer to the -production release unless otherwise specified. There may be one or -more official bug fixes by the time you read this, and also perhaps -some experimental versions on the way to the next release. -All releases prior to 5.004 were subject to buffer overruns, a grave -security issue. - -=head2 What are perl4 and perl5? - -Perl4 and perl5 are informal names for different versions of the Perl -programming language. It's easier to say "perl5" than it is to say -"the 5(.004) release of Perl", but some people have interpreted this -to mean there's a language called "perl5", which isn't the case. -Perl5 is merely the popular name for the fifth major release (October 1994), -while perl4 was the fourth major release (March 1991). There was also a -perl1 (in January 1988), a perl2 (June 1988), and a perl3 (October 1989). - -The 5.0 release is, essentially, a ground-up rewrite of the original -perl source code from releases 1 through 4. It has been modularized, -object-oriented, tweaked, trimmed, and optimized until it almost doesn't -look like the old code. However, the interface is mostly the same, and -compatibility with previous releases is very high. -See L. - -To avoid the "what language is perl5?" confusion, some people prefer to -simply use "perl" to refer to the latest version of perl and avoid using -"perl5" altogether. It's not really that big a deal, though. +(contributed by brian d foy) + +There is often a matter of opinion and taste, and there isn't any +one answer that fits anyone. In general, you want to use either +the current stable release, or the stable release immediately prior +to that one. Currently, those are perl5.8.x and perl5.6.x, respectively. + +Beyond that, you have to consider several things and decide which +is best for you. + +=over 4 + +=item + +If things aren't broken, upgrading perl may break +them (or at least issue new warnings). + +=item + +The latest versions of perl have more bug fixes. + +=item + +The Perl community is geared toward supporting the most +recent releases, so you'll have an easier time finding help for +those. + +=item + +Versions prior to perl5.004 had serious security problems with +buffer overflows, and in some cases have CERT advisories (for +instance, http://www.cert.org/advisories/CA-1997-17.html ). + +=item + +The latest versions are probably the least deployed and +widely tested, so you may want to wait a few months after their +release and see what problems others have if you are risk averse. + +=item + +The immediate, previous releases (i.e. perl5.6.x ) are usually +maintained for a while, although not at the same level as the +current releases. + +=item + +No one is actively supporting perl4.x. Five years ago it was +a dead camel carcass (according to this document). Now it's barely +a skeleton as its whitewashed bones have fractured or eroded. + +=item + +There is no perl6.x for the next couple of years. Stay tuned, +but don't worry that you'll have to change major versions of Perl +soon (i.e. before 2006). + +=item + +There are really two tracks of perl development: a +maintenance version and an experimental version. The +maintenance versions are stable, and have an even number +as the minor release (i.e. perl5.8.x, where 8 is the minor +release). The experimental versions may include features that +don't make it into the stable versions, and have an odd number +as the minor release (i.e. perl5.9.x, where 9 is the minor release). + +=back + + +=head2 What are perl4, perl5, or perl6? + +(contributed by brian d foy) + +In short, perl4 is the past, perl5 is the present, and perl6 is the +future. + +The number after perl (i.e. the 5 after perl5) is the major release +of the perl interpreter as well as the version of the language. Each +major version has significant differences that earlier versions cannot +support. + +The current major release of Perl is perl5, and was released in 1994. +It can run scripts from the previous major release, perl4 (March 1991), +but has significant differences. It introduced the concept of references, +complex data structures, and modules. The perl5 interpreter was a +complete re-write of the previous perl sources. + +Perl6 is the next major version of Perl, but it's still in development +in both its syntax and design. The work started in 2002 and is still +ongoing. Many of the most interesting features have shown up in the +latest versions of perl5, and some perl5 modules allow you to use some +perl6 syntax in your programs. You can learn more about perl6 at +http://dev.perl.org/perl6/ . See L for a history of Perl revisions. @@ -334,8 +395,8 @@ but the most recommendable way is to upgrade to at least Perl 5.6.1. =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997, 1998, 1999, 2000, 2001 Tom Christiansen and Nathan -Torkington. All rights reserved. +Copyright (c) 1997-2005 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself. diff --git a/pod/perlfaq2.pod b/pod/perlfaq2.pod index 4d41954..a0fdf30 100644 --- a/pod/perlfaq2.pod +++ b/pod/perlfaq2.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq2 - Obtaining and Learning about Perl ($Revision: 1.29 $, $Date: 2004/10/25 18:37:23 $) +perlfaq2 - Obtaining and Learning about Perl ($Revision: 1.31 $, $Date: 2005/01/31 15:54:44 $) =head1 DESCRIPTION @@ -313,6 +313,11 @@ Recommended books on (or mostly on) Perl follow. =item Tutorials + Beginning Perl + by James Lee + ISBN 1-59059-391-X [2nd edition August 2004] + http://apress.com/book/bookDisplay.html?bID=344 + Elements of Programming with Perl by Andrew L. Johnson ISBN 1-884777-80-5 [1st edition October 1999] @@ -353,6 +358,11 @@ Recommended books on (or mostly on) Perl follow. =item Task-Oriented + Writing Perl Modules for CPAN + by Sam Tregar + ISBN 1-59059-018-X [1st edition Aug 2002] + http://apress.com/book/bookDisplay.html?bID=14 + The Perl Cookbook by Tom Christiansen and Nathan Torkington with foreword by Larry Wall @@ -364,30 +374,40 @@ Recommended books on (or mostly on) Perl follow. ISBN 0-201-41975-0 [1st edition 1998] http://www.awl.com/ + Real World SQL Server Administration with Perl + by Linchi Shea + ISBN 1-59059-097-X [1st edition July 2003] + http://apress.com/book/bookDisplay.html?bID=171 + =item Special Topics + Perl 6 Now: The Core Ideas Illustrated with Perl 5 + by Scott Walters + ISBN 1-59059-395-2 [1st edition December 2004 + http://apress.com/book/bookDisplay.html?bID=355 + Mastering Regular Expressions by Jeffrey E. F. Friedl ISBN 0-596-00289-0 [2nd edition July 2002] http://www.oreilly.com/catalog/regex2/ - Network Programming with Perl + Network Programming with Perl by Lincoln Stein ISBN 0-201-61571-1 [1st edition 2001] http://www.awlonline.com/ - Object Oriented Perl + Object Oriented Perl Damian Conway with foreword by Randal L. Schwartz ISBN 1-884777-79-1 [1st edition August 1999] http://www.manning.com/Conway/ - Data Munging with Perl + Data Munging with Perl Dave Cross ISBN 1-930110-00-6 [1st edition 2001] http://www.manning.com/cross - Mastering Perl/Tk + Mastering Perl/Tk by Steve Lidie and Nancy Walsh ISBN 1-56592-716-8 [1st edition January 2002] http://www.oreilly.com/catalog/mastperltk/ @@ -523,8 +543,8 @@ the I question earlier in this document. =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997-2001 Tom Christiansen and Nathan Torkington. -All rights reserved. +Copyright (c) 1997-2005 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself. diff --git a/pod/perlfaq3.pod b/pod/perlfaq3.pod index 33675bf..7dede4c 100644 --- a/pod/perlfaq3.pod +++ b/pod/perlfaq3.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq3 - Programming Tools ($Revision: 1.41 $, $Date: 2004/11/03 22:45:32 $) +perlfaq3 - Programming Tools ($Revision: 1.46 $, $Date: 2005/02/13 02:36:09 $) =head1 DESCRIPTION @@ -407,8 +407,8 @@ no 32k limit). =item Affrus -is a full Perl development enivornment with full debugger support ( -http://www.latenightsw.com ). +is a full Perl development enivornment with full debugger support +( http://www.latenightsw.com ). =item Alpha @@ -649,23 +649,25 @@ everything works out right. =head2 How can I free an array or hash so my program shrinks? -You usually can't. On most operating systems, memory -allocated to a program can never be returned to the system. -That's why long-running programs sometimes re-exec -themselves. Some operating systems (notably, systems that -use mmap(2) for allocating large chunks of memory) can -reclaim memory that is no longer used, but on such systems, -perl must be configured and compiled to use the OS's malloc, -not perl's. - -However, judicious use of my() on your variables will help make sure -that they go out of scope so that Perl can free up that space for -use in other parts of your program. A global variable, of course, never -goes out of scope, so you can't get its space automatically reclaimed, -although undef()ing and/or delete()ing it will achieve the same effect. +(contributed by Michael Carman) + +You usually can't. Memory allocated to lexicals (i.e. my() variables) +cannot be reclaimed or reused even if they go out of scope. It is +reserved in case the variables come back into scope. Memory allocated +to global variables can be reused (within your program) by using +undef()ing and/or delete(). + +On most operating systems, memory allocated to a program can never be +returned to the system. That's why long-running programs sometimes re- +exec themselves. Some operating systems (notably, systems that use +mmap(2) for allocating large chunks of memory) can reclaim memory that +is no longer used, but on such systems, perl must be configured and +compiled to use the OS's malloc, not perl's. + In general, memory allocation and de-allocation isn't something you can -or should be worrying about much in Perl, but even this capability -(preallocation of data types) is in the works. +or should be worrying about much in Perl. + +See also "How can I make my Perl program take less memory?" =head2 How can I make my CGI script more efficient? @@ -753,41 +755,40 @@ you want to be sure your license's wording will stand up in court. =head2 How can I compile my Perl program into byte code or C? -Malcolm Beattie has written a multifunction backend compiler, -available from CPAN, that can do both these things. It is included -in the perl5.005 release, but is still considered experimental. -This means it's fun to play with if you're a programmer but not -really for people looking for turn-key solutions. - -Merely compiling into C does not in and of itself guarantee that your -code will run very much faster. That's because except for lucky cases -where a lot of native type inferencing is possible, the normal Perl -run-time system is still present and so your program will take just as -long to run and be just as big. Most programs save little more than -compilation time, leaving execution no more than 10-30% faster. A few -rare programs actually benefit significantly (even running several times -faster), but this takes some tweaking of your code. - -You'll probably be astonished to learn that the current version of the -compiler generates a compiled form of your script whose executable is -just as big as the original perl executable, and then some. That's -because as currently written, all programs are prepared for a full -eval() statement. You can tremendously reduce this cost by building a -shared I library and linking against that. See the -F podfile in the Perl source distribution for details. If -you link your main perl binary with this, it will make it minuscule. -For example, on one author's system, F is only 11k in -size! - -In general, the compiler will do nothing to make a Perl program smaller, -faster, more portable, or more secure. In fact, it can make your -situation worse. The executable will be bigger, your VM system may take -longer to load the whole thing, the binary is fragile and hard to fix, -and compilation never stopped software piracy in the form of crackers, -viruses, or bootleggers. The real advantage of the compiler is merely -packaging, and once you see the size of what it makes (well, unless -you use a shared I), you'll probably want a complete -Perl install anyway. +(contributed by brian d foy) + +In general, you can't do this. There are some things that may work +for your situation though. People usually ask this question +because they want to distribute their works without giving away +the source code, and most solutions trade disk space for convenience. +You probably won't see much of a speed increase either, since most +solutions simply bundle a Perl interpreter in the final product +(but see L). + +The Perl Archive Toolkit (http://par.perl.org/index.cgi) is +Perl's analog to Java's JAR. It's freely available and on +CPAN (http://search.cpan.org/dist/PAR/). + +The B::* namespace, often called "the Perl compiler", but is really a +way for Perl programs to peek at its innards rather than create +pre-compiled versions of your program. However. the B::Bytecode +module can turn your script into a bytecode format that could be +loaded later by the ByteLoader module and executed as a regular Perl +script. + +There are also some commercial products that may work for +you, although you have to buy a license for them. + +The Perl Dev Kit +(http://www.activestate.com/Products/Perl_Dev_Kit/) from +ActiveState can "Turn your Perl programs into ready-to-run +executables for HP-UX, Linux, Solaris and Windows." + +Perl2Exe (http://www.indigostar.com/perl2exe.htm) is a +command line program for converting perl scripts to +executable files. It targets both Windows and unix +platforms. + =head2 How can I compile Perl into Java? @@ -931,8 +932,7 @@ L. Don't forget that you can learn a lot from looking at how the authors of existing extension modules wrote their code and solved their problems. -=head2 I've read perlembed, perlguts, etc., but I can't embed perl in -my C program; what am I doing wrong? +=head2 I've read perlembed, perlguts, etc., but I can't embed perl in my C program; what am I doing wrong? Download the ExtUtils::Embed kit from CPAN and run `make test'. If the tests pass, read the pods again and again and again. If they @@ -964,8 +964,8 @@ information, see L. =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997-2002 Tom Christiansen and Nathan Torkington. -All rights reserved. +Copyright (c) 1997-2005 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself. diff --git a/pod/perlfaq4.pod b/pod/perlfaq4.pod index 815a9ea..05005cb 100644 --- a/pod/perlfaq4.pod +++ b/pod/perlfaq4.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq4 - Data Manipulation ($Revision: 1.56 $, $Date: 2004/11/03 22:47:56 $) +perlfaq4 - Data Manipulation ($Revision: 1.60 $, $Date: 2005/02/14 18:24:01 $) =head1 DESCRIPTION @@ -454,26 +454,29 @@ and Date::Manip modules from CPAN. =head2 How can I find the Julian Day? -Use the Time::JulianDay module (part of the Time-modules bundle -available from CPAN.) - -Before you immerse yourself too deeply in this, be sure to verify that -it is the I Day you really want. Are you interested in a way -of getting serial days so that you just can tell how many days they -are apart or so that you can do also other date arithmetic? If you -are interested in performing date arithmetic, this can be done using -modules Date::Manip or Date::Calc. - -There is too many details and much confusion on this issue to cover in -this FAQ, but the term is applied (correctly) to a calendar now -supplanted by the Gregorian Calendar, with the Julian Calendar failing -to adjust properly for leap years on centennial years (among other -annoyances). The term is also used (incorrectly) to mean: [1] days in -the Gregorian Calendar; and [2] days since a particular starting time -or `epoch', usually 1970 in the Unix world and 1980 in the -MS-DOS/Windows world. If you find that it is not the first meaning -that you really want, then check out the Date::Manip and Date::Calc -modules. (Thanks to David Cassell for most of this text.) +(contributed by brian d foy and Dave Cross) + +You can use the Time::JulianDay module available on CPAN. Ensure that +you really want to find a Julian day, though, as many people have +different ideas about Julian days. See +http://www.hermetic.ch/cal_stud/jdn.htm for instance. + +You can also try the DateTime module, which can convert a date/time +to a Julian Day. + + $ perl -MDateTime -le'print DateTime->today->jd' + 2453401.5 + +Or the modified Julian Day + + $ perl -MDateTime -le'print DateTime->today->mjd' + 53401 + +Or even the day of the year (which is what some people think of as a +Julian day) + + $ perl -MDateTime -le'print DateTime->today->doy' + 31 =head2 How do I find yesterday's date? @@ -598,9 +601,6 @@ a subroutine call (in list context) into a string: print "My sub returned @{[mysub(1,2,3)]} that time.\n"; -See also ``How can I expand variables in text strings?'' in this -section of the FAQ. - =head2 How do I find matching/nesting anything? This isn't something that can be done in one regular expression, no @@ -804,7 +804,7 @@ case transformations: =head2 How can I split a [character] delimited string except when inside [character]? Several modules can handle this sort of pasing---Text::Balanced, -Text::CVS, Text::CVS_XS, and Text::ParseWords, among others. +Text::CSV, Text::CSV_XS, and Text::ParseWords, among others. Take the example case of trying to split a string that is comma-separated into its different fields. You can't use C @@ -938,31 +938,27 @@ you can use this kind of thing: =head2 How do I find the soundex value of a string? -Use the standard Text::Soundex module distributed with Perl. -Before you do so, you may want to determine whether `soundex' is in -fact what you think it is. Knuth's soundex algorithm compresses words -into a small space, and so it does not necessarily distinguish between -two words which you might want to appear separately. For example, the -last names `Knuth' and `Kant' are both mapped to the soundex code K530. -If Text::Soundex does not do what you are looking for, you might want -to consider the String::Approx module available at CPAN. +(contributed by brian d foy) + +You can use the Text::Soundex module. If you want to do fuzzy or close +matching, you might also try the String::Approx, and Text::Metaphone, +and Text::DoubleMetaphone modules. =head2 How can I expand variables in text strings? -Let's assume that you have a string like: +Let's assume that you have a string that contains placeholder +variables. $text = 'this has a $foo in it and a $bar'; -If those were both global variables, then this would -suffice: +You can use a substitution with a double evaluation. The +first /e turns C<$1> into C<$foo>, and the second /e turns +C<$foo> into its value. You may want to wrap this in an +C: if you try to get the value of an undeclared variable +while running under C, you get a fatal error. - $text =~ s/\$(\w+)/${$1}/g; # no /e needed - -But since they are probably lexicals, or at least, they could -be, you'd have to do this: - - $text =~ s/(\$\w+)/$1/eeg; - die if $@; # needed /ee, not /e + eval { $text =~ s/(\$\w+)/$1/eeg }; + die if $@; It's probably better in the general case to treat those variables as entries in some special hash. For example: @@ -973,9 +969,6 @@ variables as entries in some special hash. For example: ); $text =~ s/\$(\w+)/$user_defs{$1}/g; -See also ``How do I expand function calls in a string?'' in this section -of the FAQ. - =head2 What's wrong with always quoting "$vars"? The problem is that those double-quotes force stringification-- @@ -2088,8 +2081,8 @@ the PDL module from CPAN instead--it makes number-crunching easy. =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997-2002 Tom Christiansen and Nathan Torkington. -All rights reserved. +Copyright (c) 1997-2005 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself. diff --git a/pod/perlfaq5.pod b/pod/perlfaq5.pod index ae71cd9..ab0ba26 100644 --- a/pod/perlfaq5.pod +++ b/pod/perlfaq5.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq5 - Files and Formats ($Revision: 1.31 $, $Date: 2004/02/07 04:29:50 $) +perlfaq5 - Files and Formats ($Revision: 1.35 $, $Date: 2005/01/21 12:26:11 $) =head1 DESCRIPTION @@ -106,9 +106,31 @@ This block modifies all the C<.c> files in the current directory, leaving a backup of the original data from each file in a new C<.c.orig> file. +=head2 How can I copy a file? + +(contributed by brian d foy) + +Use the File::Copy module. It comes with Perl and can do a +true copy across file systems, and it does its magic in +a portable fashion. + + use File::Copy; + + copy( $original, $new_copy ) or die "Copy failed: $!"; + +If you can't use File::Copy, you'll have to do the work yourself: +open the original file, open the destination file, then print +to the destination file as you read the original. + =head2 How do I make a temporary file name? -Use the File::Temp module, see L for more information. +If you don't need to know the name of the file, you can use C +with C in place of the file name. The C function +creates an anonymous temporary file. + + open my $tmp, '+>', undef or die $!; + +Otherwise, you can use the File::Temp module. use File::Temp qw/ tempfile tempdir /; @@ -333,7 +355,7 @@ It is easier to see with comments: s/( ^[-+]? # beginning of number. - \d{1,3}? # first digits before first comma + \d+? # first digits before first comma (?= # followed by, (but not included in the match) : (?>(?:\d{3})+) # some positive multiple of three digits. (?!\d) # an *exact* multiple, not x * 3 + 1 or whatever. @@ -1047,8 +1069,8 @@ If your array contains lines, just print them: =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997-2002 Tom Christiansen and Nathan Torkington. -All rights reserved. +Copyright (c) 1997-2005 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself. diff --git a/pod/perlfaq6.pod b/pod/perlfaq6.pod index 6b0f3bb..29e6903 100644 --- a/pod/perlfaq6.pod +++ b/pod/perlfaq6.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq6 - Regular Expressions ($Revision: 1.27 $, $Date: 2004/11/03 22:52:16 $) +perlfaq6 - Regular Expressions ($Revision: 1.30 $, $Date: 2005/02/14 18:25:48 $) =head1 DESCRIPTION @@ -518,59 +518,115 @@ See the module String::Approx available from CPAN. =head2 How do I efficiently match many regular expressions at once? -The following is extremely inefficient: +( contributed by brian d foy ) + +Avoid asking Perl to compile a regular expression every time +you want to match it. In this example, perl must recompile +the regular expression for every iteration of the foreach() +loop since it has no way to know what $pattern will be. + + @patterns = qw( foo bar baz ); + + LINE: while( <> ) + { + foreach $pattern ( @patterns ) + { + print if /\b$pattern\b/i; + next LINE; + } + } - # slow but obvious way - @popstates = qw(CO ON MI WI MN); - while (defined($line = <>)) { - for $state (@popstates) { - if ($line =~ /\b$state\b/i) { - print $line; - last; - } - } - } +The qr// operator showed up in perl 5.005. It compiles a +regular expression, but doesn't apply it. When you use the +pre-compiled version of the regex, perl does less work. In +this example, I inserted a map() to turn each pattern into +its pre-compiled form. The rest of the script is the same, +but faster. + + @patterns = map { qr/\b$_\b/i } qw( foo bar baz ); + + LINE: while( <> ) + { + foreach $pattern ( @patterns ) + { + print if /\b$pattern\b/i; + next LINE; + } + } + +In some cases, you may be able to make several patterns into +a single regular expression. Beware of situations that require +backtracking though. -That's because Perl has to recompile all those patterns for each of -the lines of the file. As of the 5.005 release, there's a much better -approach, one which makes use of the new C operator: - - # use spiffy new qr// operator, with /i flag even - use 5.005; - @popstates = qw(CO ON MI WI MN); - @poppats = map { qr/\b$_\b/i } @popstates; - while (defined($line = <>)) { - for $patobj (@poppats) { - print $line if $line =~ /$patobj/; - } - } + $regex = join '|', qw( foo bar baz ); + + LINE: while( <> ) + { + print if /\b(?:$regex)\b/i; + } + +For more details on regular expression efficiency, see Mastering +Regular Expressions by Jeffrey Freidl. He explains how regular +expressions engine work and why some patterns are surprisingly +inefficient. Once you understand how perl applies regular +expressions, you can tune them for individual situations. =head2 Why don't word-boundary searches with C<\b> work for me? -Two common misconceptions are that C<\b> is a synonym for C<\s+> and -that it's the edge between whitespace characters and non-whitespace -characters. Neither is correct. C<\b> is the place between a C<\w> -character and a C<\W> character (that is, C<\b> is the edge of a -"word"). It's a zero-width assertion, just like C<^>, C<$>, and all -the other anchors, so it doesn't consume any characters. L -describes the behavior of all the regex metacharacters. +(contributed by brian d foy) + +Ensure that you know what \b really does: it's the boundary between a +word character, \w, and something that isn't a word character. That +thing that isn't a word character might be \W, but it can also be the +start or end of the string. + +It's not (not!) the boundary between whitespace and non-whitespace, +and it's not the stuff between words we use to create sentences. + +In regex speak, a word boundary (\b) is a "zero width assertion", +meaning that it doesn't represent a character in the string, but a +condition at a certain position. + +For the regular expression, /\bPerl\b/, there has to be a word +boundary before the "P" and after the "l". As long as something other +than a word character precedes the "P" and succeeds the "l", the +pattern will match. These strings match /\bPerl\b/. + + "Perl" # no word char before P or after l + "Perl " # same as previous (space is not a word char) + "'Perl'" # the ' char is not a word char + "Perl's" # no word char before P, non-word char after "l" + +These strings do not match /\bPerl\b/. + + "Perl_" # _ is a word char! + "Perler" # no word char before P, but one after l + +You don't have to use \b to match words though. You can look for +non-word characters surrrounded by word characters. These strings +match the pattern /\b'\b/. + + "don't" # the ' char is surrounded by "n" and "t" + "qep'a'" # the ' char is surrounded by "p" and "a" + +These strings do not match /\b'\b/. -Here are examples of the incorrect application of C<\b>, with fixes: + "foo'" # there is no word char after non-word ' + +You can also use the complement of \b, \B, to specify that there +should not be a word boundary. - "two words" =~ /(\w+)\b(\w+)/; # WRONG - "two words" =~ /(\w+)\s+(\w+)/; # right +In the pattern /\Bam\B/, there must be a word character before the "a" +and after the "m". These patterns match /\Bam\B/: - " =matchless= text" =~ /\b=(\w+)=\b/; # WRONG - " =matchless= text" =~ /=(\w+)=/; # right + "llama" # "am" surrounded by word chars + "Samuel" # same + +These strings do not match /\Bam\B/ -Although they may not do what you thought they did, C<\b> and C<\B> -can still be quite useful. For an example of the correct use of -C<\b>, see the example of matching duplicate words over multiple -lines. + "Sam" # no word boundary before "a", but one after "m" + "I am Sam" # "am" surrounded by non-word chars -An example of using C<\B> is the pattern C<\Bis\B>. This will find -occurrences of "is" on the insides of words only, as in "thistle", but -not "this" or "island". =head2 Why does using $&, $`, or $' slow my program down? @@ -800,8 +856,8 @@ in L. =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997-2002 Tom Christiansen and Nathan Torkington. -All rights reserved. +Copyright (c) 1997-2005 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself. diff --git a/pod/perlfaq7.pod b/pod/perlfaq7.pod index 54e91bd..19fa780 100644 --- a/pod/perlfaq7.pod +++ b/pod/perlfaq7.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq7 - General Perl Language Issues ($Revision: 1.18 $, $Date: 2004/11/03 22:54:08 $) +perlfaq7 - General Perl Language Issues ($Revision: 1.21 $, $Date: 2005/01/21 12:10:22 $) =head1 DESCRIPTION @@ -176,20 +176,26 @@ If you're looking for something a bit more rigorous, try L. =head2 How do I create a module? -A module is a package that lives in a file of the same name. For -example, the Hello::There module would live in Hello/There.pm. For -details, read L. You'll also find L helpful. If -you're writing a C or mixed-language module with both C and Perl, then -you should study L. +(contributed by brian d foy) -The C program will create stubs for all the important stuff for you: +L, L, L explain modules +in all the gory details. L gives a a brief +overview of the process along with a couple of suggestions +about style. - % h2xs -XA -n My::Module +If you need to include C code or C library interfaces in +your module, you'll need h2xs. h2xs will create the module +distribution structure and the initial interface files +you'll need. L and L explain the details. -The C<-X> switch tells C that you are not using C extension -code. The C<-A> switch tells C that you are not using the -AutoLoader, and the C<-n> switch specifies the name of the module. -See L for more details. +If you don't need to use C code, other tools such as +ExtUtils::ModuleMaker and Module::Starter, can help you +create a skeleton module distribution. + +You may also want to see Sam Tregar's "Writing Perl Modules +for CPAN" ( http://apress.com/book/bookDisplay.html?bID=14 ) +which is the best hands-on guide to creating module +distributions. =head2 How do I create a class? @@ -736,18 +742,22 @@ not necessarily the same as the one in which you were compiled): =head2 How can I comment out a large block of perl code? You can use embedded POD to discard it. Enclose the blocks you want -to comment out in POD markers, for example C<=for nobody> and C<=cut> -(which marks ends of POD blocks). +to comment out in POD markers. The <=begin> directive marks a section +for a specific formatter. Use the C format, which no formatter +should claim to understand (by policy). Mark the end of the block +with <=end>. # program is here - =for nobody + =begin comment all of this stuff here will be ignored by everyone + =end comment + =cut # program continues @@ -911,8 +921,8 @@ where you expect it so you need to adjust your shebang line. =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997-2002 Tom Christiansen and Nathan Torkington. -All rights reserved. +Copyright (c) 1997-2005 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself. diff --git a/pod/perlfaq8.pod b/pod/perlfaq8.pod index ad07fa3..8152d49 100644 --- a/pod/perlfaq8.pod +++ b/pod/perlfaq8.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq8 - System Interaction ($Revision: 1.22 $, $Date: 2004/10/05 22:13:49 $) +perlfaq8 - System Interaction ($Revision: 1.23 $, $Date: 2005/01/03 18:43:37 $) =head1 DESCRIPTION @@ -1252,8 +1252,8 @@ but other times it is not. Modern programs C instead. =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997-2003 Tom Christiansen and Nathan Torkington. -All rights reserved. +Copyright (c) 1997-2005 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself. diff --git a/pod/perlfaq9.pod b/pod/perlfaq9.pod index fa59003..0dd6f1e 100644 --- a/pod/perlfaq9.pod +++ b/pod/perlfaq9.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq9 - Networking ($Revision: 1.16 $, $Date: 2004/10/30 12:20:59 $) +perlfaq9 - Networking ($Revision: 1.19 $, $Date: 2005/01/21 12:14:12 $) =head1 DESCRIPTION @@ -227,6 +227,10 @@ through proxies: =head2 How do I automate an HTML form submission? +If you are doing something complex, such as moving through many pages +and forms or a web site, you can use C. See its +documentation for all the details. + If you're submitting values using the GET method, create a URL and encode the form using the C method: @@ -348,35 +352,42 @@ the Mail::Header module from CPAN (part of the MailTools package). =head2 How do I decode a CGI form? -You use a standard module, probably CGI.pm. Under no circumstances -should you attempt to do so by hand! - -You'll see a lot of CGI programs that blindly read from STDIN the number -of bytes equal to CONTENT_LENGTH for POSTs, or grab QUERY_STRING for -decoding GETs. These programs are very poorly written. They only work -sometimes. They typically forget to check the return value of the read() -system call, which is a cardinal sin. They don't handle HEAD requests. -They don't handle multipart forms used for file uploads. They don't deal -with GET/POST combinations where query fields are in more than one place. -They don't deal with keywords in the query string. - -In short, they're bad hacks. Resist them at all costs. Please do not be -tempted to reinvent the wheel. Instead, use the CGI.pm or CGI_Lite.pm -(available from CPAN), or if you're trapped in the module-free land -of perl1 .. perl4, you might look into cgi-lib.pl (available from -http://cgi-lib.stanford.edu/cgi-lib/ ). - -Make sure you know whether to use a GET or a POST in your form. -GETs should only be used for something that doesn't update the server. -Otherwise you can get mangled databases and repeated feedback mail -messages. The fancy word for this is ``idempotency''. This simply -means that there should be no difference between making a GET request -for a particular URL once or multiple times. This is because the -HTTP protocol definition says that a GET request may be cached by the -browser, or server, or an intervening proxy. POST requests cannot be -cached, because each request is independent and matters. Typically, -POST requests change or depend on state on the server (query or update -a database, send mail, or purchase a computer). +(contributed by brian d foy) + +Use the CGI.pm module that comes with Perl. It's quick, +it's easy, and it actually does quite a bit of work to +ensure things happen correctly. It handles GET, POST, and +HEAD requests, multipart forms, multivalued fields, query +string and message body combinations, and many other things +you probably don't want to think about. + +It doesn't get much easier: the CGI module automatically +parses the input and makes each value available through the +C function. + + use CGI qw(:standard); + + my $total = param( "price" ) + param( "shipping" ); + + my @items = param( "item ); # multiple values, same field name + +If you want an object-oriented approach, CGI.pm can do that too. + + use CGI; + + my $cgi = CGI->new(); + + my $total = $cgi->param( "price" ) + $cgi->param( "shipping" ); + + my @items = $cgi->param( "item" ); + +You might also try CGI::Minimal which is a lightweight version +of the same thing. Other CGI::* modules on CPAN might work better +for you, too. + +Many people try to write their own decoder (or copy one from +another program) and then run into one of the many "gotchas" +of the task. It's much easier and less hassle to use CGI.pm. =head2 How do I check a valid mail address? @@ -632,8 +643,8 @@ an RPC stub generator and includes an RPC::ONC module. =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997-2002 Tom Christiansen and Nathan Torkington. -All rights reserved. +Copyright (c) 1997-2005 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself. -- 1.8.3.1