| 1 | =head1 NAME |
| 2 | |
| 3 | perlmod - Perl modules (packages and symbol tables) |
| 4 | |
| 5 | =head1 DESCRIPTION |
| 6 | |
| 7 | =head2 Packages |
| 8 | X<package> X<namespace> X<variable, global> X<global variable> X<global> |
| 9 | |
| 10 | Perl provides a mechanism for alternative namespaces to protect |
| 11 | packages from stomping on each other's variables. In fact, there's |
| 12 | really no such thing as a global variable in Perl. The package |
| 13 | statement declares the compilation unit as being in the given |
| 14 | namespace. The scope of the package declaration is from the |
| 15 | declaration itself through the end of the enclosing block, C<eval>, |
| 16 | or file, whichever comes first (the same scope as the my() and |
| 17 | local() operators). Unqualified dynamic identifiers will be in |
| 18 | this namespace, except for those few identifiers that if unqualified, |
| 19 | default to the main package instead of the current one as described |
| 20 | below. A package statement affects only dynamic variables--including |
| 21 | those you've used local() on--but I<not> lexical variables created |
| 22 | with my(). Typically it would be the first declaration in a file |
| 23 | included by the C<do>, C<require>, or C<use> operators. You can |
| 24 | switch into a package in more than one place; it merely influences |
| 25 | which symbol table is used by the compiler for the rest of that |
| 26 | block. You can refer to variables and filehandles in other packages |
| 27 | by prefixing the identifier with the package name and a double |
| 28 | colon: C<$Package::Variable>. If the package name is null, the |
| 29 | C<main> package is assumed. That is, C<$::sail> is equivalent to |
| 30 | C<$main::sail>. |
| 31 | |
| 32 | The old package delimiter was a single quote, but double colon is now the |
| 33 | preferred delimiter, in part because it's more readable to humans, and |
| 34 | in part because it's more readable to B<emacs> macros. It also makes C++ |
| 35 | programmers feel like they know what's going on--as opposed to using the |
| 36 | single quote as separator, which was there to make Ada programmers feel |
| 37 | like they knew what was going on. Because the old-fashioned syntax is still |
| 38 | supported for backwards compatibility, if you try to use a string like |
| 39 | C<"This is $owner's house">, you'll be accessing C<$owner::s>; that is, |
| 40 | the $s variable in package C<owner>, which is probably not what you meant. |
| 41 | Use braces to disambiguate, as in C<"This is ${owner}'s house">. |
| 42 | X<::> X<'> |
| 43 | |
| 44 | Packages may themselves contain package separators, as in |
| 45 | C<$OUTER::INNER::var>. This implies nothing about the order of |
| 46 | name lookups, however. There are no relative packages: all symbols |
| 47 | are either local to the current package, or must be fully qualified |
| 48 | from the outer package name down. For instance, there is nowhere |
| 49 | within package C<OUTER> that C<$INNER::var> refers to |
| 50 | C<$OUTER::INNER::var>. C<INNER> refers to a totally |
| 51 | separate global package. |
| 52 | |
| 53 | Only identifiers starting with letters (or underscore) are stored |
| 54 | in a package's symbol table. All other symbols are kept in package |
| 55 | C<main>, including all punctuation variables, like $_. In addition, |
| 56 | when unqualified, the identifiers STDIN, STDOUT, STDERR, ARGV, |
| 57 | ARGVOUT, ENV, INC, and SIG are forced to be in package C<main>, |
| 58 | even when used for other purposes than their built-in ones. If you |
| 59 | have a package called C<m>, C<s>, or C<y>, then you can't use the |
| 60 | qualified form of an identifier because it would be instead interpreted |
| 61 | as a pattern match, a substitution, or a transliteration. |
| 62 | X<variable, punctuation> |
| 63 | |
| 64 | Variables beginning with underscore used to be forced into package |
| 65 | main, but we decided it was more useful for package writers to be able |
| 66 | to use leading underscore to indicate private variables and method names. |
| 67 | However, variables and functions named with a single C<_>, such as |
| 68 | $_ and C<sub _>, are still forced into the package C<main>. See also |
| 69 | L<perlvar/"The Syntax of Variable Names">. |
| 70 | |
| 71 | C<eval>ed strings are compiled in the package in which the eval() was |
| 72 | compiled. (Assignments to C<$SIG{}>, however, assume the signal |
| 73 | handler specified is in the C<main> package. Qualify the signal handler |
| 74 | name if you wish to have a signal handler in a package.) For an |
| 75 | example, examine F<perldb.pl> in the Perl library. It initially switches |
| 76 | to the C<DB> package so that the debugger doesn't interfere with variables |
| 77 | in the program you are trying to debug. At various points, however, it |
| 78 | temporarily switches back to the C<main> package to evaluate various |
| 79 | expressions in the context of the C<main> package (or wherever you came |
| 80 | from). See L<perldebug>. |
| 81 | |
| 82 | The special symbol C<__PACKAGE__> contains the current package, but cannot |
| 83 | (easily) be used to construct variable names. |
| 84 | |
| 85 | See L<perlsub> for other scoping issues related to my() and local(), |
| 86 | and L<perlref> regarding closures. |
| 87 | |
| 88 | =head2 Symbol Tables |
| 89 | X<symbol table> X<stash> X<%::> X<%main::> X<typeglob> X<glob> X<alias> |
| 90 | |
| 91 | The symbol table for a package happens to be stored in the hash of that |
| 92 | name with two colons appended. The main symbol table's name is thus |
| 93 | C<%main::>, or C<%::> for short. Likewise the symbol table for the nested |
| 94 | package mentioned earlier is named C<%OUTER::INNER::>. |
| 95 | |
| 96 | The value in each entry of the hash is what you are referring to when you |
| 97 | use the C<*name> typeglob notation. |
| 98 | |
| 99 | local *main::foo = *main::bar; |
| 100 | |
| 101 | You can use this to print out all the variables in a package, for |
| 102 | instance. The standard but antiquated F<dumpvar.pl> library and |
| 103 | the CPAN module Devel::Symdump make use of this. |
| 104 | |
| 105 | The results of creating new symbol table entries directly or modifying any |
| 106 | entries that are not already typeglobs are undefined and subject to change |
| 107 | between releases of perl. |
| 108 | |
| 109 | Assignment to a typeglob performs an aliasing operation, i.e., |
| 110 | |
| 111 | *dick = *richard; |
| 112 | |
| 113 | causes variables, subroutines, formats, and file and directory handles |
| 114 | accessible via the identifier C<richard> also to be accessible via the |
| 115 | identifier C<dick>. If you want to alias only a particular variable or |
| 116 | subroutine, assign a reference instead: |
| 117 | |
| 118 | *dick = \$richard; |
| 119 | |
| 120 | Which makes $richard and $dick the same variable, but leaves |
| 121 | @richard and @dick as separate arrays. Tricky, eh? |
| 122 | |
| 123 | There is one subtle difference between the following statements: |
| 124 | |
| 125 | *foo = *bar; |
| 126 | *foo = \$bar; |
| 127 | |
| 128 | C<*foo = *bar> makes the typeglobs themselves synonymous while |
| 129 | C<*foo = \$bar> makes the SCALAR portions of two distinct typeglobs |
| 130 | refer to the same scalar value. This means that the following code: |
| 131 | |
| 132 | $bar = 1; |
| 133 | *foo = \$bar; # Make $foo an alias for $bar |
| 134 | |
| 135 | { |
| 136 | local $bar = 2; # Restrict changes to block |
| 137 | print $foo; # Prints '1'! |
| 138 | } |
| 139 | |
| 140 | Would print '1', because C<$foo> holds a reference to the I<original> |
| 141 | C<$bar>. The one that was stuffed away by C<local()> and which will be |
| 142 | restored when the block ends. Because variables are accessed through the |
| 143 | typeglob, you can use C<*foo = *bar> to create an alias which can be |
| 144 | localized. (But be aware that this means you can't have a separate |
| 145 | C<@foo> and C<@bar>, etc.) |
| 146 | |
| 147 | What makes all of this important is that the Exporter module uses glob |
| 148 | aliasing as the import/export mechanism. Whether or not you can properly |
| 149 | localize a variable that has been exported from a module depends on how |
| 150 | it was exported: |
| 151 | |
| 152 | @EXPORT = qw($FOO); # Usual form, can't be localized |
| 153 | @EXPORT = qw(*FOO); # Can be localized |
| 154 | |
| 155 | You can work around the first case by using the fully qualified name |
| 156 | (C<$Package::FOO>) where you need a local value, or by overriding it |
| 157 | by saying C<*FOO = *Package::FOO> in your script. |
| 158 | |
| 159 | The C<*x = \$y> mechanism may be used to pass and return cheap references |
| 160 | into or from subroutines if you don't want to copy the whole |
| 161 | thing. It only works when assigning to dynamic variables, not |
| 162 | lexicals. |
| 163 | |
| 164 | %some_hash = (); # can't be my() |
| 165 | *some_hash = fn( \%another_hash ); |
| 166 | sub fn { |
| 167 | local *hashsym = shift; |
| 168 | # now use %hashsym normally, and you |
| 169 | # will affect the caller's %another_hash |
| 170 | my %nhash = (); # do what you want |
| 171 | return \%nhash; |
| 172 | } |
| 173 | |
| 174 | On return, the reference will overwrite the hash slot in the |
| 175 | symbol table specified by the *some_hash typeglob. This |
| 176 | is a somewhat tricky way of passing around references cheaply |
| 177 | when you don't want to have to remember to dereference variables |
| 178 | explicitly. |
| 179 | |
| 180 | Another use of symbol tables is for making "constant" scalars. |
| 181 | X<constant> X<scalar, constant> |
| 182 | |
| 183 | *PI = \3.14159265358979; |
| 184 | |
| 185 | Now you cannot alter C<$PI>, which is probably a good thing all in all. |
| 186 | This isn't the same as a constant subroutine, which is subject to |
| 187 | optimization at compile-time. A constant subroutine is one prototyped |
| 188 | to take no arguments and to return a constant expression. See |
| 189 | L<perlsub> for details on these. The C<use constant> pragma is a |
| 190 | convenient shorthand for these. |
| 191 | |
| 192 | You can say C<*foo{PACKAGE}> and C<*foo{NAME}> to find out what name and |
| 193 | package the *foo symbol table entry comes from. This may be useful |
| 194 | in a subroutine that gets passed typeglobs as arguments: |
| 195 | |
| 196 | sub identify_typeglob { |
| 197 | my $glob = shift; |
| 198 | print 'You gave me ', *{$glob}{PACKAGE}, '::', *{$glob}{NAME}, "\n"; |
| 199 | } |
| 200 | identify_typeglob *foo; |
| 201 | identify_typeglob *bar::baz; |
| 202 | |
| 203 | This prints |
| 204 | |
| 205 | You gave me main::foo |
| 206 | You gave me bar::baz |
| 207 | |
| 208 | The C<*foo{THING}> notation can also be used to obtain references to the |
| 209 | individual elements of *foo. See L<perlref>. |
| 210 | |
| 211 | Subroutine definitions (and declarations, for that matter) need |
| 212 | not necessarily be situated in the package whose symbol table they |
| 213 | occupy. You can define a subroutine outside its package by |
| 214 | explicitly qualifying the name of the subroutine: |
| 215 | |
| 216 | package main; |
| 217 | sub Some_package::foo { ... } # &foo defined in Some_package |
| 218 | |
| 219 | This is just a shorthand for a typeglob assignment at compile time: |
| 220 | |
| 221 | BEGIN { *Some_package::foo = sub { ... } } |
| 222 | |
| 223 | and is I<not> the same as writing: |
| 224 | |
| 225 | { |
| 226 | package Some_package; |
| 227 | sub foo { ... } |
| 228 | } |
| 229 | |
| 230 | In the first two versions, the body of the subroutine is |
| 231 | lexically in the main package, I<not> in Some_package. So |
| 232 | something like this: |
| 233 | |
| 234 | package main; |
| 235 | |
| 236 | $Some_package::name = "fred"; |
| 237 | $main::name = "barney"; |
| 238 | |
| 239 | sub Some_package::foo { |
| 240 | print "in ", __PACKAGE__, ": \$name is '$name'\n"; |
| 241 | } |
| 242 | |
| 243 | Some_package::foo(); |
| 244 | |
| 245 | prints: |
| 246 | |
| 247 | in main: $name is 'barney' |
| 248 | |
| 249 | rather than: |
| 250 | |
| 251 | in Some_package: $name is 'fred' |
| 252 | |
| 253 | This also has implications for the use of the SUPER:: qualifier |
| 254 | (see L<perlobj>). |
| 255 | |
| 256 | =head2 BEGIN, UNITCHECK, CHECK, INIT and END |
| 257 | X<BEGIN> X<UNITCHECK> X<CHECK> X<INIT> X<END> |
| 258 | |
| 259 | Five specially named code blocks are executed at the beginning and at |
| 260 | the end of a running Perl program. These are the C<BEGIN>, |
| 261 | C<UNITCHECK>, C<CHECK>, C<INIT>, and C<END> blocks. |
| 262 | |
| 263 | These code blocks can be prefixed with C<sub> to give the appearance of a |
| 264 | subroutine (although this is not considered good style). One should note |
| 265 | that these code blocks don't really exist as named subroutines (despite |
| 266 | their appearance). The thing that gives this away is the fact that you can |
| 267 | have B<more than one> of these code blocks in a program, and they will get |
| 268 | B<all> executed at the appropriate moment. So you can't execute any of |
| 269 | these code blocks by name. |
| 270 | |
| 271 | A C<BEGIN> code block is executed as soon as possible, that is, the moment |
| 272 | it is completely defined, even before the rest of the containing file (or |
| 273 | string) is parsed. You may have multiple C<BEGIN> blocks within a file (or |
| 274 | eval'ed string); they will execute in order of definition. Because a C<BEGIN> |
| 275 | code block executes immediately, it can pull in definitions of subroutines |
| 276 | and such from other files in time to be visible to the rest of the compile |
| 277 | and run time. Once a C<BEGIN> has run, it is immediately undefined and any |
| 278 | code it used is returned to Perl's memory pool. |
| 279 | |
| 280 | An C<END> code block is executed as late as possible, that is, after |
| 281 | perl has finished running the program and just before the interpreter |
| 282 | is being exited, even if it is exiting as a result of a die() function. |
| 283 | (But not if it's morphing into another program via C<exec>, or |
| 284 | being blown out of the water by a signal--you have to trap that yourself |
| 285 | (if you can).) You may have multiple C<END> blocks within a file--they |
| 286 | will execute in reverse order of definition; that is: last in, first |
| 287 | out (LIFO). C<END> blocks are not executed when you run perl with the |
| 288 | C<-c> switch, or if compilation fails. |
| 289 | |
| 290 | Note that C<END> code blocks are B<not> executed at the end of a string |
| 291 | C<eval()>: if any C<END> code blocks are created in a string C<eval()>, |
| 292 | they will be executed just as any other C<END> code block of that package |
| 293 | in LIFO order just before the interpreter is being exited. |
| 294 | |
| 295 | Inside an C<END> code block, C<$?> contains the value that the program is |
| 296 | going to pass to C<exit()>. You can modify C<$?> to change the exit |
| 297 | value of the program. Beware of changing C<$?> by accident (e.g. by |
| 298 | running something via C<system>). |
| 299 | X<$?> |
| 300 | |
| 301 | Inside of a C<END> block, the value of C<${^GLOBAL_PHASE}> will be |
| 302 | C<"END">. |
| 303 | |
| 304 | C<UNITCHECK>, C<CHECK> and C<INIT> code blocks are useful to catch the |
| 305 | transition between the compilation phase and the execution phase of |
| 306 | the main program. |
| 307 | |
| 308 | C<UNITCHECK> blocks are run just after the unit which defined them has |
| 309 | been compiled. The main program file and each module it loads are |
| 310 | compilation units, as are string C<eval>s, run-time code compiled using the |
| 311 | C<(?{ })> construct in a regex, calls to C<do FILE>, C<require FILE>, |
| 312 | and code after the C<-e> switch on the command line. |
| 313 | |
| 314 | C<BEGIN> and C<UNITCHECK> blocks are not directly related to the phase of |
| 315 | the interpreter. They can be created and executed during any phase. |
| 316 | |
| 317 | C<CHECK> code blocks are run just after the B<initial> Perl compile phase ends |
| 318 | and before the run time begins, in LIFO order. C<CHECK> code blocks are used |
| 319 | in the Perl compiler suite to save the compiled state of the program. |
| 320 | |
| 321 | Inside of a C<CHECK> block, the value of C<${^GLOBAL_PHASE}> will be |
| 322 | C<"CHECK">. |
| 323 | |
| 324 | C<INIT> blocks are run just before the Perl runtime begins execution, in |
| 325 | "first in, first out" (FIFO) order. |
| 326 | |
| 327 | Inside of an C<INIT> block, the value of C<${^GLOBAL_PHASE}> will be C<"INIT">. |
| 328 | |
| 329 | The C<CHECK> and C<INIT> blocks in code compiled by C<require>, string C<do>, |
| 330 | or string C<eval> will not be executed if they occur after the end of the |
| 331 | main compilation phase; that can be a problem in mod_perl and other persistent |
| 332 | environments which use those functions to load code at runtime. |
| 333 | |
| 334 | When you use the B<-n> and B<-p> switches to Perl, C<BEGIN> and |
| 335 | C<END> work just as they do in B<awk>, as a degenerate case. |
| 336 | Both C<BEGIN> and C<CHECK> blocks are run when you use the B<-c> |
| 337 | switch for a compile-only syntax check, although your main code |
| 338 | is not. |
| 339 | |
| 340 | The B<begincheck> program makes it all clear, eventually: |
| 341 | |
| 342 | #!/usr/bin/perl |
| 343 | |
| 344 | # begincheck |
| 345 | |
| 346 | print "10. Ordinary code runs at runtime.\n"; |
| 347 | |
| 348 | END { print "16. So this is the end of the tale.\n" } |
| 349 | INIT { print " 7. INIT blocks run FIFO just before runtime.\n" } |
| 350 | UNITCHECK { |
| 351 | print " 4. And therefore before any CHECK blocks.\n" |
| 352 | } |
| 353 | CHECK { print " 6. So this is the sixth line.\n" } |
| 354 | |
| 355 | print "11. It runs in order, of course.\n"; |
| 356 | |
| 357 | BEGIN { print " 1. BEGIN blocks run FIFO during compilation.\n" } |
| 358 | END { print "15. Read perlmod for the rest of the story.\n" } |
| 359 | CHECK { print " 5. CHECK blocks run LIFO after all compilation.\n" } |
| 360 | INIT { print " 8. Run this again, using Perl's -c switch.\n" } |
| 361 | |
| 362 | print "12. This is anti-obfuscated code.\n"; |
| 363 | |
| 364 | END { print "14. END blocks run LIFO at quitting time.\n" } |
| 365 | BEGIN { print " 2. So this line comes out second.\n" } |
| 366 | UNITCHECK { |
| 367 | print " 3. UNITCHECK blocks run LIFO after each file is compiled.\n" |
| 368 | } |
| 369 | INIT { print " 9. You'll see the difference right away.\n" } |
| 370 | |
| 371 | print "13. It merely _looks_ like it should be confusing.\n"; |
| 372 | |
| 373 | __END__ |
| 374 | |
| 375 | =head2 Perl Classes |
| 376 | X<class> X<@ISA> |
| 377 | |
| 378 | There is no special class syntax in Perl, but a package may act |
| 379 | as a class if it provides subroutines to act as methods. Such a |
| 380 | package may also derive some of its methods from another class (package) |
| 381 | by listing the other package name(s) in its global @ISA array (which |
| 382 | must be a package global, not a lexical). |
| 383 | |
| 384 | For more on this, see L<perlootut> and L<perlobj>. |
| 385 | |
| 386 | =head2 Perl Modules |
| 387 | X<module> |
| 388 | |
| 389 | A module is just a set of related functions in a library file, i.e., |
| 390 | a Perl package with the same name as the file. It is specifically |
| 391 | designed to be reusable by other modules or programs. It may do this |
| 392 | by providing a mechanism for exporting some of its symbols into the |
| 393 | symbol table of any package using it, or it may function as a class |
| 394 | definition and make its semantics available implicitly through |
| 395 | method calls on the class and its objects, without explicitly |
| 396 | exporting anything. Or it can do a little of both. |
| 397 | |
| 398 | For example, to start a traditional, non-OO module called Some::Module, |
| 399 | create a file called F<Some/Module.pm> and start with this template: |
| 400 | |
| 401 | package Some::Module; # assumes Some/Module.pm |
| 402 | |
| 403 | use strict; |
| 404 | use warnings; |
| 405 | |
| 406 | BEGIN { |
| 407 | require Exporter; |
| 408 | |
| 409 | # set the version for version checking |
| 410 | our $VERSION = 1.00; |
| 411 | |
| 412 | # Inherit from Exporter to export functions and variables |
| 413 | our @ISA = qw(Exporter); |
| 414 | |
| 415 | # Functions and variables which are exported by default |
| 416 | our @EXPORT = qw(func1 func2); |
| 417 | |
| 418 | # Functions and variables which can be optionally exported |
| 419 | our @EXPORT_OK = qw($Var1 %Hashit func3); |
| 420 | } |
| 421 | |
| 422 | # exported package globals go here |
| 423 | our $Var1 = ''; |
| 424 | our %Hashit = (); |
| 425 | |
| 426 | # non-exported package globals go here |
| 427 | # (they are still accessible as $Some::Module::stuff) |
| 428 | our @more = (); |
| 429 | our $stuff = ''; |
| 430 | |
| 431 | # file-private lexicals go here, before any functions which use them |
| 432 | my $priv_var = ''; |
| 433 | my %secret_hash = (); |
| 434 | |
| 435 | # here's a file-private function as a closure, |
| 436 | # callable as $priv_func->(); |
| 437 | my $priv_func = sub { |
| 438 | ... |
| 439 | }; |
| 440 | |
| 441 | # make all your functions, whether exported or not; |
| 442 | # remember to put something interesting in the {} stubs |
| 443 | sub func1 { ... } |
| 444 | sub func2 { ... } |
| 445 | |
| 446 | # this one isn't exported, but could be called directly |
| 447 | # as Some::Module::func3() |
| 448 | sub func3 { ... } |
| 449 | |
| 450 | END { ... } # module clean-up code here (global destructor) |
| 451 | |
| 452 | 1; # don't forget to return a true value from the file |
| 453 | |
| 454 | Then go on to declare and use your variables in functions without |
| 455 | any qualifications. See L<Exporter> and the L<perlmodlib> for |
| 456 | details on mechanics and style issues in module creation. |
| 457 | |
| 458 | Perl modules are included into your program by saying |
| 459 | |
| 460 | use Module; |
| 461 | |
| 462 | or |
| 463 | |
| 464 | use Module LIST; |
| 465 | |
| 466 | This is exactly equivalent to |
| 467 | |
| 468 | BEGIN { require 'Module.pm'; 'Module'->import; } |
| 469 | |
| 470 | or |
| 471 | |
| 472 | BEGIN { require 'Module.pm'; 'Module'->import( LIST ); } |
| 473 | |
| 474 | As a special case |
| 475 | |
| 476 | use Module (); |
| 477 | |
| 478 | is exactly equivalent to |
| 479 | |
| 480 | BEGIN { require 'Module.pm'; } |
| 481 | |
| 482 | All Perl module files have the extension F<.pm>. The C<use> operator |
| 483 | assumes this so you don't have to spell out "F<Module.pm>" in quotes. |
| 484 | This also helps to differentiate new modules from old F<.pl> and |
| 485 | F<.ph> files. Module names are also capitalized unless they're |
| 486 | functioning as pragmas; pragmas are in effect compiler directives, |
| 487 | and are sometimes called "pragmatic modules" (or even "pragmata" |
| 488 | if you're a classicist). |
| 489 | |
| 490 | The two statements: |
| 491 | |
| 492 | require SomeModule; |
| 493 | require "SomeModule.pm"; |
| 494 | |
| 495 | differ from each other in two ways. In the first case, any double |
| 496 | colons in the module name, such as C<Some::Module>, are translated |
| 497 | into your system's directory separator, usually "/". The second |
| 498 | case does not, and would have to be specified literally. The other |
| 499 | difference is that seeing the first C<require> clues in the compiler |
| 500 | that uses of indirect object notation involving "SomeModule", as |
| 501 | in C<$ob = purge SomeModule>, are method calls, not function calls. |
| 502 | (Yes, this really can make a difference.) |
| 503 | |
| 504 | Because the C<use> statement implies a C<BEGIN> block, the importing |
| 505 | of semantics happens as soon as the C<use> statement is compiled, |
| 506 | before the rest of the file is compiled. This is how it is able |
| 507 | to function as a pragma mechanism, and also how modules are able to |
| 508 | declare subroutines that are then visible as list or unary operators for |
| 509 | the rest of the current file. This will not work if you use C<require> |
| 510 | instead of C<use>. With C<require> you can get into this problem: |
| 511 | |
| 512 | require Cwd; # make Cwd:: accessible |
| 513 | $here = Cwd::getcwd(); |
| 514 | |
| 515 | use Cwd; # import names from Cwd:: |
| 516 | $here = getcwd(); |
| 517 | |
| 518 | require Cwd; # make Cwd:: accessible |
| 519 | $here = getcwd(); # oops! no main::getcwd() |
| 520 | |
| 521 | In general, C<use Module ()> is recommended over C<require Module>, |
| 522 | because it determines module availability at compile time, not in the |
| 523 | middle of your program's execution. An exception would be if two modules |
| 524 | each tried to C<use> each other, and each also called a function from |
| 525 | that other module. In that case, it's easy to use C<require> instead. |
| 526 | |
| 527 | Perl packages may be nested inside other package names, so we can have |
| 528 | package names containing C<::>. But if we used that package name |
| 529 | directly as a filename it would make for unwieldy or impossible |
| 530 | filenames on some systems. Therefore, if a module's name is, say, |
| 531 | C<Text::Soundex>, then its definition is actually found in the library |
| 532 | file F<Text/Soundex.pm>. |
| 533 | |
| 534 | Perl modules always have a F<.pm> file, but there may also be |
| 535 | dynamically linked executables (often ending in F<.so>) or autoloaded |
| 536 | subroutine definitions (often ending in F<.al>) associated with the |
| 537 | module. If so, these will be entirely transparent to the user of |
| 538 | the module. It is the responsibility of the F<.pm> file to load |
| 539 | (or arrange to autoload) any additional functionality. For example, |
| 540 | although the POSIX module happens to do both dynamic loading and |
| 541 | autoloading, the user can say just C<use POSIX> to get it all. |
| 542 | |
| 543 | =head2 Making your module threadsafe |
| 544 | X<threadsafe> X<thread safe> |
| 545 | X<module, threadsafe> X<module, thread safe> |
| 546 | X<CLONE> X<CLONE_SKIP> X<thread> X<threads> X<ithread> |
| 547 | |
| 548 | Since 5.6.0, Perl has had support for a new type of threads called |
| 549 | interpreter threads (ithreads). These threads can be used explicitly |
| 550 | and implicitly. |
| 551 | |
| 552 | Ithreads work by cloning the data tree so that no data is shared |
| 553 | between different threads. These threads can be used by using the C<threads> |
| 554 | module or by doing fork() on win32 (fake fork() support). When a |
| 555 | thread is cloned all Perl data is cloned, however non-Perl data cannot |
| 556 | be cloned automatically. Perl after 5.7.2 has support for the C<CLONE> |
| 557 | special subroutine. In C<CLONE> you can do whatever |
| 558 | you need to do, |
| 559 | like for example handle the cloning of non-Perl data, if necessary. |
| 560 | C<CLONE> will be called once as a class method for every package that has it |
| 561 | defined (or inherits it). It will be called in the context of the new thread, |
| 562 | so all modifications are made in the new area. Currently CLONE is called with |
| 563 | no parameters other than the invocant package name, but code should not assume |
| 564 | that this will remain unchanged, as it is likely that in future extra parameters |
| 565 | will be passed in to give more information about the state of cloning. |
| 566 | |
| 567 | If you want to CLONE all objects you will need to keep track of them per |
| 568 | package. This is simply done using a hash and Scalar::Util::weaken(). |
| 569 | |
| 570 | Perl after 5.8.7 has support for the C<CLONE_SKIP> special subroutine. |
| 571 | Like C<CLONE>, C<CLONE_SKIP> is called once per package; however, it is |
| 572 | called just before cloning starts, and in the context of the parent |
| 573 | thread. If it returns a true value, then no objects of that class will |
| 574 | be cloned; or rather, they will be copied as unblessed, undef values. |
| 575 | For example: if in the parent there are two references to a single blessed |
| 576 | hash, then in the child there will be two references to a single undefined |
| 577 | scalar value instead. |
| 578 | This provides a simple mechanism for making a module threadsafe; just add |
| 579 | C<sub CLONE_SKIP { 1 }> at the top of the class, and C<DESTROY()> will |
| 580 | now only be called once per object. Of course, if the child thread needs |
| 581 | to make use of the objects, then a more sophisticated approach is |
| 582 | needed. |
| 583 | |
| 584 | Like C<CLONE>, C<CLONE_SKIP> is currently called with no parameters other |
| 585 | than the invocant package name, although that may change. Similarly, to |
| 586 | allow for future expansion, the return value should be a single C<0> or |
| 587 | C<1> value. |
| 588 | |
| 589 | =head1 SEE ALSO |
| 590 | |
| 591 | See L<perlmodlib> for general style issues related to building Perl |
| 592 | modules and classes, as well as descriptions of the standard library |
| 593 | and CPAN, L<Exporter> for how Perl's standard import/export mechanism |
| 594 | works, L<perlootut> and L<perlobj> for in-depth information on |
| 595 | creating classes, L<perlobj> for a hard-core reference document on |
| 596 | objects, L<perlsub> for an explanation of functions and scoping, |
| 597 | and L<perlxstut> and L<perlguts> for more information on writing |
| 598 | extension modules. |