| 1 | =head1 NAME |
| 2 | |
| 3 | perlmod - Perl modules (packages and symbol tables) |
| 4 | |
| 5 | =head1 DESCRIPTION |
| 6 | |
| 7 | =head2 Is this the document you were after? |
| 8 | |
| 9 | There are other documents which might contain the information that you're |
| 10 | looking for: |
| 11 | |
| 12 | =over 2 |
| 13 | |
| 14 | =item This doc |
| 15 | |
| 16 | Perl's packages, namespaces, and some info on classes. |
| 17 | |
| 18 | =item L<perlnewmod> |
| 19 | |
| 20 | Tutorial on making a new module. |
| 21 | |
| 22 | =item L<perlmodstyle> |
| 23 | |
| 24 | Best practices for making a new module. |
| 25 | |
| 26 | =back |
| 27 | |
| 28 | =head2 Packages |
| 29 | X<package> X<namespace> X<variable, global> X<global variable> X<global> |
| 30 | |
| 31 | Unlike Perl 4, in which all the variables were dynamic and shared one |
| 32 | global name space, causing maintainability problems, Perl 5 provides two |
| 33 | mechanisms for protecting code from having its variables stomped on by |
| 34 | other code: lexically scoped variables created with C<my> or C<state> and |
| 35 | namespaced global variables, which are exposed via the C<vars> pragma, |
| 36 | or the C<our> keyword. Any global variable is considered to |
| 37 | be part of a namespace and can be accessed via a "fully qualified form". |
| 38 | Conversely, any lexically scoped variable is considered to be part of |
| 39 | that lexical-scope, and does not have a "fully qualified form". |
| 40 | |
| 41 | In perl namespaces are called "packages" and |
| 42 | the C<package> declaration tells the compiler which |
| 43 | namespace to prefix to C<our> variables and unqualified dynamic names. |
| 44 | This both protects |
| 45 | against accidental stomping and provides an interface for deliberately |
| 46 | clobbering global dynamic variables declared and used in other scopes or |
| 47 | packages, when that is what you want to do. |
| 48 | |
| 49 | The scope of the C<package> declaration is from the |
| 50 | declaration itself through the end of the enclosing block, C<eval>, |
| 51 | or file, whichever comes first (the same scope as the my(), our(), state(), and |
| 52 | local() operators, and also the effect |
| 53 | of the experimental "reference aliasing," which may change), or until |
| 54 | the next C<package> declaration. Unqualified dynamic identifiers will be in |
| 55 | this namespace, except for those few identifiers that, if unqualified, |
| 56 | default to the main package instead of the current one as described |
| 57 | below. A C<package> statement affects only dynamic global |
| 58 | symbols, including subroutine names, and variables you've used local() |
| 59 | on, but I<not> lexical variables created with my(), our() or state(). |
| 60 | |
| 61 | Typically, a C<package> statement is the first declaration in a file |
| 62 | included in a program by one of the C<do>, C<require>, or C<use> operators. You can |
| 63 | switch into a package in more than one place: C<package> has no |
| 64 | effect beyond specifying which symbol table the compiler will use for |
| 65 | dynamic symbols for the rest of that block or until the next C<package> statement. |
| 66 | You can refer to variables and filehandles in other packages |
| 67 | by prefixing the identifier with the package name and a double |
| 68 | colon: C<$Package::Variable>. If the package name is null, the |
| 69 | C<main> package is assumed. That is, C<$::sail> is equivalent to |
| 70 | C<$main::sail>. |
| 71 | |
| 72 | The old package delimiter was a single quote, but double colon is now the |
| 73 | preferred delimiter, in part because it's more readable to humans, and |
| 74 | in part because it's more readable to B<emacs> macros. It also makes C++ |
| 75 | programmers feel like they know what's going on--as opposed to using the |
| 76 | single quote as separator, which was there to make Ada programmers feel |
| 77 | like they knew what was going on. Because the old-fashioned syntax is still |
| 78 | supported for backwards compatibility, if you try to use a string like |
| 79 | C<"This is $owner's house">, you'll be accessing C<$owner::s>; that is, |
| 80 | the $s variable in package C<owner>, which is probably not what you meant. |
| 81 | Use braces to disambiguate, as in C<"This is ${owner}'s house">. |
| 82 | X<::> X<'> |
| 83 | |
| 84 | Packages may themselves contain package separators, as in |
| 85 | C<$OUTER::INNER::var>. This implies nothing about the order of |
| 86 | name lookups, however. There are no relative packages: all symbols |
| 87 | are either local to the current package, or must be fully qualified |
| 88 | from the outer package name down. For instance, there is nowhere |
| 89 | within package C<OUTER> that C<$INNER::var> refers to |
| 90 | C<$OUTER::INNER::var>. C<INNER> refers to a totally |
| 91 | separate global package. The custom of treating package names as a |
| 92 | hierarchy is very strong, but the language in no way enforces it. |
| 93 | |
| 94 | Only identifiers starting with letters (or underscore) are stored |
| 95 | in a package's symbol table. All other symbols are kept in package |
| 96 | C<main>, including all punctuation variables, like $_. In addition, |
| 97 | when unqualified, the identifiers STDIN, STDOUT, STDERR, ARGV, |
| 98 | ARGVOUT, ENV, INC, and SIG are forced to be in package C<main>, |
| 99 | even when used for other purposes than their built-in ones. If you |
| 100 | have a package called C<m>, C<s>, or C<y>, then you can't use the |
| 101 | qualified form of an identifier because it would be instead interpreted |
| 102 | as a pattern match, a substitution, or a transliteration. |
| 103 | X<variable, punctuation> |
| 104 | |
| 105 | Variables beginning with underscore used to be forced into package |
| 106 | main, but we decided it was more useful for package writers to be able |
| 107 | to use leading underscore to indicate private variables and method names. |
| 108 | However, variables and functions named with a single C<_>, such as |
| 109 | $_ and C<sub _>, are still forced into the package C<main>. See also |
| 110 | L<perlvar/"The Syntax of Variable Names">. |
| 111 | |
| 112 | C<eval>ed strings are compiled in the package in which the eval() was |
| 113 | compiled. (Assignments to C<$SIG{}>, however, assume the signal |
| 114 | handler specified is in the C<main> package. Qualify the signal handler |
| 115 | name if you wish to have a signal handler in a package.) For an |
| 116 | example, examine F<perldb.pl> in the Perl library. It initially switches |
| 117 | to the C<DB> package so that the debugger doesn't interfere with variables |
| 118 | in the program you are trying to debug. At various points, however, it |
| 119 | temporarily switches back to the C<main> package to evaluate various |
| 120 | expressions in the context of the C<main> package (or wherever you came |
| 121 | from). See L<perldebug>. |
| 122 | |
| 123 | The special symbol C<__PACKAGE__> contains the current package, but cannot |
| 124 | (easily) be used to construct variable names. After C<my($foo)> has hidden |
| 125 | package variable C<$foo>, it can still be accessed, without knowing what |
| 126 | package you are in, as C<${__PACKAGE__.'::foo'}>. |
| 127 | |
| 128 | See L<perlsub> for other scoping issues related to my() and local(), |
| 129 | and L<perlref> regarding closures. |
| 130 | |
| 131 | =head2 Symbol Tables |
| 132 | X<symbol table> X<stash> X<%::> X<%main::> X<typeglob> X<glob> X<alias> |
| 133 | |
| 134 | The symbol table for a package happens to be stored in the hash of that |
| 135 | name with two colons appended. The main symbol table's name is thus |
| 136 | C<%main::>, or C<%::> for short. Likewise the symbol table for the nested |
| 137 | package mentioned earlier is named C<%OUTER::INNER::>. |
| 138 | |
| 139 | The value in each entry of the hash is what you are referring to when you |
| 140 | use the C<*name> typeglob notation. |
| 141 | |
| 142 | local *main::foo = *main::bar; |
| 143 | |
| 144 | You can use this to print out all the variables in a package, for |
| 145 | instance. The standard but antiquated F<dumpvar.pl> library and |
| 146 | the CPAN module Devel::Symdump make use of this. |
| 147 | |
| 148 | The results of creating new symbol table entries directly or modifying any |
| 149 | entries that are not already typeglobs are undefined and subject to change |
| 150 | between releases of perl. |
| 151 | |
| 152 | Assignment to a typeglob performs an aliasing operation, i.e., |
| 153 | |
| 154 | *dick = *richard; |
| 155 | |
| 156 | causes variables, subroutines, formats, and file and directory handles |
| 157 | accessible via the identifier C<richard> also to be accessible via the |
| 158 | identifier C<dick>. If you want to alias only a particular variable or |
| 159 | subroutine, assign a reference instead: |
| 160 | |
| 161 | *dick = \$richard; |
| 162 | |
| 163 | Which makes $richard and $dick the same variable, but leaves |
| 164 | @richard and @dick as separate arrays. Tricky, eh? |
| 165 | |
| 166 | There is one subtle difference between the following statements: |
| 167 | |
| 168 | *foo = *bar; |
| 169 | *foo = \$bar; |
| 170 | |
| 171 | C<*foo = *bar> makes the typeglobs themselves synonymous while |
| 172 | C<*foo = \$bar> makes the SCALAR portions of two distinct typeglobs |
| 173 | refer to the same scalar value. This means that the following code: |
| 174 | |
| 175 | $bar = 1; |
| 176 | *foo = \$bar; # Make $foo an alias for $bar |
| 177 | |
| 178 | { |
| 179 | local $bar = 2; # Restrict changes to block |
| 180 | print $foo; # Prints '1'! |
| 181 | } |
| 182 | |
| 183 | Would print '1', because C<$foo> holds a reference to the I<original> |
| 184 | C<$bar>. The one that was stuffed away by C<local()> and which will be |
| 185 | restored when the block ends. Because variables are accessed through the |
| 186 | typeglob, you can use C<*foo = *bar> to create an alias which can be |
| 187 | localized. (But be aware that this means you can't have a separate |
| 188 | C<@foo> and C<@bar>, etc.) |
| 189 | |
| 190 | What makes all of this important is that the Exporter module uses glob |
| 191 | aliasing as the import/export mechanism. Whether or not you can properly |
| 192 | localize a variable that has been exported from a module depends on how |
| 193 | it was exported: |
| 194 | |
| 195 | @EXPORT = qw($FOO); # Usual form, can't be localized |
| 196 | @EXPORT = qw(*FOO); # Can be localized |
| 197 | |
| 198 | You can work around the first case by using the fully qualified name |
| 199 | (C<$Package::FOO>) where you need a local value, or by overriding it |
| 200 | by saying C<*FOO = *Package::FOO> in your script. |
| 201 | |
| 202 | The C<*x = \$y> mechanism may be used to pass and return cheap references |
| 203 | into or from subroutines if you don't want to copy the whole |
| 204 | thing. It only works when assigning to dynamic variables, not |
| 205 | lexicals. |
| 206 | |
| 207 | %some_hash = (); # can't be my() |
| 208 | *some_hash = fn( \%another_hash ); |
| 209 | sub fn { |
| 210 | local *hashsym = shift; |
| 211 | # now use %hashsym normally, and you |
| 212 | # will affect the caller's %another_hash |
| 213 | my %nhash = (); # do what you want |
| 214 | return \%nhash; |
| 215 | } |
| 216 | |
| 217 | On return, the reference will overwrite the hash slot in the |
| 218 | symbol table specified by the *some_hash typeglob. This |
| 219 | is a somewhat tricky way of passing around references cheaply |
| 220 | when you don't want to have to remember to dereference variables |
| 221 | explicitly. |
| 222 | |
| 223 | Another use of symbol tables is for making "constant" scalars. |
| 224 | X<constant> X<scalar, constant> |
| 225 | |
| 226 | *PI = \3.14159265358979; |
| 227 | |
| 228 | Now you cannot alter C<$PI>, which is probably a good thing all in all. |
| 229 | This isn't the same as a constant subroutine, which is subject to |
| 230 | optimization at compile-time. A constant subroutine is one prototyped |
| 231 | to take no arguments and to return a constant expression. See |
| 232 | L<perlsub> for details on these. The C<use constant> pragma is a |
| 233 | convenient shorthand for these. |
| 234 | |
| 235 | You can say C<*foo{PACKAGE}> and C<*foo{NAME}> to find out what name and |
| 236 | package the *foo symbol table entry comes from. This may be useful |
| 237 | in a subroutine that gets passed typeglobs as arguments: |
| 238 | |
| 239 | sub identify_typeglob { |
| 240 | my $glob = shift; |
| 241 | print 'You gave me ', *{$glob}{PACKAGE}, |
| 242 | '::', *{$glob}{NAME}, "\n"; |
| 243 | } |
| 244 | identify_typeglob *foo; |
| 245 | identify_typeglob *bar::baz; |
| 246 | |
| 247 | This prints |
| 248 | |
| 249 | You gave me main::foo |
| 250 | You gave me bar::baz |
| 251 | |
| 252 | The C<*foo{THING}> notation can also be used to obtain references to the |
| 253 | individual elements of *foo. See L<perlref>. |
| 254 | |
| 255 | Subroutine definitions (and declarations, for that matter) need |
| 256 | not necessarily be situated in the package whose symbol table they |
| 257 | occupy. You can define a subroutine outside its package by |
| 258 | explicitly qualifying the name of the subroutine: |
| 259 | |
| 260 | package main; |
| 261 | sub Some_package::foo { ... } # &foo defined in Some_package |
| 262 | |
| 263 | This is just a shorthand for a typeglob assignment at compile time: |
| 264 | |
| 265 | BEGIN { *Some_package::foo = sub { ... } } |
| 266 | |
| 267 | and is I<not> the same as writing: |
| 268 | |
| 269 | { |
| 270 | package Some_package; |
| 271 | sub foo { ... } |
| 272 | } |
| 273 | |
| 274 | In the first two versions, the body of the subroutine is |
| 275 | lexically in the main package, I<not> in Some_package. So |
| 276 | something like this: |
| 277 | |
| 278 | package main; |
| 279 | |
| 280 | $Some_package::name = "fred"; |
| 281 | $main::name = "barney"; |
| 282 | |
| 283 | sub Some_package::foo { |
| 284 | print "in ", __PACKAGE__, ": \$name is '$name'\n"; |
| 285 | } |
| 286 | |
| 287 | Some_package::foo(); |
| 288 | |
| 289 | prints: |
| 290 | |
| 291 | in main: $name is 'barney' |
| 292 | |
| 293 | rather than: |
| 294 | |
| 295 | in Some_package: $name is 'fred' |
| 296 | |
| 297 | This also has implications for the use of the SUPER:: qualifier |
| 298 | (see L<perlobj>). |
| 299 | |
| 300 | =head2 BEGIN, UNITCHECK, CHECK, INIT and END |
| 301 | X<BEGIN> X<UNITCHECK> X<CHECK> X<INIT> X<END> |
| 302 | |
| 303 | Five specially named code blocks are executed at the beginning and at |
| 304 | the end of a running Perl program. These are the C<BEGIN>, |
| 305 | C<UNITCHECK>, C<CHECK>, C<INIT>, and C<END> blocks. |
| 306 | |
| 307 | These code blocks can be prefixed with C<sub> to give the appearance of a |
| 308 | subroutine (although this is not considered good style). One should note |
| 309 | that these code blocks don't really exist as named subroutines (despite |
| 310 | their appearance). The thing that gives this away is the fact that you can |
| 311 | have B<more than one> of these code blocks in a program, and they will get |
| 312 | B<all> executed at the appropriate moment. So you can't execute any of |
| 313 | these code blocks by name. |
| 314 | |
| 315 | A C<BEGIN> code block is executed as soon as possible, that is, the moment |
| 316 | it is completely defined, even before the rest of the containing file (or |
| 317 | string) is parsed. You may have multiple C<BEGIN> blocks within a file (or |
| 318 | eval'ed string); they will execute in order of definition. Because a C<BEGIN> |
| 319 | code block executes immediately, it can pull in definitions of subroutines |
| 320 | and such from other files in time to be visible to the rest of the compile |
| 321 | and run time. Once a C<BEGIN> has run, it is immediately undefined and any |
| 322 | code it used is returned to Perl's memory pool. |
| 323 | |
| 324 | An C<END> code block is executed as late as possible, that is, after |
| 325 | perl has finished running the program and just before the interpreter |
| 326 | is being exited, even if it is exiting as a result of a die() function. |
| 327 | (But not if it's morphing into another program via C<exec>, or |
| 328 | being blown out of the water by a signal--you have to trap that yourself |
| 329 | (if you can).) You may have multiple C<END> blocks within a file--they |
| 330 | will execute in reverse order of definition; that is: last in, first |
| 331 | out (LIFO). C<END> blocks are not executed when you run perl with the |
| 332 | C<-c> switch, or if compilation fails. |
| 333 | |
| 334 | Note that C<END> code blocks are B<not> executed at the end of a string |
| 335 | C<eval()>: if any C<END> code blocks are created in a string C<eval()>, |
| 336 | they will be executed just as any other C<END> code block of that package |
| 337 | in LIFO order just before the interpreter is being exited. |
| 338 | |
| 339 | Inside an C<END> code block, C<$?> contains the value that the program is |
| 340 | going to pass to C<exit()>. You can modify C<$?> to change the exit |
| 341 | value of the program. Beware of changing C<$?> by accident (e.g. by |
| 342 | running something via C<system>). |
| 343 | X<$?> |
| 344 | |
| 345 | Inside of a C<END> block, the value of C<${^GLOBAL_PHASE}> will be |
| 346 | C<"END">. |
| 347 | |
| 348 | C<UNITCHECK>, C<CHECK> and C<INIT> code blocks are useful to catch the |
| 349 | transition between the compilation phase and the execution phase of |
| 350 | the main program. |
| 351 | |
| 352 | C<UNITCHECK> blocks are run just after the unit which defined them has |
| 353 | been compiled. The main program file and each module it loads are |
| 354 | compilation units, as are string C<eval>s, run-time code compiled using the |
| 355 | C<(?{ })> construct in a regex, calls to C<do FILE>, C<require FILE>, |
| 356 | and code after the C<-e> switch on the command line. |
| 357 | |
| 358 | C<BEGIN> and C<UNITCHECK> blocks are not directly related to the phase of |
| 359 | the interpreter. They can be created and executed during any phase. |
| 360 | |
| 361 | C<CHECK> code blocks are run just after the B<initial> Perl compile phase ends |
| 362 | and before the run time begins, in LIFO order. C<CHECK> code blocks are used |
| 363 | in the Perl compiler suite to save the compiled state of the program. |
| 364 | |
| 365 | Inside of a C<CHECK> block, the value of C<${^GLOBAL_PHASE}> will be |
| 366 | C<"CHECK">. |
| 367 | |
| 368 | C<INIT> blocks are run just before the Perl runtime begins execution, in |
| 369 | "first in, first out" (FIFO) order. |
| 370 | |
| 371 | Inside of an C<INIT> block, the value of C<${^GLOBAL_PHASE}> will be C<"INIT">. |
| 372 | |
| 373 | The C<CHECK> and C<INIT> blocks in code compiled by C<require>, string C<do>, |
| 374 | or string C<eval> will not be executed if they occur after the end of the |
| 375 | main compilation phase; that can be a problem in mod_perl and other persistent |
| 376 | environments which use those functions to load code at runtime. |
| 377 | |
| 378 | When you use the B<-n> and B<-p> switches to Perl, C<BEGIN> and |
| 379 | C<END> work just as they do in B<awk>, as a degenerate case. |
| 380 | Both C<BEGIN> and C<CHECK> blocks are run when you use the B<-c> |
| 381 | switch for a compile-only syntax check, although your main code |
| 382 | is not. |
| 383 | |
| 384 | The B<begincheck> program makes it all clear, eventually: |
| 385 | |
| 386 | #!/usr/bin/perl |
| 387 | |
| 388 | # begincheck |
| 389 | |
| 390 | print "10. Ordinary code runs at runtime.\n"; |
| 391 | |
| 392 | END { print "16. So this is the end of the tale.\n" } |
| 393 | INIT { print " 7. INIT blocks run FIFO just before runtime.\n" } |
| 394 | UNITCHECK { |
| 395 | print " 4. And therefore before any CHECK blocks.\n" |
| 396 | } |
| 397 | CHECK { print " 6. So this is the sixth line.\n" } |
| 398 | |
| 399 | print "11. It runs in order, of course.\n"; |
| 400 | |
| 401 | BEGIN { print " 1. BEGIN blocks run FIFO during compilation.\n" } |
| 402 | END { print "15. Read perlmod for the rest of the story.\n" } |
| 403 | CHECK { print " 5. CHECK blocks run LIFO after all compilation.\n" } |
| 404 | INIT { print " 8. Run this again, using Perl's -c switch.\n" } |
| 405 | |
| 406 | print "12. This is anti-obfuscated code.\n"; |
| 407 | |
| 408 | END { print "14. END blocks run LIFO at quitting time.\n" } |
| 409 | BEGIN { print " 2. So this line comes out second.\n" } |
| 410 | UNITCHECK { |
| 411 | print " 3. UNITCHECK blocks run LIFO after each file is compiled.\n" |
| 412 | } |
| 413 | INIT { print " 9. You'll see the difference right away.\n" } |
| 414 | |
| 415 | print "13. It only _looks_ like it should be confusing.\n"; |
| 416 | |
| 417 | __END__ |
| 418 | |
| 419 | =head2 Perl Classes |
| 420 | X<class> X<@ISA> |
| 421 | |
| 422 | There is no special class syntax in Perl, but a package may act |
| 423 | as a class if it provides subroutines to act as methods. Such a |
| 424 | package may also derive some of its methods from another class (package) |
| 425 | by listing the other package name(s) in its global @ISA array (which |
| 426 | must be a package global, not a lexical). |
| 427 | |
| 428 | For more on this, see L<perlootut> and L<perlobj>. |
| 429 | |
| 430 | =head2 Perl Modules |
| 431 | X<module> |
| 432 | |
| 433 | A module is just a set of related functions in a library file, i.e., |
| 434 | a Perl package with the same name as the file. It is specifically |
| 435 | designed to be reusable by other modules or programs. It may do this |
| 436 | by providing a mechanism for exporting some of its symbols into the |
| 437 | symbol table of any package using it, or it may function as a class |
| 438 | definition and make its semantics available implicitly through |
| 439 | method calls on the class and its objects, without explicitly |
| 440 | exporting anything. Or it can do a little of both. |
| 441 | |
| 442 | For example, to start a traditional, non-OO module called Some::Module, |
| 443 | create a file called F<Some/Module.pm> and start with this template: |
| 444 | |
| 445 | package Some::Module; # assumes Some/Module.pm |
| 446 | |
| 447 | use strict; |
| 448 | use warnings; |
| 449 | |
| 450 | # Get the import method from Exporter to export functions and |
| 451 | # variables |
| 452 | use Exporter 5.57 'import'; |
| 453 | |
| 454 | # set the version for version checking |
| 455 | our $VERSION = '1.00'; |
| 456 | |
| 457 | # Functions and variables which are exported by default |
| 458 | our @EXPORT = qw(func1 func2); |
| 459 | |
| 460 | # Functions and variables which can be optionally exported |
| 461 | our @EXPORT_OK = qw($Var1 %Hashit func3); |
| 462 | |
| 463 | # exported package globals go here |
| 464 | our $Var1 = ''; |
| 465 | our %Hashit = (); |
| 466 | |
| 467 | # non-exported package globals go here |
| 468 | # (they are still accessible as $Some::Module::stuff) |
| 469 | our @more = (); |
| 470 | our $stuff = ''; |
| 471 | |
| 472 | # file-private lexicals go here, before any functions which use them |
| 473 | my $priv_var = ''; |
| 474 | my %secret_hash = (); |
| 475 | |
| 476 | # here's a file-private function as a closure, |
| 477 | # callable as $priv_func->(); |
| 478 | my $priv_func = sub { |
| 479 | ... |
| 480 | }; |
| 481 | |
| 482 | # make all your functions, whether exported or not; |
| 483 | # remember to put something interesting in the {} stubs |
| 484 | sub func1 { ... } |
| 485 | sub func2 { ... } |
| 486 | |
| 487 | # this one isn't always exported, but could be called directly |
| 488 | # as Some::Module::func3() |
| 489 | sub func3 { ... } |
| 490 | |
| 491 | END { ... } # module clean-up code here (global destructor) |
| 492 | |
| 493 | 1; # don't forget to return a true value from the file |
| 494 | |
| 495 | Then go on to declare and use your variables in functions without |
| 496 | any qualifications. See L<Exporter> and the L<perlmodlib> for |
| 497 | details on mechanics and style issues in module creation. |
| 498 | |
| 499 | Perl modules are included into your program by saying |
| 500 | |
| 501 | use Module; |
| 502 | |
| 503 | or |
| 504 | |
| 505 | use Module LIST; |
| 506 | |
| 507 | This is exactly equivalent to |
| 508 | |
| 509 | BEGIN { require 'Module.pm'; 'Module'->import; } |
| 510 | |
| 511 | or |
| 512 | |
| 513 | BEGIN { require 'Module.pm'; 'Module'->import( LIST ); } |
| 514 | |
| 515 | As a special case |
| 516 | |
| 517 | use Module (); |
| 518 | |
| 519 | is exactly equivalent to |
| 520 | |
| 521 | BEGIN { require 'Module.pm'; } |
| 522 | |
| 523 | All Perl module files have the extension F<.pm>. The C<use> operator |
| 524 | assumes this so you don't have to spell out "F<Module.pm>" in quotes. |
| 525 | This also helps to differentiate new modules from old F<.pl> and |
| 526 | F<.ph> files. Module names are also capitalized unless they're |
| 527 | functioning as pragmas; pragmas are in effect compiler directives, |
| 528 | and are sometimes called "pragmatic modules" (or even "pragmata" |
| 529 | if you're a classicist). |
| 530 | |
| 531 | The two statements: |
| 532 | |
| 533 | require SomeModule; |
| 534 | require "SomeModule.pm"; |
| 535 | |
| 536 | differ from each other in two ways. In the first case, any double |
| 537 | colons in the module name, such as C<Some::Module>, are translated |
| 538 | into your system's directory separator, usually "/". The second |
| 539 | case does not, and would have to be specified literally. The other |
| 540 | difference is that seeing the first C<require> clues in the compiler |
| 541 | that uses of indirect object notation involving "SomeModule", as |
| 542 | in C<$ob = purge SomeModule>, are method calls, not function calls. |
| 543 | (Yes, this really can make a difference.) |
| 544 | |
| 545 | Because the C<use> statement implies a C<BEGIN> block, the importing |
| 546 | of semantics happens as soon as the C<use> statement is compiled, |
| 547 | before the rest of the file is compiled. This is how it is able |
| 548 | to function as a pragma mechanism, and also how modules are able to |
| 549 | declare subroutines that are then visible as list or unary operators for |
| 550 | the rest of the current file. This will not work if you use C<require> |
| 551 | instead of C<use>. With C<require> you can get into this problem: |
| 552 | |
| 553 | require Cwd; # make Cwd:: accessible |
| 554 | $here = Cwd::getcwd(); |
| 555 | |
| 556 | use Cwd; # import names from Cwd:: |
| 557 | $here = getcwd(); |
| 558 | |
| 559 | require Cwd; # make Cwd:: accessible |
| 560 | $here = getcwd(); # oops! no main::getcwd() |
| 561 | |
| 562 | In general, C<use Module ()> is recommended over C<require Module>, |
| 563 | because it determines module availability at compile time, not in the |
| 564 | middle of your program's execution. An exception would be if two modules |
| 565 | each tried to C<use> each other, and each also called a function from |
| 566 | that other module. In that case, it's easy to use C<require> instead. |
| 567 | |
| 568 | Perl packages may be nested inside other package names, so we can have |
| 569 | package names containing C<::>. But if we used that package name |
| 570 | directly as a filename it would make for unwieldy or impossible |
| 571 | filenames on some systems. Therefore, if a module's name is, say, |
| 572 | C<Text::Soundex>, then its definition is actually found in the library |
| 573 | file F<Text/Soundex.pm>. |
| 574 | |
| 575 | Perl modules always have a F<.pm> file, but there may also be |
| 576 | dynamically linked executables (often ending in F<.so>) or autoloaded |
| 577 | subroutine definitions (often ending in F<.al>) associated with the |
| 578 | module. If so, these will be entirely transparent to the user of |
| 579 | the module. It is the responsibility of the F<.pm> file to load |
| 580 | (or arrange to autoload) any additional functionality. For example, |
| 581 | although the POSIX module happens to do both dynamic loading and |
| 582 | autoloading, the user can say just C<use POSIX> to get it all. |
| 583 | |
| 584 | =head2 Making your module threadsafe |
| 585 | X<threadsafe> X<thread safe> |
| 586 | X<module, threadsafe> X<module, thread safe> |
| 587 | X<CLONE> X<CLONE_SKIP> X<thread> X<threads> X<ithread> |
| 588 | |
| 589 | Perl supports a type of threads called interpreter threads (ithreads). |
| 590 | These threads can be used explicitly and implicitly. |
| 591 | |
| 592 | Ithreads work by cloning the data tree so that no data is shared |
| 593 | between different threads. These threads can be used by using the C<threads> |
| 594 | module or by doing fork() on win32 (fake fork() support). When a |
| 595 | thread is cloned all Perl data is cloned, however non-Perl data cannot |
| 596 | be cloned automatically. Perl after 5.8.0 has support for the C<CLONE> |
| 597 | special subroutine. In C<CLONE> you can do whatever |
| 598 | you need to do, |
| 599 | like for example handle the cloning of non-Perl data, if necessary. |
| 600 | C<CLONE> will be called once as a class method for every package that has it |
| 601 | defined (or inherits it). It will be called in the context of the new thread, |
| 602 | so all modifications are made in the new area. Currently CLONE is called with |
| 603 | no parameters other than the invocant package name, but code should not assume |
| 604 | that this will remain unchanged, as it is likely that in future extra parameters |
| 605 | will be passed in to give more information about the state of cloning. |
| 606 | |
| 607 | If you want to CLONE all objects you will need to keep track of them per |
| 608 | package. This is simply done using a hash and Scalar::Util::weaken(). |
| 609 | |
| 610 | Perl after 5.8.7 has support for the C<CLONE_SKIP> special subroutine. |
| 611 | Like C<CLONE>, C<CLONE_SKIP> is called once per package; however, it is |
| 612 | called just before cloning starts, and in the context of the parent |
| 613 | thread. If it returns a true value, then no objects of that class will |
| 614 | be cloned; or rather, they will be copied as unblessed, undef values. |
| 615 | For example: if in the parent there are two references to a single blessed |
| 616 | hash, then in the child there will be two references to a single undefined |
| 617 | scalar value instead. |
| 618 | This provides a simple mechanism for making a module threadsafe; just add |
| 619 | C<sub CLONE_SKIP { 1 }> at the top of the class, and C<DESTROY()> will |
| 620 | now only be called once per object. Of course, if the child thread needs |
| 621 | to make use of the objects, then a more sophisticated approach is |
| 622 | needed. |
| 623 | |
| 624 | Like C<CLONE>, C<CLONE_SKIP> is currently called with no parameters other |
| 625 | than the invocant package name, although that may change. Similarly, to |
| 626 | allow for future expansion, the return value should be a single C<0> or |
| 627 | C<1> value. |
| 628 | |
| 629 | =head1 SEE ALSO |
| 630 | |
| 631 | See L<perlmodlib> for general style issues related to building Perl |
| 632 | modules and classes, as well as descriptions of the standard library |
| 633 | and CPAN, L<Exporter> for how Perl's standard import/export mechanism |
| 634 | works, L<perlootut> and L<perlobj> for in-depth information on |
| 635 | creating classes, L<perlobj> for a hard-core reference document on |
| 636 | objects, L<perlsub> for an explanation of functions and scoping, |
| 637 | and L<perlxstut> and L<perlguts> for more information on writing |
| 638 | extension modules. |