Commit | Line | Data |
---|---|---|
a0d0e21e LW |
1 | =head1 NAME |
2 | ||
f102b883 | 3 | perlmod - Perl modules (packages and symbol tables) |
a0d0e21e LW |
4 | |
5 | =head1 DESCRIPTION | |
6 | ||
69520e41 E |
7 | =head2 Is this the document you were after? |
8 | ||
9 | There are other documents which might contain the information that you're | |
10 | looking for: | |
11 | ||
12 | =over 2 | |
13 | ||
14 | =item This doc | |
15 | ||
16 | Perl's packages, namespaces, and some info on classes. | |
17 | ||
18 | =item L<perlnewmod> | |
19 | ||
20 | Tutorial on making a new module. | |
21 | ||
22 | =item L<perlmodstyle> | |
23 | ||
24 | Best practices for making a new module. | |
25 | ||
26 | =back | |
27 | ||
a0d0e21e | 28 | =head2 Packages |
d74e8afc | 29 | X<package> X<namespace> X<variable, global> X<global variable> X<global> |
a0d0e21e | 30 | |
c2e08204 JK |
31 | Unlike Perl 4, in which all the variables were dynamic and shared one |
32 | global name space, causing maintainability problems, Perl 5 provides two | |
33 | mechanisms for protecting code from having its variables stomped on by | |
6bc3ceb8 YO |
34 | other code: lexically scoped variables created with C<my> or C<state> and |
35 | namespaced global variables, which are exposed via the C<vars> pragma, | |
36 | or the C<our> keyword. Any global variable is considered to | |
0ee4a8bd JK |
37 | be part of a namespace and can be accessed via a "fully qualified form". |
38 | Conversely, any lexically scoped variable is considered to be part of | |
6bc3ceb8 | 39 | that lexical-scope, and does not have a "fully qualified form". |
0ee4a8bd | 40 | |
6bc3ceb8 | 41 | In perl namespaces are called "packages" and |
0ee4a8bd JK |
42 | the C<package> declaration tells the compiler which |
43 | namespace to prefix to C<our> variables and unqualified dynamic names. | |
44 | This both protects | |
c2e08204 JK |
45 | against accidental stomping and provides an interface for deliberately |
46 | clobbering global dynamic variables declared and used in other scopes or | |
47 | packages, when that is what you want to do. | |
0ee4a8bd JK |
48 | |
49 | The scope of the C<package> declaration is from the | |
19799a22 | 50 | declaration itself through the end of the enclosing block, C<eval>, |
c2e08204 JK |
51 | or file, whichever comes first (the same scope as the my(), our(), state(), and |
52 | local() operators, and also the effect | |
53 | of the experimental "reference aliasing," which may change), or until | |
54 | the next C<package> declaration. Unqualified dynamic identifiers will be in | |
55 | this namespace, except for those few identifiers that, if unqualified, | |
19799a22 | 56 | default to the main package instead of the current one as described |
0ee4a8bd | 57 | below. A C<package> statement affects only dynamic global |
c2e08204 JK |
58 | symbols, including subroutine names, and variables you've used local() |
59 | on, but I<not> lexical variables created with my(), our() or state(). | |
0ee4a8bd JK |
60 | |
61 | Typically, a C<package> statement is the first declaration in a file | |
62 | included in a program by one of the C<do>, C<require>, or C<use> operators. You can | |
c2e08204 JK |
63 | switch into a package in more than one place: C<package> has no |
64 | effect beyond specifying which symbol table the compiler will use for | |
65 | dynamic symbols for the rest of that block or until the next C<package> statement. | |
66 | You can refer to variables and filehandles in other packages | |
19799a22 GS |
67 | by prefixing the identifier with the package name and a double |
68 | colon: C<$Package::Variable>. If the package name is null, the | |
69 | C<main> package is assumed. That is, C<$::sail> is equivalent to | |
70 | C<$main::sail>. | |
a0d0e21e | 71 | |
d3ebb66b GS |
72 | The old package delimiter was a single quote, but double colon is now the |
73 | preferred delimiter, in part because it's more readable to humans, and | |
74 | in part because it's more readable to B<emacs> macros. It also makes C++ | |
75 | programmers feel like they know what's going on--as opposed to using the | |
76 | single quote as separator, which was there to make Ada programmers feel | |
14c715f4 | 77 | like they knew what was going on. Because the old-fashioned syntax is still |
d3ebb66b GS |
78 | supported for backwards compatibility, if you try to use a string like |
79 | C<"This is $owner's house">, you'll be accessing C<$owner::s>; that is, | |
80 | the $s variable in package C<owner>, which is probably not what you meant. | |
81 | Use braces to disambiguate, as in C<"This is ${owner}'s house">. | |
d74e8afc | 82 | X<::> X<'> |
a0d0e21e | 83 | |
19799a22 GS |
84 | Packages may themselves contain package separators, as in |
85 | C<$OUTER::INNER::var>. This implies nothing about the order of | |
86 | name lookups, however. There are no relative packages: all symbols | |
a0d0e21e LW |
87 | are either local to the current package, or must be fully qualified |
88 | from the outer package name down. For instance, there is nowhere | |
19799a22 | 89 | within package C<OUTER> that C<$INNER::var> refers to |
14c715f4 | 90 | C<$OUTER::INNER::var>. C<INNER> refers to a totally |
c2e08204 JK |
91 | separate global package. The custom of treating package names as a |
92 | hierarchy is very strong, but the language in no way enforces it. | |
19799a22 GS |
93 | |
94 | Only identifiers starting with letters (or underscore) are stored | |
95 | in a package's symbol table. All other symbols are kept in package | |
96 | C<main>, including all punctuation variables, like $_. In addition, | |
97 | when unqualified, the identifiers STDIN, STDOUT, STDERR, ARGV, | |
98 | ARGVOUT, ENV, INC, and SIG are forced to be in package C<main>, | |
14c715f4 | 99 | even when used for other purposes than their built-in ones. If you |
19799a22 GS |
100 | have a package called C<m>, C<s>, or C<y>, then you can't use the |
101 | qualified form of an identifier because it would be instead interpreted | |
102 | as a pattern match, a substitution, or a transliteration. | |
d74e8afc | 103 | X<variable, punctuation> |
19799a22 GS |
104 | |
105 | Variables beginning with underscore used to be forced into package | |
a0d0e21e | 106 | main, but we decided it was more useful for package writers to be able |
cb1a09d0 | 107 | to use leading underscore to indicate private variables and method names. |
b58b0d99 AT |
108 | However, variables and functions named with a single C<_>, such as |
109 | $_ and C<sub _>, are still forced into the package C<main>. See also | |
96090e4f | 110 | L<perlvar/"The Syntax of Variable Names">. |
a0d0e21e | 111 | |
19799a22 | 112 | C<eval>ed strings are compiled in the package in which the eval() was |
a0d0e21e | 113 | compiled. (Assignments to C<$SIG{}>, however, assume the signal |
748a9306 | 114 | handler specified is in the C<main> package. Qualify the signal handler |
a0d0e21e LW |
115 | name if you wish to have a signal handler in a package.) For an |
116 | example, examine F<perldb.pl> in the Perl library. It initially switches | |
117 | to the C<DB> package so that the debugger doesn't interfere with variables | |
19799a22 | 118 | in the program you are trying to debug. At various points, however, it |
a0d0e21e LW |
119 | temporarily switches back to the C<main> package to evaluate various |
120 | expressions in the context of the C<main> package (or wherever you came | |
121 | from). See L<perldebug>. | |
122 | ||
f102b883 | 123 | The special symbol C<__PACKAGE__> contains the current package, but cannot |
c2e08204 JK |
124 | (easily) be used to construct variable names. After C<my($foo)> has hidden |
125 | package variable C<$foo>, it can still be accessed, without knowing what | |
126 | package you are in, as C<${__PACKAGE__.'::foo'}>. | |
f102b883 | 127 | |
5f05dabc | 128 | See L<perlsub> for other scoping issues related to my() and local(), |
f102b883 | 129 | and L<perlref> regarding closures. |
cb1a09d0 | 130 | |
a0d0e21e | 131 | =head2 Symbol Tables |
d74e8afc | 132 | X<symbol table> X<stash> X<%::> X<%main::> X<typeglob> X<glob> X<alias> |
a0d0e21e | 133 | |
aa689395 | 134 | The symbol table for a package happens to be stored in the hash of that |
135 | name with two colons appended. The main symbol table's name is thus | |
5803be0d | 136 | C<%main::>, or C<%::> for short. Likewise the symbol table for the nested |
aa689395 | 137 | package mentioned earlier is named C<%OUTER::INNER::>. |
138 | ||
139 | The value in each entry of the hash is what you are referring to when you | |
8c44bff1 | 140 | use the C<*name> typeglob notation. |
a0d0e21e | 141 | |
f102b883 | 142 | local *main::foo = *main::bar; |
bc8df162 | 143 | |
a0d0e21e | 144 | You can use this to print out all the variables in a package, for |
4375e838 | 145 | instance. The standard but antiquated F<dumpvar.pl> library and |
19799a22 | 146 | the CPAN module Devel::Symdump make use of this. |
a0d0e21e | 147 | |
993e39b1 | 148 | The results of creating new symbol table entries directly or modifying any |
fa4ec284 | 149 | entries that are not already typeglobs are undefined and subject to change |
993e39b1 FC |
150 | between releases of perl. |
151 | ||
cb1a09d0 | 152 | Assignment to a typeglob performs an aliasing operation, i.e., |
a0d0e21e LW |
153 | |
154 | *dick = *richard; | |
155 | ||
5a964f20 TC |
156 | causes variables, subroutines, formats, and file and directory handles |
157 | accessible via the identifier C<richard> also to be accessible via the | |
158 | identifier C<dick>. If you want to alias only a particular variable or | |
19799a22 | 159 | subroutine, assign a reference instead: |
a0d0e21e LW |
160 | |
161 | *dick = \$richard; | |
162 | ||
5a964f20 | 163 | Which makes $richard and $dick the same variable, but leaves |
a0d0e21e LW |
164 | @richard and @dick as separate arrays. Tricky, eh? |
165 | ||
5e76a0e2 MC |
166 | There is one subtle difference between the following statements: |
167 | ||
168 | *foo = *bar; | |
169 | *foo = \$bar; | |
170 | ||
171 | C<*foo = *bar> makes the typeglobs themselves synonymous while | |
172 | C<*foo = \$bar> makes the SCALAR portions of two distinct typeglobs | |
173 | refer to the same scalar value. This means that the following code: | |
174 | ||
175 | $bar = 1; | |
176 | *foo = \$bar; # Make $foo an alias for $bar | |
177 | ||
178 | { | |
179 | local $bar = 2; # Restrict changes to block | |
180 | print $foo; # Prints '1'! | |
181 | } | |
182 | ||
183 | Would print '1', because C<$foo> holds a reference to the I<original> | |
ac036724 | 184 | C<$bar>. The one that was stuffed away by C<local()> and which will be |
5e76a0e2 MC |
185 | restored when the block ends. Because variables are accessed through the |
186 | typeglob, you can use C<*foo = *bar> to create an alias which can be | |
187 | localized. (But be aware that this means you can't have a separate | |
188 | C<@foo> and C<@bar>, etc.) | |
189 | ||
190 | What makes all of this important is that the Exporter module uses glob | |
191 | aliasing as the import/export mechanism. Whether or not you can properly | |
192 | localize a variable that has been exported from a module depends on how | |
193 | it was exported: | |
194 | ||
195 | @EXPORT = qw($FOO); # Usual form, can't be localized | |
196 | @EXPORT = qw(*FOO); # Can be localized | |
197 | ||
14c715f4 | 198 | You can work around the first case by using the fully qualified name |
5e76a0e2 MC |
199 | (C<$Package::FOO>) where you need a local value, or by overriding it |
200 | by saying C<*FOO = *Package::FOO> in your script. | |
201 | ||
202 | The C<*x = \$y> mechanism may be used to pass and return cheap references | |
5803be0d | 203 | into or from subroutines if you don't want to copy the whole |
5a964f20 TC |
204 | thing. It only works when assigning to dynamic variables, not |
205 | lexicals. | |
cb1a09d0 | 206 | |
5a964f20 | 207 | %some_hash = (); # can't be my() |
cb1a09d0 AD |
208 | *some_hash = fn( \%another_hash ); |
209 | sub fn { | |
210 | local *hashsym = shift; | |
211 | # now use %hashsym normally, and you | |
212 | # will affect the caller's %another_hash | |
213 | my %nhash = (); # do what you want | |
5f05dabc | 214 | return \%nhash; |
cb1a09d0 AD |
215 | } |
216 | ||
5f05dabc | 217 | On return, the reference will overwrite the hash slot in the |
cb1a09d0 | 218 | symbol table specified by the *some_hash typeglob. This |
c36e9b62 | 219 | is a somewhat tricky way of passing around references cheaply |
5803be0d | 220 | when you don't want to have to remember to dereference variables |
cb1a09d0 AD |
221 | explicitly. |
222 | ||
19799a22 | 223 | Another use of symbol tables is for making "constant" scalars. |
d74e8afc | 224 | X<constant> X<scalar, constant> |
cb1a09d0 AD |
225 | |
226 | *PI = \3.14159265358979; | |
227 | ||
bc8df162 | 228 | Now you cannot alter C<$PI>, which is probably a good thing all in all. |
5a964f20 | 229 | This isn't the same as a constant subroutine, which is subject to |
5803be0d | 230 | optimization at compile-time. A constant subroutine is one prototyped |
14c715f4 | 231 | to take no arguments and to return a constant expression. See |
5803be0d | 232 | L<perlsub> for details on these. The C<use constant> pragma is a |
5a964f20 | 233 | convenient shorthand for these. |
cb1a09d0 | 234 | |
55497cff | 235 | You can say C<*foo{PACKAGE}> and C<*foo{NAME}> to find out what name and |
236 | package the *foo symbol table entry comes from. This may be useful | |
5a964f20 | 237 | in a subroutine that gets passed typeglobs as arguments: |
55497cff | 238 | |
239 | sub identify_typeglob { | |
240 | my $glob = shift; | |
555bd962 BG |
241 | print 'You gave me ', *{$glob}{PACKAGE}, |
242 | '::', *{$glob}{NAME}, "\n"; | |
55497cff | 243 | } |
244 | identify_typeglob *foo; | |
245 | identify_typeglob *bar::baz; | |
246 | ||
247 | This prints | |
248 | ||
249 | You gave me main::foo | |
250 | You gave me bar::baz | |
251 | ||
19799a22 | 252 | The C<*foo{THING}> notation can also be used to obtain references to the |
5803be0d | 253 | individual elements of *foo. See L<perlref>. |
55497cff | 254 | |
9263d47b GS |
255 | Subroutine definitions (and declarations, for that matter) need |
256 | not necessarily be situated in the package whose symbol table they | |
257 | occupy. You can define a subroutine outside its package by | |
258 | explicitly qualifying the name of the subroutine: | |
259 | ||
260 | package main; | |
261 | sub Some_package::foo { ... } # &foo defined in Some_package | |
262 | ||
263 | This is just a shorthand for a typeglob assignment at compile time: | |
264 | ||
265 | BEGIN { *Some_package::foo = sub { ... } } | |
266 | ||
267 | and is I<not> the same as writing: | |
268 | ||
269 | { | |
270 | package Some_package; | |
271 | sub foo { ... } | |
272 | } | |
273 | ||
274 | In the first two versions, the body of the subroutine is | |
275 | lexically in the main package, I<not> in Some_package. So | |
276 | something like this: | |
277 | ||
278 | package main; | |
279 | ||
280 | $Some_package::name = "fred"; | |
281 | $main::name = "barney"; | |
282 | ||
283 | sub Some_package::foo { | |
284 | print "in ", __PACKAGE__, ": \$name is '$name'\n"; | |
285 | } | |
286 | ||
287 | Some_package::foo(); | |
288 | ||
289 | prints: | |
290 | ||
291 | in main: $name is 'barney' | |
292 | ||
293 | rather than: | |
294 | ||
295 | in Some_package: $name is 'fred' | |
296 | ||
297 | This also has implications for the use of the SUPER:: qualifier | |
298 | (see L<perlobj>). | |
299 | ||
3c10abe3 AG |
300 | =head2 BEGIN, UNITCHECK, CHECK, INIT and END |
301 | X<BEGIN> X<UNITCHECK> X<CHECK> X<INIT> X<END> | |
ac90fb77 | 302 | |
3c10abe3 AG |
303 | Five specially named code blocks are executed at the beginning and at |
304 | the end of a running Perl program. These are the C<BEGIN>, | |
305 | C<UNITCHECK>, C<CHECK>, C<INIT>, and C<END> blocks. | |
ac90fb77 EM |
306 | |
307 | These code blocks can be prefixed with C<sub> to give the appearance of a | |
308 | subroutine (although this is not considered good style). One should note | |
309 | that these code blocks don't really exist as named subroutines (despite | |
310 | their appearance). The thing that gives this away is the fact that you can | |
311 | have B<more than one> of these code blocks in a program, and they will get | |
312 | B<all> executed at the appropriate moment. So you can't execute any of | |
313 | these code blocks by name. | |
314 | ||
315 | A C<BEGIN> code block is executed as soon as possible, that is, the moment | |
316 | it is completely defined, even before the rest of the containing file (or | |
317 | string) is parsed. You may have multiple C<BEGIN> blocks within a file (or | |
ac036724 | 318 | eval'ed string); they will execute in order of definition. Because a C<BEGIN> |
ac90fb77 EM |
319 | code block executes immediately, it can pull in definitions of subroutines |
320 | and such from other files in time to be visible to the rest of the compile | |
321 | and run time. Once a C<BEGIN> has run, it is immediately undefined and any | |
322 | code it used is returned to Perl's memory pool. | |
323 | ||
ac90fb77 | 324 | An C<END> code block is executed as late as possible, that is, after |
4f25aa18 GS |
325 | perl has finished running the program and just before the interpreter |
326 | is being exited, even if it is exiting as a result of a die() function. | |
3bf5301d | 327 | (But not if it's morphing into another program via C<exec>, or |
4f25aa18 GS |
328 | being blown out of the water by a signal--you have to trap that yourself |
329 | (if you can).) You may have multiple C<END> blocks within a file--they | |
330 | will execute in reverse order of definition; that is: last in, first | |
331 | out (LIFO). C<END> blocks are not executed when you run perl with the | |
db517d64 | 332 | C<-c> switch, or if compilation fails. |
a0d0e21e | 333 | |
ac90fb77 EM |
334 | Note that C<END> code blocks are B<not> executed at the end of a string |
335 | C<eval()>: if any C<END> code blocks are created in a string C<eval()>, | |
336 | they will be executed just as any other C<END> code block of that package | |
337 | in LIFO order just before the interpreter is being exited. | |
338 | ||
339 | Inside an C<END> code block, C<$?> contains the value that the program is | |
c36e9b62 | 340 | going to pass to C<exit()>. You can modify C<$?> to change the exit |
19799a22 | 341 | value of the program. Beware of changing C<$?> by accident (e.g. by |
c36e9b62 | 342 | running something via C<system>). |
d74e8afc | 343 | X<$?> |
c36e9b62 | 344 | |
191f4b8c CO |
345 | Inside of a C<END> block, the value of C<${^GLOBAL_PHASE}> will be |
346 | C<"END">. | |
347 | ||
3c10abe3 AG |
348 | C<UNITCHECK>, C<CHECK> and C<INIT> code blocks are useful to catch the |
349 | transition between the compilation phase and the execution phase of | |
350 | the main program. | |
351 | ||
352 | C<UNITCHECK> blocks are run just after the unit which defined them has | |
353 | been compiled. The main program file and each module it loads are | |
68e2671b | 354 | compilation units, as are string C<eval>s, run-time code compiled using the |
3c10abe3 AG |
355 | C<(?{ })> construct in a regex, calls to C<do FILE>, C<require FILE>, |
356 | and code after the C<-e> switch on the command line. | |
ca62f0fc | 357 | |
191f4b8c CO |
358 | C<BEGIN> and C<UNITCHECK> blocks are not directly related to the phase of |
359 | the interpreter. They can be created and executed during any phase. | |
360 | ||
ac90fb77 EM |
361 | C<CHECK> code blocks are run just after the B<initial> Perl compile phase ends |
362 | and before the run time begins, in LIFO order. C<CHECK> code blocks are used | |
363 | in the Perl compiler suite to save the compiled state of the program. | |
ca62f0fc | 364 | |
191f4b8c CO |
365 | Inside of a C<CHECK> block, the value of C<${^GLOBAL_PHASE}> will be |
366 | C<"CHECK">. | |
367 | ||
ca62f0fc | 368 | C<INIT> blocks are run just before the Perl runtime begins execution, in |
59f521f4 | 369 | "first in, first out" (FIFO) order. |
4f25aa18 | 370 | |
191f4b8c CO |
371 | Inside of an C<INIT> block, the value of C<${^GLOBAL_PHASE}> will be C<"INIT">. |
372 | ||
9e923162 CO |
373 | The C<CHECK> and C<INIT> blocks in code compiled by C<require>, string C<do>, |
374 | or string C<eval> will not be executed if they occur after the end of the | |
375 | main compilation phase; that can be a problem in mod_perl and other persistent | |
376 | environments which use those functions to load code at runtime. | |
98107fc7 | 377 | |
19799a22 | 378 | When you use the B<-n> and B<-p> switches to Perl, C<BEGIN> and |
4375e838 GS |
379 | C<END> work just as they do in B<awk>, as a degenerate case. |
380 | Both C<BEGIN> and C<CHECK> blocks are run when you use the B<-c> | |
381 | switch for a compile-only syntax check, although your main code | |
382 | is not. | |
a0d0e21e | 383 | |
055634da TP |
384 | The B<begincheck> program makes it all clear, eventually: |
385 | ||
386 | #!/usr/bin/perl | |
387 | ||
388 | # begincheck | |
389 | ||
3c10abe3 | 390 | print "10. Ordinary code runs at runtime.\n"; |
055634da | 391 | |
3c10abe3 AG |
392 | END { print "16. So this is the end of the tale.\n" } |
393 | INIT { print " 7. INIT blocks run FIFO just before runtime.\n" } | |
394 | UNITCHECK { | |
395 | print " 4. And therefore before any CHECK blocks.\n" | |
396 | } | |
397 | CHECK { print " 6. So this is the sixth line.\n" } | |
055634da | 398 | |
3c10abe3 | 399 | print "11. It runs in order, of course.\n"; |
055634da TP |
400 | |
401 | BEGIN { print " 1. BEGIN blocks run FIFO during compilation.\n" } | |
3c10abe3 AG |
402 | END { print "15. Read perlmod for the rest of the story.\n" } |
403 | CHECK { print " 5. CHECK blocks run LIFO after all compilation.\n" } | |
404 | INIT { print " 8. Run this again, using Perl's -c switch.\n" } | |
055634da | 405 | |
3c10abe3 | 406 | print "12. This is anti-obfuscated code.\n"; |
055634da | 407 | |
3c10abe3 | 408 | END { print "14. END blocks run LIFO at quitting time.\n" } |
055634da | 409 | BEGIN { print " 2. So this line comes out second.\n" } |
3c10abe3 AG |
410 | UNITCHECK { |
411 | print " 3. UNITCHECK blocks run LIFO after each file is compiled.\n" | |
412 | } | |
413 | INIT { print " 9. You'll see the difference right away.\n" } | |
055634da | 414 | |
555bd962 | 415 | print "13. It only _looks_ like it should be confusing.\n"; |
055634da TP |
416 | |
417 | __END__ | |
418 | ||
a0d0e21e | 419 | =head2 Perl Classes |
d74e8afc | 420 | X<class> X<@ISA> |
a0d0e21e | 421 | |
19799a22 | 422 | There is no special class syntax in Perl, but a package may act |
5a964f20 TC |
423 | as a class if it provides subroutines to act as methods. Such a |
424 | package may also derive some of its methods from another class (package) | |
14c715f4 | 425 | by listing the other package name(s) in its global @ISA array (which |
5a964f20 | 426 | must be a package global, not a lexical). |
4633a7c4 | 427 | |
82e1c0d9 | 428 | For more on this, see L<perlootut> and L<perlobj>. |
a0d0e21e LW |
429 | |
430 | =head2 Perl Modules | |
d74e8afc | 431 | X<module> |
a0d0e21e | 432 | |
5803be0d | 433 | A module is just a set of related functions in a library file, i.e., |
14c715f4 | 434 | a Perl package with the same name as the file. It is specifically |
5803be0d GS |
435 | designed to be reusable by other modules or programs. It may do this |
436 | by providing a mechanism for exporting some of its symbols into the | |
14c715f4 | 437 | symbol table of any package using it, or it may function as a class |
19799a22 GS |
438 | definition and make its semantics available implicitly through |
439 | method calls on the class and its objects, without explicitly | |
4375e838 | 440 | exporting anything. Or it can do a little of both. |
a0d0e21e | 441 | |
19799a22 GS |
442 | For example, to start a traditional, non-OO module called Some::Module, |
443 | create a file called F<Some/Module.pm> and start with this template: | |
9607fc9c | 444 | |
445 | package Some::Module; # assumes Some/Module.pm | |
446 | ||
447 | use strict; | |
9f1b1f2d | 448 | use warnings; |
9607fc9c | 449 | |
450 | BEGIN { | |
01d915c0 | 451 | require Exporter; |
9607fc9c | 452 | |
453 | # set the version for version checking | |
01d915c0 | 454 | our $VERSION = 1.00; |
9607fc9c | 455 | |
01d915c0 MS |
456 | # Inherit from Exporter to export functions and variables |
457 | our @ISA = qw(Exporter); | |
9607fc9c | 458 | |
01d915c0 MS |
459 | # Functions and variables which are exported by default |
460 | our @EXPORT = qw(func1 func2); | |
461 | ||
462 | # Functions and variables which can be optionally exported | |
463 | our @EXPORT_OK = qw($Var1 %Hashit func3); | |
9607fc9c | 464 | } |
9607fc9c | 465 | |
3da4c8f2 | 466 | # exported package globals go here |
01d915c0 MS |
467 | our $Var1 = ''; |
468 | our %Hashit = (); | |
3da4c8f2 | 469 | |
9607fc9c | 470 | # non-exported package globals go here |
01d915c0 MS |
471 | # (they are still accessible as $Some::Module::stuff) |
472 | our @more = (); | |
473 | our $stuff = ''; | |
9607fc9c | 474 | |
01d915c0 | 475 | # file-private lexicals go here, before any functions which use them |
9607fc9c | 476 | my $priv_var = ''; |
477 | my %secret_hash = (); | |
478 | ||
479 | # here's a file-private function as a closure, | |
01d915c0 | 480 | # callable as $priv_func->(); |
9607fc9c | 481 | my $priv_func = sub { |
01d915c0 | 482 | ... |
9607fc9c | 483 | }; |
484 | ||
485 | # make all your functions, whether exported or not; | |
486 | # remember to put something interesting in the {} stubs | |
01d915c0 MS |
487 | sub func1 { ... } |
488 | sub func2 { ... } | |
9607fc9c | 489 | |
01d915c0 MS |
490 | # this one isn't exported, but could be called directly |
491 | # as Some::Module::func3() | |
492 | sub func3 { ... } | |
4633a7c4 | 493 | |
01d915c0 | 494 | END { ... } # module clean-up code here (global destructor) |
19799a22 GS |
495 | |
496 | 1; # don't forget to return a true value from the file | |
497 | ||
498 | Then go on to declare and use your variables in functions without | |
499 | any qualifications. See L<Exporter> and the L<perlmodlib> for | |
500 | details on mechanics and style issues in module creation. | |
4633a7c4 LW |
501 | |
502 | Perl modules are included into your program by saying | |
a0d0e21e LW |
503 | |
504 | use Module; | |
505 | ||
506 | or | |
507 | ||
508 | use Module LIST; | |
509 | ||
510 | This is exactly equivalent to | |
511 | ||
76503c97 | 512 | BEGIN { require 'Module.pm'; 'Module'->import; } |
a0d0e21e LW |
513 | |
514 | or | |
515 | ||
76503c97 | 516 | BEGIN { require 'Module.pm'; 'Module'->import( LIST ); } |
a0d0e21e | 517 | |
cb1a09d0 AD |
518 | As a special case |
519 | ||
520 | use Module (); | |
521 | ||
522 | is exactly equivalent to | |
523 | ||
76503c97 | 524 | BEGIN { require 'Module.pm'; } |
cb1a09d0 | 525 | |
19799a22 GS |
526 | All Perl module files have the extension F<.pm>. The C<use> operator |
527 | assumes this so you don't have to spell out "F<Module.pm>" in quotes. | |
528 | This also helps to differentiate new modules from old F<.pl> and | |
529 | F<.ph> files. Module names are also capitalized unless they're | |
530 | functioning as pragmas; pragmas are in effect compiler directives, | |
531 | and are sometimes called "pragmatic modules" (or even "pragmata" | |
532 | if you're a classicist). | |
a0d0e21e | 533 | |
5a964f20 TC |
534 | The two statements: |
535 | ||
536 | require SomeModule; | |
14c715f4 | 537 | require "SomeModule.pm"; |
5a964f20 TC |
538 | |
539 | differ from each other in two ways. In the first case, any double | |
540 | colons in the module name, such as C<Some::Module>, are translated | |
541 | into your system's directory separator, usually "/". The second | |
19799a22 GS |
542 | case does not, and would have to be specified literally. The other |
543 | difference is that seeing the first C<require> clues in the compiler | |
544 | that uses of indirect object notation involving "SomeModule", as | |
545 | in C<$ob = purge SomeModule>, are method calls, not function calls. | |
546 | (Yes, this really can make a difference.) | |
547 | ||
548 | Because the C<use> statement implies a C<BEGIN> block, the importing | |
549 | of semantics happens as soon as the C<use> statement is compiled, | |
a0d0e21e LW |
550 | before the rest of the file is compiled. This is how it is able |
551 | to function as a pragma mechanism, and also how modules are able to | |
19799a22 | 552 | declare subroutines that are then visible as list or unary operators for |
a0d0e21e | 553 | the rest of the current file. This will not work if you use C<require> |
19799a22 | 554 | instead of C<use>. With C<require> you can get into this problem: |
a0d0e21e LW |
555 | |
556 | require Cwd; # make Cwd:: accessible | |
54310121 | 557 | $here = Cwd::getcwd(); |
a0d0e21e | 558 | |
5f05dabc | 559 | use Cwd; # import names from Cwd:: |
a0d0e21e LW |
560 | $here = getcwd(); |
561 | ||
562 | require Cwd; # make Cwd:: accessible | |
563 | $here = getcwd(); # oops! no main::getcwd() | |
564 | ||
5a964f20 TC |
565 | In general, C<use Module ()> is recommended over C<require Module>, |
566 | because it determines module availability at compile time, not in the | |
567 | middle of your program's execution. An exception would be if two modules | |
568 | each tried to C<use> each other, and each also called a function from | |
14c715f4 | 569 | that other module. In that case, it's easy to use C<require> instead. |
cb1a09d0 | 570 | |
a0d0e21e LW |
571 | Perl packages may be nested inside other package names, so we can have |
572 | package names containing C<::>. But if we used that package name | |
5803be0d | 573 | directly as a filename it would make for unwieldy or impossible |
a0d0e21e LW |
574 | filenames on some systems. Therefore, if a module's name is, say, |
575 | C<Text::Soundex>, then its definition is actually found in the library | |
576 | file F<Text/Soundex.pm>. | |
577 | ||
19799a22 GS |
578 | Perl modules always have a F<.pm> file, but there may also be |
579 | dynamically linked executables (often ending in F<.so>) or autoloaded | |
5803be0d | 580 | subroutine definitions (often ending in F<.al>) associated with the |
19799a22 GS |
581 | module. If so, these will be entirely transparent to the user of |
582 | the module. It is the responsibility of the F<.pm> file to load | |
583 | (or arrange to autoload) any additional functionality. For example, | |
584 | although the POSIX module happens to do both dynamic loading and | |
5803be0d | 585 | autoloading, the user can say just C<use POSIX> to get it all. |
a0d0e21e | 586 | |
f2fc0a40 | 587 | =head2 Making your module threadsafe |
d74e8afc ITB |
588 | X<threadsafe> X<thread safe> |
589 | X<module, threadsafe> X<module, thread safe> | |
590 | X<CLONE> X<CLONE_SKIP> X<thread> X<threads> X<ithread> | |
f2fc0a40 | 591 | |
8f416bb0 BF |
592 | Perl supports a type of threads called interpreter threads (ithreads). |
593 | These threads can be used explicitly and implicitly. | |
f2fc0a40 AB |
594 | |
595 | Ithreads work by cloning the data tree so that no data is shared | |
14c715f4 | 596 | between different threads. These threads can be used by using the C<threads> |
4ebc451b JH |
597 | module or by doing fork() on win32 (fake fork() support). When a |
598 | thread is cloned all Perl data is cloned, however non-Perl data cannot | |
8f416bb0 | 599 | be cloned automatically. Perl after 5.8.0 has support for the C<CLONE> |
4d5ff0dd | 600 | special subroutine. In C<CLONE> you can do whatever |
9660f481 | 601 | you need to do, |
4ebc451b | 602 | like for example handle the cloning of non-Perl data, if necessary. |
38e4e52d NC |
603 | C<CLONE> will be called once as a class method for every package that has it |
604 | defined (or inherits it). It will be called in the context of the new thread, | |
605 | so all modifications are made in the new area. Currently CLONE is called with | |
7698aede | 606 | no parameters other than the invocant package name, but code should not assume |
38e4e52d NC |
607 | that this will remain unchanged, as it is likely that in future extra parameters |
608 | will be passed in to give more information about the state of cloning. | |
f2fc0a40 AB |
609 | |
610 | If you want to CLONE all objects you will need to keep track of them per | |
611 | package. This is simply done using a hash and Scalar::Util::weaken(). | |
612 | ||
4d5ff0dd | 613 | Perl after 5.8.7 has support for the C<CLONE_SKIP> special subroutine. |
9660f481 DM |
614 | Like C<CLONE>, C<CLONE_SKIP> is called once per package; however, it is |
615 | called just before cloning starts, and in the context of the parent | |
616 | thread. If it returns a true value, then no objects of that class will | |
617 | be cloned; or rather, they will be copied as unblessed, undef values. | |
33de8e4a DM |
618 | For example: if in the parent there are two references to a single blessed |
619 | hash, then in the child there will be two references to a single undefined | |
620 | scalar value instead. | |
9660f481 | 621 | This provides a simple mechanism for making a module threadsafe; just add |
bca52ca1 | 622 | C<sub CLONE_SKIP { 1 }> at the top of the class, and C<DESTROY()> will |
9660f481 DM |
623 | now only be called once per object. Of course, if the child thread needs |
624 | to make use of the objects, then a more sophisticated approach is | |
625 | needed. | |
626 | ||
627 | Like C<CLONE>, C<CLONE_SKIP> is currently called with no parameters other | |
7698aede | 628 | than the invocant package name, although that may change. Similarly, to |
9660f481 DM |
629 | allow for future expansion, the return value should be a single C<0> or |
630 | C<1> value. | |
631 | ||
f102b883 | 632 | =head1 SEE ALSO |
cb1a09d0 | 633 | |
f102b883 | 634 | See L<perlmodlib> for general style issues related to building Perl |
19799a22 GS |
635 | modules and classes, as well as descriptions of the standard library |
636 | and CPAN, L<Exporter> for how Perl's standard import/export mechanism | |
82e1c0d9 | 637 | works, L<perlootut> and L<perlobj> for in-depth information on |
19799a22 GS |
638 | creating classes, L<perlobj> for a hard-core reference document on |
639 | objects, L<perlsub> for an explanation of functions and scoping, | |
640 | and L<perlxstut> and L<perlguts> for more information on writing | |
641 | extension modules. |