This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
perlfaq update from Tom Christiansen
[perl5.git] / pod / perlmod.pod
CommitLineData
a0d0e21e
LW
1=head1 NAME
2
f102b883 3perlmod - Perl modules (packages and symbol tables)
a0d0e21e
LW
4
5=head1 DESCRIPTION
6
7=head2 Packages
8
748a9306 9Perl provides a mechanism for alternative namespaces to protect packages
5a964f20
TC
10from stomping on each other's variables. In fact, there's really no such
11thing as a global variable in Perl (although some identifiers default
12to the main package instead of the current one). The package statement
13declares the compilation unit as
f102b883
TC
14being in the given namespace. The scope of the package declaration
15is from the declaration itself through the end of the enclosing block,
16C<eval>, C<sub>, or end of file, whichever comes first (the same scope
17as the my() and local() operators). All further unqualified dynamic
5a964f20
TC
18identifiers will be in this namespace. A package statement only affects
19dynamic variables--including those you've used local() on--but
f102b883
TC
20I<not> lexical variables created with my(). Typically it would be
21the first declaration in a file to be included by the C<require> or
22C<use> operator. You can switch into a package in more than one place;
5a964f20 23it merely influences which symbol table is used by the compiler for the
f102b883
TC
24rest of that block. You can refer to variables and filehandles in other
25packages by prefixing the identifier with the package name and a double
26colon: C<$Package::Variable>. If the package name is null, the C<main>
27package is assumed. That is, C<$::sail> is equivalent to C<$main::sail>.
a0d0e21e 28
d3ebb66b
GS
29The old package delimiter was a single quote, but double colon is now the
30preferred delimiter, in part because it's more readable to humans, and
31in part because it's more readable to B<emacs> macros. It also makes C++
32programmers feel like they know what's going on--as opposed to using the
33single quote as separator, which was there to make Ada programmers feel
34like they knew what's going on. Because the old-fashioned syntax is still
35supported for backwards compatibility, if you try to use a string like
36C<"This is $owner's house">, you'll be accessing C<$owner::s>; that is,
37the $s variable in package C<owner>, which is probably not what you meant.
38Use braces to disambiguate, as in C<"This is ${owner}'s house">.
a0d0e21e
LW
39
40Packages may be nested inside other packages: C<$OUTER::INNER::var>. This
41implies nothing about the order of name lookups, however. All symbols
42are either local to the current package, or must be fully qualified
43from the outer package name down. For instance, there is nowhere
44within package C<OUTER> that C<$INNER::var> refers to C<$OUTER::INNER::var>.
45It would treat package C<INNER> as a totally separate global package.
46
47Only identifiers starting with letters (or underscore) are stored in a
cb1a09d0 48package's symbol table. All other symbols are kept in package C<main>,
5a964f20
TC
49including all of the punctuation variables like $_. In addition, when
50unqualified, the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV,
51INC, and SIG are forced to be in package C<main>, even when used for other
52purposes than their builtin one. Note also that, if you have a package
53called C<m>, C<s>, or C<y>, then you can't use the qualified form of an
54identifier because it will be interpreted instead as a pattern match,
55a substitution, or a transliteration.
a0d0e21e
LW
56
57(Variables beginning with underscore used to be forced into package
58main, but we decided it was more useful for package writers to be able
cb1a09d0
AD
59to use leading underscore to indicate private variables and method names.
60$_ is still global though.)
a0d0e21e
LW
61
62Eval()ed strings are compiled in the package in which the eval() was
63compiled. (Assignments to C<$SIG{}>, however, assume the signal
748a9306 64handler specified is in the C<main> package. Qualify the signal handler
a0d0e21e
LW
65name if you wish to have a signal handler in a package.) For an
66example, examine F<perldb.pl> in the Perl library. It initially switches
67to the C<DB> package so that the debugger doesn't interfere with variables
68in the script you are trying to debug. At various points, however, it
69temporarily switches back to the C<main> package to evaluate various
70expressions in the context of the C<main> package (or wherever you came
71from). See L<perldebug>.
72
f102b883
TC
73The special symbol C<__PACKAGE__> contains the current package, but cannot
74(easily) be used to construct variables.
75
5f05dabc 76See L<perlsub> for other scoping issues related to my() and local(),
f102b883 77and L<perlref> regarding closures.
cb1a09d0 78
a0d0e21e
LW
79=head2 Symbol Tables
80
aa689395
PP
81The symbol table for a package happens to be stored in the hash of that
82name with two colons appended. The main symbol table's name is thus
83C<%main::>, or C<%::> for short. Likewise symbol table for the nested
84package mentioned earlier is named C<%OUTER::INNER::>.
85
86The value in each entry of the hash is what you are referring to when you
87use the C<*name> typeglob notation. In fact, the following have the same
88effect, though the first is more efficient because it does the symbol
89table lookups at compile time:
a0d0e21e 90
f102b883
TC
91 local *main::foo = *main::bar;
92 local $main::{foo} = $main::{bar};
a0d0e21e
LW
93
94You can use this to print out all the variables in a package, for
5a964f20
TC
95instance. The standard F<dumpvar.pl> library and the CPAN module
96Devel::Symdump make use of this.
a0d0e21e 97
cb1a09d0 98Assignment to a typeglob performs an aliasing operation, i.e.,
a0d0e21e
LW
99
100 *dick = *richard;
101
5a964f20
TC
102causes variables, subroutines, formats, and file and directory handles
103accessible via the identifier C<richard> also to be accessible via the
104identifier C<dick>. If you want to alias only a particular variable or
105subroutine, you can assign a reference instead:
a0d0e21e
LW
106
107 *dick = \$richard;
108
5a964f20 109Which makes $richard and $dick the same variable, but leaves
a0d0e21e
LW
110@richard and @dick as separate arrays. Tricky, eh?
111
cb1a09d0
AD
112This mechanism may be used to pass and return cheap references
113into or from subroutines if you won't want to copy the whole
5a964f20
TC
114thing. It only works when assigning to dynamic variables, not
115lexicals.
cb1a09d0 116
5a964f20 117 %some_hash = (); # can't be my()
cb1a09d0
AD
118 *some_hash = fn( \%another_hash );
119 sub fn {
120 local *hashsym = shift;
121 # now use %hashsym normally, and you
122 # will affect the caller's %another_hash
123 my %nhash = (); # do what you want
5f05dabc 124 return \%nhash;
cb1a09d0
AD
125 }
126
5f05dabc 127On return, the reference will overwrite the hash slot in the
cb1a09d0 128symbol table specified by the *some_hash typeglob. This
c36e9b62 129is a somewhat tricky way of passing around references cheaply
cb1a09d0
AD
130when you won't want to have to remember to dereference variables
131explicitly.
132
133Another use of symbol tables is for making "constant" scalars.
134
135 *PI = \3.14159265358979;
136
137Now you cannot alter $PI, which is probably a good thing all in all.
5a964f20
TC
138This isn't the same as a constant subroutine, which is subject to
139optimization at compile-time. This isn't. A constant subroutine is one
140prototyped to take no arguments and to return a constant expression.
141See L<perlsub> for details on these. The C<use constant> pragma is a
142convenient shorthand for these.
cb1a09d0 143
55497cff
PP
144You can say C<*foo{PACKAGE}> and C<*foo{NAME}> to find out what name and
145package the *foo symbol table entry comes from. This may be useful
5a964f20 146in a subroutine that gets passed typeglobs as arguments:
55497cff
PP
147
148 sub identify_typeglob {
149 my $glob = shift;
150 print 'You gave me ', *{$glob}{PACKAGE}, '::', *{$glob}{NAME}, "\n";
151 }
152 identify_typeglob *foo;
153 identify_typeglob *bar::baz;
154
155This prints
156
157 You gave me main::foo
158 You gave me bar::baz
159
160The *foo{THING} notation can also be used to obtain references to the
161individual elements of *foo, see L<perlref>.
162
a0d0e21e
LW
163=head2 Package Constructors and Destructors
164
165There are two special subroutine definitions that function as package
166constructors and destructors. These are the C<BEGIN> and C<END>
167routines. The C<sub> is optional for these routines.
168
f102b883
TC
169A C<BEGIN> subroutine is executed as soon as possible, that is, the moment
170it is completely defined, even before the rest of the containing file
171is parsed. You may have multiple C<BEGIN> blocks within a file--they
172will execute in order of definition. Because a C<BEGIN> block executes
173immediately, it can pull in definitions of subroutines and such from other
174files in time to be visible to the rest of the file. Once a C<BEGIN>
175has run, it is immediately undefined and any code it used is returned to
176Perl's memory pool. This means you can't ever explicitly call a C<BEGIN>.
a0d0e21e 177
5a964f20
TC
178An C<END> subroutine is executed as late as possible, that is, when
179the interpreter is being exited, even if it is exiting as a result of
180a die() function. (But not if it's polymorphing into another program
181via C<exec>, or being blown out of the water by a signal--you have to
182trap that yourself (if you can).) You may have multiple C<END> blocks
183within a file--they will execute in reverse order of definition; that is:
184last in, first out (LIFO).
a0d0e21e 185
5a964f20 186Inside an C<END> subroutine, C<$?> contains the value that the script is
c36e9b62 187going to pass to C<exit()>. You can modify C<$?> to change the exit
f102b883 188value of the script. Beware of changing C<$?> by accident (e.g. by
c36e9b62
PP
189running something via C<system>).
190
5a964f20
TC
191Note that when you use the B<-n> and B<-p> switches to Perl, C<BEGIN> and
192C<END> work just as they do in B<awk>, as a degenerate case. As currently
193implemented (and subject to change, since its inconvenient at best),
194both C<BEGIN> I<and> C<END> blocks are run when you use the B<-c> switch
195for a compile-only syntax check, although your main code is not.
a0d0e21e
LW
196
197=head2 Perl Classes
198
4633a7c4 199There is no special class syntax in Perl, but a package may function
5a964f20
TC
200as a class if it provides subroutines to act as methods. Such a
201package may also derive some of its methods from another class (package)
202by listing the other package name in its global @ISA array (which
203must be a package global, not a lexical).
4633a7c4 204
f102b883 205For more on this, see L<perltoot> and L<perlobj>.
a0d0e21e
LW
206
207=head2 Perl Modules
208
c07a80fd 209A module is just a package that is defined in a library file of
a0d0e21e
LW
210the same name, and is designed to be reusable. It may do this by
211providing a mechanism for exporting some of its symbols into the symbol
212table of any package using it. Or it may function as a class
213definition and make its semantics available implicitly through method
214calls on the class and its objects, without explicit exportation of any
215symbols. Or it can do a little of both.
216
9607fc9c
PP
217For example, to start a normal module called Some::Module, create
218a file called Some/Module.pm and start with this template:
219
220 package Some::Module; # assumes Some/Module.pm
221
222 use strict;
223
224 BEGIN {
225 use Exporter ();
226 use vars qw($VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS);
227
228 # set the version for version checking
229 $VERSION = 1.00;
230 # if using RCS/CVS, this may be preferred
231 $VERSION = do { my @r = (q$Revision: 2.21 $ =~ /\d+/g); sprintf "%d."."%02d" x $#r, @r }; # must be all one line, for MakeMaker
232
233 @ISA = qw(Exporter);
234 @EXPORT = qw(&func1 &func2 &func4);
235 %EXPORT_TAGS = ( ); # eg: TAG => [ qw!name1 name2! ],
236
237 # your exported package globals go here,
238 # as well as any optionally exported functions
239 @EXPORT_OK = qw($Var1 %Hashit &func3);
240 }
241 use vars @EXPORT_OK;
242
243 # non-exported package globals go here
244 use vars qw(@more $stuff);
245
c2611fb3 246 # initialize package globals, first exported ones
9607fc9c
PP
247 $Var1 = '';
248 %Hashit = ();
249
250 # then the others (which are still accessible as $Some::Module::stuff)
251 $stuff = '';
252 @more = ();
253
254 # all file-scoped lexicals must be created before
255 # the functions below that use them.
256
257 # file-private lexicals go here
258 my $priv_var = '';
259 my %secret_hash = ();
260
261 # here's a file-private function as a closure,
262 # callable as &$priv_func; it cannot be prototyped.
263 my $priv_func = sub {
264 # stuff goes here.
265 };
266
267 # make all your functions, whether exported or not;
268 # remember to put something interesting in the {} stubs
269 sub func1 {} # no prototype
270 sub func2() {} # proto'd void
271 sub func3($$) {} # proto'd to 2 scalars
272
273 # this one isn't exported, but could be called!
274 sub func4(\%) {} # proto'd to 1 hash ref
275
276 END { } # module clean-up code here (global destructor)
4633a7c4
LW
277
278Then go on to declare and use your variables in functions
279without any qualifications.
f102b883 280See L<Exporter> and the L<perlmodlib> for details on
4633a7c4
LW
281mechanics and style issues in module creation.
282
283Perl modules are included into your program by saying
a0d0e21e
LW
284
285 use Module;
286
287or
288
289 use Module LIST;
290
291This is exactly equivalent to
292
5a964f20 293 BEGIN { require Module; import Module; }
a0d0e21e
LW
294
295or
296
5a964f20 297 BEGIN { require Module; import Module LIST; }
a0d0e21e 298
cb1a09d0
AD
299As a special case
300
301 use Module ();
302
303is exactly equivalent to
304
5a964f20 305 BEGIN { require Module; }
cb1a09d0 306
a0d0e21e
LW
307All Perl module files have the extension F<.pm>. C<use> assumes this so
308that you don't have to spell out "F<Module.pm>" in quotes. This also
309helps to differentiate new modules from old F<.pl> and F<.ph> files.
310Module names are also capitalized unless they're functioning as pragmas,
311"Pragmas" are in effect compiler directives, and are sometimes called
312"pragmatic modules" (or even "pragmata" if you're a classicist).
313
5a964f20
TC
314The two statements:
315
316 require SomeModule;
317 require "SomeModule.pm";
318
319differ from each other in two ways. In the first case, any double
320colons in the module name, such as C<Some::Module>, are translated
321into your system's directory separator, usually "/". The second
322case does not, and would have to be specified literally. The other difference
323is that seeing the first C<require> clues in the compiler that uses of
324indirect object notation involving "SomeModule", as in C<$ob = purge SomeModule>,
325are method calls, not function calls. (Yes, this really can make a difference.)
326
a0d0e21e
LW
327Because the C<use> statement implies a C<BEGIN> block, the importation
328of semantics happens at the moment the C<use> statement is compiled,
329before the rest of the file is compiled. This is how it is able
330to function as a pragma mechanism, and also how modules are able to
331declare subroutines that are then visible as list operators for
332the rest of the current file. This will not work if you use C<require>
cb1a09d0 333instead of C<use>. With require you can get into this problem:
a0d0e21e
LW
334
335 require Cwd; # make Cwd:: accessible
54310121 336 $here = Cwd::getcwd();
a0d0e21e 337
5f05dabc 338 use Cwd; # import names from Cwd::
a0d0e21e
LW
339 $here = getcwd();
340
341 require Cwd; # make Cwd:: accessible
342 $here = getcwd(); # oops! no main::getcwd()
343
5a964f20
TC
344In general, C<use Module ()> is recommended over C<require Module>,
345because it determines module availability at compile time, not in the
346middle of your program's execution. An exception would be if two modules
347each tried to C<use> each other, and each also called a function from
348that other module. In that case, it's easy to use C<require>s instead.
cb1a09d0 349
a0d0e21e
LW
350Perl packages may be nested inside other package names, so we can have
351package names containing C<::>. But if we used that package name
352directly as a filename it would makes for unwieldy or impossible
353filenames on some systems. Therefore, if a module's name is, say,
354C<Text::Soundex>, then its definition is actually found in the library
355file F<Text/Soundex.pm>.
356
357Perl modules always have a F<.pm> file, but there may also be dynamically
358linked executables or autoloaded subroutine definitions associated with
359the module. If so, these will be entirely transparent to the user of
360the module. It is the responsibility of the F<.pm> file to load (or
361arrange to autoload) any additional functionality. The POSIX module
362happens to do both dynamic loading and autoloading, but the user can
5f05dabc 363say just C<use POSIX> to get it all.
a0d0e21e 364
f102b883 365For more information on writing extension modules, see L<perlxstut>
a0d0e21e
LW
366and L<perlguts>.
367
f102b883 368=head1 SEE ALSO
cb1a09d0 369
f102b883
TC
370See L<perlmodlib> for general style issues related to building Perl
371modules and classes as well as descriptions of the standard library and
372CPAN, L<Exporter> for how Perl's standard import/export mechanism works,
373L<perltoot> for an in-depth tutorial on creating classes, L<perlobj>
374for a hard-core reference document on objects, and L<perlsub> for an
375explanation of functions and scoping.