This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
more uni doc tweakage
[perl5.git] / pod / perlsub.pod
CommitLineData
a0d0e21e
LW
1=head1 NAME
2
3perlsub - Perl subroutines
4
5=head1 SYNOPSIS
6
7To declare subroutines:
8
09bef843
SB
9 sub NAME; # A "forward" declaration.
10 sub NAME(PROTO); # ditto, but with prototypes
11 sub NAME : ATTRS; # with attributes
12 sub NAME(PROTO) : ATTRS; # with attributes and prototypes
cb1a09d0 13
09bef843
SB
14 sub NAME BLOCK # A declaration and a definition.
15 sub NAME(PROTO) BLOCK # ditto, but with prototypes
16 sub NAME : ATTRS BLOCK # with attributes
17 sub NAME(PROTO) : ATTRS BLOCK # with prototypes and attributes
a0d0e21e 18
748a9306
LW
19To define an anonymous subroutine at runtime:
20
09bef843
SB
21 $subref = sub BLOCK; # no proto
22 $subref = sub (PROTO) BLOCK; # with proto
23 $subref = sub : ATTRS BLOCK; # with attributes
24 $subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes
748a9306 25
a0d0e21e
LW
26To import subroutines:
27
19799a22 28 use MODULE qw(NAME1 NAME2 NAME3);
a0d0e21e
LW
29
30To call subroutines:
31
5f05dabc 32 NAME(LIST); # & is optional with parentheses.
54310121 33 NAME LIST; # Parentheses optional if predeclared/imported.
19799a22 34 &NAME(LIST); # Circumvent prototypes.
5a964f20 35 &NAME; # Makes current @_ visible to called subroutine.
a0d0e21e
LW
36
37=head1 DESCRIPTION
38
19799a22
GS
39Like many languages, Perl provides for user-defined subroutines.
40These may be located anywhere in the main program, loaded in from
41other files via the C<do>, C<require>, or C<use> keywords, or
be3174d2 42generated on the fly using C<eval> or anonymous subroutines.
19799a22
GS
43You can even call a function indirectly using a variable containing
44its name or a CODE reference.
cb1a09d0
AD
45
46The Perl model for function call and return values is simple: all
47functions are passed as parameters one single flat list of scalars, and
48all functions likewise return to their caller one single flat list of
49scalars. Any arrays or hashes in these call and return lists will
50collapse, losing their identities--but you may always use
51pass-by-reference instead to avoid this. Both call and return lists may
52contain as many or as few scalar elements as you'd like. (Often a
53function without an explicit return statement is called a subroutine, but
19799a22
GS
54there's really no difference from Perl's perspective.)
55
56Any arguments passed in show up in the array C<@_>. Therefore, if
57you called a function with two arguments, those would be stored in
58C<$_[0]> and C<$_[1]>. The array C<@_> is a local array, but its
59elements are aliases for the actual scalar parameters. In particular,
60if an element C<$_[0]> is updated, the corresponding argument is
61updated (or an error occurs if it is not updatable). If an argument
62is an array or hash element which did not exist when the function
63was called, that element is created only when (and if) it is modified
64or a reference to it is taken. (Some earlier versions of Perl
65created the element whether or not the element was assigned to.)
66Assigning to the whole array C<@_> removes that aliasing, and does
67not update any arguments.
68
69The return value of a subroutine is the value of the last expression
70evaluated. More explicitly, a C<return> statement may be used to exit the
54310121 71subroutine, optionally specifying the returned value, which will be
72evaluated in the appropriate context (list, scalar, or void) depending
73on the context of the subroutine call. If you specify no return value,
19799a22
GS
74the subroutine returns an empty list in list context, the undefined
75value in scalar context, or nothing in void context. If you return
76one or more aggregates (arrays and hashes), these will be flattened
77together into one large indistinguishable list.
78
79Perl does not have named formal parameters. In practice all you
80do is assign to a C<my()> list of these. Variables that aren't
81declared to be private are global variables. For gory details
82on creating private variables, see L<"Private Variables via my()">
83and L<"Temporary Values via local()">. To create protected
84environments for a set of functions in a separate package (and
85probably a separate file), see L<perlmod/"Packages">.
a0d0e21e
LW
86
87Example:
88
cb1a09d0
AD
89 sub max {
90 my $max = shift(@_);
a0d0e21e
LW
91 foreach $foo (@_) {
92 $max = $foo if $max < $foo;
93 }
cb1a09d0 94 return $max;
a0d0e21e 95 }
cb1a09d0 96 $bestday = max($mon,$tue,$wed,$thu,$fri);
a0d0e21e
LW
97
98Example:
99
100 # get a line, combining continuation lines
101 # that start with whitespace
102
103 sub get_line {
19799a22 104 $thisline = $lookahead; # global variables!
54310121 105 LINE: while (defined($lookahead = <STDIN>)) {
a0d0e21e
LW
106 if ($lookahead =~ /^[ \t]/) {
107 $thisline .= $lookahead;
108 }
109 else {
110 last LINE;
111 }
112 }
19799a22 113 return $thisline;
a0d0e21e
LW
114 }
115
116 $lookahead = <STDIN>; # get first line
19799a22 117 while (defined($line = get_line())) {
a0d0e21e
LW
118 ...
119 }
120
09bef843 121Assigning to a list of private variables to name your arguments:
a0d0e21e
LW
122
123 sub maybeset {
124 my($key, $value) = @_;
cb1a09d0 125 $Foo{$key} = $value unless $Foo{$key};
a0d0e21e
LW
126 }
127
19799a22
GS
128Because the assignment copies the values, this also has the effect
129of turning call-by-reference into call-by-value. Otherwise a
130function is free to do in-place modifications of C<@_> and change
131its caller's values.
cb1a09d0
AD
132
133 upcase_in($v1, $v2); # this changes $v1 and $v2
134 sub upcase_in {
54310121 135 for (@_) { tr/a-z/A-Z/ }
136 }
cb1a09d0
AD
137
138You aren't allowed to modify constants in this way, of course. If an
139argument were actually literal and you tried to change it, you'd take a
140(presumably fatal) exception. For example, this won't work:
141
142 upcase_in("frederick");
143
f86cebdf 144It would be much safer if the C<upcase_in()> function
cb1a09d0
AD
145were written to return a copy of its parameters instead
146of changing them in place:
147
19799a22 148 ($v3, $v4) = upcase($v1, $v2); # this doesn't change $v1 and $v2
cb1a09d0 149 sub upcase {
54310121 150 return unless defined wantarray; # void context, do nothing
cb1a09d0 151 my @parms = @_;
54310121 152 for (@parms) { tr/a-z/A-Z/ }
c07a80fd 153 return wantarray ? @parms : $parms[0];
54310121 154 }
cb1a09d0 155
19799a22 156Notice how this (unprototyped) function doesn't care whether it was
a2293a43 157passed real scalars or arrays. Perl sees all arguments as one big,
19799a22
GS
158long, flat parameter list in C<@_>. This is one area where
159Perl's simple argument-passing style shines. The C<upcase()>
160function would work perfectly well without changing the C<upcase()>
161definition even if we fed it things like this:
cb1a09d0
AD
162
163 @newlist = upcase(@list1, @list2);
164 @newlist = upcase( split /:/, $var );
165
166Do not, however, be tempted to do this:
167
168 (@a, @b) = upcase(@list1, @list2);
169
19799a22
GS
170Like the flattened incoming parameter list, the return list is also
171flattened on return. So all you have managed to do here is stored
17b63f68 172everything in C<@a> and made C<@b> empty. See
13a2d996 173L<Pass by Reference> for alternatives.
19799a22
GS
174
175A subroutine may be called using an explicit C<&> prefix. The
176C<&> is optional in modern Perl, as are parentheses if the
177subroutine has been predeclared. The C<&> is I<not> optional
178when just naming the subroutine, such as when it's used as
179an argument to defined() or undef(). Nor is it optional when you
180want to do an indirect subroutine call with a subroutine name or
181reference using the C<&$subref()> or C<&{$subref}()> constructs,
c47ff5f1 182although the C<< $subref->() >> notation solves that problem.
19799a22
GS
183See L<perlref> for more about all that.
184
185Subroutines may be called recursively. If a subroutine is called
186using the C<&> form, the argument list is optional, and if omitted,
187no C<@_> array is set up for the subroutine: the C<@_> array at the
188time of the call is visible to subroutine instead. This is an
189efficiency mechanism that new users may wish to avoid.
a0d0e21e
LW
190
191 &foo(1,2,3); # pass three arguments
192 foo(1,2,3); # the same
193
194 foo(); # pass a null list
195 &foo(); # the same
a0d0e21e 196
cb1a09d0 197 &foo; # foo() get current args, like foo(@_) !!
54310121 198 foo; # like foo() IFF sub foo predeclared, else "foo"
cb1a09d0 199
19799a22
GS
200Not only does the C<&> form make the argument list optional, it also
201disables any prototype checking on arguments you do provide. This
c07a80fd 202is partly for historical reasons, and partly for having a convenient way
19799a22 203to cheat if you know what you're doing. See L<Prototypes> below.
c07a80fd 204
ac90fb77
EM
205Subroutines whose names are in all upper case are reserved to the Perl
206core, as are modules whose names are in all lower case. A subroutine in
207all capitals is a loosely-held convention meaning it will be called
208indirectly by the run-time system itself, usually due to a triggered event.
209Subroutines that do special, pre-defined things include C<AUTOLOAD>, C<CLONE>,
210C<DESTROY> plus all functions mentioned in L<perltie> and L<PerlIO::via>.
211
212The C<BEGIN>, C<CHECK>, C<INIT> and C<END> subroutines are not so much
213subroutines as named special code blocks, of which you can have more
214than one in a package, and which you can B<not> call explicitely. See
215L<perlmod/"BEGIN, CHECK, INIT and END">
5a964f20 216
b687b08b 217=head2 Private Variables via my()
cb1a09d0
AD
218
219Synopsis:
220
221 my $foo; # declare $foo lexically local
222 my (@wid, %get); # declare list of variables local
223 my $foo = "flurp"; # declare $foo lexical, and init it
224 my @oof = @bar; # declare @oof lexical, and init it
09bef843
SB
225 my $x : Foo = $y; # similar, with an attribute applied
226
a0ae32d3
JH
227B<WARNING>: The use of attribute lists on C<my> declarations is still
228evolving. The current semantics and interface are subject to change.
229See L<attributes> and L<Attribute::Handlers>.
cb1a09d0 230
19799a22
GS
231The C<my> operator declares the listed variables to be lexically
232confined to the enclosing block, conditional (C<if/unless/elsif/else>),
233loop (C<for/foreach/while/until/continue>), subroutine, C<eval>,
234or C<do/require/use>'d file. If more than one value is listed, the
235list must be placed in parentheses. All listed elements must be
236legal lvalues. Only alphanumeric identifiers may be lexically
325192b1 237scoped--magical built-ins like C<$/> must currently be C<local>ized
19799a22
GS
238with C<local> instead.
239
240Unlike dynamic variables created by the C<local> operator, lexical
241variables declared with C<my> are totally hidden from the outside
242world, including any called subroutines. This is true if it's the
243same subroutine called from itself or elsewhere--every call gets
244its own copy.
245
246This doesn't mean that a C<my> variable declared in a statically
247enclosing lexical scope would be invisible. Only dynamic scopes
248are cut off. For example, the C<bumpx()> function below has access
249to the lexical $x variable because both the C<my> and the C<sub>
250occurred at the same scope, presumably file scope.
5a964f20
TC
251
252 my $x = 10;
253 sub bumpx { $x++ }
254
19799a22
GS
255An C<eval()>, however, can see lexical variables of the scope it is
256being evaluated in, so long as the names aren't hidden by declarations within
257the C<eval()> itself. See L<perlref>.
cb1a09d0 258
19799a22 259The parameter list to my() may be assigned to if desired, which allows you
cb1a09d0
AD
260to initialize your variables. (If no initializer is given for a
261particular variable, it is created with the undefined value.) Commonly
19799a22 262this is used to name input parameters to a subroutine. Examples:
cb1a09d0
AD
263
264 $arg = "fred"; # "global" variable
265 $n = cube_root(27);
266 print "$arg thinks the root is $n\n";
267 fred thinks the root is 3
268
269 sub cube_root {
270 my $arg = shift; # name doesn't matter
271 $arg **= 1/3;
272 return $arg;
54310121 273 }
cb1a09d0 274
19799a22
GS
275The C<my> is simply a modifier on something you might assign to. So when
276you do assign to variables in its argument list, C<my> doesn't
6cc33c6d 277change whether those variables are viewed as a scalar or an array. So
cb1a09d0 278
5a964f20 279 my ($foo) = <STDIN>; # WRONG?
cb1a09d0
AD
280 my @FOO = <STDIN>;
281
5f05dabc 282both supply a list context to the right-hand side, while
cb1a09d0
AD
283
284 my $foo = <STDIN>;
285
5f05dabc 286supplies a scalar context. But the following declares only one variable:
748a9306 287
5a964f20 288 my $foo, $bar = 1; # WRONG
748a9306 289
cb1a09d0 290That has the same effect as
748a9306 291
cb1a09d0
AD
292 my $foo;
293 $bar = 1;
a0d0e21e 294
cb1a09d0
AD
295The declared variable is not introduced (is not visible) until after
296the current statement. Thus,
297
298 my $x = $x;
299
19799a22 300can be used to initialize a new $x with the value of the old $x, and
cb1a09d0
AD
301the expression
302
303 my $x = 123 and $x == 123
304
19799a22 305is false unless the old $x happened to have the value C<123>.
cb1a09d0 306
55497cff 307Lexical scopes of control structures are not bounded precisely by the
308braces that delimit their controlled blocks; control expressions are
19799a22 309part of that scope, too. Thus in the loop
55497cff 310
19799a22 311 while (my $line = <>) {
55497cff 312 $line = lc $line;
313 } continue {
314 print $line;
315 }
316
19799a22 317the scope of $line extends from its declaration throughout the rest of
55497cff 318the loop construct (including the C<continue> clause), but not beyond
319it. Similarly, in the conditional
320
321 if ((my $answer = <STDIN>) =~ /^yes$/i) {
322 user_agrees();
323 } elsif ($answer =~ /^no$/i) {
324 user_disagrees();
325 } else {
326 chomp $answer;
327 die "'$answer' is neither 'yes' nor 'no'";
328 }
329
19799a22
GS
330the scope of $answer extends from its declaration through the rest
331of that conditional, including any C<elsif> and C<else> clauses,
457b36cb
MV
332but not beyond it. See L<perlsyn/"Simple statements"> for information
333on the scope of variables in statements with modifiers.
55497cff 334
5f05dabc 335The C<foreach> loop defaults to scoping its index variable dynamically
19799a22
GS
336in the manner of C<local>. However, if the index variable is
337prefixed with the keyword C<my>, or if there is already a lexical
338by that name in scope, then a new lexical is created instead. Thus
339in the loop
55497cff 340
341 for my $i (1, 2, 3) {
342 some_function();
343 }
344
19799a22
GS
345the scope of $i extends to the end of the loop, but not beyond it,
346rendering the value of $i inaccessible within C<some_function()>.
55497cff 347
cb1a09d0 348Some users may wish to encourage the use of lexically scoped variables.
19799a22
GS
349As an aid to catching implicit uses to package variables,
350which are always global, if you say
cb1a09d0
AD
351
352 use strict 'vars';
353
19799a22
GS
354then any variable mentioned from there to the end of the enclosing
355block must either refer to a lexical variable, be predeclared via
77ca0c92 356C<our> or C<use vars>, or else must be fully qualified with the package name.
19799a22
GS
357A compilation error results otherwise. An inner block may countermand
358this with C<no strict 'vars'>.
359
360A C<my> has both a compile-time and a run-time effect. At compile
8593bda5 361time, the compiler takes notice of it. The principal usefulness
19799a22
GS
362of this is to quiet C<use strict 'vars'>, but it is also essential
363for generation of closures as detailed in L<perlref>. Actual
364initialization is delayed until run time, though, so it gets executed
365at the appropriate time, such as each time through a loop, for
366example.
367
368Variables declared with C<my> are not part of any package and are therefore
cb1a09d0
AD
369never fully qualified with the package name. In particular, you're not
370allowed to try to make a package variable (or other global) lexical:
371
372 my $pack::var; # ERROR! Illegal syntax
373 my $_; # also illegal (currently)
374
375In fact, a dynamic variable (also known as package or global variables)
f86cebdf 376are still accessible using the fully qualified C<::> notation even while a
cb1a09d0
AD
377lexical of the same name is also visible:
378
379 package main;
380 local $x = 10;
381 my $x = 20;
382 print "$x and $::x\n";
383
f86cebdf 384That will print out C<20> and C<10>.
cb1a09d0 385
19799a22
GS
386You may declare C<my> variables at the outermost scope of a file
387to hide any such identifiers from the world outside that file. This
388is similar in spirit to C's static variables when they are used at
389the file level. To do this with a subroutine requires the use of
390a closure (an anonymous function that accesses enclosing lexicals).
391If you want to create a private subroutine that cannot be called
392from outside that block, it can declare a lexical variable containing
393an anonymous sub reference:
cb1a09d0
AD
394
395 my $secret_version = '1.001-beta';
396 my $secret_sub = sub { print $secret_version };
397 &$secret_sub();
398
399As long as the reference is never returned by any function within the
5f05dabc 400module, no outside module can see the subroutine, because its name is not in
cb1a09d0 401any package's symbol table. Remember that it's not I<REALLY> called
19799a22 402C<$some_pack::secret_version> or anything; it's just $secret_version,
cb1a09d0
AD
403unqualified and unqualifiable.
404
19799a22
GS
405This does not work with object methods, however; all object methods
406have to be in the symbol table of some package to be found. See
407L<perlref/"Function Templates"> for something of a work-around to
408this.
cb1a09d0 409
c2611fb3 410=head2 Persistent Private Variables
5a964f20
TC
411
412Just because a lexical variable is lexically (also called statically)
f86cebdf 413scoped to its enclosing block, C<eval>, or C<do> FILE, this doesn't mean that
5a964f20
TC
414within a function it works like a C static. It normally works more
415like a C auto, but with implicit garbage collection.
416
417Unlike local variables in C or C++, Perl's lexical variables don't
418necessarily get recycled just because their scope has exited.
419If something more permanent is still aware of the lexical, it will
420stick around. So long as something else references a lexical, that
421lexical won't be freed--which is as it should be. You wouldn't want
422memory being free until you were done using it, or kept around once you
423were done. Automatic garbage collection takes care of this for you.
424
425This means that you can pass back or save away references to lexical
426variables, whereas to return a pointer to a C auto is a grave error.
427It also gives us a way to simulate C's function statics. Here's a
428mechanism for giving a function private variables with both lexical
429scoping and a static lifetime. If you do want to create something like
430C's static variables, just enclose the whole function in an extra block,
431and put the static variable outside the function but in the block.
cb1a09d0
AD
432
433 {
54310121 434 my $secret_val = 0;
cb1a09d0
AD
435 sub gimme_another {
436 return ++$secret_val;
54310121 437 }
438 }
cb1a09d0
AD
439 # $secret_val now becomes unreachable by the outside
440 # world, but retains its value between calls to gimme_another
441
54310121 442If this function is being sourced in from a separate file
cb1a09d0 443via C<require> or C<use>, then this is probably just fine. If it's
19799a22 444all in the main program, you'll need to arrange for the C<my>
cb1a09d0 445to be executed early, either by putting the whole block above
f86cebdf 446your main program, or more likely, placing merely a C<BEGIN>
ac90fb77 447code block around it to make sure it gets executed before your program
cb1a09d0
AD
448starts to run:
449
ac90fb77 450 BEGIN {
54310121 451 my $secret_val = 0;
cb1a09d0
AD
452 sub gimme_another {
453 return ++$secret_val;
54310121 454 }
455 }
cb1a09d0 456
ac90fb77
EM
457See L<perlmod/"BEGIN, CHECK, INIT and END"> about the
458special triggered code blocks, C<BEGIN>, C<CHECK>, C<INIT> and C<END>.
cb1a09d0 459
19799a22
GS
460If declared at the outermost scope (the file scope), then lexicals
461work somewhat like C's file statics. They are available to all
462functions in that same file declared below them, but are inaccessible
463from outside that file. This strategy is sometimes used in modules
464to create private variables that the whole module can see.
5a964f20 465
cb1a09d0
AD
466=head2 Temporary Values via local()
467
19799a22 468B<WARNING>: In general, you should be using C<my> instead of C<local>, because
6d28dffb 469it's faster and safer. Exceptions to this include the global punctuation
325192b1
RGS
470variables, global filehandles and formats, and direct manipulation of the
471Perl symbol table itself. C<local> is mostly used when the current value
472of a variable must be visible to called subroutines.
cb1a09d0
AD
473
474Synopsis:
475
325192b1
RGS
476 # localization of values
477
478 local $foo; # make $foo dynamically local
479 local (@wid, %get); # make list of variables local
480 local $foo = "flurp"; # make $foo dynamic, and init it
481 local @oof = @bar; # make @oof dynamic, and init it
482
483 local $hash{key} = "val"; # sets a local value for this hash entry
484 local ($cond ? $v1 : $v2); # several types of lvalues support
485 # localization
486
487 # localization of symbols
cb1a09d0
AD
488
489 local *FH; # localize $FH, @FH, %FH, &FH ...
490 local *merlyn = *randal; # now $merlyn is really $randal, plus
491 # @merlyn is really @randal, etc
492 local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal
54310121 493 local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc
cb1a09d0 494
19799a22
GS
495A C<local> modifies its listed variables to be "local" to the
496enclosing block, C<eval>, or C<do FILE>--and to I<any subroutine
497called from within that block>. A C<local> just gives temporary
498values to global (meaning package) variables. It does I<not> create
499a local variable. This is known as dynamic scoping. Lexical scoping
500is done with C<my>, which works more like C's auto declarations.
cb1a09d0 501
325192b1
RGS
502Some types of lvalues can be localized as well : hash and array elements
503and slices, conditionals (provided that their result is always
504localizable), and symbolic references. As for simple variables, this
505creates new, dynamically scoped values.
506
507If more than one variable or expression is given to C<local>, they must be
508placed in parentheses. This operator works
cb1a09d0 509by saving the current values of those variables in its argument list on a
5f05dabc 510hidden stack and restoring them upon exiting the block, subroutine, or
cb1a09d0
AD
511eval. This means that called subroutines can also reference the local
512variable, but not the global one. The argument list may be assigned to if
513desired, which allows you to initialize your local variables. (If no
514initializer is given for a particular variable, it is created with an
325192b1 515undefined value.)
cb1a09d0 516
19799a22 517Because C<local> is a run-time operator, it gets executed each time
325192b1
RGS
518through a loop. Consequently, it's more efficient to localize your
519variables outside the loop.
520
521=head3 Grammatical note on local()
cb1a09d0 522
f86cebdf
GS
523A C<local> is simply a modifier on an lvalue expression. When you assign to
524a C<local>ized variable, the C<local> doesn't change whether its list is viewed
cb1a09d0
AD
525as a scalar or an array. So
526
527 local($foo) = <STDIN>;
528 local @FOO = <STDIN>;
529
5f05dabc 530both supply a list context to the right-hand side, while
cb1a09d0
AD
531
532 local $foo = <STDIN>;
533
534supplies a scalar context.
535
325192b1 536=head3 Localization of special variables
3e3baf6d 537
325192b1
RGS
538If you localize a special variable, you'll be giving a new value to it,
539but its magic won't go away. That means that all side-effects related
540to this magic still work with the localized value.
3e3baf6d 541
325192b1
RGS
542This feature allows code like this to work :
543
544 # Read the whole contents of FILE in $slurp
545 { local $/ = undef; $slurp = <FILE>; }
546
547Note, however, that this restricts localization of some values ; for
548example, the following statement dies, as of perl 5.9.0, with an error
549I<Modification of a read-only value attempted>, because the $1 variable is
550magical and read-only :
551
552 local $1 = 2;
553
554Similarly, but in a way more difficult to spot, the following snippet will
555die in perl 5.9.0 :
556
557 sub f { local $_ = "foo"; print }
558 for ($1) {
559 # now $_ is aliased to $1, thus is magic and readonly
560 f();
3e3baf6d 561 }
3e3baf6d 562
325192b1
RGS
563See next section for an alternative to this situation.
564
565B<WARNING>: Localization of tied arrays and hashes does not currently
566work as described.
fd5a896a
DM
567This will be fixed in a future release of Perl; in the meantime, avoid
568code that relies on any particular behaviour of localising tied arrays
569or hashes (localising individual elements is still okay).
325192b1 570See L<perl58delta/"Localising Tied Arrays and Hashes Is Broken"> for more
fd5a896a
DM
571details.
572
325192b1 573=head3 Localization of globs
3e3baf6d 574
325192b1
RGS
575The construct
576
577 local *name;
578
579creates a whole new symbol table entry for the glob C<name> in the
580current package. That means that all variables in its glob slot ($name,
581@name, %name, &name, and the C<name> filehandle) are dynamically reset.
582
583This implies, among other things, that any magic eventually carried by
584those variables is locally lost. In other words, saying C<local */>
585will not have any effect on the internal value of the input record
586separator.
587
588Notably, if you want to work with a brand new value of the default scalar
589$_, and avoid the potential problem listed above about $_ previously
590carrying a magic value, you should use C<local *_> instead of C<local $_>.
591
592=head3 Localization of elements of composite types
3e3baf6d 593
6ee623d5 594It's also worth taking a moment to explain what happens when you
f86cebdf
GS
595C<local>ize a member of a composite type (i.e. an array or hash element).
596In this case, the element is C<local>ized I<by name>. This means that
6ee623d5
GS
597when the scope of the C<local()> ends, the saved value will be
598restored to the hash element whose key was named in the C<local()>, or
599the array element whose index was named in the C<local()>. If that
600element was deleted while the C<local()> was in effect (e.g. by a
601C<delete()> from a hash or a C<shift()> of an array), it will spring
602back into existence, possibly extending an array and filling in the
603skipped elements with C<undef>. For instance, if you say
604
605 %hash = ( 'This' => 'is', 'a' => 'test' );
606 @ary = ( 0..5 );
607 {
608 local($ary[5]) = 6;
609 local($hash{'a'}) = 'drill';
610 while (my $e = pop(@ary)) {
611 print "$e . . .\n";
612 last unless $e > 3;
613 }
614 if (@ary) {
615 $hash{'only a'} = 'test';
616 delete $hash{'a'};
617 }
618 }
619 print join(' ', map { "$_ $hash{$_}" } sort keys %hash),".\n";
620 print "The array has ",scalar(@ary)," elements: ",
621 join(', ', map { defined $_ ? $_ : 'undef' } @ary),"\n";
622
623Perl will print
624
625 6 . . .
626 4 . . .
627 3 . . .
628 This is a test only a test.
629 The array has 6 elements: 0, 1, 2, undef, undef, 5
630
19799a22 631The behavior of local() on non-existent members of composite
7185e5cc
GS
632types is subject to change in future.
633
cd06dffe
GS
634=head2 Lvalue subroutines
635
e6a32221
JC
636B<WARNING>: Lvalue subroutines are still experimental and the
637implementation may change in future versions of Perl.
cd06dffe
GS
638
639It is possible to return a modifiable value from a subroutine.
640To do this, you have to declare the subroutine to return an lvalue.
641
642 my $val;
643 sub canmod : lvalue {
e6a32221 644 # return $val; this doesn't work, don't say "return"
cd06dffe
GS
645 $val;
646 }
647 sub nomod {
648 $val;
649 }
650
651 canmod() = 5; # assigns to $val
652 nomod() = 5; # ERROR
653
654The scalar/list context for the subroutine and for the right-hand
655side of assignment is determined as if the subroutine call is replaced
656by a scalar. For example, consider:
657
658 data(2,3) = get_data(3,4);
659
660Both subroutines here are called in a scalar context, while in:
661
662 (data(2,3)) = get_data(3,4);
663
664and in:
665
666 (data(2),data(3)) = get_data(3,4);
667
668all the subroutines are called in a list context.
669
e6a32221
JC
670=over 4
671
672=item Lvalue subroutines are EXPERIMENTAL
673
674They appear to be convenient, but there are several reasons to be
675circumspect.
676
677You can't use the return keyword, you must pass out the value before
678falling out of subroutine scope. (see comment in example above). This
679is usually not a problem, but it disallows an explicit return out of a
680deeply nested loop, which is sometimes a nice way out.
681
682They violate encapsulation. A normal mutator can check the supplied
683argument before setting the attribute it is protecting, an lvalue
684subroutine never gets that chance. Consider;
685
686 my $some_array_ref = []; # protected by mutators ??
687
688 sub set_arr { # normal mutator
689 my $val = shift;
690 die("expected array, you supplied ", ref $val)
691 unless ref $val eq 'ARRAY';
692 $some_array_ref = $val;
693 }
694 sub set_arr_lv : lvalue { # lvalue mutator
695 $some_array_ref;
696 }
697
698 # set_arr_lv cannot stop this !
699 set_arr_lv() = { a => 1 };
818c4caa 700
e6a32221
JC
701=back
702
cb1a09d0
AD
703=head2 Passing Symbol Table Entries (typeglobs)
704
19799a22
GS
705B<WARNING>: The mechanism described in this section was originally
706the only way to simulate pass-by-reference in older versions of
707Perl. While it still works fine in modern versions, the new reference
708mechanism is generally easier to work with. See below.
a0d0e21e
LW
709
710Sometimes you don't want to pass the value of an array to a subroutine
711but rather the name of it, so that the subroutine can modify the global
712copy of it rather than working with a local copy. In perl you can
cb1a09d0 713refer to all objects of a particular name by prefixing the name
5f05dabc 714with a star: C<*foo>. This is often known as a "typeglob", because the
a0d0e21e
LW
715star on the front can be thought of as a wildcard match for all the
716funny prefix characters on variables and subroutines and such.
717
55497cff 718When evaluated, the typeglob produces a scalar value that represents
5f05dabc 719all the objects of that name, including any filehandle, format, or
a0d0e21e 720subroutine. When assigned to, it causes the name mentioned to refer to
19799a22 721whatever C<*> value was assigned to it. Example:
a0d0e21e
LW
722
723 sub doubleary {
724 local(*someary) = @_;
725 foreach $elem (@someary) {
726 $elem *= 2;
727 }
728 }
729 doubleary(*foo);
730 doubleary(*bar);
731
19799a22 732Scalars are already passed by reference, so you can modify
a0d0e21e 733scalar arguments without using this mechanism by referring explicitly
1fef88e7 734to C<$_[0]> etc. You can modify all the elements of an array by passing
f86cebdf
GS
735all the elements as scalars, but you have to use the C<*> mechanism (or
736the equivalent reference mechanism) to C<push>, C<pop>, or change the size of
a0d0e21e
LW
737an array. It will certainly be faster to pass the typeglob (or reference).
738
739Even if you don't want to modify an array, this mechanism is useful for
5f05dabc 740passing multiple arrays in a single LIST, because normally the LIST
a0d0e21e 741mechanism will merge all the array values so that you can't extract out
55497cff 742the individual arrays. For more on typeglobs, see
2ae324a7 743L<perldata/"Typeglobs and Filehandles">.
cb1a09d0 744
5a964f20
TC
745=head2 When to Still Use local()
746
19799a22
GS
747Despite the existence of C<my>, there are still three places where the
748C<local> operator still shines. In fact, in these three places, you
5a964f20
TC
749I<must> use C<local> instead of C<my>.
750
13a2d996 751=over 4
5a964f20 752
551e1d92
RB
753=item 1.
754
755You need to give a global variable a temporary value, especially $_.
5a964f20 756
f86cebdf
GS
757The global variables, like C<@ARGV> or the punctuation variables, must be
758C<local>ized with C<local()>. This block reads in F</etc/motd>, and splits
5a964f20 759it up into chunks separated by lines of equal signs, which are placed
f86cebdf 760in C<@Fields>.
5a964f20
TC
761
762 {
763 local @ARGV = ("/etc/motd");
764 local $/ = undef;
765 local $_ = <>;
766 @Fields = split /^\s*=+\s*$/;
767 }
768
19799a22 769It particular, it's important to C<local>ize $_ in any routine that assigns
5a964f20
TC
770to it. Look out for implicit assignments in C<while> conditionals.
771
551e1d92
RB
772=item 2.
773
774You need to create a local file or directory handle or a local function.
5a964f20 775
09bef843
SB
776A function that needs a filehandle of its own must use
777C<local()> on a complete typeglob. This can be used to create new symbol
5a964f20
TC
778table entries:
779
780 sub ioqueue {
781 local (*READER, *WRITER); # not my!
17b63f68 782 pipe (READER, WRITER) or die "pipe: $!";
5a964f20
TC
783 return (*READER, *WRITER);
784 }
785 ($head, $tail) = ioqueue();
786
787See the Symbol module for a way to create anonymous symbol table
788entries.
789
790Because assignment of a reference to a typeglob creates an alias, this
791can be used to create what is effectively a local function, or at least,
792a local alias.
793
794 {
f86cebdf
GS
795 local *grow = \&shrink; # only until this block exists
796 grow(); # really calls shrink()
797 move(); # if move() grow()s, it shrink()s too
5a964f20 798 }
f86cebdf 799 grow(); # get the real grow() again
5a964f20
TC
800
801See L<perlref/"Function Templates"> for more about manipulating
802functions by name in this way.
803
551e1d92
RB
804=item 3.
805
806You want to temporarily change just one element of an array or hash.
5a964f20 807
f86cebdf 808You can C<local>ize just one element of an aggregate. Usually this
5a964f20
TC
809is done on dynamics:
810
811 {
812 local $SIG{INT} = 'IGNORE';
813 funct(); # uninterruptible
814 }
815 # interruptibility automatically restored here
816
817But it also works on lexically declared aggregates. Prior to 5.005,
818this operation could on occasion misbehave.
819
820=back
821
cb1a09d0
AD
822=head2 Pass by Reference
823
55497cff 824If you want to pass more than one array or hash into a function--or
825return them from it--and have them maintain their integrity, then
826you're going to have to use an explicit pass-by-reference. Before you
827do that, you need to understand references as detailed in L<perlref>.
c07a80fd 828This section may not make much sense to you otherwise.
cb1a09d0 829
19799a22
GS
830Here are a few simple examples. First, let's pass in several arrays
831to a function and have it C<pop> all of then, returning a new list
832of all their former last elements:
cb1a09d0
AD
833
834 @tailings = popmany ( \@a, \@b, \@c, \@d );
835
836 sub popmany {
837 my $aref;
838 my @retlist = ();
839 foreach $aref ( @_ ) {
840 push @retlist, pop @$aref;
54310121 841 }
cb1a09d0 842 return @retlist;
54310121 843 }
cb1a09d0 844
54310121 845Here's how you might write a function that returns a
cb1a09d0
AD
846list of keys occurring in all the hashes passed to it:
847
54310121 848 @common = inter( \%foo, \%bar, \%joe );
cb1a09d0
AD
849 sub inter {
850 my ($k, $href, %seen); # locals
851 foreach $href (@_) {
852 while ( $k = each %$href ) {
853 $seen{$k}++;
54310121 854 }
855 }
cb1a09d0 856 return grep { $seen{$_} == @_ } keys %seen;
54310121 857 }
cb1a09d0 858
5f05dabc 859So far, we're using just the normal list return mechanism.
54310121 860What happens if you want to pass or return a hash? Well,
861if you're using only one of them, or you don't mind them
cb1a09d0 862concatenating, then the normal calling convention is ok, although
54310121 863a little expensive.
cb1a09d0
AD
864
865Where people get into trouble is here:
866
867 (@a, @b) = func(@c, @d);
868or
869 (%a, %b) = func(%c, %d);
870
19799a22
GS
871That syntax simply won't work. It sets just C<@a> or C<%a> and
872clears the C<@b> or C<%b>. Plus the function didn't get passed
873into two separate arrays or hashes: it got one long list in C<@_>,
874as always.
cb1a09d0
AD
875
876If you can arrange for everyone to deal with this through references, it's
877cleaner code, although not so nice to look at. Here's a function that
878takes two array references as arguments, returning the two array elements
879in order of how many elements they have in them:
880
881 ($aref, $bref) = func(\@c, \@d);
882 print "@$aref has more than @$bref\n";
883 sub func {
884 my ($cref, $dref) = @_;
885 if (@$cref > @$dref) {
886 return ($cref, $dref);
887 } else {
c07a80fd 888 return ($dref, $cref);
54310121 889 }
890 }
cb1a09d0
AD
891
892It turns out that you can actually do this also:
893
894 (*a, *b) = func(\@c, \@d);
895 print "@a has more than @b\n";
896 sub func {
897 local (*c, *d) = @_;
898 if (@c > @d) {
899 return (\@c, \@d);
900 } else {
901 return (\@d, \@c);
54310121 902 }
903 }
cb1a09d0
AD
904
905Here we're using the typeglobs to do symbol table aliasing. It's
19799a22 906a tad subtle, though, and also won't work if you're using C<my>
09bef843 907variables, because only globals (even in disguise as C<local>s)
19799a22 908are in the symbol table.
5f05dabc 909
910If you're passing around filehandles, you could usually just use the bare
19799a22
GS
911typeglob, like C<*STDOUT>, but typeglobs references work, too.
912For example:
5f05dabc 913
914 splutter(\*STDOUT);
915 sub splutter {
916 my $fh = shift;
917 print $fh "her um well a hmmm\n";
918 }
919
920 $rec = get_rec(\*STDIN);
921 sub get_rec {
922 my $fh = shift;
923 return scalar <$fh>;
924 }
925
19799a22
GS
926If you're planning on generating new filehandles, you could do this.
927Notice to pass back just the bare *FH, not its reference.
5f05dabc 928
929 sub openit {
19799a22 930 my $path = shift;
5f05dabc 931 local *FH;
e05a3a1e 932 return open (FH, $path) ? *FH : undef;
54310121 933 }
5f05dabc 934
cb1a09d0
AD
935=head2 Prototypes
936
19799a22
GS
937Perl supports a very limited kind of compile-time argument checking
938using function prototyping. If you declare
cb1a09d0
AD
939
940 sub mypush (\@@)
941
19799a22
GS
942then C<mypush()> takes arguments exactly like C<push()> does. The
943function declaration must be visible at compile time. The prototype
944affects only interpretation of new-style calls to the function,
945where new-style is defined as not using the C<&> character. In
946other words, if you call it like a built-in function, then it behaves
947like a built-in function. If you call it like an old-fashioned
948subroutine, then it behaves like an old-fashioned subroutine. It
949naturally falls out from this rule that prototypes have no influence
950on subroutine references like C<\&foo> or on indirect subroutine
c47ff5f1 951calls like C<&{$subref}> or C<< $subref->() >>.
c07a80fd 952
953Method calls are not influenced by prototypes either, because the
19799a22
GS
954function to be called is indeterminate at compile time, since
955the exact code called depends on inheritance.
cb1a09d0 956
19799a22
GS
957Because the intent of this feature is primarily to let you define
958subroutines that work like built-in functions, here are prototypes
959for some other functions that parse almost exactly like the
960corresponding built-in.
cb1a09d0
AD
961
962 Declared as Called as
963
f86cebdf
GS
964 sub mylink ($$) mylink $old, $new
965 sub myvec ($$$) myvec $var, $offset, 1
966 sub myindex ($$;$) myindex &getstring, "substr"
967 sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off
968 sub myreverse (@) myreverse $a, $b, $c
969 sub myjoin ($@) myjoin ":", $a, $b, $c
970 sub mypop (\@) mypop @array
971 sub mysplice (\@$$@) mysplice @array, @array, 0, @pushme
972 sub mykeys (\%) mykeys %{$hashref}
973 sub myopen (*;$) myopen HANDLE, $name
974 sub mypipe (**) mypipe READHANDLE, WRITEHANDLE
975 sub mygrep (&@) mygrep { /foo/ } $a, $b, $c
976 sub myrand ($) myrand 42
977 sub mytime () mytime
cb1a09d0 978
c07a80fd 979Any backslashed prototype character represents an actual argument
6e47f808 980that absolutely must start with that character. The value passed
19799a22
GS
981as part of C<@_> will be a reference to the actual argument given
982in the subroutine call, obtained by applying C<\> to that argument.
c07a80fd 983
5b794e05
JH
984You can also backslash several argument types simultaneously by using
985the C<\[]> notation:
986
987 sub myref (\[$@%&*])
988
989will allow calling myref() as
990
991 myref $var
992 myref @array
993 myref %hash
994 myref &sub
995 myref *glob
996
997and the first argument of myref() will be a reference to
998a scalar, an array, a hash, a code, or a glob.
999
c07a80fd 1000Unbackslashed prototype characters have special meanings. Any
19799a22 1001unbackslashed C<@> or C<%> eats all remaining arguments, and forces
f86cebdf
GS
1002list context. An argument represented by C<$> forces scalar context. An
1003C<&> requires an anonymous subroutine, which, if passed as the first
0df79f0c
GS
1004argument, does not require the C<sub> keyword or a subsequent comma.
1005
1006A C<*> allows the subroutine to accept a bareword, constant, scalar expression,
648ca4f7
GS
1007typeglob, or a reference to a typeglob in that slot. The value will be
1008available to the subroutine either as a simple scalar, or (in the latter
0df79f0c
GS
1009two cases) as a reference to the typeglob. If you wish to always convert
1010such arguments to a typeglob reference, use Symbol::qualify_to_ref() as
1011follows:
1012
1013 use Symbol 'qualify_to_ref';
1014
1015 sub foo (*) {
1016 my $fh = qualify_to_ref(shift, caller);
1017 ...
1018 }
c07a80fd 1019
1020A semicolon separates mandatory arguments from optional arguments.
19799a22 1021It is redundant before C<@> or C<%>, which gobble up everything else.
cb1a09d0 1022
19799a22
GS
1023Note how the last three examples in the table above are treated
1024specially by the parser. C<mygrep()> is parsed as a true list
1025operator, C<myrand()> is parsed as a true unary operator with unary
1026precedence the same as C<rand()>, and C<mytime()> is truly without
1027arguments, just like C<time()>. That is, if you say
cb1a09d0
AD
1028
1029 mytime +2;
1030
f86cebdf 1031you'll get C<mytime() + 2>, not C<mytime(2)>, which is how it would be parsed
19799a22 1032without a prototype.
cb1a09d0 1033
19799a22
GS
1034The interesting thing about C<&> is that you can generate new syntax with it,
1035provided it's in the initial position:
cb1a09d0 1036
6d28dffb 1037 sub try (&@) {
cb1a09d0
AD
1038 my($try,$catch) = @_;
1039 eval { &$try };
1040 if ($@) {
1041 local $_ = $@;
1042 &$catch;
1043 }
1044 }
55497cff 1045 sub catch (&) { $_[0] }
cb1a09d0
AD
1046
1047 try {
1048 die "phooey";
1049 } catch {
1050 /phooey/ and print "unphooey\n";
1051 };
1052
f86cebdf 1053That prints C<"unphooey">. (Yes, there are still unresolved
19799a22 1054issues having to do with visibility of C<@_>. I'm ignoring that
f86cebdf 1055question for the moment. (But note that if we make C<@_> lexically
cb1a09d0 1056scoped, those anonymous subroutines can act like closures... (Gee,
5f05dabc 1057is this sounding a little Lispish? (Never mind.))))
cb1a09d0 1058
19799a22 1059And here's a reimplementation of the Perl C<grep> operator:
cb1a09d0
AD
1060
1061 sub mygrep (&@) {
1062 my $code = shift;
1063 my @result;
1064 foreach $_ (@_) {
6e47f808 1065 push(@result, $_) if &$code;
cb1a09d0
AD
1066 }
1067 @result;
1068 }
a0d0e21e 1069
cb1a09d0
AD
1070Some folks would prefer full alphanumeric prototypes. Alphanumerics have
1071been intentionally left out of prototypes for the express purpose of
1072someday in the future adding named, formal parameters. The current
1073mechanism's main goal is to let module writers provide better diagnostics
1074for module users. Larry feels the notation quite understandable to Perl
1075programmers, and that it will not intrude greatly upon the meat of the
1076module, nor make it harder to read. The line noise is visually
1077encapsulated into a small pill that's easy to swallow.
1078
420cdfc1
ST
1079If you try to use an alphanumeric sequence in a prototype you will
1080generate an optional warning - "Illegal character in prototype...".
1081Unfortunately earlier versions of Perl allowed the prototype to be
1082used as long as its prefix was a valid prototype. The warning may be
1083upgraded to a fatal error in a future version of Perl once the
1084majority of offending code is fixed.
1085
cb1a09d0
AD
1086It's probably best to prototype new functions, not retrofit prototyping
1087into older ones. That's because you must be especially careful about
1088silent impositions of differing list versus scalar contexts. For example,
1089if you decide that a function should take just one parameter, like this:
1090
1091 sub func ($) {
1092 my $n = shift;
1093 print "you gave me $n\n";
54310121 1094 }
cb1a09d0
AD
1095
1096and someone has been calling it with an array or expression
1097returning a list:
1098
1099 func(@foo);
1100 func( split /:/ );
1101
19799a22 1102Then you've just supplied an automatic C<scalar> in front of their
f86cebdf 1103argument, which can be more than a bit surprising. The old C<@foo>
cb1a09d0 1104which used to hold one thing doesn't get passed in. Instead,
19799a22
GS
1105C<func()> now gets passed in a C<1>; that is, the number of elements
1106in C<@foo>. And the C<split> gets called in scalar context so it
1107starts scribbling on your C<@_> parameter list. Ouch!
cb1a09d0 1108
5f05dabc 1109This is all very powerful, of course, and should be used only in moderation
54310121 1110to make the world a better place.
44a8e56a 1111
1112=head2 Constant Functions
1113
1114Functions with a prototype of C<()> are potential candidates for
19799a22
GS
1115inlining. If the result after optimization and constant folding
1116is either a constant or a lexically-scoped scalar which has no other
54310121 1117references, then it will be used in place of function calls made
19799a22
GS
1118without C<&>. Calls made using C<&> are never inlined. (See
1119F<constant.pm> for an easy way to declare most constants.)
44a8e56a 1120
5a964f20 1121The following functions would all be inlined:
44a8e56a 1122
699e6cd4
TP
1123 sub pi () { 3.14159 } # Not exact, but close.
1124 sub PI () { 4 * atan2 1, 1 } # As good as it gets,
1125 # and it's inlined, too!
44a8e56a 1126 sub ST_DEV () { 0 }
1127 sub ST_INO () { 1 }
1128
1129 sub FLAG_FOO () { 1 << 8 }
1130 sub FLAG_BAR () { 1 << 9 }
1131 sub FLAG_MASK () { FLAG_FOO | FLAG_BAR }
54310121 1132
1133 sub OPT_BAZ () { not (0x1B58 & FLAG_MASK) }
44a8e56a 1134 sub BAZ_VAL () {
1135 if (OPT_BAZ) {
1136 return 23;
1137 }
1138 else {
1139 return 42;
1140 }
1141 }
cb1a09d0 1142
54310121 1143 sub N () { int(BAZ_VAL) / 3 }
1144 BEGIN {
1145 my $prod = 1;
1146 for (1..N) { $prod *= $_ }
1147 sub N_FACTORIAL () { $prod }
1148 }
1149
5a964f20 1150If you redefine a subroutine that was eligible for inlining, you'll get
4cee8e80
CS
1151a mandatory warning. (You can use this warning to tell whether or not a
1152particular subroutine is considered constant.) The warning is
1153considered severe enough not to be optional because previously compiled
1154invocations of the function will still be using the old value of the
19799a22 1155function. If you need to be able to redefine the subroutine, you need to
4cee8e80 1156ensure that it isn't inlined, either by dropping the C<()> prototype
19799a22 1157(which changes calling semantics, so beware) or by thwarting the
4cee8e80
CS
1158inlining mechanism in some other way, such as
1159
4cee8e80 1160 sub not_inlined () {
54310121 1161 23 if $];
4cee8e80
CS
1162 }
1163
19799a22 1164=head2 Overriding Built-in Functions
a0d0e21e 1165
19799a22 1166Many built-in functions may be overridden, though this should be tried
5f05dabc 1167only occasionally and for good reason. Typically this might be
19799a22 1168done by a package attempting to emulate missing built-in functionality
a0d0e21e
LW
1169on a non-Unix system.
1170
163e3a99
JP
1171Overriding may be done only by importing the name from a module at
1172compile time--ordinary predeclaration isn't good enough. However, the
19799a22
GS
1173C<use subs> pragma lets you, in effect, predeclare subs
1174via the import syntax, and these names may then override built-in ones:
a0d0e21e
LW
1175
1176 use subs 'chdir', 'chroot', 'chmod', 'chown';
1177 chdir $somewhere;
1178 sub chdir { ... }
1179
19799a22
GS
1180To unambiguously refer to the built-in form, precede the
1181built-in name with the special package qualifier C<CORE::>. For example,
1182saying C<CORE::open()> always refers to the built-in C<open()>, even
fb73857a 1183if the current package has imported some other subroutine called
19799a22 1184C<&open()> from elsewhere. Even though it looks like a regular
09bef843 1185function call, it isn't: you can't take a reference to it, such as
19799a22 1186the incorrect C<\&CORE::open> might appear to produce.
fb73857a 1187
19799a22
GS
1188Library modules should not in general export built-in names like C<open>
1189or C<chdir> as part of their default C<@EXPORT> list, because these may
a0d0e21e 1190sneak into someone else's namespace and change the semantics unexpectedly.
19799a22 1191Instead, if the module adds that name to C<@EXPORT_OK>, then it's
a0d0e21e
LW
1192possible for a user to import the name explicitly, but not implicitly.
1193That is, they could say
1194
1195 use Module 'open';
1196
19799a22 1197and it would import the C<open> override. But if they said
a0d0e21e
LW
1198
1199 use Module;
1200
19799a22 1201they would get the default imports without overrides.
a0d0e21e 1202
19799a22 1203The foregoing mechanism for overriding built-in is restricted, quite
95d94a4f 1204deliberately, to the package that requests the import. There is a second
19799a22 1205method that is sometimes applicable when you wish to override a built-in
95d94a4f
GS
1206everywhere, without regard to namespace boundaries. This is achieved by
1207importing a sub into the special namespace C<CORE::GLOBAL::>. Here is an
1208example that quite brazenly replaces the C<glob> operator with something
1209that understands regular expressions.
1210
1211 package REGlob;
1212 require Exporter;
1213 @ISA = 'Exporter';
1214 @EXPORT_OK = 'glob';
1215
1216 sub import {
1217 my $pkg = shift;
1218 return unless @_;
1219 my $sym = shift;
1220 my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0));
1221 $pkg->export($where, $sym, @_);
1222 }
1223
1224 sub glob {
1225 my $pat = shift;
1226 my @got;
19799a22
GS
1227 local *D;
1228 if (opendir D, '.') {
1229 @got = grep /$pat/, readdir D;
1230 closedir D;
1231 }
1232 return @got;
95d94a4f
GS
1233 }
1234 1;
1235
1236And here's how it could be (ab)used:
1237
1238 #use REGlob 'GLOBAL_glob'; # override glob() in ALL namespaces
1239 package Foo;
1240 use REGlob 'glob'; # override glob() in Foo:: only
1241 print for <^[a-z_]+\.pm\$>; # show all pragmatic modules
1242
19799a22 1243The initial comment shows a contrived, even dangerous example.
95d94a4f 1244By overriding C<glob> globally, you would be forcing the new (and
19799a22 1245subversive) behavior for the C<glob> operator for I<every> namespace,
95d94a4f
GS
1246without the complete cognizance or cooperation of the modules that own
1247those namespaces. Naturally, this should be done with extreme caution--if
1248it must be done at all.
1249
1250The C<REGlob> example above does not implement all the support needed to
19799a22 1251cleanly override perl's C<glob> operator. The built-in C<glob> has
95d94a4f 1252different behaviors depending on whether it appears in a scalar or list
19799a22 1253context, but our C<REGlob> doesn't. Indeed, many perl built-in have such
95d94a4f
GS
1254context sensitive behaviors, and these must be adequately supported by
1255a properly written override. For a fully functional example of overriding
1256C<glob>, study the implementation of C<File::DosGlob> in the standard
1257library.
1258
77bc9082
RGS
1259When you override a built-in, your replacement should be consistent (if
1260possible) with the built-in native syntax. You can achieve this by using
1261a suitable prototype. To get the prototype of an overridable built-in,
1262use the C<prototype> function with an argument of C<"CORE::builtin_name">
1263(see L<perlfunc/prototype>).
1264
1265Note however that some built-ins can't have their syntax expressed by a
1266prototype (such as C<system> or C<chomp>). If you override them you won't
1267be able to fully mimic their original syntax.
1268
fe854a6f 1269The built-ins C<do>, C<require> and C<glob> can also be overridden, but due
77bc9082
RGS
1270to special magic, their original syntax is preserved, and you don't have
1271to define a prototype for their replacements. (You can't override the
1272C<do BLOCK> syntax, though).
1273
1274C<require> has special additional dark magic: if you invoke your
1275C<require> replacement as C<require Foo::Bar>, it will actually receive
1276the argument C<"Foo/Bar.pm"> in @_. See L<perlfunc/require>.
1277
1278And, as you'll have noticed from the previous example, if you override
593b9c14 1279C<glob>, the C<< <*> >> glob operator is overridden as well.
77bc9082 1280
9b3023bc
RGS
1281In a similar fashion, overriding the C<readline> function also overrides
1282the equivalent I/O operator C<< <FILEHANDLE> >>.
1283
fe854a6f 1284Finally, some built-ins (e.g. C<exists> or C<grep>) can't be overridden.
77bc9082 1285
a0d0e21e
LW
1286=head2 Autoloading
1287
19799a22
GS
1288If you call a subroutine that is undefined, you would ordinarily
1289get an immediate, fatal error complaining that the subroutine doesn't
1290exist. (Likewise for subroutines being used as methods, when the
1291method doesn't exist in any base class of the class's package.)
1292However, if an C<AUTOLOAD> subroutine is defined in the package or
1293packages used to locate the original subroutine, then that
1294C<AUTOLOAD> subroutine is called with the arguments that would have
1295been passed to the original subroutine. The fully qualified name
1296of the original subroutine magically appears in the global $AUTOLOAD
1297variable of the same package as the C<AUTOLOAD> routine. The name
1298is not passed as an ordinary argument because, er, well, just
593b9c14
YST
1299because, that's why. (As an exception, a method call to a nonexistent
1300C<import> or C<unimport> method is just skipped instead.)
19799a22
GS
1301
1302Many C<AUTOLOAD> routines load in a definition for the requested
1303subroutine using eval(), then execute that subroutine using a special
1304form of goto() that erases the stack frame of the C<AUTOLOAD> routine
1305without a trace. (See the source to the standard module documented
1306in L<AutoLoader>, for example.) But an C<AUTOLOAD> routine can
1307also just emulate the routine and never define it. For example,
1308let's pretend that a function that wasn't defined should just invoke
1309C<system> with those arguments. All you'd do is:
cb1a09d0
AD
1310
1311 sub AUTOLOAD {
1312 my $program = $AUTOLOAD;
1313 $program =~ s/.*:://;
1314 system($program, @_);
54310121 1315 }
cb1a09d0 1316 date();
6d28dffb 1317 who('am', 'i');
cb1a09d0
AD
1318 ls('-l');
1319
19799a22
GS
1320In fact, if you predeclare functions you want to call that way, you don't
1321even need parentheses:
cb1a09d0
AD
1322
1323 use subs qw(date who ls);
1324 date;
1325 who "am", "i";
593b9c14 1326 ls '-l';
cb1a09d0
AD
1327
1328A more complete example of this is the standard Shell module, which
19799a22 1329can treat undefined subroutine calls as calls to external programs.
a0d0e21e 1330
19799a22
GS
1331Mechanisms are available to help modules writers split their modules
1332into autoloadable files. See the standard AutoLoader module
6d28dffb 1333described in L<AutoLoader> and in L<AutoSplit>, the standard
1334SelfLoader modules in L<SelfLoader>, and the document on adding C
19799a22 1335functions to Perl code in L<perlxs>.
cb1a09d0 1336
09bef843
SB
1337=head2 Subroutine Attributes
1338
1339A subroutine declaration or definition may have a list of attributes
1340associated with it. If such an attribute list is present, it is
0120eecf 1341broken up at space or colon boundaries and treated as though a
09bef843
SB
1342C<use attributes> had been seen. See L<attributes> for details
1343about what attributes are currently supported.
1344Unlike the limitation with the obsolescent C<use attrs>, the
1345C<sub : ATTRLIST> syntax works to associate the attributes with
1346a pre-declaration, and not just with a subroutine definition.
1347
1348The attributes must be valid as simple identifier names (without any
1349punctuation other than the '_' character). They may have a parameter
1350list appended, which is only checked for whether its parentheses ('(',')')
1351nest properly.
1352
1353Examples of valid syntax (even though the attributes are unknown):
1354
0120eecf
GS
1355 sub fnord (&\%) : switch(10,foo(7,3)) : expensive ;
1356 sub plugh () : Ugly('\(") :Bad ;
09bef843
SB
1357 sub xyzzy : _5x5 { ... }
1358
1359Examples of invalid syntax:
1360
1361 sub fnord : switch(10,foo() ; # ()-string not balanced
1362 sub snoid : Ugly('(') ; # ()-string not balanced
1363 sub xyzzy : 5x5 ; # "5x5" not a valid identifier
1364 sub plugh : Y2::north ; # "Y2::north" not a simple identifier
0120eecf 1365 sub snurt : foo + bar ; # "+" not a colon or space
09bef843
SB
1366
1367The attribute list is passed as a list of constant strings to the code
1368which associates them with the subroutine. In particular, the second example
1369of valid syntax above currently looks like this in terms of how it's
1370parsed and invoked:
1371
1372 use attributes __PACKAGE__, \&plugh, q[Ugly('\(")], 'Bad';
1373
1374For further details on attribute lists and their manipulation,
a0ae32d3 1375see L<attributes> and L<Attribute::Handlers>.
09bef843 1376
cb1a09d0 1377=head1 SEE ALSO
a0d0e21e 1378
19799a22
GS
1379See L<perlref/"Function Templates"> for more about references and closures.
1380See L<perlxs> if you'd like to learn about calling C subroutines from Perl.
a2293a43 1381See L<perlembed> if you'd like to learn about calling Perl subroutines from C.
19799a22
GS
1382See L<perlmod> to learn about bundling up your functions in separate files.
1383See L<perlmodlib> to learn what library modules come standard on your system.
1384See L<perltoot> to learn how to make object method calls.