This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Update IO-Socket-IP to CPAN version 0.28
[perl5.git] / pod / perlsub.pod
CommitLineData
a0d0e21e 1=head1 NAME
d74e8afc 2X<subroutine> X<function>
a0d0e21e
LW
3
4perlsub - Perl subroutines
5
6=head1 SYNOPSIS
7
8To declare subroutines:
d74e8afc 9X<subroutine, declaration> X<sub>
a0d0e21e 10
09bef843
SB
11 sub NAME; # A "forward" declaration.
12 sub NAME(PROTO); # ditto, but with prototypes
13 sub NAME : ATTRS; # with attributes
14 sub NAME(PROTO) : ATTRS; # with attributes and prototypes
cb1a09d0 15
09bef843
SB
16 sub NAME BLOCK # A declaration and a definition.
17 sub NAME(PROTO) BLOCK # ditto, but with prototypes
30d9c59b 18 sub NAME SIG BLOCK # with signature
09bef843
SB
19 sub NAME : ATTRS BLOCK # with attributes
20 sub NAME(PROTO) : ATTRS BLOCK # with prototypes and attributes
30d9c59b 21 sub NAME : ATTRS SIG BLOCK # with attributes and signature
a0d0e21e 22
748a9306 23To define an anonymous subroutine at runtime:
d74e8afc 24X<subroutine, anonymous>
748a9306 25
09bef843
SB
26 $subref = sub BLOCK; # no proto
27 $subref = sub (PROTO) BLOCK; # with proto
30d9c59b 28 $subref = sub SIG BLOCK; # with signature
09bef843
SB
29 $subref = sub : ATTRS BLOCK; # with attributes
30 $subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes
30d9c59b 31 $subref = sub : ATTRS SIG BLOCK; # with attribs and signature
748a9306 32
a0d0e21e 33To import subroutines:
d74e8afc 34X<import>
a0d0e21e 35
19799a22 36 use MODULE qw(NAME1 NAME2 NAME3);
a0d0e21e
LW
37
38To call subroutines:
d74e8afc 39X<subroutine, call> X<call>
a0d0e21e 40
5f05dabc 41 NAME(LIST); # & is optional with parentheses.
54310121 42 NAME LIST; # Parentheses optional if predeclared/imported.
19799a22 43 &NAME(LIST); # Circumvent prototypes.
5a964f20 44 &NAME; # Makes current @_ visible to called subroutine.
a0d0e21e
LW
45
46=head1 DESCRIPTION
47
19799a22
GS
48Like many languages, Perl provides for user-defined subroutines.
49These may be located anywhere in the main program, loaded in from
50other files via the C<do>, C<require>, or C<use> keywords, or
be3174d2 51generated on the fly using C<eval> or anonymous subroutines.
19799a22
GS
52You can even call a function indirectly using a variable containing
53its name or a CODE reference.
cb1a09d0
AD
54
55The Perl model for function call and return values is simple: all
56functions are passed as parameters one single flat list of scalars, and
57all functions likewise return to their caller one single flat list of
58scalars. Any arrays or hashes in these call and return lists will
59collapse, losing their identities--but you may always use
60pass-by-reference instead to avoid this. Both call and return lists may
61contain as many or as few scalar elements as you'd like. (Often a
62function without an explicit return statement is called a subroutine, but
19799a22 63there's really no difference from Perl's perspective.)
d74e8afc 64X<subroutine, parameter> X<parameter>
19799a22 65
30d9c59b
Z
66Any arguments passed in show up in the array C<@_>.
67(They may also show up in lexical variables introduced by a signature;
68see L</Signatures> below.) Therefore, if
19799a22
GS
69you called a function with two arguments, those would be stored in
70C<$_[0]> and C<$_[1]>. The array C<@_> is a local array, but its
71elements are aliases for the actual scalar parameters. In particular,
72if an element C<$_[0]> is updated, the corresponding argument is
73updated (or an error occurs if it is not updatable). If an argument
74is an array or hash element which did not exist when the function
75was called, that element is created only when (and if) it is modified
76or a reference to it is taken. (Some earlier versions of Perl
77created the element whether or not the element was assigned to.)
78Assigning to the whole array C<@_> removes that aliasing, and does
79not update any arguments.
d74e8afc 80X<subroutine, argument> X<argument> X<@_>
19799a22 81
dbb128be
XN
82A C<return> statement may be used to exit a subroutine, optionally
83specifying the returned value, which will be evaluated in the
84appropriate context (list, scalar, or void) depending on the context of
85the subroutine call. If you specify no return value, the subroutine
86returns an empty list in list context, the undefined value in scalar
87context, or nothing in void context. If you return one or more
88aggregates (arrays and hashes), these will be flattened together into
89one large indistinguishable list.
90
91If no C<return> is found and if the last statement is an expression, its
9a989771
RGS
92value is returned. If the last statement is a loop control structure
93like a C<foreach> or a C<while>, the returned value is unspecified. The
94empty sub returns the empty list.
d74e8afc 95X<subroutine, return value> X<return value> X<return>
19799a22 96
30d9c59b 97Aside from an experimental facility (see L</Signatures> below),
19799a22
GS
98Perl does not have named formal parameters. In practice all you
99do is assign to a C<my()> list of these. Variables that aren't
100declared to be private are global variables. For gory details
101on creating private variables, see L<"Private Variables via my()">
102and L<"Temporary Values via local()">. To create protected
103environments for a set of functions in a separate package (and
104probably a separate file), see L<perlmod/"Packages">.
d74e8afc 105X<formal parameter> X<parameter, formal>
a0d0e21e
LW
106
107Example:
108
cb1a09d0
AD
109 sub max {
110 my $max = shift(@_);
a0d0e21e
LW
111 foreach $foo (@_) {
112 $max = $foo if $max < $foo;
113 }
cb1a09d0 114 return $max;
a0d0e21e 115 }
cb1a09d0 116 $bestday = max($mon,$tue,$wed,$thu,$fri);
a0d0e21e
LW
117
118Example:
119
120 # get a line, combining continuation lines
121 # that start with whitespace
122
123 sub get_line {
19799a22 124 $thisline = $lookahead; # global variables!
54310121 125 LINE: while (defined($lookahead = <STDIN>)) {
a0d0e21e
LW
126 if ($lookahead =~ /^[ \t]/) {
127 $thisline .= $lookahead;
128 }
129 else {
130 last LINE;
131 }
132 }
19799a22 133 return $thisline;
a0d0e21e
LW
134 }
135
136 $lookahead = <STDIN>; # get first line
19799a22 137 while (defined($line = get_line())) {
a0d0e21e
LW
138 ...
139 }
140
09bef843 141Assigning to a list of private variables to name your arguments:
a0d0e21e
LW
142
143 sub maybeset {
144 my($key, $value) = @_;
cb1a09d0 145 $Foo{$key} = $value unless $Foo{$key};
a0d0e21e
LW
146 }
147
19799a22
GS
148Because the assignment copies the values, this also has the effect
149of turning call-by-reference into call-by-value. Otherwise a
150function is free to do in-place modifications of C<@_> and change
151its caller's values.
d74e8afc 152X<call-by-reference> X<call-by-value>
cb1a09d0
AD
153
154 upcase_in($v1, $v2); # this changes $v1 and $v2
155 sub upcase_in {
54310121 156 for (@_) { tr/a-z/A-Z/ }
157 }
cb1a09d0
AD
158
159You aren't allowed to modify constants in this way, of course. If an
160argument were actually literal and you tried to change it, you'd take a
161(presumably fatal) exception. For example, this won't work:
d74e8afc 162X<call-by-reference> X<call-by-value>
cb1a09d0
AD
163
164 upcase_in("frederick");
165
f86cebdf 166It would be much safer if the C<upcase_in()> function
cb1a09d0
AD
167were written to return a copy of its parameters instead
168of changing them in place:
169
19799a22 170 ($v3, $v4) = upcase($v1, $v2); # this doesn't change $v1 and $v2
cb1a09d0 171 sub upcase {
54310121 172 return unless defined wantarray; # void context, do nothing
cb1a09d0 173 my @parms = @_;
54310121 174 for (@parms) { tr/a-z/A-Z/ }
c07a80fd 175 return wantarray ? @parms : $parms[0];
54310121 176 }
cb1a09d0 177
19799a22 178Notice how this (unprototyped) function doesn't care whether it was
a2293a43 179passed real scalars or arrays. Perl sees all arguments as one big,
19799a22
GS
180long, flat parameter list in C<@_>. This is one area where
181Perl's simple argument-passing style shines. The C<upcase()>
182function would work perfectly well without changing the C<upcase()>
183definition even if we fed it things like this:
cb1a09d0
AD
184
185 @newlist = upcase(@list1, @list2);
186 @newlist = upcase( split /:/, $var );
187
188Do not, however, be tempted to do this:
189
190 (@a, @b) = upcase(@list1, @list2);
191
19799a22
GS
192Like the flattened incoming parameter list, the return list is also
193flattened on return. So all you have managed to do here is stored
17b63f68 194everything in C<@a> and made C<@b> empty. See
13a2d996 195L<Pass by Reference> for alternatives.
19799a22
GS
196
197A subroutine may be called using an explicit C<&> prefix. The
198C<&> is optional in modern Perl, as are parentheses if the
199subroutine has been predeclared. The C<&> is I<not> optional
200when just naming the subroutine, such as when it's used as
201an argument to defined() or undef(). Nor is it optional when you
202want to do an indirect subroutine call with a subroutine name or
203reference using the C<&$subref()> or C<&{$subref}()> constructs,
c47ff5f1 204although the C<< $subref->() >> notation solves that problem.
19799a22 205See L<perlref> for more about all that.
d74e8afc 206X<&>
19799a22
GS
207
208Subroutines may be called recursively. If a subroutine is called
209using the C<&> form, the argument list is optional, and if omitted,
210no C<@_> array is set up for the subroutine: the C<@_> array at the
211time of the call is visible to subroutine instead. This is an
212efficiency mechanism that new users may wish to avoid.
d74e8afc 213X<recursion>
a0d0e21e
LW
214
215 &foo(1,2,3); # pass three arguments
216 foo(1,2,3); # the same
217
218 foo(); # pass a null list
219 &foo(); # the same
a0d0e21e 220
cb1a09d0 221 &foo; # foo() get current args, like foo(@_) !!
54310121 222 foo; # like foo() IFF sub foo predeclared, else "foo"
cb1a09d0 223
19799a22
GS
224Not only does the C<&> form make the argument list optional, it also
225disables any prototype checking on arguments you do provide. This
c07a80fd 226is partly for historical reasons, and partly for having a convenient way
9688be67 227to cheat if you know what you're doing. See L</Prototypes> below.
d74e8afc 228X<&>
c07a80fd 229
977616ef
RS
230Since Perl 5.16.0, the C<__SUB__> token is available under C<use feature
231'current_sub'> and C<use 5.16.0>. It will evaluate to a reference to the
906024c7
FC
232currently-running sub, which allows for recursive calls without knowing
233your subroutine's name.
977616ef
RS
234
235 use 5.16.0;
236 my $factorial = sub {
237 my ($x) = @_;
238 return 1 if $x == 1;
239 return($x * __SUB__->( $x - 1 ) );
240 };
241
a453e28a
DM
242The behaviour of C<__SUB__> within a regex code block (such as C</(?{...})/>)
243is subject to change.
244
ac90fb77
EM
245Subroutines whose names are in all upper case are reserved to the Perl
246core, as are modules whose names are in all lower case. A subroutine in
247all capitals is a loosely-held convention meaning it will be called
248indirectly by the run-time system itself, usually due to a triggered event.
bf5513e0
ZA
249Subroutines whose name start with a left parenthesis are also reserved the
250same way. The following is a list of some subroutines that currently do
251special, pre-defined things.
252
253=over
254
255=item documented later in this document
256
257C<AUTOLOAD>
258
259=item documented in L<perlmod>
260
261C<CLONE>, C<CLONE_SKIP>,
262
263=item documented in L<perlobj>
264
265C<DESTROY>
266
267=item documented in L<perltie>
268
269C<BINMODE>, C<CLEAR>, C<CLOSE>, C<DELETE>, C<DESTROY>, C<EOF>, C<EXISTS>,
270C<EXTEND>, C<FETCH>, C<FETCHSIZE>, C<FILENO>, C<FIRSTKEY>, C<GETC>,
271C<NEXTKEY>, C<OPEN>, C<POP>, C<PRINT>, C<PRINTF>, C<PUSH>, C<READ>,
272C<READLINE>, C<SCALAR>, C<SEEK>, C<SHIFT>, C<SPLICE>, C<STORE>,
273C<STORESIZE>, C<TELL>, C<TIEARRAY>, C<TIEHANDLE>, C<TIEHASH>,
274C<TIESCALAR>, C<UNSHIFT>, C<UNTIE>, C<WRITE>
275
276=item documented in L<PerlIO::via>
277
278C<BINMODE>, C<CLEARERR>, C<CLOSE>, C<EOF>, C<ERROR>, C<FDOPEN>, C<FILENO>,
279C<FILL>, C<FLUSH>, C<OPEN>, C<POPPED>, C<PUSHED>, C<READ>, C<SEEK>,
280C<SETLINEBUF>, C<SYSOPEN>, C<TELL>, C<UNREAD>, C<UTF8>, C<WRITE>
281
ec2eb8a9
TC
282=item documented in L<perlfunc>
283
284L<< C<import> | perlfunc/use >>, L<< C<unimport> | perlfunc/use >>,
285L<< C<INC> | perlfunc/require >>
286
287=item documented in L<UNIVERSAL>
288
289C<VERSION>
290
291=item documented in L<perldebguts>
292
293C<DB::DB>, C<DB::sub>, C<DB::lsub>, C<DB::goto>, C<DB::postponed>
294
bf5513e0
ZA
295=item undocumented, used internally by the L<overload> feature
296
297any starting with C<(>
298
299=back
ac90fb77 300
3c10abe3
AG
301The C<BEGIN>, C<UNITCHECK>, C<CHECK>, C<INIT> and C<END> subroutines
302are not so much subroutines as named special code blocks, of which you
303can have more than one in a package, and which you can B<not> call
304explicitly. See L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END">
5a964f20 305
30d9c59b
Z
306=head2 Signatures
307
308B<WARNING>: Subroutine signatures are experimental. The feature may be
309modified or removed in future versions of Perl.
310
311Perl has an experimental facility to allow a subroutine's formal
312parameters to be introduced by special syntax, separate from the
313procedural code of the subroutine body. The formal parameter list
314is known as a I<signature>. The facility must be enabled first by a
315pragmatic declaration, C<use feature 'signatures'>, and it will produce
316a warning unless the "experimental::signatures" warnings category is
317disabled.
318
319The signature is part of a subroutine's body. Normally the body of a
320subroutine is simply a braced block of code. When using a signature,
321the signature is a parenthesised list that goes immediately before
322the braced block. The signature declares lexical variables that are
323in scope for the block. When the subroutine is called, the signature
324takes control first. It populates the signature variables from the
325list of arguments that were passed. If the argument list doesn't meet
326the requirements of the signature, then it will throw an exception.
327When the signature processing is complete, control passes to the block.
328
329Positional parameters are handled by simply naming scalar variables in
330the signature. For example,
331
332 sub foo ($left, $right) {
333 return $left + $right;
334 }
335
336takes two positional parameters, which must be filled at runtime by
337two arguments. By default the parameters are mandatory, and it is
338not permitted to pass more arguments than expected. So the above is
339equivalent to
340
341 sub foo {
342 die "Too many arguments for subroutine" unless @_ <= 2;
343 die "Too few arguments for subroutine" unless @_ >= 2;
344 my $left = $_[0];
345 my $right = $_[1];
346 return $left + $right;
347 }
348
349An argument can be ignored by omitting the main part of the name from
350a parameter declaration, leaving just a bare C<$> sigil. For example,
351
352 sub foo ($first, $, $third) {
353 return "first=$first, third=$third";
354 }
355
356Although the ignored argument doesn't go into a variable, it is still
357mandatory for the caller to pass it.
358
359A positional parameter is made optional by giving a default value,
360separated from the parameter name by C<=>:
361
362 sub foo ($left, $right = 0) {
363 return $left + $right;
364 }
365
366The above subroutine may be called with either one or two arguments.
367The default value expression is evaluated when the subroutine is called,
368so it may provide different default values for different calls. It is
369only evaluated if the argument was actually omitted from the call.
370For example,
371
372 my $auto_id = 0;
373 sub foo ($thing, $id = $auto_id++) {
374 print "$thing has ID $id";
375 }
376
377automatically assigns distinct sequential IDs to things for which no
378ID was supplied by the caller. A default value expression may also
379refer to parameters earlier in the signature, making the default for
380one parameter vary according to the earlier parameters. For example,
381
382 sub foo ($first_name, $surname, $nickname = $first_name) {
383 print "$first_name $surname is known as \"$nickname\"";
384 }
385
386An optional parameter can be nameless just like a mandatory parameter.
387For example,
388
389 sub foo ($thing, $ = 1) {
390 print $thing;
391 }
392
393The parameter's default value will still be evaluated if the corresponding
394argument isn't supplied, even though the value won't be stored anywhere.
395This is in case evaluating it has important side effects. However, it
396will be evaluated in void context, so if it doesn't have side effects
397and is not trivial it will generate a warning if the "void" warning
398category is enabled. If a nameless optional parameter's default value
399is not important, it may be omitted just as the parameter's name was:
400
401 sub foo ($thing, $=) {
402 print $thing;
403 }
404
405Optional positional parameters must come after all mandatory positional
406parameters. (If there are no mandatory positional parameters then an
407optional positional parameters can be the first thing in the signature.)
408If there are multiple optional positional parameters and not enough
409arguments are supplied to fill them all, they will be filled from left
410to right.
411
412After positional parameters, additional arguments may be captured in a
413slurpy parameter. The simplest form of this is just an array variable:
414
415 sub foo ($filter, @inputs) {
416 print $filter->($_) foreach @inputs;
417 }
418
419With a slurpy parameter in the signature, there is no upper limit on how
420many arguments may be passed. A slurpy array parameter may be nameless
421just like a positional parameter, in which case its only effect is to
422turn off the argument limit that would otherwise apply:
423
424 sub foo ($thing, @) {
425 print $thing;
426 }
427
428A slurpy parameter may instead be a hash, in which case the arguments
429available to it are interpreted as alternating keys and values.
430There must be as many keys as values: if there is an odd argument then
431an exception will be thrown. Keys will be stringified, and if there are
432duplicates then the later instance takes precedence over the earlier,
433as with standard hash construction.
434
435 sub foo ($filter, %inputs) {
436 print $filter->($_, $inputs{$_}) foreach sort keys %inputs;
437 }
438
439A slurpy hash parameter may be nameless just like other kinds of
440parameter. It still insists that the number of arguments available to
441it be even, even though they're not being put into a variable.
442
443 sub foo ($thing, %) {
444 print $thing;
445 }
446
447A slurpy parameter, either array or hash, must be the last thing in the
448signature. It may follow mandatory and optional positional parameters;
449it may also be the only thing in the signature. Slurpy parameters cannot
450have default values: if no arguments are supplied for them then you get
451an empty array or empty hash.
452
453A signature may be entirely empty, in which case all it does is check
454that the caller passed no arguments:
455
456 sub foo () {
457 return 123;
458 }
459
460When using a signature, the arguments are still available in the special
461array variable C<@_>, in addition to the lexical variables of the
462signature. There is a difference between the two ways of accessing the
463arguments: C<@_> I<aliases> the arguments, but the signature variables
464get I<copies> of the arguments. So writing to a signature variable
465only changes that variable, and has no effect on the caller's variables,
466but writing to an element of C<@_> modifies whatever the caller used to
467supply that argument.
468
469There is a potential syntactic ambiguity between signatures and prototypes
470(see L</Prototypes>), because both start with an opening parenthesis and
471both can appear in some of the same places, such as just after the name
472in a subroutine declaration. For historical reasons, when signatures
473are not enabled, any opening parenthesis in such a context will trigger
474very forgiving prototype parsing. Most signatures will be interpreted
475as prototypes in those circumstances, but won't be valid prototypes.
476(A valid prototype cannot contain any alphabetic character.) This will
477lead to somewhat confusing error messages.
478
479To avoid ambiguity, when signatures are enabled the special syntax
480for prototypes is disabled. There is no attempt to guess whether a
481parenthesised group was intended to be a prototype or a signature.
482To give a subroutine a prototype under these circumstances, use a
483L<prototype attribute|attributes/Built-in Attributes>. For example,
484
485 sub foo :prototype($) { $_[0] }
486
487It is entirely possible for a subroutine to have both a prototype and
488a signature. They do different jobs: the prototype affects compilation
489of calls to the subroutine, and the signature puts argument values into
490lexical variables at runtime. You can therefore write
491
492 sub foo :prototype($$) ($left, $right) {
493 return $left + $right;
494 }
495
496The prototype attribute, and any other attributes, must come before
497the signature. The signature always immediately precedes the block of
498the subroutine's body.
499
b687b08b 500=head2 Private Variables via my()
d74e8afc
ITB
501X<my> X<variable, lexical> X<lexical> X<lexical variable> X<scope, lexical>
502X<lexical scope> X<attributes, my>
cb1a09d0
AD
503
504Synopsis:
505
506 my $foo; # declare $foo lexically local
507 my (@wid, %get); # declare list of variables local
508 my $foo = "flurp"; # declare $foo lexical, and init it
509 my @oof = @bar; # declare @oof lexical, and init it
09bef843
SB
510 my $x : Foo = $y; # similar, with an attribute applied
511
a0ae32d3
JH
512B<WARNING>: The use of attribute lists on C<my> declarations is still
513evolving. The current semantics and interface are subject to change.
514See L<attributes> and L<Attribute::Handlers>.
cb1a09d0 515
19799a22
GS
516The C<my> operator declares the listed variables to be lexically
517confined to the enclosing block, conditional (C<if/unless/elsif/else>),
518loop (C<for/foreach/while/until/continue>), subroutine, C<eval>,
519or C<do/require/use>'d file. If more than one value is listed, the
520list must be placed in parentheses. All listed elements must be
521legal lvalues. Only alphanumeric identifiers may be lexically
325192b1 522scoped--magical built-ins like C<$/> must currently be C<local>ized
19799a22
GS
523with C<local> instead.
524
525Unlike dynamic variables created by the C<local> operator, lexical
526variables declared with C<my> are totally hidden from the outside
527world, including any called subroutines. This is true if it's the
528same subroutine called from itself or elsewhere--every call gets
529its own copy.
d74e8afc 530X<local>
19799a22
GS
531
532This doesn't mean that a C<my> variable declared in a statically
533enclosing lexical scope would be invisible. Only dynamic scopes
534are cut off. For example, the C<bumpx()> function below has access
535to the lexical $x variable because both the C<my> and the C<sub>
536occurred at the same scope, presumably file scope.
5a964f20
TC
537
538 my $x = 10;
539 sub bumpx { $x++ }
540
19799a22
GS
541An C<eval()>, however, can see lexical variables of the scope it is
542being evaluated in, so long as the names aren't hidden by declarations within
543the C<eval()> itself. See L<perlref>.
d74e8afc 544X<eval, scope of>
cb1a09d0 545
19799a22 546The parameter list to my() may be assigned to if desired, which allows you
cb1a09d0
AD
547to initialize your variables. (If no initializer is given for a
548particular variable, it is created with the undefined value.) Commonly
19799a22 549this is used to name input parameters to a subroutine. Examples:
cb1a09d0
AD
550
551 $arg = "fred"; # "global" variable
552 $n = cube_root(27);
553 print "$arg thinks the root is $n\n";
554 fred thinks the root is 3
555
556 sub cube_root {
557 my $arg = shift; # name doesn't matter
558 $arg **= 1/3;
559 return $arg;
54310121 560 }
cb1a09d0 561
19799a22
GS
562The C<my> is simply a modifier on something you might assign to. So when
563you do assign to variables in its argument list, C<my> doesn't
6cc33c6d 564change whether those variables are viewed as a scalar or an array. So
cb1a09d0 565
5a964f20 566 my ($foo) = <STDIN>; # WRONG?
cb1a09d0
AD
567 my @FOO = <STDIN>;
568
5f05dabc 569both supply a list context to the right-hand side, while
cb1a09d0
AD
570
571 my $foo = <STDIN>;
572
5f05dabc 573supplies a scalar context. But the following declares only one variable:
748a9306 574
5a964f20 575 my $foo, $bar = 1; # WRONG
748a9306 576
cb1a09d0 577That has the same effect as
748a9306 578
cb1a09d0
AD
579 my $foo;
580 $bar = 1;
a0d0e21e 581
cb1a09d0
AD
582The declared variable is not introduced (is not visible) until after
583the current statement. Thus,
584
585 my $x = $x;
586
19799a22 587can be used to initialize a new $x with the value of the old $x, and
cb1a09d0
AD
588the expression
589
590 my $x = 123 and $x == 123
591
19799a22 592is false unless the old $x happened to have the value C<123>.
cb1a09d0 593
55497cff 594Lexical scopes of control structures are not bounded precisely by the
595braces that delimit their controlled blocks; control expressions are
19799a22 596part of that scope, too. Thus in the loop
55497cff 597
19799a22 598 while (my $line = <>) {
55497cff 599 $line = lc $line;
600 } continue {
601 print $line;
602 }
603
19799a22 604the scope of $line extends from its declaration throughout the rest of
55497cff 605the loop construct (including the C<continue> clause), but not beyond
606it. Similarly, in the conditional
607
608 if ((my $answer = <STDIN>) =~ /^yes$/i) {
609 user_agrees();
610 } elsif ($answer =~ /^no$/i) {
611 user_disagrees();
612 } else {
613 chomp $answer;
614 die "'$answer' is neither 'yes' nor 'no'";
615 }
616
19799a22
GS
617the scope of $answer extends from its declaration through the rest
618of that conditional, including any C<elsif> and C<else> clauses,
96090e4f 619but not beyond it. See L<perlsyn/"Simple Statements"> for information
457b36cb 620on the scope of variables in statements with modifiers.
55497cff 621
5f05dabc 622The C<foreach> loop defaults to scoping its index variable dynamically
19799a22
GS
623in the manner of C<local>. However, if the index variable is
624prefixed with the keyword C<my>, or if there is already a lexical
625by that name in scope, then a new lexical is created instead. Thus
626in the loop
d74e8afc 627X<foreach> X<for>
55497cff 628
629 for my $i (1, 2, 3) {
630 some_function();
631 }
632
19799a22
GS
633the scope of $i extends to the end of the loop, but not beyond it,
634rendering the value of $i inaccessible within C<some_function()>.
d74e8afc 635X<foreach> X<for>
55497cff 636
cb1a09d0 637Some users may wish to encourage the use of lexically scoped variables.
19799a22
GS
638As an aid to catching implicit uses to package variables,
639which are always global, if you say
cb1a09d0
AD
640
641 use strict 'vars';
642
19799a22
GS
643then any variable mentioned from there to the end of the enclosing
644block must either refer to a lexical variable, be predeclared via
77ca0c92 645C<our> or C<use vars>, or else must be fully qualified with the package name.
19799a22
GS
646A compilation error results otherwise. An inner block may countermand
647this with C<no strict 'vars'>.
648
649A C<my> has both a compile-time and a run-time effect. At compile
8593bda5 650time, the compiler takes notice of it. The principal usefulness
19799a22
GS
651of this is to quiet C<use strict 'vars'>, but it is also essential
652for generation of closures as detailed in L<perlref>. Actual
653initialization is delayed until run time, though, so it gets executed
654at the appropriate time, such as each time through a loop, for
655example.
656
657Variables declared with C<my> are not part of any package and are therefore
cb1a09d0
AD
658never fully qualified with the package name. In particular, you're not
659allowed to try to make a package variable (or other global) lexical:
660
661 my $pack::var; # ERROR! Illegal syntax
cb1a09d0
AD
662
663In fact, a dynamic variable (also known as package or global variables)
f86cebdf 664are still accessible using the fully qualified C<::> notation even while a
cb1a09d0
AD
665lexical of the same name is also visible:
666
667 package main;
668 local $x = 10;
669 my $x = 20;
670 print "$x and $::x\n";
671
f86cebdf 672That will print out C<20> and C<10>.
cb1a09d0 673
19799a22
GS
674You may declare C<my> variables at the outermost scope of a file
675to hide any such identifiers from the world outside that file. This
676is similar in spirit to C's static variables when they are used at
677the file level. To do this with a subroutine requires the use of
678a closure (an anonymous function that accesses enclosing lexicals).
679If you want to create a private subroutine that cannot be called
680from outside that block, it can declare a lexical variable containing
681an anonymous sub reference:
cb1a09d0
AD
682
683 my $secret_version = '1.001-beta';
684 my $secret_sub = sub { print $secret_version };
685 &$secret_sub();
686
687As long as the reference is never returned by any function within the
5f05dabc 688module, no outside module can see the subroutine, because its name is not in
cb1a09d0 689any package's symbol table. Remember that it's not I<REALLY> called
19799a22 690C<$some_pack::secret_version> or anything; it's just $secret_version,
cb1a09d0
AD
691unqualified and unqualifiable.
692
19799a22
GS
693This does not work with object methods, however; all object methods
694have to be in the symbol table of some package to be found. See
695L<perlref/"Function Templates"> for something of a work-around to
696this.
cb1a09d0 697
c2611fb3 698=head2 Persistent Private Variables
ba1f8e91
RGS
699X<state> X<state variable> X<static> X<variable, persistent> X<variable, static> X<closure>
700
701There are two ways to build persistent private variables in Perl 5.10.
702First, you can simply use the C<state> feature. Or, you can use closures,
703if you want to stay compatible with releases older than 5.10.
704
705=head3 Persistent variables via state()
706
9d42615f 707Beginning with Perl 5.10.0, you can declare variables with the C<state>
4a904372 708keyword in place of C<my>. For that to work, though, you must have
ba1f8e91 709enabled that feature beforehand, either by using the C<feature> pragma, or
4a904372 710by using C<-E> on one-liners (see L<feature>). Beginning with Perl 5.16,
47d235f1 711the C<CORE::state> form does not require the
4a904372 712C<feature> pragma.
ba1f8e91 713
ad0cc46c
FC
714The C<state> keyword creates a lexical variable (following the same scoping
715rules as C<my>) that persists from one subroutine call to the next. If a
716state variable resides inside an anonymous subroutine, then each copy of
717the subroutine has its own copy of the state variable. However, the value
718of the state variable will still persist between calls to the same copy of
719the anonymous subroutine. (Don't forget that C<sub { ... }> creates a new
720subroutine each time it is executed.)
721
ba1f8e91
RGS
722For example, the following code maintains a private counter, incremented
723each time the gimme_another() function is called:
724
725 use feature 'state';
726 sub gimme_another { state $x; return ++$x }
727
ad0cc46c
FC
728And this example uses anonymous subroutines to create separate counters:
729
730 use feature 'state';
731 sub create_counter {
732 return sub { state $x; return ++$x }
733 }
734
ba1f8e91
RGS
735Also, since C<$x> is lexical, it can't be reached or modified by any Perl
736code outside.
737
f292fc7a
RS
738When combined with variable declaration, simple scalar assignment to C<state>
739variables (as in C<state $x = 42>) is executed only the first time. When such
740statements are evaluated subsequent times, the assignment is ignored. The
741behavior of this sort of assignment to non-scalar variables is undefined.
ba1f8e91
RGS
742
743=head3 Persistent variables with closures
5a964f20
TC
744
745Just because a lexical variable is lexically (also called statically)
f86cebdf 746scoped to its enclosing block, C<eval>, or C<do> FILE, this doesn't mean that
5a964f20
TC
747within a function it works like a C static. It normally works more
748like a C auto, but with implicit garbage collection.
749
750Unlike local variables in C or C++, Perl's lexical variables don't
751necessarily get recycled just because their scope has exited.
752If something more permanent is still aware of the lexical, it will
753stick around. So long as something else references a lexical, that
754lexical won't be freed--which is as it should be. You wouldn't want
755memory being free until you were done using it, or kept around once you
756were done. Automatic garbage collection takes care of this for you.
757
758This means that you can pass back or save away references to lexical
759variables, whereas to return a pointer to a C auto is a grave error.
760It also gives us a way to simulate C's function statics. Here's a
761mechanism for giving a function private variables with both lexical
762scoping and a static lifetime. If you do want to create something like
763C's static variables, just enclose the whole function in an extra block,
764and put the static variable outside the function but in the block.
cb1a09d0
AD
765
766 {
54310121 767 my $secret_val = 0;
cb1a09d0
AD
768 sub gimme_another {
769 return ++$secret_val;
54310121 770 }
771 }
cb1a09d0
AD
772 # $secret_val now becomes unreachable by the outside
773 # world, but retains its value between calls to gimme_another
774
54310121 775If this function is being sourced in from a separate file
cb1a09d0 776via C<require> or C<use>, then this is probably just fine. If it's
19799a22 777all in the main program, you'll need to arrange for the C<my>
cb1a09d0 778to be executed early, either by putting the whole block above
f86cebdf 779your main program, or more likely, placing merely a C<BEGIN>
ac90fb77 780code block around it to make sure it gets executed before your program
cb1a09d0
AD
781starts to run:
782
ac90fb77 783 BEGIN {
54310121 784 my $secret_val = 0;
cb1a09d0
AD
785 sub gimme_another {
786 return ++$secret_val;
54310121 787 }
788 }
cb1a09d0 789
3c10abe3
AG
790See L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END"> about the
791special triggered code blocks, C<BEGIN>, C<UNITCHECK>, C<CHECK>,
792C<INIT> and C<END>.
cb1a09d0 793
19799a22
GS
794If declared at the outermost scope (the file scope), then lexicals
795work somewhat like C's file statics. They are available to all
796functions in that same file declared below them, but are inaccessible
797from outside that file. This strategy is sometimes used in modules
798to create private variables that the whole module can see.
5a964f20 799
cb1a09d0 800=head2 Temporary Values via local()
d74e8afc
ITB
801X<local> X<scope, dynamic> X<dynamic scope> X<variable, local>
802X<variable, temporary>
cb1a09d0 803
19799a22 804B<WARNING>: In general, you should be using C<my> instead of C<local>, because
6d28dffb 805it's faster and safer. Exceptions to this include the global punctuation
325192b1
RGS
806variables, global filehandles and formats, and direct manipulation of the
807Perl symbol table itself. C<local> is mostly used when the current value
808of a variable must be visible to called subroutines.
cb1a09d0
AD
809
810Synopsis:
811
325192b1
RGS
812 # localization of values
813
555bd962
BG
814 local $foo; # make $foo dynamically local
815 local (@wid, %get); # make list of variables local
816 local $foo = "flurp"; # make $foo dynamic, and init it
817 local @oof = @bar; # make @oof dynamic, and init it
325192b1 818
555bd962
BG
819 local $hash{key} = "val"; # sets a local value for this hash entry
820 delete local $hash{key}; # delete this entry for the current block
821 local ($cond ? $v1 : $v2); # several types of lvalues support
822 # localization
325192b1
RGS
823
824 # localization of symbols
cb1a09d0 825
555bd962
BG
826 local *FH; # localize $FH, @FH, %FH, &FH ...
827 local *merlyn = *randal; # now $merlyn is really $randal, plus
828 # @merlyn is really @randal, etc
829 local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal
830 local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc
cb1a09d0 831
19799a22
GS
832A C<local> modifies its listed variables to be "local" to the
833enclosing block, C<eval>, or C<do FILE>--and to I<any subroutine
834called from within that block>. A C<local> just gives temporary
835values to global (meaning package) variables. It does I<not> create
836a local variable. This is known as dynamic scoping. Lexical scoping
837is done with C<my>, which works more like C's auto declarations.
cb1a09d0 838
ceb12f1f 839Some types of lvalues can be localized as well: hash and array elements
325192b1
RGS
840and slices, conditionals (provided that their result is always
841localizable), and symbolic references. As for simple variables, this
842creates new, dynamically scoped values.
843
844If more than one variable or expression is given to C<local>, they must be
845placed in parentheses. This operator works
cb1a09d0 846by saving the current values of those variables in its argument list on a
5f05dabc 847hidden stack and restoring them upon exiting the block, subroutine, or
cb1a09d0
AD
848eval. This means that called subroutines can also reference the local
849variable, but not the global one. The argument list may be assigned to if
850desired, which allows you to initialize your local variables. (If no
851initializer is given for a particular variable, it is created with an
325192b1 852undefined value.)
cb1a09d0 853
19799a22 854Because C<local> is a run-time operator, it gets executed each time
325192b1
RGS
855through a loop. Consequently, it's more efficient to localize your
856variables outside the loop.
857
858=head3 Grammatical note on local()
d74e8afc 859X<local, context>
cb1a09d0 860
f86cebdf
GS
861A C<local> is simply a modifier on an lvalue expression. When you assign to
862a C<local>ized variable, the C<local> doesn't change whether its list is viewed
cb1a09d0
AD
863as a scalar or an array. So
864
865 local($foo) = <STDIN>;
866 local @FOO = <STDIN>;
867
5f05dabc 868both supply a list context to the right-hand side, while
cb1a09d0
AD
869
870 local $foo = <STDIN>;
871
872supplies a scalar context.
873
325192b1 874=head3 Localization of special variables
d74e8afc 875X<local, special variable>
3e3baf6d 876
325192b1
RGS
877If you localize a special variable, you'll be giving a new value to it,
878but its magic won't go away. That means that all side-effects related
879to this magic still work with the localized value.
3e3baf6d 880
325192b1
RGS
881This feature allows code like this to work :
882
883 # Read the whole contents of FILE in $slurp
884 { local $/ = undef; $slurp = <FILE>; }
885
886Note, however, that this restricts localization of some values ; for
9d42615f 887example, the following statement dies, as of perl 5.10.0, with an error
325192b1
RGS
888I<Modification of a read-only value attempted>, because the $1 variable is
889magical and read-only :
890
891 local $1 = 2;
892
658a9f31
JD
893One exception is the default scalar variable: starting with perl 5.14
894C<local($_)> will always strip all magic from $_, to make it possible
895to safely reuse $_ in a subroutine.
325192b1
RGS
896
897B<WARNING>: Localization of tied arrays and hashes does not currently
898work as described.
fd5a896a
DM
899This will be fixed in a future release of Perl; in the meantime, avoid
900code that relies on any particular behaviour of localising tied arrays
901or hashes (localising individual elements is still okay).
325192b1 902See L<perl58delta/"Localising Tied Arrays and Hashes Is Broken"> for more
fd5a896a 903details.
d74e8afc 904X<local, tie>
fd5a896a 905
325192b1 906=head3 Localization of globs
d74e8afc 907X<local, glob> X<glob>
3e3baf6d 908
325192b1
RGS
909The construct
910
911 local *name;
912
913creates a whole new symbol table entry for the glob C<name> in the
914current package. That means that all variables in its glob slot ($name,
915@name, %name, &name, and the C<name> filehandle) are dynamically reset.
916
917This implies, among other things, that any magic eventually carried by
918those variables is locally lost. In other words, saying C<local */>
919will not have any effect on the internal value of the input record
920separator.
921
325192b1 922=head3 Localization of elements of composite types
d74e8afc 923X<local, composite type element> X<local, array element> X<local, hash element>
3e3baf6d 924
6ee623d5 925It's also worth taking a moment to explain what happens when you
f86cebdf
GS
926C<local>ize a member of a composite type (i.e. an array or hash element).
927In this case, the element is C<local>ized I<by name>. This means that
6ee623d5
GS
928when the scope of the C<local()> ends, the saved value will be
929restored to the hash element whose key was named in the C<local()>, or
930the array element whose index was named in the C<local()>. If that
931element was deleted while the C<local()> was in effect (e.g. by a
932C<delete()> from a hash or a C<shift()> of an array), it will spring
933back into existence, possibly extending an array and filling in the
934skipped elements with C<undef>. For instance, if you say
935
936 %hash = ( 'This' => 'is', 'a' => 'test' );
937 @ary = ( 0..5 );
938 {
939 local($ary[5]) = 6;
940 local($hash{'a'}) = 'drill';
941 while (my $e = pop(@ary)) {
942 print "$e . . .\n";
943 last unless $e > 3;
944 }
945 if (@ary) {
946 $hash{'only a'} = 'test';
947 delete $hash{'a'};
948 }
949 }
950 print join(' ', map { "$_ $hash{$_}" } sort keys %hash),".\n";
951 print "The array has ",scalar(@ary)," elements: ",
952 join(', ', map { defined $_ ? $_ : 'undef' } @ary),"\n";
953
954Perl will print
955
956 6 . . .
957 4 . . .
958 3 . . .
959 This is a test only a test.
960 The array has 6 elements: 0, 1, 2, undef, undef, 5
961
19799a22 962The behavior of local() on non-existent members of composite
7185e5cc
GS
963types is subject to change in future.
964
d361fafa
VP
965=head3 Localized deletion of elements of composite types
966X<delete> X<local, composite type element> X<local, array element> X<local, hash element>
967
968You can use the C<delete local $array[$idx]> and C<delete local $hash{key}>
969constructs to delete a composite type entry for the current block and restore
970it when it ends. They return the array/hash value before the localization,
971which means that they are respectively equivalent to
972
973 do {
974 my $val = $array[$idx];
975 local $array[$idx];
976 delete $array[$idx];
977 $val
978 }
979
980and
981
982 do {
983 my $val = $hash{key};
984 local $hash{key};
985 delete $hash{key};
986 $val
987 }
988
989except that for those the C<local> is scoped to the C<do> block. Slices are
990also accepted.
991
992 my %hash = (
993 a => [ 7, 8, 9 ],
994 b => 1,
995 )
996
997 {
998 my $a = delete local $hash{a};
999 # $a is [ 7, 8, 9 ]
1000 # %hash is (b => 1)
1001
1002 {
1003 my @nums = delete local @$a[0, 2]
1004 # @nums is (7, 9)
1005 # $a is [ undef, 8 ]
1006
1007 $a[0] = 999; # will be erased when the scope ends
1008 }
1009 # $a is back to [ 7, 8, 9 ]
1010
1011 }
1012 # %hash is back to its original state
1013
cd06dffe 1014=head2 Lvalue subroutines
d74e8afc 1015X<lvalue> X<subroutine, lvalue>
cd06dffe 1016
cd06dffe
GS
1017It is possible to return a modifiable value from a subroutine.
1018To do this, you have to declare the subroutine to return an lvalue.
1019
1020 my $val;
1021 sub canmod : lvalue {
4a904372 1022 $val; # or: return $val;
cd06dffe
GS
1023 }
1024 sub nomod {
1025 $val;
1026 }
1027
1028 canmod() = 5; # assigns to $val
1029 nomod() = 5; # ERROR
1030
1031The scalar/list context for the subroutine and for the right-hand
1032side of assignment is determined as if the subroutine call is replaced
1033by a scalar. For example, consider:
1034
1035 data(2,3) = get_data(3,4);
1036
1037Both subroutines here are called in a scalar context, while in:
1038
1039 (data(2,3)) = get_data(3,4);
1040
1041and in:
1042
1043 (data(2),data(3)) = get_data(3,4);
1044
1045all the subroutines are called in a list context.
1046
771cc755
JV
1047Lvalue subroutines are convenient, but you have to keep in mind that,
1048when used with objects, they may violate encapsulation. A normal
1049mutator can check the supplied argument before setting the attribute
1050it is protecting, an lvalue subroutine cannot. If you require any
1051special processing when storing and retrieving the values, consider
1052using the CPAN module Sentinel or something similar.
e6a32221 1053
ca40957e
FC
1054=head2 Lexical Subroutines
1055X<my sub> X<state sub> X<our sub> X<subroutine, lexical>
1056
441078c2
FC
1057B<WARNING>: Lexical subroutines are still experimental. The feature may be
1058modified or removed in future versions of Perl.
ca40957e
FC
1059
1060Lexical subroutines are only available under the C<use feature
1061'lexical_subs'> pragma, which produces a warning unless the
f1d34ca8 1062"experimental::lexical_subs" warnings category is disabled.
ca40957e
FC
1063
1064Beginning with Perl 5.18, you can declare a private subroutine with C<my>
1065or C<state>. As with state variables, the C<state> keyword is only
1066available under C<use feature 'state'> or C<use 5.010> or higher.
1067
1068These subroutines are only visible within the block in which they are
1069declared, and only after that declaration:
1070
f1d34ca8 1071 no warnings "experimental::lexical_subs";
ca40957e
FC
1072 use feature 'lexical_subs';
1073
1074 foo(); # calls the package/global subroutine
1075 state sub foo {
1076 foo(); # also calls the package subroutine
1077 }
1078 foo(); # calls "state" sub
1079 my $ref = \&foo; # take a reference to "state" sub
1080
1081 my sub bar { ... }
1082 bar(); # calls "my" sub
1083
1084To use a lexical subroutine from inside the subroutine itself, you must
1085predeclare it. The C<sub foo {...}> subroutine definition syntax respects
1086any previous C<my sub;> or C<state sub;> declaration.
1087
1088 my sub baz; # predeclaration
1089 sub baz { # define the "my" sub
1090 baz(); # recursive call
1091 }
1092
1093=head3 C<state sub> vs C<my sub>
1094
1095What is the difference between "state" subs and "my" subs? Each time that
1096execution enters a block when "my" subs are declared, a new copy of each
1097sub is created. "State" subroutines persist from one execution of the
1098containing block to the next.
1099
1100So, in general, "state" subroutines are faster. But "my" subs are
1101necessary if you want to create closures:
1102
f1d34ca8 1103 no warnings "experimental::lexical_subs";
ca40957e
FC
1104 use feature 'lexical_subs';
1105
1106 sub whatever {
1107 my $x = shift;
1108 my sub inner {
1109 ... do something with $x ...
1110 }
1111 inner();
1112 }
1113
1114In this example, a new C<$x> is created when C<whatever> is called, and
1115also a new C<inner>, which can see the new C<$x>. A "state" sub will only
1116see the C<$x> from the first call to C<whatever>.
1117
1118=head3 C<our> subroutines
1119
1120Like C<our $variable>, C<our sub> creates a lexical alias to the package
1121subroutine of the same name.
1122
1123The two main uses for this are to switch back to using the package sub
1124inside an inner scope:
1125
f1d34ca8 1126 no warnings "experimental::lexical_subs";
ca40957e
FC
1127 use feature 'lexical_subs';
1128
1129 sub foo { ... }
1130
1131 sub bar {
1132 my sub foo { ... }
1133 {
1134 # need to use the outer foo here
1135 our sub foo;
1136 foo();
1137 }
1138 }
1139
1140and to make a subroutine visible to other packages in the same scope:
1141
1142 package MySneakyModule;
1143
f1d34ca8 1144 no warnings "experimental::lexical_subs";
ca40957e
FC
1145 use feature 'lexical_subs';
1146
1147 our sub do_something { ... }
1148
1149 sub do_something_with_caller {
1150 package DB;
1151 () = caller 1; # sets @DB::args
1152 do_something(@args); # uses MySneakyModule::do_something
1153 }
1154
cb1a09d0 1155=head2 Passing Symbol Table Entries (typeglobs)
d74e8afc 1156X<typeglob> X<*>
cb1a09d0 1157
19799a22
GS
1158B<WARNING>: The mechanism described in this section was originally
1159the only way to simulate pass-by-reference in older versions of
1160Perl. While it still works fine in modern versions, the new reference
1161mechanism is generally easier to work with. See below.
a0d0e21e
LW
1162
1163Sometimes you don't want to pass the value of an array to a subroutine
1164but rather the name of it, so that the subroutine can modify the global
1165copy of it rather than working with a local copy. In perl you can
cb1a09d0 1166refer to all objects of a particular name by prefixing the name
5f05dabc 1167with a star: C<*foo>. This is often known as a "typeglob", because the
a0d0e21e
LW
1168star on the front can be thought of as a wildcard match for all the
1169funny prefix characters on variables and subroutines and such.
1170
55497cff 1171When evaluated, the typeglob produces a scalar value that represents
5f05dabc 1172all the objects of that name, including any filehandle, format, or
a0d0e21e 1173subroutine. When assigned to, it causes the name mentioned to refer to
19799a22 1174whatever C<*> value was assigned to it. Example:
a0d0e21e
LW
1175
1176 sub doubleary {
1177 local(*someary) = @_;
1178 foreach $elem (@someary) {
1179 $elem *= 2;
1180 }
1181 }
1182 doubleary(*foo);
1183 doubleary(*bar);
1184
19799a22 1185Scalars are already passed by reference, so you can modify
a0d0e21e 1186scalar arguments without using this mechanism by referring explicitly
1fef88e7 1187to C<$_[0]> etc. You can modify all the elements of an array by passing
f86cebdf
GS
1188all the elements as scalars, but you have to use the C<*> mechanism (or
1189the equivalent reference mechanism) to C<push>, C<pop>, or change the size of
a0d0e21e
LW
1190an array. It will certainly be faster to pass the typeglob (or reference).
1191
1192Even if you don't want to modify an array, this mechanism is useful for
5f05dabc 1193passing multiple arrays in a single LIST, because normally the LIST
a0d0e21e 1194mechanism will merge all the array values so that you can't extract out
55497cff 1195the individual arrays. For more on typeglobs, see
2ae324a7 1196L<perldata/"Typeglobs and Filehandles">.
cb1a09d0 1197
5a964f20 1198=head2 When to Still Use local()
d74e8afc 1199X<local> X<variable, local>
5a964f20 1200
19799a22
GS
1201Despite the existence of C<my>, there are still three places where the
1202C<local> operator still shines. In fact, in these three places, you
5a964f20
TC
1203I<must> use C<local> instead of C<my>.
1204
13a2d996 1205=over 4
5a964f20 1206
551e1d92
RB
1207=item 1.
1208
1209You need to give a global variable a temporary value, especially $_.
5a964f20 1210
f86cebdf
GS
1211The global variables, like C<@ARGV> or the punctuation variables, must be
1212C<local>ized with C<local()>. This block reads in F</etc/motd>, and splits
5a964f20 1213it up into chunks separated by lines of equal signs, which are placed
f86cebdf 1214in C<@Fields>.
5a964f20
TC
1215
1216 {
1217 local @ARGV = ("/etc/motd");
1218 local $/ = undef;
1219 local $_ = <>;
1220 @Fields = split /^\s*=+\s*$/;
1221 }
1222
19799a22 1223It particular, it's important to C<local>ize $_ in any routine that assigns
5a964f20
TC
1224to it. Look out for implicit assignments in C<while> conditionals.
1225
551e1d92
RB
1226=item 2.
1227
1228You need to create a local file or directory handle or a local function.
5a964f20 1229
09bef843
SB
1230A function that needs a filehandle of its own must use
1231C<local()> on a complete typeglob. This can be used to create new symbol
5a964f20
TC
1232table entries:
1233
1234 sub ioqueue {
1235 local (*READER, *WRITER); # not my!
17b63f68 1236 pipe (READER, WRITER) or die "pipe: $!";
5a964f20
TC
1237 return (*READER, *WRITER);
1238 }
1239 ($head, $tail) = ioqueue();
1240
1241See the Symbol module for a way to create anonymous symbol table
1242entries.
1243
1244Because assignment of a reference to a typeglob creates an alias, this
1245can be used to create what is effectively a local function, or at least,
1246a local alias.
1247
1248 {
4a46e268 1249 local *grow = \&shrink; # only until this block exits
555bd962
BG
1250 grow(); # really calls shrink()
1251 move(); # if move() grow()s, it shrink()s too
5a964f20 1252 }
555bd962 1253 grow(); # get the real grow() again
5a964f20
TC
1254
1255See L<perlref/"Function Templates"> for more about manipulating
1256functions by name in this way.
1257
551e1d92
RB
1258=item 3.
1259
1260You want to temporarily change just one element of an array or hash.
5a964f20 1261
f86cebdf 1262You can C<local>ize just one element of an aggregate. Usually this
5a964f20
TC
1263is done on dynamics:
1264
1265 {
1266 local $SIG{INT} = 'IGNORE';
1267 funct(); # uninterruptible
1268 }
1269 # interruptibility automatically restored here
1270
9d42615f 1271But it also works on lexically declared aggregates.
5a964f20
TC
1272
1273=back
1274
cb1a09d0 1275=head2 Pass by Reference
d74e8afc 1276X<pass by reference> X<pass-by-reference> X<reference>
cb1a09d0 1277
55497cff 1278If you want to pass more than one array or hash into a function--or
1279return them from it--and have them maintain their integrity, then
1280you're going to have to use an explicit pass-by-reference. Before you
1281do that, you need to understand references as detailed in L<perlref>.
c07a80fd 1282This section may not make much sense to you otherwise.
cb1a09d0 1283
19799a22
GS
1284Here are a few simple examples. First, let's pass in several arrays
1285to a function and have it C<pop> all of then, returning a new list
1286of all their former last elements:
cb1a09d0
AD
1287
1288 @tailings = popmany ( \@a, \@b, \@c, \@d );
1289
1290 sub popmany {
1291 my $aref;
1292 my @retlist = ();
1293 foreach $aref ( @_ ) {
1294 push @retlist, pop @$aref;
54310121 1295 }
cb1a09d0 1296 return @retlist;
54310121 1297 }
cb1a09d0 1298
54310121 1299Here's how you might write a function that returns a
cb1a09d0
AD
1300list of keys occurring in all the hashes passed to it:
1301
54310121 1302 @common = inter( \%foo, \%bar, \%joe );
cb1a09d0
AD
1303 sub inter {
1304 my ($k, $href, %seen); # locals
1305 foreach $href (@_) {
1306 while ( $k = each %$href ) {
1307 $seen{$k}++;
54310121 1308 }
1309 }
cb1a09d0 1310 return grep { $seen{$_} == @_ } keys %seen;
54310121 1311 }
cb1a09d0 1312
5f05dabc 1313So far, we're using just the normal list return mechanism.
54310121 1314What happens if you want to pass or return a hash? Well,
1315if you're using only one of them, or you don't mind them
cb1a09d0 1316concatenating, then the normal calling convention is ok, although
54310121 1317a little expensive.
cb1a09d0
AD
1318
1319Where people get into trouble is here:
1320
1321 (@a, @b) = func(@c, @d);
1322or
1323 (%a, %b) = func(%c, %d);
1324
19799a22
GS
1325That syntax simply won't work. It sets just C<@a> or C<%a> and
1326clears the C<@b> or C<%b>. Plus the function didn't get passed
1327into two separate arrays or hashes: it got one long list in C<@_>,
1328as always.
cb1a09d0
AD
1329
1330If you can arrange for everyone to deal with this through references, it's
1331cleaner code, although not so nice to look at. Here's a function that
1332takes two array references as arguments, returning the two array elements
1333in order of how many elements they have in them:
1334
1335 ($aref, $bref) = func(\@c, \@d);
1336 print "@$aref has more than @$bref\n";
1337 sub func {
1338 my ($cref, $dref) = @_;
1339 if (@$cref > @$dref) {
1340 return ($cref, $dref);
1341 } else {
c07a80fd 1342 return ($dref, $cref);
54310121 1343 }
1344 }
cb1a09d0
AD
1345
1346It turns out that you can actually do this also:
1347
1348 (*a, *b) = func(\@c, \@d);
1349 print "@a has more than @b\n";
1350 sub func {
1351 local (*c, *d) = @_;
1352 if (@c > @d) {
1353 return (\@c, \@d);
1354 } else {
1355 return (\@d, \@c);
54310121 1356 }
1357 }
cb1a09d0
AD
1358
1359Here we're using the typeglobs to do symbol table aliasing. It's
19799a22 1360a tad subtle, though, and also won't work if you're using C<my>
09bef843 1361variables, because only globals (even in disguise as C<local>s)
19799a22 1362are in the symbol table.
5f05dabc 1363
1364If you're passing around filehandles, you could usually just use the bare
19799a22
GS
1365typeglob, like C<*STDOUT>, but typeglobs references work, too.
1366For example:
5f05dabc 1367
1368 splutter(\*STDOUT);
1369 sub splutter {
1370 my $fh = shift;
1371 print $fh "her um well a hmmm\n";
1372 }
1373
1374 $rec = get_rec(\*STDIN);
1375 sub get_rec {
1376 my $fh = shift;
1377 return scalar <$fh>;
1378 }
1379
19799a22
GS
1380If you're planning on generating new filehandles, you could do this.
1381Notice to pass back just the bare *FH, not its reference.
5f05dabc 1382
1383 sub openit {
19799a22 1384 my $path = shift;
5f05dabc 1385 local *FH;
e05a3a1e 1386 return open (FH, $path) ? *FH : undef;
54310121 1387 }
5f05dabc 1388
cb1a09d0 1389=head2 Prototypes
d74e8afc 1390X<prototype> X<subroutine, prototype>
cb1a09d0 1391
19799a22 1392Perl supports a very limited kind of compile-time argument checking
eedb00fa
PM
1393using function prototyping. This can be declared in either the PROTO
1394section or with a L<prototype attribute|attributes/Built-in Attributes>.
30d9c59b 1395If you declare either of
cb1a09d0 1396
cba5a3b0 1397 sub mypush (+@)
30d9c59b
Z
1398 sub mypush :prototype(+@)
1399
1400then C<mypush()> takes arguments exactly like C<push()> does.
1401
1402If subroutine signatures are enabled (see L</Signatures>), then
1403the shorter PROTO syntax is unavailable, because it would clash with
1404signatures. In that case, a prototype can only be declared in the form
1405of an attribute.
cb1a09d0 1406
30d9c59b 1407The
19799a22
GS
1408function declaration must be visible at compile time. The prototype
1409affects only interpretation of new-style calls to the function,
1410where new-style is defined as not using the C<&> character. In
1411other words, if you call it like a built-in function, then it behaves
1412like a built-in function. If you call it like an old-fashioned
1413subroutine, then it behaves like an old-fashioned subroutine. It
1414naturally falls out from this rule that prototypes have no influence
1415on subroutine references like C<\&foo> or on indirect subroutine
c47ff5f1 1416calls like C<&{$subref}> or C<< $subref->() >>.
c07a80fd 1417
1418Method calls are not influenced by prototypes either, because the
19799a22
GS
1419function to be called is indeterminate at compile time, since
1420the exact code called depends on inheritance.
cb1a09d0 1421
19799a22
GS
1422Because the intent of this feature is primarily to let you define
1423subroutines that work like built-in functions, here are prototypes
1424for some other functions that parse almost exactly like the
1425corresponding built-in.
cb1a09d0 1426
555bd962
BG
1427 Declared as Called as
1428
1429 sub mylink ($$) mylink $old, $new
1430 sub myvec ($$$) myvec $var, $offset, 1
1431 sub myindex ($$;$) myindex &getstring, "substr"
1432 sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off
1433 sub myreverse (@) myreverse $a, $b, $c
1434 sub myjoin ($@) myjoin ":", $a, $b, $c
1435 sub mypop (+) mypop @array
1436 sub mysplice (+$$@) mysplice @array, 0, 2, @pushme
1437 sub mykeys (+) mykeys %{$hashref}
1438 sub myopen (*;$) myopen HANDLE, $name
1439 sub mypipe (**) mypipe READHANDLE, WRITEHANDLE
1440 sub mygrep (&@) mygrep { /foo/ } $a, $b, $c
1441 sub myrand (;$) myrand 42
1442 sub mytime () mytime
cb1a09d0 1443
c07a80fd 1444Any backslashed prototype character represents an actual argument
ae7a3cfa 1445that must start with that character (optionally preceded by C<my>,
b91b7d1a
FC
1446C<our> or C<local>), with the exception of C<$>, which will
1447accept any scalar lvalue expression, such as C<$foo = 7> or
74083ec6 1448C<< my_function()->[0] >>. The value passed as part of C<@_> will be a
ae7a3cfa
FC
1449reference to the actual argument given in the subroutine call,
1450obtained by applying C<\> to that argument.
c07a80fd 1451
c035a075
DG
1452You can use the C<\[]> backslash group notation to specify more than one
1453allowed argument type. For example:
5b794e05
JH
1454
1455 sub myref (\[$@%&*])
1456
1457will allow calling myref() as
1458
1459 myref $var
1460 myref @array
1461 myref %hash
1462 myref &sub
1463 myref *glob
1464
1465and the first argument of myref() will be a reference to
1466a scalar, an array, a hash, a code, or a glob.
1467
c07a80fd 1468Unbackslashed prototype characters have special meanings. Any
19799a22 1469unbackslashed C<@> or C<%> eats all remaining arguments, and forces
f86cebdf
GS
1470list context. An argument represented by C<$> forces scalar context. An
1471C<&> requires an anonymous subroutine, which, if passed as the first
0df79f0c
GS
1472argument, does not require the C<sub> keyword or a subsequent comma.
1473
1474A C<*> allows the subroutine to accept a bareword, constant, scalar expression,
648ca4f7
GS
1475typeglob, or a reference to a typeglob in that slot. The value will be
1476available to the subroutine either as a simple scalar, or (in the latter
0df79f0c
GS
1477two cases) as a reference to the typeglob. If you wish to always convert
1478such arguments to a typeglob reference, use Symbol::qualify_to_ref() as
1479follows:
1480
1481 use Symbol 'qualify_to_ref';
1482
1483 sub foo (*) {
1484 my $fh = qualify_to_ref(shift, caller);
1485 ...
1486 }
c07a80fd 1487
c035a075
DG
1488The C<+> prototype is a special alternative to C<$> that will act like
1489C<\[@%]> when given a literal array or hash variable, but will otherwise
1490force scalar context on the argument. This is useful for functions which
1491should accept either a literal array or an array reference as the argument:
1492
cba5a3b0 1493 sub mypush (+@) {
c035a075
DG
1494 my $aref = shift;
1495 die "Not an array or arrayref" unless ref $aref eq 'ARRAY';
1496 push @$aref, @_;
1497 }
1498
1499When using the C<+> prototype, your function must check that the argument
1500is of an acceptable type.
1501
859a4967 1502A semicolon (C<;>) separates mandatory arguments from optional arguments.
19799a22 1503It is redundant before C<@> or C<%>, which gobble up everything else.
cb1a09d0 1504
34daab0f
RGS
1505As the last character of a prototype, or just before a semicolon, a C<@>
1506or a C<%>, you can use C<_> in place of C<$>: if this argument is not
1507provided, C<$_> will be used instead.
859a4967 1508
19799a22
GS
1509Note how the last three examples in the table above are treated
1510specially by the parser. C<mygrep()> is parsed as a true list
1511operator, C<myrand()> is parsed as a true unary operator with unary
1512precedence the same as C<rand()>, and C<mytime()> is truly without
1513arguments, just like C<time()>. That is, if you say
cb1a09d0
AD
1514
1515 mytime +2;
1516
f86cebdf 1517you'll get C<mytime() + 2>, not C<mytime(2)>, which is how it would be parsed
3a8944db
FC
1518without a prototype. If you want to force a unary function to have the
1519same precedence as a list operator, add C<;> to the end of the prototype:
1520
1521 sub mygetprotobynumber($;);
1522 mygetprotobynumber $a > $b; # parsed as mygetprotobynumber($a > $b)
cb1a09d0 1523
19799a22
GS
1524The interesting thing about C<&> is that you can generate new syntax with it,
1525provided it's in the initial position:
d74e8afc 1526X<&>
cb1a09d0 1527
6d28dffb 1528 sub try (&@) {
cb1a09d0
AD
1529 my($try,$catch) = @_;
1530 eval { &$try };
1531 if ($@) {
1532 local $_ = $@;
1533 &$catch;
1534 }
1535 }
55497cff 1536 sub catch (&) { $_[0] }
cb1a09d0
AD
1537
1538 try {
1539 die "phooey";
1540 } catch {
1541 /phooey/ and print "unphooey\n";
1542 };
1543
f86cebdf 1544That prints C<"unphooey">. (Yes, there are still unresolved
19799a22 1545issues having to do with visibility of C<@_>. I'm ignoring that
f86cebdf 1546question for the moment. (But note that if we make C<@_> lexically
cb1a09d0 1547scoped, those anonymous subroutines can act like closures... (Gee,
5f05dabc 1548is this sounding a little Lispish? (Never mind.))))
cb1a09d0 1549
19799a22 1550And here's a reimplementation of the Perl C<grep> operator:
d74e8afc 1551X<grep>
cb1a09d0
AD
1552
1553 sub mygrep (&@) {
1554 my $code = shift;
1555 my @result;
1556 foreach $_ (@_) {
6e47f808 1557 push(@result, $_) if &$code;
cb1a09d0
AD
1558 }
1559 @result;
1560 }
a0d0e21e 1561
cb1a09d0
AD
1562Some folks would prefer full alphanumeric prototypes. Alphanumerics have
1563been intentionally left out of prototypes for the express purpose of
1564someday in the future adding named, formal parameters. The current
1565mechanism's main goal is to let module writers provide better diagnostics
1566for module users. Larry feels the notation quite understandable to Perl
1567programmers, and that it will not intrude greatly upon the meat of the
1568module, nor make it harder to read. The line noise is visually
1569encapsulated into a small pill that's easy to swallow.
1570
420cdfc1
ST
1571If you try to use an alphanumeric sequence in a prototype you will
1572generate an optional warning - "Illegal character in prototype...".
1573Unfortunately earlier versions of Perl allowed the prototype to be
1574used as long as its prefix was a valid prototype. The warning may be
1575upgraded to a fatal error in a future version of Perl once the
1576majority of offending code is fixed.
1577
cb1a09d0
AD
1578It's probably best to prototype new functions, not retrofit prototyping
1579into older ones. That's because you must be especially careful about
1580silent impositions of differing list versus scalar contexts. For example,
1581if you decide that a function should take just one parameter, like this:
1582
1583 sub func ($) {
1584 my $n = shift;
1585 print "you gave me $n\n";
54310121 1586 }
cb1a09d0
AD
1587
1588and someone has been calling it with an array or expression
1589returning a list:
1590
1591 func(@foo);
1592 func( split /:/ );
1593
19799a22 1594Then you've just supplied an automatic C<scalar> in front of their
f86cebdf 1595argument, which can be more than a bit surprising. The old C<@foo>
cb1a09d0 1596which used to hold one thing doesn't get passed in. Instead,
19799a22
GS
1597C<func()> now gets passed in a C<1>; that is, the number of elements
1598in C<@foo>. And the C<split> gets called in scalar context so it
1599starts scribbling on your C<@_> parameter list. Ouch!
cb1a09d0 1600
eb40d2ca
PM
1601If a sub has both a PROTO and a BLOCK, the prototype is not applied
1602until after the BLOCK is completely defined. This means that a recursive
1603function with a prototype has to be predeclared for the prototype to take
1604effect, like so:
1605
1606 sub foo($$);
1607 sub foo($$) {
1608 foo 1, 2;
1609 }
1610
5f05dabc 1611This is all very powerful, of course, and should be used only in moderation
54310121 1612to make the world a better place.
44a8e56a 1613
1614=head2 Constant Functions
d74e8afc 1615X<constant>
44a8e56a 1616
1617Functions with a prototype of C<()> are potential candidates for
19799a22
GS
1618inlining. If the result after optimization and constant folding
1619is either a constant or a lexically-scoped scalar which has no other
54310121 1620references, then it will be used in place of function calls made
19799a22
GS
1621without C<&>. Calls made using C<&> are never inlined. (See
1622F<constant.pm> for an easy way to declare most constants.)
44a8e56a 1623
5a964f20 1624The following functions would all be inlined:
44a8e56a 1625
699e6cd4
TP
1626 sub pi () { 3.14159 } # Not exact, but close.
1627 sub PI () { 4 * atan2 1, 1 } # As good as it gets,
1628 # and it's inlined, too!
44a8e56a 1629 sub ST_DEV () { 0 }
1630 sub ST_INO () { 1 }
1631
1632 sub FLAG_FOO () { 1 << 8 }
1633 sub FLAG_BAR () { 1 << 9 }
1634 sub FLAG_MASK () { FLAG_FOO | FLAG_BAR }
54310121 1635
1636 sub OPT_BAZ () { not (0x1B58 & FLAG_MASK) }
88267271
PZ
1637
1638 sub N () { int(OPT_BAZ) / 3 }
1639
1640 sub FOO_SET () { 1 if FLAG_MASK & FLAG_FOO }
1641
1642Be aware that these will not be inlined; as they contain inner scopes,
1643the constant folding doesn't reduce them to a single constant:
1644
1645 sub foo_set () { if (FLAG_MASK & FLAG_FOO) { 1 } }
1646
1647 sub baz_val () {
44a8e56a 1648 if (OPT_BAZ) {
1649 return 23;
1650 }
1651 else {
1652 return 42;
1653 }
1654 }
cb1a09d0 1655
5a964f20 1656If you redefine a subroutine that was eligible for inlining, you'll get
2dc1f7e5 1657a warning by default. (You can use this warning to tell whether or not a
e4fde5ca 1658particular subroutine is considered inlinable.) The warning is
2dc1f7e5
FC
1659considered severe enough not to be affected by the B<-w>
1660switch (or its absence) because previously compiled
4cee8e80 1661invocations of the function will still be using the old value of the
19799a22 1662function. If you need to be able to redefine the subroutine, you need to
4cee8e80 1663ensure that it isn't inlined, either by dropping the C<()> prototype
19799a22 1664(which changes calling semantics, so beware) or by thwarting the
4cee8e80
CS
1665inlining mechanism in some other way, such as
1666
4cee8e80 1667 sub not_inlined () {
54310121 1668 23 if $];
4cee8e80
CS
1669 }
1670
19799a22 1671=head2 Overriding Built-in Functions
d74e8afc 1672X<built-in> X<override> X<CORE> X<CORE::GLOBAL>
a0d0e21e 1673
19799a22 1674Many built-in functions may be overridden, though this should be tried
5f05dabc 1675only occasionally and for good reason. Typically this might be
19799a22 1676done by a package attempting to emulate missing built-in functionality
a0d0e21e
LW
1677on a non-Unix system.
1678
163e3a99
JP
1679Overriding may be done only by importing the name from a module at
1680compile time--ordinary predeclaration isn't good enough. However, the
19799a22
GS
1681C<use subs> pragma lets you, in effect, predeclare subs
1682via the import syntax, and these names may then override built-in ones:
a0d0e21e
LW
1683
1684 use subs 'chdir', 'chroot', 'chmod', 'chown';
1685 chdir $somewhere;
1686 sub chdir { ... }
1687
19799a22
GS
1688To unambiguously refer to the built-in form, precede the
1689built-in name with the special package qualifier C<CORE::>. For example,
1690saying C<CORE::open()> always refers to the built-in C<open()>, even
fb73857a 1691if the current package has imported some other subroutine called
19799a22 1692C<&open()> from elsewhere. Even though it looks like a regular
4aaa4757
FC
1693function call, it isn't: the CORE:: prefix in that case is part of Perl's
1694syntax, and works for any keyword, regardless of what is in the CORE
1695package. Taking a reference to it, that is, C<\&CORE::open>, only works
1696for some keywords. See L<CORE>.
fb73857a 1697
19799a22
GS
1698Library modules should not in general export built-in names like C<open>
1699or C<chdir> as part of their default C<@EXPORT> list, because these may
a0d0e21e 1700sneak into someone else's namespace and change the semantics unexpectedly.
19799a22 1701Instead, if the module adds that name to C<@EXPORT_OK>, then it's
a0d0e21e
LW
1702possible for a user to import the name explicitly, but not implicitly.
1703That is, they could say
1704
1705 use Module 'open';
1706
19799a22 1707and it would import the C<open> override. But if they said
a0d0e21e
LW
1708
1709 use Module;
1710
19799a22 1711they would get the default imports without overrides.
a0d0e21e 1712
19799a22 1713The foregoing mechanism for overriding built-in is restricted, quite
95d94a4f 1714deliberately, to the package that requests the import. There is a second
19799a22 1715method that is sometimes applicable when you wish to override a built-in
95d94a4f
GS
1716everywhere, without regard to namespace boundaries. This is achieved by
1717importing a sub into the special namespace C<CORE::GLOBAL::>. Here is an
1718example that quite brazenly replaces the C<glob> operator with something
1719that understands regular expressions.
1720
1721 package REGlob;
1722 require Exporter;
1723 @ISA = 'Exporter';
1724 @EXPORT_OK = 'glob';
1725
1726 sub import {
1727 my $pkg = shift;
1728 return unless @_;
1729 my $sym = shift;
1730 my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0));
1731 $pkg->export($where, $sym, @_);
1732 }
1733
1734 sub glob {
1735 my $pat = shift;
1736 my @got;
7b815c67
RGS
1737 if (opendir my $d, '.') {
1738 @got = grep /$pat/, readdir $d;
1739 closedir $d;
19799a22
GS
1740 }
1741 return @got;
95d94a4f
GS
1742 }
1743 1;
1744
1745And here's how it could be (ab)used:
1746
1747 #use REGlob 'GLOBAL_glob'; # override glob() in ALL namespaces
1748 package Foo;
1749 use REGlob 'glob'; # override glob() in Foo:: only
1750 print for <^[a-z_]+\.pm\$>; # show all pragmatic modules
1751
19799a22 1752The initial comment shows a contrived, even dangerous example.
95d94a4f 1753By overriding C<glob> globally, you would be forcing the new (and
19799a22 1754subversive) behavior for the C<glob> operator for I<every> namespace,
95d94a4f
GS
1755without the complete cognizance or cooperation of the modules that own
1756those namespaces. Naturally, this should be done with extreme caution--if
1757it must be done at all.
1758
1759The C<REGlob> example above does not implement all the support needed to
19799a22 1760cleanly override perl's C<glob> operator. The built-in C<glob> has
95d94a4f 1761different behaviors depending on whether it appears in a scalar or list
19799a22 1762context, but our C<REGlob> doesn't. Indeed, many perl built-in have such
95d94a4f
GS
1763context sensitive behaviors, and these must be adequately supported by
1764a properly written override. For a fully functional example of overriding
1765C<glob>, study the implementation of C<File::DosGlob> in the standard
1766library.
1767
77bc9082
RGS
1768When you override a built-in, your replacement should be consistent (if
1769possible) with the built-in native syntax. You can achieve this by using
1770a suitable prototype. To get the prototype of an overridable built-in,
1771use the C<prototype> function with an argument of C<"CORE::builtin_name">
1772(see L<perlfunc/prototype>).
1773
1774Note however that some built-ins can't have their syntax expressed by a
1775prototype (such as C<system> or C<chomp>). If you override them you won't
1776be able to fully mimic their original syntax.
1777
fe854a6f 1778The built-ins C<do>, C<require> and C<glob> can also be overridden, but due
77bc9082
RGS
1779to special magic, their original syntax is preserved, and you don't have
1780to define a prototype for their replacements. (You can't override the
1781C<do BLOCK> syntax, though).
1782
1783C<require> has special additional dark magic: if you invoke your
1784C<require> replacement as C<require Foo::Bar>, it will actually receive
1785the argument C<"Foo/Bar.pm"> in @_. See L<perlfunc/require>.
1786
1787And, as you'll have noticed from the previous example, if you override
593b9c14 1788C<glob>, the C<< <*> >> glob operator is overridden as well.
77bc9082 1789
9b3023bc 1790In a similar fashion, overriding the C<readline> function also overrides
e3f73d4e
RGS
1791the equivalent I/O operator C<< <FILEHANDLE> >>. Also, overriding
1792C<readpipe> also overrides the operators C<``> and C<qx//>.
9b3023bc 1793
fe854a6f 1794Finally, some built-ins (e.g. C<exists> or C<grep>) can't be overridden.
77bc9082 1795
a0d0e21e 1796=head2 Autoloading
d74e8afc 1797X<autoloading> X<AUTOLOAD>
a0d0e21e 1798
19799a22
GS
1799If you call a subroutine that is undefined, you would ordinarily
1800get an immediate, fatal error complaining that the subroutine doesn't
1801exist. (Likewise for subroutines being used as methods, when the
1802method doesn't exist in any base class of the class's package.)
1803However, if an C<AUTOLOAD> subroutine is defined in the package or
1804packages used to locate the original subroutine, then that
1805C<AUTOLOAD> subroutine is called with the arguments that would have
1806been passed to the original subroutine. The fully qualified name
1807of the original subroutine magically appears in the global $AUTOLOAD
1808variable of the same package as the C<AUTOLOAD> routine. The name
1809is not passed as an ordinary argument because, er, well, just
593b9c14 1810because, that's why. (As an exception, a method call to a nonexistent
80ee23cd 1811C<import> or C<unimport> method is just skipped instead. Also, if
5b36e945
FC
1812the AUTOLOAD subroutine is an XSUB, there are other ways to retrieve the
1813subroutine name. See L<perlguts/Autoloading with XSUBs> for details.)
80ee23cd 1814
19799a22
GS
1815
1816Many C<AUTOLOAD> routines load in a definition for the requested
1817subroutine using eval(), then execute that subroutine using a special
1818form of goto() that erases the stack frame of the C<AUTOLOAD> routine
1819without a trace. (See the source to the standard module documented
1820in L<AutoLoader>, for example.) But an C<AUTOLOAD> routine can
1821also just emulate the routine and never define it. For example,
1822let's pretend that a function that wasn't defined should just invoke
1823C<system> with those arguments. All you'd do is:
cb1a09d0
AD
1824
1825 sub AUTOLOAD {
1826 my $program = $AUTOLOAD;
1827 $program =~ s/.*:://;
1828 system($program, @_);
54310121 1829 }
cb1a09d0 1830 date();
6d28dffb 1831 who('am', 'i');
cb1a09d0
AD
1832 ls('-l');
1833
19799a22
GS
1834In fact, if you predeclare functions you want to call that way, you don't
1835even need parentheses:
cb1a09d0
AD
1836
1837 use subs qw(date who ls);
1838 date;
1839 who "am", "i";
593b9c14 1840 ls '-l';
cb1a09d0 1841
13058d67 1842A more complete example of this is the Shell module on CPAN, which
19799a22 1843can treat undefined subroutine calls as calls to external programs.
a0d0e21e 1844
19799a22
GS
1845Mechanisms are available to help modules writers split their modules
1846into autoloadable files. See the standard AutoLoader module
6d28dffb 1847described in L<AutoLoader> and in L<AutoSplit>, the standard
1848SelfLoader modules in L<SelfLoader>, and the document on adding C
19799a22 1849functions to Perl code in L<perlxs>.
cb1a09d0 1850
09bef843 1851=head2 Subroutine Attributes
d74e8afc 1852X<attribute> X<subroutine, attribute> X<attrs>
09bef843
SB
1853
1854A subroutine declaration or definition may have a list of attributes
1855associated with it. If such an attribute list is present, it is
0120eecf 1856broken up at space or colon boundaries and treated as though a
09bef843
SB
1857C<use attributes> had been seen. See L<attributes> for details
1858about what attributes are currently supported.
1859Unlike the limitation with the obsolescent C<use attrs>, the
1860C<sub : ATTRLIST> syntax works to associate the attributes with
1861a pre-declaration, and not just with a subroutine definition.
1862
1863The attributes must be valid as simple identifier names (without any
1864punctuation other than the '_' character). They may have a parameter
1865list appended, which is only checked for whether its parentheses ('(',')')
1866nest properly.
1867
1868Examples of valid syntax (even though the attributes are unknown):
1869
4358a253
SS
1870 sub fnord (&\%) : switch(10,foo(7,3)) : expensive;
1871 sub plugh () : Ugly('\(") :Bad;
09bef843
SB
1872 sub xyzzy : _5x5 { ... }
1873
1874Examples of invalid syntax:
1875
4358a253
SS
1876 sub fnord : switch(10,foo(); # ()-string not balanced
1877 sub snoid : Ugly('('); # ()-string not balanced
1878 sub xyzzy : 5x5; # "5x5" not a valid identifier
1879 sub plugh : Y2::north; # "Y2::north" not a simple identifier
1880 sub snurt : foo + bar; # "+" not a colon or space
09bef843
SB
1881
1882The attribute list is passed as a list of constant strings to the code
1883which associates them with the subroutine. In particular, the second example
1884of valid syntax above currently looks like this in terms of how it's
1885parsed and invoked:
1886
1887 use attributes __PACKAGE__, \&plugh, q[Ugly('\(")], 'Bad';
1888
1889For further details on attribute lists and their manipulation,
a0ae32d3 1890see L<attributes> and L<Attribute::Handlers>.
09bef843 1891
cb1a09d0 1892=head1 SEE ALSO
a0d0e21e 1893
19799a22
GS
1894See L<perlref/"Function Templates"> for more about references and closures.
1895See L<perlxs> if you'd like to learn about calling C subroutines from Perl.
a2293a43 1896See L<perlembed> if you'd like to learn about calling Perl subroutines from C.
19799a22
GS
1897See L<perlmod> to learn about bundling up your functions in separate files.
1898See L<perlmodlib> to learn what library modules come standard on your system.
82e1c0d9 1899See L<perlootut> to learn how to make object method calls.