This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Update Module::CoreList for 5.32.0
[perl5.git] / pod / perlsub.pod
CommitLineData
a0d0e21e 1=head1 NAME
d74e8afc 2X<subroutine> X<function>
a0d0e21e
LW
3
4perlsub - Perl subroutines
5
6=head1 SYNOPSIS
7
8To declare subroutines:
d74e8afc 9X<subroutine, declaration> X<sub>
a0d0e21e 10
09bef843
SB
11 sub NAME; # A "forward" declaration.
12 sub NAME(PROTO); # ditto, but with prototypes
13 sub NAME : ATTRS; # with attributes
14 sub NAME(PROTO) : ATTRS; # with attributes and prototypes
cb1a09d0 15
09bef843
SB
16 sub NAME BLOCK # A declaration and a definition.
17 sub NAME(PROTO) BLOCK # ditto, but with prototypes
18 sub NAME : ATTRS BLOCK # with attributes
19 sub NAME(PROTO) : ATTRS BLOCK # with prototypes and attributes
894f226e
DM
20
21 use feature 'signatures';
22 sub NAME(SIG) BLOCK # with signature
23 sub NAME :ATTRS (SIG) BLOCK # with signature, attributes
24 sub NAME :prototype(PROTO) (SIG) BLOCK # with signature, prototype
a0d0e21e 25
748a9306 26To define an anonymous subroutine at runtime:
d74e8afc 27X<subroutine, anonymous>
748a9306 28
09bef843
SB
29 $subref = sub BLOCK; # no proto
30 $subref = sub (PROTO) BLOCK; # with proto
31 $subref = sub : ATTRS BLOCK; # with attributes
32 $subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes
894f226e
DM
33
34 use feature 'signatures';
35 $subref = sub (SIG) BLOCK; # with signature
36 $subref = sub : ATTRS(SIG) BLOCK; # with signature, attributes
748a9306 37
a0d0e21e 38To import subroutines:
d74e8afc 39X<import>
a0d0e21e 40
19799a22 41 use MODULE qw(NAME1 NAME2 NAME3);
a0d0e21e
LW
42
43To call subroutines:
d74e8afc 44X<subroutine, call> X<call>
a0d0e21e 45
5f05dabc 46 NAME(LIST); # & is optional with parentheses.
54310121 47 NAME LIST; # Parentheses optional if predeclared/imported.
19799a22 48 &NAME(LIST); # Circumvent prototypes.
5a964f20 49 &NAME; # Makes current @_ visible to called subroutine.
a0d0e21e
LW
50
51=head1 DESCRIPTION
52
19799a22
GS
53Like many languages, Perl provides for user-defined subroutines.
54These may be located anywhere in the main program, loaded in from
55other files via the C<do>, C<require>, or C<use> keywords, or
be3174d2 56generated on the fly using C<eval> or anonymous subroutines.
19799a22
GS
57You can even call a function indirectly using a variable containing
58its name or a CODE reference.
cb1a09d0
AD
59
60The Perl model for function call and return values is simple: all
61functions are passed as parameters one single flat list of scalars, and
62all functions likewise return to their caller one single flat list of
63scalars. Any arrays or hashes in these call and return lists will
64collapse, losing their identities--but you may always use
65pass-by-reference instead to avoid this. Both call and return lists may
66contain as many or as few scalar elements as you'd like. (Often a
67function without an explicit return statement is called a subroutine, but
19799a22 68there's really no difference from Perl's perspective.)
d74e8afc 69X<subroutine, parameter> X<parameter>
19799a22 70
30d9c59b
Z
71Any arguments passed in show up in the array C<@_>.
72(They may also show up in lexical variables introduced by a signature;
73see L</Signatures> below.) Therefore, if
19799a22
GS
74you called a function with two arguments, those would be stored in
75C<$_[0]> and C<$_[1]>. The array C<@_> is a local array, but its
76elements are aliases for the actual scalar parameters. In particular,
77if an element C<$_[0]> is updated, the corresponding argument is
78updated (or an error occurs if it is not updatable). If an argument
79is an array or hash element which did not exist when the function
80was called, that element is created only when (and if) it is modified
81or a reference to it is taken. (Some earlier versions of Perl
82created the element whether or not the element was assigned to.)
83Assigning to the whole array C<@_> removes that aliasing, and does
84not update any arguments.
d74e8afc 85X<subroutine, argument> X<argument> X<@_>
19799a22 86
dbb128be
XN
87A C<return> statement may be used to exit a subroutine, optionally
88specifying the returned value, which will be evaluated in the
89appropriate context (list, scalar, or void) depending on the context of
90the subroutine call. If you specify no return value, the subroutine
91returns an empty list in list context, the undefined value in scalar
92context, or nothing in void context. If you return one or more
93aggregates (arrays and hashes), these will be flattened together into
94one large indistinguishable list.
95
96If no C<return> is found and if the last statement is an expression, its
b77865f5
FC
97value is returned. If the last statement is a loop control structure
98like a C<foreach> or a C<while>, the returned value is unspecified. The
9a989771 99empty sub returns the empty list.
d74e8afc 100X<subroutine, return value> X<return value> X<return>
19799a22 101
30d9c59b 102Aside from an experimental facility (see L</Signatures> below),
19799a22
GS
103Perl does not have named formal parameters. In practice all you
104do is assign to a C<my()> list of these. Variables that aren't
105declared to be private are global variables. For gory details
5a0de581
LM
106on creating private variables, see L</"Private Variables via my()">
107and L</"Temporary Values via local()">. To create protected
19799a22
GS
108environments for a set of functions in a separate package (and
109probably a separate file), see L<perlmod/"Packages">.
d74e8afc 110X<formal parameter> X<parameter, formal>
a0d0e21e
LW
111
112Example:
113
cb1a09d0
AD
114 sub max {
115 my $max = shift(@_);
a0d0e21e
LW
116 foreach $foo (@_) {
117 $max = $foo if $max < $foo;
118 }
cb1a09d0 119 return $max;
a0d0e21e 120 }
cb1a09d0 121 $bestday = max($mon,$tue,$wed,$thu,$fri);
a0d0e21e
LW
122
123Example:
124
125 # get a line, combining continuation lines
126 # that start with whitespace
127
128 sub get_line {
19799a22 129 $thisline = $lookahead; # global variables!
54310121 130 LINE: while (defined($lookahead = <STDIN>)) {
a0d0e21e
LW
131 if ($lookahead =~ /^[ \t]/) {
132 $thisline .= $lookahead;
133 }
134 else {
135 last LINE;
136 }
137 }
19799a22 138 return $thisline;
a0d0e21e
LW
139 }
140
141 $lookahead = <STDIN>; # get first line
19799a22 142 while (defined($line = get_line())) {
a0d0e21e
LW
143 ...
144 }
145
09bef843 146Assigning to a list of private variables to name your arguments:
a0d0e21e
LW
147
148 sub maybeset {
149 my($key, $value) = @_;
cb1a09d0 150 $Foo{$key} = $value unless $Foo{$key};
a0d0e21e
LW
151 }
152
19799a22
GS
153Because the assignment copies the values, this also has the effect
154of turning call-by-reference into call-by-value. Otherwise a
155function is free to do in-place modifications of C<@_> and change
156its caller's values.
d74e8afc 157X<call-by-reference> X<call-by-value>
cb1a09d0
AD
158
159 upcase_in($v1, $v2); # this changes $v1 and $v2
160 sub upcase_in {
54310121 161 for (@_) { tr/a-z/A-Z/ }
162 }
cb1a09d0
AD
163
164You aren't allowed to modify constants in this way, of course. If an
165argument were actually literal and you tried to change it, you'd take a
166(presumably fatal) exception. For example, this won't work:
d74e8afc 167X<call-by-reference> X<call-by-value>
cb1a09d0
AD
168
169 upcase_in("frederick");
170
f86cebdf 171It would be much safer if the C<upcase_in()> function
cb1a09d0
AD
172were written to return a copy of its parameters instead
173of changing them in place:
174
19799a22 175 ($v3, $v4) = upcase($v1, $v2); # this doesn't change $v1 and $v2
cb1a09d0 176 sub upcase {
54310121 177 return unless defined wantarray; # void context, do nothing
cb1a09d0 178 my @parms = @_;
54310121 179 for (@parms) { tr/a-z/A-Z/ }
c07a80fd 180 return wantarray ? @parms : $parms[0];
54310121 181 }
cb1a09d0 182
19799a22 183Notice how this (unprototyped) function doesn't care whether it was
a2293a43 184passed real scalars or arrays. Perl sees all arguments as one big,
19799a22
GS
185long, flat parameter list in C<@_>. This is one area where
186Perl's simple argument-passing style shines. The C<upcase()>
187function would work perfectly well without changing the C<upcase()>
188definition even if we fed it things like this:
cb1a09d0
AD
189
190 @newlist = upcase(@list1, @list2);
191 @newlist = upcase( split /:/, $var );
192
193Do not, however, be tempted to do this:
194
195 (@a, @b) = upcase(@list1, @list2);
196
19799a22
GS
197Like the flattened incoming parameter list, the return list is also
198flattened on return. So all you have managed to do here is stored
17b63f68 199everything in C<@a> and made C<@b> empty. See
5a0de581 200L</Pass by Reference> for alternatives.
19799a22
GS
201
202A subroutine may be called using an explicit C<&> prefix. The
203C<&> is optional in modern Perl, as are parentheses if the
204subroutine has been predeclared. The C<&> is I<not> optional
205when just naming the subroutine, such as when it's used as
206an argument to defined() or undef(). Nor is it optional when you
207want to do an indirect subroutine call with a subroutine name or
208reference using the C<&$subref()> or C<&{$subref}()> constructs,
c47ff5f1 209although the C<< $subref->() >> notation solves that problem.
19799a22 210See L<perlref> for more about all that.
d74e8afc 211X<&>
19799a22
GS
212
213Subroutines may be called recursively. If a subroutine is called
214using the C<&> form, the argument list is optional, and if omitted,
215no C<@_> array is set up for the subroutine: the C<@_> array at the
216time of the call is visible to subroutine instead. This is an
217efficiency mechanism that new users may wish to avoid.
d74e8afc 218X<recursion>
a0d0e21e
LW
219
220 &foo(1,2,3); # pass three arguments
221 foo(1,2,3); # the same
222
223 foo(); # pass a null list
224 &foo(); # the same
a0d0e21e 225
cb1a09d0 226 &foo; # foo() get current args, like foo(@_) !!
060b1fe2
TC
227 use strict 'subs';
228 foo; # like foo() iff sub foo predeclared, else
229 # a compile-time error
230 no strict 'subs';
231 foo; # like foo() iff sub foo predeclared, else
232 # a literal string "foo"
cb1a09d0 233
19799a22
GS
234Not only does the C<&> form make the argument list optional, it also
235disables any prototype checking on arguments you do provide. This
c07a80fd 236is partly for historical reasons, and partly for having a convenient way
9688be67 237to cheat if you know what you're doing. See L</Prototypes> below.
d74e8afc 238X<&>
c07a80fd 239
977616ef
RS
240Since Perl 5.16.0, the C<__SUB__> token is available under C<use feature
241'current_sub'> and C<use 5.16.0>. It will evaluate to a reference to the
906024c7
FC
242currently-running sub, which allows for recursive calls without knowing
243your subroutine's name.
977616ef
RS
244
245 use 5.16.0;
246 my $factorial = sub {
247 my ($x) = @_;
248 return 1 if $x == 1;
249 return($x * __SUB__->( $x - 1 ) );
250 };
251
89d1beed 252The behavior of C<__SUB__> within a regex code block (such as C</(?{...})/>)
a453e28a
DM
253is subject to change.
254
ac90fb77
EM
255Subroutines whose names are in all upper case are reserved to the Perl
256core, as are modules whose names are in all lower case. A subroutine in
257all capitals is a loosely-held convention meaning it will be called
258indirectly by the run-time system itself, usually due to a triggered event.
bf5513e0 259Subroutines whose name start with a left parenthesis are also reserved the
b77865f5 260same way. The following is a list of some subroutines that currently do
bf5513e0
ZA
261special, pre-defined things.
262
263=over
264
265=item documented later in this document
266
267C<AUTOLOAD>
268
269=item documented in L<perlmod>
270
8b7906d1 271C<CLONE>, C<CLONE_SKIP>
bf5513e0
ZA
272
273=item documented in L<perlobj>
274
8b7906d1 275C<DESTROY>, C<DOES>
bf5513e0
ZA
276
277=item documented in L<perltie>
278
279C<BINMODE>, C<CLEAR>, C<CLOSE>, C<DELETE>, C<DESTROY>, C<EOF>, C<EXISTS>,
280C<EXTEND>, C<FETCH>, C<FETCHSIZE>, C<FILENO>, C<FIRSTKEY>, C<GETC>,
281C<NEXTKEY>, C<OPEN>, C<POP>, C<PRINT>, C<PRINTF>, C<PUSH>, C<READ>,
282C<READLINE>, C<SCALAR>, C<SEEK>, C<SHIFT>, C<SPLICE>, C<STORE>,
283C<STORESIZE>, C<TELL>, C<TIEARRAY>, C<TIEHANDLE>, C<TIEHASH>,
284C<TIESCALAR>, C<UNSHIFT>, C<UNTIE>, C<WRITE>
285
286=item documented in L<PerlIO::via>
287
288C<BINMODE>, C<CLEARERR>, C<CLOSE>, C<EOF>, C<ERROR>, C<FDOPEN>, C<FILENO>,
289C<FILL>, C<FLUSH>, C<OPEN>, C<POPPED>, C<PUSHED>, C<READ>, C<SEEK>,
290C<SETLINEBUF>, C<SYSOPEN>, C<TELL>, C<UNREAD>, C<UTF8>, C<WRITE>
291
ec2eb8a9
TC
292=item documented in L<perlfunc>
293
294L<< C<import> | perlfunc/use >>, L<< C<unimport> | perlfunc/use >>,
295L<< C<INC> | perlfunc/require >>
296
297=item documented in L<UNIVERSAL>
298
299C<VERSION>
300
301=item documented in L<perldebguts>
302
303C<DB::DB>, C<DB::sub>, C<DB::lsub>, C<DB::goto>, C<DB::postponed>
304
bf5513e0
ZA
305=item undocumented, used internally by the L<overload> feature
306
307any starting with C<(>
308
309=back
ac90fb77 310
3c10abe3
AG
311The C<BEGIN>, C<UNITCHECK>, C<CHECK>, C<INIT> and C<END> subroutines
312are not so much subroutines as named special code blocks, of which you
313can have more than one in a package, and which you can B<not> call
314explicitly. See L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END">
5a964f20 315
30d9c59b
Z
316=head2 Signatures
317
318B<WARNING>: Subroutine signatures are experimental. The feature may be
319modified or removed in future versions of Perl.
320
321Perl has an experimental facility to allow a subroutine's formal
322parameters to be introduced by special syntax, separate from the
323procedural code of the subroutine body. The formal parameter list
324is known as a I<signature>. The facility must be enabled first by a
325pragmatic declaration, C<use feature 'signatures'>, and it will produce
326a warning unless the "experimental::signatures" warnings category is
327disabled.
328
329The signature is part of a subroutine's body. Normally the body of a
894f226e
DM
330subroutine is simply a braced block of code, but when using a signature,
331the signature is a parenthesised list that goes immediately before the
332block, after any name or attributes.
333
334For example,
335
336 sub foo :lvalue ($a, $b = 1, @c) { .... }
337
338The signature declares lexical variables that are
30d9c59b
Z
339in scope for the block. When the subroutine is called, the signature
340takes control first. It populates the signature variables from the
341list of arguments that were passed. If the argument list doesn't meet
342the requirements of the signature, then it will throw an exception.
343When the signature processing is complete, control passes to the block.
344
345Positional parameters are handled by simply naming scalar variables in
346the signature. For example,
347
348 sub foo ($left, $right) {
349 return $left + $right;
350 }
351
352takes two positional parameters, which must be filled at runtime by
353two arguments. By default the parameters are mandatory, and it is
354not permitted to pass more arguments than expected. So the above is
355equivalent to
356
357 sub foo {
358 die "Too many arguments for subroutine" unless @_ <= 2;
359 die "Too few arguments for subroutine" unless @_ >= 2;
360 my $left = $_[0];
361 my $right = $_[1];
362 return $left + $right;
363 }
364
365An argument can be ignored by omitting the main part of the name from
366a parameter declaration, leaving just a bare C<$> sigil. For example,
367
368 sub foo ($first, $, $third) {
369 return "first=$first, third=$third";
370 }
371
372Although the ignored argument doesn't go into a variable, it is still
373mandatory for the caller to pass it.
374
375A positional parameter is made optional by giving a default value,
376separated from the parameter name by C<=>:
377
378 sub foo ($left, $right = 0) {
379 return $left + $right;
380 }
381
382The above subroutine may be called with either one or two arguments.
383The default value expression is evaluated when the subroutine is called,
384so it may provide different default values for different calls. It is
385only evaluated if the argument was actually omitted from the call.
386For example,
387
388 my $auto_id = 0;
389 sub foo ($thing, $id = $auto_id++) {
390 print "$thing has ID $id";
391 }
392
393automatically assigns distinct sequential IDs to things for which no
394ID was supplied by the caller. A default value expression may also
395refer to parameters earlier in the signature, making the default for
396one parameter vary according to the earlier parameters. For example,
397
398 sub foo ($first_name, $surname, $nickname = $first_name) {
399 print "$first_name $surname is known as \"$nickname\"";
400 }
401
402An optional parameter can be nameless just like a mandatory parameter.
403For example,
404
405 sub foo ($thing, $ = 1) {
406 print $thing;
407 }
408
409The parameter's default value will still be evaluated if the corresponding
410argument isn't supplied, even though the value won't be stored anywhere.
411This is in case evaluating it has important side effects. However, it
412will be evaluated in void context, so if it doesn't have side effects
413and is not trivial it will generate a warning if the "void" warning
414category is enabled. If a nameless optional parameter's default value
415is not important, it may be omitted just as the parameter's name was:
416
417 sub foo ($thing, $=) {
418 print $thing;
419 }
420
421Optional positional parameters must come after all mandatory positional
422parameters. (If there are no mandatory positional parameters then an
423optional positional parameters can be the first thing in the signature.)
424If there are multiple optional positional parameters and not enough
425arguments are supplied to fill them all, they will be filled from left
426to right.
427
428After positional parameters, additional arguments may be captured in a
429slurpy parameter. The simplest form of this is just an array variable:
430
431 sub foo ($filter, @inputs) {
432 print $filter->($_) foreach @inputs;
433 }
434
435With a slurpy parameter in the signature, there is no upper limit on how
436many arguments may be passed. A slurpy array parameter may be nameless
437just like a positional parameter, in which case its only effect is to
438turn off the argument limit that would otherwise apply:
439
440 sub foo ($thing, @) {
441 print $thing;
442 }
443
444A slurpy parameter may instead be a hash, in which case the arguments
445available to it are interpreted as alternating keys and values.
446There must be as many keys as values: if there is an odd argument then
447an exception will be thrown. Keys will be stringified, and if there are
448duplicates then the later instance takes precedence over the earlier,
449as with standard hash construction.
450
451 sub foo ($filter, %inputs) {
452 print $filter->($_, $inputs{$_}) foreach sort keys %inputs;
453 }
454
455A slurpy hash parameter may be nameless just like other kinds of
456parameter. It still insists that the number of arguments available to
457it be even, even though they're not being put into a variable.
458
459 sub foo ($thing, %) {
460 print $thing;
461 }
462
463A slurpy parameter, either array or hash, must be the last thing in the
464signature. It may follow mandatory and optional positional parameters;
465it may also be the only thing in the signature. Slurpy parameters cannot
466have default values: if no arguments are supplied for them then you get
467an empty array or empty hash.
468
469A signature may be entirely empty, in which case all it does is check
470that the caller passed no arguments:
471
472 sub foo () {
473 return 123;
474 }
475
8c13e946
RS
476When using a signature, the arguments are still available in the special
477array variable C<@_>, in addition to the lexical variables of the
478signature. There is a difference between the two ways of accessing the
30d9c59b
Z
479arguments: C<@_> I<aliases> the arguments, but the signature variables
480get I<copies> of the arguments. So writing to a signature variable
481only changes that variable, and has no effect on the caller's variables,
482but writing to an element of C<@_> modifies whatever the caller used to
483supply that argument.
484
485There is a potential syntactic ambiguity between signatures and prototypes
486(see L</Prototypes>), because both start with an opening parenthesis and
487both can appear in some of the same places, such as just after the name
488in a subroutine declaration. For historical reasons, when signatures
489are not enabled, any opening parenthesis in such a context will trigger
490very forgiving prototype parsing. Most signatures will be interpreted
491as prototypes in those circumstances, but won't be valid prototypes.
492(A valid prototype cannot contain any alphabetic character.) This will
493lead to somewhat confusing error messages.
494
495To avoid ambiguity, when signatures are enabled the special syntax
496for prototypes is disabled. There is no attempt to guess whether a
497parenthesised group was intended to be a prototype or a signature.
498To give a subroutine a prototype under these circumstances, use a
499L<prototype attribute|attributes/Built-in Attributes>. For example,
500
501 sub foo :prototype($) { $_[0] }
502
503It is entirely possible for a subroutine to have both a prototype and
504a signature. They do different jobs: the prototype affects compilation
505of calls to the subroutine, and the signature puts argument values into
506lexical variables at runtime. You can therefore write
507
894f226e 508 sub foo :prototype($$) ($left, $right) {
30d9c59b
Z
509 return $left + $right;
510 }
511
894f226e
DM
512The prototype attribute, and any other attributes, must come before
513the signature. The signature always immediately precedes the block of
514the subroutine's body.
30d9c59b 515
b687b08b 516=head2 Private Variables via my()
d74e8afc
ITB
517X<my> X<variable, lexical> X<lexical> X<lexical variable> X<scope, lexical>
518X<lexical scope> X<attributes, my>
cb1a09d0
AD
519
520Synopsis:
521
522 my $foo; # declare $foo lexically local
523 my (@wid, %get); # declare list of variables local
524 my $foo = "flurp"; # declare $foo lexical, and init it
525 my @oof = @bar; # declare @oof lexical, and init it
09bef843
SB
526 my $x : Foo = $y; # similar, with an attribute applied
527
a0ae32d3
JH
528B<WARNING>: The use of attribute lists on C<my> declarations is still
529evolving. The current semantics and interface are subject to change.
530See L<attributes> and L<Attribute::Handlers>.
cb1a09d0 531
19799a22 532The C<my> operator declares the listed variables to be lexically
f185f654
KW
533confined to the enclosing block, conditional
534(C<if>/C<unless>/C<elsif>/C<else>), loop
535(C<for>/C<foreach>/C<while>/C<until>/C<continue>), subroutine, C<eval>,
536or C<do>/C<require>/C<use>'d file. If more than one value is listed, the
19799a22
GS
537list must be placed in parentheses. All listed elements must be
538legal lvalues. Only alphanumeric identifiers may be lexically
325192b1 539scoped--magical built-ins like C<$/> must currently be C<local>ized
19799a22
GS
540with C<local> instead.
541
542Unlike dynamic variables created by the C<local> operator, lexical
543variables declared with C<my> are totally hidden from the outside
544world, including any called subroutines. This is true if it's the
545same subroutine called from itself or elsewhere--every call gets
546its own copy.
d74e8afc 547X<local>
19799a22
GS
548
549This doesn't mean that a C<my> variable declared in a statically
550enclosing lexical scope would be invisible. Only dynamic scopes
551are cut off. For example, the C<bumpx()> function below has access
552to the lexical $x variable because both the C<my> and the C<sub>
553occurred at the same scope, presumably file scope.
5a964f20
TC
554
555 my $x = 10;
556 sub bumpx { $x++ }
557
19799a22
GS
558An C<eval()>, however, can see lexical variables of the scope it is
559being evaluated in, so long as the names aren't hidden by declarations within
560the C<eval()> itself. See L<perlref>.
d74e8afc 561X<eval, scope of>
cb1a09d0 562
19799a22 563The parameter list to my() may be assigned to if desired, which allows you
cb1a09d0
AD
564to initialize your variables. (If no initializer is given for a
565particular variable, it is created with the undefined value.) Commonly
19799a22 566this is used to name input parameters to a subroutine. Examples:
cb1a09d0
AD
567
568 $arg = "fred"; # "global" variable
569 $n = cube_root(27);
570 print "$arg thinks the root is $n\n";
571 fred thinks the root is 3
572
573 sub cube_root {
574 my $arg = shift; # name doesn't matter
575 $arg **= 1/3;
576 return $arg;
54310121 577 }
cb1a09d0 578
19799a22
GS
579The C<my> is simply a modifier on something you might assign to. So when
580you do assign to variables in its argument list, C<my> doesn't
6cc33c6d 581change whether those variables are viewed as a scalar or an array. So
cb1a09d0 582
5a964f20 583 my ($foo) = <STDIN>; # WRONG?
cb1a09d0
AD
584 my @FOO = <STDIN>;
585
5f05dabc 586both supply a list context to the right-hand side, while
cb1a09d0
AD
587
588 my $foo = <STDIN>;
589
5f05dabc 590supplies a scalar context. But the following declares only one variable:
748a9306 591
5a964f20 592 my $foo, $bar = 1; # WRONG
748a9306 593
cb1a09d0 594That has the same effect as
748a9306 595
cb1a09d0
AD
596 my $foo;
597 $bar = 1;
a0d0e21e 598
cb1a09d0
AD
599The declared variable is not introduced (is not visible) until after
600the current statement. Thus,
601
602 my $x = $x;
603
19799a22 604can be used to initialize a new $x with the value of the old $x, and
cb1a09d0
AD
605the expression
606
607 my $x = 123 and $x == 123
608
19799a22 609is false unless the old $x happened to have the value C<123>.
cb1a09d0 610
55497cff 611Lexical scopes of control structures are not bounded precisely by the
612braces that delimit their controlled blocks; control expressions are
19799a22 613part of that scope, too. Thus in the loop
55497cff 614
19799a22 615 while (my $line = <>) {
55497cff 616 $line = lc $line;
617 } continue {
618 print $line;
619 }
620
19799a22 621the scope of $line extends from its declaration throughout the rest of
55497cff 622the loop construct (including the C<continue> clause), but not beyond
623it. Similarly, in the conditional
624
625 if ((my $answer = <STDIN>) =~ /^yes$/i) {
626 user_agrees();
627 } elsif ($answer =~ /^no$/i) {
628 user_disagrees();
629 } else {
630 chomp $answer;
631 die "'$answer' is neither 'yes' nor 'no'";
632 }
633
19799a22
GS
634the scope of $answer extends from its declaration through the rest
635of that conditional, including any C<elsif> and C<else> clauses,
96090e4f 636but not beyond it. See L<perlsyn/"Simple Statements"> for information
457b36cb 637on the scope of variables in statements with modifiers.
55497cff 638
5f05dabc 639The C<foreach> loop defaults to scoping its index variable dynamically
19799a22
GS
640in the manner of C<local>. However, if the index variable is
641prefixed with the keyword C<my>, or if there is already a lexical
642by that name in scope, then a new lexical is created instead. Thus
643in the loop
d74e8afc 644X<foreach> X<for>
55497cff 645
646 for my $i (1, 2, 3) {
647 some_function();
648 }
649
19799a22
GS
650the scope of $i extends to the end of the loop, but not beyond it,
651rendering the value of $i inaccessible within C<some_function()>.
d74e8afc 652X<foreach> X<for>
55497cff 653
cb1a09d0 654Some users may wish to encourage the use of lexically scoped variables.
19799a22
GS
655As an aid to catching implicit uses to package variables,
656which are always global, if you say
cb1a09d0
AD
657
658 use strict 'vars';
659
19799a22
GS
660then any variable mentioned from there to the end of the enclosing
661block must either refer to a lexical variable, be predeclared via
77ca0c92 662C<our> or C<use vars>, or else must be fully qualified with the package name.
19799a22
GS
663A compilation error results otherwise. An inner block may countermand
664this with C<no strict 'vars'>.
665
666A C<my> has both a compile-time and a run-time effect. At compile
8593bda5 667time, the compiler takes notice of it. The principal usefulness
19799a22
GS
668of this is to quiet C<use strict 'vars'>, but it is also essential
669for generation of closures as detailed in L<perlref>. Actual
670initialization is delayed until run time, though, so it gets executed
671at the appropriate time, such as each time through a loop, for
672example.
673
674Variables declared with C<my> are not part of any package and are therefore
cb1a09d0
AD
675never fully qualified with the package name. In particular, you're not
676allowed to try to make a package variable (or other global) lexical:
677
678 my $pack::var; # ERROR! Illegal syntax
cb1a09d0
AD
679
680In fact, a dynamic variable (also known as package or global variables)
f86cebdf 681are still accessible using the fully qualified C<::> notation even while a
cb1a09d0
AD
682lexical of the same name is also visible:
683
684 package main;
685 local $x = 10;
686 my $x = 20;
687 print "$x and $::x\n";
688
f86cebdf 689That will print out C<20> and C<10>.
cb1a09d0 690
19799a22
GS
691You may declare C<my> variables at the outermost scope of a file
692to hide any such identifiers from the world outside that file. This
693is similar in spirit to C's static variables when they are used at
694the file level. To do this with a subroutine requires the use of
695a closure (an anonymous function that accesses enclosing lexicals).
696If you want to create a private subroutine that cannot be called
697from outside that block, it can declare a lexical variable containing
698an anonymous sub reference:
cb1a09d0
AD
699
700 my $secret_version = '1.001-beta';
701 my $secret_sub = sub { print $secret_version };
702 &$secret_sub();
703
704As long as the reference is never returned by any function within the
5f05dabc 705module, no outside module can see the subroutine, because its name is not in
cb1a09d0 706any package's symbol table. Remember that it's not I<REALLY> called
19799a22 707C<$some_pack::secret_version> or anything; it's just $secret_version,
cb1a09d0
AD
708unqualified and unqualifiable.
709
19799a22
GS
710This does not work with object methods, however; all object methods
711have to be in the symbol table of some package to be found. See
712L<perlref/"Function Templates"> for something of a work-around to
713this.
cb1a09d0 714
c2611fb3 715=head2 Persistent Private Variables
ba1f8e91
RGS
716X<state> X<state variable> X<static> X<variable, persistent> X<variable, static> X<closure>
717
718There are two ways to build persistent private variables in Perl 5.10.
b77865f5 719First, you can simply use the C<state> feature. Or, you can use closures,
ba1f8e91
RGS
720if you want to stay compatible with releases older than 5.10.
721
722=head3 Persistent variables via state()
723
9d42615f 724Beginning with Perl 5.10.0, you can declare variables with the C<state>
4a904372 725keyword in place of C<my>. For that to work, though, you must have
ba1f8e91 726enabled that feature beforehand, either by using the C<feature> pragma, or
4a904372 727by using C<-E> on one-liners (see L<feature>). Beginning with Perl 5.16,
47d235f1 728the C<CORE::state> form does not require the
4a904372 729C<feature> pragma.
ba1f8e91 730
ad0cc46c
FC
731The C<state> keyword creates a lexical variable (following the same scoping
732rules as C<my>) that persists from one subroutine call to the next. If a
733state variable resides inside an anonymous subroutine, then each copy of
734the subroutine has its own copy of the state variable. However, the value
735of the state variable will still persist between calls to the same copy of
736the anonymous subroutine. (Don't forget that C<sub { ... }> creates a new
737subroutine each time it is executed.)
738
ba1f8e91
RGS
739For example, the following code maintains a private counter, incremented
740each time the gimme_another() function is called:
741
742 use feature 'state';
743 sub gimme_another { state $x; return ++$x }
744
ad0cc46c
FC
745And this example uses anonymous subroutines to create separate counters:
746
747 use feature 'state';
748 sub create_counter {
749 return sub { state $x; return ++$x }
750 }
751
ba1f8e91
RGS
752Also, since C<$x> is lexical, it can't be reached or modified by any Perl
753code outside.
754
f99042c8 755When combined with variable declaration, simple assignment to C<state>
f292fc7a
RS
756variables (as in C<state $x = 42>) is executed only the first time. When such
757statements are evaluated subsequent times, the assignment is ignored. The
f99042c8
Z
758behavior of assignment to C<state> declarations where the left hand side
759of the assignment involves any parentheses is currently undefined.
ba1f8e91
RGS
760
761=head3 Persistent variables with closures
5a964f20
TC
762
763Just because a lexical variable is lexically (also called statically)
f86cebdf 764scoped to its enclosing block, C<eval>, or C<do> FILE, this doesn't mean that
5a964f20
TC
765within a function it works like a C static. It normally works more
766like a C auto, but with implicit garbage collection.
767
768Unlike local variables in C or C++, Perl's lexical variables don't
769necessarily get recycled just because their scope has exited.
770If something more permanent is still aware of the lexical, it will
771stick around. So long as something else references a lexical, that
772lexical won't be freed--which is as it should be. You wouldn't want
773memory being free until you were done using it, or kept around once you
774were done. Automatic garbage collection takes care of this for you.
775
776This means that you can pass back or save away references to lexical
777variables, whereas to return a pointer to a C auto is a grave error.
778It also gives us a way to simulate C's function statics. Here's a
779mechanism for giving a function private variables with both lexical
780scoping and a static lifetime. If you do want to create something like
781C's static variables, just enclose the whole function in an extra block,
782and put the static variable outside the function but in the block.
cb1a09d0
AD
783
784 {
54310121 785 my $secret_val = 0;
cb1a09d0
AD
786 sub gimme_another {
787 return ++$secret_val;
54310121 788 }
789 }
cb1a09d0
AD
790 # $secret_val now becomes unreachable by the outside
791 # world, but retains its value between calls to gimme_another
792
54310121 793If this function is being sourced in from a separate file
cb1a09d0 794via C<require> or C<use>, then this is probably just fine. If it's
19799a22 795all in the main program, you'll need to arrange for the C<my>
cb1a09d0 796to be executed early, either by putting the whole block above
f86cebdf 797your main program, or more likely, placing merely a C<BEGIN>
ac90fb77 798code block around it to make sure it gets executed before your program
cb1a09d0
AD
799starts to run:
800
ac90fb77 801 BEGIN {
54310121 802 my $secret_val = 0;
cb1a09d0
AD
803 sub gimme_another {
804 return ++$secret_val;
54310121 805 }
806 }
cb1a09d0 807
3c10abe3
AG
808See L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END"> about the
809special triggered code blocks, C<BEGIN>, C<UNITCHECK>, C<CHECK>,
810C<INIT> and C<END>.
cb1a09d0 811
19799a22
GS
812If declared at the outermost scope (the file scope), then lexicals
813work somewhat like C's file statics. They are available to all
814functions in that same file declared below them, but are inaccessible
815from outside that file. This strategy is sometimes used in modules
816to create private variables that the whole module can see.
5a964f20 817
cb1a09d0 818=head2 Temporary Values via local()
d74e8afc
ITB
819X<local> X<scope, dynamic> X<dynamic scope> X<variable, local>
820X<variable, temporary>
cb1a09d0 821
19799a22 822B<WARNING>: In general, you should be using C<my> instead of C<local>, because
6d28dffb 823it's faster and safer. Exceptions to this include the global punctuation
325192b1
RGS
824variables, global filehandles and formats, and direct manipulation of the
825Perl symbol table itself. C<local> is mostly used when the current value
826of a variable must be visible to called subroutines.
cb1a09d0
AD
827
828Synopsis:
829
325192b1
RGS
830 # localization of values
831
555bd962
BG
832 local $foo; # make $foo dynamically local
833 local (@wid, %get); # make list of variables local
834 local $foo = "flurp"; # make $foo dynamic, and init it
835 local @oof = @bar; # make @oof dynamic, and init it
325192b1 836
555bd962
BG
837 local $hash{key} = "val"; # sets a local value for this hash entry
838 delete local $hash{key}; # delete this entry for the current block
839 local ($cond ? $v1 : $v2); # several types of lvalues support
840 # localization
325192b1
RGS
841
842 # localization of symbols
cb1a09d0 843
555bd962
BG
844 local *FH; # localize $FH, @FH, %FH, &FH ...
845 local *merlyn = *randal; # now $merlyn is really $randal, plus
846 # @merlyn is really @randal, etc
847 local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal
848 local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc
cb1a09d0 849
19799a22
GS
850A C<local> modifies its listed variables to be "local" to the
851enclosing block, C<eval>, or C<do FILE>--and to I<any subroutine
852called from within that block>. A C<local> just gives temporary
853values to global (meaning package) variables. It does I<not> create
854a local variable. This is known as dynamic scoping. Lexical scoping
855is done with C<my>, which works more like C's auto declarations.
cb1a09d0 856
ceb12f1f 857Some types of lvalues can be localized as well: hash and array elements
325192b1
RGS
858and slices, conditionals (provided that their result is always
859localizable), and symbolic references. As for simple variables, this
860creates new, dynamically scoped values.
861
862If more than one variable or expression is given to C<local>, they must be
863placed in parentheses. This operator works
cb1a09d0 864by saving the current values of those variables in its argument list on a
5f05dabc 865hidden stack and restoring them upon exiting the block, subroutine, or
cb1a09d0
AD
866eval. This means that called subroutines can also reference the local
867variable, but not the global one. The argument list may be assigned to if
868desired, which allows you to initialize your local variables. (If no
869initializer is given for a particular variable, it is created with an
325192b1 870undefined value.)
cb1a09d0 871
19799a22 872Because C<local> is a run-time operator, it gets executed each time
325192b1
RGS
873through a loop. Consequently, it's more efficient to localize your
874variables outside the loop.
875
876=head3 Grammatical note on local()
d74e8afc 877X<local, context>
cb1a09d0 878
f86cebdf
GS
879A C<local> is simply a modifier on an lvalue expression. When you assign to
880a C<local>ized variable, the C<local> doesn't change whether its list is viewed
cb1a09d0
AD
881as a scalar or an array. So
882
883 local($foo) = <STDIN>;
884 local @FOO = <STDIN>;
885
5f05dabc 886both supply a list context to the right-hand side, while
cb1a09d0
AD
887
888 local $foo = <STDIN>;
889
890supplies a scalar context.
891
325192b1 892=head3 Localization of special variables
d74e8afc 893X<local, special variable>
3e3baf6d 894
325192b1
RGS
895If you localize a special variable, you'll be giving a new value to it,
896but its magic won't go away. That means that all side-effects related
897to this magic still work with the localized value.
3e3baf6d 898
325192b1
RGS
899This feature allows code like this to work :
900
901 # Read the whole contents of FILE in $slurp
902 { local $/ = undef; $slurp = <FILE>; }
903
904Note, however, that this restricts localization of some values ; for
9d42615f 905example, the following statement dies, as of perl 5.10.0, with an error
325192b1
RGS
906I<Modification of a read-only value attempted>, because the $1 variable is
907magical and read-only :
908
909 local $1 = 2;
910
658a9f31
JD
911One exception is the default scalar variable: starting with perl 5.14
912C<local($_)> will always strip all magic from $_, to make it possible
913to safely reuse $_ in a subroutine.
325192b1
RGS
914
915B<WARNING>: Localization of tied arrays and hashes does not currently
916work as described.
fd5a896a 917This will be fixed in a future release of Perl; in the meantime, avoid
89d1beed 918code that relies on any particular behavior of localising tied arrays
fd5a896a 919or hashes (localising individual elements is still okay).
325192b1 920See L<perl58delta/"Localising Tied Arrays and Hashes Is Broken"> for more
fd5a896a 921details.
d74e8afc 922X<local, tie>
fd5a896a 923
325192b1 924=head3 Localization of globs
d74e8afc 925X<local, glob> X<glob>
3e3baf6d 926
325192b1
RGS
927The construct
928
929 local *name;
930
931creates a whole new symbol table entry for the glob C<name> in the
932current package. That means that all variables in its glob slot ($name,
933@name, %name, &name, and the C<name> filehandle) are dynamically reset.
934
935This implies, among other things, that any magic eventually carried by
936those variables is locally lost. In other words, saying C<local */>
937will not have any effect on the internal value of the input record
938separator.
939
325192b1 940=head3 Localization of elements of composite types
d74e8afc 941X<local, composite type element> X<local, array element> X<local, hash element>
3e3baf6d 942
6ee623d5 943It's also worth taking a moment to explain what happens when you
f86cebdf 944C<local>ize a member of a composite type (i.e. an array or hash element).
b77865f5 945In this case, the element is C<local>ized I<by name>. This means that
6ee623d5
GS
946when the scope of the C<local()> ends, the saved value will be
947restored to the hash element whose key was named in the C<local()>, or
948the array element whose index was named in the C<local()>. If that
949element was deleted while the C<local()> was in effect (e.g. by a
950C<delete()> from a hash or a C<shift()> of an array), it will spring
951back into existence, possibly extending an array and filling in the
952skipped elements with C<undef>. For instance, if you say
953
954 %hash = ( 'This' => 'is', 'a' => 'test' );
955 @ary = ( 0..5 );
956 {
957 local($ary[5]) = 6;
958 local($hash{'a'}) = 'drill';
959 while (my $e = pop(@ary)) {
960 print "$e . . .\n";
961 last unless $e > 3;
962 }
963 if (@ary) {
964 $hash{'only a'} = 'test';
965 delete $hash{'a'};
966 }
967 }
968 print join(' ', map { "$_ $hash{$_}" } sort keys %hash),".\n";
969 print "The array has ",scalar(@ary)," elements: ",
970 join(', ', map { defined $_ ? $_ : 'undef' } @ary),"\n";
971
972Perl will print
973
974 6 . . .
975 4 . . .
976 3 . . .
977 This is a test only a test.
978 The array has 6 elements: 0, 1, 2, undef, undef, 5
979
19799a22 980The behavior of local() on non-existent members of composite
f1025824
AC
981types is subject to change in future. The behavior of local()
982on array elements specified using negative indexes is particularly
983surprising, and is very likely to change.
7185e5cc 984
d361fafa
VP
985=head3 Localized deletion of elements of composite types
986X<delete> X<local, composite type element> X<local, array element> X<local, hash element>
987
988You can use the C<delete local $array[$idx]> and C<delete local $hash{key}>
989constructs to delete a composite type entry for the current block and restore
b77865f5 990it when it ends. They return the array/hash value before the localization,
d361fafa
VP
991which means that they are respectively equivalent to
992
993 do {
994 my $val = $array[$idx];
995 local $array[$idx];
996 delete $array[$idx];
997 $val
998 }
999
1000and
1001
1002 do {
1003 my $val = $hash{key};
1004 local $hash{key};
1005 delete $hash{key};
1006 $val
1007 }
1008
b77865f5
FC
1009except that for those the C<local> is
1010scoped to the C<do> block. Slices are
d361fafa
VP
1011also accepted.
1012
1013 my %hash = (
1014 a => [ 7, 8, 9 ],
1015 b => 1,
1016 )
1017
1018 {
1019 my $a = delete local $hash{a};
1020 # $a is [ 7, 8, 9 ]
1021 # %hash is (b => 1)
1022
1023 {
1024 my @nums = delete local @$a[0, 2]
1025 # @nums is (7, 9)
1026 # $a is [ undef, 8 ]
1027
1028 $a[0] = 999; # will be erased when the scope ends
1029 }
1030 # $a is back to [ 7, 8, 9 ]
1031
1032 }
1033 # %hash is back to its original state
1034
cd06dffe 1035=head2 Lvalue subroutines
d74e8afc 1036X<lvalue> X<subroutine, lvalue>
cd06dffe 1037
cd06dffe
GS
1038It is possible to return a modifiable value from a subroutine.
1039To do this, you have to declare the subroutine to return an lvalue.
1040
1041 my $val;
1042 sub canmod : lvalue {
4a904372 1043 $val; # or: return $val;
cd06dffe
GS
1044 }
1045 sub nomod {
1046 $val;
1047 }
1048
1049 canmod() = 5; # assigns to $val
1050 nomod() = 5; # ERROR
1051
1052The scalar/list context for the subroutine and for the right-hand
1053side of assignment is determined as if the subroutine call is replaced
b77865f5 1054by a scalar. For example, consider:
cd06dffe
GS
1055
1056 data(2,3) = get_data(3,4);
1057
1058Both subroutines here are called in a scalar context, while in:
1059
1060 (data(2,3)) = get_data(3,4);
1061
1062and in:
1063
1064 (data(2),data(3)) = get_data(3,4);
1065
1066all the subroutines are called in a list context.
1067
771cc755 1068Lvalue subroutines are convenient, but you have to keep in mind that,
b77865f5 1069when used with objects, they may violate encapsulation. A normal
771cc755 1070mutator can check the supplied argument before setting the attribute
b77865f5 1071it is protecting, an lvalue subroutine cannot. If you require any
771cc755
JV
1072special processing when storing and retrieving the values, consider
1073using the CPAN module Sentinel or something similar.
e6a32221 1074
ca40957e
FC
1075=head2 Lexical Subroutines
1076X<my sub> X<state sub> X<our sub> X<subroutine, lexical>
1077
ca40957e
FC
1078Beginning with Perl 5.18, you can declare a private subroutine with C<my>
1079or C<state>. As with state variables, the C<state> keyword is only
1080available under C<use feature 'state'> or C<use 5.010> or higher.
1081
06c4bad0
FC
1082Prior to Perl 5.26, lexical subroutines were deemed experimental and were
1083available only under the C<use feature 'lexical_subs'> pragma. They also
1084produced a warning unless the "experimental::lexical_subs" warnings
1085category was disabled.
1086
ca40957e
FC
1087These subroutines are only visible within the block in which they are
1088declared, and only after that declaration:
1089
06c4bad0
FC
1090 # Include these two lines if your code is intended to run under Perl
1091 # versions earlier than 5.26.
f1d34ca8 1092 no warnings "experimental::lexical_subs";
ca40957e
FC
1093 use feature 'lexical_subs';
1094
67bf5a37 1095 foo(); # calls the package/global subroutine
ca40957e 1096 state sub foo {
67bf5a37 1097 foo(); # also calls the package subroutine
ca40957e 1098 }
67bf5a37
LM
1099 foo(); # calls "state" sub
1100 my $ref = \&foo; # take a reference to "state" sub
ca40957e
FC
1101
1102 my sub bar { ... }
67bf5a37 1103 bar(); # calls "my" sub
ca40957e 1104
67bf5a37 1105You can't (directly) write a recursive lexical subroutine:
ca40957e 1106
67bf5a37
LM
1107 # WRONG
1108 my sub baz {
1109 baz();
ca40957e
FC
1110 }
1111
67bf5a37
LM
1112This example fails because C<baz()> refers to the package/global subroutine
1113C<baz>, not the lexical subroutine currently being defined.
1114
1115The solution is to use L<C<__SUB__>|perlfunc/__SUB__>:
1116
1117 my sub baz {
1118 __SUB__->(); # calls itself
1119 }
1120
1121It is possible to predeclare a lexical subroutine. The C<sub foo {...}>
1122subroutine definition syntax respects any previous C<my sub;> or C<state sub;>
1123declaration. Using this to define recursive subroutines is a bad idea,
1124however:
1125
1126 my sub baz; # predeclaration
1127 sub baz { # define the "my" sub
1128 baz(); # WRONG: calls itself, but leaks memory
1129 }
1130
1131Just like C<< my $f; $f = sub { $f->() } >>, this example leaks memory. The
1132name C<baz> is a reference to the subroutine, and the subroutine uses the name
1133C<baz>; they keep each other alive (see L<perlref/Circular References>).
1134
ca40957e
FC
1135=head3 C<state sub> vs C<my sub>
1136
1137What is the difference between "state" subs and "my" subs? Each time that
1138execution enters a block when "my" subs are declared, a new copy of each
1139sub is created. "State" subroutines persist from one execution of the
1140containing block to the next.
1141
1142So, in general, "state" subroutines are faster. But "my" subs are
1143necessary if you want to create closures:
1144
ca40957e
FC
1145 sub whatever {
1146 my $x = shift;
1147 my sub inner {
1148 ... do something with $x ...
1149 }
1150 inner();
1151 }
1152
1153In this example, a new C<$x> is created when C<whatever> is called, and
1154also a new C<inner>, which can see the new C<$x>. A "state" sub will only
1155see the C<$x> from the first call to C<whatever>.
1156
1157=head3 C<our> subroutines
1158
1159Like C<our $variable>, C<our sub> creates a lexical alias to the package
1160subroutine of the same name.
1161
1162The two main uses for this are to switch back to using the package sub
1163inside an inner scope:
1164
ca40957e
FC
1165 sub foo { ... }
1166
1167 sub bar {
1168 my sub foo { ... }
1169 {
1170 # need to use the outer foo here
1171 our sub foo;
1172 foo();
1173 }
1174 }
1175
1176and to make a subroutine visible to other packages in the same scope:
1177
1178 package MySneakyModule;
1179
ca40957e
FC
1180 our sub do_something { ... }
1181
1182 sub do_something_with_caller {
1183 package DB;
1184 () = caller 1; # sets @DB::args
1185 do_something(@args); # uses MySneakyModule::do_something
1186 }
1187
cb1a09d0 1188=head2 Passing Symbol Table Entries (typeglobs)
d74e8afc 1189X<typeglob> X<*>
cb1a09d0 1190
19799a22
GS
1191B<WARNING>: The mechanism described in this section was originally
1192the only way to simulate pass-by-reference in older versions of
1193Perl. While it still works fine in modern versions, the new reference
1194mechanism is generally easier to work with. See below.
a0d0e21e
LW
1195
1196Sometimes you don't want to pass the value of an array to a subroutine
1197but rather the name of it, so that the subroutine can modify the global
1198copy of it rather than working with a local copy. In perl you can
cb1a09d0 1199refer to all objects of a particular name by prefixing the name
5f05dabc 1200with a star: C<*foo>. This is often known as a "typeglob", because the
a0d0e21e
LW
1201star on the front can be thought of as a wildcard match for all the
1202funny prefix characters on variables and subroutines and such.
1203
55497cff 1204When evaluated, the typeglob produces a scalar value that represents
5f05dabc 1205all the objects of that name, including any filehandle, format, or
a0d0e21e 1206subroutine. When assigned to, it causes the name mentioned to refer to
19799a22 1207whatever C<*> value was assigned to it. Example:
a0d0e21e
LW
1208
1209 sub doubleary {
1210 local(*someary) = @_;
1211 foreach $elem (@someary) {
1212 $elem *= 2;
1213 }
1214 }
1215 doubleary(*foo);
1216 doubleary(*bar);
1217
19799a22 1218Scalars are already passed by reference, so you can modify
a0d0e21e 1219scalar arguments without using this mechanism by referring explicitly
1fef88e7 1220to C<$_[0]> etc. You can modify all the elements of an array by passing
f86cebdf
GS
1221all the elements as scalars, but you have to use the C<*> mechanism (or
1222the equivalent reference mechanism) to C<push>, C<pop>, or change the size of
a0d0e21e
LW
1223an array. It will certainly be faster to pass the typeglob (or reference).
1224
1225Even if you don't want to modify an array, this mechanism is useful for
5f05dabc 1226passing multiple arrays in a single LIST, because normally the LIST
a0d0e21e 1227mechanism will merge all the array values so that you can't extract out
55497cff 1228the individual arrays. For more on typeglobs, see
2ae324a7 1229L<perldata/"Typeglobs and Filehandles">.
cb1a09d0 1230
5a964f20 1231=head2 When to Still Use local()
d74e8afc 1232X<local> X<variable, local>
5a964f20 1233
19799a22
GS
1234Despite the existence of C<my>, there are still three places where the
1235C<local> operator still shines. In fact, in these three places, you
5a964f20
TC
1236I<must> use C<local> instead of C<my>.
1237
13a2d996 1238=over 4
5a964f20 1239
551e1d92
RB
1240=item 1.
1241
1242You need to give a global variable a temporary value, especially $_.
5a964f20 1243
f86cebdf
GS
1244The global variables, like C<@ARGV> or the punctuation variables, must be
1245C<local>ized with C<local()>. This block reads in F</etc/motd>, and splits
5a964f20 1246it up into chunks separated by lines of equal signs, which are placed
f86cebdf 1247in C<@Fields>.
5a964f20
TC
1248
1249 {
1250 local @ARGV = ("/etc/motd");
1251 local $/ = undef;
1252 local $_ = <>;
1253 @Fields = split /^\s*=+\s*$/;
1254 }
1255
19799a22 1256It particular, it's important to C<local>ize $_ in any routine that assigns
5a964f20
TC
1257to it. Look out for implicit assignments in C<while> conditionals.
1258
551e1d92
RB
1259=item 2.
1260
1261You need to create a local file or directory handle or a local function.
5a964f20 1262
09bef843
SB
1263A function that needs a filehandle of its own must use
1264C<local()> on a complete typeglob. This can be used to create new symbol
5a964f20
TC
1265table entries:
1266
1267 sub ioqueue {
1268 local (*READER, *WRITER); # not my!
17b63f68 1269 pipe (READER, WRITER) or die "pipe: $!";
5a964f20
TC
1270 return (*READER, *WRITER);
1271 }
1272 ($head, $tail) = ioqueue();
1273
1274See the Symbol module for a way to create anonymous symbol table
1275entries.
1276
1277Because assignment of a reference to a typeglob creates an alias, this
1278can be used to create what is effectively a local function, or at least,
1279a local alias.
1280
1281 {
4a46e268 1282 local *grow = \&shrink; # only until this block exits
555bd962
BG
1283 grow(); # really calls shrink()
1284 move(); # if move() grow()s, it shrink()s too
5a964f20 1285 }
555bd962 1286 grow(); # get the real grow() again
5a964f20
TC
1287
1288See L<perlref/"Function Templates"> for more about manipulating
1289functions by name in this way.
1290
551e1d92
RB
1291=item 3.
1292
1293You want to temporarily change just one element of an array or hash.
5a964f20 1294
f86cebdf 1295You can C<local>ize just one element of an aggregate. Usually this
5a964f20
TC
1296is done on dynamics:
1297
1298 {
1299 local $SIG{INT} = 'IGNORE';
1300 funct(); # uninterruptible
1301 }
1302 # interruptibility automatically restored here
1303
9d42615f 1304But it also works on lexically declared aggregates.
5a964f20
TC
1305
1306=back
1307
cb1a09d0 1308=head2 Pass by Reference
d74e8afc 1309X<pass by reference> X<pass-by-reference> X<reference>
cb1a09d0 1310
55497cff 1311If you want to pass more than one array or hash into a function--or
1312return them from it--and have them maintain their integrity, then
1313you're going to have to use an explicit pass-by-reference. Before you
1314do that, you need to understand references as detailed in L<perlref>.
c07a80fd 1315This section may not make much sense to you otherwise.
cb1a09d0 1316
19799a22
GS
1317Here are a few simple examples. First, let's pass in several arrays
1318to a function and have it C<pop> all of then, returning a new list
1319of all their former last elements:
cb1a09d0
AD
1320
1321 @tailings = popmany ( \@a, \@b, \@c, \@d );
1322
1323 sub popmany {
1324 my $aref;
8b7906d1 1325 my @retlist;
cb1a09d0
AD
1326 foreach $aref ( @_ ) {
1327 push @retlist, pop @$aref;
54310121 1328 }
cb1a09d0 1329 return @retlist;
54310121 1330 }
cb1a09d0 1331
54310121 1332Here's how you might write a function that returns a
cb1a09d0
AD
1333list of keys occurring in all the hashes passed to it:
1334
54310121 1335 @common = inter( \%foo, \%bar, \%joe );
cb1a09d0
AD
1336 sub inter {
1337 my ($k, $href, %seen); # locals
1338 foreach $href (@_) {
1339 while ( $k = each %$href ) {
1340 $seen{$k}++;
54310121 1341 }
1342 }
cb1a09d0 1343 return grep { $seen{$_} == @_ } keys %seen;
54310121 1344 }
cb1a09d0 1345
5f05dabc 1346So far, we're using just the normal list return mechanism.
54310121 1347What happens if you want to pass or return a hash? Well,
1348if you're using only one of them, or you don't mind them
cb1a09d0 1349concatenating, then the normal calling convention is ok, although
54310121 1350a little expensive.
cb1a09d0
AD
1351
1352Where people get into trouble is here:
1353
1354 (@a, @b) = func(@c, @d);
1355or
1356 (%a, %b) = func(%c, %d);
1357
19799a22
GS
1358That syntax simply won't work. It sets just C<@a> or C<%a> and
1359clears the C<@b> or C<%b>. Plus the function didn't get passed
1360into two separate arrays or hashes: it got one long list in C<@_>,
1361as always.
cb1a09d0
AD
1362
1363If you can arrange for everyone to deal with this through references, it's
1364cleaner code, although not so nice to look at. Here's a function that
1365takes two array references as arguments, returning the two array elements
1366in order of how many elements they have in them:
1367
1368 ($aref, $bref) = func(\@c, \@d);
1369 print "@$aref has more than @$bref\n";
1370 sub func {
1371 my ($cref, $dref) = @_;
1372 if (@$cref > @$dref) {
1373 return ($cref, $dref);
1374 } else {
c07a80fd 1375 return ($dref, $cref);
54310121 1376 }
1377 }
cb1a09d0
AD
1378
1379It turns out that you can actually do this also:
1380
1381 (*a, *b) = func(\@c, \@d);
1382 print "@a has more than @b\n";
1383 sub func {
1384 local (*c, *d) = @_;
1385 if (@c > @d) {
1386 return (\@c, \@d);
1387 } else {
1388 return (\@d, \@c);
54310121 1389 }
1390 }
cb1a09d0
AD
1391
1392Here we're using the typeglobs to do symbol table aliasing. It's
19799a22 1393a tad subtle, though, and also won't work if you're using C<my>
09bef843 1394variables, because only globals (even in disguise as C<local>s)
19799a22 1395are in the symbol table.
5f05dabc 1396
1397If you're passing around filehandles, you could usually just use the bare
19799a22
GS
1398typeglob, like C<*STDOUT>, but typeglobs references work, too.
1399For example:
5f05dabc 1400
1401 splutter(\*STDOUT);
1402 sub splutter {
1403 my $fh = shift;
1404 print $fh "her um well a hmmm\n";
1405 }
1406
1407 $rec = get_rec(\*STDIN);
1408 sub get_rec {
1409 my $fh = shift;
1410 return scalar <$fh>;
1411 }
1412
19799a22
GS
1413If you're planning on generating new filehandles, you could do this.
1414Notice to pass back just the bare *FH, not its reference.
5f05dabc 1415
1416 sub openit {
19799a22 1417 my $path = shift;
5f05dabc 1418 local *FH;
e05a3a1e 1419 return open (FH, $path) ? *FH : undef;
54310121 1420 }
5f05dabc 1421
cb1a09d0 1422=head2 Prototypes
d74e8afc 1423X<prototype> X<subroutine, prototype>
cb1a09d0 1424
19799a22 1425Perl supports a very limited kind of compile-time argument checking
eedb00fa
PM
1426using function prototyping. This can be declared in either the PROTO
1427section or with a L<prototype attribute|attributes/Built-in Attributes>.
30d9c59b 1428If you declare either of
cb1a09d0 1429
26230909
AC
1430 sub mypush (\@@)
1431 sub mypush :prototype(\@@)
30d9c59b
Z
1432
1433then C<mypush()> takes arguments exactly like C<push()> does.
1434
1435If subroutine signatures are enabled (see L</Signatures>), then
1436the shorter PROTO syntax is unavailable, because it would clash with
1437signatures. In that case, a prototype can only be declared in the form
1438of an attribute.
cb1a09d0 1439
30d9c59b 1440The
19799a22
GS
1441function declaration must be visible at compile time. The prototype
1442affects only interpretation of new-style calls to the function,
1443where new-style is defined as not using the C<&> character. In
1444other words, if you call it like a built-in function, then it behaves
1445like a built-in function. If you call it like an old-fashioned
1446subroutine, then it behaves like an old-fashioned subroutine. It
1447naturally falls out from this rule that prototypes have no influence
1448on subroutine references like C<\&foo> or on indirect subroutine
c47ff5f1 1449calls like C<&{$subref}> or C<< $subref->() >>.
c07a80fd 1450
1451Method calls are not influenced by prototypes either, because the
19799a22
GS
1452function to be called is indeterminate at compile time, since
1453the exact code called depends on inheritance.
cb1a09d0 1454
19799a22
GS
1455Because the intent of this feature is primarily to let you define
1456subroutines that work like built-in functions, here are prototypes
1457for some other functions that parse almost exactly like the
1458corresponding built-in.
cb1a09d0 1459
555bd962
BG
1460 Declared as Called as
1461
1462 sub mylink ($$) mylink $old, $new
1463 sub myvec ($$$) myvec $var, $offset, 1
1464 sub myindex ($$;$) myindex &getstring, "substr"
1465 sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off
1466 sub myreverse (@) myreverse $a, $b, $c
1467 sub myjoin ($@) myjoin ":", $a, $b, $c
26230909
AC
1468 sub mypop (\@) mypop @array
1469 sub mysplice (\@$$@) mysplice @array, 0, 2, @pushme
1470 sub mykeys (\[%@]) mykeys %{$hashref}
555bd962
BG
1471 sub myopen (*;$) myopen HANDLE, $name
1472 sub mypipe (**) mypipe READHANDLE, WRITEHANDLE
1473 sub mygrep (&@) mygrep { /foo/ } $a, $b, $c
1474 sub myrand (;$) myrand 42
1475 sub mytime () mytime
cb1a09d0 1476
c07a80fd 1477Any backslashed prototype character represents an actual argument
ae7a3cfa 1478that must start with that character (optionally preceded by C<my>,
b91b7d1a
FC
1479C<our> or C<local>), with the exception of C<$>, which will
1480accept any scalar lvalue expression, such as C<$foo = 7> or
b77865f5 1481C<< my_function()->[0] >>. The value passed as part of C<@_> will be a
ae7a3cfa
FC
1482reference to the actual argument given in the subroutine call,
1483obtained by applying C<\> to that argument.
c07a80fd 1484
c035a075 1485You can use the C<\[]> backslash group notation to specify more than one
b77865f5 1486allowed argument type. For example:
5b794e05
JH
1487
1488 sub myref (\[$@%&*])
1489
1490will allow calling myref() as
1491
1492 myref $var
1493 myref @array
1494 myref %hash
1495 myref &sub
1496 myref *glob
1497
1498and the first argument of myref() will be a reference to
1499a scalar, an array, a hash, a code, or a glob.
1500
c07a80fd 1501Unbackslashed prototype characters have special meanings. Any
19799a22 1502unbackslashed C<@> or C<%> eats all remaining arguments, and forces
f86cebdf
GS
1503list context. An argument represented by C<$> forces scalar context. An
1504C<&> requires an anonymous subroutine, which, if passed as the first
0df79f0c
GS
1505argument, does not require the C<sub> keyword or a subsequent comma.
1506
1507A C<*> allows the subroutine to accept a bareword, constant, scalar expression,
648ca4f7
GS
1508typeglob, or a reference to a typeglob in that slot. The value will be
1509available to the subroutine either as a simple scalar, or (in the latter
0df79f0c
GS
1510two cases) as a reference to the typeglob. If you wish to always convert
1511such arguments to a typeglob reference, use Symbol::qualify_to_ref() as
1512follows:
1513
1514 use Symbol 'qualify_to_ref';
1515
1516 sub foo (*) {
1517 my $fh = qualify_to_ref(shift, caller);
1518 ...
1519 }
c07a80fd 1520
c035a075
DG
1521The C<+> prototype is a special alternative to C<$> that will act like
1522C<\[@%]> when given a literal array or hash variable, but will otherwise
1523force scalar context on the argument. This is useful for functions which
1524should accept either a literal array or an array reference as the argument:
1525
cba5a3b0 1526 sub mypush (+@) {
c035a075
DG
1527 my $aref = shift;
1528 die "Not an array or arrayref" unless ref $aref eq 'ARRAY';
1529 push @$aref, @_;
1530 }
1531
1532When using the C<+> prototype, your function must check that the argument
1533is of an acceptable type.
1534
859a4967 1535A semicolon (C<;>) separates mandatory arguments from optional arguments.
19799a22 1536It is redundant before C<@> or C<%>, which gobble up everything else.
cb1a09d0 1537
34daab0f
RGS
1538As the last character of a prototype, or just before a semicolon, a C<@>
1539or a C<%>, you can use C<_> in place of C<$>: if this argument is not
1540provided, C<$_> will be used instead.
859a4967 1541
19799a22
GS
1542Note how the last three examples in the table above are treated
1543specially by the parser. C<mygrep()> is parsed as a true list
1544operator, C<myrand()> is parsed as a true unary operator with unary
1545precedence the same as C<rand()>, and C<mytime()> is truly without
1546arguments, just like C<time()>. That is, if you say
cb1a09d0
AD
1547
1548 mytime +2;
1549
f86cebdf 1550you'll get C<mytime() + 2>, not C<mytime(2)>, which is how it would be parsed
3a8944db
FC
1551without a prototype. If you want to force a unary function to have the
1552same precedence as a list operator, add C<;> to the end of the prototype:
1553
1554 sub mygetprotobynumber($;);
1555 mygetprotobynumber $a > $b; # parsed as mygetprotobynumber($a > $b)
cb1a09d0 1556
19799a22
GS
1557The interesting thing about C<&> is that you can generate new syntax with it,
1558provided it's in the initial position:
d74e8afc 1559X<&>
cb1a09d0 1560
6d28dffb 1561 sub try (&@) {
cb1a09d0
AD
1562 my($try,$catch) = @_;
1563 eval { &$try };
1564 if ($@) {
1565 local $_ = $@;
1566 &$catch;
1567 }
1568 }
55497cff 1569 sub catch (&) { $_[0] }
cb1a09d0
AD
1570
1571 try {
1572 die "phooey";
1573 } catch {
1574 /phooey/ and print "unphooey\n";
1575 };
1576
f86cebdf 1577That prints C<"unphooey">. (Yes, there are still unresolved
19799a22 1578issues having to do with visibility of C<@_>. I'm ignoring that
f86cebdf 1579question for the moment. (But note that if we make C<@_> lexically
cb1a09d0 1580scoped, those anonymous subroutines can act like closures... (Gee,
5f05dabc 1581is this sounding a little Lispish? (Never mind.))))
cb1a09d0 1582
19799a22 1583And here's a reimplementation of the Perl C<grep> operator:
d74e8afc 1584X<grep>
cb1a09d0
AD
1585
1586 sub mygrep (&@) {
1587 my $code = shift;
1588 my @result;
1589 foreach $_ (@_) {
6e47f808 1590 push(@result, $_) if &$code;
cb1a09d0
AD
1591 }
1592 @result;
1593 }
a0d0e21e 1594
cb1a09d0
AD
1595Some folks would prefer full alphanumeric prototypes. Alphanumerics have
1596been intentionally left out of prototypes for the express purpose of
1597someday in the future adding named, formal parameters. The current
1598mechanism's main goal is to let module writers provide better diagnostics
1599for module users. Larry feels the notation quite understandable to Perl
1600programmers, and that it will not intrude greatly upon the meat of the
1601module, nor make it harder to read. The line noise is visually
1602encapsulated into a small pill that's easy to swallow.
1603
420cdfc1
ST
1604If you try to use an alphanumeric sequence in a prototype you will
1605generate an optional warning - "Illegal character in prototype...".
1606Unfortunately earlier versions of Perl allowed the prototype to be
1607used as long as its prefix was a valid prototype. The warning may be
1608upgraded to a fatal error in a future version of Perl once the
1609majority of offending code is fixed.
1610
cb1a09d0
AD
1611It's probably best to prototype new functions, not retrofit prototyping
1612into older ones. That's because you must be especially careful about
1613silent impositions of differing list versus scalar contexts. For example,
1614if you decide that a function should take just one parameter, like this:
1615
1616 sub func ($) {
1617 my $n = shift;
1618 print "you gave me $n\n";
54310121 1619 }
cb1a09d0
AD
1620
1621and someone has been calling it with an array or expression
1622returning a list:
1623
1624 func(@foo);
f2606479 1625 func( $text =~ /\w+/g );
cb1a09d0 1626
19799a22 1627Then you've just supplied an automatic C<scalar> in front of their
f86cebdf 1628argument, which can be more than a bit surprising. The old C<@foo>
cb1a09d0 1629which used to hold one thing doesn't get passed in. Instead,
19799a22 1630C<func()> now gets passed in a C<1>; that is, the number of elements
f2606479
LM
1631in C<@foo>. And the C<m//g> gets called in scalar context so instead of a
1632list of words it returns a boolean result and advances C<pos($text)>. Ouch!
cb1a09d0 1633
eb40d2ca
PM
1634If a sub has both a PROTO and a BLOCK, the prototype is not applied
1635until after the BLOCK is completely defined. This means that a recursive
1636function with a prototype has to be predeclared for the prototype to take
1637effect, like so:
1638
1639 sub foo($$);
1640 sub foo($$) {
1641 foo 1, 2;
1642 }
1643
5f05dabc 1644This is all very powerful, of course, and should be used only in moderation
54310121 1645to make the world a better place.
44a8e56a 1646
1647=head2 Constant Functions
d74e8afc 1648X<constant>
44a8e56a 1649
1650Functions with a prototype of C<()> are potential candidates for
19799a22
GS
1651inlining. If the result after optimization and constant folding
1652is either a constant or a lexically-scoped scalar which has no other
54310121 1653references, then it will be used in place of function calls made
19799a22
GS
1654without C<&>. Calls made using C<&> are never inlined. (See
1655F<constant.pm> for an easy way to declare most constants.)
44a8e56a 1656
5a964f20 1657The following functions would all be inlined:
44a8e56a 1658
699e6cd4
TP
1659 sub pi () { 3.14159 } # Not exact, but close.
1660 sub PI () { 4 * atan2 1, 1 } # As good as it gets,
1661 # and it's inlined, too!
44a8e56a 1662 sub ST_DEV () { 0 }
1663 sub ST_INO () { 1 }
1664
1665 sub FLAG_FOO () { 1 << 8 }
1666 sub FLAG_BAR () { 1 << 9 }
1667 sub FLAG_MASK () { FLAG_FOO | FLAG_BAR }
54310121 1668
1669 sub OPT_BAZ () { not (0x1B58 & FLAG_MASK) }
88267271
PZ
1670
1671 sub N () { int(OPT_BAZ) / 3 }
1672
1673 sub FOO_SET () { 1 if FLAG_MASK & FLAG_FOO }
d3c633ba 1674 sub FOO_SET2 () { if (FLAG_MASK & FLAG_FOO) { 1 } }
88267271 1675
d3c633ba
FC
1676(Be aware that the last example was not always inlined in Perl 5.20 and
1677earlier, which did not behave consistently with subroutines containing
1678inner scopes.) You can countermand inlining by using an explicit
1679C<return>:
88267271
PZ
1680
1681 sub baz_val () {
44a8e56a 1682 if (OPT_BAZ) {
1683 return 23;
1684 }
1685 else {
1686 return 42;
1687 }
1688 }
d3c633ba 1689 sub bonk_val () { return 12345 }
cb1a09d0 1690
fe39f0d5
AB
1691As alluded to earlier you can also declare inlined subs dynamically at
1692BEGIN time if their body consists of a lexically-scoped scalar which
b77865f5 1693has no other references. Only the first example here will be inlined:
fe39f0d5
AB
1694
1695 BEGIN {
1696 my $var = 1;
1697 no strict 'refs';
1698 *INLINED = sub () { $var };
1699 }
1700
1701 BEGIN {
1702 my $var = 1;
1703 my $ref = \$var;
1704 no strict 'refs';
1705 *NOT_INLINED = sub () { $var };
1706 }
1707
1708A not so obvious caveat with this (see [RT #79908]) is that the
1709variable will be immediately inlined, and will stop behaving like a
1710normal lexical variable, e.g. this will print C<79907>, not C<79908>:
1711
1712 BEGIN {
1713 my $x = 79907;
1714 *RT_79908 = sub () { $x };
1715 $x++;
1716 }
1717 print RT_79908(); # prints 79907
1718
d3c633ba
FC
1719As of Perl 5.22, this buggy behavior, while preserved for backward
1720compatibility, is detected and emits a deprecation warning. If you want
1721the subroutine to be inlined (with no warning), make sure the variable is
1722not used in a context where it could be modified aside from where it is
1723declared.
1724
1725 # Fine, no warning
1726 BEGIN {
1727 my $x = 54321;
1728 *INLINED = sub () { $x };
1729 }
1730 # Warns. Future Perl versions will stop inlining it.
1731 BEGIN {
1732 my $x;
1733 $x = 54321;
1734 *ALSO_INLINED = sub () { $x };
1735 }
1736
99734069
FC
1737Perl 5.22 also introduces the experimental "const" attribute as an
1738alternative. (Disable the "experimental::const_attr" warnings if you want
1739to use it.) When applied to an anonymous subroutine, it forces the sub to
1740be called when the C<sub> expression is evaluated. The return value is
1741captured and turned into a constant subroutine:
1742
1743 my $x = 54321;
1744 *INLINED = sub : const { $x };
1745 $x++;
1746
1747The return value of C<INLINED> in this example will always be 54321,
1748regardless of later modifications to $x. You can also put any arbitrary
1749code inside the sub, at it will be executed immediately and its return
1750value captured the same way.
1751
fe39f0d5
AB
1752If you really want a subroutine with a C<()> prototype that returns a
1753lexical variable you can easily force it to not be inlined by adding
1754an explicit C<return>:
1755
1756 BEGIN {
1757 my $x = 79907;
1758 *RT_79908 = sub () { return $x };
1759 $x++;
1760 }
1761 print RT_79908(); # prints 79908
1762
1763The easiest way to tell if a subroutine was inlined is by using
d3c633ba 1764L<B::Deparse>. Consider this example of two subroutines returning
fe39f0d5
AB
1765C<1>, one with a C<()> prototype causing it to be inlined, and one
1766without (with deparse output truncated for clarity):
1767
cb07e2f2
KW
1768 $ perl -MO=Deparse -le 'sub ONE { 1 } if (ONE) { print ONE if ONE }'
1769 sub ONE {
1770 1;
1771 }
1772 if (ONE ) {
1773 print ONE() if ONE ;
1774 }
1775 $ perl -MO=Deparse -le 'sub ONE () { 1 } if (ONE) { print ONE if ONE }'
1776 sub ONE () { 1 }
1777 do {
1778 print 1
1779 };
fe39f0d5
AB
1780
1781If you redefine a subroutine that was eligible for inlining, you'll
b77865f5 1782get a warning by default. You can use this warning to tell whether or
fe39f0d5
AB
1783not a particular subroutine is considered inlinable, since it's
1784different than the warning for overriding non-inlined subroutines:
1785
1786 $ perl -e 'sub one () {1} sub one () {2}'
1787 Constant subroutine one redefined at -e line 1.
1788 $ perl -we 'sub one {1} sub one {2}'
1789 Subroutine one redefined at -e line 1.
1790
1791The warning is considered severe enough not to be affected by the
1792B<-w> switch (or its absence) because previously compiled invocations
1793of the function will still be using the old value of the function. If
1794you need to be able to redefine the subroutine, you need to ensure
1795that it isn't inlined, either by dropping the C<()> prototype (which
1796changes calling semantics, so beware) or by thwarting the inlining
d3c633ba
FC
1797mechanism in some other way, e.g. by adding an explicit C<return>, as
1798mentioned above:
fe39f0d5
AB
1799
1800 sub not_inlined () { return 23 }
4cee8e80 1801
19799a22 1802=head2 Overriding Built-in Functions
d74e8afc 1803X<built-in> X<override> X<CORE> X<CORE::GLOBAL>
a0d0e21e 1804
19799a22 1805Many built-in functions may be overridden, though this should be tried
5f05dabc 1806only occasionally and for good reason. Typically this might be
19799a22 1807done by a package attempting to emulate missing built-in functionality
a0d0e21e
LW
1808on a non-Unix system.
1809
163e3a99
JP
1810Overriding may be done only by importing the name from a module at
1811compile time--ordinary predeclaration isn't good enough. However, the
19799a22
GS
1812C<use subs> pragma lets you, in effect, predeclare subs
1813via the import syntax, and these names may then override built-in ones:
a0d0e21e
LW
1814
1815 use subs 'chdir', 'chroot', 'chmod', 'chown';
1816 chdir $somewhere;
1817 sub chdir { ... }
1818
19799a22
GS
1819To unambiguously refer to the built-in form, precede the
1820built-in name with the special package qualifier C<CORE::>. For example,
1821saying C<CORE::open()> always refers to the built-in C<open()>, even
fb73857a 1822if the current package has imported some other subroutine called
19799a22 1823C<&open()> from elsewhere. Even though it looks like a regular
4aaa4757
FC
1824function call, it isn't: the CORE:: prefix in that case is part of Perl's
1825syntax, and works for any keyword, regardless of what is in the CORE
1826package. Taking a reference to it, that is, C<\&CORE::open>, only works
1827for some keywords. See L<CORE>.
fb73857a 1828
19799a22
GS
1829Library modules should not in general export built-in names like C<open>
1830or C<chdir> as part of their default C<@EXPORT> list, because these may
a0d0e21e 1831sneak into someone else's namespace and change the semantics unexpectedly.
19799a22 1832Instead, if the module adds that name to C<@EXPORT_OK>, then it's
a0d0e21e
LW
1833possible for a user to import the name explicitly, but not implicitly.
1834That is, they could say
1835
1836 use Module 'open';
1837
19799a22 1838and it would import the C<open> override. But if they said
a0d0e21e
LW
1839
1840 use Module;
1841
19799a22 1842they would get the default imports without overrides.
a0d0e21e 1843
19799a22 1844The foregoing mechanism for overriding built-in is restricted, quite
95d94a4f 1845deliberately, to the package that requests the import. There is a second
19799a22 1846method that is sometimes applicable when you wish to override a built-in
95d94a4f
GS
1847everywhere, without regard to namespace boundaries. This is achieved by
1848importing a sub into the special namespace C<CORE::GLOBAL::>. Here is an
1849example that quite brazenly replaces the C<glob> operator with something
1850that understands regular expressions.
1851
1852 package REGlob;
1853 require Exporter;
1854 @ISA = 'Exporter';
1855 @EXPORT_OK = 'glob';
1856
1857 sub import {
1858 my $pkg = shift;
1859 return unless @_;
1860 my $sym = shift;
1861 my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0));
1862 $pkg->export($where, $sym, @_);
1863 }
1864
1865 sub glob {
1866 my $pat = shift;
1867 my @got;
7b815c67
RGS
1868 if (opendir my $d, '.') {
1869 @got = grep /$pat/, readdir $d;
1870 closedir $d;
19799a22
GS
1871 }
1872 return @got;
95d94a4f
GS
1873 }
1874 1;
1875
1876And here's how it could be (ab)used:
1877
1878 #use REGlob 'GLOBAL_glob'; # override glob() in ALL namespaces
1879 package Foo;
1880 use REGlob 'glob'; # override glob() in Foo:: only
1881 print for <^[a-z_]+\.pm\$>; # show all pragmatic modules
1882
19799a22 1883The initial comment shows a contrived, even dangerous example.
95d94a4f 1884By overriding C<glob> globally, you would be forcing the new (and
19799a22 1885subversive) behavior for the C<glob> operator for I<every> namespace,
95d94a4f
GS
1886without the complete cognizance or cooperation of the modules that own
1887those namespaces. Naturally, this should be done with extreme caution--if
1888it must be done at all.
1889
1890The C<REGlob> example above does not implement all the support needed to
19799a22 1891cleanly override perl's C<glob> operator. The built-in C<glob> has
95d94a4f 1892different behaviors depending on whether it appears in a scalar or list
19799a22 1893context, but our C<REGlob> doesn't. Indeed, many perl built-in have such
95d94a4f
GS
1894context sensitive behaviors, and these must be adequately supported by
1895a properly written override. For a fully functional example of overriding
1896C<glob>, study the implementation of C<File::DosGlob> in the standard
1897library.
1898
77bc9082
RGS
1899When you override a built-in, your replacement should be consistent (if
1900possible) with the built-in native syntax. You can achieve this by using
1901a suitable prototype. To get the prototype of an overridable built-in,
1902use the C<prototype> function with an argument of C<"CORE::builtin_name">
1903(see L<perlfunc/prototype>).
1904
1905Note however that some built-ins can't have their syntax expressed by a
1906prototype (such as C<system> or C<chomp>). If you override them you won't
1907be able to fully mimic their original syntax.
1908
fe854a6f 1909The built-ins C<do>, C<require> and C<glob> can also be overridden, but due
77bc9082
RGS
1910to special magic, their original syntax is preserved, and you don't have
1911to define a prototype for their replacements. (You can't override the
1912C<do BLOCK> syntax, though).
1913
1914C<require> has special additional dark magic: if you invoke your
1915C<require> replacement as C<require Foo::Bar>, it will actually receive
1916the argument C<"Foo/Bar.pm"> in @_. See L<perlfunc/require>.
1917
1918And, as you'll have noticed from the previous example, if you override
593b9c14 1919C<glob>, the C<< <*> >> glob operator is overridden as well.
77bc9082 1920
9b3023bc 1921In a similar fashion, overriding the C<readline> function also overrides
b77865f5 1922the equivalent I/O operator C<< <FILEHANDLE> >>. Also, overriding
e3f73d4e 1923C<readpipe> also overrides the operators C<``> and C<qx//>.
9b3023bc 1924
fe854a6f 1925Finally, some built-ins (e.g. C<exists> or C<grep>) can't be overridden.
77bc9082 1926
a0d0e21e 1927=head2 Autoloading
d74e8afc 1928X<autoloading> X<AUTOLOAD>
a0d0e21e 1929
19799a22
GS
1930If you call a subroutine that is undefined, you would ordinarily
1931get an immediate, fatal error complaining that the subroutine doesn't
1932exist. (Likewise for subroutines being used as methods, when the
1933method doesn't exist in any base class of the class's package.)
1934However, if an C<AUTOLOAD> subroutine is defined in the package or
1935packages used to locate the original subroutine, then that
1936C<AUTOLOAD> subroutine is called with the arguments that would have
1937been passed to the original subroutine. The fully qualified name
1938of the original subroutine magically appears in the global $AUTOLOAD
1939variable of the same package as the C<AUTOLOAD> routine. The name
1940is not passed as an ordinary argument because, er, well, just
593b9c14 1941because, that's why. (As an exception, a method call to a nonexistent
80ee23cd 1942C<import> or C<unimport> method is just skipped instead. Also, if
5b36e945
FC
1943the AUTOLOAD subroutine is an XSUB, there are other ways to retrieve the
1944subroutine name. See L<perlguts/Autoloading with XSUBs> for details.)
80ee23cd 1945
19799a22
GS
1946
1947Many C<AUTOLOAD> routines load in a definition for the requested
1948subroutine using eval(), then execute that subroutine using a special
1949form of goto() that erases the stack frame of the C<AUTOLOAD> routine
1950without a trace. (See the source to the standard module documented
1951in L<AutoLoader>, for example.) But an C<AUTOLOAD> routine can
1952also just emulate the routine and never define it. For example,
1953let's pretend that a function that wasn't defined should just invoke
1954C<system> with those arguments. All you'd do is:
cb1a09d0
AD
1955
1956 sub AUTOLOAD {
33666205
EK
1957 our $AUTOLOAD; # keep 'use strict' happy
1958 my $program = $AUTOLOAD;
1959 $program =~ s/.*:://;
1960 system($program, @_);
54310121 1961 }
cb1a09d0 1962 date();
33666205 1963 who();
cb1a09d0
AD
1964 ls('-l');
1965
19799a22
GS
1966In fact, if you predeclare functions you want to call that way, you don't
1967even need parentheses:
cb1a09d0
AD
1968
1969 use subs qw(date who ls);
1970 date;
33666205 1971 who;
593b9c14 1972 ls '-l';
cb1a09d0 1973
13058d67 1974A more complete example of this is the Shell module on CPAN, which
19799a22 1975can treat undefined subroutine calls as calls to external programs.
a0d0e21e 1976
19799a22
GS
1977Mechanisms are available to help modules writers split their modules
1978into autoloadable files. See the standard AutoLoader module
6d28dffb 1979described in L<AutoLoader> and in L<AutoSplit>, the standard
1980SelfLoader modules in L<SelfLoader>, and the document on adding C
19799a22 1981functions to Perl code in L<perlxs>.
cb1a09d0 1982
09bef843 1983=head2 Subroutine Attributes
d74e8afc 1984X<attribute> X<subroutine, attribute> X<attrs>
09bef843
SB
1985
1986A subroutine declaration or definition may have a list of attributes
1987associated with it. If such an attribute list is present, it is
0120eecf 1988broken up at space or colon boundaries and treated as though a
09bef843
SB
1989C<use attributes> had been seen. See L<attributes> for details
1990about what attributes are currently supported.
1991Unlike the limitation with the obsolescent C<use attrs>, the
1992C<sub : ATTRLIST> syntax works to associate the attributes with
1993a pre-declaration, and not just with a subroutine definition.
1994
1995The attributes must be valid as simple identifier names (without any
1996punctuation other than the '_' character). They may have a parameter
1997list appended, which is only checked for whether its parentheses ('(',')')
1998nest properly.
1999
2000Examples of valid syntax (even though the attributes are unknown):
2001
4358a253
SS
2002 sub fnord (&\%) : switch(10,foo(7,3)) : expensive;
2003 sub plugh () : Ugly('\(") :Bad;
09bef843
SB
2004 sub xyzzy : _5x5 { ... }
2005
2006Examples of invalid syntax:
2007
4358a253
SS
2008 sub fnord : switch(10,foo(); # ()-string not balanced
2009 sub snoid : Ugly('('); # ()-string not balanced
2010 sub xyzzy : 5x5; # "5x5" not a valid identifier
2011 sub plugh : Y2::north; # "Y2::north" not a simple identifier
2012 sub snurt : foo + bar; # "+" not a colon or space
09bef843
SB
2013
2014The attribute list is passed as a list of constant strings to the code
2015which associates them with the subroutine. In particular, the second example
2016of valid syntax above currently looks like this in terms of how it's
2017parsed and invoked:
2018
2019 use attributes __PACKAGE__, \&plugh, q[Ugly('\(")], 'Bad';
2020
2021For further details on attribute lists and their manipulation,
a0ae32d3 2022see L<attributes> and L<Attribute::Handlers>.
09bef843 2023
cb1a09d0 2024=head1 SEE ALSO
a0d0e21e 2025
19799a22
GS
2026See L<perlref/"Function Templates"> for more about references and closures.
2027See L<perlxs> if you'd like to learn about calling C subroutines from Perl.
a2293a43 2028See L<perlembed> if you'd like to learn about calling Perl subroutines from C.
19799a22
GS
2029See L<perlmod> to learn about bundling up your functions in separate files.
2030See L<perlmodlib> to learn what library modules come standard on your system.
82e1c0d9 2031See L<perlootut> to learn how to make object method calls.