This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Update perlhist
[perl5.git] / pod / perlsub.pod
... / ...
CommitLineData
1=head1 NAME
2X<subroutine> X<function>
3
4perlsub - Perl subroutines
5
6=head1 SYNOPSIS
7
8To declare subroutines:
9X<subroutine, declaration> X<sub>
10
11 sub NAME; # A "forward" declaration.
12 sub NAME(PROTO); # ditto, but with prototypes
13 sub NAME : ATTRS; # with attributes
14 sub NAME(PROTO) : ATTRS; # with attributes and prototypes
15
16 sub NAME BLOCK # A declaration and a definition.
17 sub NAME(PROTO) BLOCK # ditto, but with prototypes
18 sub NAME SIG BLOCK # with signature
19 sub NAME : ATTRS BLOCK # with attributes
20 sub NAME(PROTO) : ATTRS BLOCK # with prototypes and attributes
21 sub NAME : ATTRS SIG BLOCK # with attributes and signature
22
23To define an anonymous subroutine at runtime:
24X<subroutine, anonymous>
25
26 $subref = sub BLOCK; # no proto
27 $subref = sub (PROTO) BLOCK; # with proto
28 $subref = sub SIG BLOCK; # with signature
29 $subref = sub : ATTRS BLOCK; # with attributes
30 $subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes
31 $subref = sub : ATTRS SIG BLOCK; # with attribs and signature
32
33To import subroutines:
34X<import>
35
36 use MODULE qw(NAME1 NAME2 NAME3);
37
38To call subroutines:
39X<subroutine, call> X<call>
40
41 NAME(LIST); # & is optional with parentheses.
42 NAME LIST; # Parentheses optional if predeclared/imported.
43 &NAME(LIST); # Circumvent prototypes.
44 &NAME; # Makes current @_ visible to called subroutine.
45
46=head1 DESCRIPTION
47
48Like many languages, Perl provides for user-defined subroutines.
49These may be located anywhere in the main program, loaded in from
50other files via the C<do>, C<require>, or C<use> keywords, or
51generated on the fly using C<eval> or anonymous subroutines.
52You can even call a function indirectly using a variable containing
53its name or a CODE reference.
54
55The Perl model for function call and return values is simple: all
56functions are passed as parameters one single flat list of scalars, and
57all functions likewise return to their caller one single flat list of
58scalars. Any arrays or hashes in these call and return lists will
59collapse, losing their identities--but you may always use
60pass-by-reference instead to avoid this. Both call and return lists may
61contain as many or as few scalar elements as you'd like. (Often a
62function without an explicit return statement is called a subroutine, but
63there's really no difference from Perl's perspective.)
64X<subroutine, parameter> X<parameter>
65
66Any arguments passed in show up in the array C<@_>.
67(They may also show up in lexical variables introduced by a signature;
68see L</Signatures> below.) Therefore, if
69you called a function with two arguments, those would be stored in
70C<$_[0]> and C<$_[1]>. The array C<@_> is a local array, but its
71elements are aliases for the actual scalar parameters. In particular,
72if an element C<$_[0]> is updated, the corresponding argument is
73updated (or an error occurs if it is not updatable). If an argument
74is an array or hash element which did not exist when the function
75was called, that element is created only when (and if) it is modified
76or a reference to it is taken. (Some earlier versions of Perl
77created the element whether or not the element was assigned to.)
78Assigning to the whole array C<@_> removes that aliasing, and does
79not update any arguments.
80X<subroutine, argument> X<argument> X<@_>
81
82A C<return> statement may be used to exit a subroutine, optionally
83specifying the returned value, which will be evaluated in the
84appropriate context (list, scalar, or void) depending on the context of
85the subroutine call. If you specify no return value, the subroutine
86returns an empty list in list context, the undefined value in scalar
87context, or nothing in void context. If you return one or more
88aggregates (arrays and hashes), these will be flattened together into
89one large indistinguishable list.
90
91If no C<return> is found and if the last statement is an expression, its
92value is returned. If the last statement is a loop control structure
93like a C<foreach> or a C<while>, the returned value is unspecified. The
94empty sub returns the empty list.
95X<subroutine, return value> X<return value> X<return>
96
97Aside from an experimental facility (see L</Signatures> below),
98Perl does not have named formal parameters. In practice all you
99do is assign to a C<my()> list of these. Variables that aren't
100declared to be private are global variables. For gory details
101on creating private variables, see L<"Private Variables via my()">
102and L<"Temporary Values via local()">. To create protected
103environments for a set of functions in a separate package (and
104probably a separate file), see L<perlmod/"Packages">.
105X<formal parameter> X<parameter, formal>
106
107Example:
108
109 sub max {
110 my $max = shift(@_);
111 foreach $foo (@_) {
112 $max = $foo if $max < $foo;
113 }
114 return $max;
115 }
116 $bestday = max($mon,$tue,$wed,$thu,$fri);
117
118Example:
119
120 # get a line, combining continuation lines
121 # that start with whitespace
122
123 sub get_line {
124 $thisline = $lookahead; # global variables!
125 LINE: while (defined($lookahead = <STDIN>)) {
126 if ($lookahead =~ /^[ \t]/) {
127 $thisline .= $lookahead;
128 }
129 else {
130 last LINE;
131 }
132 }
133 return $thisline;
134 }
135
136 $lookahead = <STDIN>; # get first line
137 while (defined($line = get_line())) {
138 ...
139 }
140
141Assigning to a list of private variables to name your arguments:
142
143 sub maybeset {
144 my($key, $value) = @_;
145 $Foo{$key} = $value unless $Foo{$key};
146 }
147
148Because the assignment copies the values, this also has the effect
149of turning call-by-reference into call-by-value. Otherwise a
150function is free to do in-place modifications of C<@_> and change
151its caller's values.
152X<call-by-reference> X<call-by-value>
153
154 upcase_in($v1, $v2); # this changes $v1 and $v2
155 sub upcase_in {
156 for (@_) { tr/a-z/A-Z/ }
157 }
158
159You aren't allowed to modify constants in this way, of course. If an
160argument were actually literal and you tried to change it, you'd take a
161(presumably fatal) exception. For example, this won't work:
162X<call-by-reference> X<call-by-value>
163
164 upcase_in("frederick");
165
166It would be much safer if the C<upcase_in()> function
167were written to return a copy of its parameters instead
168of changing them in place:
169
170 ($v3, $v4) = upcase($v1, $v2); # this doesn't change $v1 and $v2
171 sub upcase {
172 return unless defined wantarray; # void context, do nothing
173 my @parms = @_;
174 for (@parms) { tr/a-z/A-Z/ }
175 return wantarray ? @parms : $parms[0];
176 }
177
178Notice how this (unprototyped) function doesn't care whether it was
179passed real scalars or arrays. Perl sees all arguments as one big,
180long, flat parameter list in C<@_>. This is one area where
181Perl's simple argument-passing style shines. The C<upcase()>
182function would work perfectly well without changing the C<upcase()>
183definition even if we fed it things like this:
184
185 @newlist = upcase(@list1, @list2);
186 @newlist = upcase( split /:/, $var );
187
188Do not, however, be tempted to do this:
189
190 (@a, @b) = upcase(@list1, @list2);
191
192Like the flattened incoming parameter list, the return list is also
193flattened on return. So all you have managed to do here is stored
194everything in C<@a> and made C<@b> empty. See
195L<Pass by Reference> for alternatives.
196
197A subroutine may be called using an explicit C<&> prefix. The
198C<&> is optional in modern Perl, as are parentheses if the
199subroutine has been predeclared. The C<&> is I<not> optional
200when just naming the subroutine, such as when it's used as
201an argument to defined() or undef(). Nor is it optional when you
202want to do an indirect subroutine call with a subroutine name or
203reference using the C<&$subref()> or C<&{$subref}()> constructs,
204although the C<< $subref->() >> notation solves that problem.
205See L<perlref> for more about all that.
206X<&>
207
208Subroutines may be called recursively. If a subroutine is called
209using the C<&> form, the argument list is optional, and if omitted,
210no C<@_> array is set up for the subroutine: the C<@_> array at the
211time of the call is visible to subroutine instead. This is an
212efficiency mechanism that new users may wish to avoid.
213X<recursion>
214
215 &foo(1,2,3); # pass three arguments
216 foo(1,2,3); # the same
217
218 foo(); # pass a null list
219 &foo(); # the same
220
221 &foo; # foo() get current args, like foo(@_) !!
222 foo; # like foo() IFF sub foo predeclared, else "foo"
223
224Not only does the C<&> form make the argument list optional, it also
225disables any prototype checking on arguments you do provide. This
226is partly for historical reasons, and partly for having a convenient way
227to cheat if you know what you're doing. See L</Prototypes> below.
228X<&>
229
230Since Perl 5.16.0, the C<__SUB__> token is available under C<use feature
231'current_sub'> and C<use 5.16.0>. It will evaluate to a reference to the
232currently-running sub, which allows for recursive calls without knowing
233your subroutine's name.
234
235 use 5.16.0;
236 my $factorial = sub {
237 my ($x) = @_;
238 return 1 if $x == 1;
239 return($x * __SUB__->( $x - 1 ) );
240 };
241
242The behaviour of C<__SUB__> within a regex code block (such as C</(?{...})/>)
243is subject to change.
244
245Subroutines whose names are in all upper case are reserved to the Perl
246core, as are modules whose names are in all lower case. A subroutine in
247all capitals is a loosely-held convention meaning it will be called
248indirectly by the run-time system itself, usually due to a triggered event.
249Subroutines whose name start with a left parenthesis are also reserved the
250same way. The following is a list of some subroutines that currently do
251special, pre-defined things.
252
253=over
254
255=item documented later in this document
256
257C<AUTOLOAD>
258
259=item documented in L<perlmod>
260
261C<CLONE>, C<CLONE_SKIP>,
262
263=item documented in L<perlobj>
264
265C<DESTROY>
266
267=item documented in L<perltie>
268
269C<BINMODE>, C<CLEAR>, C<CLOSE>, C<DELETE>, C<DESTROY>, C<EOF>, C<EXISTS>,
270C<EXTEND>, C<FETCH>, C<FETCHSIZE>, C<FILENO>, C<FIRSTKEY>, C<GETC>,
271C<NEXTKEY>, C<OPEN>, C<POP>, C<PRINT>, C<PRINTF>, C<PUSH>, C<READ>,
272C<READLINE>, C<SCALAR>, C<SEEK>, C<SHIFT>, C<SPLICE>, C<STORE>,
273C<STORESIZE>, C<TELL>, C<TIEARRAY>, C<TIEHANDLE>, C<TIEHASH>,
274C<TIESCALAR>, C<UNSHIFT>, C<UNTIE>, C<WRITE>
275
276=item documented in L<PerlIO::via>
277
278C<BINMODE>, C<CLEARERR>, C<CLOSE>, C<EOF>, C<ERROR>, C<FDOPEN>, C<FILENO>,
279C<FILL>, C<FLUSH>, C<OPEN>, C<POPPED>, C<PUSHED>, C<READ>, C<SEEK>,
280C<SETLINEBUF>, C<SYSOPEN>, C<TELL>, C<UNREAD>, C<UTF8>, C<WRITE>
281
282=item documented in L<perlfunc>
283
284L<< C<import> | perlfunc/use >>, L<< C<unimport> | perlfunc/use >>,
285L<< C<INC> | perlfunc/require >>
286
287=item documented in L<UNIVERSAL>
288
289C<VERSION>
290
291=item documented in L<perldebguts>
292
293C<DB::DB>, C<DB::sub>, C<DB::lsub>, C<DB::goto>, C<DB::postponed>
294
295=item undocumented, used internally by the L<overload> feature
296
297any starting with C<(>
298
299=back
300
301The C<BEGIN>, C<UNITCHECK>, C<CHECK>, C<INIT> and C<END> subroutines
302are not so much subroutines as named special code blocks, of which you
303can have more than one in a package, and which you can B<not> call
304explicitly. See L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END">
305
306=head2 Signatures
307
308B<WARNING>: Subroutine signatures are experimental. The feature may be
309modified or removed in future versions of Perl.
310
311Perl has an experimental facility to allow a subroutine's formal
312parameters to be introduced by special syntax, separate from the
313procedural code of the subroutine body. The formal parameter list
314is known as a I<signature>. The facility must be enabled first by a
315pragmatic declaration, C<use feature 'signatures'>, and it will produce
316a warning unless the "experimental::signatures" warnings category is
317disabled.
318
319The signature is part of a subroutine's body. Normally the body of a
320subroutine is simply a braced block of code. When using a signature,
321the signature is a parenthesised list that goes immediately before
322the braced block. The signature declares lexical variables that are
323in scope for the block. When the subroutine is called, the signature
324takes control first. It populates the signature variables from the
325list of arguments that were passed. If the argument list doesn't meet
326the requirements of the signature, then it will throw an exception.
327When the signature processing is complete, control passes to the block.
328
329Positional parameters are handled by simply naming scalar variables in
330the signature. For example,
331
332 sub foo ($left, $right) {
333 return $left + $right;
334 }
335
336takes two positional parameters, which must be filled at runtime by
337two arguments. By default the parameters are mandatory, and it is
338not permitted to pass more arguments than expected. So the above is
339equivalent to
340
341 sub foo {
342 die "Too many arguments for subroutine" unless @_ <= 2;
343 die "Too few arguments for subroutine" unless @_ >= 2;
344 my $left = $_[0];
345 my $right = $_[1];
346 return $left + $right;
347 }
348
349An argument can be ignored by omitting the main part of the name from
350a parameter declaration, leaving just a bare C<$> sigil. For example,
351
352 sub foo ($first, $, $third) {
353 return "first=$first, third=$third";
354 }
355
356Although the ignored argument doesn't go into a variable, it is still
357mandatory for the caller to pass it.
358
359A positional parameter is made optional by giving a default value,
360separated from the parameter name by C<=>:
361
362 sub foo ($left, $right = 0) {
363 return $left + $right;
364 }
365
366The above subroutine may be called with either one or two arguments.
367The default value expression is evaluated when the subroutine is called,
368so it may provide different default values for different calls. It is
369only evaluated if the argument was actually omitted from the call.
370For example,
371
372 my $auto_id = 0;
373 sub foo ($thing, $id = $auto_id++) {
374 print "$thing has ID $id";
375 }
376
377automatically assigns distinct sequential IDs to things for which no
378ID was supplied by the caller. A default value expression may also
379refer to parameters earlier in the signature, making the default for
380one parameter vary according to the earlier parameters. For example,
381
382 sub foo ($first_name, $surname, $nickname = $first_name) {
383 print "$first_name $surname is known as \"$nickname\"";
384 }
385
386An optional parameter can be nameless just like a mandatory parameter.
387For example,
388
389 sub foo ($thing, $ = 1) {
390 print $thing;
391 }
392
393The parameter's default value will still be evaluated if the corresponding
394argument isn't supplied, even though the value won't be stored anywhere.
395This is in case evaluating it has important side effects. However, it
396will be evaluated in void context, so if it doesn't have side effects
397and is not trivial it will generate a warning if the "void" warning
398category is enabled. If a nameless optional parameter's default value
399is not important, it may be omitted just as the parameter's name was:
400
401 sub foo ($thing, $=) {
402 print $thing;
403 }
404
405Optional positional parameters must come after all mandatory positional
406parameters. (If there are no mandatory positional parameters then an
407optional positional parameters can be the first thing in the signature.)
408If there are multiple optional positional parameters and not enough
409arguments are supplied to fill them all, they will be filled from left
410to right.
411
412After positional parameters, additional arguments may be captured in a
413slurpy parameter. The simplest form of this is just an array variable:
414
415 sub foo ($filter, @inputs) {
416 print $filter->($_) foreach @inputs;
417 }
418
419With a slurpy parameter in the signature, there is no upper limit on how
420many arguments may be passed. A slurpy array parameter may be nameless
421just like a positional parameter, in which case its only effect is to
422turn off the argument limit that would otherwise apply:
423
424 sub foo ($thing, @) {
425 print $thing;
426 }
427
428A slurpy parameter may instead be a hash, in which case the arguments
429available to it are interpreted as alternating keys and values.
430There must be as many keys as values: if there is an odd argument then
431an exception will be thrown. Keys will be stringified, and if there are
432duplicates then the later instance takes precedence over the earlier,
433as with standard hash construction.
434
435 sub foo ($filter, %inputs) {
436 print $filter->($_, $inputs{$_}) foreach sort keys %inputs;
437 }
438
439A slurpy hash parameter may be nameless just like other kinds of
440parameter. It still insists that the number of arguments available to
441it be even, even though they're not being put into a variable.
442
443 sub foo ($thing, %) {
444 print $thing;
445 }
446
447A slurpy parameter, either array or hash, must be the last thing in the
448signature. It may follow mandatory and optional positional parameters;
449it may also be the only thing in the signature. Slurpy parameters cannot
450have default values: if no arguments are supplied for them then you get
451an empty array or empty hash.
452
453A signature may be entirely empty, in which case all it does is check
454that the caller passed no arguments:
455
456 sub foo () {
457 return 123;
458 }
459
460When using a signature, the arguments are still available in the special
461array variable C<@_>, in addition to the lexical variables of the
462signature. There is a difference between the two ways of accessing the
463arguments: C<@_> I<aliases> the arguments, but the signature variables
464get I<copies> of the arguments. So writing to a signature variable
465only changes that variable, and has no effect on the caller's variables,
466but writing to an element of C<@_> modifies whatever the caller used to
467supply that argument.
468
469There is a potential syntactic ambiguity between signatures and prototypes
470(see L</Prototypes>), because both start with an opening parenthesis and
471both can appear in some of the same places, such as just after the name
472in a subroutine declaration. For historical reasons, when signatures
473are not enabled, any opening parenthesis in such a context will trigger
474very forgiving prototype parsing. Most signatures will be interpreted
475as prototypes in those circumstances, but won't be valid prototypes.
476(A valid prototype cannot contain any alphabetic character.) This will
477lead to somewhat confusing error messages.
478
479To avoid ambiguity, when signatures are enabled the special syntax
480for prototypes is disabled. There is no attempt to guess whether a
481parenthesised group was intended to be a prototype or a signature.
482To give a subroutine a prototype under these circumstances, use a
483L<prototype attribute|attributes/Built-in Attributes>. For example,
484
485 sub foo :prototype($) { $_[0] }
486
487It is entirely possible for a subroutine to have both a prototype and
488a signature. They do different jobs: the prototype affects compilation
489of calls to the subroutine, and the signature puts argument values into
490lexical variables at runtime. You can therefore write
491
492 sub foo :prototype($$) ($left, $right) {
493 return $left + $right;
494 }
495
496The prototype attribute, and any other attributes, must come before
497the signature. The signature always immediately precedes the block of
498the subroutine's body.
499
500=head2 Private Variables via my()
501X<my> X<variable, lexical> X<lexical> X<lexical variable> X<scope, lexical>
502X<lexical scope> X<attributes, my>
503
504Synopsis:
505
506 my $foo; # declare $foo lexically local
507 my (@wid, %get); # declare list of variables local
508 my $foo = "flurp"; # declare $foo lexical, and init it
509 my @oof = @bar; # declare @oof lexical, and init it
510 my $x : Foo = $y; # similar, with an attribute applied
511
512B<WARNING>: The use of attribute lists on C<my> declarations is still
513evolving. The current semantics and interface are subject to change.
514See L<attributes> and L<Attribute::Handlers>.
515
516The C<my> operator declares the listed variables to be lexically
517confined to the enclosing block, conditional (C<if/unless/elsif/else>),
518loop (C<for/foreach/while/until/continue>), subroutine, C<eval>,
519or C<do/require/use>'d file. If more than one value is listed, the
520list must be placed in parentheses. All listed elements must be
521legal lvalues. Only alphanumeric identifiers may be lexically
522scoped--magical built-ins like C<$/> must currently be C<local>ized
523with C<local> instead.
524
525Unlike dynamic variables created by the C<local> operator, lexical
526variables declared with C<my> are totally hidden from the outside
527world, including any called subroutines. This is true if it's the
528same subroutine called from itself or elsewhere--every call gets
529its own copy.
530X<local>
531
532This doesn't mean that a C<my> variable declared in a statically
533enclosing lexical scope would be invisible. Only dynamic scopes
534are cut off. For example, the C<bumpx()> function below has access
535to the lexical $x variable because both the C<my> and the C<sub>
536occurred at the same scope, presumably file scope.
537
538 my $x = 10;
539 sub bumpx { $x++ }
540
541An C<eval()>, however, can see lexical variables of the scope it is
542being evaluated in, so long as the names aren't hidden by declarations within
543the C<eval()> itself. See L<perlref>.
544X<eval, scope of>
545
546The parameter list to my() may be assigned to if desired, which allows you
547to initialize your variables. (If no initializer is given for a
548particular variable, it is created with the undefined value.) Commonly
549this is used to name input parameters to a subroutine. Examples:
550
551 $arg = "fred"; # "global" variable
552 $n = cube_root(27);
553 print "$arg thinks the root is $n\n";
554 fred thinks the root is 3
555
556 sub cube_root {
557 my $arg = shift; # name doesn't matter
558 $arg **= 1/3;
559 return $arg;
560 }
561
562The C<my> is simply a modifier on something you might assign to. So when
563you do assign to variables in its argument list, C<my> doesn't
564change whether those variables are viewed as a scalar or an array. So
565
566 my ($foo) = <STDIN>; # WRONG?
567 my @FOO = <STDIN>;
568
569both supply a list context to the right-hand side, while
570
571 my $foo = <STDIN>;
572
573supplies a scalar context. But the following declares only one variable:
574
575 my $foo, $bar = 1; # WRONG
576
577That has the same effect as
578
579 my $foo;
580 $bar = 1;
581
582The declared variable is not introduced (is not visible) until after
583the current statement. Thus,
584
585 my $x = $x;
586
587can be used to initialize a new $x with the value of the old $x, and
588the expression
589
590 my $x = 123 and $x == 123
591
592is false unless the old $x happened to have the value C<123>.
593
594Lexical scopes of control structures are not bounded precisely by the
595braces that delimit their controlled blocks; control expressions are
596part of that scope, too. Thus in the loop
597
598 while (my $line = <>) {
599 $line = lc $line;
600 } continue {
601 print $line;
602 }
603
604the scope of $line extends from its declaration throughout the rest of
605the loop construct (including the C<continue> clause), but not beyond
606it. Similarly, in the conditional
607
608 if ((my $answer = <STDIN>) =~ /^yes$/i) {
609 user_agrees();
610 } elsif ($answer =~ /^no$/i) {
611 user_disagrees();
612 } else {
613 chomp $answer;
614 die "'$answer' is neither 'yes' nor 'no'";
615 }
616
617the scope of $answer extends from its declaration through the rest
618of that conditional, including any C<elsif> and C<else> clauses,
619but not beyond it. See L<perlsyn/"Simple Statements"> for information
620on the scope of variables in statements with modifiers.
621
622The C<foreach> loop defaults to scoping its index variable dynamically
623in the manner of C<local>. However, if the index variable is
624prefixed with the keyword C<my>, or if there is already a lexical
625by that name in scope, then a new lexical is created instead. Thus
626in the loop
627X<foreach> X<for>
628
629 for my $i (1, 2, 3) {
630 some_function();
631 }
632
633the scope of $i extends to the end of the loop, but not beyond it,
634rendering the value of $i inaccessible within C<some_function()>.
635X<foreach> X<for>
636
637Some users may wish to encourage the use of lexically scoped variables.
638As an aid to catching implicit uses to package variables,
639which are always global, if you say
640
641 use strict 'vars';
642
643then any variable mentioned from there to the end of the enclosing
644block must either refer to a lexical variable, be predeclared via
645C<our> or C<use vars>, or else must be fully qualified with the package name.
646A compilation error results otherwise. An inner block may countermand
647this with C<no strict 'vars'>.
648
649A C<my> has both a compile-time and a run-time effect. At compile
650time, the compiler takes notice of it. The principal usefulness
651of this is to quiet C<use strict 'vars'>, but it is also essential
652for generation of closures as detailed in L<perlref>. Actual
653initialization is delayed until run time, though, so it gets executed
654at the appropriate time, such as each time through a loop, for
655example.
656
657Variables declared with C<my> are not part of any package and are therefore
658never fully qualified with the package name. In particular, you're not
659allowed to try to make a package variable (or other global) lexical:
660
661 my $pack::var; # ERROR! Illegal syntax
662
663In fact, a dynamic variable (also known as package or global variables)
664are still accessible using the fully qualified C<::> notation even while a
665lexical of the same name is also visible:
666
667 package main;
668 local $x = 10;
669 my $x = 20;
670 print "$x and $::x\n";
671
672That will print out C<20> and C<10>.
673
674You may declare C<my> variables at the outermost scope of a file
675to hide any such identifiers from the world outside that file. This
676is similar in spirit to C's static variables when they are used at
677the file level. To do this with a subroutine requires the use of
678a closure (an anonymous function that accesses enclosing lexicals).
679If you want to create a private subroutine that cannot be called
680from outside that block, it can declare a lexical variable containing
681an anonymous sub reference:
682
683 my $secret_version = '1.001-beta';
684 my $secret_sub = sub { print $secret_version };
685 &$secret_sub();
686
687As long as the reference is never returned by any function within the
688module, no outside module can see the subroutine, because its name is not in
689any package's symbol table. Remember that it's not I<REALLY> called
690C<$some_pack::secret_version> or anything; it's just $secret_version,
691unqualified and unqualifiable.
692
693This does not work with object methods, however; all object methods
694have to be in the symbol table of some package to be found. See
695L<perlref/"Function Templates"> for something of a work-around to
696this.
697
698=head2 Persistent Private Variables
699X<state> X<state variable> X<static> X<variable, persistent> X<variable, static> X<closure>
700
701There are two ways to build persistent private variables in Perl 5.10.
702First, you can simply use the C<state> feature. Or, you can use closures,
703if you want to stay compatible with releases older than 5.10.
704
705=head3 Persistent variables via state()
706
707Beginning with Perl 5.10.0, you can declare variables with the C<state>
708keyword in place of C<my>. For that to work, though, you must have
709enabled that feature beforehand, either by using the C<feature> pragma, or
710by using C<-E> on one-liners (see L<feature>). Beginning with Perl 5.16,
711the C<CORE::state> form does not require the
712C<feature> pragma.
713
714The C<state> keyword creates a lexical variable (following the same scoping
715rules as C<my>) that persists from one subroutine call to the next. If a
716state variable resides inside an anonymous subroutine, then each copy of
717the subroutine has its own copy of the state variable. However, the value
718of the state variable will still persist between calls to the same copy of
719the anonymous subroutine. (Don't forget that C<sub { ... }> creates a new
720subroutine each time it is executed.)
721
722For example, the following code maintains a private counter, incremented
723each time the gimme_another() function is called:
724
725 use feature 'state';
726 sub gimme_another { state $x; return ++$x }
727
728And this example uses anonymous subroutines to create separate counters:
729
730 use feature 'state';
731 sub create_counter {
732 return sub { state $x; return ++$x }
733 }
734
735Also, since C<$x> is lexical, it can't be reached or modified by any Perl
736code outside.
737
738When combined with variable declaration, simple scalar assignment to C<state>
739variables (as in C<state $x = 42>) is executed only the first time. When such
740statements are evaluated subsequent times, the assignment is ignored. The
741behavior of this sort of assignment to non-scalar variables is undefined.
742
743=head3 Persistent variables with closures
744
745Just because a lexical variable is lexically (also called statically)
746scoped to its enclosing block, C<eval>, or C<do> FILE, this doesn't mean that
747within a function it works like a C static. It normally works more
748like a C auto, but with implicit garbage collection.
749
750Unlike local variables in C or C++, Perl's lexical variables don't
751necessarily get recycled just because their scope has exited.
752If something more permanent is still aware of the lexical, it will
753stick around. So long as something else references a lexical, that
754lexical won't be freed--which is as it should be. You wouldn't want
755memory being free until you were done using it, or kept around once you
756were done. Automatic garbage collection takes care of this for you.
757
758This means that you can pass back or save away references to lexical
759variables, whereas to return a pointer to a C auto is a grave error.
760It also gives us a way to simulate C's function statics. Here's a
761mechanism for giving a function private variables with both lexical
762scoping and a static lifetime. If you do want to create something like
763C's static variables, just enclose the whole function in an extra block,
764and put the static variable outside the function but in the block.
765
766 {
767 my $secret_val = 0;
768 sub gimme_another {
769 return ++$secret_val;
770 }
771 }
772 # $secret_val now becomes unreachable by the outside
773 # world, but retains its value between calls to gimme_another
774
775If this function is being sourced in from a separate file
776via C<require> or C<use>, then this is probably just fine. If it's
777all in the main program, you'll need to arrange for the C<my>
778to be executed early, either by putting the whole block above
779your main program, or more likely, placing merely a C<BEGIN>
780code block around it to make sure it gets executed before your program
781starts to run:
782
783 BEGIN {
784 my $secret_val = 0;
785 sub gimme_another {
786 return ++$secret_val;
787 }
788 }
789
790See L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END"> about the
791special triggered code blocks, C<BEGIN>, C<UNITCHECK>, C<CHECK>,
792C<INIT> and C<END>.
793
794If declared at the outermost scope (the file scope), then lexicals
795work somewhat like C's file statics. They are available to all
796functions in that same file declared below them, but are inaccessible
797from outside that file. This strategy is sometimes used in modules
798to create private variables that the whole module can see.
799
800=head2 Temporary Values via local()
801X<local> X<scope, dynamic> X<dynamic scope> X<variable, local>
802X<variable, temporary>
803
804B<WARNING>: In general, you should be using C<my> instead of C<local>, because
805it's faster and safer. Exceptions to this include the global punctuation
806variables, global filehandles and formats, and direct manipulation of the
807Perl symbol table itself. C<local> is mostly used when the current value
808of a variable must be visible to called subroutines.
809
810Synopsis:
811
812 # localization of values
813
814 local $foo; # make $foo dynamically local
815 local (@wid, %get); # make list of variables local
816 local $foo = "flurp"; # make $foo dynamic, and init it
817 local @oof = @bar; # make @oof dynamic, and init it
818
819 local $hash{key} = "val"; # sets a local value for this hash entry
820 delete local $hash{key}; # delete this entry for the current block
821 local ($cond ? $v1 : $v2); # several types of lvalues support
822 # localization
823
824 # localization of symbols
825
826 local *FH; # localize $FH, @FH, %FH, &FH ...
827 local *merlyn = *randal; # now $merlyn is really $randal, plus
828 # @merlyn is really @randal, etc
829 local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal
830 local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc
831
832A C<local> modifies its listed variables to be "local" to the
833enclosing block, C<eval>, or C<do FILE>--and to I<any subroutine
834called from within that block>. A C<local> just gives temporary
835values to global (meaning package) variables. It does I<not> create
836a local variable. This is known as dynamic scoping. Lexical scoping
837is done with C<my>, which works more like C's auto declarations.
838
839Some types of lvalues can be localized as well: hash and array elements
840and slices, conditionals (provided that their result is always
841localizable), and symbolic references. As for simple variables, this
842creates new, dynamically scoped values.
843
844If more than one variable or expression is given to C<local>, they must be
845placed in parentheses. This operator works
846by saving the current values of those variables in its argument list on a
847hidden stack and restoring them upon exiting the block, subroutine, or
848eval. This means that called subroutines can also reference the local
849variable, but not the global one. The argument list may be assigned to if
850desired, which allows you to initialize your local variables. (If no
851initializer is given for a particular variable, it is created with an
852undefined value.)
853
854Because C<local> is a run-time operator, it gets executed each time
855through a loop. Consequently, it's more efficient to localize your
856variables outside the loop.
857
858=head3 Grammatical note on local()
859X<local, context>
860
861A C<local> is simply a modifier on an lvalue expression. When you assign to
862a C<local>ized variable, the C<local> doesn't change whether its list is viewed
863as a scalar or an array. So
864
865 local($foo) = <STDIN>;
866 local @FOO = <STDIN>;
867
868both supply a list context to the right-hand side, while
869
870 local $foo = <STDIN>;
871
872supplies a scalar context.
873
874=head3 Localization of special variables
875X<local, special variable>
876
877If you localize a special variable, you'll be giving a new value to it,
878but its magic won't go away. That means that all side-effects related
879to this magic still work with the localized value.
880
881This feature allows code like this to work :
882
883 # Read the whole contents of FILE in $slurp
884 { local $/ = undef; $slurp = <FILE>; }
885
886Note, however, that this restricts localization of some values ; for
887example, the following statement dies, as of perl 5.10.0, with an error
888I<Modification of a read-only value attempted>, because the $1 variable is
889magical and read-only :
890
891 local $1 = 2;
892
893One exception is the default scalar variable: starting with perl 5.14
894C<local($_)> will always strip all magic from $_, to make it possible
895to safely reuse $_ in a subroutine.
896
897B<WARNING>: Localization of tied arrays and hashes does not currently
898work as described.
899This will be fixed in a future release of Perl; in the meantime, avoid
900code that relies on any particular behaviour of localising tied arrays
901or hashes (localising individual elements is still okay).
902See L<perl58delta/"Localising Tied Arrays and Hashes Is Broken"> for more
903details.
904X<local, tie>
905
906=head3 Localization of globs
907X<local, glob> X<glob>
908
909The construct
910
911 local *name;
912
913creates a whole new symbol table entry for the glob C<name> in the
914current package. That means that all variables in its glob slot ($name,
915@name, %name, &name, and the C<name> filehandle) are dynamically reset.
916
917This implies, among other things, that any magic eventually carried by
918those variables is locally lost. In other words, saying C<local */>
919will not have any effect on the internal value of the input record
920separator.
921
922=head3 Localization of elements of composite types
923X<local, composite type element> X<local, array element> X<local, hash element>
924
925It's also worth taking a moment to explain what happens when you
926C<local>ize a member of a composite type (i.e. an array or hash element).
927In this case, the element is C<local>ized I<by name>. This means that
928when the scope of the C<local()> ends, the saved value will be
929restored to the hash element whose key was named in the C<local()>, or
930the array element whose index was named in the C<local()>. If that
931element was deleted while the C<local()> was in effect (e.g. by a
932C<delete()> from a hash or a C<shift()> of an array), it will spring
933back into existence, possibly extending an array and filling in the
934skipped elements with C<undef>. For instance, if you say
935
936 %hash = ( 'This' => 'is', 'a' => 'test' );
937 @ary = ( 0..5 );
938 {
939 local($ary[5]) = 6;
940 local($hash{'a'}) = 'drill';
941 while (my $e = pop(@ary)) {
942 print "$e . . .\n";
943 last unless $e > 3;
944 }
945 if (@ary) {
946 $hash{'only a'} = 'test';
947 delete $hash{'a'};
948 }
949 }
950 print join(' ', map { "$_ $hash{$_}" } sort keys %hash),".\n";
951 print "The array has ",scalar(@ary)," elements: ",
952 join(', ', map { defined $_ ? $_ : 'undef' } @ary),"\n";
953
954Perl will print
955
956 6 . . .
957 4 . . .
958 3 . . .
959 This is a test only a test.
960 The array has 6 elements: 0, 1, 2, undef, undef, 5
961
962The behavior of local() on non-existent members of composite
963types is subject to change in future.
964
965=head3 Localized deletion of elements of composite types
966X<delete> X<local, composite type element> X<local, array element> X<local, hash element>
967
968You can use the C<delete local $array[$idx]> and C<delete local $hash{key}>
969constructs to delete a composite type entry for the current block and restore
970it when it ends. They return the array/hash value before the localization,
971which means that they are respectively equivalent to
972
973 do {
974 my $val = $array[$idx];
975 local $array[$idx];
976 delete $array[$idx];
977 $val
978 }
979
980and
981
982 do {
983 my $val = $hash{key};
984 local $hash{key};
985 delete $hash{key};
986 $val
987 }
988
989except that for those the C<local> is scoped to the C<do> block. Slices are
990also accepted.
991
992 my %hash = (
993 a => [ 7, 8, 9 ],
994 b => 1,
995 )
996
997 {
998 my $a = delete local $hash{a};
999 # $a is [ 7, 8, 9 ]
1000 # %hash is (b => 1)
1001
1002 {
1003 my @nums = delete local @$a[0, 2]
1004 # @nums is (7, 9)
1005 # $a is [ undef, 8 ]
1006
1007 $a[0] = 999; # will be erased when the scope ends
1008 }
1009 # $a is back to [ 7, 8, 9 ]
1010
1011 }
1012 # %hash is back to its original state
1013
1014=head2 Lvalue subroutines
1015X<lvalue> X<subroutine, lvalue>
1016
1017It is possible to return a modifiable value from a subroutine.
1018To do this, you have to declare the subroutine to return an lvalue.
1019
1020 my $val;
1021 sub canmod : lvalue {
1022 $val; # or: return $val;
1023 }
1024 sub nomod {
1025 $val;
1026 }
1027
1028 canmod() = 5; # assigns to $val
1029 nomod() = 5; # ERROR
1030
1031The scalar/list context for the subroutine and for the right-hand
1032side of assignment is determined as if the subroutine call is replaced
1033by a scalar. For example, consider:
1034
1035 data(2,3) = get_data(3,4);
1036
1037Both subroutines here are called in a scalar context, while in:
1038
1039 (data(2,3)) = get_data(3,4);
1040
1041and in:
1042
1043 (data(2),data(3)) = get_data(3,4);
1044
1045all the subroutines are called in a list context.
1046
1047Lvalue subroutines are convenient, but you have to keep in mind that,
1048when used with objects, they may violate encapsulation. A normal
1049mutator can check the supplied argument before setting the attribute
1050it is protecting, an lvalue subroutine cannot. If you require any
1051special processing when storing and retrieving the values, consider
1052using the CPAN module Sentinel or something similar.
1053
1054=head2 Lexical Subroutines
1055X<my sub> X<state sub> X<our sub> X<subroutine, lexical>
1056
1057B<WARNING>: Lexical subroutines are still experimental. The feature may be
1058modified or removed in future versions of Perl.
1059
1060Lexical subroutines are only available under the C<use feature
1061'lexical_subs'> pragma, which produces a warning unless the
1062"experimental::lexical_subs" warnings category is disabled.
1063
1064Beginning with Perl 5.18, you can declare a private subroutine with C<my>
1065or C<state>. As with state variables, the C<state> keyword is only
1066available under C<use feature 'state'> or C<use 5.010> or higher.
1067
1068These subroutines are only visible within the block in which they are
1069declared, and only after that declaration:
1070
1071 no warnings "experimental::lexical_subs";
1072 use feature 'lexical_subs';
1073
1074 foo(); # calls the package/global subroutine
1075 state sub foo {
1076 foo(); # also calls the package subroutine
1077 }
1078 foo(); # calls "state" sub
1079 my $ref = \&foo; # take a reference to "state" sub
1080
1081 my sub bar { ... }
1082 bar(); # calls "my" sub
1083
1084To use a lexical subroutine from inside the subroutine itself, you must
1085predeclare it. The C<sub foo {...}> subroutine definition syntax respects
1086any previous C<my sub;> or C<state sub;> declaration.
1087
1088 my sub baz; # predeclaration
1089 sub baz { # define the "my" sub
1090 baz(); # recursive call
1091 }
1092
1093=head3 C<state sub> vs C<my sub>
1094
1095What is the difference between "state" subs and "my" subs? Each time that
1096execution enters a block when "my" subs are declared, a new copy of each
1097sub is created. "State" subroutines persist from one execution of the
1098containing block to the next.
1099
1100So, in general, "state" subroutines are faster. But "my" subs are
1101necessary if you want to create closures:
1102
1103 no warnings "experimental::lexical_subs";
1104 use feature 'lexical_subs';
1105
1106 sub whatever {
1107 my $x = shift;
1108 my sub inner {
1109 ... do something with $x ...
1110 }
1111 inner();
1112 }
1113
1114In this example, a new C<$x> is created when C<whatever> is called, and
1115also a new C<inner>, which can see the new C<$x>. A "state" sub will only
1116see the C<$x> from the first call to C<whatever>.
1117
1118=head3 C<our> subroutines
1119
1120Like C<our $variable>, C<our sub> creates a lexical alias to the package
1121subroutine of the same name.
1122
1123The two main uses for this are to switch back to using the package sub
1124inside an inner scope:
1125
1126 no warnings "experimental::lexical_subs";
1127 use feature 'lexical_subs';
1128
1129 sub foo { ... }
1130
1131 sub bar {
1132 my sub foo { ... }
1133 {
1134 # need to use the outer foo here
1135 our sub foo;
1136 foo();
1137 }
1138 }
1139
1140and to make a subroutine visible to other packages in the same scope:
1141
1142 package MySneakyModule;
1143
1144 no warnings "experimental::lexical_subs";
1145 use feature 'lexical_subs';
1146
1147 our sub do_something { ... }
1148
1149 sub do_something_with_caller {
1150 package DB;
1151 () = caller 1; # sets @DB::args
1152 do_something(@args); # uses MySneakyModule::do_something
1153 }
1154
1155=head2 Passing Symbol Table Entries (typeglobs)
1156X<typeglob> X<*>
1157
1158B<WARNING>: The mechanism described in this section was originally
1159the only way to simulate pass-by-reference in older versions of
1160Perl. While it still works fine in modern versions, the new reference
1161mechanism is generally easier to work with. See below.
1162
1163Sometimes you don't want to pass the value of an array to a subroutine
1164but rather the name of it, so that the subroutine can modify the global
1165copy of it rather than working with a local copy. In perl you can
1166refer to all objects of a particular name by prefixing the name
1167with a star: C<*foo>. This is often known as a "typeglob", because the
1168star on the front can be thought of as a wildcard match for all the
1169funny prefix characters on variables and subroutines and such.
1170
1171When evaluated, the typeglob produces a scalar value that represents
1172all the objects of that name, including any filehandle, format, or
1173subroutine. When assigned to, it causes the name mentioned to refer to
1174whatever C<*> value was assigned to it. Example:
1175
1176 sub doubleary {
1177 local(*someary) = @_;
1178 foreach $elem (@someary) {
1179 $elem *= 2;
1180 }
1181 }
1182 doubleary(*foo);
1183 doubleary(*bar);
1184
1185Scalars are already passed by reference, so you can modify
1186scalar arguments without using this mechanism by referring explicitly
1187to C<$_[0]> etc. You can modify all the elements of an array by passing
1188all the elements as scalars, but you have to use the C<*> mechanism (or
1189the equivalent reference mechanism) to C<push>, C<pop>, or change the size of
1190an array. It will certainly be faster to pass the typeglob (or reference).
1191
1192Even if you don't want to modify an array, this mechanism is useful for
1193passing multiple arrays in a single LIST, because normally the LIST
1194mechanism will merge all the array values so that you can't extract out
1195the individual arrays. For more on typeglobs, see
1196L<perldata/"Typeglobs and Filehandles">.
1197
1198=head2 When to Still Use local()
1199X<local> X<variable, local>
1200
1201Despite the existence of C<my>, there are still three places where the
1202C<local> operator still shines. In fact, in these three places, you
1203I<must> use C<local> instead of C<my>.
1204
1205=over 4
1206
1207=item 1.
1208
1209You need to give a global variable a temporary value, especially $_.
1210
1211The global variables, like C<@ARGV> or the punctuation variables, must be
1212C<local>ized with C<local()>. This block reads in F</etc/motd>, and splits
1213it up into chunks separated by lines of equal signs, which are placed
1214in C<@Fields>.
1215
1216 {
1217 local @ARGV = ("/etc/motd");
1218 local $/ = undef;
1219 local $_ = <>;
1220 @Fields = split /^\s*=+\s*$/;
1221 }
1222
1223It particular, it's important to C<local>ize $_ in any routine that assigns
1224to it. Look out for implicit assignments in C<while> conditionals.
1225
1226=item 2.
1227
1228You need to create a local file or directory handle or a local function.
1229
1230A function that needs a filehandle of its own must use
1231C<local()> on a complete typeglob. This can be used to create new symbol
1232table entries:
1233
1234 sub ioqueue {
1235 local (*READER, *WRITER); # not my!
1236 pipe (READER, WRITER) or die "pipe: $!";
1237 return (*READER, *WRITER);
1238 }
1239 ($head, $tail) = ioqueue();
1240
1241See the Symbol module for a way to create anonymous symbol table
1242entries.
1243
1244Because assignment of a reference to a typeglob creates an alias, this
1245can be used to create what is effectively a local function, or at least,
1246a local alias.
1247
1248 {
1249 local *grow = \&shrink; # only until this block exits
1250 grow(); # really calls shrink()
1251 move(); # if move() grow()s, it shrink()s too
1252 }
1253 grow(); # get the real grow() again
1254
1255See L<perlref/"Function Templates"> for more about manipulating
1256functions by name in this way.
1257
1258=item 3.
1259
1260You want to temporarily change just one element of an array or hash.
1261
1262You can C<local>ize just one element of an aggregate. Usually this
1263is done on dynamics:
1264
1265 {
1266 local $SIG{INT} = 'IGNORE';
1267 funct(); # uninterruptible
1268 }
1269 # interruptibility automatically restored here
1270
1271But it also works on lexically declared aggregates.
1272
1273=back
1274
1275=head2 Pass by Reference
1276X<pass by reference> X<pass-by-reference> X<reference>
1277
1278If you want to pass more than one array or hash into a function--or
1279return them from it--and have them maintain their integrity, then
1280you're going to have to use an explicit pass-by-reference. Before you
1281do that, you need to understand references as detailed in L<perlref>.
1282This section may not make much sense to you otherwise.
1283
1284Here are a few simple examples. First, let's pass in several arrays
1285to a function and have it C<pop> all of then, returning a new list
1286of all their former last elements:
1287
1288 @tailings = popmany ( \@a, \@b, \@c, \@d );
1289
1290 sub popmany {
1291 my $aref;
1292 my @retlist = ();
1293 foreach $aref ( @_ ) {
1294 push @retlist, pop @$aref;
1295 }
1296 return @retlist;
1297 }
1298
1299Here's how you might write a function that returns a
1300list of keys occurring in all the hashes passed to it:
1301
1302 @common = inter( \%foo, \%bar, \%joe );
1303 sub inter {
1304 my ($k, $href, %seen); # locals
1305 foreach $href (@_) {
1306 while ( $k = each %$href ) {
1307 $seen{$k}++;
1308 }
1309 }
1310 return grep { $seen{$_} == @_ } keys %seen;
1311 }
1312
1313So far, we're using just the normal list return mechanism.
1314What happens if you want to pass or return a hash? Well,
1315if you're using only one of them, or you don't mind them
1316concatenating, then the normal calling convention is ok, although
1317a little expensive.
1318
1319Where people get into trouble is here:
1320
1321 (@a, @b) = func(@c, @d);
1322or
1323 (%a, %b) = func(%c, %d);
1324
1325That syntax simply won't work. It sets just C<@a> or C<%a> and
1326clears the C<@b> or C<%b>. Plus the function didn't get passed
1327into two separate arrays or hashes: it got one long list in C<@_>,
1328as always.
1329
1330If you can arrange for everyone to deal with this through references, it's
1331cleaner code, although not so nice to look at. Here's a function that
1332takes two array references as arguments, returning the two array elements
1333in order of how many elements they have in them:
1334
1335 ($aref, $bref) = func(\@c, \@d);
1336 print "@$aref has more than @$bref\n";
1337 sub func {
1338 my ($cref, $dref) = @_;
1339 if (@$cref > @$dref) {
1340 return ($cref, $dref);
1341 } else {
1342 return ($dref, $cref);
1343 }
1344 }
1345
1346It turns out that you can actually do this also:
1347
1348 (*a, *b) = func(\@c, \@d);
1349 print "@a has more than @b\n";
1350 sub func {
1351 local (*c, *d) = @_;
1352 if (@c > @d) {
1353 return (\@c, \@d);
1354 } else {
1355 return (\@d, \@c);
1356 }
1357 }
1358
1359Here we're using the typeglobs to do symbol table aliasing. It's
1360a tad subtle, though, and also won't work if you're using C<my>
1361variables, because only globals (even in disguise as C<local>s)
1362are in the symbol table.
1363
1364If you're passing around filehandles, you could usually just use the bare
1365typeglob, like C<*STDOUT>, but typeglobs references work, too.
1366For example:
1367
1368 splutter(\*STDOUT);
1369 sub splutter {
1370 my $fh = shift;
1371 print $fh "her um well a hmmm\n";
1372 }
1373
1374 $rec = get_rec(\*STDIN);
1375 sub get_rec {
1376 my $fh = shift;
1377 return scalar <$fh>;
1378 }
1379
1380If you're planning on generating new filehandles, you could do this.
1381Notice to pass back just the bare *FH, not its reference.
1382
1383 sub openit {
1384 my $path = shift;
1385 local *FH;
1386 return open (FH, $path) ? *FH : undef;
1387 }
1388
1389=head2 Prototypes
1390X<prototype> X<subroutine, prototype>
1391
1392Perl supports a very limited kind of compile-time argument checking
1393using function prototyping. This can be declared in either the PROTO
1394section or with a L<prototype attribute|attributes/Built-in Attributes>.
1395If you declare either of
1396
1397 sub mypush (+@)
1398 sub mypush :prototype(+@)
1399
1400then C<mypush()> takes arguments exactly like C<push()> does.
1401
1402If subroutine signatures are enabled (see L</Signatures>), then
1403the shorter PROTO syntax is unavailable, because it would clash with
1404signatures. In that case, a prototype can only be declared in the form
1405of an attribute.
1406
1407The
1408function declaration must be visible at compile time. The prototype
1409affects only interpretation of new-style calls to the function,
1410where new-style is defined as not using the C<&> character. In
1411other words, if you call it like a built-in function, then it behaves
1412like a built-in function. If you call it like an old-fashioned
1413subroutine, then it behaves like an old-fashioned subroutine. It
1414naturally falls out from this rule that prototypes have no influence
1415on subroutine references like C<\&foo> or on indirect subroutine
1416calls like C<&{$subref}> or C<< $subref->() >>.
1417
1418Method calls are not influenced by prototypes either, because the
1419function to be called is indeterminate at compile time, since
1420the exact code called depends on inheritance.
1421
1422Because the intent of this feature is primarily to let you define
1423subroutines that work like built-in functions, here are prototypes
1424for some other functions that parse almost exactly like the
1425corresponding built-in.
1426
1427 Declared as Called as
1428
1429 sub mylink ($$) mylink $old, $new
1430 sub myvec ($$$) myvec $var, $offset, 1
1431 sub myindex ($$;$) myindex &getstring, "substr"
1432 sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off
1433 sub myreverse (@) myreverse $a, $b, $c
1434 sub myjoin ($@) myjoin ":", $a, $b, $c
1435 sub mypop (+) mypop @array
1436 sub mysplice (+$$@) mysplice @array, 0, 2, @pushme
1437 sub mykeys (+) mykeys %{$hashref}
1438 sub myopen (*;$) myopen HANDLE, $name
1439 sub mypipe (**) mypipe READHANDLE, WRITEHANDLE
1440 sub mygrep (&@) mygrep { /foo/ } $a, $b, $c
1441 sub myrand (;$) myrand 42
1442 sub mytime () mytime
1443
1444Any backslashed prototype character represents an actual argument
1445that must start with that character (optionally preceded by C<my>,
1446C<our> or C<local>), with the exception of C<$>, which will
1447accept any scalar lvalue expression, such as C<$foo = 7> or
1448C<< my_function()->[0] >>. The value passed as part of C<@_> will be a
1449reference to the actual argument given in the subroutine call,
1450obtained by applying C<\> to that argument.
1451
1452You can use the C<\[]> backslash group notation to specify more than one
1453allowed argument type. For example:
1454
1455 sub myref (\[$@%&*])
1456
1457will allow calling myref() as
1458
1459 myref $var
1460 myref @array
1461 myref %hash
1462 myref &sub
1463 myref *glob
1464
1465and the first argument of myref() will be a reference to
1466a scalar, an array, a hash, a code, or a glob.
1467
1468Unbackslashed prototype characters have special meanings. Any
1469unbackslashed C<@> or C<%> eats all remaining arguments, and forces
1470list context. An argument represented by C<$> forces scalar context. An
1471C<&> requires an anonymous subroutine, which, if passed as the first
1472argument, does not require the C<sub> keyword or a subsequent comma.
1473
1474A C<*> allows the subroutine to accept a bareword, constant, scalar expression,
1475typeglob, or a reference to a typeglob in that slot. The value will be
1476available to the subroutine either as a simple scalar, or (in the latter
1477two cases) as a reference to the typeglob. If you wish to always convert
1478such arguments to a typeglob reference, use Symbol::qualify_to_ref() as
1479follows:
1480
1481 use Symbol 'qualify_to_ref';
1482
1483 sub foo (*) {
1484 my $fh = qualify_to_ref(shift, caller);
1485 ...
1486 }
1487
1488The C<+> prototype is a special alternative to C<$> that will act like
1489C<\[@%]> when given a literal array or hash variable, but will otherwise
1490force scalar context on the argument. This is useful for functions which
1491should accept either a literal array or an array reference as the argument:
1492
1493 sub mypush (+@) {
1494 my $aref = shift;
1495 die "Not an array or arrayref" unless ref $aref eq 'ARRAY';
1496 push @$aref, @_;
1497 }
1498
1499When using the C<+> prototype, your function must check that the argument
1500is of an acceptable type.
1501
1502A semicolon (C<;>) separates mandatory arguments from optional arguments.
1503It is redundant before C<@> or C<%>, which gobble up everything else.
1504
1505As the last character of a prototype, or just before a semicolon, a C<@>
1506or a C<%>, you can use C<_> in place of C<$>: if this argument is not
1507provided, C<$_> will be used instead.
1508
1509Note how the last three examples in the table above are treated
1510specially by the parser. C<mygrep()> is parsed as a true list
1511operator, C<myrand()> is parsed as a true unary operator with unary
1512precedence the same as C<rand()>, and C<mytime()> is truly without
1513arguments, just like C<time()>. That is, if you say
1514
1515 mytime +2;
1516
1517you'll get C<mytime() + 2>, not C<mytime(2)>, which is how it would be parsed
1518without a prototype. If you want to force a unary function to have the
1519same precedence as a list operator, add C<;> to the end of the prototype:
1520
1521 sub mygetprotobynumber($;);
1522 mygetprotobynumber $a > $b; # parsed as mygetprotobynumber($a > $b)
1523
1524The interesting thing about C<&> is that you can generate new syntax with it,
1525provided it's in the initial position:
1526X<&>
1527
1528 sub try (&@) {
1529 my($try,$catch) = @_;
1530 eval { &$try };
1531 if ($@) {
1532 local $_ = $@;
1533 &$catch;
1534 }
1535 }
1536 sub catch (&) { $_[0] }
1537
1538 try {
1539 die "phooey";
1540 } catch {
1541 /phooey/ and print "unphooey\n";
1542 };
1543
1544That prints C<"unphooey">. (Yes, there are still unresolved
1545issues having to do with visibility of C<@_>. I'm ignoring that
1546question for the moment. (But note that if we make C<@_> lexically
1547scoped, those anonymous subroutines can act like closures... (Gee,
1548is this sounding a little Lispish? (Never mind.))))
1549
1550And here's a reimplementation of the Perl C<grep> operator:
1551X<grep>
1552
1553 sub mygrep (&@) {
1554 my $code = shift;
1555 my @result;
1556 foreach $_ (@_) {
1557 push(@result, $_) if &$code;
1558 }
1559 @result;
1560 }
1561
1562Some folks would prefer full alphanumeric prototypes. Alphanumerics have
1563been intentionally left out of prototypes for the express purpose of
1564someday in the future adding named, formal parameters. The current
1565mechanism's main goal is to let module writers provide better diagnostics
1566for module users. Larry feels the notation quite understandable to Perl
1567programmers, and that it will not intrude greatly upon the meat of the
1568module, nor make it harder to read. The line noise is visually
1569encapsulated into a small pill that's easy to swallow.
1570
1571If you try to use an alphanumeric sequence in a prototype you will
1572generate an optional warning - "Illegal character in prototype...".
1573Unfortunately earlier versions of Perl allowed the prototype to be
1574used as long as its prefix was a valid prototype. The warning may be
1575upgraded to a fatal error in a future version of Perl once the
1576majority of offending code is fixed.
1577
1578It's probably best to prototype new functions, not retrofit prototyping
1579into older ones. That's because you must be especially careful about
1580silent impositions of differing list versus scalar contexts. For example,
1581if you decide that a function should take just one parameter, like this:
1582
1583 sub func ($) {
1584 my $n = shift;
1585 print "you gave me $n\n";
1586 }
1587
1588and someone has been calling it with an array or expression
1589returning a list:
1590
1591 func(@foo);
1592 func( split /:/ );
1593
1594Then you've just supplied an automatic C<scalar> in front of their
1595argument, which can be more than a bit surprising. The old C<@foo>
1596which used to hold one thing doesn't get passed in. Instead,
1597C<func()> now gets passed in a C<1>; that is, the number of elements
1598in C<@foo>. And the C<split> gets called in scalar context so it
1599starts scribbling on your C<@_> parameter list. Ouch!
1600
1601If a sub has both a PROTO and a BLOCK, the prototype is not applied
1602until after the BLOCK is completely defined. This means that a recursive
1603function with a prototype has to be predeclared for the prototype to take
1604effect, like so:
1605
1606 sub foo($$);
1607 sub foo($$) {
1608 foo 1, 2;
1609 }
1610
1611This is all very powerful, of course, and should be used only in moderation
1612to make the world a better place.
1613
1614=head2 Constant Functions
1615X<constant>
1616
1617Functions with a prototype of C<()> are potential candidates for
1618inlining. If the result after optimization and constant folding
1619is either a constant or a lexically-scoped scalar which has no other
1620references, then it will be used in place of function calls made
1621without C<&>. Calls made using C<&> are never inlined. (See
1622F<constant.pm> for an easy way to declare most constants.)
1623
1624The following functions would all be inlined:
1625
1626 sub pi () { 3.14159 } # Not exact, but close.
1627 sub PI () { 4 * atan2 1, 1 } # As good as it gets,
1628 # and it's inlined, too!
1629 sub ST_DEV () { 0 }
1630 sub ST_INO () { 1 }
1631
1632 sub FLAG_FOO () { 1 << 8 }
1633 sub FLAG_BAR () { 1 << 9 }
1634 sub FLAG_MASK () { FLAG_FOO | FLAG_BAR }
1635
1636 sub OPT_BAZ () { not (0x1B58 & FLAG_MASK) }
1637
1638 sub N () { int(OPT_BAZ) / 3 }
1639
1640 sub FOO_SET () { 1 if FLAG_MASK & FLAG_FOO }
1641
1642Be aware that these will not be inlined; as they contain inner scopes,
1643the constant folding doesn't reduce them to a single constant:
1644
1645 sub foo_set () { if (FLAG_MASK & FLAG_FOO) { 1 } }
1646
1647 sub baz_val () {
1648 if (OPT_BAZ) {
1649 return 23;
1650 }
1651 else {
1652 return 42;
1653 }
1654 }
1655
1656If you redefine a subroutine that was eligible for inlining, you'll get
1657a warning by default. (You can use this warning to tell whether or not a
1658particular subroutine is considered inlinable.) The warning is
1659considered severe enough not to be affected by the B<-w>
1660switch (or its absence) because previously compiled
1661invocations of the function will still be using the old value of the
1662function. If you need to be able to redefine the subroutine, you need to
1663ensure that it isn't inlined, either by dropping the C<()> prototype
1664(which changes calling semantics, so beware) or by thwarting the
1665inlining mechanism in some other way, such as
1666
1667 sub not_inlined () {
1668 23 if $];
1669 }
1670
1671=head2 Overriding Built-in Functions
1672X<built-in> X<override> X<CORE> X<CORE::GLOBAL>
1673
1674Many built-in functions may be overridden, though this should be tried
1675only occasionally and for good reason. Typically this might be
1676done by a package attempting to emulate missing built-in functionality
1677on a non-Unix system.
1678
1679Overriding may be done only by importing the name from a module at
1680compile time--ordinary predeclaration isn't good enough. However, the
1681C<use subs> pragma lets you, in effect, predeclare subs
1682via the import syntax, and these names may then override built-in ones:
1683
1684 use subs 'chdir', 'chroot', 'chmod', 'chown';
1685 chdir $somewhere;
1686 sub chdir { ... }
1687
1688To unambiguously refer to the built-in form, precede the
1689built-in name with the special package qualifier C<CORE::>. For example,
1690saying C<CORE::open()> always refers to the built-in C<open()>, even
1691if the current package has imported some other subroutine called
1692C<&open()> from elsewhere. Even though it looks like a regular
1693function call, it isn't: the CORE:: prefix in that case is part of Perl's
1694syntax, and works for any keyword, regardless of what is in the CORE
1695package. Taking a reference to it, that is, C<\&CORE::open>, only works
1696for some keywords. See L<CORE>.
1697
1698Library modules should not in general export built-in names like C<open>
1699or C<chdir> as part of their default C<@EXPORT> list, because these may
1700sneak into someone else's namespace and change the semantics unexpectedly.
1701Instead, if the module adds that name to C<@EXPORT_OK>, then it's
1702possible for a user to import the name explicitly, but not implicitly.
1703That is, they could say
1704
1705 use Module 'open';
1706
1707and it would import the C<open> override. But if they said
1708
1709 use Module;
1710
1711they would get the default imports without overrides.
1712
1713The foregoing mechanism for overriding built-in is restricted, quite
1714deliberately, to the package that requests the import. There is a second
1715method that is sometimes applicable when you wish to override a built-in
1716everywhere, without regard to namespace boundaries. This is achieved by
1717importing a sub into the special namespace C<CORE::GLOBAL::>. Here is an
1718example that quite brazenly replaces the C<glob> operator with something
1719that understands regular expressions.
1720
1721 package REGlob;
1722 require Exporter;
1723 @ISA = 'Exporter';
1724 @EXPORT_OK = 'glob';
1725
1726 sub import {
1727 my $pkg = shift;
1728 return unless @_;
1729 my $sym = shift;
1730 my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0));
1731 $pkg->export($where, $sym, @_);
1732 }
1733
1734 sub glob {
1735 my $pat = shift;
1736 my @got;
1737 if (opendir my $d, '.') {
1738 @got = grep /$pat/, readdir $d;
1739 closedir $d;
1740 }
1741 return @got;
1742 }
1743 1;
1744
1745And here's how it could be (ab)used:
1746
1747 #use REGlob 'GLOBAL_glob'; # override glob() in ALL namespaces
1748 package Foo;
1749 use REGlob 'glob'; # override glob() in Foo:: only
1750 print for <^[a-z_]+\.pm\$>; # show all pragmatic modules
1751
1752The initial comment shows a contrived, even dangerous example.
1753By overriding C<glob> globally, you would be forcing the new (and
1754subversive) behavior for the C<glob> operator for I<every> namespace,
1755without the complete cognizance or cooperation of the modules that own
1756those namespaces. Naturally, this should be done with extreme caution--if
1757it must be done at all.
1758
1759The C<REGlob> example above does not implement all the support needed to
1760cleanly override perl's C<glob> operator. The built-in C<glob> has
1761different behaviors depending on whether it appears in a scalar or list
1762context, but our C<REGlob> doesn't. Indeed, many perl built-in have such
1763context sensitive behaviors, and these must be adequately supported by
1764a properly written override. For a fully functional example of overriding
1765C<glob>, study the implementation of C<File::DosGlob> in the standard
1766library.
1767
1768When you override a built-in, your replacement should be consistent (if
1769possible) with the built-in native syntax. You can achieve this by using
1770a suitable prototype. To get the prototype of an overridable built-in,
1771use the C<prototype> function with an argument of C<"CORE::builtin_name">
1772(see L<perlfunc/prototype>).
1773
1774Note however that some built-ins can't have their syntax expressed by a
1775prototype (such as C<system> or C<chomp>). If you override them you won't
1776be able to fully mimic their original syntax.
1777
1778The built-ins C<do>, C<require> and C<glob> can also be overridden, but due
1779to special magic, their original syntax is preserved, and you don't have
1780to define a prototype for their replacements. (You can't override the
1781C<do BLOCK> syntax, though).
1782
1783C<require> has special additional dark magic: if you invoke your
1784C<require> replacement as C<require Foo::Bar>, it will actually receive
1785the argument C<"Foo/Bar.pm"> in @_. See L<perlfunc/require>.
1786
1787And, as you'll have noticed from the previous example, if you override
1788C<glob>, the C<< <*> >> glob operator is overridden as well.
1789
1790In a similar fashion, overriding the C<readline> function also overrides
1791the equivalent I/O operator C<< <FILEHANDLE> >>. Also, overriding
1792C<readpipe> also overrides the operators C<``> and C<qx//>.
1793
1794Finally, some built-ins (e.g. C<exists> or C<grep>) can't be overridden.
1795
1796=head2 Autoloading
1797X<autoloading> X<AUTOLOAD>
1798
1799If you call a subroutine that is undefined, you would ordinarily
1800get an immediate, fatal error complaining that the subroutine doesn't
1801exist. (Likewise for subroutines being used as methods, when the
1802method doesn't exist in any base class of the class's package.)
1803However, if an C<AUTOLOAD> subroutine is defined in the package or
1804packages used to locate the original subroutine, then that
1805C<AUTOLOAD> subroutine is called with the arguments that would have
1806been passed to the original subroutine. The fully qualified name
1807of the original subroutine magically appears in the global $AUTOLOAD
1808variable of the same package as the C<AUTOLOAD> routine. The name
1809is not passed as an ordinary argument because, er, well, just
1810because, that's why. (As an exception, a method call to a nonexistent
1811C<import> or C<unimport> method is just skipped instead. Also, if
1812the AUTOLOAD subroutine is an XSUB, there are other ways to retrieve the
1813subroutine name. See L<perlguts/Autoloading with XSUBs> for details.)
1814
1815
1816Many C<AUTOLOAD> routines load in a definition for the requested
1817subroutine using eval(), then execute that subroutine using a special
1818form of goto() that erases the stack frame of the C<AUTOLOAD> routine
1819without a trace. (See the source to the standard module documented
1820in L<AutoLoader>, for example.) But an C<AUTOLOAD> routine can
1821also just emulate the routine and never define it. For example,
1822let's pretend that a function that wasn't defined should just invoke
1823C<system> with those arguments. All you'd do is:
1824
1825 sub AUTOLOAD {
1826 my $program = $AUTOLOAD;
1827 $program =~ s/.*:://;
1828 system($program, @_);
1829 }
1830 date();
1831 who('am', 'i');
1832 ls('-l');
1833
1834In fact, if you predeclare functions you want to call that way, you don't
1835even need parentheses:
1836
1837 use subs qw(date who ls);
1838 date;
1839 who "am", "i";
1840 ls '-l';
1841
1842A more complete example of this is the Shell module on CPAN, which
1843can treat undefined subroutine calls as calls to external programs.
1844
1845Mechanisms are available to help modules writers split their modules
1846into autoloadable files. See the standard AutoLoader module
1847described in L<AutoLoader> and in L<AutoSplit>, the standard
1848SelfLoader modules in L<SelfLoader>, and the document on adding C
1849functions to Perl code in L<perlxs>.
1850
1851=head2 Subroutine Attributes
1852X<attribute> X<subroutine, attribute> X<attrs>
1853
1854A subroutine declaration or definition may have a list of attributes
1855associated with it. If such an attribute list is present, it is
1856broken up at space or colon boundaries and treated as though a
1857C<use attributes> had been seen. See L<attributes> for details
1858about what attributes are currently supported.
1859Unlike the limitation with the obsolescent C<use attrs>, the
1860C<sub : ATTRLIST> syntax works to associate the attributes with
1861a pre-declaration, and not just with a subroutine definition.
1862
1863The attributes must be valid as simple identifier names (without any
1864punctuation other than the '_' character). They may have a parameter
1865list appended, which is only checked for whether its parentheses ('(',')')
1866nest properly.
1867
1868Examples of valid syntax (even though the attributes are unknown):
1869
1870 sub fnord (&\%) : switch(10,foo(7,3)) : expensive;
1871 sub plugh () : Ugly('\(") :Bad;
1872 sub xyzzy : _5x5 { ... }
1873
1874Examples of invalid syntax:
1875
1876 sub fnord : switch(10,foo(); # ()-string not balanced
1877 sub snoid : Ugly('('); # ()-string not balanced
1878 sub xyzzy : 5x5; # "5x5" not a valid identifier
1879 sub plugh : Y2::north; # "Y2::north" not a simple identifier
1880 sub snurt : foo + bar; # "+" not a colon or space
1881
1882The attribute list is passed as a list of constant strings to the code
1883which associates them with the subroutine. In particular, the second example
1884of valid syntax above currently looks like this in terms of how it's
1885parsed and invoked:
1886
1887 use attributes __PACKAGE__, \&plugh, q[Ugly('\(")], 'Bad';
1888
1889For further details on attribute lists and their manipulation,
1890see L<attributes> and L<Attribute::Handlers>.
1891
1892=head1 SEE ALSO
1893
1894See L<perlref/"Function Templates"> for more about references and closures.
1895See L<perlxs> if you'd like to learn about calling C subroutines from Perl.
1896See L<perlembed> if you'd like to learn about calling Perl subroutines from C.
1897See L<perlmod> to learn about bundling up your functions in separate files.
1898See L<perlmodlib> to learn what library modules come standard on your system.
1899See L<perlootut> to learn how to make object method calls.