This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
perldelta for Cow Tools
[perl5.git] / pod / perlsub.pod
CommitLineData
a0d0e21e 1=head1 NAME
d74e8afc 2X<subroutine> X<function>
a0d0e21e
LW
3
4perlsub - Perl subroutines
5
6=head1 SYNOPSIS
7
8To declare subroutines:
d74e8afc 9X<subroutine, declaration> X<sub>
a0d0e21e 10
09bef843
SB
11 sub NAME; # A "forward" declaration.
12 sub NAME(PROTO); # ditto, but with prototypes
13 sub NAME : ATTRS; # with attributes
14 sub NAME(PROTO) : ATTRS; # with attributes and prototypes
cb1a09d0 15
09bef843
SB
16 sub NAME BLOCK # A declaration and a definition.
17 sub NAME(PROTO) BLOCK # ditto, but with prototypes
18 sub NAME : ATTRS BLOCK # with attributes
19 sub NAME(PROTO) : ATTRS BLOCK # with prototypes and attributes
a0d0e21e 20
748a9306 21To define an anonymous subroutine at runtime:
d74e8afc 22X<subroutine, anonymous>
748a9306 23
09bef843
SB
24 $subref = sub BLOCK; # no proto
25 $subref = sub (PROTO) BLOCK; # with proto
26 $subref = sub : ATTRS BLOCK; # with attributes
27 $subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes
748a9306 28
a0d0e21e 29To import subroutines:
d74e8afc 30X<import>
a0d0e21e 31
19799a22 32 use MODULE qw(NAME1 NAME2 NAME3);
a0d0e21e
LW
33
34To call subroutines:
d74e8afc 35X<subroutine, call> X<call>
a0d0e21e 36
5f05dabc 37 NAME(LIST); # & is optional with parentheses.
54310121 38 NAME LIST; # Parentheses optional if predeclared/imported.
19799a22 39 &NAME(LIST); # Circumvent prototypes.
5a964f20 40 &NAME; # Makes current @_ visible to called subroutine.
a0d0e21e
LW
41
42=head1 DESCRIPTION
43
19799a22
GS
44Like many languages, Perl provides for user-defined subroutines.
45These may be located anywhere in the main program, loaded in from
46other files via the C<do>, C<require>, or C<use> keywords, or
be3174d2 47generated on the fly using C<eval> or anonymous subroutines.
19799a22
GS
48You can even call a function indirectly using a variable containing
49its name or a CODE reference.
cb1a09d0
AD
50
51The Perl model for function call and return values is simple: all
52functions are passed as parameters one single flat list of scalars, and
53all functions likewise return to their caller one single flat list of
54scalars. Any arrays or hashes in these call and return lists will
55collapse, losing their identities--but you may always use
56pass-by-reference instead to avoid this. Both call and return lists may
57contain as many or as few scalar elements as you'd like. (Often a
58function without an explicit return statement is called a subroutine, but
19799a22 59there's really no difference from Perl's perspective.)
d74e8afc 60X<subroutine, parameter> X<parameter>
19799a22
GS
61
62Any arguments passed in show up in the array C<@_>. Therefore, if
63you called a function with two arguments, those would be stored in
64C<$_[0]> and C<$_[1]>. The array C<@_> is a local array, but its
65elements are aliases for the actual scalar parameters. In particular,
66if an element C<$_[0]> is updated, the corresponding argument is
67updated (or an error occurs if it is not updatable). If an argument
68is an array or hash element which did not exist when the function
69was called, that element is created only when (and if) it is modified
70or a reference to it is taken. (Some earlier versions of Perl
71created the element whether or not the element was assigned to.)
72Assigning to the whole array C<@_> removes that aliasing, and does
73not update any arguments.
d74e8afc 74X<subroutine, argument> X<argument> X<@_>
19799a22 75
dbb128be
XN
76A C<return> statement may be used to exit a subroutine, optionally
77specifying the returned value, which will be evaluated in the
78appropriate context (list, scalar, or void) depending on the context of
79the subroutine call. If you specify no return value, the subroutine
80returns an empty list in list context, the undefined value in scalar
81context, or nothing in void context. If you return one or more
82aggregates (arrays and hashes), these will be flattened together into
83one large indistinguishable list.
84
85If no C<return> is found and if the last statement is an expression, its
9a989771
RGS
86value is returned. If the last statement is a loop control structure
87like a C<foreach> or a C<while>, the returned value is unspecified. The
88empty sub returns the empty list.
d74e8afc 89X<subroutine, return value> X<return value> X<return>
19799a22
GS
90
91Perl does not have named formal parameters. In practice all you
92do is assign to a C<my()> list of these. Variables that aren't
93declared to be private are global variables. For gory details
94on creating private variables, see L<"Private Variables via my()">
95and L<"Temporary Values via local()">. To create protected
96environments for a set of functions in a separate package (and
97probably a separate file), see L<perlmod/"Packages">.
d74e8afc 98X<formal parameter> X<parameter, formal>
a0d0e21e
LW
99
100Example:
101
cb1a09d0
AD
102 sub max {
103 my $max = shift(@_);
a0d0e21e
LW
104 foreach $foo (@_) {
105 $max = $foo if $max < $foo;
106 }
cb1a09d0 107 return $max;
a0d0e21e 108 }
cb1a09d0 109 $bestday = max($mon,$tue,$wed,$thu,$fri);
a0d0e21e
LW
110
111Example:
112
113 # get a line, combining continuation lines
114 # that start with whitespace
115
116 sub get_line {
19799a22 117 $thisline = $lookahead; # global variables!
54310121 118 LINE: while (defined($lookahead = <STDIN>)) {
a0d0e21e
LW
119 if ($lookahead =~ /^[ \t]/) {
120 $thisline .= $lookahead;
121 }
122 else {
123 last LINE;
124 }
125 }
19799a22 126 return $thisline;
a0d0e21e
LW
127 }
128
129 $lookahead = <STDIN>; # get first line
19799a22 130 while (defined($line = get_line())) {
a0d0e21e
LW
131 ...
132 }
133
09bef843 134Assigning to a list of private variables to name your arguments:
a0d0e21e
LW
135
136 sub maybeset {
137 my($key, $value) = @_;
cb1a09d0 138 $Foo{$key} = $value unless $Foo{$key};
a0d0e21e
LW
139 }
140
19799a22
GS
141Because the assignment copies the values, this also has the effect
142of turning call-by-reference into call-by-value. Otherwise a
143function is free to do in-place modifications of C<@_> and change
144its caller's values.
d74e8afc 145X<call-by-reference> X<call-by-value>
cb1a09d0
AD
146
147 upcase_in($v1, $v2); # this changes $v1 and $v2
148 sub upcase_in {
54310121 149 for (@_) { tr/a-z/A-Z/ }
150 }
cb1a09d0
AD
151
152You aren't allowed to modify constants in this way, of course. If an
153argument were actually literal and you tried to change it, you'd take a
154(presumably fatal) exception. For example, this won't work:
d74e8afc 155X<call-by-reference> X<call-by-value>
cb1a09d0
AD
156
157 upcase_in("frederick");
158
f86cebdf 159It would be much safer if the C<upcase_in()> function
cb1a09d0
AD
160were written to return a copy of its parameters instead
161of changing them in place:
162
19799a22 163 ($v3, $v4) = upcase($v1, $v2); # this doesn't change $v1 and $v2
cb1a09d0 164 sub upcase {
54310121 165 return unless defined wantarray; # void context, do nothing
cb1a09d0 166 my @parms = @_;
54310121 167 for (@parms) { tr/a-z/A-Z/ }
c07a80fd 168 return wantarray ? @parms : $parms[0];
54310121 169 }
cb1a09d0 170
19799a22 171Notice how this (unprototyped) function doesn't care whether it was
a2293a43 172passed real scalars or arrays. Perl sees all arguments as one big,
19799a22
GS
173long, flat parameter list in C<@_>. This is one area where
174Perl's simple argument-passing style shines. The C<upcase()>
175function would work perfectly well without changing the C<upcase()>
176definition even if we fed it things like this:
cb1a09d0
AD
177
178 @newlist = upcase(@list1, @list2);
179 @newlist = upcase( split /:/, $var );
180
181Do not, however, be tempted to do this:
182
183 (@a, @b) = upcase(@list1, @list2);
184
19799a22
GS
185Like the flattened incoming parameter list, the return list is also
186flattened on return. So all you have managed to do here is stored
17b63f68 187everything in C<@a> and made C<@b> empty. See
13a2d996 188L<Pass by Reference> for alternatives.
19799a22
GS
189
190A subroutine may be called using an explicit C<&> prefix. The
191C<&> is optional in modern Perl, as are parentheses if the
192subroutine has been predeclared. The C<&> is I<not> optional
193when just naming the subroutine, such as when it's used as
194an argument to defined() or undef(). Nor is it optional when you
195want to do an indirect subroutine call with a subroutine name or
196reference using the C<&$subref()> or C<&{$subref}()> constructs,
c47ff5f1 197although the C<< $subref->() >> notation solves that problem.
19799a22 198See L<perlref> for more about all that.
d74e8afc 199X<&>
19799a22
GS
200
201Subroutines may be called recursively. If a subroutine is called
202using the C<&> form, the argument list is optional, and if omitted,
203no C<@_> array is set up for the subroutine: the C<@_> array at the
204time of the call is visible to subroutine instead. This is an
205efficiency mechanism that new users may wish to avoid.
d74e8afc 206X<recursion>
a0d0e21e
LW
207
208 &foo(1,2,3); # pass three arguments
209 foo(1,2,3); # the same
210
211 foo(); # pass a null list
212 &foo(); # the same
a0d0e21e 213
cb1a09d0 214 &foo; # foo() get current args, like foo(@_) !!
54310121 215 foo; # like foo() IFF sub foo predeclared, else "foo"
cb1a09d0 216
19799a22
GS
217Not only does the C<&> form make the argument list optional, it also
218disables any prototype checking on arguments you do provide. This
c07a80fd 219is partly for historical reasons, and partly for having a convenient way
9688be67 220to cheat if you know what you're doing. See L</Prototypes> below.
d74e8afc 221X<&>
c07a80fd 222
977616ef
RS
223Since Perl 5.16.0, the C<__SUB__> token is available under C<use feature
224'current_sub'> and C<use 5.16.0>. It will evaluate to a reference to the
906024c7
FC
225currently-running sub, which allows for recursive calls without knowing
226your subroutine's name.
977616ef
RS
227
228 use 5.16.0;
229 my $factorial = sub {
230 my ($x) = @_;
231 return 1 if $x == 1;
232 return($x * __SUB__->( $x - 1 ) );
233 };
234
a453e28a
DM
235The behaviour of C<__SUB__> within a regex code block (such as C</(?{...})/>)
236is subject to change.
237
ac90fb77
EM
238Subroutines whose names are in all upper case are reserved to the Perl
239core, as are modules whose names are in all lower case. A subroutine in
240all capitals is a loosely-held convention meaning it will be called
241indirectly by the run-time system itself, usually due to a triggered event.
bf5513e0
ZA
242Subroutines whose name start with a left parenthesis are also reserved the
243same way. The following is a list of some subroutines that currently do
244special, pre-defined things.
245
246=over
247
248=item documented later in this document
249
250C<AUTOLOAD>
251
252=item documented in L<perlmod>
253
254C<CLONE>, C<CLONE_SKIP>,
255
256=item documented in L<perlobj>
257
258C<DESTROY>
259
260=item documented in L<perltie>
261
262C<BINMODE>, C<CLEAR>, C<CLOSE>, C<DELETE>, C<DESTROY>, C<EOF>, C<EXISTS>,
263C<EXTEND>, C<FETCH>, C<FETCHSIZE>, C<FILENO>, C<FIRSTKEY>, C<GETC>,
264C<NEXTKEY>, C<OPEN>, C<POP>, C<PRINT>, C<PRINTF>, C<PUSH>, C<READ>,
265C<READLINE>, C<SCALAR>, C<SEEK>, C<SHIFT>, C<SPLICE>, C<STORE>,
266C<STORESIZE>, C<TELL>, C<TIEARRAY>, C<TIEHANDLE>, C<TIEHASH>,
267C<TIESCALAR>, C<UNSHIFT>, C<UNTIE>, C<WRITE>
268
269=item documented in L<PerlIO::via>
270
271C<BINMODE>, C<CLEARERR>, C<CLOSE>, C<EOF>, C<ERROR>, C<FDOPEN>, C<FILENO>,
272C<FILL>, C<FLUSH>, C<OPEN>, C<POPPED>, C<PUSHED>, C<READ>, C<SEEK>,
273C<SETLINEBUF>, C<SYSOPEN>, C<TELL>, C<UNREAD>, C<UTF8>, C<WRITE>
274
ec2eb8a9
TC
275=item documented in L<perlfunc>
276
277L<< C<import> | perlfunc/use >>, L<< C<unimport> | perlfunc/use >>,
278L<< C<INC> | perlfunc/require >>
279
280=item documented in L<UNIVERSAL>
281
282C<VERSION>
283
284=item documented in L<perldebguts>
285
286C<DB::DB>, C<DB::sub>, C<DB::lsub>, C<DB::goto>, C<DB::postponed>
287
bf5513e0
ZA
288=item undocumented, used internally by the L<overload> feature
289
290any starting with C<(>
291
292=back
ac90fb77 293
3c10abe3
AG
294The C<BEGIN>, C<UNITCHECK>, C<CHECK>, C<INIT> and C<END> subroutines
295are not so much subroutines as named special code blocks, of which you
296can have more than one in a package, and which you can B<not> call
297explicitly. See L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END">
5a964f20 298
b687b08b 299=head2 Private Variables via my()
d74e8afc
ITB
300X<my> X<variable, lexical> X<lexical> X<lexical variable> X<scope, lexical>
301X<lexical scope> X<attributes, my>
cb1a09d0
AD
302
303Synopsis:
304
305 my $foo; # declare $foo lexically local
306 my (@wid, %get); # declare list of variables local
307 my $foo = "flurp"; # declare $foo lexical, and init it
308 my @oof = @bar; # declare @oof lexical, and init it
09bef843
SB
309 my $x : Foo = $y; # similar, with an attribute applied
310
a0ae32d3
JH
311B<WARNING>: The use of attribute lists on C<my> declarations is still
312evolving. The current semantics and interface are subject to change.
313See L<attributes> and L<Attribute::Handlers>.
cb1a09d0 314
19799a22
GS
315The C<my> operator declares the listed variables to be lexically
316confined to the enclosing block, conditional (C<if/unless/elsif/else>),
317loop (C<for/foreach/while/until/continue>), subroutine, C<eval>,
318or C<do/require/use>'d file. If more than one value is listed, the
319list must be placed in parentheses. All listed elements must be
320legal lvalues. Only alphanumeric identifiers may be lexically
325192b1 321scoped--magical built-ins like C<$/> must currently be C<local>ized
19799a22
GS
322with C<local> instead.
323
324Unlike dynamic variables created by the C<local> operator, lexical
325variables declared with C<my> are totally hidden from the outside
326world, including any called subroutines. This is true if it's the
327same subroutine called from itself or elsewhere--every call gets
328its own copy.
d74e8afc 329X<local>
19799a22
GS
330
331This doesn't mean that a C<my> variable declared in a statically
332enclosing lexical scope would be invisible. Only dynamic scopes
333are cut off. For example, the C<bumpx()> function below has access
334to the lexical $x variable because both the C<my> and the C<sub>
335occurred at the same scope, presumably file scope.
5a964f20
TC
336
337 my $x = 10;
338 sub bumpx { $x++ }
339
19799a22
GS
340An C<eval()>, however, can see lexical variables of the scope it is
341being evaluated in, so long as the names aren't hidden by declarations within
342the C<eval()> itself. See L<perlref>.
d74e8afc 343X<eval, scope of>
cb1a09d0 344
19799a22 345The parameter list to my() may be assigned to if desired, which allows you
cb1a09d0
AD
346to initialize your variables. (If no initializer is given for a
347particular variable, it is created with the undefined value.) Commonly
19799a22 348this is used to name input parameters to a subroutine. Examples:
cb1a09d0
AD
349
350 $arg = "fred"; # "global" variable
351 $n = cube_root(27);
352 print "$arg thinks the root is $n\n";
353 fred thinks the root is 3
354
355 sub cube_root {
356 my $arg = shift; # name doesn't matter
357 $arg **= 1/3;
358 return $arg;
54310121 359 }
cb1a09d0 360
19799a22
GS
361The C<my> is simply a modifier on something you might assign to. So when
362you do assign to variables in its argument list, C<my> doesn't
6cc33c6d 363change whether those variables are viewed as a scalar or an array. So
cb1a09d0 364
5a964f20 365 my ($foo) = <STDIN>; # WRONG?
cb1a09d0
AD
366 my @FOO = <STDIN>;
367
5f05dabc 368both supply a list context to the right-hand side, while
cb1a09d0
AD
369
370 my $foo = <STDIN>;
371
5f05dabc 372supplies a scalar context. But the following declares only one variable:
748a9306 373
5a964f20 374 my $foo, $bar = 1; # WRONG
748a9306 375
cb1a09d0 376That has the same effect as
748a9306 377
cb1a09d0
AD
378 my $foo;
379 $bar = 1;
a0d0e21e 380
cb1a09d0
AD
381The declared variable is not introduced (is not visible) until after
382the current statement. Thus,
383
384 my $x = $x;
385
19799a22 386can be used to initialize a new $x with the value of the old $x, and
cb1a09d0
AD
387the expression
388
389 my $x = 123 and $x == 123
390
19799a22 391is false unless the old $x happened to have the value C<123>.
cb1a09d0 392
55497cff 393Lexical scopes of control structures are not bounded precisely by the
394braces that delimit their controlled blocks; control expressions are
19799a22 395part of that scope, too. Thus in the loop
55497cff 396
19799a22 397 while (my $line = <>) {
55497cff 398 $line = lc $line;
399 } continue {
400 print $line;
401 }
402
19799a22 403the scope of $line extends from its declaration throughout the rest of
55497cff 404the loop construct (including the C<continue> clause), but not beyond
405it. Similarly, in the conditional
406
407 if ((my $answer = <STDIN>) =~ /^yes$/i) {
408 user_agrees();
409 } elsif ($answer =~ /^no$/i) {
410 user_disagrees();
411 } else {
412 chomp $answer;
413 die "'$answer' is neither 'yes' nor 'no'";
414 }
415
19799a22
GS
416the scope of $answer extends from its declaration through the rest
417of that conditional, including any C<elsif> and C<else> clauses,
96090e4f 418but not beyond it. See L<perlsyn/"Simple Statements"> for information
457b36cb 419on the scope of variables in statements with modifiers.
55497cff 420
5f05dabc 421The C<foreach> loop defaults to scoping its index variable dynamically
19799a22
GS
422in the manner of C<local>. However, if the index variable is
423prefixed with the keyword C<my>, or if there is already a lexical
424by that name in scope, then a new lexical is created instead. Thus
425in the loop
d74e8afc 426X<foreach> X<for>
55497cff 427
428 for my $i (1, 2, 3) {
429 some_function();
430 }
431
19799a22
GS
432the scope of $i extends to the end of the loop, but not beyond it,
433rendering the value of $i inaccessible within C<some_function()>.
d74e8afc 434X<foreach> X<for>
55497cff 435
cb1a09d0 436Some users may wish to encourage the use of lexically scoped variables.
19799a22
GS
437As an aid to catching implicit uses to package variables,
438which are always global, if you say
cb1a09d0
AD
439
440 use strict 'vars';
441
19799a22
GS
442then any variable mentioned from there to the end of the enclosing
443block must either refer to a lexical variable, be predeclared via
77ca0c92 444C<our> or C<use vars>, or else must be fully qualified with the package name.
19799a22
GS
445A compilation error results otherwise. An inner block may countermand
446this with C<no strict 'vars'>.
447
448A C<my> has both a compile-time and a run-time effect. At compile
8593bda5 449time, the compiler takes notice of it. The principal usefulness
19799a22
GS
450of this is to quiet C<use strict 'vars'>, but it is also essential
451for generation of closures as detailed in L<perlref>. Actual
452initialization is delayed until run time, though, so it gets executed
453at the appropriate time, such as each time through a loop, for
454example.
455
456Variables declared with C<my> are not part of any package and are therefore
cb1a09d0
AD
457never fully qualified with the package name. In particular, you're not
458allowed to try to make a package variable (or other global) lexical:
459
460 my $pack::var; # ERROR! Illegal syntax
cb1a09d0
AD
461
462In fact, a dynamic variable (also known as package or global variables)
f86cebdf 463are still accessible using the fully qualified C<::> notation even while a
cb1a09d0
AD
464lexical of the same name is also visible:
465
466 package main;
467 local $x = 10;
468 my $x = 20;
469 print "$x and $::x\n";
470
f86cebdf 471That will print out C<20> and C<10>.
cb1a09d0 472
19799a22
GS
473You may declare C<my> variables at the outermost scope of a file
474to hide any such identifiers from the world outside that file. This
475is similar in spirit to C's static variables when they are used at
476the file level. To do this with a subroutine requires the use of
477a closure (an anonymous function that accesses enclosing lexicals).
478If you want to create a private subroutine that cannot be called
479from outside that block, it can declare a lexical variable containing
480an anonymous sub reference:
cb1a09d0
AD
481
482 my $secret_version = '1.001-beta';
483 my $secret_sub = sub { print $secret_version };
484 &$secret_sub();
485
486As long as the reference is never returned by any function within the
5f05dabc 487module, no outside module can see the subroutine, because its name is not in
cb1a09d0 488any package's symbol table. Remember that it's not I<REALLY> called
19799a22 489C<$some_pack::secret_version> or anything; it's just $secret_version,
cb1a09d0
AD
490unqualified and unqualifiable.
491
19799a22
GS
492This does not work with object methods, however; all object methods
493have to be in the symbol table of some package to be found. See
494L<perlref/"Function Templates"> for something of a work-around to
495this.
cb1a09d0 496
c2611fb3 497=head2 Persistent Private Variables
ba1f8e91
RGS
498X<state> X<state variable> X<static> X<variable, persistent> X<variable, static> X<closure>
499
500There are two ways to build persistent private variables in Perl 5.10.
501First, you can simply use the C<state> feature. Or, you can use closures,
502if you want to stay compatible with releases older than 5.10.
503
504=head3 Persistent variables via state()
505
9d42615f 506Beginning with Perl 5.10.0, you can declare variables with the C<state>
4a904372 507keyword in place of C<my>. For that to work, though, you must have
ba1f8e91 508enabled that feature beforehand, either by using the C<feature> pragma, or
4a904372 509by using C<-E> on one-liners (see L<feature>). Beginning with Perl 5.16,
47d235f1 510the C<CORE::state> form does not require the
4a904372 511C<feature> pragma.
ba1f8e91 512
ad0cc46c
FC
513The C<state> keyword creates a lexical variable (following the same scoping
514rules as C<my>) that persists from one subroutine call to the next. If a
515state variable resides inside an anonymous subroutine, then each copy of
516the subroutine has its own copy of the state variable. However, the value
517of the state variable will still persist between calls to the same copy of
518the anonymous subroutine. (Don't forget that C<sub { ... }> creates a new
519subroutine each time it is executed.)
520
ba1f8e91
RGS
521For example, the following code maintains a private counter, incremented
522each time the gimme_another() function is called:
523
524 use feature 'state';
525 sub gimme_another { state $x; return ++$x }
526
ad0cc46c
FC
527And this example uses anonymous subroutines to create separate counters:
528
529 use feature 'state';
530 sub create_counter {
531 return sub { state $x; return ++$x }
532 }
533
ba1f8e91
RGS
534Also, since C<$x> is lexical, it can't be reached or modified by any Perl
535code outside.
536
f292fc7a
RS
537When combined with variable declaration, simple scalar assignment to C<state>
538variables (as in C<state $x = 42>) is executed only the first time. When such
539statements are evaluated subsequent times, the assignment is ignored. The
540behavior of this sort of assignment to non-scalar variables is undefined.
ba1f8e91
RGS
541
542=head3 Persistent variables with closures
5a964f20
TC
543
544Just because a lexical variable is lexically (also called statically)
f86cebdf 545scoped to its enclosing block, C<eval>, or C<do> FILE, this doesn't mean that
5a964f20
TC
546within a function it works like a C static. It normally works more
547like a C auto, but with implicit garbage collection.
548
549Unlike local variables in C or C++, Perl's lexical variables don't
550necessarily get recycled just because their scope has exited.
551If something more permanent is still aware of the lexical, it will
552stick around. So long as something else references a lexical, that
553lexical won't be freed--which is as it should be. You wouldn't want
554memory being free until you were done using it, or kept around once you
555were done. Automatic garbage collection takes care of this for you.
556
557This means that you can pass back or save away references to lexical
558variables, whereas to return a pointer to a C auto is a grave error.
559It also gives us a way to simulate C's function statics. Here's a
560mechanism for giving a function private variables with both lexical
561scoping and a static lifetime. If you do want to create something like
562C's static variables, just enclose the whole function in an extra block,
563and put the static variable outside the function but in the block.
cb1a09d0
AD
564
565 {
54310121 566 my $secret_val = 0;
cb1a09d0
AD
567 sub gimme_another {
568 return ++$secret_val;
54310121 569 }
570 }
cb1a09d0
AD
571 # $secret_val now becomes unreachable by the outside
572 # world, but retains its value between calls to gimme_another
573
54310121 574If this function is being sourced in from a separate file
cb1a09d0 575via C<require> or C<use>, then this is probably just fine. If it's
19799a22 576all in the main program, you'll need to arrange for the C<my>
cb1a09d0 577to be executed early, either by putting the whole block above
f86cebdf 578your main program, or more likely, placing merely a C<BEGIN>
ac90fb77 579code block around it to make sure it gets executed before your program
cb1a09d0
AD
580starts to run:
581
ac90fb77 582 BEGIN {
54310121 583 my $secret_val = 0;
cb1a09d0
AD
584 sub gimme_another {
585 return ++$secret_val;
54310121 586 }
587 }
cb1a09d0 588
3c10abe3
AG
589See L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END"> about the
590special triggered code blocks, C<BEGIN>, C<UNITCHECK>, C<CHECK>,
591C<INIT> and C<END>.
cb1a09d0 592
19799a22
GS
593If declared at the outermost scope (the file scope), then lexicals
594work somewhat like C's file statics. They are available to all
595functions in that same file declared below them, but are inaccessible
596from outside that file. This strategy is sometimes used in modules
597to create private variables that the whole module can see.
5a964f20 598
cb1a09d0 599=head2 Temporary Values via local()
d74e8afc
ITB
600X<local> X<scope, dynamic> X<dynamic scope> X<variable, local>
601X<variable, temporary>
cb1a09d0 602
19799a22 603B<WARNING>: In general, you should be using C<my> instead of C<local>, because
6d28dffb 604it's faster and safer. Exceptions to this include the global punctuation
325192b1
RGS
605variables, global filehandles and formats, and direct manipulation of the
606Perl symbol table itself. C<local> is mostly used when the current value
607of a variable must be visible to called subroutines.
cb1a09d0
AD
608
609Synopsis:
610
325192b1
RGS
611 # localization of values
612
555bd962
BG
613 local $foo; # make $foo dynamically local
614 local (@wid, %get); # make list of variables local
615 local $foo = "flurp"; # make $foo dynamic, and init it
616 local @oof = @bar; # make @oof dynamic, and init it
325192b1 617
555bd962
BG
618 local $hash{key} = "val"; # sets a local value for this hash entry
619 delete local $hash{key}; # delete this entry for the current block
620 local ($cond ? $v1 : $v2); # several types of lvalues support
621 # localization
325192b1
RGS
622
623 # localization of symbols
cb1a09d0 624
555bd962
BG
625 local *FH; # localize $FH, @FH, %FH, &FH ...
626 local *merlyn = *randal; # now $merlyn is really $randal, plus
627 # @merlyn is really @randal, etc
628 local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal
629 local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc
cb1a09d0 630
19799a22
GS
631A C<local> modifies its listed variables to be "local" to the
632enclosing block, C<eval>, or C<do FILE>--and to I<any subroutine
633called from within that block>. A C<local> just gives temporary
634values to global (meaning package) variables. It does I<not> create
635a local variable. This is known as dynamic scoping. Lexical scoping
636is done with C<my>, which works more like C's auto declarations.
cb1a09d0 637
ceb12f1f 638Some types of lvalues can be localized as well: hash and array elements
325192b1
RGS
639and slices, conditionals (provided that their result is always
640localizable), and symbolic references. As for simple variables, this
641creates new, dynamically scoped values.
642
643If more than one variable or expression is given to C<local>, they must be
644placed in parentheses. This operator works
cb1a09d0 645by saving the current values of those variables in its argument list on a
5f05dabc 646hidden stack and restoring them upon exiting the block, subroutine, or
cb1a09d0
AD
647eval. This means that called subroutines can also reference the local
648variable, but not the global one. The argument list may be assigned to if
649desired, which allows you to initialize your local variables. (If no
650initializer is given for a particular variable, it is created with an
325192b1 651undefined value.)
cb1a09d0 652
19799a22 653Because C<local> is a run-time operator, it gets executed each time
325192b1
RGS
654through a loop. Consequently, it's more efficient to localize your
655variables outside the loop.
656
657=head3 Grammatical note on local()
d74e8afc 658X<local, context>
cb1a09d0 659
f86cebdf
GS
660A C<local> is simply a modifier on an lvalue expression. When you assign to
661a C<local>ized variable, the C<local> doesn't change whether its list is viewed
cb1a09d0
AD
662as a scalar or an array. So
663
664 local($foo) = <STDIN>;
665 local @FOO = <STDIN>;
666
5f05dabc 667both supply a list context to the right-hand side, while
cb1a09d0
AD
668
669 local $foo = <STDIN>;
670
671supplies a scalar context.
672
325192b1 673=head3 Localization of special variables
d74e8afc 674X<local, special variable>
3e3baf6d 675
325192b1
RGS
676If you localize a special variable, you'll be giving a new value to it,
677but its magic won't go away. That means that all side-effects related
678to this magic still work with the localized value.
3e3baf6d 679
325192b1
RGS
680This feature allows code like this to work :
681
682 # Read the whole contents of FILE in $slurp
683 { local $/ = undef; $slurp = <FILE>; }
684
685Note, however, that this restricts localization of some values ; for
9d42615f 686example, the following statement dies, as of perl 5.10.0, with an error
325192b1
RGS
687I<Modification of a read-only value attempted>, because the $1 variable is
688magical and read-only :
689
690 local $1 = 2;
691
658a9f31
JD
692One exception is the default scalar variable: starting with perl 5.14
693C<local($_)> will always strip all magic from $_, to make it possible
694to safely reuse $_ in a subroutine.
325192b1
RGS
695
696B<WARNING>: Localization of tied arrays and hashes does not currently
697work as described.
fd5a896a
DM
698This will be fixed in a future release of Perl; in the meantime, avoid
699code that relies on any particular behaviour of localising tied arrays
700or hashes (localising individual elements is still okay).
325192b1 701See L<perl58delta/"Localising Tied Arrays and Hashes Is Broken"> for more
fd5a896a 702details.
d74e8afc 703X<local, tie>
fd5a896a 704
325192b1 705=head3 Localization of globs
d74e8afc 706X<local, glob> X<glob>
3e3baf6d 707
325192b1
RGS
708The construct
709
710 local *name;
711
712creates a whole new symbol table entry for the glob C<name> in the
713current package. That means that all variables in its glob slot ($name,
714@name, %name, &name, and the C<name> filehandle) are dynamically reset.
715
716This implies, among other things, that any magic eventually carried by
717those variables is locally lost. In other words, saying C<local */>
718will not have any effect on the internal value of the input record
719separator.
720
325192b1 721=head3 Localization of elements of composite types
d74e8afc 722X<local, composite type element> X<local, array element> X<local, hash element>
3e3baf6d 723
6ee623d5 724It's also worth taking a moment to explain what happens when you
f86cebdf
GS
725C<local>ize a member of a composite type (i.e. an array or hash element).
726In this case, the element is C<local>ized I<by name>. This means that
6ee623d5
GS
727when the scope of the C<local()> ends, the saved value will be
728restored to the hash element whose key was named in the C<local()>, or
729the array element whose index was named in the C<local()>. If that
730element was deleted while the C<local()> was in effect (e.g. by a
731C<delete()> from a hash or a C<shift()> of an array), it will spring
732back into existence, possibly extending an array and filling in the
733skipped elements with C<undef>. For instance, if you say
734
735 %hash = ( 'This' => 'is', 'a' => 'test' );
736 @ary = ( 0..5 );
737 {
738 local($ary[5]) = 6;
739 local($hash{'a'}) = 'drill';
740 while (my $e = pop(@ary)) {
741 print "$e . . .\n";
742 last unless $e > 3;
743 }
744 if (@ary) {
745 $hash{'only a'} = 'test';
746 delete $hash{'a'};
747 }
748 }
749 print join(' ', map { "$_ $hash{$_}" } sort keys %hash),".\n";
750 print "The array has ",scalar(@ary)," elements: ",
751 join(', ', map { defined $_ ? $_ : 'undef' } @ary),"\n";
752
753Perl will print
754
755 6 . . .
756 4 . . .
757 3 . . .
758 This is a test only a test.
759 The array has 6 elements: 0, 1, 2, undef, undef, 5
760
19799a22 761The behavior of local() on non-existent members of composite
7185e5cc
GS
762types is subject to change in future.
763
d361fafa
VP
764=head3 Localized deletion of elements of composite types
765X<delete> X<local, composite type element> X<local, array element> X<local, hash element>
766
767You can use the C<delete local $array[$idx]> and C<delete local $hash{key}>
768constructs to delete a composite type entry for the current block and restore
769it when it ends. They return the array/hash value before the localization,
770which means that they are respectively equivalent to
771
772 do {
773 my $val = $array[$idx];
774 local $array[$idx];
775 delete $array[$idx];
776 $val
777 }
778
779and
780
781 do {
782 my $val = $hash{key};
783 local $hash{key};
784 delete $hash{key};
785 $val
786 }
787
788except that for those the C<local> is scoped to the C<do> block. Slices are
789also accepted.
790
791 my %hash = (
792 a => [ 7, 8, 9 ],
793 b => 1,
794 )
795
796 {
797 my $a = delete local $hash{a};
798 # $a is [ 7, 8, 9 ]
799 # %hash is (b => 1)
800
801 {
802 my @nums = delete local @$a[0, 2]
803 # @nums is (7, 9)
804 # $a is [ undef, 8 ]
805
806 $a[0] = 999; # will be erased when the scope ends
807 }
808 # $a is back to [ 7, 8, 9 ]
809
810 }
811 # %hash is back to its original state
812
cd06dffe 813=head2 Lvalue subroutines
d74e8afc 814X<lvalue> X<subroutine, lvalue>
cd06dffe 815
cd06dffe
GS
816It is possible to return a modifiable value from a subroutine.
817To do this, you have to declare the subroutine to return an lvalue.
818
819 my $val;
820 sub canmod : lvalue {
4a904372 821 $val; # or: return $val;
cd06dffe
GS
822 }
823 sub nomod {
824 $val;
825 }
826
827 canmod() = 5; # assigns to $val
828 nomod() = 5; # ERROR
829
830The scalar/list context for the subroutine and for the right-hand
831side of assignment is determined as if the subroutine call is replaced
832by a scalar. For example, consider:
833
834 data(2,3) = get_data(3,4);
835
836Both subroutines here are called in a scalar context, while in:
837
838 (data(2,3)) = get_data(3,4);
839
840and in:
841
842 (data(2),data(3)) = get_data(3,4);
843
844all the subroutines are called in a list context.
845
771cc755
JV
846Lvalue subroutines are convenient, but you have to keep in mind that,
847when used with objects, they may violate encapsulation. A normal
848mutator can check the supplied argument before setting the attribute
849it is protecting, an lvalue subroutine cannot. If you require any
850special processing when storing and retrieving the values, consider
851using the CPAN module Sentinel or something similar.
e6a32221 852
ca40957e
FC
853=head2 Lexical Subroutines
854X<my sub> X<state sub> X<our sub> X<subroutine, lexical>
855
441078c2
FC
856B<WARNING>: Lexical subroutines are still experimental. The feature may be
857modified or removed in future versions of Perl.
ca40957e
FC
858
859Lexical subroutines are only available under the C<use feature
860'lexical_subs'> pragma, which produces a warning unless the
f1d34ca8 861"experimental::lexical_subs" warnings category is disabled.
ca40957e
FC
862
863Beginning with Perl 5.18, you can declare a private subroutine with C<my>
864or C<state>. As with state variables, the C<state> keyword is only
865available under C<use feature 'state'> or C<use 5.010> or higher.
866
867These subroutines are only visible within the block in which they are
868declared, and only after that declaration:
869
f1d34ca8 870 no warnings "experimental::lexical_subs";
ca40957e
FC
871 use feature 'lexical_subs';
872
873 foo(); # calls the package/global subroutine
874 state sub foo {
875 foo(); # also calls the package subroutine
876 }
877 foo(); # calls "state" sub
878 my $ref = \&foo; # take a reference to "state" sub
879
880 my sub bar { ... }
881 bar(); # calls "my" sub
882
883To use a lexical subroutine from inside the subroutine itself, you must
884predeclare it. The C<sub foo {...}> subroutine definition syntax respects
885any previous C<my sub;> or C<state sub;> declaration.
886
887 my sub baz; # predeclaration
888 sub baz { # define the "my" sub
889 baz(); # recursive call
890 }
891
892=head3 C<state sub> vs C<my sub>
893
894What is the difference between "state" subs and "my" subs? Each time that
895execution enters a block when "my" subs are declared, a new copy of each
896sub is created. "State" subroutines persist from one execution of the
897containing block to the next.
898
899So, in general, "state" subroutines are faster. But "my" subs are
900necessary if you want to create closures:
901
f1d34ca8 902 no warnings "experimental::lexical_subs";
ca40957e
FC
903 use feature 'lexical_subs';
904
905 sub whatever {
906 my $x = shift;
907 my sub inner {
908 ... do something with $x ...
909 }
910 inner();
911 }
912
913In this example, a new C<$x> is created when C<whatever> is called, and
914also a new C<inner>, which can see the new C<$x>. A "state" sub will only
915see the C<$x> from the first call to C<whatever>.
916
917=head3 C<our> subroutines
918
919Like C<our $variable>, C<our sub> creates a lexical alias to the package
920subroutine of the same name.
921
922The two main uses for this are to switch back to using the package sub
923inside an inner scope:
924
f1d34ca8 925 no warnings "experimental::lexical_subs";
ca40957e
FC
926 use feature 'lexical_subs';
927
928 sub foo { ... }
929
930 sub bar {
931 my sub foo { ... }
932 {
933 # need to use the outer foo here
934 our sub foo;
935 foo();
936 }
937 }
938
939and to make a subroutine visible to other packages in the same scope:
940
941 package MySneakyModule;
942
f1d34ca8 943 no warnings "experimental::lexical_subs";
ca40957e
FC
944 use feature 'lexical_subs';
945
946 our sub do_something { ... }
947
948 sub do_something_with_caller {
949 package DB;
950 () = caller 1; # sets @DB::args
951 do_something(@args); # uses MySneakyModule::do_something
952 }
953
cb1a09d0 954=head2 Passing Symbol Table Entries (typeglobs)
d74e8afc 955X<typeglob> X<*>
cb1a09d0 956
19799a22
GS
957B<WARNING>: The mechanism described in this section was originally
958the only way to simulate pass-by-reference in older versions of
959Perl. While it still works fine in modern versions, the new reference
960mechanism is generally easier to work with. See below.
a0d0e21e
LW
961
962Sometimes you don't want to pass the value of an array to a subroutine
963but rather the name of it, so that the subroutine can modify the global
964copy of it rather than working with a local copy. In perl you can
cb1a09d0 965refer to all objects of a particular name by prefixing the name
5f05dabc 966with a star: C<*foo>. This is often known as a "typeglob", because the
a0d0e21e
LW
967star on the front can be thought of as a wildcard match for all the
968funny prefix characters on variables and subroutines and such.
969
55497cff 970When evaluated, the typeglob produces a scalar value that represents
5f05dabc 971all the objects of that name, including any filehandle, format, or
a0d0e21e 972subroutine. When assigned to, it causes the name mentioned to refer to
19799a22 973whatever C<*> value was assigned to it. Example:
a0d0e21e
LW
974
975 sub doubleary {
976 local(*someary) = @_;
977 foreach $elem (@someary) {
978 $elem *= 2;
979 }
980 }
981 doubleary(*foo);
982 doubleary(*bar);
983
19799a22 984Scalars are already passed by reference, so you can modify
a0d0e21e 985scalar arguments without using this mechanism by referring explicitly
1fef88e7 986to C<$_[0]> etc. You can modify all the elements of an array by passing
f86cebdf
GS
987all the elements as scalars, but you have to use the C<*> mechanism (or
988the equivalent reference mechanism) to C<push>, C<pop>, or change the size of
a0d0e21e
LW
989an array. It will certainly be faster to pass the typeglob (or reference).
990
991Even if you don't want to modify an array, this mechanism is useful for
5f05dabc 992passing multiple arrays in a single LIST, because normally the LIST
a0d0e21e 993mechanism will merge all the array values so that you can't extract out
55497cff 994the individual arrays. For more on typeglobs, see
2ae324a7 995L<perldata/"Typeglobs and Filehandles">.
cb1a09d0 996
5a964f20 997=head2 When to Still Use local()
d74e8afc 998X<local> X<variable, local>
5a964f20 999
19799a22
GS
1000Despite the existence of C<my>, there are still three places where the
1001C<local> operator still shines. In fact, in these three places, you
5a964f20
TC
1002I<must> use C<local> instead of C<my>.
1003
13a2d996 1004=over 4
5a964f20 1005
551e1d92
RB
1006=item 1.
1007
1008You need to give a global variable a temporary value, especially $_.
5a964f20 1009
f86cebdf
GS
1010The global variables, like C<@ARGV> or the punctuation variables, must be
1011C<local>ized with C<local()>. This block reads in F</etc/motd>, and splits
5a964f20 1012it up into chunks separated by lines of equal signs, which are placed
f86cebdf 1013in C<@Fields>.
5a964f20
TC
1014
1015 {
1016 local @ARGV = ("/etc/motd");
1017 local $/ = undef;
1018 local $_ = <>;
1019 @Fields = split /^\s*=+\s*$/;
1020 }
1021
19799a22 1022It particular, it's important to C<local>ize $_ in any routine that assigns
5a964f20
TC
1023to it. Look out for implicit assignments in C<while> conditionals.
1024
551e1d92
RB
1025=item 2.
1026
1027You need to create a local file or directory handle or a local function.
5a964f20 1028
09bef843
SB
1029A function that needs a filehandle of its own must use
1030C<local()> on a complete typeglob. This can be used to create new symbol
5a964f20
TC
1031table entries:
1032
1033 sub ioqueue {
1034 local (*READER, *WRITER); # not my!
17b63f68 1035 pipe (READER, WRITER) or die "pipe: $!";
5a964f20
TC
1036 return (*READER, *WRITER);
1037 }
1038 ($head, $tail) = ioqueue();
1039
1040See the Symbol module for a way to create anonymous symbol table
1041entries.
1042
1043Because assignment of a reference to a typeglob creates an alias, this
1044can be used to create what is effectively a local function, or at least,
1045a local alias.
1046
1047 {
4a46e268 1048 local *grow = \&shrink; # only until this block exits
555bd962
BG
1049 grow(); # really calls shrink()
1050 move(); # if move() grow()s, it shrink()s too
5a964f20 1051 }
555bd962 1052 grow(); # get the real grow() again
5a964f20
TC
1053
1054See L<perlref/"Function Templates"> for more about manipulating
1055functions by name in this way.
1056
551e1d92
RB
1057=item 3.
1058
1059You want to temporarily change just one element of an array or hash.
5a964f20 1060
f86cebdf 1061You can C<local>ize just one element of an aggregate. Usually this
5a964f20
TC
1062is done on dynamics:
1063
1064 {
1065 local $SIG{INT} = 'IGNORE';
1066 funct(); # uninterruptible
1067 }
1068 # interruptibility automatically restored here
1069
9d42615f 1070But it also works on lexically declared aggregates.
5a964f20
TC
1071
1072=back
1073
cb1a09d0 1074=head2 Pass by Reference
d74e8afc 1075X<pass by reference> X<pass-by-reference> X<reference>
cb1a09d0 1076
55497cff 1077If you want to pass more than one array or hash into a function--or
1078return them from it--and have them maintain their integrity, then
1079you're going to have to use an explicit pass-by-reference. Before you
1080do that, you need to understand references as detailed in L<perlref>.
c07a80fd 1081This section may not make much sense to you otherwise.
cb1a09d0 1082
19799a22
GS
1083Here are a few simple examples. First, let's pass in several arrays
1084to a function and have it C<pop> all of then, returning a new list
1085of all their former last elements:
cb1a09d0
AD
1086
1087 @tailings = popmany ( \@a, \@b, \@c, \@d );
1088
1089 sub popmany {
1090 my $aref;
1091 my @retlist = ();
1092 foreach $aref ( @_ ) {
1093 push @retlist, pop @$aref;
54310121 1094 }
cb1a09d0 1095 return @retlist;
54310121 1096 }
cb1a09d0 1097
54310121 1098Here's how you might write a function that returns a
cb1a09d0
AD
1099list of keys occurring in all the hashes passed to it:
1100
54310121 1101 @common = inter( \%foo, \%bar, \%joe );
cb1a09d0
AD
1102 sub inter {
1103 my ($k, $href, %seen); # locals
1104 foreach $href (@_) {
1105 while ( $k = each %$href ) {
1106 $seen{$k}++;
54310121 1107 }
1108 }
cb1a09d0 1109 return grep { $seen{$_} == @_ } keys %seen;
54310121 1110 }
cb1a09d0 1111
5f05dabc 1112So far, we're using just the normal list return mechanism.
54310121 1113What happens if you want to pass or return a hash? Well,
1114if you're using only one of them, or you don't mind them
cb1a09d0 1115concatenating, then the normal calling convention is ok, although
54310121 1116a little expensive.
cb1a09d0
AD
1117
1118Where people get into trouble is here:
1119
1120 (@a, @b) = func(@c, @d);
1121or
1122 (%a, %b) = func(%c, %d);
1123
19799a22
GS
1124That syntax simply won't work. It sets just C<@a> or C<%a> and
1125clears the C<@b> or C<%b>. Plus the function didn't get passed
1126into two separate arrays or hashes: it got one long list in C<@_>,
1127as always.
cb1a09d0
AD
1128
1129If you can arrange for everyone to deal with this through references, it's
1130cleaner code, although not so nice to look at. Here's a function that
1131takes two array references as arguments, returning the two array elements
1132in order of how many elements they have in them:
1133
1134 ($aref, $bref) = func(\@c, \@d);
1135 print "@$aref has more than @$bref\n";
1136 sub func {
1137 my ($cref, $dref) = @_;
1138 if (@$cref > @$dref) {
1139 return ($cref, $dref);
1140 } else {
c07a80fd 1141 return ($dref, $cref);
54310121 1142 }
1143 }
cb1a09d0
AD
1144
1145It turns out that you can actually do this also:
1146
1147 (*a, *b) = func(\@c, \@d);
1148 print "@a has more than @b\n";
1149 sub func {
1150 local (*c, *d) = @_;
1151 if (@c > @d) {
1152 return (\@c, \@d);
1153 } else {
1154 return (\@d, \@c);
54310121 1155 }
1156 }
cb1a09d0
AD
1157
1158Here we're using the typeglobs to do symbol table aliasing. It's
19799a22 1159a tad subtle, though, and also won't work if you're using C<my>
09bef843 1160variables, because only globals (even in disguise as C<local>s)
19799a22 1161are in the symbol table.
5f05dabc 1162
1163If you're passing around filehandles, you could usually just use the bare
19799a22
GS
1164typeglob, like C<*STDOUT>, but typeglobs references work, too.
1165For example:
5f05dabc 1166
1167 splutter(\*STDOUT);
1168 sub splutter {
1169 my $fh = shift;
1170 print $fh "her um well a hmmm\n";
1171 }
1172
1173 $rec = get_rec(\*STDIN);
1174 sub get_rec {
1175 my $fh = shift;
1176 return scalar <$fh>;
1177 }
1178
19799a22
GS
1179If you're planning on generating new filehandles, you could do this.
1180Notice to pass back just the bare *FH, not its reference.
5f05dabc 1181
1182 sub openit {
19799a22 1183 my $path = shift;
5f05dabc 1184 local *FH;
e05a3a1e 1185 return open (FH, $path) ? *FH : undef;
54310121 1186 }
5f05dabc 1187
cb1a09d0 1188=head2 Prototypes
d74e8afc 1189X<prototype> X<subroutine, prototype>
cb1a09d0 1190
19799a22 1191Perl supports a very limited kind of compile-time argument checking
eedb00fa
PM
1192using function prototyping. This can be declared in either the PROTO
1193section or with a L<prototype attribute|attributes/Built-in Attributes>.
1194If you declare
cb1a09d0 1195
cba5a3b0 1196 sub mypush (+@)
cb1a09d0 1197
19799a22
GS
1198then C<mypush()> takes arguments exactly like C<push()> does. The
1199function declaration must be visible at compile time. The prototype
1200affects only interpretation of new-style calls to the function,
1201where new-style is defined as not using the C<&> character. In
1202other words, if you call it like a built-in function, then it behaves
1203like a built-in function. If you call it like an old-fashioned
1204subroutine, then it behaves like an old-fashioned subroutine. It
1205naturally falls out from this rule that prototypes have no influence
1206on subroutine references like C<\&foo> or on indirect subroutine
c47ff5f1 1207calls like C<&{$subref}> or C<< $subref->() >>.
c07a80fd 1208
1209Method calls are not influenced by prototypes either, because the
19799a22
GS
1210function to be called is indeterminate at compile time, since
1211the exact code called depends on inheritance.
cb1a09d0 1212
19799a22
GS
1213Because the intent of this feature is primarily to let you define
1214subroutines that work like built-in functions, here are prototypes
1215for some other functions that parse almost exactly like the
1216corresponding built-in.
cb1a09d0 1217
555bd962
BG
1218 Declared as Called as
1219
1220 sub mylink ($$) mylink $old, $new
1221 sub myvec ($$$) myvec $var, $offset, 1
1222 sub myindex ($$;$) myindex &getstring, "substr"
1223 sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off
1224 sub myreverse (@) myreverse $a, $b, $c
1225 sub myjoin ($@) myjoin ":", $a, $b, $c
1226 sub mypop (+) mypop @array
1227 sub mysplice (+$$@) mysplice @array, 0, 2, @pushme
1228 sub mykeys (+) mykeys %{$hashref}
1229 sub myopen (*;$) myopen HANDLE, $name
1230 sub mypipe (**) mypipe READHANDLE, WRITEHANDLE
1231 sub mygrep (&@) mygrep { /foo/ } $a, $b, $c
1232 sub myrand (;$) myrand 42
1233 sub mytime () mytime
cb1a09d0 1234
c07a80fd 1235Any backslashed prototype character represents an actual argument
ae7a3cfa 1236that must start with that character (optionally preceded by C<my>,
b91b7d1a
FC
1237C<our> or C<local>), with the exception of C<$>, which will
1238accept any scalar lvalue expression, such as C<$foo = 7> or
74083ec6 1239C<< my_function()->[0] >>. The value passed as part of C<@_> will be a
ae7a3cfa
FC
1240reference to the actual argument given in the subroutine call,
1241obtained by applying C<\> to that argument.
c07a80fd 1242
c035a075
DG
1243You can use the C<\[]> backslash group notation to specify more than one
1244allowed argument type. For example:
5b794e05
JH
1245
1246 sub myref (\[$@%&*])
1247
1248will allow calling myref() as
1249
1250 myref $var
1251 myref @array
1252 myref %hash
1253 myref &sub
1254 myref *glob
1255
1256and the first argument of myref() will be a reference to
1257a scalar, an array, a hash, a code, or a glob.
1258
c07a80fd 1259Unbackslashed prototype characters have special meanings. Any
19799a22 1260unbackslashed C<@> or C<%> eats all remaining arguments, and forces
f86cebdf
GS
1261list context. An argument represented by C<$> forces scalar context. An
1262C<&> requires an anonymous subroutine, which, if passed as the first
0df79f0c
GS
1263argument, does not require the C<sub> keyword or a subsequent comma.
1264
1265A C<*> allows the subroutine to accept a bareword, constant, scalar expression,
648ca4f7
GS
1266typeglob, or a reference to a typeglob in that slot. The value will be
1267available to the subroutine either as a simple scalar, or (in the latter
0df79f0c
GS
1268two cases) as a reference to the typeglob. If you wish to always convert
1269such arguments to a typeglob reference, use Symbol::qualify_to_ref() as
1270follows:
1271
1272 use Symbol 'qualify_to_ref';
1273
1274 sub foo (*) {
1275 my $fh = qualify_to_ref(shift, caller);
1276 ...
1277 }
c07a80fd 1278
c035a075
DG
1279The C<+> prototype is a special alternative to C<$> that will act like
1280C<\[@%]> when given a literal array or hash variable, but will otherwise
1281force scalar context on the argument. This is useful for functions which
1282should accept either a literal array or an array reference as the argument:
1283
cba5a3b0 1284 sub mypush (+@) {
c035a075
DG
1285 my $aref = shift;
1286 die "Not an array or arrayref" unless ref $aref eq 'ARRAY';
1287 push @$aref, @_;
1288 }
1289
1290When using the C<+> prototype, your function must check that the argument
1291is of an acceptable type.
1292
859a4967 1293A semicolon (C<;>) separates mandatory arguments from optional arguments.
19799a22 1294It is redundant before C<@> or C<%>, which gobble up everything else.
cb1a09d0 1295
34daab0f
RGS
1296As the last character of a prototype, or just before a semicolon, a C<@>
1297or a C<%>, you can use C<_> in place of C<$>: if this argument is not
1298provided, C<$_> will be used instead.
859a4967 1299
19799a22
GS
1300Note how the last three examples in the table above are treated
1301specially by the parser. C<mygrep()> is parsed as a true list
1302operator, C<myrand()> is parsed as a true unary operator with unary
1303precedence the same as C<rand()>, and C<mytime()> is truly without
1304arguments, just like C<time()>. That is, if you say
cb1a09d0
AD
1305
1306 mytime +2;
1307
f86cebdf 1308you'll get C<mytime() + 2>, not C<mytime(2)>, which is how it would be parsed
3a8944db
FC
1309without a prototype. If you want to force a unary function to have the
1310same precedence as a list operator, add C<;> to the end of the prototype:
1311
1312 sub mygetprotobynumber($;);
1313 mygetprotobynumber $a > $b; # parsed as mygetprotobynumber($a > $b)
cb1a09d0 1314
19799a22
GS
1315The interesting thing about C<&> is that you can generate new syntax with it,
1316provided it's in the initial position:
d74e8afc 1317X<&>
cb1a09d0 1318
6d28dffb 1319 sub try (&@) {
cb1a09d0
AD
1320 my($try,$catch) = @_;
1321 eval { &$try };
1322 if ($@) {
1323 local $_ = $@;
1324 &$catch;
1325 }
1326 }
55497cff 1327 sub catch (&) { $_[0] }
cb1a09d0
AD
1328
1329 try {
1330 die "phooey";
1331 } catch {
1332 /phooey/ and print "unphooey\n";
1333 };
1334
f86cebdf 1335That prints C<"unphooey">. (Yes, there are still unresolved
19799a22 1336issues having to do with visibility of C<@_>. I'm ignoring that
f86cebdf 1337question for the moment. (But note that if we make C<@_> lexically
cb1a09d0 1338scoped, those anonymous subroutines can act like closures... (Gee,
5f05dabc 1339is this sounding a little Lispish? (Never mind.))))
cb1a09d0 1340
19799a22 1341And here's a reimplementation of the Perl C<grep> operator:
d74e8afc 1342X<grep>
cb1a09d0
AD
1343
1344 sub mygrep (&@) {
1345 my $code = shift;
1346 my @result;
1347 foreach $_ (@_) {
6e47f808 1348 push(@result, $_) if &$code;
cb1a09d0
AD
1349 }
1350 @result;
1351 }
a0d0e21e 1352
cb1a09d0
AD
1353Some folks would prefer full alphanumeric prototypes. Alphanumerics have
1354been intentionally left out of prototypes for the express purpose of
1355someday in the future adding named, formal parameters. The current
1356mechanism's main goal is to let module writers provide better diagnostics
1357for module users. Larry feels the notation quite understandable to Perl
1358programmers, and that it will not intrude greatly upon the meat of the
1359module, nor make it harder to read. The line noise is visually
1360encapsulated into a small pill that's easy to swallow.
1361
420cdfc1
ST
1362If you try to use an alphanumeric sequence in a prototype you will
1363generate an optional warning - "Illegal character in prototype...".
1364Unfortunately earlier versions of Perl allowed the prototype to be
1365used as long as its prefix was a valid prototype. The warning may be
1366upgraded to a fatal error in a future version of Perl once the
1367majority of offending code is fixed.
1368
cb1a09d0
AD
1369It's probably best to prototype new functions, not retrofit prototyping
1370into older ones. That's because you must be especially careful about
1371silent impositions of differing list versus scalar contexts. For example,
1372if you decide that a function should take just one parameter, like this:
1373
1374 sub func ($) {
1375 my $n = shift;
1376 print "you gave me $n\n";
54310121 1377 }
cb1a09d0
AD
1378
1379and someone has been calling it with an array or expression
1380returning a list:
1381
1382 func(@foo);
1383 func( split /:/ );
1384
19799a22 1385Then you've just supplied an automatic C<scalar> in front of their
f86cebdf 1386argument, which can be more than a bit surprising. The old C<@foo>
cb1a09d0 1387which used to hold one thing doesn't get passed in. Instead,
19799a22
GS
1388C<func()> now gets passed in a C<1>; that is, the number of elements
1389in C<@foo>. And the C<split> gets called in scalar context so it
1390starts scribbling on your C<@_> parameter list. Ouch!
cb1a09d0 1391
eb40d2ca
PM
1392If a sub has both a PROTO and a BLOCK, the prototype is not applied
1393until after the BLOCK is completely defined. This means that a recursive
1394function with a prototype has to be predeclared for the prototype to take
1395effect, like so:
1396
1397 sub foo($$);
1398 sub foo($$) {
1399 foo 1, 2;
1400 }
1401
5f05dabc 1402This is all very powerful, of course, and should be used only in moderation
54310121 1403to make the world a better place.
44a8e56a 1404
1405=head2 Constant Functions
d74e8afc 1406X<constant>
44a8e56a 1407
1408Functions with a prototype of C<()> are potential candidates for
19799a22
GS
1409inlining. If the result after optimization and constant folding
1410is either a constant or a lexically-scoped scalar which has no other
54310121 1411references, then it will be used in place of function calls made
19799a22
GS
1412without C<&>. Calls made using C<&> are never inlined. (See
1413F<constant.pm> for an easy way to declare most constants.)
44a8e56a 1414
5a964f20 1415The following functions would all be inlined:
44a8e56a 1416
699e6cd4
TP
1417 sub pi () { 3.14159 } # Not exact, but close.
1418 sub PI () { 4 * atan2 1, 1 } # As good as it gets,
1419 # and it's inlined, too!
44a8e56a 1420 sub ST_DEV () { 0 }
1421 sub ST_INO () { 1 }
1422
1423 sub FLAG_FOO () { 1 << 8 }
1424 sub FLAG_BAR () { 1 << 9 }
1425 sub FLAG_MASK () { FLAG_FOO | FLAG_BAR }
54310121 1426
1427 sub OPT_BAZ () { not (0x1B58 & FLAG_MASK) }
88267271
PZ
1428
1429 sub N () { int(OPT_BAZ) / 3 }
1430
1431 sub FOO_SET () { 1 if FLAG_MASK & FLAG_FOO }
1432
1433Be aware that these will not be inlined; as they contain inner scopes,
1434the constant folding doesn't reduce them to a single constant:
1435
1436 sub foo_set () { if (FLAG_MASK & FLAG_FOO) { 1 } }
1437
1438 sub baz_val () {
44a8e56a 1439 if (OPT_BAZ) {
1440 return 23;
1441 }
1442 else {
1443 return 42;
1444 }
1445 }
cb1a09d0 1446
5a964f20 1447If you redefine a subroutine that was eligible for inlining, you'll get
2dc1f7e5 1448a warning by default. (You can use this warning to tell whether or not a
e4fde5ca 1449particular subroutine is considered inlinable.) The warning is
2dc1f7e5
FC
1450considered severe enough not to be affected by the B<-w>
1451switch (or its absence) because previously compiled
4cee8e80 1452invocations of the function will still be using the old value of the
19799a22 1453function. If you need to be able to redefine the subroutine, you need to
4cee8e80 1454ensure that it isn't inlined, either by dropping the C<()> prototype
19799a22 1455(which changes calling semantics, so beware) or by thwarting the
4cee8e80
CS
1456inlining mechanism in some other way, such as
1457
4cee8e80 1458 sub not_inlined () {
54310121 1459 23 if $];
4cee8e80
CS
1460 }
1461
19799a22 1462=head2 Overriding Built-in Functions
d74e8afc 1463X<built-in> X<override> X<CORE> X<CORE::GLOBAL>
a0d0e21e 1464
19799a22 1465Many built-in functions may be overridden, though this should be tried
5f05dabc 1466only occasionally and for good reason. Typically this might be
19799a22 1467done by a package attempting to emulate missing built-in functionality
a0d0e21e
LW
1468on a non-Unix system.
1469
163e3a99
JP
1470Overriding may be done only by importing the name from a module at
1471compile time--ordinary predeclaration isn't good enough. However, the
19799a22
GS
1472C<use subs> pragma lets you, in effect, predeclare subs
1473via the import syntax, and these names may then override built-in ones:
a0d0e21e
LW
1474
1475 use subs 'chdir', 'chroot', 'chmod', 'chown';
1476 chdir $somewhere;
1477 sub chdir { ... }
1478
19799a22
GS
1479To unambiguously refer to the built-in form, precede the
1480built-in name with the special package qualifier C<CORE::>. For example,
1481saying C<CORE::open()> always refers to the built-in C<open()>, even
fb73857a 1482if the current package has imported some other subroutine called
19799a22 1483C<&open()> from elsewhere. Even though it looks like a regular
4aaa4757
FC
1484function call, it isn't: the CORE:: prefix in that case is part of Perl's
1485syntax, and works for any keyword, regardless of what is in the CORE
1486package. Taking a reference to it, that is, C<\&CORE::open>, only works
1487for some keywords. See L<CORE>.
fb73857a 1488
19799a22
GS
1489Library modules should not in general export built-in names like C<open>
1490or C<chdir> as part of their default C<@EXPORT> list, because these may
a0d0e21e 1491sneak into someone else's namespace and change the semantics unexpectedly.
19799a22 1492Instead, if the module adds that name to C<@EXPORT_OK>, then it's
a0d0e21e
LW
1493possible for a user to import the name explicitly, but not implicitly.
1494That is, they could say
1495
1496 use Module 'open';
1497
19799a22 1498and it would import the C<open> override. But if they said
a0d0e21e
LW
1499
1500 use Module;
1501
19799a22 1502they would get the default imports without overrides.
a0d0e21e 1503
19799a22 1504The foregoing mechanism for overriding built-in is restricted, quite
95d94a4f 1505deliberately, to the package that requests the import. There is a second
19799a22 1506method that is sometimes applicable when you wish to override a built-in
95d94a4f
GS
1507everywhere, without regard to namespace boundaries. This is achieved by
1508importing a sub into the special namespace C<CORE::GLOBAL::>. Here is an
1509example that quite brazenly replaces the C<glob> operator with something
1510that understands regular expressions.
1511
1512 package REGlob;
1513 require Exporter;
1514 @ISA = 'Exporter';
1515 @EXPORT_OK = 'glob';
1516
1517 sub import {
1518 my $pkg = shift;
1519 return unless @_;
1520 my $sym = shift;
1521 my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0));
1522 $pkg->export($where, $sym, @_);
1523 }
1524
1525 sub glob {
1526 my $pat = shift;
1527 my @got;
7b815c67
RGS
1528 if (opendir my $d, '.') {
1529 @got = grep /$pat/, readdir $d;
1530 closedir $d;
19799a22
GS
1531 }
1532 return @got;
95d94a4f
GS
1533 }
1534 1;
1535
1536And here's how it could be (ab)used:
1537
1538 #use REGlob 'GLOBAL_glob'; # override glob() in ALL namespaces
1539 package Foo;
1540 use REGlob 'glob'; # override glob() in Foo:: only
1541 print for <^[a-z_]+\.pm\$>; # show all pragmatic modules
1542
19799a22 1543The initial comment shows a contrived, even dangerous example.
95d94a4f 1544By overriding C<glob> globally, you would be forcing the new (and
19799a22 1545subversive) behavior for the C<glob> operator for I<every> namespace,
95d94a4f
GS
1546without the complete cognizance or cooperation of the modules that own
1547those namespaces. Naturally, this should be done with extreme caution--if
1548it must be done at all.
1549
1550The C<REGlob> example above does not implement all the support needed to
19799a22 1551cleanly override perl's C<glob> operator. The built-in C<glob> has
95d94a4f 1552different behaviors depending on whether it appears in a scalar or list
19799a22 1553context, but our C<REGlob> doesn't. Indeed, many perl built-in have such
95d94a4f
GS
1554context sensitive behaviors, and these must be adequately supported by
1555a properly written override. For a fully functional example of overriding
1556C<glob>, study the implementation of C<File::DosGlob> in the standard
1557library.
1558
77bc9082
RGS
1559When you override a built-in, your replacement should be consistent (if
1560possible) with the built-in native syntax. You can achieve this by using
1561a suitable prototype. To get the prototype of an overridable built-in,
1562use the C<prototype> function with an argument of C<"CORE::builtin_name">
1563(see L<perlfunc/prototype>).
1564
1565Note however that some built-ins can't have their syntax expressed by a
1566prototype (such as C<system> or C<chomp>). If you override them you won't
1567be able to fully mimic their original syntax.
1568
fe854a6f 1569The built-ins C<do>, C<require> and C<glob> can also be overridden, but due
77bc9082
RGS
1570to special magic, their original syntax is preserved, and you don't have
1571to define a prototype for their replacements. (You can't override the
1572C<do BLOCK> syntax, though).
1573
1574C<require> has special additional dark magic: if you invoke your
1575C<require> replacement as C<require Foo::Bar>, it will actually receive
1576the argument C<"Foo/Bar.pm"> in @_. See L<perlfunc/require>.
1577
1578And, as you'll have noticed from the previous example, if you override
593b9c14 1579C<glob>, the C<< <*> >> glob operator is overridden as well.
77bc9082 1580
9b3023bc 1581In a similar fashion, overriding the C<readline> function also overrides
e3f73d4e
RGS
1582the equivalent I/O operator C<< <FILEHANDLE> >>. Also, overriding
1583C<readpipe> also overrides the operators C<``> and C<qx//>.
9b3023bc 1584
fe854a6f 1585Finally, some built-ins (e.g. C<exists> or C<grep>) can't be overridden.
77bc9082 1586
a0d0e21e 1587=head2 Autoloading
d74e8afc 1588X<autoloading> X<AUTOLOAD>
a0d0e21e 1589
19799a22
GS
1590If you call a subroutine that is undefined, you would ordinarily
1591get an immediate, fatal error complaining that the subroutine doesn't
1592exist. (Likewise for subroutines being used as methods, when the
1593method doesn't exist in any base class of the class's package.)
1594However, if an C<AUTOLOAD> subroutine is defined in the package or
1595packages used to locate the original subroutine, then that
1596C<AUTOLOAD> subroutine is called with the arguments that would have
1597been passed to the original subroutine. The fully qualified name
1598of the original subroutine magically appears in the global $AUTOLOAD
1599variable of the same package as the C<AUTOLOAD> routine. The name
1600is not passed as an ordinary argument because, er, well, just
593b9c14 1601because, that's why. (As an exception, a method call to a nonexistent
80ee23cd 1602C<import> or C<unimport> method is just skipped instead. Also, if
5b36e945
FC
1603the AUTOLOAD subroutine is an XSUB, there are other ways to retrieve the
1604subroutine name. See L<perlguts/Autoloading with XSUBs> for details.)
80ee23cd 1605
19799a22
GS
1606
1607Many C<AUTOLOAD> routines load in a definition for the requested
1608subroutine using eval(), then execute that subroutine using a special
1609form of goto() that erases the stack frame of the C<AUTOLOAD> routine
1610without a trace. (See the source to the standard module documented
1611in L<AutoLoader>, for example.) But an C<AUTOLOAD> routine can
1612also just emulate the routine and never define it. For example,
1613let's pretend that a function that wasn't defined should just invoke
1614C<system> with those arguments. All you'd do is:
cb1a09d0
AD
1615
1616 sub AUTOLOAD {
1617 my $program = $AUTOLOAD;
1618 $program =~ s/.*:://;
1619 system($program, @_);
54310121 1620 }
cb1a09d0 1621 date();
6d28dffb 1622 who('am', 'i');
cb1a09d0
AD
1623 ls('-l');
1624
19799a22
GS
1625In fact, if you predeclare functions you want to call that way, you don't
1626even need parentheses:
cb1a09d0
AD
1627
1628 use subs qw(date who ls);
1629 date;
1630 who "am", "i";
593b9c14 1631 ls '-l';
cb1a09d0 1632
13058d67 1633A more complete example of this is the Shell module on CPAN, which
19799a22 1634can treat undefined subroutine calls as calls to external programs.
a0d0e21e 1635
19799a22
GS
1636Mechanisms are available to help modules writers split their modules
1637into autoloadable files. See the standard AutoLoader module
6d28dffb 1638described in L<AutoLoader> and in L<AutoSplit>, the standard
1639SelfLoader modules in L<SelfLoader>, and the document on adding C
19799a22 1640functions to Perl code in L<perlxs>.
cb1a09d0 1641
09bef843 1642=head2 Subroutine Attributes
d74e8afc 1643X<attribute> X<subroutine, attribute> X<attrs>
09bef843
SB
1644
1645A subroutine declaration or definition may have a list of attributes
1646associated with it. If such an attribute list is present, it is
0120eecf 1647broken up at space or colon boundaries and treated as though a
09bef843
SB
1648C<use attributes> had been seen. See L<attributes> for details
1649about what attributes are currently supported.
1650Unlike the limitation with the obsolescent C<use attrs>, the
1651C<sub : ATTRLIST> syntax works to associate the attributes with
1652a pre-declaration, and not just with a subroutine definition.
1653
1654The attributes must be valid as simple identifier names (without any
1655punctuation other than the '_' character). They may have a parameter
1656list appended, which is only checked for whether its parentheses ('(',')')
1657nest properly.
1658
1659Examples of valid syntax (even though the attributes are unknown):
1660
4358a253
SS
1661 sub fnord (&\%) : switch(10,foo(7,3)) : expensive;
1662 sub plugh () : Ugly('\(") :Bad;
09bef843
SB
1663 sub xyzzy : _5x5 { ... }
1664
1665Examples of invalid syntax:
1666
4358a253
SS
1667 sub fnord : switch(10,foo(); # ()-string not balanced
1668 sub snoid : Ugly('('); # ()-string not balanced
1669 sub xyzzy : 5x5; # "5x5" not a valid identifier
1670 sub plugh : Y2::north; # "Y2::north" not a simple identifier
1671 sub snurt : foo + bar; # "+" not a colon or space
09bef843
SB
1672
1673The attribute list is passed as a list of constant strings to the code
1674which associates them with the subroutine. In particular, the second example
1675of valid syntax above currently looks like this in terms of how it's
1676parsed and invoked:
1677
1678 use attributes __PACKAGE__, \&plugh, q[Ugly('\(")], 'Bad';
1679
1680For further details on attribute lists and their manipulation,
a0ae32d3 1681see L<attributes> and L<Attribute::Handlers>.
09bef843 1682
cb1a09d0 1683=head1 SEE ALSO
a0d0e21e 1684
19799a22
GS
1685See L<perlref/"Function Templates"> for more about references and closures.
1686See L<perlxs> if you'd like to learn about calling C subroutines from Perl.
a2293a43 1687See L<perlembed> if you'd like to learn about calling Perl subroutines from C.
19799a22
GS
1688See L<perlmod> to learn about bundling up your functions in separate files.
1689See L<perlmodlib> to learn what library modules come standard on your system.
82e1c0d9 1690See L<perlootut> to learn how to make object method calls.