Commit | Line | Data |
---|---|---|
a0d0e21e | 1 | =head1 NAME |
d74e8afc | 2 | X<subroutine> X<function> |
a0d0e21e LW |
3 | |
4 | perlsub - Perl subroutines | |
5 | ||
6 | =head1 SYNOPSIS | |
7 | ||
8 | To declare subroutines: | |
d74e8afc | 9 | X<subroutine, declaration> X<sub> |
a0d0e21e | 10 | |
09bef843 SB |
11 | sub NAME; # A "forward" declaration. |
12 | sub NAME(PROTO); # ditto, but with prototypes | |
13 | sub NAME : ATTRS; # with attributes | |
14 | sub NAME(PROTO) : ATTRS; # with attributes and prototypes | |
cb1a09d0 | 15 | |
09bef843 SB |
16 | sub NAME BLOCK # A declaration and a definition. |
17 | sub NAME(PROTO) BLOCK # ditto, but with prototypes | |
18 | sub NAME : ATTRS BLOCK # with attributes | |
19 | sub NAME(PROTO) : ATTRS BLOCK # with prototypes and attributes | |
894f226e DM |
20 | |
21 | use feature 'signatures'; | |
22 | sub NAME(SIG) BLOCK # with signature | |
23 | sub NAME :ATTRS (SIG) BLOCK # with signature, attributes | |
24 | sub NAME :prototype(PROTO) (SIG) BLOCK # with signature, prototype | |
a0d0e21e | 25 | |
748a9306 | 26 | To define an anonymous subroutine at runtime: |
d74e8afc | 27 | X<subroutine, anonymous> |
748a9306 | 28 | |
09bef843 SB |
29 | $subref = sub BLOCK; # no proto |
30 | $subref = sub (PROTO) BLOCK; # with proto | |
31 | $subref = sub : ATTRS BLOCK; # with attributes | |
32 | $subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes | |
894f226e DM |
33 | |
34 | use feature 'signatures'; | |
35 | $subref = sub (SIG) BLOCK; # with signature | |
36 | $subref = sub : ATTRS(SIG) BLOCK; # with signature, attributes | |
748a9306 | 37 | |
a0d0e21e | 38 | To import subroutines: |
d74e8afc | 39 | X<import> |
a0d0e21e | 40 | |
19799a22 | 41 | use MODULE qw(NAME1 NAME2 NAME3); |
a0d0e21e LW |
42 | |
43 | To call subroutines: | |
d74e8afc | 44 | X<subroutine, call> X<call> |
a0d0e21e | 45 | |
5f05dabc | 46 | NAME(LIST); # & is optional with parentheses. |
54310121 | 47 | NAME LIST; # Parentheses optional if predeclared/imported. |
19799a22 | 48 | &NAME(LIST); # Circumvent prototypes. |
5a964f20 | 49 | &NAME; # Makes current @_ visible to called subroutine. |
a0d0e21e LW |
50 | |
51 | =head1 DESCRIPTION | |
52 | ||
19799a22 GS |
53 | Like many languages, Perl provides for user-defined subroutines. |
54 | These may be located anywhere in the main program, loaded in from | |
55 | other files via the C<do>, C<require>, or C<use> keywords, or | |
be3174d2 | 56 | generated on the fly using C<eval> or anonymous subroutines. |
19799a22 GS |
57 | You can even call a function indirectly using a variable containing |
58 | its name or a CODE reference. | |
cb1a09d0 AD |
59 | |
60 | The Perl model for function call and return values is simple: all | |
61 | functions are passed as parameters one single flat list of scalars, and | |
62 | all functions likewise return to their caller one single flat list of | |
63 | scalars. Any arrays or hashes in these call and return lists will | |
64 | collapse, losing their identities--but you may always use | |
65 | pass-by-reference instead to avoid this. Both call and return lists may | |
66 | contain as many or as few scalar elements as you'd like. (Often a | |
67 | function without an explicit return statement is called a subroutine, but | |
19799a22 | 68 | there's really no difference from Perl's perspective.) |
d74e8afc | 69 | X<subroutine, parameter> X<parameter> |
19799a22 | 70 | |
30d9c59b Z |
71 | Any arguments passed in show up in the array C<@_>. |
72 | (They may also show up in lexical variables introduced by a signature; | |
73 | see L</Signatures> below.) Therefore, if | |
19799a22 GS |
74 | you called a function with two arguments, those would be stored in |
75 | C<$_[0]> and C<$_[1]>. The array C<@_> is a local array, but its | |
76 | elements are aliases for the actual scalar parameters. In particular, | |
77 | if an element C<$_[0]> is updated, the corresponding argument is | |
78 | updated (or an error occurs if it is not updatable). If an argument | |
79 | is an array or hash element which did not exist when the function | |
80 | was called, that element is created only when (and if) it is modified | |
81 | or a reference to it is taken. (Some earlier versions of Perl | |
82 | created the element whether or not the element was assigned to.) | |
83 | Assigning to the whole array C<@_> removes that aliasing, and does | |
84 | not update any arguments. | |
d74e8afc | 85 | X<subroutine, argument> X<argument> X<@_> |
19799a22 | 86 | |
dbb128be XN |
87 | A C<return> statement may be used to exit a subroutine, optionally |
88 | specifying the returned value, which will be evaluated in the | |
89 | appropriate context (list, scalar, or void) depending on the context of | |
90 | the subroutine call. If you specify no return value, the subroutine | |
91 | returns an empty list in list context, the undefined value in scalar | |
92 | context, or nothing in void context. If you return one or more | |
93 | aggregates (arrays and hashes), these will be flattened together into | |
94 | one large indistinguishable list. | |
95 | ||
96 | If no C<return> is found and if the last statement is an expression, its | |
b77865f5 FC |
97 | value is returned. If the last statement is a loop control structure |
98 | like a C<foreach> or a C<while>, the returned value is unspecified. The | |
9a989771 | 99 | empty sub returns the empty list. |
d74e8afc | 100 | X<subroutine, return value> X<return value> X<return> |
19799a22 | 101 | |
30d9c59b | 102 | Aside from an experimental facility (see L</Signatures> below), |
19799a22 GS |
103 | Perl does not have named formal parameters. In practice all you |
104 | do is assign to a C<my()> list of these. Variables that aren't | |
105 | declared to be private are global variables. For gory details | |
5a0de581 LM |
106 | on creating private variables, see L</"Private Variables via my()"> |
107 | and L</"Temporary Values via local()">. To create protected | |
19799a22 GS |
108 | environments for a set of functions in a separate package (and |
109 | probably a separate file), see L<perlmod/"Packages">. | |
d74e8afc | 110 | X<formal parameter> X<parameter, formal> |
a0d0e21e LW |
111 | |
112 | Example: | |
113 | ||
cb1a09d0 AD |
114 | sub max { |
115 | my $max = shift(@_); | |
a0d0e21e LW |
116 | foreach $foo (@_) { |
117 | $max = $foo if $max < $foo; | |
118 | } | |
cb1a09d0 | 119 | return $max; |
a0d0e21e | 120 | } |
cb1a09d0 | 121 | $bestday = max($mon,$tue,$wed,$thu,$fri); |
a0d0e21e LW |
122 | |
123 | Example: | |
124 | ||
125 | # get a line, combining continuation lines | |
126 | # that start with whitespace | |
127 | ||
128 | sub get_line { | |
19799a22 | 129 | $thisline = $lookahead; # global variables! |
54310121 | 130 | LINE: while (defined($lookahead = <STDIN>)) { |
a0d0e21e LW |
131 | if ($lookahead =~ /^[ \t]/) { |
132 | $thisline .= $lookahead; | |
133 | } | |
134 | else { | |
135 | last LINE; | |
136 | } | |
137 | } | |
19799a22 | 138 | return $thisline; |
a0d0e21e LW |
139 | } |
140 | ||
141 | $lookahead = <STDIN>; # get first line | |
19799a22 | 142 | while (defined($line = get_line())) { |
a0d0e21e LW |
143 | ... |
144 | } | |
145 | ||
09bef843 | 146 | Assigning to a list of private variables to name your arguments: |
a0d0e21e LW |
147 | |
148 | sub maybeset { | |
149 | my($key, $value) = @_; | |
cb1a09d0 | 150 | $Foo{$key} = $value unless $Foo{$key}; |
a0d0e21e LW |
151 | } |
152 | ||
19799a22 GS |
153 | Because the assignment copies the values, this also has the effect |
154 | of turning call-by-reference into call-by-value. Otherwise a | |
155 | function is free to do in-place modifications of C<@_> and change | |
156 | its caller's values. | |
d74e8afc | 157 | X<call-by-reference> X<call-by-value> |
cb1a09d0 AD |
158 | |
159 | upcase_in($v1, $v2); # this changes $v1 and $v2 | |
160 | sub upcase_in { | |
54310121 | 161 | for (@_) { tr/a-z/A-Z/ } |
162 | } | |
cb1a09d0 AD |
163 | |
164 | You aren't allowed to modify constants in this way, of course. If an | |
165 | argument were actually literal and you tried to change it, you'd take a | |
166 | (presumably fatal) exception. For example, this won't work: | |
d74e8afc | 167 | X<call-by-reference> X<call-by-value> |
cb1a09d0 AD |
168 | |
169 | upcase_in("frederick"); | |
170 | ||
f86cebdf | 171 | It would be much safer if the C<upcase_in()> function |
cb1a09d0 AD |
172 | were written to return a copy of its parameters instead |
173 | of changing them in place: | |
174 | ||
19799a22 | 175 | ($v3, $v4) = upcase($v1, $v2); # this doesn't change $v1 and $v2 |
cb1a09d0 | 176 | sub upcase { |
54310121 | 177 | return unless defined wantarray; # void context, do nothing |
cb1a09d0 | 178 | my @parms = @_; |
54310121 | 179 | for (@parms) { tr/a-z/A-Z/ } |
c07a80fd | 180 | return wantarray ? @parms : $parms[0]; |
54310121 | 181 | } |
cb1a09d0 | 182 | |
19799a22 | 183 | Notice how this (unprototyped) function doesn't care whether it was |
a2293a43 | 184 | passed real scalars or arrays. Perl sees all arguments as one big, |
19799a22 GS |
185 | long, flat parameter list in C<@_>. This is one area where |
186 | Perl's simple argument-passing style shines. The C<upcase()> | |
187 | function would work perfectly well without changing the C<upcase()> | |
188 | definition even if we fed it things like this: | |
cb1a09d0 AD |
189 | |
190 | @newlist = upcase(@list1, @list2); | |
191 | @newlist = upcase( split /:/, $var ); | |
192 | ||
193 | Do not, however, be tempted to do this: | |
194 | ||
195 | (@a, @b) = upcase(@list1, @list2); | |
196 | ||
19799a22 GS |
197 | Like the flattened incoming parameter list, the return list is also |
198 | flattened on return. So all you have managed to do here is stored | |
17b63f68 | 199 | everything in C<@a> and made C<@b> empty. See |
5a0de581 | 200 | L</Pass by Reference> for alternatives. |
19799a22 GS |
201 | |
202 | A subroutine may be called using an explicit C<&> prefix. The | |
203 | C<&> is optional in modern Perl, as are parentheses if the | |
204 | subroutine has been predeclared. The C<&> is I<not> optional | |
205 | when just naming the subroutine, such as when it's used as | |
206 | an argument to defined() or undef(). Nor is it optional when you | |
207 | want to do an indirect subroutine call with a subroutine name or | |
208 | reference using the C<&$subref()> or C<&{$subref}()> constructs, | |
c47ff5f1 | 209 | although the C<< $subref->() >> notation solves that problem. |
19799a22 | 210 | See L<perlref> for more about all that. |
d74e8afc | 211 | X<&> |
19799a22 GS |
212 | |
213 | Subroutines may be called recursively. If a subroutine is called | |
214 | using the C<&> form, the argument list is optional, and if omitted, | |
215 | no C<@_> array is set up for the subroutine: the C<@_> array at the | |
216 | time of the call is visible to subroutine instead. This is an | |
217 | efficiency mechanism that new users may wish to avoid. | |
d74e8afc | 218 | X<recursion> |
a0d0e21e LW |
219 | |
220 | &foo(1,2,3); # pass three arguments | |
221 | foo(1,2,3); # the same | |
222 | ||
223 | foo(); # pass a null list | |
224 | &foo(); # the same | |
a0d0e21e | 225 | |
cb1a09d0 | 226 | &foo; # foo() get current args, like foo(@_) !! |
54310121 | 227 | foo; # like foo() IFF sub foo predeclared, else "foo" |
cb1a09d0 | 228 | |
19799a22 GS |
229 | Not only does the C<&> form make the argument list optional, it also |
230 | disables any prototype checking on arguments you do provide. This | |
c07a80fd | 231 | is partly for historical reasons, and partly for having a convenient way |
9688be67 | 232 | to cheat if you know what you're doing. See L</Prototypes> below. |
d74e8afc | 233 | X<&> |
c07a80fd | 234 | |
977616ef RS |
235 | Since Perl 5.16.0, the C<__SUB__> token is available under C<use feature |
236 | 'current_sub'> and C<use 5.16.0>. It will evaluate to a reference to the | |
906024c7 FC |
237 | currently-running sub, which allows for recursive calls without knowing |
238 | your subroutine's name. | |
977616ef RS |
239 | |
240 | use 5.16.0; | |
241 | my $factorial = sub { | |
242 | my ($x) = @_; | |
243 | return 1 if $x == 1; | |
244 | return($x * __SUB__->( $x - 1 ) ); | |
245 | }; | |
246 | ||
89d1beed | 247 | The behavior of C<__SUB__> within a regex code block (such as C</(?{...})/>) |
a453e28a DM |
248 | is subject to change. |
249 | ||
ac90fb77 EM |
250 | Subroutines whose names are in all upper case are reserved to the Perl |
251 | core, as are modules whose names are in all lower case. A subroutine in | |
252 | all capitals is a loosely-held convention meaning it will be called | |
253 | indirectly by the run-time system itself, usually due to a triggered event. | |
bf5513e0 | 254 | Subroutines whose name start with a left parenthesis are also reserved the |
b77865f5 | 255 | same way. The following is a list of some subroutines that currently do |
bf5513e0 ZA |
256 | special, pre-defined things. |
257 | ||
258 | =over | |
259 | ||
260 | =item documented later in this document | |
261 | ||
262 | C<AUTOLOAD> | |
263 | ||
264 | =item documented in L<perlmod> | |
265 | ||
8b7906d1 | 266 | C<CLONE>, C<CLONE_SKIP> |
bf5513e0 ZA |
267 | |
268 | =item documented in L<perlobj> | |
269 | ||
8b7906d1 | 270 | C<DESTROY>, C<DOES> |
bf5513e0 ZA |
271 | |
272 | =item documented in L<perltie> | |
273 | ||
274 | C<BINMODE>, C<CLEAR>, C<CLOSE>, C<DELETE>, C<DESTROY>, C<EOF>, C<EXISTS>, | |
275 | C<EXTEND>, C<FETCH>, C<FETCHSIZE>, C<FILENO>, C<FIRSTKEY>, C<GETC>, | |
276 | C<NEXTKEY>, C<OPEN>, C<POP>, C<PRINT>, C<PRINTF>, C<PUSH>, C<READ>, | |
277 | C<READLINE>, C<SCALAR>, C<SEEK>, C<SHIFT>, C<SPLICE>, C<STORE>, | |
278 | C<STORESIZE>, C<TELL>, C<TIEARRAY>, C<TIEHANDLE>, C<TIEHASH>, | |
279 | C<TIESCALAR>, C<UNSHIFT>, C<UNTIE>, C<WRITE> | |
280 | ||
281 | =item documented in L<PerlIO::via> | |
282 | ||
283 | C<BINMODE>, C<CLEARERR>, C<CLOSE>, C<EOF>, C<ERROR>, C<FDOPEN>, C<FILENO>, | |
284 | C<FILL>, C<FLUSH>, C<OPEN>, C<POPPED>, C<PUSHED>, C<READ>, C<SEEK>, | |
285 | C<SETLINEBUF>, C<SYSOPEN>, C<TELL>, C<UNREAD>, C<UTF8>, C<WRITE> | |
286 | ||
ec2eb8a9 TC |
287 | =item documented in L<perlfunc> |
288 | ||
289 | L<< C<import> | perlfunc/use >>, L<< C<unimport> | perlfunc/use >>, | |
290 | L<< C<INC> | perlfunc/require >> | |
291 | ||
292 | =item documented in L<UNIVERSAL> | |
293 | ||
294 | C<VERSION> | |
295 | ||
296 | =item documented in L<perldebguts> | |
297 | ||
298 | C<DB::DB>, C<DB::sub>, C<DB::lsub>, C<DB::goto>, C<DB::postponed> | |
299 | ||
bf5513e0 ZA |
300 | =item undocumented, used internally by the L<overload> feature |
301 | ||
302 | any starting with C<(> | |
303 | ||
304 | =back | |
ac90fb77 | 305 | |
3c10abe3 AG |
306 | The C<BEGIN>, C<UNITCHECK>, C<CHECK>, C<INIT> and C<END> subroutines |
307 | are not so much subroutines as named special code blocks, of which you | |
308 | can have more than one in a package, and which you can B<not> call | |
309 | explicitly. See L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END"> | |
5a964f20 | 310 | |
30d9c59b Z |
311 | =head2 Signatures |
312 | ||
313 | B<WARNING>: Subroutine signatures are experimental. The feature may be | |
314 | modified or removed in future versions of Perl. | |
315 | ||
316 | Perl has an experimental facility to allow a subroutine's formal | |
317 | parameters to be introduced by special syntax, separate from the | |
318 | procedural code of the subroutine body. The formal parameter list | |
319 | is known as a I<signature>. The facility must be enabled first by a | |
320 | pragmatic declaration, C<use feature 'signatures'>, and it will produce | |
321 | a warning unless the "experimental::signatures" warnings category is | |
322 | disabled. | |
323 | ||
324 | The signature is part of a subroutine's body. Normally the body of a | |
894f226e DM |
325 | subroutine is simply a braced block of code, but when using a signature, |
326 | the signature is a parenthesised list that goes immediately before the | |
327 | block, after any name or attributes. | |
328 | ||
329 | For example, | |
330 | ||
331 | sub foo :lvalue ($a, $b = 1, @c) { .... } | |
332 | ||
333 | The signature declares lexical variables that are | |
30d9c59b Z |
334 | in scope for the block. When the subroutine is called, the signature |
335 | takes control first. It populates the signature variables from the | |
336 | list of arguments that were passed. If the argument list doesn't meet | |
337 | the requirements of the signature, then it will throw an exception. | |
338 | When the signature processing is complete, control passes to the block. | |
339 | ||
340 | Positional parameters are handled by simply naming scalar variables in | |
341 | the signature. For example, | |
342 | ||
343 | sub foo ($left, $right) { | |
344 | return $left + $right; | |
345 | } | |
346 | ||
347 | takes two positional parameters, which must be filled at runtime by | |
348 | two arguments. By default the parameters are mandatory, and it is | |
349 | not permitted to pass more arguments than expected. So the above is | |
350 | equivalent to | |
351 | ||
352 | sub foo { | |
353 | die "Too many arguments for subroutine" unless @_ <= 2; | |
354 | die "Too few arguments for subroutine" unless @_ >= 2; | |
355 | my $left = $_[0]; | |
356 | my $right = $_[1]; | |
357 | return $left + $right; | |
358 | } | |
359 | ||
360 | An argument can be ignored by omitting the main part of the name from | |
361 | a parameter declaration, leaving just a bare C<$> sigil. For example, | |
362 | ||
363 | sub foo ($first, $, $third) { | |
364 | return "first=$first, third=$third"; | |
365 | } | |
366 | ||
367 | Although the ignored argument doesn't go into a variable, it is still | |
368 | mandatory for the caller to pass it. | |
369 | ||
370 | A positional parameter is made optional by giving a default value, | |
371 | separated from the parameter name by C<=>: | |
372 | ||
373 | sub foo ($left, $right = 0) { | |
374 | return $left + $right; | |
375 | } | |
376 | ||
377 | The above subroutine may be called with either one or two arguments. | |
378 | The default value expression is evaluated when the subroutine is called, | |
379 | so it may provide different default values for different calls. It is | |
380 | only evaluated if the argument was actually omitted from the call. | |
381 | For example, | |
382 | ||
383 | my $auto_id = 0; | |
384 | sub foo ($thing, $id = $auto_id++) { | |
385 | print "$thing has ID $id"; | |
386 | } | |
387 | ||
388 | automatically assigns distinct sequential IDs to things for which no | |
389 | ID was supplied by the caller. A default value expression may also | |
390 | refer to parameters earlier in the signature, making the default for | |
391 | one parameter vary according to the earlier parameters. For example, | |
392 | ||
393 | sub foo ($first_name, $surname, $nickname = $first_name) { | |
394 | print "$first_name $surname is known as \"$nickname\""; | |
395 | } | |
396 | ||
397 | An optional parameter can be nameless just like a mandatory parameter. | |
398 | For example, | |
399 | ||
400 | sub foo ($thing, $ = 1) { | |
401 | print $thing; | |
402 | } | |
403 | ||
404 | The parameter's default value will still be evaluated if the corresponding | |
405 | argument isn't supplied, even though the value won't be stored anywhere. | |
406 | This is in case evaluating it has important side effects. However, it | |
407 | will be evaluated in void context, so if it doesn't have side effects | |
408 | and is not trivial it will generate a warning if the "void" warning | |
409 | category is enabled. If a nameless optional parameter's default value | |
410 | is not important, it may be omitted just as the parameter's name was: | |
411 | ||
412 | sub foo ($thing, $=) { | |
413 | print $thing; | |
414 | } | |
415 | ||
416 | Optional positional parameters must come after all mandatory positional | |
417 | parameters. (If there are no mandatory positional parameters then an | |
418 | optional positional parameters can be the first thing in the signature.) | |
419 | If there are multiple optional positional parameters and not enough | |
420 | arguments are supplied to fill them all, they will be filled from left | |
421 | to right. | |
422 | ||
423 | After positional parameters, additional arguments may be captured in a | |
424 | slurpy parameter. The simplest form of this is just an array variable: | |
425 | ||
426 | sub foo ($filter, @inputs) { | |
427 | print $filter->($_) foreach @inputs; | |
428 | } | |
429 | ||
430 | With a slurpy parameter in the signature, there is no upper limit on how | |
431 | many arguments may be passed. A slurpy array parameter may be nameless | |
432 | just like a positional parameter, in which case its only effect is to | |
433 | turn off the argument limit that would otherwise apply: | |
434 | ||
435 | sub foo ($thing, @) { | |
436 | print $thing; | |
437 | } | |
438 | ||
439 | A slurpy parameter may instead be a hash, in which case the arguments | |
440 | available to it are interpreted as alternating keys and values. | |
441 | There must be as many keys as values: if there is an odd argument then | |
442 | an exception will be thrown. Keys will be stringified, and if there are | |
443 | duplicates then the later instance takes precedence over the earlier, | |
444 | as with standard hash construction. | |
445 | ||
446 | sub foo ($filter, %inputs) { | |
447 | print $filter->($_, $inputs{$_}) foreach sort keys %inputs; | |
448 | } | |
449 | ||
450 | A slurpy hash parameter may be nameless just like other kinds of | |
451 | parameter. It still insists that the number of arguments available to | |
452 | it be even, even though they're not being put into a variable. | |
453 | ||
454 | sub foo ($thing, %) { | |
455 | print $thing; | |
456 | } | |
457 | ||
458 | A slurpy parameter, either array or hash, must be the last thing in the | |
459 | signature. It may follow mandatory and optional positional parameters; | |
460 | it may also be the only thing in the signature. Slurpy parameters cannot | |
461 | have default values: if no arguments are supplied for them then you get | |
462 | an empty array or empty hash. | |
463 | ||
464 | A signature may be entirely empty, in which case all it does is check | |
465 | that the caller passed no arguments: | |
466 | ||
467 | sub foo () { | |
468 | return 123; | |
469 | } | |
470 | ||
8c13e946 RS |
471 | When using a signature, the arguments are still available in the special |
472 | array variable C<@_>, in addition to the lexical variables of the | |
473 | signature. There is a difference between the two ways of accessing the | |
30d9c59b Z |
474 | arguments: C<@_> I<aliases> the arguments, but the signature variables |
475 | get I<copies> of the arguments. So writing to a signature variable | |
476 | only changes that variable, and has no effect on the caller's variables, | |
477 | but writing to an element of C<@_> modifies whatever the caller used to | |
478 | supply that argument. | |
479 | ||
480 | There is a potential syntactic ambiguity between signatures and prototypes | |
481 | (see L</Prototypes>), because both start with an opening parenthesis and | |
482 | both can appear in some of the same places, such as just after the name | |
483 | in a subroutine declaration. For historical reasons, when signatures | |
484 | are not enabled, any opening parenthesis in such a context will trigger | |
485 | very forgiving prototype parsing. Most signatures will be interpreted | |
486 | as prototypes in those circumstances, but won't be valid prototypes. | |
487 | (A valid prototype cannot contain any alphabetic character.) This will | |
488 | lead to somewhat confusing error messages. | |
489 | ||
490 | To avoid ambiguity, when signatures are enabled the special syntax | |
491 | for prototypes is disabled. There is no attempt to guess whether a | |
492 | parenthesised group was intended to be a prototype or a signature. | |
493 | To give a subroutine a prototype under these circumstances, use a | |
494 | L<prototype attribute|attributes/Built-in Attributes>. For example, | |
495 | ||
496 | sub foo :prototype($) { $_[0] } | |
497 | ||
498 | It is entirely possible for a subroutine to have both a prototype and | |
499 | a signature. They do different jobs: the prototype affects compilation | |
500 | of calls to the subroutine, and the signature puts argument values into | |
501 | lexical variables at runtime. You can therefore write | |
502 | ||
894f226e | 503 | sub foo :prototype($$) ($left, $right) { |
30d9c59b Z |
504 | return $left + $right; |
505 | } | |
506 | ||
894f226e DM |
507 | The prototype attribute, and any other attributes, must come before |
508 | the signature. The signature always immediately precedes the block of | |
509 | the subroutine's body. | |
30d9c59b | 510 | |
b687b08b | 511 | =head2 Private Variables via my() |
d74e8afc ITB |
512 | X<my> X<variable, lexical> X<lexical> X<lexical variable> X<scope, lexical> |
513 | X<lexical scope> X<attributes, my> | |
cb1a09d0 AD |
514 | |
515 | Synopsis: | |
516 | ||
517 | my $foo; # declare $foo lexically local | |
518 | my (@wid, %get); # declare list of variables local | |
519 | my $foo = "flurp"; # declare $foo lexical, and init it | |
520 | my @oof = @bar; # declare @oof lexical, and init it | |
09bef843 SB |
521 | my $x : Foo = $y; # similar, with an attribute applied |
522 | ||
a0ae32d3 JH |
523 | B<WARNING>: The use of attribute lists on C<my> declarations is still |
524 | evolving. The current semantics and interface are subject to change. | |
525 | See L<attributes> and L<Attribute::Handlers>. | |
cb1a09d0 | 526 | |
19799a22 | 527 | The C<my> operator declares the listed variables to be lexically |
f185f654 KW |
528 | confined to the enclosing block, conditional |
529 | (C<if>/C<unless>/C<elsif>/C<else>), loop | |
530 | (C<for>/C<foreach>/C<while>/C<until>/C<continue>), subroutine, C<eval>, | |
531 | or C<do>/C<require>/C<use>'d file. If more than one value is listed, the | |
19799a22 GS |
532 | list must be placed in parentheses. All listed elements must be |
533 | legal lvalues. Only alphanumeric identifiers may be lexically | |
325192b1 | 534 | scoped--magical built-ins like C<$/> must currently be C<local>ized |
19799a22 GS |
535 | with C<local> instead. |
536 | ||
537 | Unlike dynamic variables created by the C<local> operator, lexical | |
538 | variables declared with C<my> are totally hidden from the outside | |
539 | world, including any called subroutines. This is true if it's the | |
540 | same subroutine called from itself or elsewhere--every call gets | |
541 | its own copy. | |
d74e8afc | 542 | X<local> |
19799a22 GS |
543 | |
544 | This doesn't mean that a C<my> variable declared in a statically | |
545 | enclosing lexical scope would be invisible. Only dynamic scopes | |
546 | are cut off. For example, the C<bumpx()> function below has access | |
547 | to the lexical $x variable because both the C<my> and the C<sub> | |
548 | occurred at the same scope, presumably file scope. | |
5a964f20 TC |
549 | |
550 | my $x = 10; | |
551 | sub bumpx { $x++ } | |
552 | ||
19799a22 GS |
553 | An C<eval()>, however, can see lexical variables of the scope it is |
554 | being evaluated in, so long as the names aren't hidden by declarations within | |
555 | the C<eval()> itself. See L<perlref>. | |
d74e8afc | 556 | X<eval, scope of> |
cb1a09d0 | 557 | |
19799a22 | 558 | The parameter list to my() may be assigned to if desired, which allows you |
cb1a09d0 AD |
559 | to initialize your variables. (If no initializer is given for a |
560 | particular variable, it is created with the undefined value.) Commonly | |
19799a22 | 561 | this is used to name input parameters to a subroutine. Examples: |
cb1a09d0 AD |
562 | |
563 | $arg = "fred"; # "global" variable | |
564 | $n = cube_root(27); | |
565 | print "$arg thinks the root is $n\n"; | |
566 | fred thinks the root is 3 | |
567 | ||
568 | sub cube_root { | |
569 | my $arg = shift; # name doesn't matter | |
570 | $arg **= 1/3; | |
571 | return $arg; | |
54310121 | 572 | } |
cb1a09d0 | 573 | |
19799a22 GS |
574 | The C<my> is simply a modifier on something you might assign to. So when |
575 | you do assign to variables in its argument list, C<my> doesn't | |
6cc33c6d | 576 | change whether those variables are viewed as a scalar or an array. So |
cb1a09d0 | 577 | |
5a964f20 | 578 | my ($foo) = <STDIN>; # WRONG? |
cb1a09d0 AD |
579 | my @FOO = <STDIN>; |
580 | ||
5f05dabc | 581 | both supply a list context to the right-hand side, while |
cb1a09d0 AD |
582 | |
583 | my $foo = <STDIN>; | |
584 | ||
5f05dabc | 585 | supplies a scalar context. But the following declares only one variable: |
748a9306 | 586 | |
5a964f20 | 587 | my $foo, $bar = 1; # WRONG |
748a9306 | 588 | |
cb1a09d0 | 589 | That has the same effect as |
748a9306 | 590 | |
cb1a09d0 AD |
591 | my $foo; |
592 | $bar = 1; | |
a0d0e21e | 593 | |
cb1a09d0 AD |
594 | The declared variable is not introduced (is not visible) until after |
595 | the current statement. Thus, | |
596 | ||
597 | my $x = $x; | |
598 | ||
19799a22 | 599 | can be used to initialize a new $x with the value of the old $x, and |
cb1a09d0 AD |
600 | the expression |
601 | ||
602 | my $x = 123 and $x == 123 | |
603 | ||
19799a22 | 604 | is false unless the old $x happened to have the value C<123>. |
cb1a09d0 | 605 | |
55497cff | 606 | Lexical scopes of control structures are not bounded precisely by the |
607 | braces that delimit their controlled blocks; control expressions are | |
19799a22 | 608 | part of that scope, too. Thus in the loop |
55497cff | 609 | |
19799a22 | 610 | while (my $line = <>) { |
55497cff | 611 | $line = lc $line; |
612 | } continue { | |
613 | print $line; | |
614 | } | |
615 | ||
19799a22 | 616 | the scope of $line extends from its declaration throughout the rest of |
55497cff | 617 | the loop construct (including the C<continue> clause), but not beyond |
618 | it. Similarly, in the conditional | |
619 | ||
620 | if ((my $answer = <STDIN>) =~ /^yes$/i) { | |
621 | user_agrees(); | |
622 | } elsif ($answer =~ /^no$/i) { | |
623 | user_disagrees(); | |
624 | } else { | |
625 | chomp $answer; | |
626 | die "'$answer' is neither 'yes' nor 'no'"; | |
627 | } | |
628 | ||
19799a22 GS |
629 | the scope of $answer extends from its declaration through the rest |
630 | of that conditional, including any C<elsif> and C<else> clauses, | |
96090e4f | 631 | but not beyond it. See L<perlsyn/"Simple Statements"> for information |
457b36cb | 632 | on the scope of variables in statements with modifiers. |
55497cff | 633 | |
5f05dabc | 634 | The C<foreach> loop defaults to scoping its index variable dynamically |
19799a22 GS |
635 | in the manner of C<local>. However, if the index variable is |
636 | prefixed with the keyword C<my>, or if there is already a lexical | |
637 | by that name in scope, then a new lexical is created instead. Thus | |
638 | in the loop | |
d74e8afc | 639 | X<foreach> X<for> |
55497cff | 640 | |
641 | for my $i (1, 2, 3) { | |
642 | some_function(); | |
643 | } | |
644 | ||
19799a22 GS |
645 | the scope of $i extends to the end of the loop, but not beyond it, |
646 | rendering the value of $i inaccessible within C<some_function()>. | |
d74e8afc | 647 | X<foreach> X<for> |
55497cff | 648 | |
cb1a09d0 | 649 | Some users may wish to encourage the use of lexically scoped variables. |
19799a22 GS |
650 | As an aid to catching implicit uses to package variables, |
651 | which are always global, if you say | |
cb1a09d0 AD |
652 | |
653 | use strict 'vars'; | |
654 | ||
19799a22 GS |
655 | then any variable mentioned from there to the end of the enclosing |
656 | block must either refer to a lexical variable, be predeclared via | |
77ca0c92 | 657 | C<our> or C<use vars>, or else must be fully qualified with the package name. |
19799a22 GS |
658 | A compilation error results otherwise. An inner block may countermand |
659 | this with C<no strict 'vars'>. | |
660 | ||
661 | A C<my> has both a compile-time and a run-time effect. At compile | |
8593bda5 | 662 | time, the compiler takes notice of it. The principal usefulness |
19799a22 GS |
663 | of this is to quiet C<use strict 'vars'>, but it is also essential |
664 | for generation of closures as detailed in L<perlref>. Actual | |
665 | initialization is delayed until run time, though, so it gets executed | |
666 | at the appropriate time, such as each time through a loop, for | |
667 | example. | |
668 | ||
669 | Variables declared with C<my> are not part of any package and are therefore | |
cb1a09d0 AD |
670 | never fully qualified with the package name. In particular, you're not |
671 | allowed to try to make a package variable (or other global) lexical: | |
672 | ||
673 | my $pack::var; # ERROR! Illegal syntax | |
cb1a09d0 AD |
674 | |
675 | In fact, a dynamic variable (also known as package or global variables) | |
f86cebdf | 676 | are still accessible using the fully qualified C<::> notation even while a |
cb1a09d0 AD |
677 | lexical of the same name is also visible: |
678 | ||
679 | package main; | |
680 | local $x = 10; | |
681 | my $x = 20; | |
682 | print "$x and $::x\n"; | |
683 | ||
f86cebdf | 684 | That will print out C<20> and C<10>. |
cb1a09d0 | 685 | |
19799a22 GS |
686 | You may declare C<my> variables at the outermost scope of a file |
687 | to hide any such identifiers from the world outside that file. This | |
688 | is similar in spirit to C's static variables when they are used at | |
689 | the file level. To do this with a subroutine requires the use of | |
690 | a closure (an anonymous function that accesses enclosing lexicals). | |
691 | If you want to create a private subroutine that cannot be called | |
692 | from outside that block, it can declare a lexical variable containing | |
693 | an anonymous sub reference: | |
cb1a09d0 AD |
694 | |
695 | my $secret_version = '1.001-beta'; | |
696 | my $secret_sub = sub { print $secret_version }; | |
697 | &$secret_sub(); | |
698 | ||
699 | As long as the reference is never returned by any function within the | |
5f05dabc | 700 | module, no outside module can see the subroutine, because its name is not in |
cb1a09d0 | 701 | any package's symbol table. Remember that it's not I<REALLY> called |
19799a22 | 702 | C<$some_pack::secret_version> or anything; it's just $secret_version, |
cb1a09d0 AD |
703 | unqualified and unqualifiable. |
704 | ||
19799a22 GS |
705 | This does not work with object methods, however; all object methods |
706 | have to be in the symbol table of some package to be found. See | |
707 | L<perlref/"Function Templates"> for something of a work-around to | |
708 | this. | |
cb1a09d0 | 709 | |
c2611fb3 | 710 | =head2 Persistent Private Variables |
ba1f8e91 RGS |
711 | X<state> X<state variable> X<static> X<variable, persistent> X<variable, static> X<closure> |
712 | ||
713 | There are two ways to build persistent private variables in Perl 5.10. | |
b77865f5 | 714 | First, you can simply use the C<state> feature. Or, you can use closures, |
ba1f8e91 RGS |
715 | if you want to stay compatible with releases older than 5.10. |
716 | ||
717 | =head3 Persistent variables via state() | |
718 | ||
9d42615f | 719 | Beginning with Perl 5.10.0, you can declare variables with the C<state> |
4a904372 | 720 | keyword in place of C<my>. For that to work, though, you must have |
ba1f8e91 | 721 | enabled that feature beforehand, either by using the C<feature> pragma, or |
4a904372 | 722 | by using C<-E> on one-liners (see L<feature>). Beginning with Perl 5.16, |
47d235f1 | 723 | the C<CORE::state> form does not require the |
4a904372 | 724 | C<feature> pragma. |
ba1f8e91 | 725 | |
ad0cc46c FC |
726 | The C<state> keyword creates a lexical variable (following the same scoping |
727 | rules as C<my>) that persists from one subroutine call to the next. If a | |
728 | state variable resides inside an anonymous subroutine, then each copy of | |
729 | the subroutine has its own copy of the state variable. However, the value | |
730 | of the state variable will still persist between calls to the same copy of | |
731 | the anonymous subroutine. (Don't forget that C<sub { ... }> creates a new | |
732 | subroutine each time it is executed.) | |
733 | ||
ba1f8e91 RGS |
734 | For example, the following code maintains a private counter, incremented |
735 | each time the gimme_another() function is called: | |
736 | ||
737 | use feature 'state'; | |
738 | sub gimme_another { state $x; return ++$x } | |
739 | ||
ad0cc46c FC |
740 | And this example uses anonymous subroutines to create separate counters: |
741 | ||
742 | use feature 'state'; | |
743 | sub create_counter { | |
744 | return sub { state $x; return ++$x } | |
745 | } | |
746 | ||
ba1f8e91 RGS |
747 | Also, since C<$x> is lexical, it can't be reached or modified by any Perl |
748 | code outside. | |
749 | ||
f99042c8 | 750 | When combined with variable declaration, simple assignment to C<state> |
f292fc7a RS |
751 | variables (as in C<state $x = 42>) is executed only the first time. When such |
752 | statements are evaluated subsequent times, the assignment is ignored. The | |
f99042c8 Z |
753 | behavior of assignment to C<state> declarations where the left hand side |
754 | of the assignment involves any parentheses is currently undefined. | |
ba1f8e91 RGS |
755 | |
756 | =head3 Persistent variables with closures | |
5a964f20 TC |
757 | |
758 | Just because a lexical variable is lexically (also called statically) | |
f86cebdf | 759 | scoped to its enclosing block, C<eval>, or C<do> FILE, this doesn't mean that |
5a964f20 TC |
760 | within a function it works like a C static. It normally works more |
761 | like a C auto, but with implicit garbage collection. | |
762 | ||
763 | Unlike local variables in C or C++, Perl's lexical variables don't | |
764 | necessarily get recycled just because their scope has exited. | |
765 | If something more permanent is still aware of the lexical, it will | |
766 | stick around. So long as something else references a lexical, that | |
767 | lexical won't be freed--which is as it should be. You wouldn't want | |
768 | memory being free until you were done using it, or kept around once you | |
769 | were done. Automatic garbage collection takes care of this for you. | |
770 | ||
771 | This means that you can pass back or save away references to lexical | |
772 | variables, whereas to return a pointer to a C auto is a grave error. | |
773 | It also gives us a way to simulate C's function statics. Here's a | |
774 | mechanism for giving a function private variables with both lexical | |
775 | scoping and a static lifetime. If you do want to create something like | |
776 | C's static variables, just enclose the whole function in an extra block, | |
777 | and put the static variable outside the function but in the block. | |
cb1a09d0 AD |
778 | |
779 | { | |
54310121 | 780 | my $secret_val = 0; |
cb1a09d0 AD |
781 | sub gimme_another { |
782 | return ++$secret_val; | |
54310121 | 783 | } |
784 | } | |
cb1a09d0 AD |
785 | # $secret_val now becomes unreachable by the outside |
786 | # world, but retains its value between calls to gimme_another | |
787 | ||
54310121 | 788 | If this function is being sourced in from a separate file |
cb1a09d0 | 789 | via C<require> or C<use>, then this is probably just fine. If it's |
19799a22 | 790 | all in the main program, you'll need to arrange for the C<my> |
cb1a09d0 | 791 | to be executed early, either by putting the whole block above |
f86cebdf | 792 | your main program, or more likely, placing merely a C<BEGIN> |
ac90fb77 | 793 | code block around it to make sure it gets executed before your program |
cb1a09d0 AD |
794 | starts to run: |
795 | ||
ac90fb77 | 796 | BEGIN { |
54310121 | 797 | my $secret_val = 0; |
cb1a09d0 AD |
798 | sub gimme_another { |
799 | return ++$secret_val; | |
54310121 | 800 | } |
801 | } | |
cb1a09d0 | 802 | |
3c10abe3 AG |
803 | See L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END"> about the |
804 | special triggered code blocks, C<BEGIN>, C<UNITCHECK>, C<CHECK>, | |
805 | C<INIT> and C<END>. | |
cb1a09d0 | 806 | |
19799a22 GS |
807 | If declared at the outermost scope (the file scope), then lexicals |
808 | work somewhat like C's file statics. They are available to all | |
809 | functions in that same file declared below them, but are inaccessible | |
810 | from outside that file. This strategy is sometimes used in modules | |
811 | to create private variables that the whole module can see. | |
5a964f20 | 812 | |
cb1a09d0 | 813 | =head2 Temporary Values via local() |
d74e8afc ITB |
814 | X<local> X<scope, dynamic> X<dynamic scope> X<variable, local> |
815 | X<variable, temporary> | |
cb1a09d0 | 816 | |
19799a22 | 817 | B<WARNING>: In general, you should be using C<my> instead of C<local>, because |
6d28dffb | 818 | it's faster and safer. Exceptions to this include the global punctuation |
325192b1 RGS |
819 | variables, global filehandles and formats, and direct manipulation of the |
820 | Perl symbol table itself. C<local> is mostly used when the current value | |
821 | of a variable must be visible to called subroutines. | |
cb1a09d0 AD |
822 | |
823 | Synopsis: | |
824 | ||
325192b1 RGS |
825 | # localization of values |
826 | ||
555bd962 BG |
827 | local $foo; # make $foo dynamically local |
828 | local (@wid, %get); # make list of variables local | |
829 | local $foo = "flurp"; # make $foo dynamic, and init it | |
830 | local @oof = @bar; # make @oof dynamic, and init it | |
325192b1 | 831 | |
555bd962 BG |
832 | local $hash{key} = "val"; # sets a local value for this hash entry |
833 | delete local $hash{key}; # delete this entry for the current block | |
834 | local ($cond ? $v1 : $v2); # several types of lvalues support | |
835 | # localization | |
325192b1 RGS |
836 | |
837 | # localization of symbols | |
cb1a09d0 | 838 | |
555bd962 BG |
839 | local *FH; # localize $FH, @FH, %FH, &FH ... |
840 | local *merlyn = *randal; # now $merlyn is really $randal, plus | |
841 | # @merlyn is really @randal, etc | |
842 | local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal | |
843 | local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc | |
cb1a09d0 | 844 | |
19799a22 GS |
845 | A C<local> modifies its listed variables to be "local" to the |
846 | enclosing block, C<eval>, or C<do FILE>--and to I<any subroutine | |
847 | called from within that block>. A C<local> just gives temporary | |
848 | values to global (meaning package) variables. It does I<not> create | |
849 | a local variable. This is known as dynamic scoping. Lexical scoping | |
850 | is done with C<my>, which works more like C's auto declarations. | |
cb1a09d0 | 851 | |
ceb12f1f | 852 | Some types of lvalues can be localized as well: hash and array elements |
325192b1 RGS |
853 | and slices, conditionals (provided that their result is always |
854 | localizable), and symbolic references. As for simple variables, this | |
855 | creates new, dynamically scoped values. | |
856 | ||
857 | If more than one variable or expression is given to C<local>, they must be | |
858 | placed in parentheses. This operator works | |
cb1a09d0 | 859 | by saving the current values of those variables in its argument list on a |
5f05dabc | 860 | hidden stack and restoring them upon exiting the block, subroutine, or |
cb1a09d0 AD |
861 | eval. This means that called subroutines can also reference the local |
862 | variable, but not the global one. The argument list may be assigned to if | |
863 | desired, which allows you to initialize your local variables. (If no | |
864 | initializer is given for a particular variable, it is created with an | |
325192b1 | 865 | undefined value.) |
cb1a09d0 | 866 | |
19799a22 | 867 | Because C<local> is a run-time operator, it gets executed each time |
325192b1 RGS |
868 | through a loop. Consequently, it's more efficient to localize your |
869 | variables outside the loop. | |
870 | ||
871 | =head3 Grammatical note on local() | |
d74e8afc | 872 | X<local, context> |
cb1a09d0 | 873 | |
f86cebdf GS |
874 | A C<local> is simply a modifier on an lvalue expression. When you assign to |
875 | a C<local>ized variable, the C<local> doesn't change whether its list is viewed | |
cb1a09d0 AD |
876 | as a scalar or an array. So |
877 | ||
878 | local($foo) = <STDIN>; | |
879 | local @FOO = <STDIN>; | |
880 | ||
5f05dabc | 881 | both supply a list context to the right-hand side, while |
cb1a09d0 AD |
882 | |
883 | local $foo = <STDIN>; | |
884 | ||
885 | supplies a scalar context. | |
886 | ||
325192b1 | 887 | =head3 Localization of special variables |
d74e8afc | 888 | X<local, special variable> |
3e3baf6d | 889 | |
325192b1 RGS |
890 | If you localize a special variable, you'll be giving a new value to it, |
891 | but its magic won't go away. That means that all side-effects related | |
892 | to this magic still work with the localized value. | |
3e3baf6d | 893 | |
325192b1 RGS |
894 | This feature allows code like this to work : |
895 | ||
896 | # Read the whole contents of FILE in $slurp | |
897 | { local $/ = undef; $slurp = <FILE>; } | |
898 | ||
899 | Note, however, that this restricts localization of some values ; for | |
9d42615f | 900 | example, the following statement dies, as of perl 5.10.0, with an error |
325192b1 RGS |
901 | I<Modification of a read-only value attempted>, because the $1 variable is |
902 | magical and read-only : | |
903 | ||
904 | local $1 = 2; | |
905 | ||
658a9f31 JD |
906 | One exception is the default scalar variable: starting with perl 5.14 |
907 | C<local($_)> will always strip all magic from $_, to make it possible | |
908 | to safely reuse $_ in a subroutine. | |
325192b1 RGS |
909 | |
910 | B<WARNING>: Localization of tied arrays and hashes does not currently | |
911 | work as described. | |
fd5a896a | 912 | This will be fixed in a future release of Perl; in the meantime, avoid |
89d1beed | 913 | code that relies on any particular behavior of localising tied arrays |
fd5a896a | 914 | or hashes (localising individual elements is still okay). |
325192b1 | 915 | See L<perl58delta/"Localising Tied Arrays and Hashes Is Broken"> for more |
fd5a896a | 916 | details. |
d74e8afc | 917 | X<local, tie> |
fd5a896a | 918 | |
325192b1 | 919 | =head3 Localization of globs |
d74e8afc | 920 | X<local, glob> X<glob> |
3e3baf6d | 921 | |
325192b1 RGS |
922 | The construct |
923 | ||
924 | local *name; | |
925 | ||
926 | creates a whole new symbol table entry for the glob C<name> in the | |
927 | current package. That means that all variables in its glob slot ($name, | |
928 | @name, %name, &name, and the C<name> filehandle) are dynamically reset. | |
929 | ||
930 | This implies, among other things, that any magic eventually carried by | |
931 | those variables is locally lost. In other words, saying C<local */> | |
932 | will not have any effect on the internal value of the input record | |
933 | separator. | |
934 | ||
325192b1 | 935 | =head3 Localization of elements of composite types |
d74e8afc | 936 | X<local, composite type element> X<local, array element> X<local, hash element> |
3e3baf6d | 937 | |
6ee623d5 | 938 | It's also worth taking a moment to explain what happens when you |
f86cebdf | 939 | C<local>ize a member of a composite type (i.e. an array or hash element). |
b77865f5 | 940 | In this case, the element is C<local>ized I<by name>. This means that |
6ee623d5 GS |
941 | when the scope of the C<local()> ends, the saved value will be |
942 | restored to the hash element whose key was named in the C<local()>, or | |
943 | the array element whose index was named in the C<local()>. If that | |
944 | element was deleted while the C<local()> was in effect (e.g. by a | |
945 | C<delete()> from a hash or a C<shift()> of an array), it will spring | |
946 | back into existence, possibly extending an array and filling in the | |
947 | skipped elements with C<undef>. For instance, if you say | |
948 | ||
949 | %hash = ( 'This' => 'is', 'a' => 'test' ); | |
950 | @ary = ( 0..5 ); | |
951 | { | |
952 | local($ary[5]) = 6; | |
953 | local($hash{'a'}) = 'drill'; | |
954 | while (my $e = pop(@ary)) { | |
955 | print "$e . . .\n"; | |
956 | last unless $e > 3; | |
957 | } | |
958 | if (@ary) { | |
959 | $hash{'only a'} = 'test'; | |
960 | delete $hash{'a'}; | |
961 | } | |
962 | } | |
963 | print join(' ', map { "$_ $hash{$_}" } sort keys %hash),".\n"; | |
964 | print "The array has ",scalar(@ary)," elements: ", | |
965 | join(', ', map { defined $_ ? $_ : 'undef' } @ary),"\n"; | |
966 | ||
967 | Perl will print | |
968 | ||
969 | 6 . . . | |
970 | 4 . . . | |
971 | 3 . . . | |
972 | This is a test only a test. | |
973 | The array has 6 elements: 0, 1, 2, undef, undef, 5 | |
974 | ||
19799a22 | 975 | The behavior of local() on non-existent members of composite |
7185e5cc GS |
976 | types is subject to change in future. |
977 | ||
d361fafa VP |
978 | =head3 Localized deletion of elements of composite types |
979 | X<delete> X<local, composite type element> X<local, array element> X<local, hash element> | |
980 | ||
981 | You can use the C<delete local $array[$idx]> and C<delete local $hash{key}> | |
982 | constructs to delete a composite type entry for the current block and restore | |
b77865f5 | 983 | it when it ends. They return the array/hash value before the localization, |
d361fafa VP |
984 | which means that they are respectively equivalent to |
985 | ||
986 | do { | |
987 | my $val = $array[$idx]; | |
988 | local $array[$idx]; | |
989 | delete $array[$idx]; | |
990 | $val | |
991 | } | |
992 | ||
993 | and | |
994 | ||
995 | do { | |
996 | my $val = $hash{key}; | |
997 | local $hash{key}; | |
998 | delete $hash{key}; | |
999 | $val | |
1000 | } | |
1001 | ||
b77865f5 FC |
1002 | except that for those the C<local> is |
1003 | scoped to the C<do> block. Slices are | |
d361fafa VP |
1004 | also accepted. |
1005 | ||
1006 | my %hash = ( | |
1007 | a => [ 7, 8, 9 ], | |
1008 | b => 1, | |
1009 | ) | |
1010 | ||
1011 | { | |
1012 | my $a = delete local $hash{a}; | |
1013 | # $a is [ 7, 8, 9 ] | |
1014 | # %hash is (b => 1) | |
1015 | ||
1016 | { | |
1017 | my @nums = delete local @$a[0, 2] | |
1018 | # @nums is (7, 9) | |
1019 | # $a is [ undef, 8 ] | |
1020 | ||
1021 | $a[0] = 999; # will be erased when the scope ends | |
1022 | } | |
1023 | # $a is back to [ 7, 8, 9 ] | |
1024 | ||
1025 | } | |
1026 | # %hash is back to its original state | |
1027 | ||
cd06dffe | 1028 | =head2 Lvalue subroutines |
d74e8afc | 1029 | X<lvalue> X<subroutine, lvalue> |
cd06dffe | 1030 | |
cd06dffe GS |
1031 | It is possible to return a modifiable value from a subroutine. |
1032 | To do this, you have to declare the subroutine to return an lvalue. | |
1033 | ||
1034 | my $val; | |
1035 | sub canmod : lvalue { | |
4a904372 | 1036 | $val; # or: return $val; |
cd06dffe GS |
1037 | } |
1038 | sub nomod { | |
1039 | $val; | |
1040 | } | |
1041 | ||
1042 | canmod() = 5; # assigns to $val | |
1043 | nomod() = 5; # ERROR | |
1044 | ||
1045 | The scalar/list context for the subroutine and for the right-hand | |
1046 | side of assignment is determined as if the subroutine call is replaced | |
b77865f5 | 1047 | by a scalar. For example, consider: |
cd06dffe GS |
1048 | |
1049 | data(2,3) = get_data(3,4); | |
1050 | ||
1051 | Both subroutines here are called in a scalar context, while in: | |
1052 | ||
1053 | (data(2,3)) = get_data(3,4); | |
1054 | ||
1055 | and in: | |
1056 | ||
1057 | (data(2),data(3)) = get_data(3,4); | |
1058 | ||
1059 | all the subroutines are called in a list context. | |
1060 | ||
771cc755 | 1061 | Lvalue subroutines are convenient, but you have to keep in mind that, |
b77865f5 | 1062 | when used with objects, they may violate encapsulation. A normal |
771cc755 | 1063 | mutator can check the supplied argument before setting the attribute |
b77865f5 | 1064 | it is protecting, an lvalue subroutine cannot. If you require any |
771cc755 JV |
1065 | special processing when storing and retrieving the values, consider |
1066 | using the CPAN module Sentinel or something similar. | |
e6a32221 | 1067 | |
ca40957e FC |
1068 | =head2 Lexical Subroutines |
1069 | X<my sub> X<state sub> X<our sub> X<subroutine, lexical> | |
1070 | ||
ca40957e FC |
1071 | Beginning with Perl 5.18, you can declare a private subroutine with C<my> |
1072 | or C<state>. As with state variables, the C<state> keyword is only | |
1073 | available under C<use feature 'state'> or C<use 5.010> or higher. | |
1074 | ||
06c4bad0 FC |
1075 | Prior to Perl 5.26, lexical subroutines were deemed experimental and were |
1076 | available only under the C<use feature 'lexical_subs'> pragma. They also | |
1077 | produced a warning unless the "experimental::lexical_subs" warnings | |
1078 | category was disabled. | |
1079 | ||
ca40957e FC |
1080 | These subroutines are only visible within the block in which they are |
1081 | declared, and only after that declaration: | |
1082 | ||
06c4bad0 FC |
1083 | # Include these two lines if your code is intended to run under Perl |
1084 | # versions earlier than 5.26. | |
f1d34ca8 | 1085 | no warnings "experimental::lexical_subs"; |
ca40957e FC |
1086 | use feature 'lexical_subs'; |
1087 | ||
67bf5a37 | 1088 | foo(); # calls the package/global subroutine |
ca40957e | 1089 | state sub foo { |
67bf5a37 | 1090 | foo(); # also calls the package subroutine |
ca40957e | 1091 | } |
67bf5a37 LM |
1092 | foo(); # calls "state" sub |
1093 | my $ref = \&foo; # take a reference to "state" sub | |
ca40957e FC |
1094 | |
1095 | my sub bar { ... } | |
67bf5a37 | 1096 | bar(); # calls "my" sub |
ca40957e | 1097 | |
67bf5a37 | 1098 | You can't (directly) write a recursive lexical subroutine: |
ca40957e | 1099 | |
67bf5a37 LM |
1100 | # WRONG |
1101 | my sub baz { | |
1102 | baz(); | |
ca40957e FC |
1103 | } |
1104 | ||
67bf5a37 LM |
1105 | This example fails because C<baz()> refers to the package/global subroutine |
1106 | C<baz>, not the lexical subroutine currently being defined. | |
1107 | ||
1108 | The solution is to use L<C<__SUB__>|perlfunc/__SUB__>: | |
1109 | ||
1110 | my sub baz { | |
1111 | __SUB__->(); # calls itself | |
1112 | } | |
1113 | ||
1114 | It is possible to predeclare a lexical subroutine. The C<sub foo {...}> | |
1115 | subroutine definition syntax respects any previous C<my sub;> or C<state sub;> | |
1116 | declaration. Using this to define recursive subroutines is a bad idea, | |
1117 | however: | |
1118 | ||
1119 | my sub baz; # predeclaration | |
1120 | sub baz { # define the "my" sub | |
1121 | baz(); # WRONG: calls itself, but leaks memory | |
1122 | } | |
1123 | ||
1124 | Just like C<< my $f; $f = sub { $f->() } >>, this example leaks memory. The | |
1125 | name C<baz> is a reference to the subroutine, and the subroutine uses the name | |
1126 | C<baz>; they keep each other alive (see L<perlref/Circular References>). | |
1127 | ||
ca40957e FC |
1128 | =head3 C<state sub> vs C<my sub> |
1129 | ||
1130 | What is the difference between "state" subs and "my" subs? Each time that | |
1131 | execution enters a block when "my" subs are declared, a new copy of each | |
1132 | sub is created. "State" subroutines persist from one execution of the | |
1133 | containing block to the next. | |
1134 | ||
1135 | So, in general, "state" subroutines are faster. But "my" subs are | |
1136 | necessary if you want to create closures: | |
1137 | ||
ca40957e FC |
1138 | sub whatever { |
1139 | my $x = shift; | |
1140 | my sub inner { | |
1141 | ... do something with $x ... | |
1142 | } | |
1143 | inner(); | |
1144 | } | |
1145 | ||
1146 | In this example, a new C<$x> is created when C<whatever> is called, and | |
1147 | also a new C<inner>, which can see the new C<$x>. A "state" sub will only | |
1148 | see the C<$x> from the first call to C<whatever>. | |
1149 | ||
1150 | =head3 C<our> subroutines | |
1151 | ||
1152 | Like C<our $variable>, C<our sub> creates a lexical alias to the package | |
1153 | subroutine of the same name. | |
1154 | ||
1155 | The two main uses for this are to switch back to using the package sub | |
1156 | inside an inner scope: | |
1157 | ||
ca40957e FC |
1158 | sub foo { ... } |
1159 | ||
1160 | sub bar { | |
1161 | my sub foo { ... } | |
1162 | { | |
1163 | # need to use the outer foo here | |
1164 | our sub foo; | |
1165 | foo(); | |
1166 | } | |
1167 | } | |
1168 | ||
1169 | and to make a subroutine visible to other packages in the same scope: | |
1170 | ||
1171 | package MySneakyModule; | |
1172 | ||
ca40957e FC |
1173 | our sub do_something { ... } |
1174 | ||
1175 | sub do_something_with_caller { | |
1176 | package DB; | |
1177 | () = caller 1; # sets @DB::args | |
1178 | do_something(@args); # uses MySneakyModule::do_something | |
1179 | } | |
1180 | ||
cb1a09d0 | 1181 | =head2 Passing Symbol Table Entries (typeglobs) |
d74e8afc | 1182 | X<typeglob> X<*> |
cb1a09d0 | 1183 | |
19799a22 GS |
1184 | B<WARNING>: The mechanism described in this section was originally |
1185 | the only way to simulate pass-by-reference in older versions of | |
1186 | Perl. While it still works fine in modern versions, the new reference | |
1187 | mechanism is generally easier to work with. See below. | |
a0d0e21e LW |
1188 | |
1189 | Sometimes you don't want to pass the value of an array to a subroutine | |
1190 | but rather the name of it, so that the subroutine can modify the global | |
1191 | copy of it rather than working with a local copy. In perl you can | |
cb1a09d0 | 1192 | refer to all objects of a particular name by prefixing the name |
5f05dabc | 1193 | with a star: C<*foo>. This is often known as a "typeglob", because the |
a0d0e21e LW |
1194 | star on the front can be thought of as a wildcard match for all the |
1195 | funny prefix characters on variables and subroutines and such. | |
1196 | ||
55497cff | 1197 | When evaluated, the typeglob produces a scalar value that represents |
5f05dabc | 1198 | all the objects of that name, including any filehandle, format, or |
a0d0e21e | 1199 | subroutine. When assigned to, it causes the name mentioned to refer to |
19799a22 | 1200 | whatever C<*> value was assigned to it. Example: |
a0d0e21e LW |
1201 | |
1202 | sub doubleary { | |
1203 | local(*someary) = @_; | |
1204 | foreach $elem (@someary) { | |
1205 | $elem *= 2; | |
1206 | } | |
1207 | } | |
1208 | doubleary(*foo); | |
1209 | doubleary(*bar); | |
1210 | ||
19799a22 | 1211 | Scalars are already passed by reference, so you can modify |
a0d0e21e | 1212 | scalar arguments without using this mechanism by referring explicitly |
1fef88e7 | 1213 | to C<$_[0]> etc. You can modify all the elements of an array by passing |
f86cebdf GS |
1214 | all the elements as scalars, but you have to use the C<*> mechanism (or |
1215 | the equivalent reference mechanism) to C<push>, C<pop>, or change the size of | |
a0d0e21e LW |
1216 | an array. It will certainly be faster to pass the typeglob (or reference). |
1217 | ||
1218 | Even if you don't want to modify an array, this mechanism is useful for | |
5f05dabc | 1219 | passing multiple arrays in a single LIST, because normally the LIST |
a0d0e21e | 1220 | mechanism will merge all the array values so that you can't extract out |
55497cff | 1221 | the individual arrays. For more on typeglobs, see |
2ae324a7 | 1222 | L<perldata/"Typeglobs and Filehandles">. |
cb1a09d0 | 1223 | |
5a964f20 | 1224 | =head2 When to Still Use local() |
d74e8afc | 1225 | X<local> X<variable, local> |
5a964f20 | 1226 | |
19799a22 GS |
1227 | Despite the existence of C<my>, there are still three places where the |
1228 | C<local> operator still shines. In fact, in these three places, you | |
5a964f20 TC |
1229 | I<must> use C<local> instead of C<my>. |
1230 | ||
13a2d996 | 1231 | =over 4 |
5a964f20 | 1232 | |
551e1d92 RB |
1233 | =item 1. |
1234 | ||
1235 | You need to give a global variable a temporary value, especially $_. | |
5a964f20 | 1236 | |
f86cebdf GS |
1237 | The global variables, like C<@ARGV> or the punctuation variables, must be |
1238 | C<local>ized with C<local()>. This block reads in F</etc/motd>, and splits | |
5a964f20 | 1239 | it up into chunks separated by lines of equal signs, which are placed |
f86cebdf | 1240 | in C<@Fields>. |
5a964f20 TC |
1241 | |
1242 | { | |
1243 | local @ARGV = ("/etc/motd"); | |
1244 | local $/ = undef; | |
1245 | local $_ = <>; | |
1246 | @Fields = split /^\s*=+\s*$/; | |
1247 | } | |
1248 | ||
19799a22 | 1249 | It particular, it's important to C<local>ize $_ in any routine that assigns |
5a964f20 TC |
1250 | to it. Look out for implicit assignments in C<while> conditionals. |
1251 | ||
551e1d92 RB |
1252 | =item 2. |
1253 | ||
1254 | You need to create a local file or directory handle or a local function. | |
5a964f20 | 1255 | |
09bef843 SB |
1256 | A function that needs a filehandle of its own must use |
1257 | C<local()> on a complete typeglob. This can be used to create new symbol | |
5a964f20 TC |
1258 | table entries: |
1259 | ||
1260 | sub ioqueue { | |
1261 | local (*READER, *WRITER); # not my! | |
17b63f68 | 1262 | pipe (READER, WRITER) or die "pipe: $!"; |
5a964f20 TC |
1263 | return (*READER, *WRITER); |
1264 | } | |
1265 | ($head, $tail) = ioqueue(); | |
1266 | ||
1267 | See the Symbol module for a way to create anonymous symbol table | |
1268 | entries. | |
1269 | ||
1270 | Because assignment of a reference to a typeglob creates an alias, this | |
1271 | can be used to create what is effectively a local function, or at least, | |
1272 | a local alias. | |
1273 | ||
1274 | { | |
4a46e268 | 1275 | local *grow = \&shrink; # only until this block exits |
555bd962 BG |
1276 | grow(); # really calls shrink() |
1277 | move(); # if move() grow()s, it shrink()s too | |
5a964f20 | 1278 | } |
555bd962 | 1279 | grow(); # get the real grow() again |
5a964f20 TC |
1280 | |
1281 | See L<perlref/"Function Templates"> for more about manipulating | |
1282 | functions by name in this way. | |
1283 | ||
551e1d92 RB |
1284 | =item 3. |
1285 | ||
1286 | You want to temporarily change just one element of an array or hash. | |
5a964f20 | 1287 | |
f86cebdf | 1288 | You can C<local>ize just one element of an aggregate. Usually this |
5a964f20 TC |
1289 | is done on dynamics: |
1290 | ||
1291 | { | |
1292 | local $SIG{INT} = 'IGNORE'; | |
1293 | funct(); # uninterruptible | |
1294 | } | |
1295 | # interruptibility automatically restored here | |
1296 | ||
9d42615f | 1297 | But it also works on lexically declared aggregates. |
5a964f20 TC |
1298 | |
1299 | =back | |
1300 | ||
cb1a09d0 | 1301 | =head2 Pass by Reference |
d74e8afc | 1302 | X<pass by reference> X<pass-by-reference> X<reference> |
cb1a09d0 | 1303 | |
55497cff | 1304 | If you want to pass more than one array or hash into a function--or |
1305 | return them from it--and have them maintain their integrity, then | |
1306 | you're going to have to use an explicit pass-by-reference. Before you | |
1307 | do that, you need to understand references as detailed in L<perlref>. | |
c07a80fd | 1308 | This section may not make much sense to you otherwise. |
cb1a09d0 | 1309 | |
19799a22 GS |
1310 | Here are a few simple examples. First, let's pass in several arrays |
1311 | to a function and have it C<pop> all of then, returning a new list | |
1312 | of all their former last elements: | |
cb1a09d0 AD |
1313 | |
1314 | @tailings = popmany ( \@a, \@b, \@c, \@d ); | |
1315 | ||
1316 | sub popmany { | |
1317 | my $aref; | |
8b7906d1 | 1318 | my @retlist; |
cb1a09d0 AD |
1319 | foreach $aref ( @_ ) { |
1320 | push @retlist, pop @$aref; | |
54310121 | 1321 | } |
cb1a09d0 | 1322 | return @retlist; |
54310121 | 1323 | } |
cb1a09d0 | 1324 | |
54310121 | 1325 | Here's how you might write a function that returns a |
cb1a09d0 AD |
1326 | list of keys occurring in all the hashes passed to it: |
1327 | ||
54310121 | 1328 | @common = inter( \%foo, \%bar, \%joe ); |
cb1a09d0 AD |
1329 | sub inter { |
1330 | my ($k, $href, %seen); # locals | |
1331 | foreach $href (@_) { | |
1332 | while ( $k = each %$href ) { | |
1333 | $seen{$k}++; | |
54310121 | 1334 | } |
1335 | } | |
cb1a09d0 | 1336 | return grep { $seen{$_} == @_ } keys %seen; |
54310121 | 1337 | } |
cb1a09d0 | 1338 | |
5f05dabc | 1339 | So far, we're using just the normal list return mechanism. |
54310121 | 1340 | What happens if you want to pass or return a hash? Well, |
1341 | if you're using only one of them, or you don't mind them | |
cb1a09d0 | 1342 | concatenating, then the normal calling convention is ok, although |
54310121 | 1343 | a little expensive. |
cb1a09d0 AD |
1344 | |
1345 | Where people get into trouble is here: | |
1346 | ||
1347 | (@a, @b) = func(@c, @d); | |
1348 | or | |
1349 | (%a, %b) = func(%c, %d); | |
1350 | ||
19799a22 GS |
1351 | That syntax simply won't work. It sets just C<@a> or C<%a> and |
1352 | clears the C<@b> or C<%b>. Plus the function didn't get passed | |
1353 | into two separate arrays or hashes: it got one long list in C<@_>, | |
1354 | as always. | |
cb1a09d0 AD |
1355 | |
1356 | If you can arrange for everyone to deal with this through references, it's | |
1357 | cleaner code, although not so nice to look at. Here's a function that | |
1358 | takes two array references as arguments, returning the two array elements | |
1359 | in order of how many elements they have in them: | |
1360 | ||
1361 | ($aref, $bref) = func(\@c, \@d); | |
1362 | print "@$aref has more than @$bref\n"; | |
1363 | sub func { | |
1364 | my ($cref, $dref) = @_; | |
1365 | if (@$cref > @$dref) { | |
1366 | return ($cref, $dref); | |
1367 | } else { | |
c07a80fd | 1368 | return ($dref, $cref); |
54310121 | 1369 | } |
1370 | } | |
cb1a09d0 AD |
1371 | |
1372 | It turns out that you can actually do this also: | |
1373 | ||
1374 | (*a, *b) = func(\@c, \@d); | |
1375 | print "@a has more than @b\n"; | |
1376 | sub func { | |
1377 | local (*c, *d) = @_; | |
1378 | if (@c > @d) { | |
1379 | return (\@c, \@d); | |
1380 | } else { | |
1381 | return (\@d, \@c); | |
54310121 | 1382 | } |
1383 | } | |
cb1a09d0 AD |
1384 | |
1385 | Here we're using the typeglobs to do symbol table aliasing. It's | |
19799a22 | 1386 | a tad subtle, though, and also won't work if you're using C<my> |
09bef843 | 1387 | variables, because only globals (even in disguise as C<local>s) |
19799a22 | 1388 | are in the symbol table. |
5f05dabc | 1389 | |
1390 | If you're passing around filehandles, you could usually just use the bare | |
19799a22 GS |
1391 | typeglob, like C<*STDOUT>, but typeglobs references work, too. |
1392 | For example: | |
5f05dabc | 1393 | |
1394 | splutter(\*STDOUT); | |
1395 | sub splutter { | |
1396 | my $fh = shift; | |
1397 | print $fh "her um well a hmmm\n"; | |
1398 | } | |
1399 | ||
1400 | $rec = get_rec(\*STDIN); | |
1401 | sub get_rec { | |
1402 | my $fh = shift; | |
1403 | return scalar <$fh>; | |
1404 | } | |
1405 | ||
19799a22 GS |
1406 | If you're planning on generating new filehandles, you could do this. |
1407 | Notice to pass back just the bare *FH, not its reference. | |
5f05dabc | 1408 | |
1409 | sub openit { | |
19799a22 | 1410 | my $path = shift; |
5f05dabc | 1411 | local *FH; |
e05a3a1e | 1412 | return open (FH, $path) ? *FH : undef; |
54310121 | 1413 | } |
5f05dabc | 1414 | |
cb1a09d0 | 1415 | =head2 Prototypes |
d74e8afc | 1416 | X<prototype> X<subroutine, prototype> |
cb1a09d0 | 1417 | |
19799a22 | 1418 | Perl supports a very limited kind of compile-time argument checking |
eedb00fa PM |
1419 | using function prototyping. This can be declared in either the PROTO |
1420 | section or with a L<prototype attribute|attributes/Built-in Attributes>. | |
30d9c59b | 1421 | If you declare either of |
cb1a09d0 | 1422 | |
26230909 AC |
1423 | sub mypush (\@@) |
1424 | sub mypush :prototype(\@@) | |
30d9c59b Z |
1425 | |
1426 | then C<mypush()> takes arguments exactly like C<push()> does. | |
1427 | ||
1428 | If subroutine signatures are enabled (see L</Signatures>), then | |
1429 | the shorter PROTO syntax is unavailable, because it would clash with | |
1430 | signatures. In that case, a prototype can only be declared in the form | |
1431 | of an attribute. | |
cb1a09d0 | 1432 | |
30d9c59b | 1433 | The |
19799a22 GS |
1434 | function declaration must be visible at compile time. The prototype |
1435 | affects only interpretation of new-style calls to the function, | |
1436 | where new-style is defined as not using the C<&> character. In | |
1437 | other words, if you call it like a built-in function, then it behaves | |
1438 | like a built-in function. If you call it like an old-fashioned | |
1439 | subroutine, then it behaves like an old-fashioned subroutine. It | |
1440 | naturally falls out from this rule that prototypes have no influence | |
1441 | on subroutine references like C<\&foo> or on indirect subroutine | |
c47ff5f1 | 1442 | calls like C<&{$subref}> or C<< $subref->() >>. |
c07a80fd | 1443 | |
1444 | Method calls are not influenced by prototypes either, because the | |
19799a22 GS |
1445 | function to be called is indeterminate at compile time, since |
1446 | the exact code called depends on inheritance. | |
cb1a09d0 | 1447 | |
19799a22 GS |
1448 | Because the intent of this feature is primarily to let you define |
1449 | subroutines that work like built-in functions, here are prototypes | |
1450 | for some other functions that parse almost exactly like the | |
1451 | corresponding built-in. | |
cb1a09d0 | 1452 | |
555bd962 BG |
1453 | Declared as Called as |
1454 | ||
1455 | sub mylink ($$) mylink $old, $new | |
1456 | sub myvec ($$$) myvec $var, $offset, 1 | |
1457 | sub myindex ($$;$) myindex &getstring, "substr" | |
1458 | sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off | |
1459 | sub myreverse (@) myreverse $a, $b, $c | |
1460 | sub myjoin ($@) myjoin ":", $a, $b, $c | |
26230909 AC |
1461 | sub mypop (\@) mypop @array |
1462 | sub mysplice (\@$$@) mysplice @array, 0, 2, @pushme | |
1463 | sub mykeys (\[%@]) mykeys %{$hashref} | |
555bd962 BG |
1464 | sub myopen (*;$) myopen HANDLE, $name |
1465 | sub mypipe (**) mypipe READHANDLE, WRITEHANDLE | |
1466 | sub mygrep (&@) mygrep { /foo/ } $a, $b, $c | |
1467 | sub myrand (;$) myrand 42 | |
1468 | sub mytime () mytime | |
cb1a09d0 | 1469 | |
c07a80fd | 1470 | Any backslashed prototype character represents an actual argument |
ae7a3cfa | 1471 | that must start with that character (optionally preceded by C<my>, |
b91b7d1a FC |
1472 | C<our> or C<local>), with the exception of C<$>, which will |
1473 | accept any scalar lvalue expression, such as C<$foo = 7> or | |
b77865f5 | 1474 | C<< my_function()->[0] >>. The value passed as part of C<@_> will be a |
ae7a3cfa FC |
1475 | reference to the actual argument given in the subroutine call, |
1476 | obtained by applying C<\> to that argument. | |
c07a80fd | 1477 | |
c035a075 | 1478 | You can use the C<\[]> backslash group notation to specify more than one |
b77865f5 | 1479 | allowed argument type. For example: |
5b794e05 JH |
1480 | |
1481 | sub myref (\[$@%&*]) | |
1482 | ||
1483 | will allow calling myref() as | |
1484 | ||
1485 | myref $var | |
1486 | myref @array | |
1487 | myref %hash | |
1488 | myref &sub | |
1489 | myref *glob | |
1490 | ||
1491 | and the first argument of myref() will be a reference to | |
1492 | a scalar, an array, a hash, a code, or a glob. | |
1493 | ||
c07a80fd | 1494 | Unbackslashed prototype characters have special meanings. Any |
19799a22 | 1495 | unbackslashed C<@> or C<%> eats all remaining arguments, and forces |
f86cebdf GS |
1496 | list context. An argument represented by C<$> forces scalar context. An |
1497 | C<&> requires an anonymous subroutine, which, if passed as the first | |
0df79f0c GS |
1498 | argument, does not require the C<sub> keyword or a subsequent comma. |
1499 | ||
1500 | A C<*> allows the subroutine to accept a bareword, constant, scalar expression, | |
648ca4f7 GS |
1501 | typeglob, or a reference to a typeglob in that slot. The value will be |
1502 | available to the subroutine either as a simple scalar, or (in the latter | |
0df79f0c GS |
1503 | two cases) as a reference to the typeglob. If you wish to always convert |
1504 | such arguments to a typeglob reference, use Symbol::qualify_to_ref() as | |
1505 | follows: | |
1506 | ||
1507 | use Symbol 'qualify_to_ref'; | |
1508 | ||
1509 | sub foo (*) { | |
1510 | my $fh = qualify_to_ref(shift, caller); | |
1511 | ... | |
1512 | } | |
c07a80fd | 1513 | |
c035a075 DG |
1514 | The C<+> prototype is a special alternative to C<$> that will act like |
1515 | C<\[@%]> when given a literal array or hash variable, but will otherwise | |
1516 | force scalar context on the argument. This is useful for functions which | |
1517 | should accept either a literal array or an array reference as the argument: | |
1518 | ||
cba5a3b0 | 1519 | sub mypush (+@) { |
c035a075 DG |
1520 | my $aref = shift; |
1521 | die "Not an array or arrayref" unless ref $aref eq 'ARRAY'; | |
1522 | push @$aref, @_; | |
1523 | } | |
1524 | ||
1525 | When using the C<+> prototype, your function must check that the argument | |
1526 | is of an acceptable type. | |
1527 | ||
859a4967 | 1528 | A semicolon (C<;>) separates mandatory arguments from optional arguments. |
19799a22 | 1529 | It is redundant before C<@> or C<%>, which gobble up everything else. |
cb1a09d0 | 1530 | |
34daab0f RGS |
1531 | As the last character of a prototype, or just before a semicolon, a C<@> |
1532 | or a C<%>, you can use C<_> in place of C<$>: if this argument is not | |
1533 | provided, C<$_> will be used instead. | |
859a4967 | 1534 | |
19799a22 GS |
1535 | Note how the last three examples in the table above are treated |
1536 | specially by the parser. C<mygrep()> is parsed as a true list | |
1537 | operator, C<myrand()> is parsed as a true unary operator with unary | |
1538 | precedence the same as C<rand()>, and C<mytime()> is truly without | |
1539 | arguments, just like C<time()>. That is, if you say | |
cb1a09d0 AD |
1540 | |
1541 | mytime +2; | |
1542 | ||
f86cebdf | 1543 | you'll get C<mytime() + 2>, not C<mytime(2)>, which is how it would be parsed |
3a8944db FC |
1544 | without a prototype. If you want to force a unary function to have the |
1545 | same precedence as a list operator, add C<;> to the end of the prototype: | |
1546 | ||
1547 | sub mygetprotobynumber($;); | |
1548 | mygetprotobynumber $a > $b; # parsed as mygetprotobynumber($a > $b) | |
cb1a09d0 | 1549 | |
19799a22 GS |
1550 | The interesting thing about C<&> is that you can generate new syntax with it, |
1551 | provided it's in the initial position: | |
d74e8afc | 1552 | X<&> |
cb1a09d0 | 1553 | |
6d28dffb | 1554 | sub try (&@) { |
cb1a09d0 AD |
1555 | my($try,$catch) = @_; |
1556 | eval { &$try }; | |
1557 | if ($@) { | |
1558 | local $_ = $@; | |
1559 | &$catch; | |
1560 | } | |
1561 | } | |
55497cff | 1562 | sub catch (&) { $_[0] } |
cb1a09d0 AD |
1563 | |
1564 | try { | |
1565 | die "phooey"; | |
1566 | } catch { | |
1567 | /phooey/ and print "unphooey\n"; | |
1568 | }; | |
1569 | ||
f86cebdf | 1570 | That prints C<"unphooey">. (Yes, there are still unresolved |
19799a22 | 1571 | issues having to do with visibility of C<@_>. I'm ignoring that |
f86cebdf | 1572 | question for the moment. (But note that if we make C<@_> lexically |
cb1a09d0 | 1573 | scoped, those anonymous subroutines can act like closures... (Gee, |
5f05dabc | 1574 | is this sounding a little Lispish? (Never mind.)))) |
cb1a09d0 | 1575 | |
19799a22 | 1576 | And here's a reimplementation of the Perl C<grep> operator: |
d74e8afc | 1577 | X<grep> |
cb1a09d0 AD |
1578 | |
1579 | sub mygrep (&@) { | |
1580 | my $code = shift; | |
1581 | my @result; | |
1582 | foreach $_ (@_) { | |
6e47f808 | 1583 | push(@result, $_) if &$code; |
cb1a09d0 AD |
1584 | } |
1585 | @result; | |
1586 | } | |
a0d0e21e | 1587 | |
cb1a09d0 AD |
1588 | Some folks would prefer full alphanumeric prototypes. Alphanumerics have |
1589 | been intentionally left out of prototypes for the express purpose of | |
1590 | someday in the future adding named, formal parameters. The current | |
1591 | mechanism's main goal is to let module writers provide better diagnostics | |
1592 | for module users. Larry feels the notation quite understandable to Perl | |
1593 | programmers, and that it will not intrude greatly upon the meat of the | |
1594 | module, nor make it harder to read. The line noise is visually | |
1595 | encapsulated into a small pill that's easy to swallow. | |
1596 | ||
420cdfc1 ST |
1597 | If you try to use an alphanumeric sequence in a prototype you will |
1598 | generate an optional warning - "Illegal character in prototype...". | |
1599 | Unfortunately earlier versions of Perl allowed the prototype to be | |
1600 | used as long as its prefix was a valid prototype. The warning may be | |
1601 | upgraded to a fatal error in a future version of Perl once the | |
1602 | majority of offending code is fixed. | |
1603 | ||
cb1a09d0 AD |
1604 | It's probably best to prototype new functions, not retrofit prototyping |
1605 | into older ones. That's because you must be especially careful about | |
1606 | silent impositions of differing list versus scalar contexts. For example, | |
1607 | if you decide that a function should take just one parameter, like this: | |
1608 | ||
1609 | sub func ($) { | |
1610 | my $n = shift; | |
1611 | print "you gave me $n\n"; | |
54310121 | 1612 | } |
cb1a09d0 AD |
1613 | |
1614 | and someone has been calling it with an array or expression | |
1615 | returning a list: | |
1616 | ||
1617 | func(@foo); | |
f2606479 | 1618 | func( $text =~ /\w+/g ); |
cb1a09d0 | 1619 | |
19799a22 | 1620 | Then you've just supplied an automatic C<scalar> in front of their |
f86cebdf | 1621 | argument, which can be more than a bit surprising. The old C<@foo> |
cb1a09d0 | 1622 | which used to hold one thing doesn't get passed in. Instead, |
19799a22 | 1623 | C<func()> now gets passed in a C<1>; that is, the number of elements |
f2606479 LM |
1624 | in C<@foo>. And the C<m//g> gets called in scalar context so instead of a |
1625 | list of words it returns a boolean result and advances C<pos($text)>. Ouch! | |
cb1a09d0 | 1626 | |
eb40d2ca PM |
1627 | If a sub has both a PROTO and a BLOCK, the prototype is not applied |
1628 | until after the BLOCK is completely defined. This means that a recursive | |
1629 | function with a prototype has to be predeclared for the prototype to take | |
1630 | effect, like so: | |
1631 | ||
1632 | sub foo($$); | |
1633 | sub foo($$) { | |
1634 | foo 1, 2; | |
1635 | } | |
1636 | ||
5f05dabc | 1637 | This is all very powerful, of course, and should be used only in moderation |
54310121 | 1638 | to make the world a better place. |
44a8e56a | 1639 | |
1640 | =head2 Constant Functions | |
d74e8afc | 1641 | X<constant> |
44a8e56a | 1642 | |
1643 | Functions with a prototype of C<()> are potential candidates for | |
19799a22 GS |
1644 | inlining. If the result after optimization and constant folding |
1645 | is either a constant or a lexically-scoped scalar which has no other | |
54310121 | 1646 | references, then it will be used in place of function calls made |
19799a22 GS |
1647 | without C<&>. Calls made using C<&> are never inlined. (See |
1648 | F<constant.pm> for an easy way to declare most constants.) | |
44a8e56a | 1649 | |
5a964f20 | 1650 | The following functions would all be inlined: |
44a8e56a | 1651 | |
699e6cd4 TP |
1652 | sub pi () { 3.14159 } # Not exact, but close. |
1653 | sub PI () { 4 * atan2 1, 1 } # As good as it gets, | |
1654 | # and it's inlined, too! | |
44a8e56a | 1655 | sub ST_DEV () { 0 } |
1656 | sub ST_INO () { 1 } | |
1657 | ||
1658 | sub FLAG_FOO () { 1 << 8 } | |
1659 | sub FLAG_BAR () { 1 << 9 } | |
1660 | sub FLAG_MASK () { FLAG_FOO | FLAG_BAR } | |
54310121 | 1661 | |
1662 | sub OPT_BAZ () { not (0x1B58 & FLAG_MASK) } | |
88267271 PZ |
1663 | |
1664 | sub N () { int(OPT_BAZ) / 3 } | |
1665 | ||
1666 | sub FOO_SET () { 1 if FLAG_MASK & FLAG_FOO } | |
d3c633ba | 1667 | sub FOO_SET2 () { if (FLAG_MASK & FLAG_FOO) { 1 } } |
88267271 | 1668 | |
d3c633ba FC |
1669 | (Be aware that the last example was not always inlined in Perl 5.20 and |
1670 | earlier, which did not behave consistently with subroutines containing | |
1671 | inner scopes.) You can countermand inlining by using an explicit | |
1672 | C<return>: | |
88267271 PZ |
1673 | |
1674 | sub baz_val () { | |
44a8e56a | 1675 | if (OPT_BAZ) { |
1676 | return 23; | |
1677 | } | |
1678 | else { | |
1679 | return 42; | |
1680 | } | |
1681 | } | |
d3c633ba | 1682 | sub bonk_val () { return 12345 } |
cb1a09d0 | 1683 | |
fe39f0d5 AB |
1684 | As alluded to earlier you can also declare inlined subs dynamically at |
1685 | BEGIN time if their body consists of a lexically-scoped scalar which | |
b77865f5 | 1686 | has no other references. Only the first example here will be inlined: |
fe39f0d5 AB |
1687 | |
1688 | BEGIN { | |
1689 | my $var = 1; | |
1690 | no strict 'refs'; | |
1691 | *INLINED = sub () { $var }; | |
1692 | } | |
1693 | ||
1694 | BEGIN { | |
1695 | my $var = 1; | |
1696 | my $ref = \$var; | |
1697 | no strict 'refs'; | |
1698 | *NOT_INLINED = sub () { $var }; | |
1699 | } | |
1700 | ||
1701 | A not so obvious caveat with this (see [RT #79908]) is that the | |
1702 | variable will be immediately inlined, and will stop behaving like a | |
1703 | normal lexical variable, e.g. this will print C<79907>, not C<79908>: | |
1704 | ||
1705 | BEGIN { | |
1706 | my $x = 79907; | |
1707 | *RT_79908 = sub () { $x }; | |
1708 | $x++; | |
1709 | } | |
1710 | print RT_79908(); # prints 79907 | |
1711 | ||
d3c633ba FC |
1712 | As of Perl 5.22, this buggy behavior, while preserved for backward |
1713 | compatibility, is detected and emits a deprecation warning. If you want | |
1714 | the subroutine to be inlined (with no warning), make sure the variable is | |
1715 | not used in a context where it could be modified aside from where it is | |
1716 | declared. | |
1717 | ||
1718 | # Fine, no warning | |
1719 | BEGIN { | |
1720 | my $x = 54321; | |
1721 | *INLINED = sub () { $x }; | |
1722 | } | |
1723 | # Warns. Future Perl versions will stop inlining it. | |
1724 | BEGIN { | |
1725 | my $x; | |
1726 | $x = 54321; | |
1727 | *ALSO_INLINED = sub () { $x }; | |
1728 | } | |
1729 | ||
99734069 FC |
1730 | Perl 5.22 also introduces the experimental "const" attribute as an |
1731 | alternative. (Disable the "experimental::const_attr" warnings if you want | |
1732 | to use it.) When applied to an anonymous subroutine, it forces the sub to | |
1733 | be called when the C<sub> expression is evaluated. The return value is | |
1734 | captured and turned into a constant subroutine: | |
1735 | ||
1736 | my $x = 54321; | |
1737 | *INLINED = sub : const { $x }; | |
1738 | $x++; | |
1739 | ||
1740 | The return value of C<INLINED> in this example will always be 54321, | |
1741 | regardless of later modifications to $x. You can also put any arbitrary | |
1742 | code inside the sub, at it will be executed immediately and its return | |
1743 | value captured the same way. | |
1744 | ||
fe39f0d5 AB |
1745 | If you really want a subroutine with a C<()> prototype that returns a |
1746 | lexical variable you can easily force it to not be inlined by adding | |
1747 | an explicit C<return>: | |
1748 | ||
1749 | BEGIN { | |
1750 | my $x = 79907; | |
1751 | *RT_79908 = sub () { return $x }; | |
1752 | $x++; | |
1753 | } | |
1754 | print RT_79908(); # prints 79908 | |
1755 | ||
1756 | The easiest way to tell if a subroutine was inlined is by using | |
d3c633ba | 1757 | L<B::Deparse>. Consider this example of two subroutines returning |
fe39f0d5 AB |
1758 | C<1>, one with a C<()> prototype causing it to be inlined, and one |
1759 | without (with deparse output truncated for clarity): | |
1760 | ||
cb07e2f2 KW |
1761 | $ perl -MO=Deparse -le 'sub ONE { 1 } if (ONE) { print ONE if ONE }' |
1762 | sub ONE { | |
1763 | 1; | |
1764 | } | |
1765 | if (ONE ) { | |
1766 | print ONE() if ONE ; | |
1767 | } | |
1768 | $ perl -MO=Deparse -le 'sub ONE () { 1 } if (ONE) { print ONE if ONE }' | |
1769 | sub ONE () { 1 } | |
1770 | do { | |
1771 | print 1 | |
1772 | }; | |
fe39f0d5 AB |
1773 | |
1774 | If you redefine a subroutine that was eligible for inlining, you'll | |
b77865f5 | 1775 | get a warning by default. You can use this warning to tell whether or |
fe39f0d5 AB |
1776 | not a particular subroutine is considered inlinable, since it's |
1777 | different than the warning for overriding non-inlined subroutines: | |
1778 | ||
1779 | $ perl -e 'sub one () {1} sub one () {2}' | |
1780 | Constant subroutine one redefined at -e line 1. | |
1781 | $ perl -we 'sub one {1} sub one {2}' | |
1782 | Subroutine one redefined at -e line 1. | |
1783 | ||
1784 | The warning is considered severe enough not to be affected by the | |
1785 | B<-w> switch (or its absence) because previously compiled invocations | |
1786 | of the function will still be using the old value of the function. If | |
1787 | you need to be able to redefine the subroutine, you need to ensure | |
1788 | that it isn't inlined, either by dropping the C<()> prototype (which | |
1789 | changes calling semantics, so beware) or by thwarting the inlining | |
d3c633ba FC |
1790 | mechanism in some other way, e.g. by adding an explicit C<return>, as |
1791 | mentioned above: | |
fe39f0d5 AB |
1792 | |
1793 | sub not_inlined () { return 23 } | |
4cee8e80 | 1794 | |
19799a22 | 1795 | =head2 Overriding Built-in Functions |
d74e8afc | 1796 | X<built-in> X<override> X<CORE> X<CORE::GLOBAL> |
a0d0e21e | 1797 | |
19799a22 | 1798 | Many built-in functions may be overridden, though this should be tried |
5f05dabc | 1799 | only occasionally and for good reason. Typically this might be |
19799a22 | 1800 | done by a package attempting to emulate missing built-in functionality |
a0d0e21e LW |
1801 | on a non-Unix system. |
1802 | ||
163e3a99 JP |
1803 | Overriding may be done only by importing the name from a module at |
1804 | compile time--ordinary predeclaration isn't good enough. However, the | |
19799a22 GS |
1805 | C<use subs> pragma lets you, in effect, predeclare subs |
1806 | via the import syntax, and these names may then override built-in ones: | |
a0d0e21e LW |
1807 | |
1808 | use subs 'chdir', 'chroot', 'chmod', 'chown'; | |
1809 | chdir $somewhere; | |
1810 | sub chdir { ... } | |
1811 | ||
19799a22 GS |
1812 | To unambiguously refer to the built-in form, precede the |
1813 | built-in name with the special package qualifier C<CORE::>. For example, | |
1814 | saying C<CORE::open()> always refers to the built-in C<open()>, even | |
fb73857a | 1815 | if the current package has imported some other subroutine called |
19799a22 | 1816 | C<&open()> from elsewhere. Even though it looks like a regular |
4aaa4757 FC |
1817 | function call, it isn't: the CORE:: prefix in that case is part of Perl's |
1818 | syntax, and works for any keyword, regardless of what is in the CORE | |
1819 | package. Taking a reference to it, that is, C<\&CORE::open>, only works | |
1820 | for some keywords. See L<CORE>. | |
fb73857a | 1821 | |
19799a22 GS |
1822 | Library modules should not in general export built-in names like C<open> |
1823 | or C<chdir> as part of their default C<@EXPORT> list, because these may | |
a0d0e21e | 1824 | sneak into someone else's namespace and change the semantics unexpectedly. |
19799a22 | 1825 | Instead, if the module adds that name to C<@EXPORT_OK>, then it's |
a0d0e21e LW |
1826 | possible for a user to import the name explicitly, but not implicitly. |
1827 | That is, they could say | |
1828 | ||
1829 | use Module 'open'; | |
1830 | ||
19799a22 | 1831 | and it would import the C<open> override. But if they said |
a0d0e21e LW |
1832 | |
1833 | use Module; | |
1834 | ||
19799a22 | 1835 | they would get the default imports without overrides. |
a0d0e21e | 1836 | |
19799a22 | 1837 | The foregoing mechanism for overriding built-in is restricted, quite |
95d94a4f | 1838 | deliberately, to the package that requests the import. There is a second |
19799a22 | 1839 | method that is sometimes applicable when you wish to override a built-in |
95d94a4f GS |
1840 | everywhere, without regard to namespace boundaries. This is achieved by |
1841 | importing a sub into the special namespace C<CORE::GLOBAL::>. Here is an | |
1842 | example that quite brazenly replaces the C<glob> operator with something | |
1843 | that understands regular expressions. | |
1844 | ||
1845 | package REGlob; | |
1846 | require Exporter; | |
1847 | @ISA = 'Exporter'; | |
1848 | @EXPORT_OK = 'glob'; | |
1849 | ||
1850 | sub import { | |
1851 | my $pkg = shift; | |
1852 | return unless @_; | |
1853 | my $sym = shift; | |
1854 | my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0)); | |
1855 | $pkg->export($where, $sym, @_); | |
1856 | } | |
1857 | ||
1858 | sub glob { | |
1859 | my $pat = shift; | |
1860 | my @got; | |
7b815c67 RGS |
1861 | if (opendir my $d, '.') { |
1862 | @got = grep /$pat/, readdir $d; | |
1863 | closedir $d; | |
19799a22 GS |
1864 | } |
1865 | return @got; | |
95d94a4f GS |
1866 | } |
1867 | 1; | |
1868 | ||
1869 | And here's how it could be (ab)used: | |
1870 | ||
1871 | #use REGlob 'GLOBAL_glob'; # override glob() in ALL namespaces | |
1872 | package Foo; | |
1873 | use REGlob 'glob'; # override glob() in Foo:: only | |
1874 | print for <^[a-z_]+\.pm\$>; # show all pragmatic modules | |
1875 | ||
19799a22 | 1876 | The initial comment shows a contrived, even dangerous example. |
95d94a4f | 1877 | By overriding C<glob> globally, you would be forcing the new (and |
19799a22 | 1878 | subversive) behavior for the C<glob> operator for I<every> namespace, |
95d94a4f GS |
1879 | without the complete cognizance or cooperation of the modules that own |
1880 | those namespaces. Naturally, this should be done with extreme caution--if | |
1881 | it must be done at all. | |
1882 | ||
1883 | The C<REGlob> example above does not implement all the support needed to | |
19799a22 | 1884 | cleanly override perl's C<glob> operator. The built-in C<glob> has |
95d94a4f | 1885 | different behaviors depending on whether it appears in a scalar or list |
19799a22 | 1886 | context, but our C<REGlob> doesn't. Indeed, many perl built-in have such |
95d94a4f GS |
1887 | context sensitive behaviors, and these must be adequately supported by |
1888 | a properly written override. For a fully functional example of overriding | |
1889 | C<glob>, study the implementation of C<File::DosGlob> in the standard | |
1890 | library. | |
1891 | ||
77bc9082 RGS |
1892 | When you override a built-in, your replacement should be consistent (if |
1893 | possible) with the built-in native syntax. You can achieve this by using | |
1894 | a suitable prototype. To get the prototype of an overridable built-in, | |
1895 | use the C<prototype> function with an argument of C<"CORE::builtin_name"> | |
1896 | (see L<perlfunc/prototype>). | |
1897 | ||
1898 | Note however that some built-ins can't have their syntax expressed by a | |
1899 | prototype (such as C<system> or C<chomp>). If you override them you won't | |
1900 | be able to fully mimic their original syntax. | |
1901 | ||
fe854a6f | 1902 | The built-ins C<do>, C<require> and C<glob> can also be overridden, but due |
77bc9082 RGS |
1903 | to special magic, their original syntax is preserved, and you don't have |
1904 | to define a prototype for their replacements. (You can't override the | |
1905 | C<do BLOCK> syntax, though). | |
1906 | ||
1907 | C<require> has special additional dark magic: if you invoke your | |
1908 | C<require> replacement as C<require Foo::Bar>, it will actually receive | |
1909 | the argument C<"Foo/Bar.pm"> in @_. See L<perlfunc/require>. | |
1910 | ||
1911 | And, as you'll have noticed from the previous example, if you override | |
593b9c14 | 1912 | C<glob>, the C<< <*> >> glob operator is overridden as well. |
77bc9082 | 1913 | |
9b3023bc | 1914 | In a similar fashion, overriding the C<readline> function also overrides |
b77865f5 | 1915 | the equivalent I/O operator C<< <FILEHANDLE> >>. Also, overriding |
e3f73d4e | 1916 | C<readpipe> also overrides the operators C<``> and C<qx//>. |
9b3023bc | 1917 | |
fe854a6f | 1918 | Finally, some built-ins (e.g. C<exists> or C<grep>) can't be overridden. |
77bc9082 | 1919 | |
a0d0e21e | 1920 | =head2 Autoloading |
d74e8afc | 1921 | X<autoloading> X<AUTOLOAD> |
a0d0e21e | 1922 | |
19799a22 GS |
1923 | If you call a subroutine that is undefined, you would ordinarily |
1924 | get an immediate, fatal error complaining that the subroutine doesn't | |
1925 | exist. (Likewise for subroutines being used as methods, when the | |
1926 | method doesn't exist in any base class of the class's package.) | |
1927 | However, if an C<AUTOLOAD> subroutine is defined in the package or | |
1928 | packages used to locate the original subroutine, then that | |
1929 | C<AUTOLOAD> subroutine is called with the arguments that would have | |
1930 | been passed to the original subroutine. The fully qualified name | |
1931 | of the original subroutine magically appears in the global $AUTOLOAD | |
1932 | variable of the same package as the C<AUTOLOAD> routine. The name | |
1933 | is not passed as an ordinary argument because, er, well, just | |
593b9c14 | 1934 | because, that's why. (As an exception, a method call to a nonexistent |
80ee23cd | 1935 | C<import> or C<unimport> method is just skipped instead. Also, if |
5b36e945 FC |
1936 | the AUTOLOAD subroutine is an XSUB, there are other ways to retrieve the |
1937 | subroutine name. See L<perlguts/Autoloading with XSUBs> for details.) | |
80ee23cd | 1938 | |
19799a22 GS |
1939 | |
1940 | Many C<AUTOLOAD> routines load in a definition for the requested | |
1941 | subroutine using eval(), then execute that subroutine using a special | |
1942 | form of goto() that erases the stack frame of the C<AUTOLOAD> routine | |
1943 | without a trace. (See the source to the standard module documented | |
1944 | in L<AutoLoader>, for example.) But an C<AUTOLOAD> routine can | |
1945 | also just emulate the routine and never define it. For example, | |
1946 | let's pretend that a function that wasn't defined should just invoke | |
1947 | C<system> with those arguments. All you'd do is: | |
cb1a09d0 AD |
1948 | |
1949 | sub AUTOLOAD { | |
33666205 EK |
1950 | our $AUTOLOAD; # keep 'use strict' happy |
1951 | my $program = $AUTOLOAD; | |
1952 | $program =~ s/.*:://; | |
1953 | system($program, @_); | |
54310121 | 1954 | } |
cb1a09d0 | 1955 | date(); |
33666205 | 1956 | who(); |
cb1a09d0 AD |
1957 | ls('-l'); |
1958 | ||
19799a22 GS |
1959 | In fact, if you predeclare functions you want to call that way, you don't |
1960 | even need parentheses: | |
cb1a09d0 AD |
1961 | |
1962 | use subs qw(date who ls); | |
1963 | date; | |
33666205 | 1964 | who; |
593b9c14 | 1965 | ls '-l'; |
cb1a09d0 | 1966 | |
13058d67 | 1967 | A more complete example of this is the Shell module on CPAN, which |
19799a22 | 1968 | can treat undefined subroutine calls as calls to external programs. |
a0d0e21e | 1969 | |
19799a22 GS |
1970 | Mechanisms are available to help modules writers split their modules |
1971 | into autoloadable files. See the standard AutoLoader module | |
6d28dffb | 1972 | described in L<AutoLoader> and in L<AutoSplit>, the standard |
1973 | SelfLoader modules in L<SelfLoader>, and the document on adding C | |
19799a22 | 1974 | functions to Perl code in L<perlxs>. |
cb1a09d0 | 1975 | |
09bef843 | 1976 | =head2 Subroutine Attributes |
d74e8afc | 1977 | X<attribute> X<subroutine, attribute> X<attrs> |
09bef843 SB |
1978 | |
1979 | A subroutine declaration or definition may have a list of attributes | |
1980 | associated with it. If such an attribute list is present, it is | |
0120eecf | 1981 | broken up at space or colon boundaries and treated as though a |
09bef843 SB |
1982 | C<use attributes> had been seen. See L<attributes> for details |
1983 | about what attributes are currently supported. | |
1984 | Unlike the limitation with the obsolescent C<use attrs>, the | |
1985 | C<sub : ATTRLIST> syntax works to associate the attributes with | |
1986 | a pre-declaration, and not just with a subroutine definition. | |
1987 | ||
1988 | The attributes must be valid as simple identifier names (without any | |
1989 | punctuation other than the '_' character). They may have a parameter | |
1990 | list appended, which is only checked for whether its parentheses ('(',')') | |
1991 | nest properly. | |
1992 | ||
1993 | Examples of valid syntax (even though the attributes are unknown): | |
1994 | ||
4358a253 SS |
1995 | sub fnord (&\%) : switch(10,foo(7,3)) : expensive; |
1996 | sub plugh () : Ugly('\(") :Bad; | |
09bef843 SB |
1997 | sub xyzzy : _5x5 { ... } |
1998 | ||
1999 | Examples of invalid syntax: | |
2000 | ||
4358a253 SS |
2001 | sub fnord : switch(10,foo(); # ()-string not balanced |
2002 | sub snoid : Ugly('('); # ()-string not balanced | |
2003 | sub xyzzy : 5x5; # "5x5" not a valid identifier | |
2004 | sub plugh : Y2::north; # "Y2::north" not a simple identifier | |
2005 | sub snurt : foo + bar; # "+" not a colon or space | |
09bef843 SB |
2006 | |
2007 | The attribute list is passed as a list of constant strings to the code | |
2008 | which associates them with the subroutine. In particular, the second example | |
2009 | of valid syntax above currently looks like this in terms of how it's | |
2010 | parsed and invoked: | |
2011 | ||
2012 | use attributes __PACKAGE__, \&plugh, q[Ugly('\(")], 'Bad'; | |
2013 | ||
2014 | For further details on attribute lists and their manipulation, | |
a0ae32d3 | 2015 | see L<attributes> and L<Attribute::Handlers>. |
09bef843 | 2016 | |
cb1a09d0 | 2017 | =head1 SEE ALSO |
a0d0e21e | 2018 | |
19799a22 GS |
2019 | See L<perlref/"Function Templates"> for more about references and closures. |
2020 | See L<perlxs> if you'd like to learn about calling C subroutines from Perl. | |
a2293a43 | 2021 | See L<perlembed> if you'd like to learn about calling Perl subroutines from C. |
19799a22 GS |
2022 | See L<perlmod> to learn about bundling up your functions in separate files. |
2023 | See L<perlmodlib> to learn what library modules come standard on your system. | |
82e1c0d9 | 2024 | See L<perlootut> to learn how to make object method calls. |